Parameter estimation

The probabilities and utilities specified in the effects of probabilistic rules can be automatically optimised from dialogue data. OpenDial relies on Bayesian learning to perform this parameter estimation. Each (univariate or multivariate) parameter is therefore associated with a specific distribution that is progressively narrowed down as more data points are observed in order to provide the best "fit" for the observed data.

1. Data collection

Recording dialogues

OpenDial comes bundled with specific functions to export and import dialogues right from the user interface. To record a particular interaction, click on Interactions -> Save Dialogue As.... The function will record the sequence of dialogue turns (from the user and from the system) in the form of an XML file. Every turn contains a particular utterance (encoded as a probability distribution in the case of uncertain user inputs) as well as other optional contextual variables. The recorded XML file follows the following skeleton:


<variable id="u_u">

      <!-- distribution for the user utterance -->
    <!-- other contextual variables -->

<variable id="u_m">

      <!-- System utterance -->
    <!-- other contextual variables -->
  <!-- etc. -->

The order of user and system turns is arbitrary (user turns can follow other user turns without any system utterance, and vice versa). The (domain-specific) contextual variables to include in each turn are specified in the system settings.

OpenDial can be paused/resumed at any time during the data collection using the Interactions -> Pause/Resume button.

Recording Wizard-of-Oz interactions

Wizard-of-Oz interactions can also be recorded through the OpenDial interface. A Wizard-of-Oz interaction is an interaction in which a human user is asked to interact with a system that is remotely operated by an unseen human agent. One can easily execute such Wizard-of-Oz experiments via the "remote connection" functionality integrated in the latest versions of OpenDial. To allow two OpenDial systems to be connected with one another, follow the following procedure:

  • Start OpenDial on the two machines A and B (where A and B have mutually accessible IP addresses).
  • Click on Help -> About on machine A. Copy the local address (IP and port) for the machine.
  • Click on Interaction -> Connect to Remote Client on machine B. Copy the address and port for machine A into the field and click OK.
  • The connection between the two clients is now established. To use this remote connection for a Wizard-of-Oz dialogue, the machine playing the role of the Wizard should change its role by clicking on Interaction -> Interaction Role -> System.
  • At the end of the interaction, simply save the transcript into XML by clicking on Interaction -> Save Dialogue As ....

If one wishes to conduct more advanced Wizard-of-Oz experiments (where e.g. the wizard can only choose among the set of actions available from a predefined dialogue domain), the module WizardControl can be used. Using this module, OpenDial will trigger the domain models as usual upon the reception of new user inputs, but will not select the highest-utility actions. Instead, the system will show a list of possible actions on the right side of the chat window. The wizard is then expected to select the most appropriate action in this list. Note that the action selection box only displays actions that are relevant in the current situation (i.e. for which at least one utility rule specifies a utility). Using the domain from the step-by-step example, three actions will therefore be available to the wizard after observing the user input a_u=Request(Left), namely Move(Left), AskRepeat, and the default None (representing the absence of any action).

Importing dialogues

Previously recorded dialogues can be imported via Interactions -> Import Dialogue From.... This function does more than simply displaying the dialogue history in the chat window - it actually "replays" the full interaction, performing dialogue update at each step (including the update of parameter distributions). This import function is therefore a crucial tool to estimate model parameters based on previously recorded dialogue data.

Two alternatives import modes are available:

  • Standard Transcript simply replays the dialogue from start to finish, without making any particular assumption on the system actions.
  • Wizard-of-Oz Transcript is used when the system actions come from a wizard (or some other kind of gold standard). In this mode, the posterior distributions over the domain parameters are updated in order to reflect the choice of actions observed in the dialogue (see below).

2. Parameter estimation

With unannotated dialogues

Some parameters can be directly learned from raw, unannotated dialogues. Take for instance the dialogue domain described in the step-by-step example. The parameter theta_repeatpredict (which reflects the probability that the user will comply to the system request to repeat the instruction) can be automatically estimated through repeated interactions with users. Each observed dialogue act a_u after a system request AskRepeat will thus trigger a Bayesian update of the parameter.

One can easily test this learning mechanism by starting OpenDial with the domain domains/examples/example-step-by-step_params.xml and entering a few instructions with a low probability (in order to trigger the AskRepeat response). The state viewer shows how the prior prediction a_u^p and the actual dialogue act a_u' are combined.

The parameter theta_repeatpredict is automatically refined as a result of this update.

Once the interaction is complete, the posterior parameter distributions can be exported by clicking on Domains -> Export -> Parameters. As the posterior distribution of theta_repeatpredict may not be a Dirichlet distribution anymore due to the partial observability of the graphical model, the posterior distribution is encoded as a multivariate Gaussian distribution (with a diagonal covariance).

With Wizard-of-Oz dialogues

The parameter theta_repeatpredict could be directly estimated from unannotated dialogues. This is not the case for the utility theta_repeat. Indeed, there is no way the system could learn the utility of the AskRepeat action without receiving some feedback on the desirability of this action.

One simple way to estimate such utility parameters is to collect Wizard-of-Oz data (cf. explanations above) and estimate posterior parameter distributions from them. The most likely parameter values are in this case those that provide the best "fit" for the wizard decisions - in other words, the parameter values that best imitate the wizard's conversational behaviour in similar situations).[1]

In practice, you can simply import a previously recorded Wizard-of-Oz dialogue (for instance the one in domains/examples/woz-dialogue.xml). At the end of the interaction, the parameters will be automatically updated to reflect the Wizard-of-Oz decisions, as illustrated in the screenshots below.

Parameter distributions before learning:

Parameter distributions after learning:

(Notice that the theta_repeat distribution has narrowed down and that its mean is now centered around 0.2-0.3)

With simulated dialogues

Finally, the last possible method for parameter estimation is via simulation. A simulator automatically generates user inputs in accordance with an internal model of the user behaviour.

The easiest way to build a user simulator in OpenDial is to use the Simulator module. The simulator module takes as parameter a dialogue domain specification in the same format as a standard dialogue domain. The simulator is triggered after each system action and automatically updates its internal state and generates new user inputs in accordance with the specified model.

An example of such simulator domain is provided in domains/examples/example-simulator.xml. As we can see, this simulator domain is composed of :

  • A reward model that determines the utility of the previous system action.
  • A simulator model that determines the next dialogue act from the user.
  • An error model that introduces noises and errors into the actual dialogue act perceived by the system.

Some state variables in this domain specification are labelled with an ^o suffix. This suffix indicates which variables are intended to be part of the generated output of the simulator (the remaining variables are internal to the simulator). In this particular example, the only output variable is the user dialogue act. More complex domains may however include other contextual variables. One should also note the use of a Dirichlet parameter called error. This Dirichlet parameter is used for the simulator's error model. The first dimension of this parameter determines the confidence probability for the dialogue act actually selected by the simulator, the second dimension the confidence probability for another, erroneous dialogue act, and the third dimension the probability of no dialogue act. At runtime, the simulator samples a particular (multivariate) value from this Dirichlet and use it to construct the final N-best list for the dialogue act.

In order to start the simulator, simply add the paramater -Dsimulator=path/to/the/simulator/domain on the command line. A simulated dialogue will then start. The simulation can be interrupted at any time by clicking on Interactions > Pause/Resume.

More advanced simulators can be constructed as separate modules. The section External modules describes in more details the implementation of such modules.

[1] See Lison (2014), chapter 5 for details.