Dialogue Control by Pomdp Using Dialogue Data Statistics
Partially Observable Markov Decision Processes (POMDPs) are applied in ac- tion control to manage and support users’ natural dialogue communication with conversational agents. Any agent’s action must be determined, based on probabilistic methods, from noisy data through sensors in the real world. Agents must flexibly choose their actions to reach a target dialogue sequence with the users while retaining as many statistical characteristics of the data as possible. This issue can be solved by two approaches: automatically acquiring POMDP probabilities using Dynamic Bayesian Networks (DBNs)(DBNs) trained from a large amount of dialogue data and obtaining POMDP rewards from human evaluations and agent action predictive probabilities. Using the probabilities and the rewards, POMDP value iteration calculates a policy that can generate an action sequence that maximizes both the predictive distributions of actions and user evaluations.
KeywordsPartially Observable Markov Decision Process (POMDP) Dialogue management DMulti-modal interaction Dynamic Bayesian Network (DBN) Expectation-Maximization (EM) algorithm
Unable to display preview. Download preview PDF.