Learning optimal discourse strategies in a spoken dialogue system

Fromer, Jeanne C., 1975-

Author(s)

Fromer, Jeanne C., 1975-

DownloadFull printable version (5.341Mb)

Advisor

Robert C. Berwick.

Terms of use

M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

Participants in a conversation can often realize their conversational goals in multiple ways by employing different discourse strategies. For example, one can usually present requested information in various ways; different presentation methods are preferred and most effective in varying contexts. One can also manage conversations, or assume initiative, to varying degrees by directing questions, issuing commands, restricting potential responses, and controlling discussion topics in different ways. Agents that converse with users in natural language and possess different discourse strategies need to choose and realize the optimal strategy from competing strategies. Previous work in natural language generation has selected discourse strategies by using heuristics based on discourse focus, medium, style, and the content of previous utterances. Recent work suggests that an agent can learn which strategies are optimal. This thesis investigates the issues involved with learning optimal discourse strategies on the basis of experience gained through conversations between human users and natural language agents. A spoken dialogue agent, ELVIS, is implemented as a testbed for learning optimal discourse strategies. ELVIS provides telephone-based voice access to a caller's email. Within ELVIS, various discourse strategies for the distribution of initiative, reading messages, and summarizing messages are implemented. Actual users interact with discourse strategy-based variations of ELVIS. Their conversations are used to derive a dialogue performance function for ELVIS using the PARADISE dialogue evaluation framework. This performance function is then used with reinforcement learning techniques, such as adaptive dynamic programming, Q-learning, temporal difference learning, and temporal difference Q-learning, to determine the optimal discourse strategies for ELVIS to use in different contexts. This thesis reports and compares learning results and describes how the particular reinforcement algorithm, local reward functions, and the system state space representation affect the efficiency and the outcome of the learning results. This thesis concludes by suggesting how it may be possible to automate online learning in spoken dialogue systems by extending the presented evaluation and learning techniques.

Description

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.

Includes bibliographical references (p. 123-129).

Date issued

1998

URI

http://hdl.handle.net/1721.1/47703

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science

Collections

Graduate Theses