Nuggeteer: Automatic Nugget-Based Evaluation Using Descriptions and Judgements

Marton, Gregory

Author(s)

Marton, Gregory

Downloadnuggeteer.pdf (230.8Kb)

Other Contributors

Infolab

Advisor

Boris Katz

Metadata

Show full item record

Abstract

TREC Definition and Relationship questions are evaluated on thebasis of information nuggets that may be contained in systemresponses. Human evaluators provide informal descriptions of eachnugget, and judgements (assignments of nuggets to responses) for eachresponse submitted by participants.The best present automatic evaluation for these kinds of questions isPourpre. Pourpre uses a stemmed unigram similarity of responses withnugget descriptions, yielding an aggregate result that is difficult tointerpret, but is useful for relative comparison. Nuggeteer, bycontrast, uses both the human descriptions and the human judgements,and makes binary decisions about each response, so that the end resultis as interpretable as the official score.I explore n-gram length, use of judgements, stemming, and termweighting, and provide a new algorithm quantitatively comparable to,and qualitatively better than the state of the art.

Date issued

2006-01-09

URI

http://hdl.handle.net/1721.1/30604

Series/Report no.

Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory

Keywords

natural language, question answering

Collections

CSAIL Work Products