Student Research Abstract: Evaluating Dialogue Summarization Using LLMs

Wang, Alison

Author(s)

Wang, Alison

Download3672608.3707999.pdf (1.347Mb)

Publisher Policy

Terms of use

Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.

Metadata

Show full item record

Abstract

With the surge in audio data available today, there is a growing need for effective dialogue summarization. This study conducts two experiments using two LLMs, BART and Mistral, to assess dialogue summarization. The first experiment evaluates model performance, while the second examines the impact of upstream errors from Automatic Speech Recognition (ASR) and Machine Translation (MT) on summarization performance. Results indicate that SummaC, a commonly used evaluation metric, is unreliable for dialogue summarization. Additionally, Mistral's summarization performance is more sensitive to upstream errors than BART's.

Description

SAC ’25, March 31-April 4, 2025, Catania, Italy

Date issued

2025-05-14

URI

https://hdl.handle.net/1721.1/162214

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

ACM|The 40th ACM/SIGAPP Symposium on Applied Computing

Citation

Wang, Alison. 2025. "Student Research Abstract: Evaluating Dialogue Summarization Using LLMs."

Version: Final published version

ISBN

979-8-4007-0629-5

Collections

MIT Open Access Articles