Speaker diarization in a meeting scenario

Oseni-Adegbite, Adedotun J.

Author(s)

Oseni-Adegbite, Adedotun J.

Download1192966863-MIT.pdf (1.584Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.

Advisor

James Glass and Hao Tang.

Terms of use

MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

Given the large amount of time workers are spending in meetings, having statistics to drive more effective meetings is desirable. Various workplaces have distinct types of meetings and workers present. So the more agnostic to the content and people present within the meeting, the more meeting scenarios these statistics can be applied. We propose a system that provides these statistics in the form of a summary of who is speaking within the meeting and at what times they are speaking whilst respecting the participants' privacy. The system aims to run completely online and locally. Therefore, no audio needs to be stored or transmitted on the device running the system. This is accomplished by displaying where speech originates in the room and also labeling the speaker. Time stamp labels are provided for all occurrences of a speaker's speech thus allowing a breakdown of how each speaker contributed to the meeting. We have created a dataset of emulated meeting-like scenario recordings to run experiments on. In an offline scenario, this system was able to achieve a DER of 27.8% with no overlap in the speech, 44.3% with small amounts of overlap, and 50.0% with large amounts of overlap. When run online, DERs of 16.9%, 37.2%, and 45.6% were achieved in situations of no overlap, small overlap, and large amounts of overlap respectively.

Description

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, May, 2020

Cataloged from the official PDF of thesis.

Includes bibliographical references (pages 81-84).

Date issued

2020

URI

https://hdl.handle.net/1721.1/127463

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Graduate Theses