Speaker diarization in a meeting scenario

Oseni-Adegbite, Adedotun J.

dc.contributor.advisor	James Glass and Hao Tang.	en_US
dc.contributor.author	Oseni-Adegbite, Adedotun J.	en_US
dc.contributor.other	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.	en_US
dc.date.accessioned	2020-09-15T21:59:07Z
dc.date.available	2020-09-15T21:59:07Z
dc.date.copyright	2020	en_US
dc.date.issued	2020	en_US
dc.identifier.uri	https://hdl.handle.net/1721.1/127463
dc.description	Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, May, 2020	en_US
dc.description	Cataloged from the official PDF of thesis.	en_US
dc.description	Includes bibliographical references (pages 81-84).	en_US
dc.description.abstract	Given the large amount of time workers are spending in meetings, having statistics to drive more effective meetings is desirable. Various workplaces have distinct types of meetings and workers present. So the more agnostic to the content and people present within the meeting, the more meeting scenarios these statistics can be applied. We propose a system that provides these statistics in the form of a summary of who is speaking within the meeting and at what times they are speaking whilst respecting the participants' privacy. The system aims to run completely online and locally. Therefore, no audio needs to be stored or transmitted on the device running the system. This is accomplished by displaying where speech originates in the room and also labeling the speaker. Time stamp labels are provided for all occurrences of a speaker's speech thus allowing a breakdown of how each speaker contributed to the meeting. We have created a dataset of emulated meeting-like scenario recordings to run experiments on. In an offline scenario, this system was able to achieve a DER of 27.8% with no overlap in the speech, 44.3% with small amounts of overlap, and 50.0% with large amounts of overlap. When run online, DERs of 16.9%, 37.2%, and 45.6% were achieved in situations of no overlap, small overlap, and large amounts of overlap respectively.	en_US
dc.description.statementofresponsibility	by Adedotun J Oseni-Adegbite.	en_US
dc.format.extent	84 pages	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Electrical Engineering and Computer Science.	en_US
dc.title	Speaker diarization in a meeting scenario	en_US
dc.type	Thesis	en_US
dc.description.degree	M. Eng.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science	en_US
dc.identifier.oclc	1192966863	en_US
dc.description.collection	M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science	en_US
dspace.imported	2020-09-15T21:59:07Z	en_US
mit.thesis.degree	Master	en_US
mit.thesis.department	EECS	en_US

Files in this item

Name:: 1192966863-MIT.pdf
Size:: 1.584Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record