Smile: a system to support machine learning on EEG data at scale

Cao, Lei; Tao, Wenbo; An, Sungtae; Jin, Jing; Yan, Yizhou; Liu, Xiaoyu; Ge, Wendong; Sah, Adam; Battle, Leilani; Sun, Jimeng; Chang, Remco; Westover, Brandon; Madden, Samuel; Stonebraker, Michael

dc.contributor.author	Cao, Lei
dc.contributor.author	Tao, Wenbo
dc.contributor.author	An, Sungtae
dc.contributor.author	Jin, Jing
dc.contributor.author	Yan, Yizhou
dc.contributor.author	Liu, Xiaoyu
dc.contributor.author	Ge, Wendong
dc.contributor.author	Sah, Adam
dc.contributor.author	Battle, Leilani
dc.contributor.author	Sun, Jimeng
dc.contributor.author	Chang, Remco
dc.contributor.author	Westover, Brandon
dc.contributor.author	Madden, Samuel
dc.contributor.author	Stonebraker, Michael
dc.date.accessioned	2021-11-05T15:34:30Z
dc.date.available	2021-11-05T15:34:30Z
dc.date.issued	2019
dc.identifier.uri	https://hdl.handle.net/1721.1/137531
dc.description.abstract	© 2019 VLDB Endowment. In order to reduce the possibility of neural injury from seizures and sidestep the need for a neurologist to spend hours on manually reviewing the EEG recording, it is critical to automatically detect and classify "interictal-ictal continuum" (IIC) patterns from EEG data. However, the existing IIC classification techniques are shown to be not accurate and robust enough for clinical use because of the lack of high quality labels of EEG segments as training data. Obtaining high-quality labeled data is traditionally a manual process by trained clinicians that can be tedious, time-consuming, and errorprone. In this work, we propose Smile, an industrial scale system that provides an end-to-end solution to the IIC pattern classification problem. The core components of Smile include a visualizationbased time series labeling module and a deep-learning based active learning module. The labeling module enables the users to explore and label 350 million EEG segments (30TB) at interactive speed. The multiple coordinated views allow the users to examine the EEG signals from both time domain and frequency domain simultaneously. The active learning module first trains a deep neural network that automatically extracts both the local features with respect to each segment itself and the long term dynamics of the EEG signals to classify IIC patterns. Then leveraging the output of the deep learning model, the EEG segments that can best improve the model are selected and prompted to clinicians to label. This process is iterated until the clinicians and the models show high degree of agreement. Our initial experimental results show that our Smile system allows the clinicians to label the EEG segments at will with a response time below 500 ms. The accuracy of the model is progressively improved as more and more high quality labels are acquired over time.	en_US
dc.language.iso	en
dc.publisher	VLDB Endowment	en_US
dc.relation.isversionof	10.14778/3352063.3352138	en_US
dc.rights	Creative Commons Attribution-NonCommercial-NoDerivs License	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	en_US
dc.source	VLDB Endowment	en_US
dc.title	Smile: a system to support machine learning on EEG data at scale	en_US
dc.type	Article	en_US
dc.identifier.citation	Cao, Lei, Tao, Wenbo, An, Sungtae, Jin, Jing, Yan, Yizhou et al. 2019. "Smile: a system to support machine learning on EEG data at scale." Proceedings of the VLDB Endowment, 12 (12).
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.relation.journal	Proceedings of the VLDB Endowment	en_US
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dc.date.updated	2021-01-29T17:48:42Z
dspace.orderedauthors	Cao, L; Tao, W; An, S; Jin, J; Yan, Y; Liu, X; Ge, W; Sah, A; Battle, L; Sun, J; Chang, R; Westover, B; Madden, S; Stonebraker, M	en_US
dspace.date.submission	2021-01-29T17:49:06Z
mit.journal.volume	12	en_US
mit.journal.issue	12	en_US
mit.license	PUBLISHER_CC
mit.metadata.status	Authority Work and Publication Information Needed	en_US

Files in this item

Name:: 3352063.3352138.pdf
Size:: 1.097Mb
Format:: PDF
Description:: Published version

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record