Smile: a system to support machine learning on EEG data at scale
Author(s)
Cao, Lei; Tao, Wenbo; An, Sungtae; Jin, Jing; Yan, Yizhou; Liu, Xiaoyu; Ge, Wendong; Sah, Adam; Battle, Leilani; Sun, Jimeng; Chang, Remco; Westover, Brandon; Madden, Samuel; Stonebraker, Michael; ... Show more Show less
DownloadPublished version (1.097Mb)
Publisher with Creative Commons License
Publisher with Creative Commons License
Creative Commons Attribution
Terms of use
Metadata
Show full item recordAbstract
© 2019 VLDB Endowment. In order to reduce the possibility of neural injury from seizures and sidestep the need for a neurologist to spend hours on manually reviewing the EEG recording, it is critical to automatically detect and classify "interictal-ictal continuum" (IIC) patterns from EEG data. However, the existing IIC classification techniques are shown to be not accurate and robust enough for clinical use because of the lack of high quality labels of EEG segments as training data. Obtaining high-quality labeled data is traditionally a manual process by trained clinicians that can be tedious, time-consuming, and errorprone. In this work, we propose Smile, an industrial scale system that provides an end-to-end solution to the IIC pattern classification problem. The core components of Smile include a visualizationbased time series labeling module and a deep-learning based active learning module. The labeling module enables the users to explore and label 350 million EEG segments (30TB) at interactive speed. The multiple coordinated views allow the users to examine the EEG signals from both time domain and frequency domain simultaneously. The active learning module first trains a deep neural network that automatically extracts both the local features with respect to each segment itself and the long term dynamics of the EEG signals to classify IIC patterns. Then leveraging the output of the deep learning model, the EEG segments that can best improve the model are selected and prompted to clinicians to label. This process is iterated until the clinicians and the models show high degree of agreement. Our initial experimental results show that our Smile system allows the clinicians to label the EEG segments at will with a response time below 500 ms. The accuracy of the model is progressively improved as more and more high quality labels are acquired over time.
Date issued
2019Department
Massachusetts Institute of Technology. Computer Science and Artificial Intelligence LaboratoryJournal
Proceedings of the VLDB Endowment
Publisher
VLDB Endowment
Citation
Cao, Lei, Tao, Wenbo, An, Sungtae, Jin, Jing, Yan, Yizhou et al. 2019. "Smile: a system to support machine learning on EEG data at scale." Proceedings of the VLDB Endowment, 12 (12).
Version: Final published version