Datathons and Software to Promote Reproducible Research
Author(s)
Celi, Leo Anthony G.; Lokhandwala, Sharukh; Montgomery, Robert; Moses, Christopher A; Pollard, Tom Joseph; Stretch, Robert; Spitz, Daniel; Naumann, Tristan Josef; ... Show more Show less
DownloadDatathons and software.pdf (528.4Kb)
PUBLISHER_CC
Publisher with Creative Commons License
Creative Commons Attribution
Terms of use
Metadata
Show full item recordAbstract
Background: Datathons facilitate collaboration between clinicians, statisticians, and data scientists in order to answer important clinical questions. Previous datathons have resulted in numerous publications of interest to the critical care community and serve as a viable model for interdisciplinary collaboration.
Objective: We report on an open-source software called Chatto that was created by members of our group, in the context of the second international Critical Care Datathon, held in September 2015.
Methods: Datathon participants formed teams to discuss potential research questions and the methods required to address them. They were provided with the Chatto suite of tools to facilitate their teamwork. Each multidisciplinary team spent the next 2 days with clinicians working alongside data scientists to write code, extract and analyze data, and reformulate their queries in real time as needed. All projects were then presented on the last day of the datathon to a panel of judges that consisted of clinicians and scientists.
Results: Use of Chatto was particularly effective in the datathon setting, enabling teams to reduce the time spent configuring their research environments to just a few minutes—a process that would normally take hours to days. Chatto continued to serve as a useful research tool after the conclusion of the datathon.
Conclusions: This suite of tools fulfills two purposes: (1) facilitation of interdisciplinary teamwork through archiving and version control of datasets, analytical code, and team discussions, and (2) advancement of research reproducibility by functioning postpublication as an online environment in which independent investigators can rerun or modify analyses with relative ease. With the introduction of Chatto, we hope to solve a variety of challenges presented by collaborative data mining projects while improving research reproducibility.
Date issued
2016-08Department
Massachusetts Institute of Technology. Institute for Medical Engineering & Science; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science; MIT Critical Data (Laboratory)Journal
Journal of Medical Internet Research
Publisher
Gunther Eysenbach, JMIR
Citation
Celi, Leo Anthony et al. “Datathons and Software to Promote Reproducible Research.” Journal of Medical Internet Research 18.8 (2016): e230.
Version: Final published version
ISSN
1438-8871