Prediction and analysis of degree of suicidal ideation in online content
Author(s)
Jones, Noah C.(Noah Corinthian)
Download1193027006-MIT.pdf (11.37Mb)
Other Contributors
Program in Media Arts and Sciences (Massachusetts Institute of Technology)
Advisor
Rosalind Picard.
Terms of use
Metadata
Show full item recordAbstract
Machine learning (ML) has increasingly been used to address the growing burden of mental illness and lack of access to quality mental health care. Recently such models have been applied to online data, such as social media postings to augment mental health screening. Despite the potential of these methods, online ML classifiers still perform poorly in multi-class settings. In this thesis, we propose the usage of novel document embeddings and mental health based user embeddings for triaged suicide risk screening. Machine learning to infer suicide risk and urgency is applied to a dataset of Reddit users in which the risk and urgency labels were derived from crowdsource consensus. We show that the document embedding approach outperforms count-based baselines and a method based on word importance, where important words were identified by domain experts. We examine interpretable features and methods that help to discern and explain risk labels. Finally, we find, using a Latent Dirichlet Allocation (LDA) topic model, that users labeled at-risk for suicide post about different topics to the rest of Reddit than non-suicidal users.
Description
Thesis: S.M., Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, May, 2020 Cataloged from the official PDF of thesis. Includes bibliographical references (pages 51-57).
Date issued
2020Department
Program in Media Arts and Sciences (Massachusetts Institute of Technology)Publisher
Massachusetts Institute of Technology
Keywords
Program in Media Arts and Sciences