Prediction and analysis of degree of suicidal ideation in online content
Author(s)Jones, Noah C.(Noah Corinthian)
Program in Media Arts and Sciences (Massachusetts Institute of Technology)
MetadataShow full item record
Machine learning (ML) has increasingly been used to address the growing burden of mental illness and lack of access to quality mental health care. Recently such models have been applied to online data, such as social media postings to augment mental health screening. Despite the potential of these methods, online ML classifiers still perform poorly in multi-class settings. In this thesis, we propose the usage of novel document embeddings and mental health based user embeddings for triaged suicide risk screening. Machine learning to infer suicide risk and urgency is applied to a dataset of Reddit users in which the risk and urgency labels were derived from crowdsource consensus. We show that the document embedding approach outperforms count-based baselines and a method based on word importance, where important words were identified by domain experts. We examine interpretable features and methods that help to discern and explain risk labels. Finally, we find, using a Latent Dirichlet Allocation (LDA) topic model, that users labeled at-risk for suicide post about different topics to the rest of Reddit than non-suicidal users.
Thesis: S.M., Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, May, 2020Cataloged from the official PDF of thesis.Includes bibliographical references (pages 51-57).
DepartmentProgram in Media Arts and Sciences (Massachusetts Institute of Technology)
Massachusetts Institute of Technology
Program in Media Arts and Sciences