Shallow Sparsely-Connected Autoencoders for Gene Set Projection

Gold, Maxwell P.; Lenail, Alexander; Fraenkel, Ernest

Author(s)

Gold, Maxwell P.; Lenail, Alexander; Fraenkel, Ernest

DownloadAccepted version (1.058Mb)

Open Access Policy

Terms of use

Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/

Metadata

Show full item record

Abstract

When analyzing biological data, it can be helpful to consider gene sets, or predefined groups of biologically related genes. Methods exist for identifying gene sets that are differential between conditions, but large public datasets from consortium projects and single-cell RNA-Sequencing have opened the door for gene set analysis using more sophisticated machine learning techniques, such as autoencoders and variational autoencoders. We present shallow sparsely-connected autoencoders (SSCAs) and variational autoencoders (SSCVAs) as tools for projecting gene-level data onto gene sets. We tested these approaches on single-cell RNA-Sequencing data from blood cells and on RNA-Sequencing data from breast cancer patients. Both SSCA and SSCVA can recover known biological features from these datasets and the SSCVA method often outperforms SSCA (and six existing gene set scoring algorithms) on classification and prediction tasks.

Date issued

2019-03

URI

https://hdl.handle.net/1721.1/125231

Department

Massachusetts Institute of Technology. Department of Biological Engineering

Journal

Pacific Symposium on Biocomputing 2019

Publisher

World Scientific Pub Co Pte Lt

Citation

Gold, Maxwell P., Alexander LeNail, and Ernest Fraenkel. "Shallow Sparsely-Connected Autoencoders for Gene Set Projection." Paper presented at the Pacific Symposium on Biocomputing 2019 (Kohala Coast, Hawaii, USA, 3-7 January 2019) 24 (2019): 374-385 © 2019 The Author(s)

Version: Author's final manuscript

ISBN

9789813279810

Collections

MIT Open Access Articles