A validated lineage-derived somatic truth data set enables benchmarking in cancer genome analysis
Author(s)
Shand, Megan; Soto, Jose; Lichtenstein, Lee; Benjamin, David; Farjoun, Yossi; Brody, Yehuda; Maruvka, Yosef; Blainey, Paul C.; Banks, Eric; ... Show more Show less
DownloadPublished version (1.961Mb)
Publisher with Creative Commons License
Publisher with Creative Commons License
Creative Commons Attribution
Terms of use
Metadata
Show full item recordAbstract
Existing cancer benchmark data sets for human sequencing data use germline variants, synthetic methods, or expensive validations, none of which are satisfactory for providing a large collection of true somatic variation across a whole genome. Here we propose a data set, Lineage derived Somatic Truth (LinST), of short somatic mutations in the HT115 colon cancer cell-line, that are validated using a known cell lineage that includes thousands of mutations and a high confidence region covering 2.7 gigabases per sample.
Date issued
2020-12Department
Massachusetts Institute of Technology. Department of Biological Engineering; Koch Institute for Integrative Cancer Research at MITJournal
Communications Biology
Publisher
Springer Science and Business Media LLC
Citation
Shand, Megan, Soto, Jose, Lichtenstein, Lee, Benjamin, David, Farjoun, Yossi et al. 2020. "A validated lineage-derived somatic truth data set enables benchmarking in cancer genome analysis." Communications Biology, 3 (1).
Version: Final published version
ISSN
2399-3642