Long-range Genomics Benchmark Technology and More
Author(s)
Polen, McKinley
DownloadThesis PDF (2.479Mb)
Advisor
Kellis, Manolis
Terms of use
Metadata
Show full item recordAbstract
The transformer architecture has emerged as a popular choice in various domains, owing to its ability to capture long-range dependencies and parallel processing capabilities. In the context of genomics, where dependencies often span over 100,000 base pairs, the quadratic computational complexity of the attention mechanism, a core feature of the transformer architecture, poses a significant bottleneck. With the goal of creating a genomics foundation model (FM), this paper aims to address challenges associated long range dependencies in genomics. Our survey encompasses modifications to the attention mechanism, the creation of a genomics long range benchmark (GLRB), and the evaluation of various transformer and other non-transformer architectures. These efforts collectively develop the groundwork supporting the development of a robust genomics foundation model, opening new possibilities for genomics research and applications.
Date issued
2024-05Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology