MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning

Author(s)
Wang, Hanrui; Zhang, Zhekai; Han, Song
Thumbnail
DownloadAccepted version (2.525Mb)
Open Access Policy

Open Access Policy

Creative Commons Attribution-Noncommercial-Share Alike

Terms of use
Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/
Metadata
Show full item record
Date issued
2021
URI
https://hdl.handle.net/1721.1/143674
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Journal
2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA)
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Citation
Wang, Hanrui, Zhang, Zhekai and Han, Song. 2021. "SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning." 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA).
Version: Author's final manuscript

Collections
  • MIT Open Access Articles

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.