AMC: AutoML for Model Compression and Acceleration on Mobile Devices

Lin, Ji; Liu, Zhijian; Wang, Hanrui; Han, Song

dc.contributor.author	Lin, Ji
dc.contributor.author	Liu, Zhijian
dc.contributor.author	Wang, Hanrui
dc.contributor.author	Han, Song
dc.date.accessioned	2021-01-26T18:39:53Z
dc.date.available	2021-01-26T18:39:53Z
dc.date.issued	2018-10
dc.identifier.isbn	9783030012205
dc.identifier.uri	https://hdl.handle.net/1721.1/129576
dc.description.abstract	Model compression is an effective technique to efficiently deploy neural network models on mobile devices which have limited computation resources and tight power budgets. Conventional model compression techniques rely on hand-crafted features and require domain experts to explore the large design space trading off among model size, speed, and accuracy, which is usually sub-optimal and time-consuming. In this paper, we propose AutoML for Model Compression (AMC) which leverages reinforcement learning to efficiently sample the design space and can improve the model compression quality. We achieved state-of-the-art model compression results in a fully automated way without any human efforts. Under 4 × FLOPs reduction, we achieved 2.7% better accuracy than the hand-crafted model compression method for VGG-16 on ImageNet. We applied this automated, push-the-button compression pipeline to MobileNet-V1 and achieved a speedup of 1.53 × on the GPU (Titan Xp) and 1.95 × on an Android phone (Google Pixel 1), with negligible loss of accuracy.	en_US
dc.language.iso	en
dc.publisher	Springer Science and Business Media LLC	en_US
dc.relation.isversionof	10.1007/978-3-030-01234-2_48	en_US
dc.rights	Creative Commons Attribution-Noncommercial-Share Alike	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/	en_US
dc.source	arXiv	en_US
dc.title	AMC: AutoML for Model Compression and Acceleration on Mobile Devices	en_US
dc.type	Article	en_US
dc.identifier.citation	He, Yihui et al. "AMC: AutoML for Model Compression and Acceleration on Mobile Devices." Computer vision -- ECCV 2018 : 15th European Conference, Lecture Notes in Computer Science, 11211, Springer, 2018, 815-832 © 2018 The Author(s)	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science	en_US
dc.relation.journal	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)	en_US
dc.eprint.version	Author's final manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/JournalArticle	en_US
eprint.status	http://purl.org/eprint/status/PeerReviewed	en_US
dc.date.updated	2020-12-17T15:33:56Z
dspace.orderedauthors	He, Y; Lin, J; Liu, Z; Wang, H; Li, L-J; Han, S	en_US
dspace.date.submission	2020-12-17T15:34:00Z
mit.journal.volume	11211	en_US
mit.license	OPEN_ACCESS_POLICY
mit.metadata.status	Complete

Files in this item

Name:: 1802.03494.pdf
Size:: 792.6Kb
Format:: PDF
Description:: Accepted version

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record