GAN Compression: Efficient Architectures for Interactive Conditional GANs

Li, Muyang; Lin, Ji; Ding, Yaoyao; Liu, Zhijian; Han, Song

dc.contributor.author	Li, Muyang
dc.contributor.author	Lin, Ji
dc.contributor.author	Ding, Yaoyao
dc.contributor.author	Liu, Zhijian
dc.contributor.author	Han, Song
dc.date.accessioned	2021-01-19T17:04:47Z
dc.date.available	2021-01-19T17:04:47Z
dc.date.issued	2020-06
dc.identifier.issn	1063-6919
dc.identifier.uri	https://hdl.handle.net/1721.1/129446
dc.description.abstract	Conditional Generative Adversarial Networks (cGANs) have enabled controllable image synthesis for many computer vision and graphics applications. However, recent cGANs are 1-2 orders of magnitude more computationally-intensive than modern recognition CNNs. For example, GauGAN consumes 281G MACs per image, compared to 0.44G MACs for MobileNet-v3, making it difficult for interactive deployment. In this work, we propose a general-purpose compression framework for reducing the inference time and model size of the generator in cGANs. Directly applying existing CNNs compression methods yields poor performance due to the difficulty of GAN training and the differences in generator architectures. We address these challenges in two ways. First, to stabilize the GAN training, we transfer knowledge of multiple intermediate representations of the original model to its compressed model, and unify unpaired and paired learning. Second, instead of reusing existing CNN designs, our method automatically finds efficient architectures via neural architecture search (NAS). To accelerate the search process, we decouple the model training and architecture search via weight sharing. Experiments demonstrate the effectiveness of our method across different supervision settings (paired and unpaired), model architectures, and learning methods (e.g., pix2pix, GauGAN, CycleGAN). Without losing image quality, we reduce the computation of CycleGAN by more than 20x and GauGAN by 9x, paving the way for interactive image synthesis. The code and demo are publicly available.	en_US
dc.description.sponsorship	National Science Foundation (U.S.). Career (Award 1943349)	en_US
dc.language.iso	en
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)	en_US
dc.relation.isversionof	10.1109/CVPR42600.2020.00533	en_US
dc.rights	Creative Commons Attribution-Noncommercial-Share Alike	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/	en_US
dc.source	arXiv	en_US
dc.title	GAN Compression: Efficient Architectures for Interactive Conditional GANs	en_US
dc.type	Article	en_US
dc.identifier.citation	Li, Muyang et al. “GAN Compression: Efficient Architectures for Interactive Conditional GANs.” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2020 (June 2020) © 2020 The Author(s)	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science	en_US
dc.relation.journal	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition	en_US
dc.eprint.version	Author's final manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dc.date.updated	2020-12-17T16:34:00Z
dspace.orderedauthors	Li, M; Lin, J; Ding, Y; Liu, Z; Zhu, J-Y; Han, S	en_US
dspace.date.submission	2020-12-17T16:34:07Z
mit.journal.volume	2020	en_US
mit.license	OPEN_ACCESS_POLICY
mit.metadata.status	Complete

Files in this item

Name:: 2003.08936.pdf
Size:: 5.297Mb
Format:: PDF
Description:: Accepted version

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record