Show simple item record

dc.contributor.advisorKim, Yoon
dc.contributor.authorNrusimha, Aniruddha
dc.date.accessioned2024-08-21T18:53:35Z
dc.date.available2024-08-21T18:53:35Z
dc.date.issued2024-05
dc.date.submitted2024-07-10T12:59:47.470Z
dc.identifier.urihttps://hdl.handle.net/1721.1/156280
dc.description.abstractWe consider the problem of accurate quantization for language models, where both the weights and activations are quantized to 4 bits per parameter with uniform quantization, the lowest bitwidth format natively supported by existing GPU hardware. In this context, the key challenge is activation quantization: it is known that language models contain outlier channels whose values on average are orders of magnitude higher than than other channels, which prevents accurate low-bitwidth quantization with known techniques. We systematically study this phenomena and find that these outlier channels emerge early in training, and that they occur more frequently in layers with residual streams. We then propose a simple strategy which regularizes a layer’s inputs via quantization-aware training (QAT) and its outputs via activation kurtosis regularization. We show that regularizing both the inputs and outputs is crucial for preventing a model’s "migrating" the difficulty in input quantization to the weights, which makes post-training quantization (PTQ) of weights more difficult. When combined with weight PTQ, we show that our approach can obtain a W4A4 model with integer quantization that performs competitively to the standard-precision W16A16 baseline.1
dc.publisherMassachusetts Institute of Technology
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
dc.rightsCopyright retained by author(s)
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.titleMitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization
dc.typeThesis
dc.description.degreeS.M.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeMaster
thesis.degree.nameMaster of Science in Electrical Engineering and Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record