Investigating Fine-Tuning of Language Models for Multiple-Choice Questions

Wang, Ivy A.

Author(s)

Wang, Ivy A.

DownloadThesis PDF (980.9Kb)

Advisor

Hemberg, Erik

O’Reilly, Una-May

Terms of use

Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) Copyright retained by author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/

Metadata

Show full item record

Abstract

This thesis investigates the positional and contextual bias of large language models (LLMs) when used to answer multiple-choice questions (MCQs). Given the increasing use of generative language models in fields ranging from cybersecurity to biomedical research, it is important to understand the causes of their behavior in order to mitigate biases and prevent errors. One known method of improving the performance of LLMs is fine-tuning, wherein a model is additionally trained on data from a specified distribution or subject area. We specifically investigate training data properties related to positional bias in fine-tuned language model performance on correctly answering MCQs. To improve model efficiency, we used parameter-efficient fine-tuning, specifically LoRA (Low-Rank Adaptation), which reduces the dimensionality of weight matrices used in the model’s layers. We verify that if the training data for the model possesses the same qualities and distributions as the test data, the LLM will achieve the best performance. In our experiments, we scaled and balanced our fine-tuning datasets and learned that both processes improve the accuracy on test sets of MCQs.

Date issued

2024-09

URI

https://hdl.handle.net/1721.1/157591

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses