MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Savaal: A system for automatically generating high-quality questions from unseen documents

Author(s)
Chandler, Joseph A.
Thumbnail
DownloadThesis PDF (2.572Mb)
Advisor
Balakrishnan, Hari
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Assessing human understanding through exams and quizzes is fundamental to learning and advancement in both educational and professional settings. However, current solutions to automate the generation of challenging questions from educational materials and documents are insufficient, resulting in superficial or often irrelevant questions. While LLMs have been shown to excel in tasks like question answering, their usage on question generation is underexplored for general domains and at scale. This work presents Savaal, a scalable question-generation system that generates higher-order questions from documents, as well as a real-world system implementation for general use. Savaal accomplishes the following goals and objectives: (i) scalability, capable of generating hundreds of questions from any document (ii) depth of understanding, synthesizing higherorder concepts to test learners’ understanding of the material, and (iii) domain independence, generalizing broadly to any field. Rather than naively providing the entire document in context to an LLM, Savaal breaks down the process of generating questions into a three-stage pipeline. We demonstrate that Savaal outperforms the direct prompting baseline as evaluated by 76 human experts on 71 documents across conference papers and PhD dissertations. We additionally contribute a general system for serving Savaal in real-world scenarios. We demonstrate that our system is scalable, enabling fault-tolerant and horizontal scaling of each individual component in response to fluctuations in usage. Moreover, our architecture enables interactive usage from users and collaboration in groups, reflecting real-world organizations like classrooms or enterprises. We hope that the system enables scalable question generation for educational and corporate use-cases.
Date issued
2025-05
URI
https://hdl.handle.net/1721.1/162563
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.