MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Monte Carlo Tree Search Applications to Neural Theorem Proving

Author(s)
LaBelle, Ethan
Thumbnail
DownloadThesis PDF (2.959Mb)
Advisor
Solar-Lezama, Armando
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
A common problem of LLM inference is hallucination, where models generate false information. Another such problem is the tradeoff between model size and computational cost. Larger models use more VRAM, in addition to requiring longer training and inference times. This work explores solutions to these problems, namely search and verification, following Yang’s recent contribution: LeanDojo: Theorem Proving with Retrieval-Augmented Language Models. In their work, Yang et al. introduce LeanDojo, an environment for programmatic interaction with the Lean theorem proving language, alongside ReProver, a ByT5-Small transformer-based ATP fine-tuned using the open source Lean mathlib. The smaller model requires fewer resources, enabling faster inference, which when combined with search, improves the effective performance of the model. We use the language model to generate a space of partial proof trees in Lean. As the core GPT can be interchanged with a larger or more performant model, this work focuses on search algorithms for finding novel proofs given the same computational budget. Three classes of algorithms are explored: best first search, random walk, and Monte Carlo Tree Search. Search algorithms are evaluated on the random split test dataset of the LeanDojo Benchmark. Finally, we present common failure modes of various methods, search results of algorithm variants, and novel proofs discovered relative to the baseline. Across our trials, we show the search space defined by ReProver’s tactic generator contains proofs for approximately 55.0% of theorems in the LeanDojo Benchmark random test split. In Yang’s evaluations, ReProver achieves a 51.2% solve rate Pass@1 on this benchmark.
Date issued
2024-05
URI
https://hdl.handle.net/1721.1/156761
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.