MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

LLM-Directed Agent Models in Cyberspace

Author(s)
Laney, Samuel P.
Thumbnail
DownloadThesis PDF (2.742Mb)
Advisor
O'Reilly, Una-May
Vilas-Boas, Felipe
Yu, Chris
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Network penetration testing, a proactive method for identifying vulnerabilities in cy- berspace, has long been the domain of human experts. However, rapid advancements in machine learning have opened up new possibilities for automating many of these tasks. This thesis aims to explore the application of Large Language Models (LLMs) for automating penetration tests and Cyber Capture the Flag (CTF) challenges, bridging the gap between static tools and dynamic human intuition in cybersecurity. This work provides an evaluation framework for assessing the performance of LLMs in autonomously solving CTF challenges, with an emphasis on understanding the capabilities, limitations, and best prompting strategies for LLMs in this domain. Notably, this thesis presents an agent configuration that offers a 102% improvement in challenge completion on a database of PicoCTF challenges compared to the published baseline. By analyzing a variety of agent strategies, response formats, and historical action representations in the context of CTF challenges, this work aims to provide insights into the best practices and limitations in leveraging LLMs for cybersecurity tasks. Additionally, this work proposes a hierarchical architecture to guide an LLM-enabled agent in performing complex, multi-step penetration testing tasks with strategic foresight. This proof of concept approach shows success in entry level challenges. While LLMs exhibit impressive capabilities, they are limited out of the box in their ability to solve complex, multi-step tasks requiring exploration, necessitating approaches such as those described in this work to improve performance in these areas.
Date issued
2024-05
URI
https://hdl.handle.net/1721.1/156291
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.