An Intent-based Neural Monte Carlo Tree Search Framework for Synthesis of Printed Circuit Boards

Kaphle, Arpan

Author(s)

Kaphle, Arpan

DownloadThesis PDF (4.449Mb)

Advisor

Wu, Cathy

Hogan, Taylor

Roberto, Luke

Terms of use

In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

PCB Synthesis is a difficult joint optimization problem that has eluded automation in Electronic Design Automation (EDA) industry until now. Past approaches to create algorithms that intelligently learn to solve these problems have not yet widely been seen. Cadence Design Systems, affiliated with the author, works on solving such problems. This paper proposes the usage of Monte Carlo Tree Search (MCTS) augmented to improve search in order to self-generate datasets, culminating in a process called LFS (Learning Feedback System). This process allows using past data to accelerate MCTS with deep RL models on new or similar board configurations. Datasets are utilized with forms of dataset-based Reinforcement Learning (RL) algorithms, known as ’Offline’ and ’Off-Policy’ algorithms to solve this problem in a useful and simplified scope. The problem scope starts when other approaches have left a design with constraint violations. This paper baselines with both an algorithmically improved version of MCTS and that further accelerated with PPO, a purely online non demonstrator-based Deep RL algorithm. The results find that MCTS allows for smooth self-generation of datasets, a process inspired by AlphaGo Zero. Adding on to that, we find that off-policy and expert-based RL algorithms such as Adversarial Inverse Reinforcement Learning (AIRL) and Generative Adversarial Imitation Learning (GAIL) can significantly utilize the generated dataset to improve solving the board over time, and do far better when compared to the baseline trained for the same training amount once proper tuning is done. We also find that the complexity of the problem related to the performance of the baseline. Regarding our exploration of offline CQL within this MCTS-connected environment, we find that performance was not up to par, but that it was still able to generalize reasonable actions. We found that all approaches can be tuned to further accelerate MCTS’s decision making and help it prune better for larger state spaces upon comparison of overall actions per episode. The results indicate that amongst the methods tried, the neural accelerated MCTS feedback loop proposed seems to promisingly perform the best with expert-based RL methods.

Date issued

2022-05

URI

https://hdl.handle.net/1721.1/145108

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses