Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia

Delelegn, Yonatan

Author(s)

Delelegn, Yonatan

DownloadThesis PDF (3.755Mb)

Advisor

Edelman, Alan

Rackauckas, Christopher

Terms of use

In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

Traditional deep learning models typically consist of explicitly defined layers, such as fully connected and self-attention layers found in Transformers, which have been pivotal in recent advancements in computer vision and large language models. Selecting an appropriate architecture is critical for these models. However, even with optimal architecture, these models may fail to capture intricate relationships and dependencies within hidden states due to the inherent limitations of the chosen layers. Furthermore, in several scientific applications, particularly those simulating physical systems, there is a pressing need to integrate domain-specific knowledge into the modeling process, a task for which explicit neural networks may not be ideally suited. Recent studies, such as [2] and [4] have highlighted the potential of implicit layers in capturing more complex relationships and learning more stringent constraints than traditional neural networks. Beyond capturing intricate relationships, implicit layers offer the advantage of decoupling the solution process from the layer definition, thus facilitating faster training and the seamless integration of domain-specific knowledge. To enable implicit models to rival state-of-the-art performance, robust and efficient solvers are required for the forward pass. In this project, we focus on exploring stable and efficient solvers, specifically Pseudo-transient methods, for resolving neural complementarity problems. We aim to derive the sensitivity analysis of these problems, implement it in julia, and delve into the applications of differentiable complementarity problems in fields such as economics, game theory, and optimization.

Date issued

2024-02

URI

https://hdl.handle.net/1721.1/153881

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses