Modeling Intelligence via Graph Neural Networks

Xu, Keyulu

dc.contributor.advisor	Jegelka, Stefanie
dc.contributor.author	Xu, Keyulu
dc.date.accessioned	2022-01-14T15:04:32Z
dc.date.available	2022-01-14T15:04:32Z
dc.date.issued	2021-06
dc.date.submitted	2021-06-23T19:41:02.425Z
dc.identifier.uri	https://hdl.handle.net/1721.1/139331
dc.description.abstract	Artificial intelligence can be more powerful than human intelligence. Many problems are perhaps challenging from a human perspective. These could be seeking statistical patterns in complex and structured objects, such as drug molecules and the global financial system. Advances in deep learning have shown that the key to solving such tasks is to learn a good representation. Given the representations of the world, the second aspect of intelligence is reasoning. Learning to reason implies learning to implement a correct reasoning process, within and outside the training distribution. In this thesis, we address the fundamental problem of modeling intelligence that can learn to represent and reason about the world. We study both questions from the lens of graph neural networks, a class of neural networks acting on graphs. First, we can abstract many objects in the world as graphs and learn their representations with graph neural networks. Second, we shall see how graph neural networks exploit the algorithmic structure in reasoning processes to improve generalization. This thesis consists of four parts. Each part also studies one aspect of the theoretical landscape of learning: the representation power, generalization, extrapolation, and optimization. In Part I, we characterize the expressive power of graph neural networks for representing graphs, and build maximally powerful graph neural networks. In Part II, we analyze generalization and show implications for what reasoning a neural network can sample-efficiently learn. Our analysis takes into account the training algorithm, the network structure, and the task structure. In Part III, we study how neural networks extrapolate and under what conditions they learn the correct reasoning outside the training distribution. In Part IV, we prove global convergence rates and develop normalization methods that accelerate the training of graph neural networks. Our techniques and insights go beyond graph neural networks, and extend broadly to deep learning models.
dc.publisher	Massachusetts Institute of Technology
dc.rights	In Copyright - Educational Use Permitted
dc.rights	Copyright MIT
dc.rights.uri	http://rightsstatements.org/page/InC-EDU/1.0/
dc.title	Modeling Intelligence via Graph Neural Networks
dc.type	Thesis
dc.description.degree	Ph.D.
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degree	Doctoral
thesis.degree.name	Doctor of Philosophy

Files in this item

Name:: Xu-keyulu-PhD-EECS-2021-thesis.pdf
Size:: 11.44Mb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Doctoral Theses

Show simple item record