| dc.contributor.advisor | Joshua B. Tenenbaum. | en_US |
| dc.contributor.author | Hartman, William R.,M. Eng.Massachusetts Institute of Technology. | en_US |
| dc.contributor.other | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. | en_US |
| dc.date.accessioned | 2019-12-05T18:07:33Z | |
| dc.date.available | 2019-12-05T18:07:33Z | |
| dc.date.copyright | 2019 | en_US |
| dc.date.issued | 2019 | en_US |
| dc.identifier.uri | https://hdl.handle.net/1721.1/123176 | |
| dc.description | This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. | en_US |
| dc.description | Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019 | en_US |
| dc.description | Cataloged from student-submitted PDF version of thesis. | en_US |
| dc.description | Includes bibliographical references (pages 41-42). | en_US |
| dc.description.abstract | You are using your brain to understand this sentence, but can you explain precisely how you do it? Although we constantly experience language processing first-hand, we're still not entirely sure how it's done. Linguists and computer scientists have been separately working on discovering mechanisms that are necessary for high-performing language processing, but they have yet to discover the holy-grail. In this work, we take the first steps toward bridging the gap between these two approaches by developing a method that discovers, with statistical significance, brain-like neural network sub-architectures in their simplest form. Instead of just evaluating established NLP models for brain-likeness, our objective is to find new architectures and computations that are especially brain-like. The method randomly generates a large and varied collection of neural network architectures in the pursuit of finding architectures that mimic fMRI data in tasks like language modeling, translation, and summarization. All hyper-parameters are fixed across models. Thus, the method is then able to identify the sub-architectures that are associated with the most brain-like models, and return them in their simplest form. The sub-architectures that the method returns enable two important analyses: because they are pruned to the most brain-like components, the computations that these smaller sub-architectures perform are easier to interpret than those of the architecture as a whole. And interpretability is crucial for understanding the mechanisms that are intrinsic to language processing. The second reason is that the sub-architectures may help improve future architecture samples. For instance, those brain-like computations may be defined as unit operations in order to bias more models to include them. | en_US |
| dc.description.statementofresponsibility | by William R. Hartman. | en_US |
| dc.format.extent | 42 pages | en_US |
| dc.language.iso | eng | en_US |
| dc.publisher | Massachusetts Institute of Technology | en_US |
| dc.rights | MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. | en_US |
| dc.rights.uri | http://dspace.mit.edu/handle/1721.1/7582 | en_US |
| dc.subject | Electrical Engineering and Computer Science. | en_US |
| dc.title | Uncovering brain-like computations for natural language processing | en_US |
| dc.type | Thesis | en_US |
| dc.description.degree | M. Eng. | en_US |
| dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | en_US |
| dc.identifier.oclc | 1129456672 | en_US |
| dc.description.collection | M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science | en_US |
| dspace.imported | 2019-12-05T18:07:32Z | en_US |
| mit.thesis.degree | Master | en_US |
| mit.thesis.department | EECS | en_US |