Deriving neural architectures from sequence and graph kernels

Lei, Tao; Jin, Wengong; Barzilay, Regina; Jaakkola, Tommi S

Author(s)

Lei, Tao; Jin, Wengong; Barzilay, Regina; Jaakkola, Tommi S

DownloadAccepted version (687.0Kb)

Terms of use

Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/

Metadata

Show full item record

Abstract

The design of neural architectures for structured objects is typically guided by experimental insights rather than a formal process. In this work, we appeal to kernels over combinatorial structures, such as sequences and graphs, to derive appropriate neural operations. We introduce a class of deep recurrent neural operations and formally characterize their associated kernel spaces. Our recurrent modules compare the input to virtual reference objects (cf. filters in CNN) via the kernels. Similar to traditional neural operations, these reference objects are parameterized and directly optimized in end-to-end training. We empirically evaluate the proposed class of neural architectures on standard applications such as language modeling and molecular graph regression, achieving state-of-the-art results across these applications.

Date issued

2017

URI

https://hdl.handle.net/1721.1/130480

Department

Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory

Journal

Proceedings of the 34th International Conference on Machine Learning

Publisher

MLResearch Press

Citation

Lei, Tao et al. "Deriving neural architectures from sequence and graph kernels." Proceedings of the 34th International Conference on Machine Learning, August 2017, Sydney, Australia, MLResearch Press, 2017. © 2017 The author(s)

Version: Author's final manuscript

Collections

MIT Open Access Articles