Show simple item record

dc.contributor.authorJing, Li
dc.contributor.authorGulcehre, Caglar
dc.contributor.authorPeurifoy, John
dc.contributor.authorShen, Yichen
dc.contributor.authorTegmark, Max
dc.contributor.authorSoljacic, Marin
dc.contributor.authorBengio, Yoshua
dc.date.accessioned2021-10-27T20:10:57Z
dc.date.available2021-10-27T20:10:57Z
dc.date.issued2019
dc.identifier.urihttps://hdl.handle.net/1721.1/135148
dc.description.abstract© 2019 Massachusetts Institute of Technology. We present a novel recurrent neural network (RNN)based model that combines the remembering ability of unitary evolution RNNs with the ability of gated RNNs to effectively forget redundant or irrelevant information in its memory. We achieve this by extending restricted orthogonal evolution RNNs with a gating mechanism similar to gated recurrent unit RNNs with a reset gate and an update gate. Our model is able to outperform long short-term memory, gated recurrent units, and vanilla unitary or orthogonal RNNs on several long-term-dependency benchmark tasks. We empirically show that both orthogonal and unitary RNNs lack the ability to forget. This ability plays an important role in RNNs. We provide competitive results along with an analysis of our model on many natural sequential tasks, including question answering, speech spectrum prediction, character-level language modeling, and synthetic tasks that involve long-term dependencies such as algorithmic, denoising, and copying tasks.
dc.language.isoen
dc.publisherMIT Press - Journals
dc.relation.isversionof10.1162/neco_a_01174
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.
dc.sourceMIT Press
dc.titleGated Orthogonal Recurrent Units: On Learning to Forget
dc.typeArticle
dc.contributor.departmentSloan School of Management
dc.contributor.departmentMassachusetts Institute of Technology. Department of Physics
dc.relation.journalNeural Computation
dc.eprint.versionFinal published version
dc.type.urihttp://purl.org/eprint/type/JournalArticle
eprint.statushttp://purl.org/eprint/status/PeerReviewed
dc.date.updated2019-06-05T12:08:35Z
dspace.orderedauthorsJing, L; Gulcehre, C; Peurifoy, J; Shen, Y; Tegmark, M; Soljacic, M; Bengio, Y
dspace.date.submission2019-06-05T12:08:36Z
mit.journal.volume31
mit.journal.issue4
mit.metadata.statusAuthority Work and Publication Information Needed


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record