Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images

Marin, Javier; Biswas, Aritro; Ofli, Ferda; Hynes, Nicholas; Salvador, Amaia; Aytar, Yusuf; Weber, Ingmar; Torralba, Antonio

dc.contributor.author	Marin, Javier
dc.contributor.author	Biswas, Aritro
dc.contributor.author	Ofli, Ferda
dc.contributor.author	Hynes, Nicholas
dc.contributor.author	Salvador, Amaia
dc.contributor.author	Aytar, Yusuf
dc.contributor.author	Weber, Ingmar
dc.contributor.author	Torralba, Antonio
dc.date.accessioned	2021-04-01T19:28:34Z
dc.date.available	2021-04-01T19:28:34Z
dc.date.issued	2021-01
dc.identifier.issn	0162-8828
dc.identifier.issn	2160-9292
dc.identifier.issn	1939-3539
dc.identifier.uri	https://hdl.handle.net/1721.1/130340
dc.description.abstract	In this paper, we introduce Recipe1M+, a new large-scale, structured corpus of over one million cooking recipes and 13 million food images. As the largest publicly available collection of recipe data, Recipe1M+ affords the ability to train high-capacity models on aligned, multimodal data. Using these data, we train a neural network to learn a joint embedding of recipes and images that yields impressive results on an image-recipe retrieval task. Moreover, we demonstrate that regularization via the addition of a high-level classification objective both improves retrieval performance to rival that of humans and enables semantic vector arithmetic. We postulate that these embeddings will provide a basis for further exploration of the Recipe1M+ dataset and food and cooking in general. Code, data and models are publicly available.11.http://im2recipe.csail.mit.edu.	en_US
dc.language.iso	en
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)	en_US
dc.relation.isversionof	http://dx.doi.org/10.1109/tpami.2019.2927476	en_US
dc.rights	Creative Commons Attribution-Noncommercial-Share Alike	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/	en_US
dc.source	MIT web domain	en_US
dc.title	Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images	en_US
dc.type	Article	en_US
dc.identifier.citation	Marin, Javier et al. "Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images." IEEE Transactions on Pattern Analysis and Machine Intelligence (January 2021): 187 - 203 © 2021 IEEE	en_US
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.relation.journal	IEEE Transactions on Pattern Analysis and Machine Intelligence	en_US
dc.eprint.version	Author's final manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/JournalArticle	en_US
eprint.status	http://purl.org/eprint/status/PeerReviewed	en_US
dc.date.updated	2021-01-28T15:58:54Z
dspace.orderedauthors	Marin, J; Biswas, A; Ofli, F; Hynes, N; Salvador, A; Aytar, Y; Weber, I; Torralba, A	en_US
dspace.date.submission	2021-01-28T15:59:01Z
mit.journal.volume	43	en_US
mit.journal.issue	1	en_US
mit.license	OPEN_ACCESS_POLICY
mit.metadata.status	Complete

Files in this item

Name:: tpami19.pdf
Size:: 16.36Mb
Format:: PDF
Description:: Accepted version

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record