Show simple item record

dc.contributor.authorPapadopoulos, Dim P
dc.contributor.authorTamaazousti, Youssef
dc.contributor.authorOfli, Ferda
dc.contributor.authorWeber, Ingmar
dc.contributor.authorTorralba, Antonio
dc.date.accessioned2021-09-27T17:17:37Z
dc.date.available2021-09-27T17:17:37Z
dc.date.issued2020-01
dc.date.submitted2019-06
dc.identifier.isbn978-1-7281-3293-8
dc.identifier.issn2575-7075
dc.identifier.urihttps://hdl.handle.net/1721.1/132649
dc.description.abstractA food recipe is an ordered set of instructions for preparing a particular dish. From a visual perspective, every instruction step can be seen as a way to change the visual appearance of the dish by adding extra objects (e.g., adding an ingredient) or changing the appearance of the existing ones (e.g., cooking the dish). In this paper, we aim to teach a machine how to make a pizza by building a generative model that mirrors this step-by-step procedure. To do so, we learn composable module operations which are able to either add or remove a particular ingredient. Each operator is designed as a Generative Adversarial Network (GAN). Given only weak image-level supervision, the operators are trained to generate a visual layer that needs to be added to or removed from the existing image. The proposed model is able to decompose an image into an ordered sequence of layers by applying sequentially in the right order the corresponding removing modules. Experimental results on synthetic and real pizza images demonstrate that our proposed model is able to: (1) segment pizza toppings in a weakly- supervised fashion, (2) remove them by revealing what is occluded underneath them (i.e., inpainting), and (3) infer the ordering of the toppings without any depth ordering supervision. Code, data, and models are available online.en_US
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en_US
dc.relation.isversionofhttp://dx.doi.org/10.1109/cvpr.2019.00819en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourcearXiven_US
dc.titleHow to Make a Pizza: Learning a Compositional Layer-Based GAN Modelen_US
dc.typeArticleen_US
dc.identifier.citationPapadopoulos, Dim P. et al. "How to Make a Pizza: Learning a Compositional Layer-Based GAN Model." 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019, Long Beach, CA, USA, Institute of Electrical and Electronics Engineers, January 2020. © 2019 IEEEen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.relation.journal2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)en_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dspace.date.submission2021-01-28T13:53:51Z
mit.licenseOPEN_ACCESS_POLICY
mit.metadata.statusCompleteen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record