MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

How to Make a Pizza: Learning a Compositional Layer-Based GAN Model

Author(s)
Papadopoulos, Dim P; Tamaazousti, Youssef; Ofli, Ferda; Weber, Ingmar; Torralba, Antonio
Thumbnail
Download1906.02839(1).pdf (7.707Mb)
Open Access Policy

Open Access Policy

Creative Commons Attribution-Noncommercial-Share Alike

Terms of use
Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/
Metadata
Show full item record
Abstract
A food recipe is an ordered set of instructions for preparing a particular dish. From a visual perspective, every instruction step can be seen as a way to change the visual appearance of the dish by adding extra objects (e.g., adding an ingredient) or changing the appearance of the existing ones (e.g., cooking the dish). In this paper, we aim to teach a machine how to make a pizza by building a generative model that mirrors this step-by-step procedure. To do so, we learn composable module operations which are able to either add or remove a particular ingredient. Each operator is designed as a Generative Adversarial Network (GAN). Given only weak image-level supervision, the operators are trained to generate a visual layer that needs to be added to or removed from the existing image. The proposed model is able to decompose an image into an ordered sequence of layers by applying sequentially in the right order the corresponding removing modules. Experimental results on synthetic and real pizza images demonstrate that our proposed model is able to: (1) segment pizza toppings in a weakly- supervised fashion, (2) remove them by revealing what is occluded underneath them (i.e., inpainting), and (3) infer the ordering of the toppings without any depth ordering supervision. Code, data, and models are available online.
Date issued
2020-01
URI
https://hdl.handle.net/1721.1/132649
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Journal
2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Citation
Papadopoulos, Dim P. et al. "How to Make a Pizza: Learning a Compositional Layer-Based GAN Model." 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019, Long Beach, CA, USA, Institute of Electrical and Electronics Engineers, January 2020. © 2019 IEEE
Version: Author's final manuscript
ISBN
978-1-7281-3293-8
ISSN
2575-7075

Collections
  • MIT Open Access Articles

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.