Show simple item record

dc.contributor.authorLiao, Qianli
dc.contributor.authorPoggio, Tomaso
dc.date.accessioned2018-01-01T00:03:52Z
dc.date.available2018-01-01T00:03:52Z
dc.date.issued2017-12-31
dc.identifier.urihttp://hdl.handle.net/1721.1/113002
dc.description.abstractWe provide more detailed explanation of the ideas behind a recent paper on “Object-Oriented Deep Learning” [1] and extend it to handle 3D inputs/outputs. Similar to [1], every layer of the system takes in a list of “objects/symbols”, processes it and outputs another list of objects/symbols. In this report, the properties of the objects/symbols are extended to contain 3D information — including 3D orientations (i.e., rotation quaternion or yaw, pitch and roll) and one extra coordinate dimension (z-axis or depth). The resultant model is a novel end-to-end interpretable 3D representation that systematically factors out common 3D transformations such as translation and 3D rotation. As first proposed by [1] and discussed in more detail in [2], it offers a “symbolic disentanglement” solution to the problem of transformation invariance/equivariance. To demonstrate the effectiveness of the model, we show that it can achieve perfect performance on the task of 3D invariant recognition by training on one rotation of a 3D object and test it on 3D rotations (i.e., at arbitrary angles of yaw, pitch and roll). Furthermore, in a more realistic case where depth information is not given (similar to viewpoint invariant object recognition from 2D vision) our model generalizes reasonably well to novel viewpoints while ConvNets fail to generalize.en_US
dc.description.sponsorshipThis work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF - 1231216.en_US
dc.language.isoen_USen_US
dc.relation.ispartofseriesCBMM Memo Series;075
dc.rightsAttribution-NonCommercial-ShareAlike 3.0 United States*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/3.0/us/*
dc.title3D Object-Oriented Learning: An End-to-end Transformation-Disentangled 3D Representationen_US
dc.typeTechnical Reporten_US
dc.typeWorking Paperen_US
dc.typeOtheren_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record