Show simple item record

dc.contributor.authorYen-Chen, Lin
dc.contributor.authorFlorence, Pete
dc.contributor.authorBarron, Jonathan T.
dc.contributor.authorLin, Tsung-Yi
dc.contributor.authorRodriguez, Alberto
dc.contributor.authorIsola, Phillip
dc.date.accessioned2024-03-08T16:58:27Z
dc.date.available2024-03-08T16:58:27Z
dc.date.issued2022-05-23
dc.identifier.urihttps://hdl.handle.net/1721.1/153644
dc.description2022 International Conference on Robotics and Automation (ICRA) 23-27 May 2022en_US
dc.description.abstractThin, reflective objects such as forks and whisks are common in our daily lives, but they are particularly chal-lenging for robot perception because it is hard to reconstruct them using commodity RGB-D cameras or multi-view stereo techniques. While traditional pipelines struggle with objects like these, Neural Radiance Fields (NeRFs) have recently been shown to be remarkably effective for performing view synthesis on objects with thin structures or reflective materials. In this paper we explore the use of NeRF as a new source of supervision for robust robot vision systems. In particular, we demonstrate that a NeRF representation of a scene can be used to train dense object descriptors. We use an optimized NeRF to extract dense correspondences between multiple views of an object, and then use these correspondences as training data for learning a view-invariant representation of the object. NeRF's usage of a density field allows us to reformulate the correspondence problem with a novel distribution-of-depths formulation, as opposed to the conventional approach of using a depth map. Dense correspondence models supervised with our method significantly outperform off-the-shelf learned descriptors by 106% (PCK@3px metric, more than doubling performance) and outperform our baseline supervised with multi-view stereo by 29%. Furthermore, we demonstrate the learned dense descriptors enable robots to perform accurate 6-degree of freedom (6-DoF) pick and place of thin and reflective objects.en_US
dc.language.isoen_US
dc.publisherIEEEen_US
dc.relation.isversionof10.1109/icra46639.2022.9812291en_US
dc.rightsCreative Commons Attribution-Noncommercial-ShareAlikeen_US
dc.rightsAttribution-NonCommercial-ShareAlike 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourceIEEEen_US
dc.titleNeRF-Supervision: Learning Dense Object Descriptors from Neural Radiance Fieldsen_US
dc.typeArticleen_US
dc.identifier.citationYen-Chen, Lin, Florence, Pete, Barron, Jonathan T., Lin, Tsung-Yi, Rodriguez, Alberto et al. 2022. "NeRF-Supervision: Learning Dense Object Descriptors from Neural Radiance Fields."
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.contributor.departmentMassachusetts Institute of Technology. Department of Mechanical Engineering
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dspace.date.submission2024-03-08T16:55:47Z
mit.licenseOPEN_ACCESS_POLICY
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record