dc.contributor.author | Zeng, Andy | |
dc.contributor.author | Song, Shuran | |
dc.contributor.author | Suo, Daniel | |
dc.contributor.author | Walker, Ed | |
dc.contributor.author | Xiao, Jianxiong | |
dc.contributor.author | Yu, Kuan-Ting | |
dc.contributor.author | Rodriguez Garcia, Alberto | |
dc.date.accessioned | 2019-03-29T19:46:15Z | |
dc.date.available | 2019-03-29T19:46:15Z | |
dc.date.issued | 2017-07 | |
dc.date.submitted | 2017-06 | |
dc.identifier.isbn | 978-1-5090-4633-1 | |
dc.identifier.uri | http://hdl.handle.net/1721.1/121121 | |
dc.description.abstract | Robot warehouse automation has attracted significant interest in recent years, perhaps most visibly in the Amazon Picking Challenge (APC) [1]. A fully autonomous warehouse pick-and-place system requires robust vision that reliably recognizes and locates objects amid cluttered environments, self-occlusions, sensor noise, and a large variety of objects. In this paper we present an approach that leverages multiview RGB-D data and self-supervised, data-driven learning to overcome those difficulties. The approach was part of the MIT-Princeton Team system that took 3rd- and 4th-place in the stowing and picking tasks, respectively at APC 2016. In the proposed approach, we segment and label multiple views of a scene with a fully convolutional neural network, and then fit pre-scanned 3D object models to the resulting segmentation to get the 6D object pose. Training a deep neural network for segmentation typically requires a large amount of training data. We propose a self-supervised method to generate a large labeled dataset without tedious manual segmentation. We demonstrate that our system can reliably estimate the 6D pose of objects under a variety of scenarios. All code, data, and benchmarks are available at http://apc.cs.princeton.edu/ | en_US |
dc.publisher | Institute of Electrical and Electronics Engineers (IEEE) | en_US |
dc.relation.isversionof | http://dx.doi.org/10.1109/ICRA.2017.7989165 | en_US |
dc.rights | Creative Commons Attribution-Noncommercial-Share Alike | en_US |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ | en_US |
dc.source | arXiv | en_US |
dc.title | Multi-view self-supervised deep learning for 6D pose estimation in the Amazon Picking Challenge | en_US |
dc.type | Article | en_US |
dc.identifier.citation | Zeng, Andy, Kuan-Ting Yu, Shuran Song, Daniel Suo, Ed Walker, Alberto Rodriguez, and Jianxiong Xiao. “Multi-View Self-Supervised Deep Learning for 6D Pose Estimation in the Amazon Picking Challenge.” 2017 IEEE International Conference on Robotics and Automation (ICRA), 29 May - 3 July, 2017, Singapore, Singapore, IEEE, 2017. © 2017 IEEE | en_US |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | en_US |
dc.contributor.department | Massachusetts Institute of Technology. Department of Mechanical Engineering | en_US |
dc.contributor.mitauthor | Yu, Kuan-Ting | |
dc.contributor.mitauthor | Rodriguez Garcia, Alberto | |
dc.relation.journal | 2017 IEEE International Conference on Robotics and Automation (ICRA) | en_US |
dc.eprint.version | Author's final manuscript | en_US |
dc.type.uri | http://purl.org/eprint/type/ConferencePaper | en_US |
eprint.status | http://purl.org/eprint/status/NonPeerReviewed | en_US |
dc.date.updated | 2018-12-17T18:13:34Z | |
dspace.orderedauthors | Zeng, Andy; Yu, Kuan-Ting; Song, Shuran; Suo, Daniel; Walker, Ed; Rodriguez, Alberto; Xiao, Jianxiong | en_US |
dspace.embargo.terms | N | en_US |
dc.identifier.orcid | https://orcid.org/0000-0002-8954-2310 | |
dc.identifier.orcid | https://orcid.org/0000-0002-1119-4512 | |
mit.license | OPEN_ACCESS_POLICY | en_US |