Label Fusion: A Pipeline for Generating Ground Truth Labels for Real RGBD Data of Cluttered Scenes

Marion, James Patrick; Florence, Peter Raymond; Manuelli, Lucas; Tedrake, Russell L

dc.contributor.author	Marion, James Patrick
dc.contributor.author	Florence, Peter Raymond
dc.contributor.author	Manuelli, Lucas
dc.contributor.author	Tedrake, Russell L
dc.date.accessioned	2021-01-14T20:45:25Z
dc.date.available	2021-01-14T20:45:25Z
dc.date.issued	2018-09
dc.date.submitted	2018-05
dc.identifier.isbn	9781538630815
dc.identifier.issn	2577-087X
dc.identifier.uri	https://hdl.handle.net/1721.1/129426
dc.description.abstract	Deep neural network (DNN) architectures have been shown to outperform traditional pipelines for object segmentation and pose estimation using RGBD data, but the performance of these DNN pipelines is directly tied to how representative the training data is of the true data. Hence a key requirement for employing these methods in practice is to have a large set of labeled data for your specific robotic manipulation task, a requirement that is not generally satisfied by existing datasets. In this paper we develop a pipeline to rapidly generate high quality RGBD data with pixelwise labels and object poses. We use an RGBD camera to collect video of a scene from multiple viewpoints and leverage existing reconstruction techniques to produce a 3D dense reconstruction. We label the 3D reconstruction using a human assisted ICP-fitting of object meshes. By reprojecting the results of labeling the 3D scene we can produce labels for each RGBD image of the scene. This pipeline enabled us to collect over 1,000,000 labeled object instances in just a few days. We use this dataset to answer questions related to how much training data is required, and of what quality the data must be, to achieve high performance from a DNN architecture. Our dataset and annotation pipeline are available at labelfusion.csail.mit.edu.	en_US
dc.language.iso	en
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)	en_US
dc.relation.isversionof	http://dx.doi.org/10.1109/icra.2018.8460950	en_US
dc.rights	Creative Commons Attribution-Noncommercial-Share Alike	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/	en_US
dc.source	arXiv	en_US
dc.title	Label Fusion: A Pipeline for Generating Ground Truth Labels for Real RGBD Data of Cluttered Scenes	en_US
dc.type	Article	en_US
dc.identifier.citation	Marion, Pat et al. "Label Fusion: A Pipeline for Generating Ground Truth Labels for Real RGBD Data of Cluttered Scenes." 2018 IEEE International Conference on Robotics and Automation, May 2018, Brisbane, Australia, Institute of Electrical and Electronics Engineers, September 2018. © 2018 IEEE	en_US
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science	en_US
dc.relation.journal	2018 IEEE International Conference on Robotics and Automation	en_US
dc.eprint.version	Original manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dc.date.updated	2019-07-16T12:01:18Z
dspace.date.submission	2019-07-16T12:01:56Z
mit.metadata.status	Complete

Files in this item

Name:: 1707.04796.pdf
Size:: 5.712Mb
Format:: PDF
Description:: Submitted version

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record