SLAM-Supported Self-Training for 6D Object Pose Estimation

Lu, Ziqi; Zhang, Yihao; Doherty, Kevin; Severinsen, Odin; Yang, Ethan; Leonard, John

dc.contributor.author	Lu, Ziqi
dc.contributor.author	Zhang, Yihao
dc.contributor.author	Doherty, Kevin
dc.contributor.author	Severinsen, Odin
dc.contributor.author	Yang, Ethan
dc.contributor.author	Leonard, John
dc.date.accessioned	2024-03-13T19:43:57Z
dc.date.available	2024-03-13T19:43:57Z
dc.date.issued	2022-10-23
dc.identifier.uri	https://hdl.handle.net/1721.1/153750
dc.description	2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) October 23-27, 2022, Kyoto, Japan	en_US
dc.description.abstract	Recent progress in object pose prediction provides a promising path for robots to build object-level scene representations during navigation. However, as we deploy a robot in novel environments, the out-of-distribution data can degrade the prediction performance. To mitigate the domain gap, we can potentially perform self-training in the target domain, using predictions on robot-captured images as pseudo labels to fine-tune the object pose estimator. Unfortunately, the pose predictions are typically outlier-corrupted, and it is hard to quantify their uncertainties, which can result in low-quality pseudo-labeled data. To address the problem, we propose a SLAM-supported self-training method, leveraging robot understanding of the 3D scene geometry to enhance the object pose inference performance. Combining the pose predictions with robot odometry, we formulate and solve pose graph optimization to refine the object pose estimates and make pseudo labels more consistent across frames. We incorporate the pose prediction covariances as variables into the optimization to automatically model their uncertainties. This automatic covariance tuning (ACT) process can fit 6D pose prediction noise at the component level, leading to higher-quality pseudo training data. We test our method with the deep object pose estimator (DOPE) on the YCB video dataset and in real robot experiments. It achieves respectively 34.3% and 17.8% accuracy enhancements in pose prediction on the two tests.	en_US
dc.language.iso	en
dc.publisher	IEEE	en_US
dc.relation.isversionof	10.1109/iros47612.2022.9981145	en_US
dc.rights	Creative Commons Attribution-Noncommercial-ShareAlike	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/	en_US
dc.source	arxiv	en_US
dc.title	SLAM-Supported Self-Training for 6D Object Pose Estimation	en_US
dc.type	Article	en_US
dc.identifier.citation	Lu, Ziqi, Zhang, Yihao, Doherty, Kevin, Severinsen, Odin, Yang, Ethan et al. 2022. "SLAM-Supported Self-Training for 6D Object Pose Estimation."
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.eprint.version	Author's final manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dc.date.updated	2024-03-13T19:18:03Z
dspace.orderedauthors	Lu, Z; Zhang, Y; Doherty, K; Severinsen, O; Yang, E; Leonard, J	en_US
dspace.date.submission	2024-03-13T19:18:05Z
mit.license	OPEN_ACCESS_POLICY
mit.metadata.status	Authority Work and Publication Information Needed	en_US

Files in this item

Name:: 2203.04424.pdf
Size:: 4.222Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record