. 2021 Feb 11;21(4):1299.

doi: 10.3390/s21041299.

RobotP: A Benchmark Dataset for 6D Object Pose Estimation

Honglin Yuan¹, Tim Hoogenkamp¹, Remco C Veltkamp¹

Affiliations

PMID: 33670325
PMCID: PMC7917891
DOI: 10.3390/s21041299

RobotP: A Benchmark Dataset for 6D Object Pose Estimation

Honglin Yuan et al. Sensors (Basel). 2021.

. 2021 Feb 11;21(4):1299.

doi: 10.3390/s21041299.

Authors

Honglin Yuan¹, Tim Hoogenkamp¹, Remco C Veltkamp¹

Affiliation

¹ Department of Information and Computing Sciences, Utrecht University, 3584 CC Utrecht, The Netherlands.

PMID: 33670325
PMCID: PMC7917891
DOI: 10.3390/s21041299

Abstract

Deep learning has achieved great success on robotic vision tasks. However, when compared with other vision-based tasks, it is difficult to collect a representative and sufficiently large training set for six-dimensional (6D) object pose estimation, due to the inherent difficulty of data collection. In this paper, we propose the RobotP dataset consisting of commonly used objects for benchmarking in 6D object pose estimation. To create the dataset, we apply a 3D reconstruction pipeline to produce high-quality depth images, ground truth poses, and 3D models for well-selected objects. Subsequently, based on the generated data, we produce object segmentation masks and two-dimensional (2D) bounding boxes automatically. To further enrich the data, we synthesize a large number of photo-realistic color-and-depth image pairs with ground truth 6D poses. Our dataset is freely distributed to research groups by the Shape Retrieval Challenge benchmark on 6D pose estimation. Based on our benchmark, different learning-based approaches are trained and tested by the unified dataset. The evaluation results indicate that there is considerable room for improvement in 6D object pose estimation, particularly for objects with dark colors, and photo-realistic images are helpful in increasing the performance of pose estimation algorithms.

Keywords: 3D reconstruction; 6D pose estimation; benchmark dataset; sensors.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Figures

**Figure 1**
Scene examples and visualization of estimated poses by the approach that was proposed from our benchmark.

**Figure 2**
Daily used objects in our dataset.

**Figure 3**
Different three-dimensional (3D) cameras. Left: time-of-flight camera. Middle: structured-light camera. Right: depth-from-stereo camera.

**Figure 4**
The pipeline of the pose estimation process. The input are RGB images and the initial poses of these images are estimated by Structure from Motion (SfM). After that, the initial poses are refined locally and globally.

**Figure 5**
The local pose groups. They are clustered based on angle and distance similarities.

**Figure 6**
(a) The captured depth image: the red rectangle shows left invalid depth band. (b) Misalignment of color-and-depth image pairs: the images are generated when the distance between the object and camera is near, showing large misalignment.

**Figure 7**
The depth images are estimated by COLMAP, showing better alignment.

**Figure 8**
Snapshots from our simulator showing a robot synthesizing data. Green points and red lines are positions and view directions of input cameras, black lines are the view directions of the virtual camera, and the long line is the whole trajectory of the virtual camera.

**Figure 9**
Reprojection error comparison with and without pose refinement for different objects.

**Figure 10**
Examples of depth alignment results on table1 and table2 scenarios. The first column is the aligned depth image, the second column is the matching between captured depth and color images, and the third column is the matching between aligned depth and color images. The black color is the missing information.

**Figure 11**
The depth fusion results on table1 and table2 scenarios. The first column are color images, and the second column are the estimated depth images by COLMAP and the third column are the depth images generated by our approach.

**Figure 12**
Examples of three-dimensional (3D) point clouds for the objects in our dataset. The point clouds shown in figures (a) and (b) are generated by COLMAP and our approach, respectively.

**Figure 13**
Examples of segmentation masks and bounding boxes for different objects.

**Figure 14**
Examples of synthesized color-and-depth image pairs.

**Figure 15**
Examples of accuracy performance. Each 3D model is projected to the image plane with the estimated 6D pose.

See this image and copyright information in PMC

References

1. Wang C., Xu D., Zhu Y., Martín-Martín R., Lu C., Fei-Fei L., Savarese S. Densefusion: 6D object pose estimation by iterative dense fusion; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Long Beach, CA, USA. 15–21 June 2019; pp. 3343–3352.
1. Chen W., Duan J., Basevi H., Chang H.J., Leonardis A. Ponitposenet: Point pose network for robust 6D object pose estimation; Proceedings of the IEEE Winter Conference on Applications of Computer Vision; Snowmass Village, CO, USA. 1–5 March 2020; pp. 2824–2833.
1. Tekin B., Sinha S.N., Fua P. Real-time seamless single shot 6D object pose prediction; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Salt Lake City, UT, USA. 18–23 June 2018; pp. 292–301.
1. Garcia-Garcia A., Martinez-Gonzalez P., Oprea S., Castro-Vargas J.A., Orts-Escolano S., Garcia-Rodriguez J., Jover-Alvarez A. The robotrix: An extremely photorealistic and very-large-scale indoor dataset of sequences with robot trajectories and interactions; Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); Madrid, Spain. 1–5 October 2018; pp. 6790–6797.
1. Lepetit V., Moreno-Noguer F., Fua P. Epnp: An accurate O(n) solution to the pnp problem. Int. J. Comput. Vis. 2009;81:155. doi: 10.1007/s11263-008-0152-6. - DOI

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

RobotP: A Benchmark Dataset for 6D Object Pose Estimation

Affiliation

RobotP: A Benchmark Dataset for 6D Object Pose Estimation

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources

Other Literature Sources