A survey on 3D object detection in real time for autonomous driving

Marcelo Contreras¹, Aayush Jain², Neel P Bhatt¹, Arunava Banerjee¹, Ehsan Hashemi¹

Affiliations

PMID: 38510560
PMCID: PMC10950960
DOI: 10.3389/frobt.2024.1212070

Review

A survey on 3D object detection in real time for autonomous driving

Marcelo Contreras et al. Front Robot AI. 2024.

. 2024 Mar 6:11:1212070.

doi: 10.3389/frobt.2024.1212070. eCollection 2024.

Authors

Marcelo Contreras¹, Aayush Jain², Neel P Bhatt¹, Arunava Banerjee¹, Ehsan Hashemi¹

Affiliations

¹ University of Alberta, Edmonton, AB, Canada.
² Indian Institute of Technology Kharagpur, Kharagpur, West Bengal, India.

PMID: 38510560
PMCID: PMC10950960
DOI: 10.3389/frobt.2024.1212070

Abstract

This survey reviews advances in 3D object detection approaches for autonomous driving. A brief introduction to 2D object detection is first discussed and drawbacks of the existing methodologies are identified for highly dynamic environments. Subsequently, this paper reviews the state-of-the-art 3D object detection techniques that utilizes monocular and stereo vision for reliable detection in urban settings. Based on depth inference basis, learning schemes, and internal representation, this work presents a method taxonomy of three classes: model-based and geometrically constrained approaches, end-to-end learning methodologies, and hybrid methods. There is highlighted segment for current trend of multi-view detectors as end-to-end methods due to their boosted robustness. Detectors from the last two kinds were specially selected to exploit the autonomous driving context in terms of geometry, scene content and instances distribution. To prove the effectiveness of each method, 3D object detection datasets for autonomous vehicles are described with their unique features, e. g., varying weather conditions, multi-modality, multi camera perspective and their respective metrics associated to different difficulty categories. In addition, we included multi-modal visual datasets, i. e., V2X that may tackle the problems of single-view occlusion. Finally, the current research trends in object detection are summarized, followed by a discussion on possible scope for future research in this domain.

Keywords: 3D object detection; automated driving systems (ADS); autonomous navigation; robot perception; visual navigation; visual-aided decision.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**FIGURE 1**
A hybrid electric vehicle at the NODE lab equipped with multi-modal sensors and data fusion systems for perception, motion planning, autonomous navigation, and controls in perceptually-degraded conditions.

**FIGURE 2**
The structure of existing 3D object detection methodologies (having the same input of monocular or stereo images and output of the 3D detection header): **(A)** Methods using geometrical constraints use ROI features from backbone output or combine them with 2D bounding boxes to fit constraints on loss function or space projection. **(B)** End-to-end learning methods update all layer parameters using backpropagation. This method is categorized depending on utilization of an ROI or feature pyramid network regression with an optimal 2D detection. **(C)** Hybrid methods combine depth estimation from a standalone pretrained network and a change of representation to leverage detailed features for 3D detection. The 3D backbone can be from existing methods for LiDAR, BEV or Voxel points.

**FIGURE 3**
Taxonomy of monocular 3D object detection frameworks: i) Geometric methods consider spatial relationships between several objects and perspective consistency; ii) The end-to-end learning framework is categorized based on their utilization of internal features; and *iii*) Hybrid methods were classified by 3D representation and its augmentation with other techniques such as segmentation or 2D detection.

**FIGURE 4**
Taxonomy of stereo 3D object detection approaches. None-geometrical methods are widely utilized for stereo vision based 3D object detection since previously trained depth estimators or end-to-end depth cost volume achieve better results compared with geometric methods (utilizing in stereo camera). For the remaining categories, the inner classification remains the same as monocular 3D object detection frameworks.

See this image and copyright information in PMC

References

1. Arnold E., Al-Jarrah O. Y., Dianati M., Fallah S., Oxtoby D., Mouzakitis A. (2019). A survey on 3d object detection methods for autonomous driving applications. IEEE Trans. Intelligent Transp. Syst. 20 (10), 3782–3795. 10.1109/tits.2019.2892405 - DOI
1. Azim A., Aycard O. (2014). “Layer-based supervised classification of moving objects in outdoor dynamic environment using 3d laser scanner,” in 2014 IEEE intelligent vehicles symposium proceedings (IEEE; ), 1408–1414.
1. Bao W., Yu Q., Kong Y. (2020). “Object-aware centroid voting for monocular 3d object detection,” in 2020 IEEE/RSJ international conference on intelligent robots and systems (IROS) (IEEE; ), 2197–2204.
1. Bengler K., Dietmayer K., Farber B., Maurer M., Stiller C., Winner H. (2014). Three decades of driver assistance systems: review and future perspectives. IEEE Intell. Transp. Syst. Mag. 6 (4), 6–22. 10.1109/mits.2014.2336271 - DOI
1. Bhatt N. P., Khajepour A., Hashemi E. (2022). “MPC-PF: social interaction aware trajectory prediction of dynamic objects for autonomous driving using potential fields,” in 2022 IEEE/RSJ international conference on intelligent robots and systems (IROS), 9837–9844.

Publication types

Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A survey on 3D object detection in real time for autonomous driving

Affiliations

A survey on 3D object detection in real time for autonomous driving

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

LinkOut - more resources

Full Text Sources