Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep 5;22(17):6703.
doi: 10.3390/s22176703.

Potential Obstacle Detection Using RGB to Depth Image Encoder-Decoder Network: Application to Unmanned Aerial Vehicles

Affiliations

Potential Obstacle Detection Using RGB to Depth Image Encoder-Decoder Network: Application to Unmanned Aerial Vehicles

Tomasz Hachaj. Sensors (Basel). .

Abstract

In this work, a new method is proposed that allows the use of a single RGB camera for the real-time detection of objects that could be potential collision sources for Unmanned Aerial Vehicles. For this purpose, a new network with an encoder-decoder architecture has been developed, which allows rapid distance estimation from a single image by performing RGB to depth mapping. Based on a comparison with other existing RGB to depth mapping methods, the proposed network achieved a satisfactory trade-off between complexity and accuracy. With only 6.3 million parameters, it achieved efficiency close to models with more than five times the number of parameters. This allows the proposed network to operate in real time. A special algorithm makes use of the distance predictions made by the network, compensating for measurement inaccuracies. The entire solution has been implemented and tested in practice in an indoor environment using a micro-drone equipped with a front-facing RGB camera. All data and source codes and pretrained network weights are available to download. Thus, one can easily reproduce the results, and the resulting solution can be tested and quickly deployed in practice.

Keywords: RGB to depth mapping; Unmanned Aerial Vehicles; deep neural network; depth prediction; encoder–decoder network; obstacle detection.

PubMed Disclaimer

Conflict of interest statement

The author declares no conflict of interest.

Figures

Figure 1
Figure 1
Deep encoder–decoder architecture of proposed network for RGB to depth image prediction.
Figure 2
Figure 2
Examples of distance estimation inaccuracies visualized with point clouds. Top row (ac) contains RGB image; bottom row contains depth estimation (df).
Figure 3
Figure 3
Diagram of the architecture of the system combining Algorithm 1 with UAV.
Figure 4
Figure 4
Loss curves for network introduced in Section 2.1. Training has been done using dataset from Section 2.4.
Figure 5
Figure 5
Different types of obstacles used during Algorithm 1 testing. In each sub-image, on the left is an RGB image, at the top right is a depth map estimated by proposed E-D, and at the bottom right are potential obstacles in the drone’s path as detected by Algorithm 1. If the rectangle is red, the algorithm predicts that the drone may collide with the obstacle. If the rectangle is green, the algorithm decides that there are no obstacles in the path. (a) Static obstacle 60 × 50 × 75 cm. (b) Static obstacle 60 × 40 × 80 cm. (c) Static obstacle with height 120 cm. (d) A second static obstacle with height 120 cm. (e) Static obstacle with a 45 cm deep “valley”. (f) Dynamic obstacle that is not on the drone’s flight trajectory. (g) A dynamic obstacle on a drone’s flight trajectory. (h) Dynamic obstacle that is not on the drone’s flight trajectory. (i) Dynamic obstacle on the drone’s flight trajectory.
Figure 6
Figure 6
Example errors of Algorithm 1. (a) Misestimation of obstacle height. (b) Potential collision with a window. (c) Potential collision with a wall. (d) Misjudging the size of a complex obstacle. (e) Misjudging the distance to a dynamic obstacle from a hovering drone. (f) Misjudging the distance to a dynamic obstacle from a moving drone.

References

    1. Ronneberger O., Fischer P., Brox T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab N., Hornegger J., Wells W.M., Frangi A.F., editors. Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Proceedings of the 18th International Conference, Munich, Germany, 5–9 October 2015. Springer International Publishing; Cham, Switzerland: 2015. pp. 234–241.
    1. Wang J., Li B., Zhou Y., Meng Q., Rende S.F., Rocco E. Real-time and Embedded Compact Deep Neural Networks for Seagrass Monitoring; Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC); Toronto, ON, Canada. 11–14 October 2020; pp. 3570–3575. - DOI
    1. Levinshtein A., Chang C., Phung E., Kezele I., Guo W., Aarabi P. Real-Time Deep Hair Matting on Mobile Devices; Proceedings of the 2018 15th Conference on Computer and Robot Vision (CRV); Toronto, ON, Canada. 8–10 May 2018; pp. 1–7. - DOI
    1. Yao Z., He K., Zhou H., Zhang Z., Zhu G., Xing C., Zhang J., Zhang Z., Shao B., Tao Y., et al. Eye3DVas: Three-dimensional reconstruction of retinal vascular structures by integrating fundus image features; Proceedings of the Frontiers in Optics/Laser Science; Washington, DC, USA. 14–17 September 2020; Washington, DC, USA: Optica Publishing Group; 2020. p. JTu1B.22.
    1. Hachaj T., Stolińska A., Andrzejewska M., Czerski P. Deep Convolutional Symmetric Encoder-Decoder Neural Networks to Predict Students’ Visual Attention. Symmetry. 2021;13:2246. doi: 10.3390/sym13122246. - DOI

LinkOut - more resources