Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov 9;21(22):7436.
doi: 10.3390/s21227436.

Towards Autonomous Drone Racing without GPU Using an OAK-D Smart Camera

Affiliations

Towards Autonomous Drone Racing without GPU Using an OAK-D Smart Camera

Leticia Oyuki Rojas-Perez et al. Sensors (Basel). .

Abstract

Recent advances have shown for the first time that it is possible to beat a human with an autonomous drone in a drone race. However, this solution relies heavily on external sensors, specifically on the use of a motion capture system. Thus, a truly autonomous solution demands performing computationally intensive tasks such as gate detection, drone localisation, and state estimation. To this end, other solutions rely on specialised hardware such as graphics processing units (GPUs) whose onboard hardware versions are not as powerful as those available for desktop and server computers. An alternative is to combine specialised hardware with smart sensors capable of processing specific tasks on the chip, alleviating the need for the onboard processor to perform these computations. Motivated by this, we present the initial results of adapting a novel smart camera, known as the OpenCV AI Kit or OAK-D, as part of a solution for the ADR running entirely on board. This smart camera performs neural inference on the chip that does not use a GPU. It can also perform depth estimation with a stereo rig and run neural network models using images from a 4K colour camera as the input. Additionally, seeking to limit the payload to 200 g, we present a new 3D-printed design of the camera's back case, reducing the original weight 40%, thus enabling the drone to carry it in tandem with a host onboard computer, the Intel Stick compute, where we run a controller based on gate detection. The latter is performed with a neural model running on an OAK-D at an operation frequency of 40 Hz, enabling the drone to fly at a speed of 2 m/s. We deem these initial results promising toward the development of a truly autonomous solution that will run intensive computational tasks fully on board.

Keywords: Autonomous Drone Racing; CNN; OAK-D; deep learning; smart camera.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
OAK-D is a camera capable of running neural networks simultaneously while providing depth from two stereo cameras and colour information from a single 4K camera in the centre. The camera’s weight is 117 g, and its dimensions are 110 mm in width, 54.5 mm in height, and 33 mm in length.
Figure 2
Figure 2
General diagram of the internal configuration of the OAK-D camera to obtain colour camera images, an object of interest, and their spatial position with respect to the camera.
Figure 3
Figure 3
First mount design enabling the Bebop 2.0 Power Edition to carry the Intel Compute Stick and the OAK-D smart camera with its original case. In this design, the total weight was 283 g, 117 g for the OAK-D sensor, plus 166 g for the onboard computer and the mount. This exceeds the maximum payload set by us (200 g) by 83 g, which resulted in unstable flights.
Figure 4
Figure 4
New design of the back case for the OAK-D smart camera. This new back case was 3D printed, which reduced the weight by almost 40%.
Figure 5
Figure 5
The 3D-printed mount to carry the OAK-D smart camera (with the new back case) and the Intel Compute Stick onboard the drone.
Figure 6
Figure 6
Front and side views of the Bebop 2.0 Power Edition carrying the OAK-D and the Intel Compute Stick.
Figure 7
Figure 7
Example of images from the database collected from real scenarios and simulated scenarios. These images include examples of background images and images with gates in both simulated and real scenes.
Figure 8
Figure 8
Distribution of labels in 9106 images: background, 2534 labels; gates, 6572 labels.
Figure 9
Figure 9
Communication system based on the robotic operating system. Nodes running on the Intel Stick (full onboard processing) are marked in green. Nodes running on the ground control station (GCS) are marked in orange. The GCS is only used for visualisation of imagery and telemetry transmitted by the system running onboard the drone.
Figure 10
Figure 10
GCS and the Bebop 2.0 PWE with the Intel Stick and the OAK-D camera on board.
Figure 11
Figure 11
Indoor environment for our experiments where the drone had to fly through gates. All processing was performed on board using an OAK-D smart camera and an Intel Stick Compute processor.
Figure 12
Figure 12
Schematic view illustrating different combinations of gate positions for our experiments. In total, we performed 10 runs in this indoor environment with the drone successfully crossing the gates, with the gate detector and controller running at 40 Hz with a flight speed of 2.0 m/s.
Figure 13
Figure 13
Examples of the GCS while the drone flew autonomously in an indoor environment. Note that the square at the centre of the image turned blue when the drone identified that it was centred with respect to the gate and it was then ready to cross the gate. A video illustrating these experiments can be found at https://youtu.be/P1187astpe0 (accessed on 1 November 2021).
Figure 14
Figure 14
Exterior views of the drone’s performance while the drone flies autonomously in an indoor environment. A video illustrating these experiments is found at https://youtu.be/P1187astpe0 (accessed on 1 November 2021).
Figure 15
Figure 15
Example of 1 run for gate detection error in the X and Y image axes. This error approximates to zero as the controller commands the drone to centre itself with respect to the gate. The plot on the left shows the reduction in error in the X axis twice, corresponding to the moments when the drone crosses two gates. To the right, note that the drone has some banging in pitch reflected in the error in the Y axis; this is due to a speed break occurring when the drone centres over the gate and decides to cross it.

Similar articles

References

    1. Jung S., Cho S., Lee D., Lee H., Shim D.H. A direct visual servoing-based framework for the 2016 IROS Autonomous Drone Racing Challenge. J. Field Robot. 2018;35:146–166. doi: 10.1002/rob.21743. - DOI
    1. Moon H., Martinez-Carranza J., Cieslewski T., Faessler M., Falanga D., Simovic A., Scaramuzza D., Li S., Ozo M., De Wagter C. Challenges and implemented technologies used in autonomous drone racing. Intell. Serv. Robot. 2019;12:137–148. doi: 10.1007/s11370-018-00271-6. - DOI
    1. Jung S., Hwang S., Shin H., Shim D.H. Perception, guidance, and navigation for indoor autonomous drone racing using deep learning. IEEE Robot. Autom. Lett. 2018;3:2539–2544. doi: 10.1109/LRA.2018.2808368. - DOI
    1. Kaufmann E., Gehrig M., Foehn P., Ranftl R., Dosovitskiy A., Koltun V., Scaramuzza D. Beauty and the beast: Optimal methods meet learning for drone racing; Proceedings of the 2019 International Conference on Robotics and Automation (ICRA); Montreal, QC, Canada. 20–24 May 2019; pp. 690–696.
    1. Cocoma-Ortega J.A., Martinez-Carranza J. A compact CNN approach for drone localisation in autonomous drone racing. J. Real-Time Image Process. 2021:1–14. doi: 10.1007/s11554-021-01162-3. - DOI