Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jan 8;21(2):413.
doi: 10.3390/s21020413.

Evaluation of the Azure Kinect and Its Comparison to Kinect V1 and Kinect V2

Affiliations

Evaluation of the Azure Kinect and Its Comparison to Kinect V1 and Kinect V2

Michal Tölgyessy et al. Sensors (Basel). .

Abstract

The Azure Kinect is the successor of Kinect v1 and Kinect v2. In this paper we perform brief data analysis and comparison of all Kinect versions with focus on precision (repeatability) and various aspects of noise of these three sensors. Then we thoroughly evaluate the new Azure Kinect; namely its warm-up time, precision (and sources of its variability), accuracy (thoroughly, using a robotic arm), reflectivity (using 18 different materials), and the multipath and flying pixel phenomenon. Furthermore, we validate its performance in both indoor and outdoor environments, including direct and indirect sun conditions. We conclude with a discussion on its improvements in the context of the evolution of the Kinect sensor. It was shown that it is crucial to choose well designed experiments to measure accuracy, since the RGB and depth camera are not aligned. Our measurements confirm the officially stated values, namely standard deviation ≤17 mm, and distance error <11 mm in up to 3.5 meters distance from the sensor in all four supported modes. The device, however, has to be warmed up for at least 40-50 min to give stable results. Due to the time-of-flight technology, the Azure Kinect cannot be reliably used in direct sunlight. Therefore, it is convenient mostly for indoor applications.

Keywords: 3D scanning; Azure Kinect; HRI (human–robot interaction); Kinect; SLAM (simultaneous localization and mapping); depth imaging; gesture recognition; mapping; object recognition; robotics.

PubMed Disclaimer

Conflict of interest statement

The authors declare there are no conflicts of interest.

Figures

Figure 1
Figure 1
From left to right—Kinect v1, Kinect v2, Azure Kinect.
Figure 2
Figure 2
Schematic of the Azure Kinect.
Figure 3
Figure 3
Sensor placement for testing purposes.
Figure 4
Figure 4
Typical data measurements acquired from Kinect v1 (a), Kinect v2 (b), Azure Kinect in narrow field-of-view (NFOV) binned mode (c), and Azure Kinect in wide field-of-view (WFOV) (d) sensors (axes represent image pixel positions).
Figure 5
Figure 5
Typical depth noise of Kinect v1 in mm (values over 2 mm were limited to 2 mm for better visual clarity). Picture axes represent pixel positions.
Figure 6
Figure 6
Typical depth noise of Kinect v2 in mm (values over 2 mm were limited to 2 mm for better visual clarity). Picture axes represent pixel positions.
Figure 7
Figure 7
Typical depth noise of Azure Kinect in NFOV binned mode in mm (values over 2 mm were limited to 2 mm for better visual clarity). Picture axes represent pixel positions.
Figure 8
Figure 8
Typical depth noise of Azure Kinect in WFOV binned mode in mm (values over 2 mm were limited to 2 mm for better visual clarity). Picture axes represent pixel positions.
Figure 9
Figure 9
Plane vs. Euclidian distance of a 3D point form the sensor chip.
Figure 10
Figure 10
Test plate composed of plastic reflective material and cork.
Figure 11
Figure 11
Depth noise of Kinect v1 with presence of an object with different reflectivity (values over 5 mm were limited to 5 mm for better visual clarity). Picture axes represent pixel positions.
Figure 12
Figure 12
Depth noise of Kinect v2 with presence of an object with different reflectivity in the bottom area (values over 1.5 mm were limited to 1.5 mm for better visual clarity). Picture axes represent pixel positions.
Figure 13
Figure 13
Depth noise of Kinect Azure with presence of an object with different reflectivity in the bottom area (values over 1.3 mm were limited to 1.3 mm for better visual clarity). Picture axes represent pixel positions.
Figure 14
Figure 14
Measured distance while warming up the Azure Kinect. Each point represents the average distance for that particular minute.
Figure 15
Figure 15
Measured standard deviation while warming up the Azure Kinect.
Figure 16
Figure 16
Scheme for accuracy measurements using robotic manipulator.
Figure 17
Figure 17
Picture of the actual laboratory experiment.
Figure 18
Figure 18
Selected depth points for fine tuning of the plate alignment. Picture axes represent pixel positions.
Figure 19
Figure 19
Accuracy of the Azure Kinect for all modes.
Figure 20
Figure 20
Precision of the Azure Kinect for all modes.
Figure 21
Figure 21
Layout of tested specimens: a—felt, b—office carpet (wave pattern), c—leatherette, d—bubble relief styrofoam, e—cork, f—office carpet, g—polyurethane foam, h—carpet with short fibres, i—anti-slippery matt, j—soft foam with wave pattern, k—felt with pattern, l—spruce wood, m—sandpaper, n—wallpaper, o—buble foam, p—plush, q—fake grass, r—aluminum thermofoil.
Figure 22
Figure 22
Infrared image of tested specimens.
Figure 23
Figure 23
Standard deviation of tested specimens.
Figure 24
Figure 24
Average distance of tested specimens.
Figure 25
Figure 25
Correlation of noise and growing Euclidian distance from the sensor (blue curve—standard deviation of original data; orange curve—original data compensated with real Euclidian distance from the lens).
Figure 26
Figure 26
Standard deviation of noise of the Kinect Azure with respect to distance from the object (wall) and relative angle between the object and sensor.
Figure 27
Figure 27
Truncated standard deviation of noise of the Kinect Azure with respect to distance from the object and relative angle of the object and sensor.
Figure 28
Figure 28
Standard deviation of noise of the Kinect Azure with respect to the relative angle of the object and sensor measured at different distances.
Figure 29
Figure 29
Standard deviation of noise of the Kinect Azure with respect to distance from the object with variable relative angles of the object and sensor.
Figure 30
Figure 30
Standard deviation for particular identical distances with respect to the angle for which we measured.
Figure 31
Figure 31
RGB and IR image of the first experiment scenario.
Figure 32
Figure 32
RGB and IR image of the second experiment scenario.
Figure 33
Figure 33
Standard deviation of binned NFOV mode limited to 200 mm (experiment 1).
Figure 34
Figure 34
Standard deviation of binned WFOV mode limited to 200 mm (experiment 1).
Figure 35
Figure 35
Standard deviation of binned NFOV mode limited to 200 mm (experiment 2).
Figure 36
Figure 36
Standard deviation of binned WFOV mode limited to 200 mm (experiment 2).
Figure 37
Figure 37
Demonstration of the flying pixel phenomenon–fluctuating depth data at the edge of the plate (front view, all values in mm).
Figure 38
Figure 38
Demonstration of the flying pixel phenomenon–fluctuating depth data at the edge of the plate (side view, all values in mm).

References

    1. Elaraby A.F., Hamdy A., Rehan M. A Kinect-Based 3D Object Detection and Recognition System with Enhanced Depth Estimation Algorithm; Proceedings of the 2018 IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON); Vancouver, BC, Canada. 1–3 November 2018; pp. 247–252. - DOI
    1. Tanabe R., Cao M., Murao T., Hashimoto H. Vision based object recognition of mobile robot with Kinect 3D sensor in indoor environment; Proceedings of the 2012 Proceedings of SICE Annual Conference (SICE); Akita, Japan. 20–23 August 2012; pp. 2203–2206.
    1. Manap M.S.A., Sahak R., Zabidi A., Yassin I., Tahir N.M. Object Detection using Depth Information from Kinect Sensor; Proceedings of the 2015 IEEE 11th International Colloquium on Signal Processing & Its Applications (CSPA); Kuala Lumpur, Malaysia. 6–8 March 2015; pp. 160–163.
    1. Xin G.X., Zhang X.T., Wang X., Song J. A RGBD SLAM algorithm combining ORB with PROSAC for indoor mobile robot; Proceedings of the 2015 4th International Conference on Computer Science and Network Technology (ICCSNT); 19–20 December 2015; pp. 71–74.
    1. Henry P., Krainin M., Herbst E., Ren X., Fox D. RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments. Int. J. Robot. Res. 2012;31:647–663. doi: 10.1177/0278364911434148. - DOI

LinkOut - more resources