Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jul 13;20(14):3884.
doi: 10.3390/s20143884.

Towards Breathing as a Sensing Modality in Depth-Based Activity Recognition

Affiliations

Towards Breathing as a Sensing Modality in Depth-Based Activity Recognition

Jochen Kempfle et al. Sensors (Basel). .

Abstract

Depth imaging has, through recent technological advances, become ubiquitous as products become smaller, more affordable, and more precise. Depth cameras have also emerged as a promising modality for activity recognition as they allow detection of users' body joints and postures. Increased resolutions have now enabled a novel use of depth cameras that facilitate more fine-grained activity descriptors: The remote detection of a person's breathing by picking up the small distance changes from the user's chest over time. We propose in this work a novel method to model chest elevation to robustly monitor a user's respiration, whenever users are sitting or standing, and facing the camera. The method is robust to users occasionally blocking their torso region and is able to provide meaningful breathing features to allow classification in activity recognition tasks. We illustrate that with this method, with specific activities such as paced-breathing meditating, performing breathing exercises, or post-exercise recovery, our model delivers a breathing accuracy that matches that of a commercial respiration chest monitor belt. Results show that the breathing rate can be detected with our method at an accuracy of 92 to 97% from a distance of two metres, outperforming state-of-the-art depth imagining methods especially for non-sedentary persons, and allowing separation of activities in respiration-derived features space.

Keywords: activity recognition features; depth imaging; non-contact respiration estimation.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Left: Our approach estimates the breathing signal and rate of a user, using body posture data and the raw depth signals from a depth camera in the environment. The user is assumed to be indoors and facing the camera, for instance while performing activities guided by a display, but can be anywhere in the frame, upright or sitting, and she can perform activities that self-occlude the torso. Right: An example application with the user performing a relaxation exercise with guided respiration while her breathing is tracked and evaluated over time.
Figure 2
Figure 2
The core steps of our respiration monitoring method. From left to right: The process start with the camera’s depth input frame and the estimated torso position of the user. Since joint position estimates contain jitter, multiple torso window candidates that frame the torso are selected. Combined with the torso prediction from the previous frame, each candidate is then assigned an occlusion mask. The best matching candidate to the torso prediction is selected as torso window and forwarded to the occlusion recovery stage, which uses the occlusion mask as well as the torso prediction from the previous frame to yield the current torso model. The torso model then delivers the prediction for the next frame, where it will be used for occlusion recovery. The torso model is transformed to a single respiration state value, the history of these values yields a respiration signal that, after Fourier Analysis and extraction of the dominant frequency, estimates the respiratory rate.
Figure 3
Figure 3
Detailed example of our occlusion recovery process: The depth input frame, occlusion mask, and previous model state are used. The occluded area is removed from the depth frame and the model state, with the model patch from the previous model state kept. The non-zero pixels from the depth input frame cut feed into the model update to yield a partly updated model state. The holes of both model states are in-painted, and the difference of the fitting model patch and the in-painted model cut is added to the in-painted area of the partly updated model state, yielding a new model state. First and second order derivatives are equally updated.
Figure 4
Figure 4
Our approach builds an adaptive model for the user’s torso appearance. From depth input frames, torso windows are selected over time that, after occlusion recovery, update the model consisting of a torso state estimate, its first order time derivative, and its second order time derivative. From these, the model can build a torso prediction for the next input frame.
Figure 5
Figure 5
Left: Torso surface (left images) and variance of a 12 s time window (right images) of two persons. Red: Low variance; Green: high variance. The throat area shows low variance while the remaining area highly is influenced by clothing. Right: The estimation of the respiration signal uses defined areas in the torso window. For the chest region, the result (top right) is a clean respiration signal when subtracting the maximum depth measure for a small region around the throat and the mean depth measurement over the chest region. In contrast, the mean depth measurements over the chest alone are highly influenced by motion artifacts (bottom right).
Figure 6
Figure 6
In our evaluations, 24 study participants (of which 7 female, aged between 22 and 57 years old) were asked to sit or stand in front of a depth sensor and were recorded for different conditions. Top row: Exemplary depth data from some participants while sitting. Bottom row: Some examples of participants while holding a cup (leading to regular occlusions of the torso). All of the above examples are taken from 2 metres distance sessions.
Figure 7
Figure 7
Comparison plots between the output of the chest-worn respiration belt (in orange) and the output of our proposed model (in blue), for the post-exercise condition (going from a fast respiration rate in the beginning to slower one over time). The top segment has a PCC of 0.95 and an accuracy of 100%, both signals match well in terms of frequency. The bottom segment has a PCC of 0.26 and an accuracy of 22%, with the larger peaks on the left in our output due to the user occasionally tilting the torso forward and occluding the throat region with the head.
Figure 8
Figure 8
Left: The Pearson Correlation Coefficient (PCC) between the data from the Vernier respiration belt and our proposed system, for all users individually. The bars depict the means, while the black bars indicate the 99% confidence intervals. Data from paced breathing (15 breaths per minute) tends to result in high correlation between our system’s prediction and the belt’s output. Relaxed and post-exercise breathing tends to perform slightly worse. Middle: The accuracy of our proposed system compared to the respiration belt. Right: The error of the estimated respiratory rate compared to the respiratory rate from the respiration belt. Poor performance for users 9 and 22 stem mostly from larger movements during the recording.
Figure 9
Figure 9
Comparison of two depth-based respiration estimation methods to the respiration belt. Left: The mean accuracy and standard deviation of the Mean Raw method. Right: The mean accuracy and standard deviation of our proposed system.
Figure 10
Figure 10
Left: The respiration rate accuracies of the chest using a Fast Fourier Transform window with a length of 48 s, for all participants performing three activities (sitting, standing upright, and drinking), for the following methods: Mean Raw, Median Raw, Mean Model, Median Model, and our proposed method. The colored bars show the averages, while overlay box plots show median (middle parts) and whiskers marking data within 1.5 IQR. Middle: The error in breaths per minute of the different methods again using bars and overlay box plots. Right: The signal to noise ratio (or SNR) of the chest area shows a similar picture to the accuracy performance measures: Sitting results for all methods in a much clearer signal than standing upright, with standing and occlusion (holding a cup, right measures) performing slightly worse than just standing upright.
Figure 11
Figure 11
The respiration rate accuracies (left), errors (middle), and SNR (right) of the chest using a Fast Fourier Transform window with a length of 48 s, for the users performing three activities (sitting, standing upright, and drinking) for the following methods: Mean Raw, Median Raw, Mean Model, Median Model, and our proposed method. The colored bars show the averages, while overlay box plots show median (middle parts) and whiskers marking data within 1.5 IQR. Higher rates show a slightly better performance over all methods, especially when non-sedentary.
Figure 12
Figure 12
The different activities: paced-breathing meditating, post-exercise recovering, relaxing, and speaking, transformed to feature space. Left: The features Med-PP, ESP, STD-Spec, and Skew-Spec reduced by one dimension by a principal component analysis. Right: The inverse of the feature Med-PP (to express it in Hz) plotted against Skew-Spec.
Figure 13
Figure 13
Left: Comparison of occlusion vs no occlusion on the prediction. Note the difference in the highlighted area. Right: Example of the adaptiveness of the model: An initial occlusion fades away as soon as the occluded area gets visible.

Similar articles

Cited by

References

    1. Cretikos M.A., Bellomo R., Hillman K., Chen J., Finfer S., Flabouris A. Respiratory rate: The neglected vital sign. Med. J. Aust. 2008;188:657–659. doi: 10.5694/j.1326-5377.2008.tb01825.x. - DOI - PubMed
    1. Parkes R. Rate of respiration: The forgotten vital sign. Emerg. Nurse. 2011;19:12–17. doi: 10.7748/en2011.05.19.2.12.c8504. - DOI - PubMed
    1. Massaroni C., Nicolò A., Lo Presti D., Sacchetti M., Silvestri S., Schena E. Contact-Based Methods for Measuring Respiratory Rate. Sensors. 2019;19:908. doi: 10.3390/s19040908. - DOI - PMC - PubMed
    1. Lara O.D., Labrador M.A. A survey on human activity recognition using wearable sensors. IEEE Commun. Surv. Tutor. 2012;15:1192–1209. doi: 10.1109/SURV.2012.110112.00192. - DOI
    1. Castro D., Coral W., Rodriguez C., Cabra J., Colorado J. Wearable-based human activity recognition using an iot approach. J. Sens. Actuator Netw. 2017;6:28. doi: 10.3390/jsan6040028. - DOI