Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov:138:104873.
doi: 10.1016/j.compbiomed.2021.104873. Epub 2021 Sep 20.

RGB-D scene analysis in the NICU

Affiliations
Free article

RGB-D scene analysis in the NICU

Yasmina Souley Dosso et al. Comput Biol Med. 2021 Nov.
Free article

Abstract

Continuity of care is achieved in the neonatal intensive care unit (NICU) through careful documentation of all events of clinical significance, including clinical interventions and routine care events (e.g., feeding, diaper change, weighing, etc.). As a step towards automating this documentation process, we propose a scene recognition algorithm that can automatically identify key features in a single image of the patient environment, paired with a rule-based sentence generator to caption the scene. Color and depth video were obtained from 29 newborn patients from the Children's Hospital of Eastern Ontario (CHEO) using an Intel RealSense SR300 RGB-D camera and manual bedside event annotation. Image processing techniques are implemented to classify two lighting conditions: brightness level and phototherapy. A deep neural network is developed for three image classification tasks: on-going intervention, bed occupancy, and patient coverage. Transfer learning is leveraged in the feature extraction layers, such that weights learned from a generic data-rich task are applied to the clinical domain where data collection is complex and costly. Different depth fusion techniques are implemented and compared among classification tasks, where the depth and color data are fused as an RGB-D image (image fusion) or separately at various layers in the network (network fusion). Promising results were obtained with >84% sensitivity and >73% F1 measure across all context variables despite the large class imbalance. RGBD-based models are shown to outperform RGB models on most tasks. In general, a 4-channel image fusion and network fusion at the 11th layer of the VGG-16 architecture were preferred. Ultimately, achieving complete scene understanding through multimodal computer vision could form the basis for a semi-automated charting system to assist clinical staff.

Keywords: Computer vision; Documentation; Image classification; Image processing; Knowledge transfer; Multimodal sensors; Neural networks; Patient monitoring; Scene analysis; Sensor fusion.

PubMed Disclaimer

Publication types

LinkOut - more resources