Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2026 Jan 28;13(1):145.
doi: 10.1038/s41597-025-06457-9.

Skin tone and clinical dataset from a prospective trial on acute care patients

Affiliations

Skin tone and clinical dataset from a prospective trial on acute care patients

Sicheng Hao et al. Sci Data. .

Abstract

Although hypothesized to be the root cause of the pulse oximetry disparities, skin tone and its use for improving medical therapies have yet to be extensively studied. Studies that previously used self-reported race as a proxy variable for skin tone cannot account for skin tone variabilities within race groups. This study aimed to create a unique baseline dataset that included skin tone and electronic health record (EHR) data to better evaluate health disparities associated with pulse oximetry. We collected skin tone data at 16 different body locations using multiple devices, including administered visual scales, colorimetric, spectrophotometric, and photography via mobile phone cameras. All patients' data were converted into a common data model and de-identified before publication in PhysioNet. We assessed 167 features per skin location on 128 patients linked with their EHR data, such as laboratory data, vital sign recordings, and demographic information. We also include 2,438 images from mobile phones to assist in developing artificial intelligence tools to combat health disparities.

PubMed Disclaimer

Conflict of interest statement

Competing interests: AIW holds equity and management roles in Ataia Medical. AIW is supported by the National Heart, Lung, and Blood Institute under R01HL177003 and REACH Equity under the National Institute on Minority Health and Health Disparities (NIMHD) of the National Institutes of Health under U54MD012530. Dr. Gichoya is a 2022 Robert Wood Johnson Foundation Harold Amos Medical Faculty Development Program and declares support from RSNA Health Disparities grant (#EIHD2204), Lacuna Fund (#67), Gordon and Betty Moore Foundation, NIH (NIBIB) MIDRC grant under contracts 75N92020C00008 and 75N92020C00021, and NHLBI Award Number R01HL167811.

Figures

Fig. 1
Fig. 1
Data flow diagram and content. The left side of the figure represents how data flows in the collection process. Firstly, EHR data are pulled from EPIC databases into REDCap, and patient skin data is collected at the bedside and stored in REDCap. Then, the data are de-identified before leaving Duke’s compute enclave PACE via an honest broker request. Lately, data has been transformed into an OMOP format. The right side of the figures provides a high-level view of the data content. Source content contains patient’s EHR tables and data from five different types of devices or collection methods. Output content contains tables and images in OMOP format.
Fig. 2
Fig. 2
Image processing. This figure demonstrated how raw images were processed. (a) is the raw image taken with smartphone cameras. (b) The circle was calculated based on brightness representing the center of the image. (c) is the image output to the dataset, and information inside the circle from (2) is kept. (d–f) are representations of how to derive image figures such as average red, green, and blue from the output image (c).
Fig. 3
Fig. 3
Sample data for a single patient. This is a timeline plot of a single patient’s data selected at random. The gold star represents the SaO2 - SpO2 pair we collected before skin data collection. The dashed black line represents the beginning of skin data collection. EHR data are available before and after skin collection.
Fig. 4
Fig. 4
Samples of data explorations performed by ARES. In this figure, (a,b) are part of the output from explorations of the Observation table and Person table. (c) is a detailed view of a subset of all the quality checks mentioned in Table 3. (d) is a subset of all the items in the Measurement table ranked by lowest appearance rate among all patients. Detailed information on the whole dashboard can be found on the GitHub page: aiwonglab/ENCoDE_tutorial.

Update of

References

    1. Center for Devices & Radiological Health. Pulse Oximeters. U.S. Food and Drug Administrationhttps://www.fda.gov/medical-devices/products-and-medical-procedures/puls... (2024).
    1. Chan, E. D., Chan, M. M. & Chan, M. M. Pulse oximetry: understanding its basic principles facilitates appreciation of its limitations. Respir. Med.107, 789–799 (2013). - DOI - PubMed
    1. Wong, A. et al. Analysis of discrepancies between pulse oximetry and arterial oxygen saturation measurements by race and ethnicity and association with organ dysfunction and mortality. JAMA Netw. Open4, (2021). - PMC - PubMed
    1. Fawzy, A. et al. Racial and Ethnic Discrepancy in Pulse Oximetry and Delayed Identification of Treatment Eligibility Among Patients With COVID-19. JAMA Intern. Med.182, 730–738 (2022). - DOI - PMC - PubMed
    1. Valbuena, V. S. M. et al. Racial Bias in Pulse Oximetry Measurement Among Patients About to Undergo Extracorporeal Membrane Oxygenation in 2019–2020: A Retrospective Cohort Study. Chest161, 971–978 (2022). - DOI - PMC - PubMed

LinkOut - more resources