Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Oct;34(5):e70036.
doi: 10.1111/jsr.70036. Epub 2025 Apr 1.

Performance Evaluation of the Verily Numetric Watch Sleep Suite for Digital Sleep Assessment Against In-Lab Polysomnography

Affiliations

Performance Evaluation of the Verily Numetric Watch Sleep Suite for Digital Sleep Assessment Against In-Lab Polysomnography

Benjamin W Nelson et al. J Sleep Res. 2025 Oct.

Abstract

The goal was to evaluate the performance of a suite of 12 sleep measures generated by a multi-sensor wrist-worn wearable device, the verily numetric watch (VNW), in a diverse cohort. We used polysomnography (PSG) as reference during one-night simultaneous recording in a sample of N = 41 (18 male, age range: 18-78 years). We performed epoch-by-epoch comparisons for all measures. Key specific analyses were: core accuracy metrics for sleep versus wake classification; bias for continuous measures (Bland-Altman); unweighted Cohen's kappa and accuracy for sleep stage classifications and mean count difference and linearly weighted Cohen's kappa for count metric. In addition, we performed exploratory subgroup analyses by sex, age, skin tone, body mass index and arm hair density. Sensitivity and specificity (95% CI) of sleep versus wake classification were 0.97 (0.96, 0.98) and 0.66 (0.61, 0.71), respectively. Mean total sleep time bias was 14.55 min (1.61, 27.16); wake after sleep onset, -11.77 min (-23.89, 1.09); sleep efficiency, 3.15% (0.68, 5.57); sleep onset latency, -3.24 min (-9.38, 3.57); light-sleep duration, 3.78 min (-7.04, 15.06); deep-sleep duration, 3.91 min (-4.59, 12.60) and rapid eye movement-sleep duration, 6.94 min (0.57, 13.04). Mean difference for the number of awakenings, 0.17: (95% CI: -0.32, 0.71) and overall accuracy of sleep stage classification, 0.78 (0.51, 0.88). Most measures showed statistically significant proportional biases and/or heteroscedasticity. Exploratory subgroup results appeared largely consistent with the overall group, although small samples precluded strong conclusions. These results support the use of VNWs in classifying sleep versus wake, sleep stages and overnight sleep measures.

Keywords: PSG; digital measures; mHealth; polysomnography; sleep; verily numetric watch; wearable technology.

PubMed Disclaimer

References

    1. American Academy of Sleep Medicine. 2008. Obstructive Sleep Apnea. American Academy of Sleep Medicine.
    1. American Academy of Sleep Medicine. 2023. The AASM Manual for the Scoring of Sleep and Associated Events, Version 3. American Academy of Sleep Medicine.
    1. Arnal, P. J., V. Thorey, E. Debellemaniere, et al. 2020. “The Dreem Headband Compared to Polysomnography for Electroencephalographic Signal Acquisition and Sleep Staging.” Sleep 43, no. 11: zsaa097.
    1. Bastien, C. H., A. Vallières, and C. M. Morin. 2001. “Validation of the Insomnia Severity Index as an Outcome Measure for Insomnia Research.” Sleep Medicine 2, no. 4: 297–307.
    1. Benedetti, D., L. Menghini, R. Vallat, et al. 2023. “Call to Action: An Open‐Source Pipeline for Standardized Performance Evaluation of Sleep‐Tracking Technology.” Sleep 46, no. 2: zsac304. https://doi.org/10.1093/sleep/zsac304.

Grants and funding

LinkOut - more resources