Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Dec 23:8:e63866.
doi: 10.2196/63866.

Building a Human Digital Twin (HDTwin) Using Large Language Models for Cognitive Diagnosis: Algorithm Development and Validation

Affiliations

Building a Human Digital Twin (HDTwin) Using Large Language Models for Cognitive Diagnosis: Algorithm Development and Validation

Gina Sprint et al. JMIR Form Res. .

Abstract

Background: Human digital twins have the potential to change the practice of personalizing cognitive health diagnosis because these systems can integrate multiple sources of health information and influence into a unified model. Cognitive health is multifaceted, yet researchers and clinical professionals struggle to align diverse sources of information into a single model.

Objective: This study aims to introduce a method called HDTwin, for unifying heterogeneous data using large language models. HDTwin is designed to predict cognitive diagnoses and offer explanations for its inferences.

Methods: HDTwin integrates cognitive health data from multiple sources, including demographic, behavioral, ecological momentary assessment, n-back test, speech, and baseline experimenter testing session markers. Data are converted into text prompts for a large language model. The system then combines these inputs with relevant external knowledge from scientific literature to construct a predictive model. The model's performance is validated using data from 3 studies involving 124 participants, comparing its diagnostic accuracy with baseline machine learning classifiers.

Results: HDTwin achieves a peak accuracy of 0.81 based on the automated selection of markers, significantly outperforming baseline classifiers. On average, HDTwin yielded accuracy=0.77, precision=0.88, recall=0.63, and Matthews correlation coefficient=0.57. In comparison, the baseline classifiers yielded average accuracy=0.65, precision=0.86, recall=0.35, and Matthews correlation coefficient=0.36. The experiments also reveal that HDTwin yields superior predictive accuracy when information sources are fused compared to single sources. HDTwin's chatbot interface provides interactive dialogues, aiding in diagnosis interpretation and allowing further exploration of patient data.

Conclusions: HDTwin integrates diverse cognitive health data, enhancing the accuracy and explainability of cognitive diagnoses. This approach outperforms traditional models and provides an interface for navigating patient information. The approach shows promise for improving early detection and intervention strategies in cognitive health.

Keywords: artificial intelligence; chatbot; cognitive diagnosis; cognitive health; digital behavior marker; digital twin; health information; human digital twin; interview marker; large language models; machine learning; smartwatch.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

Figure 1
Figure 1
HDTwin information processing pipeline. A user interacts with the LLM interface to request summary information about a person or a suggested diagnosis. Based on the query, HDTwin retrieves personalized markers together with paper abstracts and data from a knowledge base that informs a response. The query response is presented to the user, supporting an ongoing conversation about the person or explanation of the query response. LLM: large language model.
Figure 2
Figure 2
In addition to collecting sensor data, the smartwatch app queries the user for their current state, includes an n-back shape test, and collects daily audio data.
Figure 3
Figure 3
Distribution of healthy participants and those with MCI based on HDTwin markers that include (from upper left): demographics, behavior, EMA response, and n-back scores. The bottom graph shows a t-sne plot of all quantifiable features. Text input from journals and testing sessions are not included in the plots. EMA: ecological momentary assessment; MCI: mild cognitive impairment.
Figure 4
Figure 4
The HDTwin chatbot interface with an example prompt and response for a query regarding one of a person’s n-back score statistics. Users can see the agent’s message memory using the “Chat History” dropdown and the agent’s planning and execution steps using the “See Intermediate Steps” dropdown. A video demonstration of the chatbot is available on the web [28].

Similar articles

Cited by

References

    1. O'Malley RPD, Mirheidari B, Harkness K, Reuber M, Venneri A, Walker T, Christensen H, Blackburn D. Fully automated cognitive screening tool based on assessment of speech and language. J Neurol Neurosurg Psychiatry. 2020;92(1):12–15. doi: 10.1136/jnnp-2019-322517. https://eprints.whiterose.ac.uk/169297/ jnnp-2019-322517 - DOI - PubMed
    1. Sand Aronsson FS, Kuhlmann M, Jelic V, Östberg P. Is cognitive impairment associated with reduced syntactic complexity in writing? Evidence from automated text analysis. Aphasiology. 2020;35(7):900–913. doi: 10.1080/02687038.2020.1742282. - DOI
    1. Nicosia J, Aschenbrenner AJ, Balota DA, Sliwinski MJ, Tahan M, Adams S, Stout SS, Wilks H, Gordon BA, Benzinger TLS. Unsupervised high-frequency smartphone-based cognitive assessments are reliable, valid, and feasible in older adults at risk for Alzheimer's disease. J Int Neuropsychol Soc. 2023;29(5):459–471. doi: 10.31234/osf.io/wtsyn. - DOI - PMC - PubMed
    1. Schmitter-Edgecombe M, Luna C, Beech B, Dai S, Cook D. Capturing cognitive capacity in the everyday environment across a continuum of cognitive decline using a smartwatch n-back task and ecological momentary assessment. Neuropsychology. 2024 doi: 10.1037/neu0000984.2025-46915-001 - DOI - PMC - PubMed
    1. Cook D, Walker A, Minor B. A cross-study analysis of mobile EMA in monitoring behavior and well-being: insights to refine EMA methods. JMIR mHealth uHealth. 2024 doi: 10.2196/preprints.57018. https://www.researchgate.net/publication/378414238_A_Cross-Study_Analysi... - DOI

LinkOut - more resources