LATTE: Label-efficient incident phenotyping from longitudinal electronic health records
- PMID: 38264714
- PMCID: PMC10801250
- DOI: 10.1016/j.patter.2023.100906
LATTE: Label-efficient incident phenotyping from longitudinal electronic health records
Abstract
Electronic health record (EHR) data are increasingly used to support real-world evidence studies but are limited by the lack of precise timings of clinical events. Here, we propose a label-efficient incident phenotyping (LATTE) algorithm to accurately annotate the timing of clinical events from longitudinal EHR data. By leveraging the pre-trained semantic embeddings, LATTE selects predictive features and compresses their information into longitudinal visit embeddings through visit attention learning. LATTE models the sequential dependency between the target event and visit embeddings to derive the timings. To improve label efficiency, LATTE constructs longitudinal silver-standard labels from unlabeled patients to perform semi-supervised training. LATTE is evaluated on the onset of type 2 diabetes, heart failure, and relapses of multiple sclerosis. LATTE consistently achieves substantial improvements over benchmark methods while providing high prediction interpretability. The event timings are shown to help discover risk factors of heart failure among patients with rheumatoid arthritis.
© 2023 The Authors.
Conflict of interest statement
The authors declare no competing interests.
Figures







Similar articles
-
Weakly Semi-supervised phenotyping using Electronic Health records.J Biomed Inform. 2022 Oct;134:104175. doi: 10.1016/j.jbi.2022.104175. Epub 2022 Sep 5. J Biomed Inform. 2022. PMID: 36064111 Free PMC article.
-
A semi-supervised adaptive Markov Gaussian embedding process (SAMGEP) for prediction of phenotype event times using the electronic health record.Sci Rep. 2022 Oct 22;12(1):17737. doi: 10.1038/s41598-022-22585-3. Sci Rep. 2022. PMID: 36273240 Free PMC article.
-
LATTE: A knowledge-based method to normalize various expressions of laboratory test results in free text of Chinese electronic health records.J Biomed Inform. 2020 Feb;102:103372. doi: 10.1016/j.jbi.2019.103372. Epub 2019 Dec 31. J Biomed Inform. 2020. PMID: 31901507
-
The use of electronic health records for psychiatric phenotyping and genomics.Am J Med Genet B Neuropsychiatr Genet. 2018 Oct;177(7):601-612. doi: 10.1002/ajmg.b.32548. Epub 2017 May 30. Am J Med Genet B Neuropsychiatr Genet. 2018. PMID: 28557243 Free PMC article. Review.
-
Adult patient access to electronic health records.Cochrane Database Syst Rev. 2021 Feb 26;2(2):CD012707. doi: 10.1002/14651858.CD012707.pub2. Cochrane Database Syst Rev. 2021. PMID: 33634854 Free PMC article.
Cited by
-
DOME: Directional medical embedding vectors from Electronic Health Records.J Biomed Inform. 2025 Feb;162:104768. doi: 10.1016/j.jbi.2024.104768. Epub 2025 Jan 2. J Biomed Inform. 2025. PMID: 39755324
-
Advancing the Use of Longitudinal Electronic Health Records: Tutorial for Uncovering Real-World Evidence in Chronic Disease Outcomes.J Med Internet Res. 2025 May 12;27:e71873. doi: 10.2196/71873. J Med Internet Res. 2025. PMID: 40357530 Free PMC article.
-
With big data comes big responsibility: Strategies for utilizing aggregated, standardized, de-identified electronic health record data for research.Clin Transl Sci. 2025 Jan;18(1):e70093. doi: 10.1111/cts.70093. Clin Transl Sci. 2025. PMID: 39740190 Free PMC article.
References
-
- Ananthakrishnan A.N., Cai T., Savova G., Cheng S.C., Chen P., Perez R.G., Gainer V.S., Murphy S.N., Szolovits P., Xia Z., et al. Improving case definition of crohn’s disease and ulcerative colitis in electronic medical records using natural language processing: a novel informatics approach. Inflamm. Bowel Dis. 2013;19:1411–1420. - PMC - PubMed
Grants and funding
LinkOut - more resources
Full Text Sources