A high throughput semantic concept frequency based approach for patient identification: a case study using type 2 diabetes mellitus clinical notes
- PMID: 21347100
- PMCID: PMC3041302
A high throughput semantic concept frequency based approach for patient identification: a case study using type 2 diabetes mellitus clinical notes
Abstract
Current research on high throughput identification of patients with a specific phenotype is in its infancy. There is an urgent need to develop a general automatic approach for patient identification.
Objective: We took advantage of Mayo Clinic electronic clinical notes and proposed a novel method of combining NLP, machine learning, and ontology for automatic patient identification. We also investigated the benefits of involving existing SNOMED semantic knowledge in a patient identification task.
Methods: the SVM algorithm was applied on SNOMED concept units extracted from T2DM case/control clinical notes. Precision, recall, and F-score were calculated to evaluate the performance.
Results: This approach achieved an F-score of above 0.950 for both groups when using all identified concept units as features. Concept units from semantic type-Disease or Syndrome contain the most important information for patient identification. Our results also implied that the coarse level concepts contain enough information to classify T2DM cases/controls.
Similar articles
-
Development and evaluation of RapTAT: a machine learning system for concept mapping of phrases from medical narratives.J Biomed Inform. 2014 Apr;48:54-65. doi: 10.1016/j.jbi.2013.11.008. Epub 2013 Dec 4. J Biomed Inform. 2014. PMID: 24316051
-
A semi-automatic semantic method for mapping SNOMED CT concepts to VCM Icons.Stud Health Technol Inform. 2013;192:42-6. Stud Health Technol Inform. 2013. PMID: 23920512 Free PMC article.
-
Enriching the international clinical nomenclature with Chinese daily used synonyms and concept recognition in physician notes.BMC Med Inform Decis Mak. 2017 May 2;17(1):54. doi: 10.1186/s12911-017-0455-z. BMC Med Inform Decis Mak. 2017. PMID: 28464923 Free PMC article.
-
Semantic characteristics of NLP-extracted concepts in clinical notes vs. biomedical literature.AMIA Annu Symp Proc. 2011;2011:1550-8. Epub 2011 Oct 22. AMIA Annu Symp Proc. 2011. PMID: 22195220 Free PMC article.
-
Automatic SNOMED CT coding of Chinese clinical terms via attention-based semantic matching.Int J Med Inform. 2022 Mar;159:104676. doi: 10.1016/j.ijmedinf.2021.104676. Epub 2021 Dec 28. Int J Med Inform. 2022. PMID: 34990940
Cited by
-
Natural Language Processing for EHR-Based Computational Phenotyping.IEEE/ACM Trans Comput Biol Bioinform. 2019 Jan-Feb;16(1):139-153. doi: 10.1109/TCBB.2018.2849968. Epub 2018 Jun 25. IEEE/ACM Trans Comput Biol Bioinform. 2019. PMID: 29994486 Free PMC article. Review.
-
Defining Phenotypes from Clinical Data to Drive Genomic Research.Annu Rev Biomed Data Sci. 2018 Jul;1:69-92. doi: 10.1146/annurev-biodatasci-080917-013335. Epub 2018 Apr 25. Annu Rev Biomed Data Sci. 2018. PMID: 34109303 Free PMC article.
-
Hybrid bag of approaches to characterize selection criteria for cohort identification.J Am Med Inform Assoc. 2019 Nov 1;26(11):1172-1180. doi: 10.1093/jamia/ocz079. J Am Med Inform Assoc. 2019. PMID: 31197354 Free PMC article.
-
Time-related patient data retrieval for the case studies from the pharmacogenomics research network.J Med Syst. 2012 Nov;36 Suppl 1(Suppl 1):S37-42. doi: 10.1007/s10916-012-9888-1. Epub 2012 Oct 18. J Med Syst. 2012. PMID: 23076712 Free PMC article.
-
Impact of data fragmentation across healthcare centers on the accuracy of a high-throughput clinical phenotyping algorithm for specifying subjects with type 2 diabetes mellitus.J Am Med Inform Assoc. 2012 Mar-Apr;19(2):219-24. doi: 10.1136/amiajnl-2011-000597. Epub 2012 Jan 16. J Am Med Inform Assoc. 2012. PMID: 22249968 Free PMC article.
References
-
- Feero WG, Guttmacher AE, Collins FS. The genome gets personal--almost. JAMA. 2008;299:1351–2. - PubMed
-
- Birman-Deych E, Waterman AD, Yan Y, Nilasena DS, Radford MJ, Gage BF. Accuracy of ICD-9-CM codes for identifying cardiovascular and stroke risk factors. Med Care. 2005;43:480–5. - PubMed
-
- Schmiedeskamp M, Harpe S, Polk R, Oinonen M, Pakyz A. Use of International Classification of Diseases, Ninth Revision, Clinical Modification codes and medication use data to identify nosocomial Clostridium difficile infection. Infect Control Hosp Epidemiol. 2009;30:1070–6. - PubMed
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources