Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Case Reports
. 2021 May;23(5):968-971.
doi: 10.1038/s41436-020-01039-z. Epub 2021 Jan 26.

Deep phenotyping unstructured data mining in an extensive pediatric database to unravel a common KCNA2 variant in neurodevelopmental syndromes

Affiliations
Case Reports

Deep phenotyping unstructured data mining in an extensive pediatric database to unravel a common KCNA2 variant in neurodevelopmental syndromes

Marie Hully et al. Genet Med. 2021 May.

Abstract

Purpose: Electronic health records are gaining popularity to detect and propose interdisciplinary treatments for patients with similar medical histories, diagnoses, and outcomes. These files are compiled by different nonexperts and expert clinicians. Data mining in these unstructured data is a transposable and sustainable methodology to search for patients presenting a high similitude of clinical features.

Methods: Exome and targeted next-generation sequencing bioinformatics analyses were performed at the Imagine Institute. Similarity Index (SI), an algorithm based on a vector space model (VSM) that exploits concepts extracted from clinical narrative reports was used to identify patients with highly similar clinical features.

Results: Here we describe a case of "automated diagnosis" indicated by Dr. Warehouse, a biomedical data warehouse oriented toward clinical narrative reports, developed at Necker Children's Hospital using around 500,000 patients' records. Through the use of this warehouse, we were able to match and identify two patients sharing very specific clinical neonatal and childhood features harboring the same de novo variant in KCNA2.

Conclusion: This innovative application of database clustering clinical features could advance identification of patients with rare and common genetic conditions and detect with high accuracy the natural history of patients harboring similar genetic pathogenic variants.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Display of the two patients (patient 1 from our institution and patient 2 from another institution in our reference center network) sharing the same phenotype and the same KCN2A variant.
Similarity analysis with all data warehouse narrative reports was performed, yielding a high similarity index (SI) in five patients (patients A–E). Exome sequencing validated that patient A, who had the highest SI, harbored the same KCNA2 variant. NGS next-generation sequencing.
Fig. 2
Fig. 2. Clinical heat map describing the detailed characteristics of the patients in this study.
Heatmap for patient 1 and 2 with the p.T374A KCNA2 variant as well as the other five patients (patients A–E) who were identified by the Dr. Warehouse database with the highest Similarity Index (SI). EEG electroencephalogram, MRI magnetic resonance image, GOR Gastro-oesophageal reflux.

References

    1. Jannot AS, et al. The Georges Pompidou University Hospital Clinical Data Warehouse: a 8-years follow-up experience. Int. J. Med. Inform. 2017;102:21–28. doi: 10.1016/j.ijmedinf.2017.02.006. - DOI - PubMed
    1. Garcelon N, et al. Improving a full-text search engine: the importance of negation detection and family history context to identify cases in a biomedical data warehouse. J. Am. Med. Informatics Assoc. 2017;80:52–63. - PMC - PubMed
    1. Hardies K, Weckhuysen S, De Jonghe P, Suls A. Lessons learned from gene identification studies in Mendelian epilepsy disorders. Eur. J. Hum. Genet. 2016;24:961–967. doi: 10.1038/ejhg.2015.251. - DOI - PMC - PubMed
    1. McTague A, et al. The genetic landscape of the epileptic encephalopathies of infancy and childhood. Lancet. Neurol. 2016;5:304–316. doi: 10.1016/S1474-4422(15)00250-1. - DOI - PubMed
    1. Barcia G, et al. Epilepsy with migrating focal seizures. Neurol. Genet. 2019;5:e363. doi: 10.1212/NXG.0000000000000363. - DOI - PMC - PubMed

Publication types

Substances

LinkOut - more resources