Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Oct;23(10):1993-1997.
doi: 10.1038/s41436-021-01213-x. Epub 2021 Jun 10.

A framework for automated gene selection in genomic applications

Affiliations

A framework for automated gene selection in genomic applications

L Lazo de la Vega et al. Genet Med. 2021 Oct.

Abstract

Purpose: An efficient framework to identify disease-associated genes is needed to evaluate genomic data for both individuals with an unknown disease etiology and those undergoing genomic screening. Here, we propose a framework for gene selection used in genomic analyses, including applications limited to genes with strong or established evidence levels and applications including genes with less or emerging evidence of disease association.

Methods: We extracted genes with evidence for gene-disease association from the Human Gene Mutation Database, OMIM, and ClinVar to build a comprehensive gene list of 6,145 genes. Next, we applied stringent filters in conjunction with computationally curated evidence (DisGeNET) to create a restrictive list limited to 3,929 genes with stronger disease associations.

Results: When compared to manual gene curation efforts, including the Clinical Genome Resource, genes with strong or definitive disease associations are included in both gene lists at high percentages, while genes with limited evidence are largely removed. We further confirmed the utility of this approach in identifying pathogenic and likely pathogenic variants in 45 genomes.

Conclusion: Our approach efficiently creates highly sensitive gene lists for genomic applications, while remaining dynamic and updatable, enabling time savings in genomic applications.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
(A) Schematic of the criteria fulfilled at each stage of the gene filtration process. Genes with entries in ClinVar (11,234 genes), OMIM Morbid Map (8,087 genes), and HGMD (12,080 genes) were integrated to generate the comprehensive and restrictive gene lists. Filtration parameters for each stage are presented in the right panel. (B) Venn diagram of the comprehensive (left) and restrictive (right) gene lists, including the number of genes meeting criteria in the initial databases.
Figure 2:
Figure 2:
Comprehensive and restrictive gene lists were compared to the GDA classifications assigned by 6 resources (A) ClinGen, (B) MedSeq, (C) BabySeq, (D) consensus of Australian PanelApp (Incidentalome and Mendeliome panel) and GenomicsEngland PanelApp (Paediatric Panel), and (E) consensus from GenCC. Numbers below the bar represent the number of genes included and numbers above the bar are the number of genes excluded in the respective list. Other: conflicting, refuted, disputed, no reported evidence, trait, pharmacogenomic association, only claim is from GWAS, and does not meet criteria; C: Comprehensive Gene List; R: Restrictive Gene List; aCD79B; bRPS15; cSMOC2

References

    1. Strande NT, Riggs ER, Buchanan AH, et al.Evaluating the Clinical Validity of Gene-Disease Associations: An Evidence-Based Framework Developed by the Clinical Genome Resource. Am J Hum Genet. 2017;100(6):895–906. - PMC - PubMed
    1. Piñero J, Queralt-Rosinach N, Bravo À, et al.DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes. Database (Oxford). 2015;2015:bav028. - PMC - PubMed
    1. Piñero J, Bravo À, Queralt-Rosinach N, et al.DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants. Nucleic Acids Res. 2017;45(D1):D833–D839. - PMC - PubMed
    1. Piñero J, Ramírez-Anguita JM, Saüch-Pitarch J, et al.The DisGeNET knowledge platform for disease genomics: 2019 update. Nucleic Acids Res. 2020;48(D1):D845–D855. - PMC - PubMed
    1. Martin AR, Williams E, Foulger RE, et al.PanelApp crowdsources expert knowledge to establish consensus diagnostic gene panels. Nat Genet. 2019;51(11):1560–1565. - PubMed

Publication types

LinkOut - more resources