Pathogenic variation types in human genes relate to diseases through Pfam and InterPro mapping
- PMID: 36188216
- PMCID: PMC9523224
- DOI: 10.3389/fmolb.2022.966927
Pathogenic variation types in human genes relate to diseases through Pfam and InterPro mapping
Abstract
Grouping residue variations in a protein according to their physicochemical properties allows a dimensionality reduction of all the possible substitutions in a variant with respect to the wild type. Here, by using a large dataset of proteins with disease-related and benign variations, as derived by merging Humsavar and ClinVar data, we investigate to which extent our physicochemical grouping procedure can help in determining whether patterns of variation types are related to specific groups of diseases and whether they occur in Pfam and/or InterPro gene domains. Here, we download 75,145 germline disease-related and benign variations of 3,605 genes, group them according to physicochemical categories and map them into Pfam and InterPro gene domains. Statistically validated analysis indicates that each cluster of genes associated to Mondo anatomical system categorizations is characterized by a specific variation pattern. Patterns identify specific Pfam and InterPro domain-Mondo category associations. Our data suggest that the association of variation patterns to Mondo categories is unique and may help in associating gene variants to genetic diseases. This work corroborates in a much larger data set previous observations from our group.
Keywords: InterPro domain; Pfam domain; disease associated variant; mondo anatomical system categories; variation physicochemical type.
Copyright © 2022 Babbi, Savojardo, Baldazzi, Martelli and Casadio.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures








Similar articles
-
Mapping OMIM Disease-Related Variations on Protein Domains Reveals an Association Among Variation Type, Pfam Models, and Disease Classes.Front Mol Biosci. 2021 May 7;8:617016. doi: 10.3389/fmolb.2021.617016. eCollection 2021. Front Mol Biosci. 2021. PMID: 34026820 Free PMC article.
-
ECDomainMiner: discovering hidden associations between enzyme commission numbers and Pfam domains.BMC Bioinformatics. 2017 Feb 13;18(1):107. doi: 10.1186/s12859-017-1519-x. BMC Bioinformatics. 2017. PMID: 28193156 Free PMC article.
-
InterPro--an integrated documentation resource for protein families, domains and functional sites.Bioinformatics. 2000 Dec;16(12):1145-50. doi: 10.1093/bioinformatics/16.12.1145. Bioinformatics. 2000. PMID: 11159333
-
Protein domains provide a new layer of information for classifying human variations in rare diseases.Front Bioinform. 2023 Feb 21;3:1127341. doi: 10.3389/fbinf.2023.1127341. eCollection 2023. Front Bioinform. 2023. PMID: 36896423 Free PMC article.
-
A distributed computation of Interpro Pfam, PROSITE and ProDom for protein annotation.Genet Mol Res. 2005 Sep 30;4(3):590-8. Genet Mol Res. 2005. PMID: 16342044
Cited by
-
MultifacetedProtDB: a database of human proteins with multiple functions.Nucleic Acids Res. 2024 Jan 5;52(D1):D494-D501. doi: 10.1093/nar/gkad783. Nucleic Acids Res. 2024. PMID: 37791887 Free PMC article.
References
-
- Hebbar P., Sowmya S. K. (2022). “Genomic ariant nnotation: A omprehensive eview of ools and echniques,” in Intelligent ystems esign and pplications. ISDA 2021. Lecture Notes in Networks and ystems 418. Editors Abraham A., Gandhi N., Hanne T., Hong T. P., Nogueira Rios T., Ding W.. 10.1007/978-3-030-96308-8_98 - DOI
LinkOut - more resources
Full Text Sources