Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Dec 1;6 Suppl 4(Suppl 4):S18.
doi: 10.1186/1471-2105-6-S4-S18.

Inherited disorder phenotypes: controlled annotation and statistical analysis for knowledge mining from gene lists

Affiliations

Inherited disorder phenotypes: controlled annotation and statistical analysis for knowledge mining from gene lists

Marco Masseroli et al. BMC Bioinformatics. .

Abstract

Background: Analysis of inherited diseases and their associated phenotypes is of great importance to gain knowledge of underlying genetic interactions and could ultimately give clinically useful insights into disease processes, including complex diseases influenced by multiple genetic loci. Nevertheless, to date few computational contributions have been proposed for this purpose, mainly due to lack of controlled clinical information easily accessible and structured for computational genome-wise analyses. To allow performing phenotype analyses of inherited disorder related genes we implemented new original modules within GFINDer http://www.bioinformatics.polimi.it/GFINDer/, a Web system we previously developed that dynamically aggregates functional annotations of user uploaded gene lists and allows performing their statistical analysis and mining.

Results: New GFINDer modules allow annotating large numbers of user classified biomolecular sequence identifiers with morbidity and clinical information, classifying them according to genetic disease phenotypes and their locations of occurrence, and statistically analyzing the obtained classifications. To achieve this we exploited, normalized and structured the information present in textual form in the Clinical Synopsis sections of the Online Mendelian Inheritance in Man (OMIM) databank. Such valuable information delineates numerous signs and symptoms accompanying many genetic diseases and it is divided into phenotype location categories, either by organ system or type of finding.

Conclusion: Supporting phenotype analyses of inherited diseases and biomolecular functional evaluations, GFINDer facilitates a genomic approach to the understanding of fundamental biological processes and complex cellular mechanisms underlying patho-physiological phenotypes.

PubMed Disclaimer

Figures

Figure 1
Figure 1
OMIM Clinical Synopsis section for the Phenylketonuria disease associated with the Phenylalanine Hydroxylase(PAH) human gene and with Mental retardation Neurologic phenotype. 261600: MIM (Mendelian Inheritance in Man) ID of the Phenylketonuria disease; +: in OMIM a plus sign before a MIM number entry indicates that the entry contains the description of a gene of known sequence and a phenotype; Neuro: Neurologic, GI: Gastrointestinal, Misc: Miscellaneous, Lab: Laboratory phenotype locations.
Figure 2
Figure 2
GFINDer Exploration Genetic Disorders module: phenotype location categories related to the considered cardiovascular system and neurobiology genes, respectively. Phenotype view: link to the list of considered genes associated with phenotypes in the specific Phenotype location; Level: level in the defined phenotype location hierarchy (higher levels correspond to more specific locations); Num. (%): absolute and percentage number of considered genes associated with phenotypes in the specific phenotype location.
Figure 3
Figure 3
GFINDer Statistics Genetic Disorders module: phenotypes most significantly over- and under-represented in the considered cardiovascular system versus neurobiology gene classes. Phenotype level: level in the defined phenotype hierarchy (higher levels correspond to more detailed and specific phenotype descriptions); P-valuetest-type: P value defining association between a given phenotype and a considered class of genes, and initial of used statistical test name (h: hypergeometric distribution test).

References

    1. Phillips TJ, Belknap JK. Complex-trait genetics: emergence of multivariate strategies. Nat Rev Neurosci. 2002;3:478–485. - PubMed
    1. Cantor MN, Lussier YA. Mining OMIM for insight into complex diseases. In: Fieschi M, Coiera E, Li Y-CJ, editor. Proceedings of Medinfo 2004: 7–11 September 2004; San Francisco, CA. Amsterdam, NL: IOS Press; 2004. pp. 753–757. - PMC - PubMed
    1. Maglott D, Ostell J, Pruitt KD, Tatusova T. Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res. 2005;33:D54–D58. doi: 10.1093/nar/gki031. - DOI - PMC - PubMed
    1. Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M, Martin MJ, Natale DA, O'Donovan C, Redaschi N, Yeh LS. UniProt: the Universal Protein Knowledgebase. Nucleic Acids Res. 2004;32:D115–D119. doi: 10.1093/nar/gkh131. - DOI - PMC - PubMed
    1. Sonnhammer ELL, Eddy SR, Durbin R. Pfam: a comprehensive database of protein domain families based on seed alignments. Proteins. 1997;28:405–420. doi: 10.1002/(SICI)1097-0134(199707)28:3<405::AID-PROT10>3.0.CO;2-L. - DOI - PubMed

Publication types