Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 1997 May 27;94(11):5831-6.
doi: 10.1073/pnas.94.11.5831.

Positionally cloned human disease genes: patterns of evolutionary conservation and functional motifs

Affiliations
Comparative Study

Positionally cloned human disease genes: patterns of evolutionary conservation and functional motifs

A R Mushegian et al. Proc Natl Acad Sci U S A. .

Abstract

Positional cloning has already produced the sequences of more than 70 human genes associated with specific diseases. In addition to their medical importance, these genes are of interest as a set of human genes isolated solely on the basis of the phenotypic effect of the respective mutations. We analyzed the protein sequences encoded by the positionally cloned disease genes using an iterative strategy combining several sensitive computer methods. Comparisons to complete sequence databases and to separate databases of nematode, yeast, and bacterial proteins showed that for most of the disease gene products, statistically significant sequence similarities are detectable in each of the model organisms. Only the nematode genome encodes apparent orthologs with conserved domain architecture for the majority of the disease genes. In yeast and bacterial homologs, domain organization is typically not conserved, and sequence similarity is limited to individual domains. Generally, human genes complement mutations only in orthologous yeast genes. Most of the positionally cloned genes encode large proteins with several globular and nonglobular domains, the functions of some or all of which are not known. We detected conserved domains and motifs not described previously in a number of proteins encoded by disease genes and predicted functions for some of them. These predictions include an ATP-binding domain in the product of hereditary nonpolyposis colon cancer gene (a MutL homolog), which is conserved in the HS90 family of chaperone proteins, type II DNA topoisomerases, and histidine kinases, and a nuclease domain homologous to bacterial RNase D and the 3'-5' exonuclease domain of DNA polymerase I in the Werner syndrome gene product.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Features of human positionally cloned disease gene products. The vertical axis in each panel indicates the percentage of the analyzed protein set. (A) Protein size classes. 1, Positionally cloned disease gene products; 2, human proteins from swiss-prot. (B) Domain organization. 1, Positionally cloned disease gene products; 2, human proteins from swiss-prot. The horizontal axis indicates the predicted globular domains, with the lower size limit of 20 amino acid residues. (C) Sequence conservation. 1, All sequence similarities; 2, orthologs.
Figure 2
Figure 2
Previously undetected conserved domains and motifs in positionally cloned disease gene products. Alignments were constructed using the macaw program. Unique identifiers for each sequence are shown. Distances to the ends of the proteins and distances between the aligned, conserved blocks are shown by numbers. Conserved bulky hydrophobic residues (I, L, M, V, F, Y, W) are indicated by yellow shading and by U in the consensus. Other conserved residues are shown in magenta. Other designations in the consensus: O, small residues (A, G, or S); +, basic residues (K and R); −, acidic residues (D and E). (A) A putative ATP-binding domain in hereditary nonpolyposis colon cancer gene product (HML1), MutL mismatch repair proteins, HSP90 chaperones, type II DNA topoisomerases, and bacterial histidine kinases. The sequences were from the swiss-prot database: MLH1__HUMAN, human MutL homolog, colon cancer susceptibility gene product; PMS1__YEAST and MLH1__YEAST, yeast mismatch repair gene products homologous to MutL; HEXB__STRPN, mismatch repair gene product from Streptococcus pneumoniae; MUTL__ECOLI, E. coli mismatch repair gene mutL product; HTPG__HAEIN, HTPG__BACSU, HS90__THEPA, HS90__CANAL, and HS9A__HUMAN, molecular chaperones of the HSP90 family from Haemophilus influenzae, Bacillus subtilis, Theileria parva, Candida albicans, and human; TOPB__HUMAN, TOP2__DROME, GYRB__HALSQ, PARE__ECOLI, and GYRB__ECOLI, type II topoisomerases from human, Drosophila melanogaster, Haloferax sp., and E. coli; SPHS__SYNP7 and PHOR__BACSU, histidine kinases involved in inducible alkaline phosphatase production from Synechococcus sp. and Bacillus subtilis; PILS__PSEAE, histidine kinase involved in fimbriae biogenesis from Pseudomonas aeruginosa; PHY1__TOBAC, tobacco phytochrome A1 (histidine kinase homolog); ENVZ__ECOLI, osmolarity sensor histidine kinase. Three motifs described in histidine kinases and phenotypes of E. coli envZ mutants (36) are shown below the alignment. Dominant-negative mutations in MutL protein (37) are indicated by gray shading. Asterisks indicate amino acid residues that in E. coli GyrB are in direct contact with ATP; two of such residues are in the spacer between motifs G1 and G2 (38). The secondary structure assignments are from the crystal structure of the N-terminal fragment of E. coli GyrB (38); h, α-helix, e, extended conformation (β-sheet), and l, loop. (B) A putative nuclease domain conserved in Werner syndrome gene product (WRNp), bacterial RNase D, and DNA polymerase I. The sequences were from swiss-prot (names with underlines) or from GenBank (National Center for Biotechnology Information accession numbers indicated below). PMSC__HUMAN, human polymyositis and scleroderma autoantigen; RND__HAEIN, RND__ECOLI, and RND/SYNSP (gi 1001530), RNase D from H. influenzae, E. coli, and Synechocystis sp.; ORF/SCHPO (gi 1256512), uncharacterized ORF product from Schizosaccharomyces pombe; DPO1__ECOLI, DPO1__HAEIN, DPO1/SYNSP, DPO1/TREPA, and DPO1/BPT5, DNA polymerase I from E. coli, H. influenzae, Synechocystis sp., Treponema pallidum, and bacteriophage T5. The structural assignments are from the Klenow fragment structure (39); the designations are as in A. The three aspartates that coordinate the two cations required for the 3′-5′ exonuclease reaction by the Klenow fragment are indicated by asterisks; the two residues directly involved in catalysis (40, 41) are indicated by exclamation marks. ExoI, ExoII, and ExoIII indicate the three motifs that are conserved throughout the nuclease superfamily (42, 43).

Comment in

References

    1. Collins F S. Nat Genet. 1995;9:347–350. - PubMed
    1. Bassett D E, Jr, Boguski M S, Spencer F, Reeves R, Goebl M, Hieter P. Trends Genet. 1995;11:372–373. - PubMed
    1. Bassett D E, Jr, Boguski M S, Hieter P. Nature (London) 1996;379:589–590. - PubMed
    1. McKusick V A. Mendelian Inheritance in Man: Catalogs of Human Genes and Genetic Disorders. 11th Ed. Baltimore: Johns Hopkins Univ. Press; 1993.
    1. Koonin E V, Mushegian A R. Curr Opin Genet Dev. 1996;6:757–762. - PubMed

Publication types

MeSH terms

LinkOut - more resources