Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Jan 1;34(Database issue):D572-80.
doi: 10.1093/nar/gkj118.

TreeFam: a curated database of phylogenetic trees of animal gene families

Affiliations

TreeFam: a curated database of phylogenetic trees of animal gene families

Heng Li et al. Nucleic Acids Res. .

Abstract

TreeFam is a database of phylogenetic trees of gene families found in animals. It aims to develop a curated resource that presents the accurate evolutionary history of all animal gene families, as well as reliable ortholog and paralog assignments. Curated families are being added progressively, based on seed alignments and trees in a similar fashion to Pfam. Release 1.1 of TreeFam contains curated trees for 690 families and automatically generated trees for another 11 646 families. These represent over 128 000 genes from nine fully sequenced animal genomes and over 45 000 other animal proteins from UniProt; approximately 40-85% of proteins encoded in the fully sequenced animal genomes are included in TreeFam. TreeFam is freely available at http://www.treefam.org and http://treefam.genomics.org.cn.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Flowcharts of TreeFam pipelines. (A) Overall strategy. The seed families for TreeFam-B are taken from PhIGs clusters. They are expanded by a seed-to-full procedure to form full families. Manual curation makes TreeFam-B families become TreeFam-A families, which can also be curated further at a later date. (B) The seed-to-full procedure. This procedure is used to expand seed families to full families. Note that the complete seed-to-full pipeline is only applied when the sequence sets are updated or a whole new genome is added to TreeFam. That is, for a TreeFam-A family created by curation of a TreeFam-B family, the TreeFam-A seed is generated by manual curation, and the full sequences are taken directly from the TreeFam-B family that was curated. (C) Manual curation. Various published resources and in-house tools are utilized in this process.
Figure 2
Figure 2
An example TreeFam webpage, for the Cyclin-E family. In the alignment the position of introns are indicated by highlighting the amino acid to the right of each intron–exon boundary in red.

References

    1. Fitch W.M. Distinguishing homologous from analogous proteins. Syst. Zool. 1970;19:99–113. - PubMed
    1. O'Brien K.P., Remm M., Sonnhammer E.L. Inparanoid: a comprehensive database of eukaryotic orthologs. Nucleic Acids Res. 2005;33:D476–D480. - PMC - PubMed
    1. Hubbard T., Andrews D., Caccamo M., Cameron G., Chen Y., Clamp M., Clarke L., Coates G., Cox T., Cunningham F., et al. Ensembl 2005. Nucleic Acids Res. 2005;33:D447–D453. - PMC - PubMed
    1. Tatusov R.L., Fedorova N.D., Jackson J.D., Jacobs A.R., Kiryutin B., Koonin E.V., Krylov D.M., Mazumder R., Mekhedov S.L., Nikolskaya A.N., et al. The COG database: an updated version includes eukaryotes. BMC Bioinformatics. 2003;4:41. - PMC - PubMed
    1. Li L., Stoeckert C.J., Jr, Roos D.S. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 2003;13:2178–2189. - PMC - PubMed

Publication types