Pandit: a database of protein and associated nucleotide domains with inferred trees

Simon Whelan¹, Paul I W de Bakker, Nick Goldman

Affiliations

PMID: 12912837
DOI: 10.1093/bioinformatics/btg188

Comparative Study

Pandit: a database of protein and associated nucleotide domains with inferred trees

Simon Whelan et al. Bioinformatics. 2003.

. 2003 Aug 12;19(12):1556-63.

doi: 10.1093/bioinformatics/btg188.

Authors

Simon Whelan¹, Paul I W de Bakker, Nick Goldman

Affiliation

¹ Department of Zoology, University of Cambridge, Downing Street, Cambridge CB2 3EJ, UK. simon@ebi.ac.uk

PMID: 12912837
DOI: 10.1093/bioinformatics/btg188

Abstract

Motivation: A large, high-quality database of homologous sequence alignments with good estimates of their corresponding phylogenetic trees will be a valuable resource to those studying phylogenetics. It will allow researchers to compare current and new models of sequence evolution across a large variety of sequences. The large quantity of data may provide inspiration for new models and methodology to study sequence evolution and may allow general statements about the relative effect of different molecular processes on evolution.

Results: The Pandit 7.6 database contains 4341 families of sequences derived from the seed alignments of the Pfam database of amino acid alignments of families of homologous protein domains (Bateman et al., 2002). Each family in Pandit includes an alignment of amino acid sequences that matches the corresponding Pfam family seed alignment, an alignment of DNA sequences that contain the coding sequence of the Pfam alignment when they can be recovered (overall, 82.9% of sequences taken from Pfam) and the alignment of amino acid sequences restricted to only those sequences for which a DNA sequence could be recovered. Each of the alignments has an estimate of the phylogenetic tree associated with it. The tree topologies were obtained using the neighbor joining method based on maximum likelihood estimates of the evolutionary distances, with branch lengths then calculated using a standard maximum likelihood approach.

PubMed Disclaimer

Cited by

Integrating protein structures and precomputed genealogies in the Magnum database: examples with cellular retinoid binding proteins.
Bradley ME, Benner SA. Bradley ME, et al. BMC Bioinformatics. 2006 Feb 23;7:89. doi: 10.1186/1471-2105-7-89. BMC Bioinformatics. 2006. PMID: 16504077 Free PMC article.
Genomic scale sub-family assignment of protein domains.
Gough J. Gough J. Nucleic Acids Res. 2006 Jul 28;34(13):3625-33. doi: 10.1093/nar/gkl484. Print 2006. Nucleic Acids Res. 2006. PMID: 16877569 Free PMC article.
Harnessing machine learning to guide phylogenetic-tree search algorithms.
Azouri D, Abadi S, Mansour Y, Mayrose I, Pupko T. Azouri D, et al. Nat Commun. 2021 Mar 31;12(1):1983. doi: 10.1038/s41467-021-22073-8. Nat Commun. 2021. PMID: 33790270 Free PMC article.
Evidence of Statistical Inconsistency of Phylogenetic Methods in the Presence of Multiple Sequence Alignment Uncertainty.
Md Mukarram Hossain AS, Blackburne BP, Shah A, Whelan S. Md Mukarram Hossain AS, et al. Genome Biol Evol. 2015 Jul 1;7(8):2102-16. doi: 10.1093/gbe/evv127. Genome Biol Evol. 2015. PMID: 26139831 Free PMC article.
The Molecular Biology Database Collection: 2005 update.
Galperin MY. Galperin MY. Nucleic Acids Res. 2005 Jan 1;33(Database issue):D5-24. doi: 10.1093/nar/gki139. Nucleic Acids Res. 2005. PMID: 15608247 Free PMC article.

See all "Cited by" articles

Publication types

Actions
Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
- Ovid Technologies, Inc.
- Silverchair Information Systems

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Pandit: a database of protein and associated nucleotide domains with inferred trees

Affiliation

Pandit: a database of protein and associated nucleotide domains with inferred trees

Authors

Affiliation

Abstract

Similar articles

Cited by

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources