. 2005 Mar 31:6:81.

doi: 10.1186/1471-2105-6-81.

Columba: an integrated database of proteins, structures, and annotations

Silke Trissl¹, Kristian Rother, Heiko Müller, Thomas Steinke, Ina Koch, Robert Preissner, Cornelius Frömmel, Ulf Leser

Affiliations

PMID: 15801979
PMCID: PMC1087474
DOI: 10.1186/1471-2105-6-81

Columba: an integrated database of proteins, structures, and annotations

Silke Trissl et al. BMC Bioinformatics. 2005.

. 2005 Mar 31:6:81.

doi: 10.1186/1471-2105-6-81.

Authors

Silke Trissl¹, Kristian Rother, Heiko Müller, Thomas Steinke, Ina Koch, Robert Preissner, Cornelius Frömmel, Ulf Leser

Affiliation

¹ Institute of Informatics, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099 Berlin, Germany. silke.trissl@informatik.hu-berlin.de

PMID: 15801979
PMCID: PMC1087474
DOI: 10.1186/1471-2105-6-81

Abstract

Background: Structural and functional research often requires the computation of sets of protein structures based on certain properties of the proteins, such as sequence features, fold classification, or functional annotation. Compiling such sets using current web resources is tedious because the necessary data are spread over many different databases. To facilitate this task, we have created COLUMBA, an integrated database of annotations of protein structures.

Description: COLUMBA currently integrates twelve different databases, including PDB, KEGG, Swiss-Prot, CATH, SCOP, the Gene Ontology, and ENZYME. The database can be searched using either keyword search or data source-specific web forms. Users can thus quickly select and download PDB entries that, for instance, participate in a particular pathway, are classified as containing a certain CATH architecture, are annotated as having a certain molecular function in the Gene Ontology, and whose structures have a resolution under a defined threshold. The results of queries are provided in both machine-readable extensible markup language and human-readable format. The structures themselves can be viewed interactively on the web.

Conclusion: The COLUMBA database facilitates the creation of protein structure data sets for many structure-based studies. It allows to combine queries on a number of structure-related databases not covered by other projects at present. Thus, information on both many and few protein structures can be used efficiently. The web interface for COLUMBA is available at http://www.columba-db.de.

PubMed Disclaimer

Figures

**Figure 1**
**Schematic entity-relationship model of COLUMBA**. The dark gray part in the middle is the subschema that originates from the Protein Data Bank (PDB). The other subschemas are represented by a single box indicating the name of the data source and are grouped according to a broad classification of their content.

**Figure 2**
**Screen shots of COLUMBA web-forms.** (A) Interface for the full text search. (B) Query form for the metabolism information, where the result set can be restricted by information from ENZYME and KEGG.

**Figure 3**
**Screen shots of COLUMBA query results.** (A) Result set for a query requesting structures from the ENZYME class '1.-.-.-' combined with a full text condition on 'TIM barrel'. (B) COLUMBA Explorer detailed view of the PDB structure 1d3h.

**Figure 4**
**The CATH wheel for KEGG pathways**. The color of the CATH wheel represents the CATH classes, where yellow stands for alpha/beta, red for mainly alpha, blue for mainly beta, and green for Few Secondary Structures. The inner circle represents the CATH architectures (C.A.), where the width of each segment represents the number of enzymes found to exhibit that architecture. The outer circle stands for the Topology (C.A.T.). (A) shows the distribution of all enzymes participating in KEGG pathways with the '3-layer(aba) sandwich' representing the largest architecture. (B) shows the CATH wheel for the pathway 'Pyrimidine metabolism' while (C) for 'Glycolysis/Gluconeogenesis'.

See this image and copyright information in PMC

Cited by

SuperMimic--fitting peptide mimetics into protein structures.
Goede A, Michalsky E, Schmidt U, Preissner R. Goede A, et al. BMC Bioinformatics. 2006 Jan 10;7:11. doi: 10.1186/1471-2105-7-11. BMC Bioinformatics. 2006. PMID: 16403211 Free PMC article.
BIOZON: a system for unification, management and analysis of heterogeneous biological data.
Birkland A, Yona G. Birkland A, et al. BMC Bioinformatics. 2006 Feb 15;7:70. doi: 10.1186/1471-2105-7-70. BMC Bioinformatics. 2006. PMID: 16480510 Free PMC article.
TAGOPSIN: collating taxa-specific gene and protein functional and structural information.
Bundhoo E, Ghoorah AW, Jaufeerally-Fakim Y. Bundhoo E, et al. BMC Bioinformatics. 2021 Oct 23;22(1):517. doi: 10.1186/s12859-021-04429-5. BMC Bioinformatics. 2021. PMID: 34688246 Free PMC article.
GenoQuery: a new querying module for functional annotation in a genomic warehouse.
Lemoine F, Labedan B, Froidevaux C. Lemoine F, et al. Bioinformatics. 2008 Jul 1;24(13):i322-9. doi: 10.1093/bioinformatics/btn159. Bioinformatics. 2008. PMID: 18586731 Free PMC article.
Variant information systems for precision oncology.
Starlinger J, Pallarz S, Ševa J, Rieke D, Sers C, Keilholz U, Leser U. Starlinger J, et al. BMC Med Inform Decis Mak. 2018 Nov 21;18(1):107. doi: 10.1186/s12911-018-0665-z. BMC Med Inform Decis Mak. 2018. PMID: 30463544 Free PMC article.

See all "Cited by" articles

References

1. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. The Protein Data Bank. Nucleic Acids Res. 2000;28:235–242. doi: 10.1093/nar/28.1.235. - DOI - PMC - PubMed
1. Oberg K, Ruysschaert J, Goormaghtigh E. Rationally selected basis proteins: A new approach to selecting proteins for spectroscopic secondary structure analysis. Protein Sci. 2003;12:2015–2031. doi: 10.1110/ps.0354703. - DOI - PMC - PubMed
1. Martin AC, Orengo CA, Hutchinson EG, Jones S, Karmirantzou M, Laskowski RA, Mitchell JB, Taroni C, Thornton JM. Protein folds and functions. Structure. 1998;6:875–884. doi: 10.1016/S0969-2126(98)00089-6. - DOI - PubMed
1. Bhat T, Bourne P, Feng Z, Gilliland G, Jain S, Ravichandran V, Schneider B, Schneider K, Thanki N, Weissig H, Westbrook J, Berman HM. The PDB data uniformity project. Nucleic Acids Res. 2001;29:214–218. doi: 10.1093/nar/29.1.214. - DOI - PMC - PubMed
1. Laskowski RA. PDBsum: summaries and analyses of PDB structures. Nucleic Acids Res. 2001;29:221–222. doi: 10.1093/nar/29.1.221. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Columba: an integrated database of proteins, structures, and annotations

Affiliation

Columba: an integrated database of proteins, structures, and annotations

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Miscellaneous

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

LinkOut - more resources

Full Text Sources

Miscellaneous