Mapping PDB chains to UniProtKB entries
- PMID: 16188924
- DOI: 10.1093/bioinformatics/bti694
Mapping PDB chains to UniProtKB entries
Abstract
Motivation: UniProtKB/SwissProt is the main resource for detailed annotations of protein sequences. This database provides a jumping-off point to many other resources through the links it provides. Among others, these include other primary databases, secondary databases, the Gene Ontology and OMIM. While a large number of links are provided to Protein Data Bank (PDB) files, obtaining a regularly updated mapping between UniProtKB entries and PDB entries at the chain or residue level is not straightforward. In particular, there is no regularly updated resource which allows a UniProtKB/SwissProt entry to be identified for a given residue of a PDB file.
Results: We have created a completely automatically maintained database which maps PDB residues to residues in UniProtKB/SwissProt and UniProtKB/trEMBL entries. The protocol uses links from PDB to UniProtKB, from UniProtKB to PDB and a brute-force sequence scan to resolve PDB chains for which no annotated link is available. Finally the sequences from PDB and UniProtKB are aligned to obtain a residue-level mapping.
Availability: The resource may be queried interactively or downloaded from http://www.bioinf.org.uk/pdbsws/.
Similar articles
-
SSMap: a new UniProt-PDB mapping resource for the curation of structural-related information in the UniProt/Swiss-Prot Knowledgebase.BMC Bioinformatics. 2008 Sep 23;9:391. doi: 10.1186/1471-2105-9-391. BMC Bioinformatics. 2008. PMID: 18811932 Free PMC article.
-
Mapping SNPs to protein sequence and structure data.Bioinformatics. 2005 Apr 15;21(8):1443-50. doi: 10.1093/bioinformatics/bti220. Epub 2004 Dec 21. Bioinformatics. 2005. PMID: 15613399
-
SCOWLP: a web-based database for detailed characterization and visualization of protein interfaces.BMC Bioinformatics. 2006 Mar 2;7:104. doi: 10.1186/1471-2105-7-104. BMC Bioinformatics. 2006. PMID: 16512892 Free PMC article.
-
Data mining the PDB for glyco-related data.Methods Mol Biol. 2009;534:293-310. doi: 10.1007/978-1-59745-022-5_21. Methods Mol Biol. 2009. PMID: 19277543 Review.
-
Bioinformatics in protein analysis.EXS. 2000;88:215-31. doi: 10.1007/978-3-0348-8458-7_14. EXS. 2000. PMID: 10803381 Review.
Cited by
-
IntEnzyDB: an Integrated Structure-Kinetics Enzymology Database.J Chem Inf Model. 2022 Nov 28;62(22):5841-5848. doi: 10.1021/acs.jcim.2c01139. Epub 2022 Oct 26. J Chem Inf Model. 2022. PMID: 36286319 Free PMC article.
-
Elevated neoantigen levels in tumors with somatic mutations in the HLA-A, HLA-B, HLA-C and B2M genes.BMC Med Genomics. 2019 Jul 25;12(Suppl 6):107. doi: 10.1186/s12920-019-0544-1. BMC Med Genomics. 2019. PMID: 31345234 Free PMC article.
-
Joint probabilistic-logical refinement of multiple protein feature predictors.BMC Bioinformatics. 2014 Jan 15;15:16. doi: 10.1186/1471-2105-15-16. BMC Bioinformatics. 2014. PMID: 24428894 Free PMC article.
-
The Purine Bias of Coding Sequences is Determined by Physicochemical Constraints on Proteins.Bioinform Biol Insights. 2014 May 20;8:93-108. doi: 10.4137/BBI.S13161. eCollection 2014. Bioinform Biol Insights. 2014. PMID: 24899802 Free PMC article.
-
Dissecting protein loops with a statistical scalpel suggests a functional implication of some structural motifs.BMC Bioinformatics. 2011 Jun 20;12:247. doi: 10.1186/1471-2105-12-247. BMC Bioinformatics. 2011. PMID: 21689388 Free PMC article.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources