Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Nov 29:2013:bat079.
doi: 10.1093/database/bat079. Print 2013.

Chemical annotation of small and peptide-like molecules at the Protein Data Bank

Affiliations

Chemical annotation of small and peptide-like molecules at the Protein Data Bank

Jasmine Y Young et al. Database (Oxford). .

Abstract

Over the past decade, the number of polymers and their complexes with small molecules in the Protein Data Bank archive (PDB) has continued to increase significantly. To support scientific advancements and ensure the best quality and completeness of the data files over the next 10 years and beyond, the Worldwide PDB partnership that manages the PDB archive is developing a new deposition and annotation system. This system focuses on efficient data capture across all supported experimental methods. The new deposition and annotation system is composed of four major modules that together support all of the processing requirements for a PDB entry. In this article, we describe one such module called the Chemical Component Annotation Tool. This tool uses information from both the Chemical Component Dictionary and Biologically Interesting molecule Reference Dictionary to aid in annotation. Benchmark studies have shown that the Chemical Component Annotation Tool provides significant improvements in processing efficiency and data quality. Database URL: http://wwpdb.org.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Chemical component annotation. Processing steps are labeled as described in this article. Compared with the previous method of chemical component processing, the CCA Tool automates and integrates most of the steps, including ‘batch’ functionalities that process multiple components at the same time. The CCA Tool is also fully integrated with the D&A system, whereas the previous pipeline was completely separate from the other annotation processes and tools.
Figure 2.
Figure 2.
Batch Search Results Report for an entry that contains multiple chemical components (Step 2.1 of Figure 1). (A) The CCA Tool identifies and compares deposited ligands with the CCD in a batch mode, and reports the status (passed, close match or no match) of the comparison, which results in corresponding annotator action. (B) An example search results report that provides immediate information to the annotator about each chemical component instance found in the entry, and the closest match found in the CCD, as named in the ‘Top Hit’ column. In this example, the deposited entry has 14 chemical components, including two instances of alpha-d-mannose (MAN), eight instances of N-acetyl-beta-d-glucosamine (NAG), two instances of 2-(N-morpholino)-ethanesulfonic acid (MES) and two instances of zinc ions (ZN) as listed in the first column. The second column displays closest component matches found in the CCD. The report shown indicates that only the first three instances require further inspection, as the other instances in the deposited entry have corresponding definitions in the CCD. Matches that are similar but not identical can be initially evaluated with the use of the Composite Score column that represents the comparison of the instance with its Top Hit match in five categories: number of heavy atoms, number of chiral centers (independent of handedness), handedness of the chiral centers, number of aromatic atoms and bond order. In each category, each atom in the instance is compared against the Top Hit match in a binary way (match or no match). Then the number of matching atoms is divided by the number of total possible matches (i.e. the number of eligible atoms in the category) and is expressed as a percentage. Mousing over the Composite Score displays additional information in a pop-up window. In this example, the Composite Score does not reveal any chemical differences for the first two instances listed, which means that the only difference between the deposited instance and the top hit match is the CCD ID used. The annotator will then update the CCD ID used in the deposited entry. The third instance listed, 1_C_NAG_1076, has a score of 80% for the chiral center comparison, as one of the five chiral centers in NAG (chiral center C1) has sp2 hybridization rather than sp3 in the experimental coordinates.
Figure 3.
Figure 3.
The Instance Search View (Step 2.2 in Figure 1). This example shows two instances in the deposited entry, both labeled NAG (N-acetyl-beta-d-glucosamine), that are analogs of the matches found in the CCD. For these instances, the annotator can launch 2D and 3D comparisons by selecting the arrow next to the instance of interest. Visual comparisons of deposited instances (green column) and CCD definitions (blue columns) are available. To suggest CCD matches, the CCA Tool uses the deposited chemical environment for the prediction of the complete chemical description. This environment is displayed as sticks in the visual displays. In this example, the tool has recognized the adjacent atoms, and has added the leaving group as a black stick (labeled A) to provide the absolute stereochemistry. The 3D view reveals the glycosylation interaction of the deposited instance of NAG with asparagine (ASN) (in stick representation, labeled B). The annotator can use this environmental information to correctly assign components.
Figure 4.
Figure 4.
The Chemical Component Editor (Step 2.3 in Figure 1) used to create new CCD definitions. The interface provides a variety of operations (top buttons) to support the creation of a new chemical definition with a unique code (labeled A), update the PDB entry file (labeled B) and add definitions to the CCD (labeled C). Molecular viewers interactively display the chemical component in 2D and 3D. In this example showing the creation of the definition for CCD ID R12, two steps performed are shown in the 2D sketch tool panels: changing bonds from single to double, and from double to single (CCE functions labeled D and E) and then adding missing atoms/elements (labeled F). Hydrogen atoms are added implicitly and the chemical descriptions are updated automatically. Changes are updated instantly in the 2D and 3D viewers.
Figure 5.
Figure 5.
The Chopper Tool is used to break peptide-like inhibitors and antibiotics into individual polymeric residues or subcomponents following BIRD definitions (Step 2.4 in Figure 1). Edits are made in the 2D view (lower left); selected bonds (highlighted in yellow) can be marked to be ‘chopped’. The chopped residue or subcomponent is then searched against the CCD; results are color-coded and listed in the top bar. For example, VAL is colored coded in orange. The order of residues/subcomponents listed at the top of the page can be changed (by dragging the name) as needed to provide the appropriate sequence. The 3D view (right) displays the same components shown in the 2D view. When the subcomponents have been created and the sequence is in the correct order, the ‘Chop Coordinates’ button will change the molecule to a polymeric representation in the PDB entry according to the selected decomposition.

References

    1. Berman HM, Henrick K, Nakamura H. Announcing the worldwide Protein Data Bank. Nat Struct Biol. 2003;10:980. - PubMed
    1. Berman HM, Westbrook JD, Feng Z, et al. The Protein Data Bank. Nucleic Acids Res. 2000;28:235–242. - PMC - PubMed
    1. Velankar S, Alhroub Y, Best C, et al. PDBe: Protein Data Bank in Europe. Nucleic Acids Res. 2012;40:D445–D452. - PMC - PubMed
    1. Kinjo AR, Suzuki H, Yamashita R, et al. Protein Data Bank Japan (PDBj): maintaining a structural data archive and resource description framework format. Nucleic Acids Res. 2012;40:D453–D460. - PMC - PubMed
    1. Quesada M, Westbrook J, Oldfield T, et al. The wwPDB common tool for deposition and annotation. Acta Cryst. 2011;A67:C403–C404.

Publication types