Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jan 8;47(D1):D367-D375.
doi: 10.1093/nar/gky1140.

KnotProt 2.0: a database of proteins with knots and other entangled structures

Affiliations

KnotProt 2.0: a database of proteins with knots and other entangled structures

Pawel Dabrowski-Tumanski et al. Nucleic Acids Res. .

Abstract

The KnotProt 2.0 database (the updated version of the KnotProt database) collects information about proteins which form knots and other entangled structures. New features in KnotProt 2.0 include the characterization of both probabilistic and deterministic entanglements which can be formed by disulfide bonds and interactions via ions, a refined characterization of entanglement in terms of knotoids, the identification of the so-called cysteine knots, the possibility to analyze all or a non-redundant set of proteins, and various technical updates. The KnotProt 2.0 database classifies all entangled proteins, represents their complexity in the form of a knotting fingerprint, and presents many biological and geometrical statistics based on these results. Currently the database contains >2000 entangled structures, and it regularly self-updates based on proteins deposited in the Protein Data Bank (PDB).

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Proteins that represent various classes of entangled structures identified by KnotProt 2.0 (top row, structures with PDB codes, from left to right: 2EFV, 2RH3, 1AOC, 2P4Z, 2ML7) and their schematic representation (bottom row). Probabilistic knots are identified based on the protein backbone chain; the dashed line denotes a possible chain closure. All the structures with probabilistic knots are subjected to the knotoid analysis. Knotoids provide a refined characterization and classification of open chains, which depends, however, on the choice of the projection plane. Deterministic knots are detected based on all possible combinations of covalent bonds or interactions via ions with the protein’s backbone. In the case of probabilistic covalent and ion-based knots, the chain is closed based on the methods established previously for knotted proteins. The green bead in the schematic depiction denotes the ion, while the orange stripes denote the covalent connections (e.g. disulfide bonds). For structures with covalent bridges, consecutive parts of the chain constituting the knot are colored red to blue. The parts of the chain not building the knot are marked with light grey. The rightmost structure is a cysteine knot, which in fact is not a mathematical knot (for more details see Figure 11); in this example, the loop (blue) is closed by two disulfide bridges (orange) and pierced by the third disulfide bond (red).
Figure 2.
Figure 2.
A table of knotoids through three crossings. The letter k in the notation refers to a ‘knotoid’, and the first number indicates the minimal number of crossings of the given knotoid. The second number (written in subscript) is needed to distinguish different knotoids with the same minimal number of crossings. Here this second number corresponds to the notation of corresponding knotoids applied in Knoto-ID (22).
Figure 3.
Figure 3.
An example of the knotoid k2.1 and its mirror reflection, denoted k2.1m.
Figure 4.
Figure 4.
A knotoid is defined via a projection of an open 3D curve onto a plane. Its knotoid type depends on the choice of the plane.
Figure 5.
Figure 5.
Projection globe/map. Each point in this map represents a projection direction that defines a knotoid, and its color denotes the resulting knotoid type.
Figure 6.
Figure 6.
An example of a fingerprint matrix. Each point in the matrix represents a subchain of the protein chain (whose beginning and end correspond to x- and y-coordinates of this point), and its color denotes the dominant knotoid type of the corresponding subchain. The assignment of colors to the knotoid types is provided in the key. After pointing the cursor at a point corresponding to some knotoid, its x- and y-coordinates, knotoid type, and frequency are shown in a blue box.
Figure 7.
Figure 7.
Schematic presentation of the three types of interactions included in the KnotProt 2.0 database.
Figure 8.
Figure 8.
Examples of protein structures with various covalent bonds between non-sequential residues. A disulfide bond (within hydrolase inhibitor, PDB code 2LFK), a post-translational amide bond closing the loop (within replication inhibitor, PDB code 1RPB), and a concatenation of aromatic side chains (within oxidoreductase, PDB code 3C75). The atoms of joined residues are shown and colored.
Figure 9.
Figure 9.
A schematic presentation of an unknotted protein with a knotted, covalent loop. The protein backbone is presented in black, and the covalent bridges in orange. The knotted loop is highlighted in red.
Figure 10.
Figure 10.
Cartoon representation of the clotting protein from the horseshoe crab, left. The structures on the right show two different trefoil knots identified based on disulfide bonds and the protein backbone. The magenta spheres indicate the positions of the cysteines which form the disulfide bonds. The red color indicates the location of the cysteine knot.
Figure 11.
Figure 11.
Three types of cysteine knots identified in proteins. The loop-forming bridges are shown in orange, and the piercing bridge in red. The ‘N’ and ‘C’ letters denote the chain termini. The numbers in the beads representing the cysteines illustrate the sequential order of the cysteine residues.
Figure 12.
Figure 12.
The continuous transformation of growth factor cysteine into a simpler form in which all the closed, covalent loops (circular paths) are clearly unknotted. The orange stripes denote the bridges and the blue dashed line denotes the chain closure. The gray arrows indicate the parts of the chain being pushed in subsequent steps. The ‘N’ and ‘C’ letters denote the chain termini. The protein with the termini connected is equivalent to the bottom-middle scheme. A similar operation may also be performed in the case of inhibitor and cyclic cysteine knots (see the ‘Cysteine knots’ description on the KnotProt 2.0 website).
Figure 13.
Figure 13.
The result of the ‘View details’ button on the structure visualization. The loop-forming subchains are in blue, the loop-forming bridges are in orange, and the loop piercing bridge is in red.

References

    1. Mansfield M.L. Are there knots in proteins. Nat. Struct. Biol. 1994; 1:213–214. - PubMed
    1. Taylor W.R. A deeply knotted protein and how it might fold. Nature. 2000; 406:916–919. - PubMed
    1. Virnau P., Mirny L.A., Kardar M.. Intricate knot in proteins: Function and evolution. PLoS Comput. Biol. 2006; 2:1074–1079. - PMC - PubMed
    1. Bölinger D., Sulkowska J.I., Hsu H.-P., Mirny L.A., Kardar M., Onuchic J.N., Virnau P.. A Stevedore’s protein knot. PLoS Comput. Biol. 2010; 6:e1000731. - PMC - PubMed
    1. Sulkowska J.I., Sulkowski P.. Entangled proteins: knots, slipknots, links, and lassos. The Role of Topology in Materials. 2018; Springer; 201–226.

Publication types