Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2003 Feb 1;31(3):944-52.
doi: 10.1093/nar/gkg189.

Improving the performance of DomainParser for structural domain partition using neural network

Affiliations
Comparative Study

Improving the performance of DomainParser for structural domain partition using neural network

Jun-tao Guo et al. Nucleic Acids Res. .

Abstract

Structural domains are considered as the basic units of protein folding, evolution, function and design. Automatic decomposition of protein structures into structural domains, though after many years of investigation, remains a challenging and unsolved problem. Manual inspection still plays a key role in domain decomposition of a protein structure. We have previously developed a computer program, DomainParser, using network flow algorithms. The algorithm partitions a protein structure into domains accurately when the number of domains to be partitioned is known. However the performance drops when this number is unclear (the overall performance is 74.5% over a set of 1317 protein chains). Through utilization of various types of structural information including hydrophobic moment profile, we have developed an effective method for assessing the most probable number of domains a structure may have. The core of this method is a neural network, which is trained to discriminate correctly partitioned domains from incorrectly partitioned domains. When compared with the manual decomposition results given in the SCOP database, our new algorithm achieves higher decomposition accuracy (81.9%) on the same data set.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Protein structure of 2bb2 (β B2-crystallin) and its schematic representation of a flow network. (A) The ball–stick representation of the protein structure. (B) Schematic representation of the flow network of 2bb2.
Figure 2
Figure 2
The zero- and second-order spherical moment profiles of the protein 2i1b and the second-order spherical moment profile of the decoy structure of 2i1b. The zero-order moment shown has been multiplied by 30.
Figure 3
Figure 3
The distributions of the values of parameters Cd, Id and Td in well-partitioned domains and overcut domains. (Top) Cd, compactness of a domain. (Middle) Id, interface size relative to the domain volume. (Bottom) Td, relative motion between domains of each partition.
Figure 4
Figure 4
Domain size and the number of segments of each domain in well-partitioned and overcut domains. (Top) Distributions of the number of segments. (Bottom) Distributions of domain sizes.
Figure 5
Figure 5
Neural network architecture for evaluation of decomposed individual domains. This network has nine input nodes, six hidden nodes and one output node. The nine input parameters are shown on the left.
Figure 6
Figure 6
The frequency of true positive assignments plotted against the output of the neural network.
Figure 7
Figure 7
CPU time of running new DomainParser as a function of the number of residues for 278 two-domain chains in FSSP.
Figure 8
Figure 8
Domain decompositions of 2adma and 1hzda by DomainParser. SCOP assigns both 2adma and 1hzda as single-domain proteins. The thick ribbons and thin strands show different domains. (A) 2adma (21–243/244–413). (B) 1hzda (74–279/280–339).
Figure 9
Figure 9
Single α-helix and simple structures are assigned as separate domains by SCOP. (A) 1aaya (103–131/132–159/160–187). (B) 6prch (1–36/37–258). (C) 1d0ab (150–501/334–349). (D) 1zmec (31–66/ 67–100). DomainParser defines them as single-domain proteins.
Figure 10
Figure 10
Domain assignments of 1plq, 1ig3a and 1b8sa by SCOP. All three are defined as two-domain proteins. Thick ribbons and thin strands show different domains assigned by SCOP. (A) 1plq (1–126/127–258). (B) 1ig3a (179–263/10–178). (C) 1b8sa (319–450/9–318;451–506).

References

    1. Wetlaufer D.B. (1973) Nucleation, rapid folding and globular intrachain regions in proteins. Proc. Natl Acad. Sci. USA, 70, 697–701. - PMC - PubMed
    1. Richardson J.S. (1981) The anatomy and taxonomy of protein structure. Adv. Protein Chem., 34, 167–339. - PubMed
    1. Murzin A.G., Brenner,S.E., Hubbard,T. and Chothia,C. (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol., 247, 536–540. - PubMed
    1. Orengo C.A., Michie,A.D., Jones,S., Jones,D.T., Swindells,M.B. and Thornton,J.M. (1997) CATH—a hierarchic classification of protein domain structures. Structure, 5, 1093–1108. - PubMed
    1. Jones D.T. (1999) GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. J. Mol. Biol., 287, 797–815. - PubMed

Publication types

MeSH terms