Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2003 Jul 1;31(13):3701-8.
doi: 10.1093/nar/gkg519.

GlobPlot: Exploring protein sequences for globularity and disorder

Affiliations
Comparative Study

GlobPlot: Exploring protein sequences for globularity and disorder

Rune Linding et al. Nucleic Acids Res. .

Abstract

A major challenge in the proteomics and structural genomics era is to predict protein structure and function, including identification of those proteins that are partially or wholly unstructured. Non-globular sequence segments often contain short linear peptide motifs (e.g. SH3-binding sites) which are important for protein function. We present here a new tool for discovery of such unstructured, or disordered regions within proteins. GlobPlot (http://globplot.embl.de) is a web service that allows the user to plot the tendency within the query protein for order/globularity and disorder. We show examples with known proteins where it successfully identifies inter-domain segments containing linear motifs, and also apparently ordered regions that do not contain any recognised domain. GlobPlot may be useful in domain hunting efforts. The plots indicate that instances of known domains may often contain additional N- or C-terminal segments that appear ordered. Thus GlobPlot may be of use in the design of constructs corresponding to globular proteins, as needed for many biochemical studies, particularly structural biology. GlobPlot has a pipeline interface--GlobPipe--for the advanced user to do whole proteome analysis. GlobPlot can also be used as a generic infrastructure package for graphical displaying of any possible propensity.

PubMed Disclaimer

Figures

Figure 1
Figure 1
GlobPlot predictions for human Bcl-2. The predicted disordered segments are mapped on the structure in red. The yellow helix kink is falsely predicted as a disordered segment. The green segment is not predicted by GlobPlot probably because the algorithm has lower sensitivity in the termini due to the Savitzky-Golay filter. Blue color corresponds to the globular domain of Bcl-2.
Figure 2
Figure 2
Propensities for disorder/globularity detection. The Russell/Linding is the default set used by the GlobPlot algorithm.
Figure 3
Figure 3
GlobPlot of human CREB binding protein (CBP_HUMAN). About half of the sequence appears to be in a disordered state with long flexible regions observed at N- and C-terminus. The flexible region just after the KIX domain might be important for induced binding of the pKID domain of CREB to CBP (33,34). For further discussion of disorder in CBP/CREB see Wright et al. (2).
Figure 4
Figure 4
GlobPlot of bovine prion protein (P10279). The flexible N-terminal segment is easily spotted (13). The SMART ‘domain’ is, in this case, a protein signature not a descriptor of a globular fold. The plot was created by using the ‘Create PostScript’ option.
Figure 5
Figure 5
Benchmarking of GlobDoms versus SMART domains. We used the GlobPipe PeakFinder to search for ‘down-hill’ areas (negative first order derivative) in the GlobPlot graph. Assuming that such regions (GlobDoms) can be patched together (and thereby define a single domain), if they overlap with or are completely embedded in a SMART domain on the same sequence, we establish a recovery of the SMART domains. Patched GlobDoms are predicted domains co-located with a known SMART domain on the same sequence. The green ‘Discovery’ bar shows how many GlobDoms are found entirely outside SMART predictions. From fragments of length ≥100, we observe that the fraction originating from the overlapped segments results in overprediction.
Figure 6
Figure 6
Structural analysis of disordered segments. For all lengths of segments it is observed that more segments are found in sequence not predicted to be globular by SMART. However we observe a significant amount of internal disorder, which in many cases can correspond to a loop, hinge or another type of flexible insertion in the protein. The amount of overlapping is difficult to interpret, however we have observed that GlobPlot often predicts the domain boundaries more precisely than SMART does. This is because SMART typically uses only the ‘core’ sequences to define the Hidden Markov Model for a given domain.

References

    1. Brenner S. (2000) Target selection for structural genomics. Nat. Struct. Biol., 7 (suppl), 967–969. - PubMed
    1. Wright P. and Dyson,H. (1999) Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm. J. Mol. Biol., 293, 321–331. - PubMed
    1. Letunic I., Goodstadt,L., Dickens,N., Doerks,T., Schultz,J., Mott,R., Ciccarelli,F., Copley,R., Ponting,C. and Bork,P. (2002) Recent improvements to the SMART domain-based sequence annotation resource. Nucleic Acids Res., 30, 242–244. - PMC - PubMed
    1. Servant F., Bru,C., Carrere,S., Courcelle,E., Gouzy,J., Peyruc,D. and Kahn,D. (2002) ProDom: automated clustering of homologous domains. Brief Bioinform., 3, 246–251. - PubMed
    1. Mulder N., Apweiler,R., Attwood,T., Bairoch,A., Barrell,D., Bateman,A., Binns,D., Biswas,M., Bradley,P., Bork,P. et al. (2003) The InterPro Database, 2003 brings increased coverage and new features. Nucleic Acids Res., 31, 315–318. - PMC - PubMed

Publication types