Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Sep 16;9 Suppl 2(Suppl 2):S8.
doi: 10.1186/1471-2164-9-S2-S8.

Predicting protein disorder by analyzing amino acid sequence

Affiliations

Predicting protein disorder by analyzing amino acid sequence

Jack Y Yang et al. BMC Genomics. .

Abstract

Background: Many protein regions and some entire proteins have no definite tertiary structure, presenting instead as dynamic, disorder ensembles under different physiochemical circumstances. These proteins and regions are known as Intrinsically Unstructured Proteins (IUP). IUP have been associated with a wide range of protein functions, along with roles in diseases characterized by protein misfolding and aggregation.

Results: Identifying IUP is important task in structural and functional genomics. We exact useful features from sequences and develop machine learning algorithms for the above task. We compare our IUP predictor with PONDRs (mainly neural-network-based predictors), disEMBL (also based on neural networks) and Globplot (based on disorder propensity).

Conclusion: We find that augmenting features derived from physiochemical properties of amino acids (such as hydrophobicity, complexity etc.) and using ensemble method proved beneficial. The IUP predictor is a viable alternative software tool for identifying IUP protein regions and proteins.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Example of IUP.
Figure 2
Figure 2
The effect of window length for feature extraction on the performance of IUP.
Figure 3
Figure 3
Comparison of our predictor (IUP) to DisEMBl, GlobPlot and PONDR VLXT.

References

    1. Radivojac P, Chawla NV, Dunker AK, Obradovic Z. Classification and Knowledge Discovery in Protein Databases. J Biomed Inform. 2004;37:224–239. doi: 10.1016/j.jbi.2004.07.008. - DOI - PubMed
    1. Dunker AK, Obradovic Z. The protein trinity – linking function and disorder. Nature Biotechnology. 2001;19:805–806. doi: 10.1038/nbt0901-805. - DOI - PubMed
    1. Romero P, Dunker AK, et al. Identifying Disordered Regions in Proteins from Amino Acid Sequences. Proceeding of ICNN. 1997. pp. 90–5.
    1. Dunker AK, Radivojac P, Obradovic Z. Intrinsic disorder and protein function. Biochemistry. 2002;41:6573–82. doi: 10.1021/bi012159+. - DOI - PubMed
    1. Uversky VN, Fink A. Protein Misfolding, Aggregation and Conformational Diseases. Springer; 2005.

LinkOut - more resources