Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Jun 6;45(22):6873-88.
doi: 10.1021/bi0602718.

Intrinsic disorder in transcription factors

Affiliations

Intrinsic disorder in transcription factors

Jiangang Liu et al. Biochemistry. .

Abstract

Intrinsic disorder (ID) is highly abundant in eukaryotes, which reflect the greater need for disorder-associated signaling and transcriptional regulation in nucleated cells. Although several well-characterized examples of intrinsically disordered proteins in transcriptional regulation have been reported, no systematic analysis has been reported so far. To test for the general prevalence of intrinsic disorder in transcriptional regulation, we used the predictor of natural disorder regions (PONDR) to analyze the abundance of intrinsic disorder in three transcription factor datasets and two control sets. This analysis revealed that from 94.13 to 82.63% of transcription factors possess extended regions of intrinsic disorder, relative to 54.51 and 18.64% of the proteins in two control datasets, which indicates the significant prevalence of intrinsic disorder in transcription factors. This propensity of transcription factors to intrinsic disorder was confirmed by cumulative distribution function analysis and charge-hydropathy plots. The amino acid composition analysis showed that all three transcription factor datasets were substantially depleted in order-promoting residues and significantly enriched in disorder-promoting residues. Our analysis of the distribution of disorder within the transcription factor datasets revealed that (a) the AT-hooks and basic regions of transcription factor DNA-binding domains are highly disordered; (b) the degree of disorder in transcription factor activation regions is much higher than that in DNA-binding domains; (c) the degree of disorder is significantly higher in eukaryotic transcription factors than in prokaryotic transcription factors; and (d) the level of alpha-MoRF (molecular recognition feature) prediction is much higher in transcription factors. Overall, our data reflected the fact that eukaryotes with well-developed gene transcription machinery require transcription factor flexibility to be more efficient.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Workflow for dataset construction and non-redundancy preparation
Figure 2
Figure 2
Composition profiling of TF compared to PDBs25 (A) or RandomAC_NR25 (B). The bar for a given amino acid represents the fractional difference in composition between a set of TFs and a control set. The fractional difference is calculated as (CX-Cc)/Cc, where CX is the composition of a given amino acid in a given TF database, and Cc is the corresponding composition in a control set of proteins (Cc1 for PDBs25 and Cc2 for RandomAC_NR25). The residues are ordered by Vihinen’s flexibility scale (43). Negative values indicate residues that the given TF set has less than control, positive indicates more than control. Error bars show one standard deviation.
Figure 3
Figure 3
Summary of disorder predictions for the three TF datasets - TF_SP&TRE_NR25 (red bars), TF_SP_NR25 (pink bars), TF_NR25 (yellow bars) - and two control sets - PDBs25 (blue bars), and RandomAC_NR25 (cyan bars) – showing the fraction of proteins in each set with a predicted disordered region longer than that indicated on the horizontal axis. Predictions are summarized for (A) PONDR VL-XT and (B) PONDR VSL1, where the error-bar shows the 99% bootstrap confidence interval.
Figure 4
Figure 4
PONDR CDF analysis of whole protein order-disorder. (A) Example plot of the CDF curves for randomly selected proteins from the TF_SP_NR25 set, where proteins are shown by the colored lines and the order-disorder boundary is shown by the black line. (B) Fraction of each of the five sets predicted to be wholly disordered by CDF analysis. Error bars show the 99% bootstrap confidence interval.
Figure 4
Figure 4
PONDR CDF analysis of whole protein order-disorder. (A) Example plot of the CDF curves for randomly selected proteins from the TF_SP_NR25 set, where proteins are shown by the colored lines and the order-disorder boundary is shown by the black line. (B) Fraction of each of the five sets predicted to be wholly disordered by CDF analysis. Error bars show the 99% bootstrap confidence interval.
Figure 5
Figure 5
CH-plot analysis of whole protein order-disorder. (A) Example plot of randomly selected proteins from the TF_SP_NR25 set in CH-space, where proteins are shown by the cycles and the order-disorder boundary is shown by the black line. (B) Fraction of each of the five sets predicted to be wholly disordered by CH-plots. Error bars show the 99% bootstrap confidence interval.
Figure 5
Figure 5
CH-plot analysis of whole protein order-disorder. (A) Example plot of randomly selected proteins from the TF_SP_NR25 set in CH-space, where proteins are shown by the cycles and the order-disorder boundary is shown by the black line. (B) Fraction of each of the five sets predicted to be wholly disordered by CH-plots. Error bars show the 99% bootstrap confidence interval.
Figure 6
Figure 6
α-MoRF predictions for the TF_SP_NR25 set and the two control sets. Both the faction of proteins containing predicted α-MoRFs (white bars) and the predicted α-MoRFs per-residue in each set (grey bars) are shown. Error bars show the 99% bootstrap confidence interval.
Figure 7
Figure 7
Composition profiling of C2H2 zinc-finger linkers compared to RandomAC_NR25. The bar for a given amino acid represents the fractional difference in composition between a set of linkers and a control set. The fractional difference is calculated as (CLinker-Ccontrol)/Ccontrol, where Clinker is the composition of a given amino acid in a database of linkers, and Ccontrol is the corresponding composition in a RandomAC_NR25 dataset. The residues are ordered by Vihinen’s flexibility scale (Vihinen, M.1987). Negative values indicate residues which are underrepresented in a linker dataset, whereas positive values correspond to the over-represented.
Figure 8
Figure 8
PONDRing TFIIA, TFIIB, and TATA box binding protein. The correspondence of PONDR predictions and regions of known structure are shown. Three PDB structures are shown – 1NVP (top), 1C9B (bottom right), and 1DL6 (bottom left) – where each chain in the ribbon and molecular surface representations are color coded – TATA box binding protein (yellow), TFIIA-α (blue), TFIIA-β (green), TFIIA-γ (red, PONDR prediction not shown), TFIIB core domain (maroon), and TFIIB zinc ribbon domain (purple). These color codes are also used for bars in three PONDR plots - (top) TFIIA, (middle) TATA box binding protein, and (bottom) TFIIB – to indicate the positions of the regions of known structure in the context of the PONDR predictions. Drawn over these bars, hash marks show the residues in contact with other chains, where the color of the hash mark corresponds to the color code of the partner chain.
Figure 9
Figure 9
Analysis of the molecular interactions of HMGA1 using PathwayAssist™. A simplified network associated with HMGA1 is shown. HMGA1 protein acts as a ‘hub’ of nuclear function and interacts with at least 18 TFs in the nucleus.

References

    1. Dunker AK, Obradovic Z, Romero P, Garner EC, Brown CJ. Intrinsic protein disorder in complete genomes. Genome Inform Ser Workshop Genome Inform. 2000;11:161–171. - PubMed
    1. Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT. Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol. 2004;337:635–645. - PubMed
    1. Oldfield CJ, Cheng Y, Cortese MS, Brown CJ, Uversky VN, Dunker AK. Comparing and combining predictors of mostly disordered proteins. Biochemistry. 2005;44:1989–2000. - PubMed
    1. Dunker AK, Lawson JD, Brown CJ, Williams RM, Romero P, Oh JS, Oldfield CJ, Campen AM, Ratliff CM, Hipps KW, Ausio J, Nissen MS, Reeves R, Kang C, Kissinger CR, Bailey RW, Griswold MD, Chiu W, Garner EC, Obradovic Z. Intrinsically disordered protein. J Mol Graph Model. 2001;19:26–59. - PubMed
    1. Iakoucheva LM, Brown CJ, Lawson JD, Obradovic Z, Dunker AK. Intrinsic disorder in cell-signaling and cancer-associated proteins. J Mol Biol. 2002;323:573–584. - PubMed

Publication types

Substances