Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Mar;5(3):e1000397.
doi: 10.1371/journal.pgen.1000397. Epub 2009 Mar 6.

Genome-wide analysis of histidine repeats reveals their role in the localization of human proteins to the nuclear speckles compartment

Affiliations

Genome-wide analysis of histidine repeats reveals their role in the localization of human proteins to the nuclear speckles compartment

Eulàlia Salichs et al. PLoS Genet. 2009 Mar.

Abstract

Single amino acid repeats are prevalent in eukaryote organisms, although the role of many such sequences is still poorly understood. We have performed a comprehensive analysis of the proteins containing homopolymeric histidine tracts in the human genome and identified 86 human proteins that contain stretches of five or more histidines. Most of them are endowed with DNA- and RNA-related functions, and, in addition, there is an overrepresentation of proteins expressed in the brain and/or nervous system development. An analysis of their subcellular localization shows that 15 of the 22 nuclear proteins identified accumulate in the nuclear subcompartment known as nuclear speckles. This localization is lost when the histidine repeat is deleted, and significantly, closely related paralogous proteins without histidine repeats also fail to localize to nuclear speckles. Hence, the histidine tract appears to be directly involved in targeting proteins to this compartment. The removal of DNA-binding domains or treatment with RNA polymerase II inhibitors induces the re-localization of several polyhistidine-containing proteins from the nucleoplasm to nuclear speckles. These findings highlight the dynamic relationship between sites of transcription and nuclear speckles. Therefore, we define the histidine repeats as a novel targeting signal for nuclear speckles, and we suggest that these repeats are a way of generating evolutionary diversification in gene duplicates. These data contribute to our better understanding of the physiological role of single amino acid repeats in proteins.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. The ability of a His-repeat to direct a heterologous protein to the nuclear speckles depends on the number of His residues in the tract.
A) HeLa cells were transfected with expression plasmids encoding fusion proteins of GFP with 6 or 9 His residues. B) Cells were transfected with expression plasmids encoding fusion proteins of GFP with 9 Pro or Gln residues, as indicated. At 48 h post-transfection, the localization of the fusion proteins was analyzed by direct fluorescence (left column, green) and by indirect immunofluorescence for SC35 (middle column, red). The merged images are also shown (left column), and the unfused GFP protein was used as a control. In all cases, co-localization with the endogenous marker was determined by confocal imaging.
Figure 2
Figure 2. Distribution of CAC/CAT repeat sizes in coding (A) and non-coding (B) regions.
Figure 3
Figure 3. Gene Ontology distribution of polyHis-containing proteins.
A) Distribution of genes annotated as ‘nucleus’, ‘cytoplasm’ (excluding ‘nucleus’) and ‘membrane (excluding ‘nucleus’ and ‘cytoplasm’). B) Distribution of the main functional groups in nuclear His-repeat containing proteins and a comparison with the same groups in the complete gene dataset (see Materials and Methods for more details).
Figure 4
Figure 4. The His-repeat is a novel nuclear speckle targeting signal.
A) HeLa cells were transfected with the expression plasmids for the fusion proteins GFP-DYRK1A, GFP-POU4F2, GFP-YY1 and GFP-NLK. Cells were immunostained for SC35 to visualize the nuclear speckles (middle column, red) and GFP fusion proteins were visualized directly by fluorescence microscopy (left column, green). Merged images are shown (right column). B) HeLa cells were transfected with the expression plasmids for HA-DYRK1AΔHis and Flag-POU4F2ΔHis, and the cells were immunostained for DYRK1A or POU4F2 (left column) and for SC35 to detect nuclear speckles (middle column). C) Soluble extracts from cells expressing HA-DYRK1A or HA-DYRK1AΔHis were subjected to immunoprecipitation with anti-HA and then in vitro kinase activity on the DYRKtide peptide was assayed. Samples were analyzed in Western blots probed with anti-HA. D) Cells were co-transfected with pGL2-3xBrn3a and pCMV-βgal together with pFlag-POU4F2 wild type (wt) or pFlag-POU4F2ΔHis (ΔHis). Transcriptional activity is presented as the ratio of luciferase and β-galactosidase; values are the means±S.D. of triplicate determinations for each condition in one representative experiment of three performed. The panel shows a Western blot of transfected extracts probed with an anti-Flag antibody.
Figure 5
Figure 5. The presence of a His-repeat dictates the different subcellular localization of paralogous proteins.
A) Alignment of the primary sequences of the paralogues, FAM76B (NP_653265; hypothetical protein LOC143684) and FAM76A (NP_689873; hypothetical protein LOC199870), obtained with the multiple sequence alignment program “Blast 2 Sequences” (http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi). His residues in FAM76B are highlighted in red. B) HeLa cells were transfected with an expression plasmid encoding FAM76B (upper panel) or FAM76A (lower panel) fused to GFP at their N-terminal. The subcellular localization of the fusion proteins was analyzed by direct fluorescence and their accumulation in nuclear speckles was followed by immunostaining for SC35. C) Using the lines on the merged image, fluorescence intensity profiles were obtained for GFP (green) and SC35 (red).
Figure 6
Figure 6. The accumulation in nuclear speckles of some polyHis-containing proteins depends on the presence of other interacting domains.
A) HeLa cells were transfected with the expression plasmids for wild type GFP-MEOX2 or the mutant GFP-MEOX2ΔHB as indicated (see scheme; His: His-repeat; NLS: nuclear localization signal; HoBox: homeobox domain). B) HeLa cells were transfected with the expression plasmids for GFP-CBX4 wild type or GFP-CBX4ΔPB as indicated (see scheme: CHROMO, chromatin organization modifier domain; His, His-repeat; NLS, nuclear localization signal; CtBP2, CtBP binding domain; and RING1, RING1-interacting domain). In A) and B), the subcellular localization of the GFP-fusion proteins was analyzed by direct fluorescence (left column, green) and their accumulation in nuclear speckles by immunofluorescence for SC35 (middle column, red).
Figure 7
Figure 7. The transcriptional state of the cell determines whether some polyHis transcription factors accumulate in nuclear speckles.
A, B) HeLa cells were transfected with the expression plasmids encoding the GFP-FOXG1B (A) and GFP-HOXA1 (B) fusion proteins. At 36 h post-transfection, cells were treated with α-amanitin for 5 h to inhibit transcription and then processed for SC35 immunofluorescence. Fluorescence intensity profiles are shown for GFP (green) and SC35 (red), obtained from the lines on the merged images. C) The panels show the results for the same type of experiment performed on mutant HOXA1ΔHis in which the His-tract has been eliminated (see scheme: His, His-repeat; NLS, nuclear localization signal; HoBox, homeobox). D) Cells were co-transfected with pE1bG4-luc and pCMV-RNL together with pG4-DBD (-), pG4-HOXA1 wild type (wt) or pG4-HOXA1ΔHis (ΔHis), and luciferase activity was measured in triplicate plates. Values were corrected for transfection efficiency as measured by Renilla activity. Data is presented as the induction of luciferase activity above the G4-DBD transfection and the values are the means±S.D. of triplicate determinations for each condition in a representative experiment of a minimum of two performed. The panel shows a Western blot analysis of transfected extracts with an anti-Gal4-DBD antibody.
Figure 8
Figure 8. The His-tract participates in the dynamic properties of polyHis-containing proteins.
A) HeLa cells were transfected with the expression plasmids encoding GFP-PRICKLE3. Cells were treated with α-amanitin for 5 h to inhibit transcription and then processed for SC35 immunofluorescence. B) HeLa cells expressing the GFP-PRICKLE3 fusion protein were mock-treated or exposed to leptomycin B for 5 h, 24 h after transfection. The subcellular localization of the fusion protein was analyzed by direct fluorescence. Note that PRICKLE3 is detected in the cytosol in untreated cells but it accumulates in the nucleus, nucleoplasm and nuclear speckles in response to the inhibitor of nuclear export.

References

    1. Huntley MA, Golding GB. Simple sequences are rare in the Protein Data Bank. Proteins. 2002;48:134–140. - PubMed
    1. Karlin S, Brocchieri L, Bergman A, Mrazek J, Gentles AJ. Amino acid runs in eukaryotic proteomes and disease associations. Proc Natl Acad Sci USA. 2002;99:333–338. - PMC - PubMed
    1. Mar Alba M, Santibanez-Koref MF, Hancock JM. Amino acid reiterations in yeast are overrepresented in particular classes of proteins and show evidence of a slippage-like mutational process. J Mol Evol. 1999;49:789–797. - PubMed
    1. Alba MM, Guigo R. Comparative analysis of amino acid repeats in rodents and humans. Genome Res. 2004;14:549–554. - PMC - PubMed
    1. Faux NG, Bottomley SP, Lesk AM, Irving JA, Morrison JR, et al. Functional insights from the distribution and role of homopeptide repeat-containing proteins. Genome Res. 2005;15:537–551. - PMC - PubMed

Publication types

MeSH terms