Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2001 Feb;11(2):218-29.
doi: 10.1101/gr.gr-1522r.

Low-complexity regions in Plasmodium falciparum proteins

Affiliations

Low-complexity regions in Plasmodium falciparum proteins

E Pizzi et al. Genome Res. 2001 Feb.

Abstract

Full-sequence data available for Plasmodium falciparum chromosomes 2 and 3 are exploited to perform a statistical analysis of the long tracts of biased amino acid composition that characterize the vast majority of P. falciparum proteins and to make a comparison with similarly defined tracts from other simple eukaryotes. When the relatively minor subset of prevalently hydrophobic segments is discarded from the set of low-complexity segments identified by current segmentation methods in P. falciparum proteins, a good correspondence is found between prevalently hydrophilic low-complexity segments and the species-specific, rapidly diverging insertions detected by multiple-alignment procedures when sequences of bona fide homologs are available. Amino acid preferences are fairly uniform in the set of hydrophilic low-complexity segments identified in the two P. falciparum chromosomes sequenced, as well as in sequenced genes from Plasmodium berghei, but differ from those observed in Saccharomyces cerevisiae and Dictyostelium discoideum. In the two plasmodial species, amino acid frequencies do not correlate with properties such as hydrophilicity, small volume, or flexibility, which might be expected to characterize residues involved in nonglobular domains but do correlate with A-richness in codons. An effect of phenotypic selection versus neutral drift, however, is suggested by the predominance of asparagine over lysine.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(a) Length distribution of the 415 low-complexity segments identified by the SEG program (window: 45; trigger: 3.4; extension: 3.75) in the complete set of 205 proteins predicted for Plasmodium falciparum chromosome 2. Superimposed in grey is the length distribution of the 77 internally repetitive low-complexity segments, always expressed as a percentage of the total number of SEG-identified segments. (Inset) Length distribution of the low-complexity segments identified by the SEG program (parameter setting as above) in the complete set of ORFs present on Saccharomyces cerevisiae chromosome II (Feldman et al. 1994). (b) Fractional distribution of the predicted proteins of P. falciparum chromosome 2 according to the percentage of the protein length occupied by low-complexity (l-c) segments, identified as above. (Insert) Idem for S. cerevisiae chromosome 2.
Figure 2
Figure 2
Number of low-complexity (l-c) segments identified in individual proteins vs. protein length for Plasmodium falciparum chromosomes 2 (open circles) and 3 (solid circles). Only a relatively small number of proteins (24 out of 205 on chromosome 2 and 13 out of 215 on chromosome 3), all shorter than ∼500 amino acids, appear to be entirely complex.
Figure 3
Figure 3
For each of the indicated proteins (SWALL accession nos. in Methods) a diagram is presented in which insertions (grey boxes) resulting from multialignment procedures are compared with SEG-identified low-complexity regions (segment) and with the hydrophobicity profile of the protein (Kyte and Doolittle 1982).
Figure 4
Figure 4
For Plasmodium falciparum chromosome-2–predicted proteins, ordinate values give the frequencies with which individual residues appear in complex regions (squares), nonrepetitive (triangles), and repetitive (circles) hydrophilic low-complexity segments, normalized with respect to frequencies averaged over the entire SWISS-PROT database (notes to release 38.0). Residues are grouped as follows: C+, positively charged; C−, negatively charged; P, uncharged polar; I, hydrophobic; A, aromatic nonpolar. Within groups, residues are ordered according to increasing hydrophobicity.
Figure 5
Figure 5
Skewness (A − T)/(A + T) profile of the 40-kb region of Plasmodium falciparum chromosome 2 starting at nucleotide 640,000 (Gardner et al. 1998). Arrows indicate rightward- and leftward-transcribed ORFs. Introns are shown as black rectangles. Segments (continuous or dashed) indicate SEG-identified, low-complexity segments (nonrepetitive or repetitive, respectively).

Comment in

Similar articles

Cited by

References

    1. Arnot DE, Barnwell JW, Stewart MJ. Does biased gene conversion influence polymorphism in the CS encoding gene of P. vivax? Proc Natl Acad Sci. 1988;85:8102–8106. - PMC - PubMed
    1. Bell SJ, Forsdyke DR. Deviations from Chargaff's second parity rule correlate with direction of transcription. J Theor Biol. 1999;197:63–76. - PubMed
    1. Birago C, Pace T, Picci L, Pizzi E, Scotti R, Ponzi M. The putative gene for the first enzyme of glutathione biosynthesis in P. berghei and P. falciparum. Mol Biochem Parasitol. 1999;99:33–40. - PubMed
    1. Bowman S, Lawson D, Basham D, Brown D, Chillingworth T, Churcher CM, Craig A, Davies RM, Devlin K, Feltwell T, et al. The complete nucleotide sequence of chromosome 3 of P. falciparum. Nature. 1999;400:532–538. - PubMed
    1. Braun JV, Mueller HG. Statistical methods for DNA sequence segmentation. Statist Sci. 1998;13:142–162.

MeSH terms

LinkOut - more resources