Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 May 28:10:249.
doi: 10.1186/1471-2164-10-249.

Biases in Drosophila melanogaster protein trap screens

Affiliations

Biases in Drosophila melanogaster protein trap screens

Jelena Aleksic et al. BMC Genomics. .

Abstract

Background: The ability to localise or follow endogenous proteins in real time in vivo is of tremendous utility for cell biology or systems biology studies. Protein trap screens utilise the random genomic insertion of a transposon-borne artificial reporter exon (e.g. encoding the green fluorescent protein, GFP) into an intron of an endogenous gene to generate a fluorescent fusion protein. Despite recent efforts aimed at achieving comprehensive coverage of the genes encoded in the Drosophila genome, the repertoire of genes that yield protein traps is still small.

Results: We analysed the collection of available protein trap lines in Drosophila melanogaster and identified potential biases that are likely to restrict genome coverage in protein trap screens. The protein trap screens investigated here primarily used P-element vectors and thus exhibit some of the same positional biases associated with this transposon that are evident from the comprehensive Drosophila Gene Disruption Project. We further found that protein trap target genes usually exhibit broad and persistent expression during embryonic development, which is likely to facilitate better detection. In addition, we investigated the likely influence of the GFP exon on host protein structure and found that protein trap insertions have a significant bias for exon-exon boundaries that encode disordered protein regions. 38.8% of GFP insertions land in disordered protein regions compared with only 23.4% in the case of non-trapping P-element insertions landing in coding sequence introns (p < 10(-4)). Interestingly, even in cases where protein domains are predicted, protein trap insertions frequently occur in regions encoding surface exposed areas that are likely to be functionally neutral. Considering the various biases observed, we predict that less than one third of intron-containing genes are likely to be amenable to trapping by the existing methods.

Conclusion: Our analyses suggest that the utility of P-element vectors for protein trap screens has largely been exhausted, and that approximately 2,800 genes may still be amenable using piggyBac vectors. Thus protein trap strategies based on current approaches are unlikely to offer true genome-wide coverage. We suggest that either transposons with reduced insertion bias or recombineering-based targeting techniques will be required for comprehensive genome coverage in Drosophila.

PubMed Disclaimer

Figures

Figure 1
Figure 1
GFP protein-trap insertions. A) Mapping insertions to gene features: the 1,522 insertions reported in the FlyTrap database were mapped to Release 5.3 of the D. melanogaster gene annotation, with the results relative to gene features shown. B) Overlap in protein trap targets: the 226 protein trap targets with matching frame information reported in FlyTrap originate from three large-scale screens and there is considerable overlap between the gene sets obtained.
Figure 2
Figure 2
Insertional biases. A) Number of transposable element insertions in introns: introns of canonical GFP-tagged genes were divided into 'trap' if they carried the GFP-trap insertion and 'non-trap' if not. Targets were independently analysed for P-element and piggyBac insertions. While the 'trapped' introns generally show large numbers of previously reported P-element or piggyBac insertions, the 'non-trapped' introns show significantly lower values. B) Intron length: genes with previous P-element or piggyBac insertions were divided into 'trapped' or 'non-trapped' depending on the existence of a GFP-trap insertion. The average length of the introns hit by P-element traps is 9.2 kb and is 18.2 kb for piggyBac targets, which is significantly higher than the 1.9 kb and 1.8 kb average for other non-trapped introns in the same genes. C) Expression duration of genes: embryonic gene expression was divided into three roughly equivalent time frames. GFP-trap targets show prolonged gene expression compared to other genes susceptible to P-element-insertion (data for piggyBac not shown). The small number of genes with no embryonic expression originated from the pilot screen performed on L1 larvae.
Figure 3
Figure 3
Structural constraints on GFP polypeptide insertion. A) GFP trap hotspots: the left chart plots the relative frequency of GFP insertions in introns between exon-exon boundaries comprising predicted structural domains, regions of intrinsic disorder or unclassified regions. The right chart plots un-trapped introns from the same genes and shows a reduction in the intrinsically disordered category. B) Consequences of GFP insertions: in cases where GFP insertions fall into predicted structural domains, mapping of domain sequence to known structures of proteins of the same fold shows that it is mostly surface exposed areas that are affected. In both examples, the overall fold of the GFP target domain is unlikely to be affected by the insertion. Note that the linker residues and the C-terminus of GFP (blue dotted lines) are predicted to be highly flexible. The displayed examples show only one possibility for how the GFP domain is structured relative to the host protein.
Figure 4
Figure 4
Prediction of protein-trap targets. A) Estimated number of protein-tagging targets for genes with known expression: genes were ranked according to their highest scoring introns. The sets of predicted genes were then compared to the reported GFP-trap target genes to determine an overlap. The graph shows the number of top-ranked genes considered in order to recover 10%, 20% etc of previously known GFP-trap target genes. B) The predicted numbers of possible targets for P-element- and piggyBac-based GFP trap screens. The numbers were first derived separately for genes with known and genes with unknown expression, and the total prediction is the sum of the two. The predictions show the number of top genes predicted by the model required for an 80% coverage of previously known protein trap insertion targets. C) Overlap between P-element and piggyBac targets: the inner circles of the Venn diagrams (solid lines) represent the numbers of reported P-element and piggyBac gene hits, the outer circles (dotted lines) represent the numbers of estimated gene targets derived from our model. There is an overlap between both the reported and the predicted targets. Only reported genes that have been successfully predicted by the theoretical model are shown in this diagram.

Similar articles

Cited by

References

    1. O'Kane CJ, Gehring WJ. Detection in situ of genomic regulatory elements in Drosophila. Proc Natl Acad Sci USA. 1987;84:9123–9127. doi: 10.1073/pnas.84.24.9123. - DOI - PMC - PubMed
    1. Bellen HJ, O'Kane CJ, Wilson C, Grossniklaus U, Pearson RK, Gehring WJ. P-element-mediated enhancer detection: a versatile method to study development in Drosophila. Genes Dev. 1989;3:1288–1300. doi: 10.1101/gad.3.9.1288. - DOI - PubMed
    1. Wilson C, Pearson RK, Bellen HJ, O'Kane CJ, Grossniklaus U, Gehring WJ. P-element-mediated enhancer detection: an efficient method for isolating and characterizing developmentally regulated genes in Drosophila. Genes Dev. 1989;3:1301–1313. doi: 10.1101/gad.3.9.1301. - DOI - PubMed
    1. Thibault ST, Singer MA, Miyazaki WY, Milash B, Dompe NA, Singh CM, Buchholz R, Demsky M, Fawcett R, Francis-Lang HL, Ryner L, Cheung LM, Chong A, Erickson C, Fisher WW, Greer K, Hartouni SR, Howie E, Jakkula L, Joo D, Killpack K, Laufer A, Mazzotta J, Smith RD, Stevens LM, Stuber C, Tan LR, Ventura R, Woo A, Zakrajsek I, Zhao L, Chen F, Swimmer C, Kopczynski C, Duyk G, Winberg ML, Margolis J. A complementary transposon tool kit for Drosophila melanogaster using P and piggyBac. Nat Genet. 2004;36:283–7. doi: 10.1038/ng1314. - DOI - PubMed
    1. Metaxakis A, Oehler S, Klinakis A, Savakis C. Minos as a genetic and genomic tool in Drosophila melanogaster. Genetics. 2005;171:571–581. doi: 10.1534/genetics.105.041848. - DOI - PMC - PubMed

Publication types

Substances

LinkOut - more resources