Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Feb 11:10:59.
doi: 10.1186/1471-2105-10-59.

The Drosophila melanogaster PeptideAtlas facilitates the use of peptide data for improved fly proteomics and genome annotation

Collaborators, Affiliations

The Drosophila melanogaster PeptideAtlas facilitates the use of peptide data for improved fly proteomics and genome annotation

Sandra N Loevenich et al. BMC Bioinformatics. .

Abstract

Background: Crucial foundations of any quantitative systems biology experiment are correct genome and proteome annotations. Protein databases compiled from high quality empirical protein identifications that are in turn based on correct gene models increase the correctness, sensitivity, and quantitative accuracy of systems biology genome-scale experiments.

Results: In this manuscript, we present the Drosophila melanogaster PeptideAtlas, a fly proteomics and genomics resource of unsurpassed depth. Based on peptide mass spectrometry data collected in our laboratory the portal http://www.drosophila-peptideatlas.org allows querying fly protein data observed with respect to gene model confirmation and splice site verification as well as for the identification of proteotypic peptides suited for targeted proteomics studies. Additionally, the database provides consensus mass spectra for observed peptides along with qualitative and quantitative information about the number of observations of a particular peptide and the sample(s) in which it was observed.

Conclusion: PeptideAtlas is an open access database for the Drosophila community that has several features and applications that support (1) reduction of the complexity inherently associated with performing targeted proteomic studies, (2) designing and accelerating shotgun proteomics experiments, (3) confirming or questioning gene models, and (4) adjusting gene models such that they are in line with observed Drosophila peptides. While the database consists of proteomic data it is not required that the user is a proteomics expert.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The placement of PeptideAtlas. The placement of PeptideAtlas bridging global and targeted proteomics experiments is depicted. Protein extracts were prepared from diverse in vivo and in vitro samples. The proteins were digested and, after additional fractionation steps, analyzed via LC-MS/MS. The spectra were mapped to peptides that likely gave rise to them using sequence database searching. Subsequently, the mappings were statistically validated. Spectral centric data mining was facilitated by a relational database (SBEAMS). Then, re-organization of the data in a peptide-centric manner was the focus during PeptideAtlas construction: Data accumulated over many experiments, generated over the course of several months or years, was complexity-reduced and condensed concentrating on the actual peptide entities. This now allows for retrieval of high quality proteotypic peptides as well as for gene model validation based on expressed peptide sequences.
Figure 2
Figure 2
GBrowse on the FlyBase website. The figure shows a screenshot of the genome browser GBrowse on the FlyBase website. The gene model of shibire (shi) on the forward strand of the X chromosome is currently known to have 9 different mRNAs encoding 2 different proteins. 29 peptides are displayed that map to only one location in the genome (shi) and have been observed at least twice. 23 of these peptides lie within exons. In addition, 6 peptides lie across 2 exons and cover 5 splice sites. The peptides are hyperlinked to the PeptideAtlas website where more information about each peptide is available.
Figure 3
Figure 3
A 'lost' peptide in CG30084. Shown here is part of the gene CG30084 (it was called tun in release 4) of the FlyBase annotation. Within the FlyBase release 3.2 it was annotated to encode the 4 protein isoforms CG30084-PA, CG30084-PB, CG30084-PC, and CG30084-PD (shown in dark grey). The PeptideAtlas peptide PAp00061581 (PSIASITAPGSASAPAPVPSAAPTK) was part of the splice variant CG30084-PB (red frame, PAp00061581 highlighted in yellow). The annotations of the subsequent release 4.3 are shown in beige color. As one can see, none of the 4 isoforms in this newer release (CG30084-PA,-PC,-PE, and -PF) can account for the observed peptide.
Figure 4
Figure 4
A peptide highlights a missing splice form. Part of the gene model of the Na pump alpha subunit (Atpα, CG5670) is depicted. In front of the black background, different types of sequence data are displayed: several predictions (light pink, purple, and different shades of turquoise), conserved coding regions (bright yellow), cDNAs alignments (greens), and peptides from the PeptideAtlas (bright pink). In front of the light blue background, alternative splice forms annotated in release 5.12 are shown in dark blue. The PeptideAtlas peptide PAp00073066 was identified in a 6-frame search and maps within the Atpα gene region. Note that while prediction algorithms postulate an alternative exon in this region, there are no supporting cDNAs (nor ESTs; not shown). The splice variant Atpα-PI, added in FlyBase annotation release 5.11, now accounts for the identified peptide sequence, NPEIDNLVNER. The codon for the last residue of the peptide spans the adjacent intron, thus supporting the annotated splice sites.

References

    1. Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ, Tomb JF, Dougherty BA, Merrick JM, et al. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science. 1995;269:496–512. doi: 10.1126/science.7542800. - DOI - PubMed
    1. Lee NH, Saeed AI. Microarrays: an overview. Methods Mol Biol. 2007;353:265–300. - PubMed
    1. Anderson L, Hunter CL. Quantitative mass spectrometric multiple reaction monitoring assays for major plasma proteins. Mol Cell Proteomics. 2006;5:573–588. - PubMed
    1. Kuster B, Schirle M, Mallick P, Aebersold R. Scoring proteomes with proteotypic peptide probes. Nat Rev Mol Cell Biol. 2005;6:577–583. doi: 10.1038/nrm1683. - DOI - PubMed
    1. Stahl-Zeng J, Lange V, Ossola R, Eckhardt K, Krek W, Aebersold R, Domon B. High sensitivity detection of plasma proteins by multiple reaction monitoring of N-glycosites. Mol Cell Proteomics. 2007;6:1809–1817. doi: 10.1074/mcp.M700132-MCP200. - DOI - PubMed

Publication types

LinkOut - more resources