Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jul;19(14):e1800367.
doi: 10.1002/pmic.201800367.

RAId: Knowledge-Integrated Proteomics Web Service with Accurate Statistical Significance Assignment

Affiliations

RAId: Knowledge-Integrated Proteomics Web Service with Accurate Statistical Significance Assignment

Aleksey Y Ogurtsov et al. Proteomics. 2019 Jul.

Abstract

Mass spectrometry-based proteomics starts with identifications of peptides and proteins, which provide the bases for forming the next-level hypotheses whose "validations" are often employed for forming even higher level hypotheses and so forth. Scientifically meaningful conclusions are thus attainable only if the number of falsely identified peptides/proteins is accurately controlled. For this reason, RAId continued to be developed in the past decade. RAId employs rigorous statistics for peptides/proteins identification, hence assigning accurate P-values/E-values that can be used confidently to control the number of falsely identified peptides and proteins. The RAId web service is a versatile tool built to identify peptides and proteins from tandem mass spectrometry data. Not only recognizing various spectra file formats, the web service also allows four peptide scoring functions and choice of three statistical methods for assigning P-values/E-values to identified peptides. Users may upload their own protein database or use one of the available knowledge integrated organismal databases that contain annotated information such as single amino acid polymorphisms, post-translational modifications, and their disease associations. The web service also provides a friendly interface to display, sort using different criteria, and download the identified peptides and proteins. RAId web service is freely available at https://www.ncbi.nlm.nih.gov/CBBresearch/Yu/raid.

Keywords: MS/MS data analyses; knowledge-integrated database; peptide/protein identifications; proportion of false discoveries; statistical significance assignment.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest

The authors have declared no conflict of interest.

Figures

Figure 1:
Figure 1:
Panel (A) shows the main dialog window of RAId when the Generate histogram tab is chosen. To the right of the scoring function heading, users may select a scoring function. Note that the symbol R(•) represents RAId’s implementation of scoring function • minus the undescribed/unjustified heuristics. The user may select desired PTMs alongside with the 20 regular amino acids to generate the score distribution (normalized histogram) for a given MS/MS spectrum. Under the “Amino acids and PTMs” expansion tab, one can click on “Change” next to PTMs, and a pop-up window, part of which shown in panel (B), appears which allows the user to select PTMs desired. Panel (C) displays the score distribution by scoring all possible peptides made of only regular amino acids. Panel (D) also displays the score distribution by scoring all possible peptides made of 20 regular amino acids and with 31 PTMs, some of which are shown in panel (B).
Figure 2:
Figure 2:
Panel (A) shows the main dialog window of RAId when the Database search tab is chosen. Under the “Amino acids and PTMs” expansion tab, one can enable the search program to consider annotated SAPs, novel PTMs, and annotated PTMs. All of those can be accessed via pop-up windows by clicking on the corresponding “Change” buttons on the right. In the example shown in the lower right corner of panel (A), two annotated SAPs are selected. This means that when encountering peptides whose residue A (or S) have been documented to have SAPs, the search program will automatically consider all such variant peptides during the search. Panel (B) shows the pop-up window associated with annotated PTMs. In this example, 31 annotated PTMs were checked. Once all search parameters are entered and the button “Submit job” clicked, the main dialog window is replaced by the result window. An example of the result window is displayed in panel (C) when a Homo sapiens dataset is used. The four job status, pending, running, retrieving, and complete are self-explanatory. Once the job is complete, one may sort the results using a different criterion by clicking on any one of the remaining five open circles.
Figure 3:
Figure 3:
Panel (A) shows what the results window is like when the user sorts the search results according to “protein E-value”. With this choice, proteins are sorted in ascending order of their E-values. That is, the protein with highest identification significance is shown first and so on. If one clicks on the “plus” sign in the front of a protein row, that row expands and all peptides mappable to that protein are displayed in ascending order of their E-values. Panel (B) shows such an expansion when the fifth protein on Panel (A) is clicked. The user may also in the expanded list click on one of the peptides; this will induce a pop-up window displaying the peptide-spectrum match. Panel (C) shows such an example when clicking on the peptide AVFQANQENLPILKR belonging to the fifth protein. Finally, the user may access a protein’s corresponding RefSeq page by clicking on that protein’s GI. Panel (D) shows part of the RefSeq page corresponding to the fifth protein when its accession number NP_000692 is clicked.

References

    1. Alves G, Ogurtsov AY, Yu YK: RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics. Biology Direct 2007, 2:25. [[Online]]. - PMC - PubMed
    1. Park CY, Klammer AA, Kall L, MacCoss MJ, Noble WS: Rapid and accurate peptide identification from tandem mass spectra. J. Proteome Res. 2008, 7(7):3022–3027. - PMC - PubMed
    1. Craig R, Beavis RC: TANDEM: matching proteins with tandem mass spectra. Bioinformatics 2004, 20(9):1466–1467. - PubMed
    1. MacLean B, Eng JK, Beavis RC, McIntosh M: General framework for developing and evaluating database scoring algorithms using the TANDEM search engine. Bioinformatics 2006, 22(22):2830–2832. - PubMed
    1. Alves G, Yu YK: Statistical Characterization of a 1D Random Potential Problem - with applications in score statistics of MS-based peptide sequencing. Physica A 2008, 387(26):6538–6544. - PMC - PubMed

Publication types

LinkOut - more resources