Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jun;22(11-12):e2100209.
doi: 10.1002/pmic.202100209. Epub 2022 Mar 23.

ProSight Annotator: Complete control and customization of protein entries in UniProt XML files

Affiliations

ProSight Annotator: Complete control and customization of protein entries in UniProt XML files

Joseph B Greer et al. Proteomics. 2022 Jun.

Abstract

The effectiveness of any proteomics database search depends on the theoretical candidate information contained in the protein database. Unfortunately, candidate entries from protein databases such as UniProt rarely contain all the post-translational modifications (PTMs), disulfide bonds, or endogenous cleavages of interest to researchers. These omissions can limit discovery of novel and biologically important proteoforms. Conversely, searching for a specific proteoform becomes a computationally difficult task for heavily modified proteins. Both situations require updates to the database through user-annotated entries. Unfortunately, manually creating properly formatted UniProt Extensible Markup Language (XML) files is tedious and prone to errors. ProSight Annotator solves these issues by providing a graphical interface for adding user-defined features to UniProt-formatted XML files for better informed proteoform searches. It can be downloaded from http://prosightannotator.northwestern.edu.

Keywords: UniProt; bottom-up proteomics; post-translational modification; proteoforms; proteomics software; top-down proteomics.

PubMed Disclaimer

Conflict of interest statement

CONFLICT OF INTEREST

The authors have declared a conflict of interest. Some authors are involved in the production of commercial software which benefits from this tool.

Figures

FIGURE 1
FIGURE 1
ProSight Annotator allows an iterative process of adding, removing, or editing isoform and feature data to create a UniProt formatted Extensible Markup Language (XML) suitable to create a database for top-down or bottom-up searching
FIGURE 2
FIGURE 2
The ProSight Annotator main window contains five sections: (1) the menu, (2) isoforms table, (3) graphical annotated isoform, (4) point features, and (5) range features. It also contains information regarding currently annotated point features and proteoform counts: total proteoform count (6), selected isoform proteoform count (7), single PTM applied to a residue (8), multiple PTMs applied to a residue (9), cSNP applied to a residue (10), and custom mass modification applied to a residue (11). The main window also contains a button that launches the global feature set editor (12). A locked point feature is designated with a closed lock image (13).
FIGURE 3
FIGURE 3
The global feature set (GFS) editor allows users to edit the point features included in the GFS and perform global add and remove operations. The point features currently in the GFS are listed in the Included Features table (1). They are organized by feature type. The type can be expanded to display individual PTMs (2). A column with the number of currently applied features is displayed (3). The number of instances that a feature would be applied if added to all the isoforms is included in the potential column (4). Users can globally add or remove features by using the buttons in the global update feature set column (5). Features not currently included in the GFS are listed in the Available Features table (6). Features can be added to the included table by clicking the button in the Include column (7). Custom mass modifications can be added the included table by clicking the “Add Custom Modification” button (8). Features are only applied to isoforms with masses below the max. isoform mass threshold. This threshold can be updated by clicking the Advanced Parameters button (9)

References

    1. Smith LM, Kelleher NL, & Consortium for Top Down Proteomics. (2013). Proteoform: A single term describing protein complexity. Nature Methods, 10(3), 186–187. 10.1038/nmeth.2369. - DOI - PMC - PubMed
    1. Aebersold R, Agar JN, Amster IJ, Baker MS, Bertozzi CR, Boja ES, Costello CE, Cravatt BF, Fenselau C, Garcia BA, Ge Y, Gunawardena J, Hendrickson RC, Hergenrother PJ, Huber CG, Ivanov AR, Jensen ON, Jewett MC, Kelleher NL,… Zhang B (2018). How many human proteoforms are there? Nature Chemical Biology, 14(3), 206–214. 10.1038/nchembio.2576. - DOI - PMC - PubMed
    1. Kelleher NL (2004). Peer reviewed: Top-down proteomics. Analytical Chemistry, 76(11), 197A–196A. - PubMed
    1. Cui W, Rohrs HW, & Gross ML, (2011). Top-down mass spectrometry: Recent developments, applications and perspectives. The Analyst, 136(19), 3854–3864. 10.1039/c1an15286f. - DOI - PMC - PubMed
    1. Schaffer LV, Millikin RJ, Miller RM, Anderson LC, Fellers RT, Ge Y, Kelleher NL, Leduc RD, Liu X, Payne SH, Sun L, Thomas PM, Tucholski T, Wang Z, Wu S, Wu Z, Yu D, Shortreed MR, & Smith LM (2019). Identification and quantification of proteoforms by mass spectrometry. Proteomics, 19(10), 1800361. 10.1002/pmic.201800361. - DOI - PMC - PubMed

Publication types

LinkOut - more resources