Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Jan 7:9:4.
doi: 10.1186/1471-2105-9-4.

An open-source representation for 2-DE-centric proteomics and support infrastructure for data storage and analysis

Affiliations

An open-source representation for 2-DE-centric proteomics and support infrastructure for data storage and analysis

Romesh Stanislaus et al. BMC Bioinformatics. .

Abstract

Background: In spite of two-dimensional gel electrophoresis (2-DE) being an effective and widely used method to screen the proteome, its data standardization has still not matured to the level of microarray genomics data or mass spectrometry approaches. The trend toward identifying encompassing data standards has been expanding from genomics to transcriptomics, and more recently to proteomics. The relative success of genomic and transcriptomic data standardization has enabled the development of central repositories such as GenBank and Gene Expression Omnibus. An equivalent 2-DE-centric data structure would similarly have to include a balance among raw data, basic feature detection results, sufficiency in the description of the experimental context and methods, and an overall structure that facilitates a diversity of usages, from central reposition to local data representation in LIMs systems.

Results & conclusion: Achieving such a balance can only be accomplished through several iterations involving bioinformaticians, bench molecular biologists, and the manufacturers of the equipment and commercial software from which the data is primarily generated. Such an encompassing data structure is described here, developed as the mature successor to the well established and broadly used earlier version. A public repository, AGML Central, is configured with a suite of tools for the conversion from a variety of popular formats, web-based visualization, and interoperation with other tools and repositories, and is particularly mass-spectrometry oriented with I/O for annotation and data analysis.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Top level view of the AGML XML data structure. AGML represented as a UML class diagram can be found at the project website [25]. The element 'detection_parameters' (a child of 'gel_image') enables AGML 2.0 to handle DIGE gel images. Also, the fact that AGML stores the raw gel image enables their reanalysis by other means. Please note that, for clarity, not all elements are shown in the figure.
Figure 2
Figure 2
View of the sub element <mass_spec> structure. The non-compulsory 'mass_spec' provides the option to a) store the mass spec data in a native format provided by AGML, b) store it in another format such as mzXML or mzData, or c) add a link to a location where the data is stored, such as PRIDE. Additionally, the elements 'location' and 'pooledwith' (children of 'mass_spec') can capture the location of the plate well where the spot was deposited or pooled respectively.
Figure 3
Figure 3
AGML Central web infrastructure. AGML XML format describing a 2-DE experiment is central to the AGML Central architecture. The web infrastructure is written in PHP programming language, a widely-used general-purpose scripting language, and the storage of the XML instance documents is provided by PostgreSQL, an open source object-relational database management system. The AGML document is stored as a logical unit within the database. This eliminates the need to store the document as blobs and also provides for fast retrieval of the data. Analysis software is written in MATLAB® (The MathWorks, Inc.), a technical computing environment ideal for handling high dimensional data.
Figure 4
Figure 4
AGML Central web site displaying the AGML Document Main page. This page gives access to all of the information relating to a 2-DE experiment. The owner of the page can also give permissions to others to view the experiment, check progress, and delete the submitted files (1). Collaborators of the project can also submit files to the project (2) or view the experiment using AGML Visualizer (3). They can also view the 2-DE protocol used for the experiment by clicking on the view protocol link (4). Raw images can be viewed or downloaded by going to the images link (5). All of the project files are described on this page under analysis result information (6) and can be viewed or downloaded. Additionally, the AGML XML file and MATLAB® mat files are available for download from this page for the experiment (7). Thus the MATLAB® code written for this project can be used to analyze the 2-DE data directly without further manipulation.

References

    1. Fu Q, Garnham CP, Elliott ST, Bovenkamp DE, Van Eyk JE. A robust, streamlined, and reproducible method for proteomic analysis of serum by delipidation, albumin and IgG depletion, and two-dimensional gel electrophoresis. Proteomics. 2005;5:2656–2664. doi: 10.1002/pmic.200402048. - DOI - PubMed
    1. Gorg A, Weiss W, Dunn MJ. Current two-dimensional electrophoresis technology for proteomics. Proteomics. 2004;4:3665–3685. doi: 10.1002/pmic.200401031. - DOI - PubMed
    1. Pedrioli PG, Eng JK, Hubley R, Vogelzang M, Deutsch EW, Raught B, Pratt B, Nilsson E, Angeletti RH, Apweiler R, Cheung K, Costello CE, Hermjakob H, Huang S, Julian RK, Kapp E, McComb ME, Oliver SG, Omenn G, Paton NW, Simpson R, Smith R, Taylor CF, Zhu W, Aebersold R. A common open representation of mass spectrometry data and its application to proteomics research. Nat Biotechnol. 2004;22:1459–1466. doi: 10.1038/nbt1031. - DOI - PubMed
    1. mzData http://www.psidev.info/
    1. mzXML http://sashimi.sourceforge.net/

Publication types

LinkOut - more resources