Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2004 Feb 3;32(2):742-8.
doi: 10.1093/nar/gkh257. Print 2004.

High-throughput protein analysis integrating bioinformatics and experimental assays

Affiliations

High-throughput protein analysis integrating bioinformatics and experimental assays

Coral del Val et al. Nucleic Acids Res. .

Abstract

The wealth of transcript information that has been made publicly available in recent years requires the development of high-throughput functional genomics and proteomics approaches for its analysis. Such approaches need suitable data integration procedures and a high level of automation in order to gain maximum benefit from the results generated. We have designed an automatic pipeline to analyse annotated open reading frames (ORFs) stemming from full-length cDNAs produced mainly by the German cDNA Consortium. The ORFs are cloned into expression vectors for use in large-scale assays such as the determination of subcellular protein localization or kinase reaction specificity. Additionally, all identified ORFs undergo exhaustive bioinformatic analysis such as similarity searches, protein domain architecture determination and prediction of physicochemical characteristics and secondary structure, using a wide variety of bioinformatic methods in combination with the most up-to-date public databases (e.g. PRINTS, BLOCKS, INTERPRO, PROSITE SWISSPROT). Data from experimental results and from the bioinformatic analysis are integrated and stored in a relational database (MS SQL-Server), which makes it possible for researchers to find answers to biological questions easily, thereby speeding up the selection of targets for further analysis. The designed pipeline constitutes a new automatic approach to obtaining and administrating relevant biological data from high-throughput investigations of cDNAs in order to systematically identify and characterize novel genes, as well as to comprehensively describe the function of the encoded proteins.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Pipeline description. A new integrated and automated strategy for obtaining and administrating data from high-throughput investigations of cDNAs. Input for the described pipeline are FL-cDNAs extracted from large-scale generated cDNAs. Verified ORFs are cloned into expression vectors to be used in high-throughput experimental assays. Additionally, all identified ORFs undergo an exhaustive bioinformatics analysis by running automated tasks (DomainSweep, 2DSweep and ProtSweep). These tasks are part of the W3H task framework, which allows the integration of heterogeneous applications to create tailor-made analysis flows. Both computational and experimental results are integrated into a relational database (‘Core Database’). The core database consists of several single databases, which allow the researcher to cross-check different protein features in silico by simply executing appropriate SQL queries via web browsers or clients.

References

    1. Ernst P., Glatting,K.H. and Suhai,S. (2003) A task framework for the web interface W2H. Bioinformatics, 19, 278–282. - PubMed
    1. Hotz-Wagenblatt A., Hankeln,T., Ernst,P., Glatting,K.H., Schmidt,E.R. and Suhai,S. (2003) ESTAnnotator: A tool for high throughput EST annotation. Nucleic Acids Res., 31, 3716–3719. - PMC - PubMed
    1. DelVal C., Glatting,K.H. and Suhai,S. (2003) cDNA2Genome: a tool for mapping and annotating cDNAs. BMC Bioinformatics, 4, 39. - PMC - PubMed
    1. Senger M., Flores,T., Glatting,K., Ernst,P., Hotz-Wagenblatt,A. and Suhai,S. (1998) W2H: WWW interface to the GCG sequence analysis package. Bioinformatics, 14, 452–457. - PubMed
    1. Corpet F., Servant,F., Gouzy,J. and Kahn,D. (2000) ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons. Nucleic Acids Res., 28, 267–269. - PMC - PubMed

Publication types