Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Jan 17:7:22.
doi: 10.1186/1471-2105-7-22.

preAssemble: a tool for automatic sequencer trace data processing

Affiliations

preAssemble: a tool for automatic sequencer trace data processing

Alexei A Adzhubei et al. BMC Bioinformatics. .

Abstract

Background: Trace or chromatogram files (raw data) are produced by automatic nucleic acid sequencing equipment or sequencers. Each file contains information which can be interpreted by specialised software to reveal the sequence (base calling). This is done by the sequencer proprietary software or publicly available programs. Depending on the size of a sequencing project the number of trace files can vary from just a few to thousands of files. Sequencing quality assessment on various criteria is important at the stage preceding clustering and contig assembly. Two major publicly available packages--Phred and Staden are used by preAssemble to perform sequence quality processing.

Results: The preAssemble pre-assembly sequence processing pipeline has been developed for small to large scale automatic processing of DNA sequencer chromatogram (trace) data. The Staden Package Pregap4 module and base-calling program Phred are utilized in the pipeline, which produces detailed and self-explanatory output that can be displayed with a web browser. preAssemble can be used successfully with very little previous experience, however options for parameter tuning are provided for advanced users. preAssemble runs under UNIX and LINUX operating systems. It is available for downloading and will run as stand-alone software. It can also be accessed on the Norwegian Salmon Genome Project web site where preAssemble jobs can be run on the project server.

Conclusion: preAssemble is a tool allowing to perform quality assessment of sequences generated by automatic sequencing equipment. preAssemble is flexible since both interactive jobs on the preAssemble server and the stand alone downloadable version are available. Virtually no previous experience is necessary to run a default preAssemble job, on the other hand options for parameter tuning are provided. Consequently preAssemble can be used as efficiently for just several trace files as for large scale sequence processing.

PubMed Disclaimer

Figures

Figure 1
Figure 1
preAssemble output. Main HTML window of the preAssemble output and supplementary windows showing processed sequences in the Fasta format and detailed colour-coded processing results with Phred quality values, available for each sequence.

References

    1. Ewing B, Hillier LD, Wendl MC, Green P. Base-Calling of Automated Sequencer Traces Using Phred. I. Accuracy Assessment. Genome Res. 1998;8:175–185. - PubMed
    1. Ewing B, Green P. Base-Calling of Automated Sequencer Traces Using Phred. II. Error Probabilities. Genome Res. 1998;8:186–194. - PubMed
    1. Bonfield J, Beal K, Cheng Y, Jordan M, Staden R. Staden Package . 1995. http://staden.sourceforge.net/
    1. Staden R, Beal KF, Bonfield JK. The Staden Package . In: Misener S, Krawetz S, editor. Computer Methods in Molecular Biology. Vol. 132. Totowa, NJ 07512 , The Humana Press Inc.; 1998. pp. 115 –1130. - PubMed
    1. Adzhubei AA, Laerdahl JK, Vlasova AV, Ruden TA. Norwegian Salmon Genome Project database and web site. 2002. http://www.salmongenome.no

Publication types

LinkOut - more resources