Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Sep 14;10(9):e0137436.
doi: 10.1371/journal.pone.0137436. eCollection 2015.

SATRAP: SOLiD Assembler TRAnslation Program

Affiliations

SATRAP: SOLiD Assembler TRAnslation Program

Davide Campagna et al. PLoS One. .

Abstract

SOLiD DNA sequences are typically analyzed using a reference genome, while they are not recommended for de novo assembly of genomes or transcriptomes. This is mainly due to the difficulty in translating the SOLiD color-space data into normal base-space sequences. In fact, the nature of color-space is such that any misinterpreted color leads to a chain of further translation errors, producing totally wrong results. Here we describe SATRAP, a computer program designed to efficiently translate de novo assembled color-space sequences into a base-space format. The program was tested and validated using simulated and real transcriptomic data; its modularity allows an easy integration into more complex pipelines, such as Oases for RNA-seq de novo assembly. SATRAP is available at http://satrap.cribi.unipd.it, either as a multi-step pipeline incorporating several tools for RNA-seq assembly or as an individual module for use with the Oases package.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Flowchart of the color-translation process.
Step1: the first base (FTB) of each read can be translated from color-space with high accuracy; for each read the FTB is mapped on the contig. Step 2: check color coherence with neighboring FTBs; three conditions can be detected: a) FTBs coherent with their neighboring FTBs on both sides (such as the 'A' at the centre of the figure); FTB coherent only on one side (such as the 'G' that is coherent with the 'A', but not with the 'C'); FTBs with no coherence on both sides (such as the 'A' circled in red). The latter are removed from the assembly. Step 3 and 4: find regions delimited by two reliable start sites and translate color-space into base-space. Any remaining regions will be incoherent in terms of color compatibility. To resolve these regions the threshold for color reliability is calculated (Step 5) and the resulting value is used to establish the critical regions of the contig (Step 6).
Fig 2
Fig 2. Effect of sequence coverage on color translation.
ASID, SATRAP and SOPRA were used to translate the color-space assemblies produced at different sequence coverage into base-space. The same set of reads was also assembled in base-space as a control.

References

    1. McKernan KJ, Peckham HE, Costa GL, McLaughlin SF, Fu Y, Tsung EF, et al. Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding. Genome Research. 2009;19: 1527–1541. 10.1101/gr.091868.109 - DOI - PMC - PubMed
    1. Umemura M, Koyama Y, Takeda I, Hagiwara H, Ikegami T, Koike H, et al. Fine de novo sequencing of a fungal genome using only SOLiD short read data: verification on Aspergillus oryzae RIB40. PLOS ONE. 2013;8: e63673 10.1371/journal.pone.0063673 - DOI - PMC - PubMed
    1. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, et al. Full-length transcriptome assembly from RNA-seq data without a reference genome. Nat Biotechnology. 2011;29: 644–652. - PMC - PubMed
    1. Schulz MH, Zerbino DR, Vingron M, Birney E. Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics. 2012;28: 1086–1092. 10.1093/bioinformatics/bts094 - DOI - PMC - PubMed
    1. Zerbino DR. Using the Velvet de novo Assembler for Short‐Read Sequencing Technologies. Current Protocols in Bioinformatics. 2010;11: 5.1–5.12. - PMC - PubMed

Publication types