Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jul 1;30(13):1930-2.
doi: 10.1093/bioinformatics/btu138. Epub 2014 Mar 10.

pRESTO: a toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires

Affiliations

pRESTO: a toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires

Jason A Vander Heiden et al. Bioinformatics. .

Abstract

Driven by dramatic technological improvements, large-scale characterization of lymphocyte receptor repertoires via high-throughput sequencing is now feasible. Although promising, the high germline and somatic diversity, especially of B-cell immunoglobulin repertoires, presents challenges for analysis requiring the development of specialized computational pipelines. We developed the REpertoire Sequencing TOolkit (pRESTO) for processing reads from high-throughput lymphocyte receptor studies. pRESTO processes raw sequences to produce error-corrected, sorted and annotated sequence sets, along with a wealth of metrics at each step. The toolkit supports multiplexed primer pools, single- or paired-end reads and emerging technologies that use single-molecule identifiers. pRESTO has been tested on data generated from Roche and Illumina platforms. It has a built-in capacity to parallelize the work between available processors and is able to efficiently process millions of sequences generated by typical high-throughput projects.

Availability and implementation: pRESTO is freely available for academic use. The software package and detailed tutorials may be downloaded from http://clip.med.yale.edu/presto.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Example workflow diagram. Example workflows for single-end read sequencing protocols with sample barcoding (left) and paired-end read protocols with/without UID barcoding (right). Single sequence file inputs are shown with single arrowheads, and parallel processing of two paired-end read files are shown with a double arrowhead

References

    1. Alamyar E, et al. IMGT/HighV-QUEST: the IMGT web portal for immunoglobulin (IG) or antibody and T cell receptor (TR) analysis from NGS high throughput and deep sequencing. Immunome Res. 2012;8:26.
    1. Barak M, et al. IgTree: creating immunoglobulin variable region gene lineage trees. J. Immunol. Methods. 2008;338:67–74. - PubMed
    1. Benichou J, et al. Rep-seq: uncovering the immunological repertoire through next-generation sequencing. Immunology. 2012;135:183191. - PMC - PubMed
    1. Chen Z, et al. Clustering-based identification of clonally-related immunoglobulin gene sequence sets. Immunome Res. 2010;6(Suppl. 1):S4. - PMC - PubMed
    1. Gaëta BA, et al. iHMMune-align: hidden Markov model-based alignment and identification of germline genes in rearranged immunoglobulin gene sequences. Bioinformatics. 2007;23:1580–1587. - PubMed

Publication types

MeSH terms

Substances