Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022:2453:439-446.
doi: 10.1007/978-1-0716-2115-8_22.

Immune Repertoire Analysis on High-Performance Computing Using VDJServer V1: A Method by the AIRR Community

Affiliations

Immune Repertoire Analysis on High-Performance Computing Using VDJServer V1: A Method by the AIRR Community

Scott Christley et al. Methods Mol Biol. 2022.

Abstract

AIRR-seq data sets are usually large and require specialized analysis methods and software tools. A typical Illumina MiSeq sequencing run generates 20-30 million 2 × 300 bp paired-end sequence reads, which roughly corresponds to 15 GB of sequence data to be processed. Other platforms like NextSeq, which is useful in projects where the full V gene is not needed, create about 400 million 2 × 150 bp paired-end reads. Because of the size of the data sets, the analysis can be computationally expensive, particularly the early analysis steps like preprocessing and gene annotation that process the majority of the sequence data. A standard desktop PC may take 3-5 days of constant processing for a single MiSeq run, so dedicated high-performance computational resources may be required.VDJServer provides free access to high-performance computing (HPC) at the Texas Advanced Computing Center (TACC) through a graphical user interface (Christley et al. Front Immunol 9:976, 2018). VDJServer is a cloud-based analysis portal for immune repertoire sequence data that provides access to a suite of tools for a complete analysis workflow, including modules for preprocessing and quality control of sequence reads, V(D)J gene assignment, repertoire characterization, and repertoire comparison. Furthermore, VDJServer has parallelized execution for tools such as IgBLAST, so more compute resources are utilized as the size of the input data grows. Analysis that takes days on a desktop PC might take only a few hours on VDJServer. VDJServer is a free, publicly available, and open-source licensed resource. Here, we describe the workflow for performing immune repertoire analysis on VDJServer's high-performance computing.

Keywords: AIRR-Seq; B-cell receptor; Cloud computing; High-performance computing; T-cell receptor.

PubMed Disclaimer

References

    1. Christley S, Scarborough W, Salinas E, Rounds WH, Toby IT, Fonner JM et al (2018) VDJServer: a cloud-based analysis portal and data commons for immune repertoire sequences and rearrangements. Front Immunol 9:976 - PMC - PubMed
    1. Gupta NT, Vander Heiden JA, Uduman M, Gadala-Maria D, Yaari G, Kleinstein SH (2015) Change-O: a toolkit for analyzing large-scale B cell immunoglobulin repertoire sequencing data. Bioinformatics 31:3356–3358 - PMC - PubMed
    1. Vander Heiden JA, Yaari G, Uduman M, Stern JN, O'Connor KC, Hafler DA et al (2014) pRESTO: a toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires. Bioinformatics 30:1930–1932 - PMC - PubMed
    1. Christley S, Levin MK, Toby IT, Fonner JM, Monson NL, Rounds WH et al (2017) VDJPipe: a pipelined tool for pre-processing immune repertoire sequencing data. BMC Bioinformatics 18:448 - PMC - PubMed
    1. Toby IT, Levin MK, Salinas EA, Christley S, Bhattacharya S, Breden F et al (2016) VDJML: a file format with tools for capturing the results of inferring immune receptor rearrangements. BMC Bioinformatics 17:333 - PMC - PubMed

LinkOut - more resources