Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jun 1;36(11):3594-3596.
doi: 10.1093/bioinformatics/btaa158.

immuneSIM: tunable multi-feature simulation of B- and T-cell receptor repertoires for immunoinformatics benchmarking

Affiliations

immuneSIM: tunable multi-feature simulation of B- and T-cell receptor repertoires for immunoinformatics benchmarking

Cédric R Weber et al. Bioinformatics. .

Abstract

Summary: B- and T-cell receptor repertoires of the adaptive immune system have become a key target for diagnostics and therapeutics research. Consequently, there is a rapidly growing number of bioinformatics tools for immune repertoire analysis. Benchmarking of such tools is crucial for ensuring reproducible and generalizable computational analyses. Currently, however, it remains challenging to create standardized ground truth immune receptor repertoires for immunoinformatics tool benchmarking. Therefore, we developed immuneSIM, an R package that allows the simulation of native-like and aberrant synthetic full-length variable region immune receptor sequences by tuning the following immune receptor features: (i) species and chain type (BCR, TCR, single and paired), (ii) germline gene usage, (iii) occurrence of insertions and deletions, (iv) clonal abundance, (v) somatic hypermutation and (vi) sequence motifs. Each simulated sequence is annotated by the complete set of simulation events that contributed to its in silico generation. immuneSIM permits the benchmarking of key computational tools for immune receptor analysis, such as germline gene annotation, diversity and overlap estimation, sequence similarity, network architecture, clustering analysis and machine learning methods for motif detection.

Availability and implementation: The package is available via https://github.com/GreiffLab/immuneSIM and on CRAN at https://cran.r-project.org/web/packages/immuneSIM. The documentation is hosted at https://immuneSIM.readthedocs.io.

Contact: sai.reddy@ethz.ch or victor.greiff@medisin.uio.no.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
immuneSIM simulates fully (single, paired) annotated tunable immune repertoires that are either highly similar (native-like) or deviating (aberrant, see main text for definition) from experimental immune repertoires. All major immune repertoire features, such as clonal abundance, germline genes, deletions and insertions and somatic hypermutation, are tunable. Post in silico recombination, the immuneSIM-generated immune receptor repertoires may be further modified by (i) implantation of motifs, (ii) codon replacement and (iii) change of sequence similarity architecture

References

    1. Arora R. et al. (2019) Repertoire-based diagnostics using statistical biophysics. bioRxiv, doi: 10.1101/519108.
    1. Breden F. et al. (2017) Reproducibility and reuse of adaptive immune receptor repertoire data. Front. Immunol., 8, 1418. - PMC - PubMed
    1. Brown A.J. et al. (2019) Augmenting adaptive immunity: progress and challenges in the quantitative engineering and analysis of adaptive immune receptor repertoires. Mol. Syst. Des. Eng., 4, 701–736.
    1. Dash P. et al. (2017) Quantifiable predictive features define epitope-specific T cell receptor repertoires. Nature, 547, 89–93. - PMC - PubMed
    1. Emerson R.O. et al. (2017) Immunosequencing identifies signatures of cytomegalovirus exposure history and HLA-mediated effects on the T cell repertoire. Nat. Genet., 49, 659–665. - PubMed

Publication types

Substances