Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jun 1;39(6):btad365.
doi: 10.1093/bioinformatics/btad365.

AIRRSHIP: simulating human B cell receptor repertoire sequences

Affiliations

AIRRSHIP: simulating human B cell receptor repertoire sequences

Catherine Sutherland et al. Bioinformatics. .

Abstract

Summary: Adaptive Immune Receptor Repertoire Sequencing is a rapidly developing field that has advanced understanding of the role of the adaptive immune system in health and disease. Numerous tools have been developed to analyse the complex data produced by this technique but work to compare their accuracy and reliability has been limited. Thorough, systematic assessment of their performance is dependent on the ability to produce high quality simulated datasets with known ground truth. We have developed AIRRSHIP, a flexible and fast Python package that produces synthetic human B cell receptor sequences. AIRRSHIP uses a comprehensive set of reference data to replicate key mechanisms in the immunoglobulin recombination process, with a particular focus on junctional complexity. Repertoires generated by AIRRSHIP are highly similar to published data and all steps in the sequence generation process are recorded. These data can be used to not only determine the accuracy of repertoire analysis tools but can also, by tuning of the large number of user-controllable parameters, give insight into factors that contribute to inaccuracies in results.

Availability and implementation: AIRRSHIP is implemented in Python. It is available via https://github.com/Cowanlab/airrship and on PyPI at https://pypi.org/project/airrship/. Documentation can be found at https://airrship.readthedocs.io/.

PubMed Disclaimer

Conflict of interest statement

None declared.

Figures

Figure 1
Figure 1
AIRRSHIP simulates human heavy chain BCR sequences, informed by experimental data at each step of the synthetic recombination process. Key parameters such as VDJ usage, gene trimming, junctional insertions and somatic hypermutation can all be modified by the user. Sequences in the FASTA file output can then be used as input for tools of interest and results easily compared to the tab separated values (TSV) file which acts as a record of the recombination process for each sequence.

References

    1. Bolotin DA, Poslavsky S, Mitrophanov I. et al. MiXCR: software for comprehensive adaptive immunity profiling. Nat Methods 2015;12:380–1. - PubMed
    1. Brochet X, Lefranc M-P, Giudicelli V. et al. IMGT/V-QUEST: the highly customized and integrated system for IG and TR standardized V-J and V-D-J sequence analysis. Nucleic Acids Res 2008;36:W503–8. - PMC - PubMed
    1. DeWitt WS, Lindau P, Snyder TM. et al. A public database of memory and naive B-cell receptor sequences. PLoS One 2016;11:e0160853. - PMC - PubMed
    1. Gadala-Maria D et al. Automated analysis of high-throughput B-cell sequencing data reveals a high frequency of novel immunoglobulin V gene segment alleles. Proc Natl Acad Sci USA 2015;112:E862–70. - PMC - PubMed
    1. Han J et al. Echidna: integrated simulations of single-cell immune receptor repertoires and transcriptomes. Bioinf Adv 2022;2:vbac062. - PMC - PubMed

Publication types