Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jun 17:3:22.
doi: 10.3389/fdata.2020.00022. eCollection 2020.

The ADC API: A Web API for the Programmatic Query of the AIRR Data Commons

Affiliations

The ADC API: A Web API for the Programmatic Query of the AIRR Data Commons

Scott Christley et al. Front Big Data. .

Abstract

The Adaptive Immune Receptor Repertoire (AIRR) Community is a research-driven group that is establishing a clear set of community-accepted data and metadata standards; standards-based reference implementation tools; and policies and practices for infrastructure to support the deposit, curation, storage, and use of high-throughput sequencing data from B-cell and T-cell receptor repertoires (AIRR-seq data). The AIRR Data Commons is a distributed system of data repositories that utilizes a common data model, a common query language, and common interoperability formats for storage, query, and downloading of AIRR-seq data. Here is described the principal technical standards for the AIRR Data Commons consisting of the AIRR Data Model for repertoires and rearrangements, the AIRR Data Commons (ADC) API for programmatic query of data repositories, a reference implementation for ADC API services, and tools for querying and validating data repositories that support the ADC API. AIRR-seq data repositories can become part of the AIRR Data Commons by implementing the data model and API. The AIRR Data Commons allows AIRR-seq data to be reused for novel analyses and empowers researchers to discover new biological insights about the adaptive immune system.

Keywords: Rep-Seq; antibody; community standards; data sharing; immunoglobulin; immunology; repertoire analysis.

PubMed Disclaimer

Figures

Figure 1
Figure 1
AIRR Standards Ecosystem. The AIRR Standards consisting of MiAIRR, the AIRR Data Model with the repertoire metadata schema and the AIRR TSV for rearrangements, and the ADC API, creates a cohesive ecosystem making AIRR-seq data findable, accessible, interoperable, and reusable. A typical entrance (Diamond #1) is when a researcher generates their own AIRR-seq data and performs analysis on it. The AIRR Standards allows the researcher to utilize a diverse set of analysis tools by providing interoperability of the data between the tools. Furthermore, when the researcher publishes (Diamond #2) their findings, the AIRR Standards facilitates conformance to data reporting and sharing by international funder and journal policies, archival of the raw sequencing data into databases of the International Nucleotide Sequence Database Collaboration, and storage of metadata and annotations in the AIRR Data Commons. Another starting point (Diamond #3) is when a researcher explores the AIRR Data Commons using a variety of web interfaces and tools that can communicate through the ADC API, and these ADC API clients query (Diamond #4) the AIRR Data Commons on behalf of the researcher. The AIRR Standards ensures reusability of the data downloaded (Diamond #5) from the AIRR Data Commons with the same, diverse set of analysis tools. As the AIRR Data Commons grows, researchers can utilize comparative analyses between their own data (Diamond #1) and data from the AIRR Data Commons (Diamond #5) to provide novel biological insights into the adaptive immune system.
Figure 2
Figure 2
Repertoire and Rearrangement Relationships. A Repertoire object is linked to a single study, a single subject, any number of samples where each sample combines together the cell processing, nucleic acid processing and sequencing run information, and any number of data processing analyses. A Rearrangement object is linked to a single Repertoire object and to a single data processing analysis object.
Figure 3
Figure 3
CDR3 AA Length Histogram from Example Use Case Walkthrough. Normalized count for CDR3 lengths from 10 to 19 amino acids is shown for four T cell subsets. As only partial data was downloaded in the example for illustrative purposes, this figure should not be construed as representing the true length distribution of the repertoires.

References

    1. Bolotin D. A., Poslavsky S., Mitrophanov I., Shugay M., Mamedov I. Z., Putintseva E. V., et al. . (2015). MiXCR: software for comprehensive adaptive immunity profiling. Nat. Methods 12, 380–381. 10.1038/nmeth.3364 - DOI - PubMed
    1. Breden F., Luning Prak E. T., Peters B., Rubelt F., Schramm C. A., Busse C. E, et al. . (2017). Reproducibility and reuse of adaptive immune receptor repertoire data. Front. Immunol. 8:1418. 10.3389/fimmu.2017.01418 - DOI - PMC - PubMed
    1. Christley S., Scarborough W., Salinas E., Rounds W. H., Toby I. T., Fonner J. M., et al. . (2018). VDJServer: a cloud-based analysis portal and data commons for immune repertoire sequences and rearrangements. Front. Immunol. 9:976. 10.3389/fimmu.2018.00976 - DOI - PMC - PubMed
    1. Contreras J. L., Reichman J. H. (2015). DATA ACCESS. Sharing by design: data and decentralized commons. Science 350, 1312–1314. 10.1126/science.aaa7485 - DOI - PMC - PubMed
    1. Corrie B. D., Marthandan N., Zimonja B., Jaglale J., Zhou Y., Barr E., et al. . (2018). iReceptor: a platform for querying and analyzing antibody/B-cell and T-cell receptor repertoire data across federated repositories. Immunol. Rev. 284, 24–41. 10.1111/imr.12666 - DOI - PMC - PubMed

LinkOut - more resources