Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2023 Jul 18:2023.03.17.533215.
doi: 10.1101/2023.03.17.533215.

A machine-readable specification for genomics assays

Affiliations

A machine-readable specification for genomics assays

A Sina Booeshaghi et al. bioRxiv. .

Update in

Abstract

Understanding the structure of sequenced fragments from genomics libraries is essential for accurate read preprocessing. Currently, different assays and sequencing technologies require custom scripts and programs that do not leverage the common structure of sequence elements present in genomics libraries. We present seqspec, a machine-readable specification for libraries produced by genomics assays that facilitates standardization of preprocessing and enables tracking and comparison of genomics assays. The specification and associated seqspec command line tool is available at https://github.com/IGVF/seqspec.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:. The structure of reads sequenced from genomics libraries.
Sequencing libraries are constructed by combining Atomic Regions to form an adapter-insert-adapter construct. The seqspec for the assay annotates the construct with Regions and Meta Regions.
Figure 2:
Figure 2:. Uniform processing enabled with seqspec.
The seqspec index command produces a technology string that identifies appropriate sequence elements and can be passed into processing tools.

References

    1. Cao Junyue, Cusanovich Darren A., Ramani Vijay, Aghamirzaie Delasa, Pliner Hannah A., Hill Andrew J., Daza Riza M., et al. 2018. “Joint Profiling of Chromatin Accessibility and Gene Expression in Thousands of Single Cells.” Science 361 (6409): 1380–85. - PMC - PubMed
    1. Chen Xi. 2020. Collections of Library Structure and Sequence of Popular Single Cell Genomic Methods. Github. https://github.com/Teichlab/scg_lib_structs.
    1. Cheow Lih Feng, Courtois Elise T., Tan Yuliana, Viswanathan Ramya, Xing Qiaorui, Rui Zhen Tan Daniel S. W. Tan, et al. 2016. “Single-Cell Multimodal Profiling Reveals Cellular Epigenetic Heterogeneity.” Nature Methods 13 (10): 833–36. - PubMed
    1. Healey Hope M., Bassham Susan, and Cresko William A.. 2022. “Single-Cell Iso-Sequencing Enables Rapid Genome Annotation for scRNAseq Analysis.” Genetics 220 (3). 10.1093/genetics/iyac017. - DOI - PMC - PubMed
    1. He Dongze, Zakeri Mohsen, Sarkar Hirak, Soneson Charlotte, Srivastava Avi, and Patro Rob. 2022. “Alevin-Fry Unlocks Rapid, Accurate and Memory-Frugal Quantification of Single-Cell RNA-Seq Data.” Nature Methods 19 (3): 316–22. - PMC - PubMed

Publication types