HTSeq--a Python framework to work with high-throughput sequencing data
- PMID: 25260700
- PMCID: PMC4287950
- DOI: 10.1093/bioinformatics/btu638
HTSeq--a Python framework to work with high-throughput sequencing data
Abstract
Motivation: A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard workflows, custom scripts are needed.
Results: We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data, such as genomic coordinates, sequences, sequencing reads, alignments, gene model information and variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes.
Availability and implementation: HTSeq is released as an open-source software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index at https://pypi.python.org/pypi/HTSeq.
© The Author 2014. Published by Oxford University Press.
Figures


Similar articles
-
Analysing high-throughput sequencing data in Python with HTSeq 2.0.Bioinformatics. 2022 May 13;38(10):2943-2945. doi: 10.1093/bioinformatics/btac166. Bioinformatics. 2022. PMID: 35561197 Free PMC article.
-
htseq-clip: a toolset for the preprocessing of eCLIP/iCLIP datasets.Bioinformatics. 2023 Jan 1;39(1):btac747. doi: 10.1093/bioinformatics/btac747. Bioinformatics. 2023. PMID: 36394253 Free PMC article.
-
Rcount: simple and flexible RNA-Seq read counting.Bioinformatics. 2015 Feb 1;31(3):436-7. doi: 10.1093/bioinformatics/btu680. Epub 2014 Oct 15. Bioinformatics. 2015. PMID: 25322836
-
Omics Pipe: a community-based framework for reproducible multi-omics data analysis.Bioinformatics. 2015 Jun 1;31(11):1724-8. doi: 10.1093/bioinformatics/btv061. Epub 2015 Jan 30. Bioinformatics. 2015. PMID: 25637560 Free PMC article.
-
Rnalib: a Python library for custom transcriptomics analyses.Bioinformatics. 2024 Dec 26;41(1):btae751. doi: 10.1093/bioinformatics/btae751. Bioinformatics. 2024. PMID: 39718766 Free PMC article.
Cited by
-
Decoding mutational hotspots in human disease through the gene modules governing thymic regulatory T cells.Front Immunol. 2024 Oct 15;15:1458581. doi: 10.3389/fimmu.2024.1458581. eCollection 2024. Front Immunol. 2024. PMID: 39483472 Free PMC article.
-
Activated ALK Cooperates with N-Myc via Wnt/β-Catenin Signaling to Induce Neuroendocrine Prostate Cancer.Cancer Res. 2021 Apr 15;81(8):2157-2170. doi: 10.1158/0008-5472.CAN-20-3351. Epub 2021 Feb 26. Cancer Res. 2021. PMID: 33637566 Free PMC article.
-
Determining Aspergillus fumigatus transcription factor expression and function during invasion of the mammalian lung.PLoS Pathog. 2021 Mar 29;17(3):e1009235. doi: 10.1371/journal.ppat.1009235. eCollection 2021 Mar. PLoS Pathog. 2021. PMID: 33780518 Free PMC article.
-
Temporal multiomic modeling reveals a B-cell receptor proliferative program in chronic lymphocytic leukemia.Leukemia. 2021 May;35(5):1463-1474. doi: 10.1038/s41375-021-01221-5. Epub 2021 Apr 8. Leukemia. 2021. PMID: 33833385 Free PMC article.
-
Genomic and Transcriptomic Characterization of Canine Osteosarcoma Cell Lines: A Valuable Resource in Translational Medicine.Front Vet Sci. 2021 May 17;8:666838. doi: 10.3389/fvets.2021.666838. eCollection 2021. Front Vet Sci. 2021. PMID: 34079834 Free PMC article.
References
-
- Beazley DM, et al. Proceedings of the 4th USENIX Tcl/Tk workshop. 1996. SWIG: an easy to use tool for integrating scripting languages with C and C++ pp. 129–139.
-
- Behnel S, et al. Cython: the best of both worlds. Comput. Sci. Eng. 2011;13:31–39.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases