The Scramble conversion tool
- PMID: 24930138
- PMCID: PMC4173023
- DOI: 10.1093/bioinformatics/btu390
The Scramble conversion tool
Abstract
Motivation: The reference CRAM file format implementation is in Java. We present 'Scramble': a new C implementation of SAM, BAM and CRAM file I/O.
Results: The C implementation of for CRAM is 1.5-1.7× slower than BAM at decoding but 1.8-2.6× faster at encoding. We see file size savings of 34-55%.
Availability and implementation: Source code is available at http://sourceforge.net/projects/staden/files/io_lib/ under the BSD software licence.
© The Author 2014. Published by Oxford University Press.
Figures

Similar articles
-
Software support for SBGN maps: SBGN-ML and LibSBGN.Bioinformatics. 2012 Aug 1;28(15):2016-21. doi: 10.1093/bioinformatics/bts270. Epub 2012 May 10. Bioinformatics. 2012. PMID: 22581176 Free PMC article.
-
Whiteboard: a framework for the programmatic visualization of complex biological analyses.Bioinformatics. 2015 Jun 15;31(12):2054-5. doi: 10.1093/bioinformatics/btv078. Epub 2015 Feb 5. Bioinformatics. 2015. PMID: 25661541
-
kSNP3.0: SNP detection and phylogenetic analysis of genomes without genome alignment or reference genome.Bioinformatics. 2015 Sep 1;31(17):2877-8. doi: 10.1093/bioinformatics/btv271. Epub 2015 Apr 25. Bioinformatics. 2015. PMID: 25913206
-
A library of efficient bioinformatics algorithms.Appl Bioinformatics. 2003;2(2):117-21. Appl Bioinformatics. 2003. PMID: 15130828 Review.
-
Interoperability with Moby 1.0--it's better than sharing your toothbrush!Brief Bioinform. 2008 May;9(3):220-31. doi: 10.1093/bib/bbn003. Epub 2008 Jan 31. Brief Bioinform. 2008. PMID: 18238804 Review.
Cited by
-
PQSDC: a parallel lossless compressor for quality scores data via sequences partition and run-length prediction mapping.Bioinformatics. 2024 May 2;40(5):btae323. doi: 10.1093/bioinformatics/btae323. Bioinformatics. 2024. PMID: 38759114 Free PMC article.
-
Mind the gap: resources required to receive, process and interpret research-returned whole genome data.Hum Genet. 2019 Jul;138(7):691-701. doi: 10.1007/s00439-019-02033-5. Epub 2019 Jun 3. Hum Genet. 2019. PMID: 31161416 Free PMC article. Review.
-
CALQ: compression of quality values of aligned sequencing data.Bioinformatics. 2018 May 15;34(10):1650-1658. doi: 10.1093/bioinformatics/btx737. Bioinformatics. 2018. PMID: 29186284 Free PMC article.
-
CARGO: effective format-free compressed storage of genomic information.Nucleic Acids Res. 2016 Jul 8;44(12):e114. doi: 10.1093/nar/gkw318. Epub 2016 Apr 29. Nucleic Acids Res. 2016. PMID: 27131376 Free PMC article.
-
Comparison of high-throughput sequencing data compression tools.Nat Methods. 2016 Dec;13(12):1005-1008. doi: 10.1038/nmeth.4037. Epub 2016 Oct 24. Nat Methods. 2016. PMID: 27776113
References
-
- Deutsch P, Gailly JL. 1996. ZLIB compressed data format specification version 3.3. RFC 1950.
-
- Duda J. Asymmetric numeral systems: entropy coding combining speed of Huffman coding with compression rate of arithmetic coding. arXiv:1311.2540. 2013
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous