GReEn: a tool for efficient compression of genome resequencing data
- PMID: 22139935
- PMCID: PMC3287168
- DOI: 10.1093/nar/gkr1124
GReEn: a tool for efficient compression of genome resequencing data
Abstract
Research in the genomic sciences is confronted with the volume of sequencing and resequencing data increasing at a higher pace than that of data storage and communication resources, shifting a significant part of research budgets from the sequencing component of a project to the computational one. Hence, being able to efficiently store sequencing and resequencing data is a problem of paramount importance. In this article, we describe GReEn (Genome Resequencing Encoding), a tool for compressing genome resequencing data using a reference genome sequence. It overcomes some drawbacks of the recently proposed tool GRS, namely, the possibility of compressing sequences that cannot be handled by GRS, faster running times and compression gains of over 100-fold for some sequences. This tool is freely available for non-commercial use at ftp://ftp.ieeta.pt/~ap/codecs/GReEn1.tar.gz.
Figures


References
-
- Grumbach S, Tahi F. Proceedings of the Data Compression Conference, DCC-93. Snowbird. Utah: IEEE; 1993. Compression of DNA sequences; pp. 340–350.
-
- Grumbach S, Tahi F. A new challenge for compression algorithms: genetic sequences. Inform. Process. Manag. 1994;30:875–886.
-
- Rivals E, Delahaye J-P, Dauchet M, Delgrange O. Proceedings of the Data Compression Conference, DCC-96. Snowbird. Utah: IEEE; 1996. A guaranteed compression scheme for repetitive DNA sequences; p. 453.
-
- Loewenstern D, Yianilos PN. Proceedings of the Data Compression Conf., DCC-97. Snowbird. Utah: IEEE; 1997. Significantly lower entropy estimates for natural DNA sequences; pp. 151–160. - PubMed
-
- Chen X, Kwong S, Li M. A compression algorithm for DNA sequences and its applications in genome comparison. In: Asai K, Miyano S, Takagi T, editors. Genome Informatics 1999: Proc. of the 10th Workshop. Tokyo, Japan: Universal Academy Press, Inc; 1999. pp. 51–61. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources