GTShark: genotype compression in large projects
- PMID: 31225861
- DOI: 10.1093/bioinformatics/btz508
GTShark: genotype compression in large projects
Abstract
Summary: Nowadays large sequencing projects handle tens of thousands of individuals. The huge files summarizing the findings definitely require compression. We propose a tool able to compress large collections of genotypes almost 30% better than the best tool to date, i.e. squeezing human genotype to less than 62 KB. Moreover, it can also compress single samples in reference to the existing database achieving comparable results.
Availability and implementation: https://github.com/refresh-bio/GTShark.
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources