Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Nov 1;35(22):4624-4631.
doi: 10.1093/bioinformatics/btz280.

BHap: a novel approach for bacterial haplotype reconstruction

Affiliations

BHap: a novel approach for bacterial haplotype reconstruction

Xin Li et al. Bioinformatics. .

Abstract

Motivation: The bacterial haplotype reconstruction is critical for selecting proper treatments for diseases caused by unknown haplotypes. Existing methods and tools do not work well on this task, because they are usually developed for viral instead of bacterial populations.

Results: In this study, we developed BHap, a novel algorithm based on fuzzy flow networks, for reconstructing bacterial haplotypes from next generation sequencing data. Tested on simulated and experimental datasets, we showed that BHap was capable of reconstructing haplotypes of bacterial populations with an average F1 score of 0.87, an average precision of 0.87 and an average recall of 0.88. We also demonstrated that BHap had a low susceptibility to sequencing errors, was capable of reconstructing haplotypes with low coverage and could handle a wide range of mutation rates. Compared with existing approaches, BHap outperformed them in terms of higher F1 scores, better precision, better recall and more accurate estimation of the number of haplotypes.

Availability and implementation: The BHap tool is available at http://www.cs.ucf.edu/∼xiaoman/BHap/.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Flowchart of the BHap algorithm. Polymorphic nodes from different haplotypes and the haplotypes themselves are drawn with different patterns
Fig. 2.
Fig. 2.
BHap performance under different parameters. (A) BHap performance under the default parameters. The three bars are for three different reference genomes; (B) BHap performance under different read lengths; (C) BHap performance under different error rates; (D) BHap performance under different haplotype proportions; (E) BHap performance under different mutation rates; (F) Predicted polymorphic sites by BHap compared with known polymorphic sites
Fig. 3.
Fig. 3.
Reliability comparison on experimental datasets. The box plot for BHap is in front of that for EVORhA in the four comparisons

Similar articles

Cited by

References

    1. Astrovskaya I. et al. (2011) Inferring viral quasispecies spectra from 454 pyrosequencing reads. BMC Bioinformatics, 12, S1. - PMC - PubMed
    1. Barrick J.E., Lenski R.E. (2009) Genome-wide mutational diversity in an evolving population of Escherichia coli In: Cold Spring Harbor Symposia on Quantitative Biology. Cold Spring Harbor Laboratory Press. pp. sqb. 2009.2074. 2018. - PMC - PubMed
    1. Eyre D.W. et al. (2013) Detection of mixed infection from bacterial whole genome sequence data allows assessment of its role in Clostridium difficile transmission. PLoS Comput. Biol., 9, e1003059. - PMC - PubMed
    1. Glenn T.C. (2011) Field guide to next‐generation DNA sequencers. Mol. Ecol. Resour., 11, 759–769. - PubMed
    1. Huang A. et al. QColors: an algorithm for conservative viral quasispecies reconstruction from short and non-contiguous next generation sequencing reads. In: IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW), 2011, pp. 130–136, IEEE. - PMC - PubMed

Publication types