Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Aug 30;19(1):309.
doi: 10.1186/s12859-018-2320-1.

Selection of marker genes for genetic barcoding of microorganisms and binning of metagenomic reads by Barcoder software tools

Affiliations

Selection of marker genes for genetic barcoding of microorganisms and binning of metagenomic reads by Barcoder software tools

Adeola M Rotimi et al. BMC Bioinformatics. .

Abstract

Background: Metagenomic approaches have revealed the complexity of environmental microbiomes with the advancement in whole genome sequencing displaying a significant level of genetic heterogeneity on the species level. It has become apparent that patterns of superior bioactivity of bacteria applicable in biotechnology as well as the enhanced virulence of pathogens often requires distinguishing between closely related species or sub-species. Current methods for binning of metagenomic reads usually do not allow for identification below the genus level and generally stops at the family level.

Results: In this work, an attempt was made to improve metagenomic binning resolution by creating genome specific barcodes based on the core and accessory genomes. This protocol was implemented in novel software tools available for use and download from http://bargene.bi.up.ac.za /. The most abundant barcode genes from the core genomes were found to encode for ribosomal proteins, certain central metabolic genes and ABC transporters. Performance of metabarcode sequences created by this package was evaluated using artificially generated and publically available metagenomic datasets. Furthermore, a program (Barcoding 2.0) was developed to align reads against barcode sequences and thereafter calculate various parameters to score the alignments and the individual barcodes. Taxonomic units were identified in metagenomic samples by comparison of the calculated barcode scores to set cut-off values. In this study, it was found that varying sample sizes, i.e. number of reads in a metagenome and metabarcode lengths, had no significant effect on the sensitivity and specificity of the algorithm. Receiver operating characteristics (ROC) curves were calculated for different taxonomic groups based on the results of identification of the corresponding genomes in artificial metagenomic datasets. The reliability of distinguishing between species of the same genus or family by the program was nearly perfect.

Conclusions: The results showed that the novel online tool BarcodeGenerator ( http://bargene.bi.up.ac.za /) is an efficient approach for generating barcode sequences from a set of complete genomes provided by users. Another program, Barcoder 2.0 is available from the same resource to enable an efficient and practical use of metabarcodes for visualization of the distribution of organisms of interest in environmental and clinical samples.

Keywords: Bacterial genome; Metabarcoding; Metagenome; NGS; Software tool.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

No ethics committee approval is required.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
Workflow diagram of selection of diagnostic barcode sequences
Fig. 2
Fig. 2
Graphical output of the Program BarcodeGenerator presents a distribution of COGs depicted by dots in the 3D plot. X-axis: percentage of sense mutations; Y-axis: 1 – percentage of identities; Z (vertical) axis: (positives-identities)/identities. Conserved, positively selected and highly variable groups of COGs are labelled. COGs suitable for barcoding are in brown colour
Fig. 3
Fig. 3
Distribution of 15 accessory genes (depicted by black and grey bars) selected to represent genetic variability of 9 sampled genomes of Shewanella
Fig. 4
Fig. 4
Workflow diagram of the program Barcoding 2.0
Fig. 5
Fig. 5
Distribution of values of a BarcodeScore1 and b BarcodeScore2 calculated based on the percentage of genome specific reads in artificial metagenomes. Whisker lines depict the minimal, maximal and median values; grey bars show middle quartiles and the open cycles indicate the average values
Fig. 6
Fig. 6
Surface plotting of the distribution of values for TP / (FP + FN) calculated for different pairs of cut-off values of the BarcodeScore 1 and 2
Fig. 7
Fig. 7
Influence of the a metagenome sample size and b length of barcode sequence on the program performance
Fig. 8
Fig. 8
ROC diagrams of identification of a genomes on different taxonomic levels; b genomes of the Escherichia / Shigella group by barcodes with different contribution of accessory genes
Fig. 9
Fig. 9
An example of identification of Lactobacillus species in phyllosphere metagenome

Similar articles

Cited by

References

    1. Handelsman J, Rondon MR, Brady SF, Clardy J, Goodman RM. Molecular biological access to the chemistry of unknown soil microbes: a new frontier for natural products. Chem Biol. 1998;5(10):R245–R249. doi: 10.1016/S1074-5521(98)90108-9. - DOI - PubMed
    1. Thomas T, Gilbert J, Meyer F. Metagenomics – a guide from sampling to data analysis. Microb Inform Exp. 2012;2:3. doi: 10.1186/2042-5783-2-3. - DOI - PMC - PubMed
    1. Tyson GW, Chapman J, Hugenholtz P, Allen EE, Ram RJ, Richardson PM, Solovyev W, Rubin EM, Rokhsar DS, Banfield JF. Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature. 2004;428(6978):37–43. doi: 10.1038/nature02340. - DOI - PubMed
    1. Gilbert JA, Field D, Huang Y, Edwards R, Li W, Gilna P, Joint I. Detection of large numbers of novel sequences in the Metatranscriptomes of complex marine microbial communities. PLoS One. 2008;3(8):e3042. doi: 10.1371/journal.pone.0003042. - DOI - PMC - PubMed
    1. Desai N, Antonopoulos D, Gilbert AJ, Glass ME, Meyer F. From genomics to metagenomics. Curr Opin Biotechnol. 2012;23(1):72–76. doi: 10.1016/j.copbio.2011.12.017. - DOI - PubMed

Substances

LinkOut - more resources