Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Oct 14:12:400.
doi: 10.1186/1471-2105-12-400.

MetaboHunter: an automatic approach for identification of metabolites from 1H-NMR spectra of complex mixtures

Affiliations

MetaboHunter: an automatic approach for identification of metabolites from 1H-NMR spectra of complex mixtures

Dan Tulpan et al. BMC Bioinformatics. .

Abstract

Background: One-dimensional 1H-NMR spectroscopy is widely used for high-throughput characterization of metabolites in complex biological mixtures. However, the accurate identification of individual compounds is still a challenging task, particularly in spectral regions with higher peak densities. The need for automatic tools to facilitate and further improve the accuracy of such tasks, while using increasingly larger reference spectral libraries becomes a priority of current metabolomics research.

Results: We introduce a web server application, called MetaboHunter, which can be used for automatic assignment of 1H-NMR spectra of metabolites. MetaboHunter provides methods for automatic metabolite identification based on spectra or peak lists with three different search methods and with possibility for peak drift in a user defined spectral range. The assignment is performed using as reference libraries manually curated data from two major publicly available databases of NMR metabolite standard measurements (HMDB and MMCD). Tests using a variety of synthetic and experimental spectra of single and multi metabolite mixtures show that MetaboHunter is able to identify, in average, more than 80% of detectable metabolites from spectra of synthetic mixtures and more than 50% from spectra corresponding to experimental mixtures. This work also suggests that better scoring functions improve by more than 30% the performance of MetaboHunter's metabolite identification methods.

Conclusions: MetaboHunter is a freely accessible, easy to use and user friendly 1H-NMR-based web server application that provides efficient data input and pre-processing, flexible parameter settings, fast and automatic metabolite fingerprinting and results visualization via intuitive plotting and compound peak hit maps. Compared to other published and freely accessible metabolomics tools, MetaboHunter implements three efficient methods to search for metabolites in manually curated data from two reference libraries.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Results of metabolite searches for mixtures of HMDB peaks corresponding to n metabolites, where n = 1:10. The curves labelled "MHX_DB ALL", where X = {1,3} and DB = {HMDB, MMCD}, represent average percentages of correctly matched metabolites over 100 runs using MH1 and MH3, when all matches regardless of their position are considered, whereas the curves labelled "MHX_DB TOPN" represent average percentages of correctly matched metabolites over 100 runs using MH1 and MH3, when only top n matches were selected, where n = 1:10. The "HMDB NMR Search ALL" and "HMDB NMR Search TOPN" curves represent average percentages of correctly matched metabolites over 100 runs using the HMDB NMR Search option available online. These curves are not present in the MMCD plot since metabolites have different names and the database does not have a 100% overlap with HMDB.
Figure 2
Figure 2
Results of metabolite searches for mixtures of MMCD peaks corresponding to n metabolites, where n = 1:10. The curves labelled "MHX_DB ALL", where X = {1,3} and DB = {HMDB, MMCD}, represent average percentages of correctly matched metabolites over 100 runs using MH1 and MH3, when all matches regardless of their position are considered, whereas the curves labelled "MHX_DB TOPN" represent average percentages of correctly matched metabolites over 100 runs using MH1 and MH3, when only top n matches were selected, where n = 1:10.
Figure 3
Figure 3
Comparative performance of metabolite matching strategies applied on individual metabolite spectra with removed peaks. Each experimental metabolite spectrum from both reference libraries was altered by removing from 0% to 50% of the peaks and then searched against both libraries. The percentage of matched metabolites was averaged over 5 iterations.
Figure 4
Figure 4
Comparative performance of metabolite matching strategies applied on individual metabolite spectra with chemical shift variations. Each experimental metabolite spectrum from both reference libraries were altered by adding/subtracting a chemical shift variation from 0.00 ppm to ± 0.05 ppm in equal increments of 0.01 ppm and then searched against both libraries. The percentage of matched metabolites was averaged over 5 iterations.
Figure 5
Figure 5
Comparative performance of metabolite matching strategies applied on synthetic mixtures with removed peaks. Synthetic mixture spectra were obtained by pooling peaks from 50 randomly selected metabolites from the two reference libraries (HMDB and MMCD). Spectral noise was introduced by removing from 0% to 50% of the peaks. The percentage of correctly identified metabolites was averaged over 50 iterations.
Figure 6
Figure 6
Comparative performance of metabolite matching strategies applied on synthetic mixtures with chemical shift variations. Synthetic mixture spectra were obtained by pooling peaks from 50 randomly selected metabolites from the two reference libraries (HMDB and MMCD). Spectral noise was introduced by adding/subtracting a chemical shift variation from 0 ppm to ± 0.05 ppm in equal increments of 0.01. The percentage of correctly identified metabolites was averaged over 50 iterations.
Figure 7
Figure 7
Evaluation of MetaboHunter on individual metabolite spectra with methods that use different scoring functions. Each individual metabolite spectrum was queried against the original reference libraries (HMDB and MMCD) using methods MH1 and MH3 that applied the scoring functions f1 (simple percentage calculations) and f2 (Equation 1) for ranking the metabolites. The top metabolite hit in MetaboHunter's output was reported as a match if it was identical with the query metabolite. The process was repeated for all metabolites in the reference libraries. MM = number of matched metabolites; TM = total number of metabolites.
Figure 8
Figure 8
Frequency of metabolites that share the same peak coordinate in the HMDB reference library. The x-axis contains the locations of the peaks (ppm) and the y-axis marks the number of metabolites that have peaks at given locations.
Figure 9
Figure 9
Distribution of the number of peak coordinates per metabolite in the HMDB reference library. The number of peak coordinates for HMDB metabolites varies between 1 and 181.
Figure 10
Figure 10
Frequency of metabolites that share the same peak coordinate in the MMCD reference library. The x-axis contains the locations of the peaks (ppm) and the y-axis marks the number of metabolites that have peaks at given locations.
Figure 11
Figure 11
Distribution of the number of peak coordinates per metabolite in the MMCD reference library. The number of peak coordinates for HMDB metabolites varies between 1 and 66.
Figure 12
Figure 12
De-noising and peaks identification in an NMR spectrum. The spectrum data points lie on the red curve while the vertical dotted blue segments represent peak locations whose heights are above the noise threshold (horizontal black line).
Figure 13
Figure 13
Confusion matrix. The number of true negatives (TN) is calculated as the difference between the number of metabolites in the reference library minus the total number of true positives (TP), false positives (FP) and false negatives (FN).
Figure 14
Figure 14
MetaboHunter screenshot for the Processing View. The figure shows the Processing View for MetaboHunter, which includes drop-down selection lists for input type, reference library (database), metabolite type, pH, solvent, NMR frequency, matching method, noise threshold, confidence threshold and shift tolerance.
Figure 15
Figure 15
MetaboHunter screenshot for the Search Results View. The figure depicts MetaboHunter's Search Results View, which consists of metabolite ranking, plot selection field, metabolite ID, metabolite name, matching score and ratio of identified versus total number of peaks, origin of reference metabolite, pH, solvent and the experimental NMR frequency. Three action buttons are placed at the bottom of the list of results, which allow users to further select, download and explore the identified metabolites using graphical means.
Figure 16
Figure 16
MetaboHunter screenshot for the Plot View. The figure shows MetaboHunter's Plot View, which lists on the left the selected metabolites, while the plot on the right shows the location of the selected metabolite peaks with respect to the sample peaks. The "Export chart" button at the bottom of the plot allows users to save the plot as a PDF file.
Figure 17
Figure 17
MetaboHunter screenshot for the Peaks Hit Map View. The figure shows MetaboHunter's Peaks Hit Map View, which displays in a tabular fashion the identity of the identified metabolite peaks relative to the location (ppm) of all the peaks in the sample.

References

    1. Pauling L, Robinson AB, Teranishi R, Cary P. Quantitative analysis of urine vapor and breath by gas-liquid partition chromatography. Proc Natl Acad Sci USA. 1971;68(10):2374–2376. doi: 10.1073/pnas.68.10.2374. - DOI - PMC - PubMed
    1. Oliver SG, Winson MK, Kell DB, Baganz F. Systematic functional analysis of the yeast genome. Trends Biotechnol. 1998;16(9):373–378. doi: 10.1016/S0167-7799(98)01214-1. - DOI - PubMed
    1. Beckonert O, Keun HC, Ebbels TM, Bundy J, Holmes E, Lindon JC, Nicholson JK. Metabolic profiling, metabolomic and metabonomic procedures for NMR spectroscopy of urine, plasma, serum and tissue extracts. Nat Protoc. 2007;2:2692–2703. doi: 10.1038/nprot.2007.376. - DOI - PubMed
    1. Viant MR, Bearden DW, Bundy JG, Burton IW, Collette TW, Ekman DR, Ezernieks V, Karakach TK, Lin CY, Rochfort S, de Ropp JS, Teng Q, Tjeerdema RS, Walter JA, Wu H. International NMR-based Environmental Metabolomics Intercomparison Exercise. Environ Sci Technol. 2009;43:219–225. doi: 10.1021/es802198z. - DOI - PubMed
    1. Robinette SL, Zhang F, Brüschweiler-Li L, Brüschweiler R. Web Server based complex mixture analysis by NMR. Anal Chem. 2008;80(10):3606–3611. doi: 10.1021/ac702530t. - DOI - PubMed

Publication types

LinkOut - more resources