Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Nov-Dec;14(6):1434-1445.
doi: 10.1109/TCBB.2016.2586065. Epub 2016 Jul 7.

Strategies for Comparing Metabolic Profiles: Implications for the Inference of Biochemical Mechanisms from Metabolomics Data

Strategies for Comparing Metabolic Profiles: Implications for the Inference of Biochemical Mechanisms from Metabolomics Data

Zhen Qi et al. IEEE/ACM Trans Comput Biol Bioinform. 2017 Nov-Dec.

Abstract

Background: Large amounts of metabolomics data have been accumulated in recent years and await analysis. Previously, we had developed a systems biology approach to infer biochemical mechanisms underlying metabolic alterations observed in cancers and other diseases. The method utilized the typical Euclidean distance for comparing metabolic profiles. Here, we ask whether any of the numerous alternative metrics might serve this purpose better.

Methods and findings: We used enzymatic alterations in purine metabolism that were measured in human renal cell carcinoma to test various metrics with the goal of identifying the best metrics for discerning metabolic profiles of healthy and diseased individuals. The results showed that several metrics have similarly good performance, but that some are unsuited for comparisons of metabolic profiles. Furthermore, the results suggest that relative changes in metabolite levels, which reduce bias toward large metabolite concentrations, are better suited for comparisons of metabolic profiles than absolute changes. Finally, we demonstrate that a sequential search for enzymatic alterations, ranked by importance, is not always valid.

Conclusions: We identified metrics that are appropriate for comparisons of metabolic profiles. In addition, we constructed strategic guidelines for the algorithmic identification of biochemical mechanisms from metabolomics data.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest

The authors have no conflict of interest to declare.

Figures

Figure 1
Figure 1. Simplified diagram of human purine metabolism
Purine metabolism consists of a de novo synthesis pathway (red arrows) and a salvage pathway (green arrows) for purine bases. Reactions are represented with arrows. Metabolites are shown in dashed boxes and enzymes are indicated by italics. Table III lists enzyme names and their abbreviations. Chevron arrows point to altered enzymes in human renal cell carcinoma (magenta: activation; blue: inhibition). Regulatory signals are omitted for clarity. Metabolites and their abbreviations are: phosphoribosylpyrophosphate (PRPP), inosine monophosphate (IMP), adenylosuccinate (S-AMP), adenosine + adenosine monophosphate + adenosine diphosphate + adenosine triphosphate (Ado_AMP_ADP_ATP), s-adenosyl-L-methionine (SAM), adenine (Ade), xanthosine monophosphate (XMP), guanosine monophosphate + guanosine diphosphate + guanosine triphosphate (GMP_GDP_GTP), deoxyadenosine + deoxyadenosine monophosphate + deoxyadenosine diphosphate + deoxyadenosine triphosphate (dAdo_dAMP_dADP_dATP), deoxyguanosine monophosphate + deoxyguanosine diphosphate + deoxyguanosine triphosphate (dGMP_dGDP_dGTP), ribonucleic acid (RNA), deoxyribonucleic acid (DNA), hypoxanthine + inosine + deoxyinosine (HX_Ino_dIno), xanthine (Xa), guanine + guanosine + deoxyguanosine (Gua_Guo_dGuo), uric acid (UA), ribose-5-phosphate (R5P). Chevron arrows
Figure 2
Figure 2. Performance of metrics on comparison of metabolic profiles resulted from experimentally measured enzymatic changes
Out of six experimentally measured enzymatic changes, all possible combinations are implemented. When an exact enzymatic change is implemented, the result in each subpanel is shown in the column next to the control, which corresponds to no perturbation at all. Subsequent columns show the results of two (15 different combinations of exact perturbations), three (20 different combinations), four (15 different combinations), and five combinatory alterations of enzymatic activities (6 different combinations). The y-axis represents the distance or dissimilarity which is normalized. Results are based on relative changes. Each red horizontal line shows the smallest distance or dissimilarity in each column. A: Minkowski distance (m = 3); B: Euclidean distance; C: Manhattan distance; D: Jeffreys & Matusita distance; E: Canberra distance; F: relative distance; G: cosine of angle; H: Dice’s coefficient; I: Jaccard similarity coefficient. The corresponding plot for absolute changes is shown in Fig. S1. Note differences in magnitudes along the y-axis.
Figure 3
Figure 3. Distance topology for the Neighborhood surrounding the targeted enzymatic changes using absolute changes
For human renal cell carcinoma, the observed enzymatic changes include the activation of IMPD (2.53 in fold change) and ATASE (1.58 in fold change). This targeted set of enzymatic changes is disturbed by uncertainty (10% relative noise sampled from a normal distribution). In addition, all other enzymes are also affected by 10% relative noise over normal activities. The x- and y-axes represent relative enzymatic activities of IMPD and ATASE in regard to their normal values, respectively. The z-axis shows the Minkowski distances between the simulated metabolic profiles and the targeted disease profile, using absolute changes. Subplot A schematically illustrates the uncertainty surrounding the targeted enzymatic changes (red cross). The same distance topology is viewed from different angles. B: horizontal rotation (−105) and vertical elevation (20); C: horizontal rotation (−15) and vertical elevation (20); D: horizontal rotation (0) and vertical elevation (20). Distances are similarly distributed within the neighborhood surrounding the targeted enzymatic changes.
Figure 4
Figure 4. Distance topology for the Neighborhood surrounding the targeted enzymatic changes using relative changes
As described in Figure 3, the enzymatic changes consist of the activation of IMPD (2.53 in fold change) and ATASE (1.58 in fold change). Uncertainty is implemented as in Figure 3. The x- and y-axes represent relative enzymatic activities of IMPD and ATASE in regard to their normal values, respectively. The z-axis shows the Minkowski distances between simulated metabolic profiles and the targeted disease profile, using relative changes instead of absolute changes. Subplot A schematically illustrates the uncertainty surrounding the targeted enzymatic changes (red cross). The same distance topology is viewed from different angles. B: horizontal rotation (−105) and vertical elevation (20); C: horizontal rotation (−15) and vertical elevation (20); D: horizontal rotation (0) and vertical elevation (20). In contrast to Figure 3, distances are unevenly distributed within the neighborhood surrounding the targeted enzymatic changes. The surface looks like a trough, which identifies IMPD as the most significant factor.
Figure 5
Figure 5. Mean values of distances for each grid area within the neighborhood surrounding the targeted enzymatic changes and significance of the differences in mean values
Targeted enzymatic changes (IMPD and ATASE) with up to 10% relative variations are gridded, and the mean value is calculated for each grid box. All other enzymes have similar variations around their normal activities. The x- and y-axes represent relative enzymatic activities of IMPD and ATASE in regard to their normal values, while the z-axis exhibits the Minkowski distances using relative changes. The same distances are viewed from different angles. A: horizontal rotation (−125) and vertical elevation (20); B: horizontal rotation (−25) and vertical elevation (20). C: Significance of the differences in mean values between the distribution in the grid box with the minimal mean value and all other distributions.
Figure 6
Figure 6. Feasibility of the sequential identification strategy
Minkowski distances for all possible sequential enzymatic changes with one additional change per step are connected through lines. The x-axis shows the number of enzymatic changes in each sequential scenario, while the y-axis represents the Minkowski distances using absolute changes. With each increase in the number of enzymatic changes, the simulated vector might be expected to become closer to the target vector. A: Schematic illustration of positions of simulated vectors and the target vector for a demonstration system with only two metabolites. B: Changes in Minkowski distances with increasing numbers of sequential enzymatic changes (720 different combinations in the case of 6 enzymes). C: Out of all 720 possible sequential enzymatic changes, only those with decreasing Minkowski distances for subsequent steps are shown. D: Out of all 720 possible sequential enzymatic changes, only those are shown that start with the minimal Minkowski distance at 1st step and end with a minimum at the 5th step.
Figure 7
Figure 7. Strategic guidelines for the algorithmic inference of biochemical mechanisms from metabolomics data
The flow chart shows recommendations for designing an algorithm for the inference of biochemical mechanisms underlying a disease from metabolomics data. Preferred metrics are Minkowski distance, Euclidean distance, Manhattan distance, Jeffreys & Matusita distance, Dice’s coefficient, and Jaccard similarity coefficient. These metrics have similar performance. The outputs from the sequential strategy and multi-phase strategy can be compared and provide further targets for experimental investigations.

Similar articles

Cited by

References

    1. Wu W, Zhao S. Metabolic changes in cancer: beyond the Warburg effect. Acta Biochim Biophys Sin (Shanghai) 2013 Jan;45(1):18–26. - PubMed
    1. Schulze A, Harris AL. How cancer metabolism is tuned for proliferation and vulnerable to disruption. Nature. 2012 Nov 15;491(7424):364–73. - PubMed
    1. Qi Z, Miller GW, Voit EO. Rotenone and paraquat perturb dopamine metabolism: A computational analysis of pesticide toxicity. Toxicology. 2014 Jan 6;315:92–101. - PMC - PubMed
    1. Qi Z, Voit EO. Identification of cancer mechanisms through computational systems modeling. Translational Cancer Research. 2014;3(3):233–242. - PMC - PubMed
    1. Park FC. Distance Metrics on the Rigid-Body Motions with Applications to Mechanism Design. Journal of Mechanical Design. 1995;117(1):48–54.

Publication types