Cellwise outlier detection and biomarker identification in metabolomics based on pairwise log ratios
- PMID: 32189829
- PMCID: PMC7063692
- DOI: 10.1002/cem.3182
Cellwise outlier detection and biomarker identification in metabolomics based on pairwise log ratios
Abstract
Data outliers can carry very valuable information and might be most informative for the interpretation. Nevertheless, they are often neglected. An algorithm called cellwise outlier diagnostics using robust pairwise log ratios (cell-rPLR) for the identification of outliers in single cell of a data matrix is proposed. The algorithm is designed for metabolomic data, where due to the size effect, the measured values are not directly comparable. Pairwise log ratios between the variable values form the elemental information for the algorithm, and the aggregation of appropriate outlyingness values results in outlyingness information. A further feature of cell-rPLR is that it is useful for biomarker identification, particularly in the presence of cellwise outliers. Real data examples and simulation studies underline the good performance of this algorithm in comparison with alternative methods.
Keywords: biomarker; cellwise outliers; cell‐rPLR; log ratio; metabolomics; robust method.
© 2019 The Authors. Journal of Chemometrics published by John Wiley & Sons Ltd.
Figures








References
-
- Pepe MS, Etzioni R, Feng Z, et al. Phases of biomarker development for early detection of cancer. JNCI: J Natl Cancer Inst. 2001;93(14):1054‐1061. - PubMed
-
- Abeel T, Helleputte T, Van de Peer Y, Dupont P, Saeys Y. Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics. 2009;26(3):392‐398. - PubMed
-
- Huber PJ, Ronchetti EM. Robust Statistics, Series in Probability and Mathematical Statistics. New York, NY, USA: John Wiley; 1981.
-
- Maronna RA, Martin RD, Yohai VJ, Salibián‐Barrera M. Robust Statistics: Theory and Methods (With R). Chichester, UK: Wiley; 2019.
LinkOut - more resources
Full Text Sources