Statistical approaches applicable in managing OMICS data: Urinary proteomics as exemplary case

De-Wei An^{1

2}, Yu-Ling Yu^{1

2}, Dries S Martens³, Agnieszka Latosinska⁴, Zhen-Yu Zhang⁵, Harald Mischak⁴, Tim S Nawrot^{2

3}, Jan A Staessen^{1

6}

Affiliations

¹ Non-Profit Research Association Alliance for the Promotion of Preventive Medicine, Mechelen, Belgium.
² Research Unit Environment and Health, KU Leuven Department of Public Health and Primary Care, University of Leuven, Leuven, Belgium.
³ Centre for Environmental Sciences, Hasselt University, Hasselt, Belgium.
⁴ Mosaiques Diagnostics GmbH, Hannover, Germany.
⁵ Research Unit Hypertension and Cardiovascular Epidemiology, KU Leuven Department of Cardiovascular Sciences, University of Leuven, Leuven, Belgium.
⁶ Biomedical Research Group, Faculty of Medicine, University of Leuven, Leuven, Belgium.

PMID: 37143314
DOI: 10.1002/mas.21849

Review

Statistical approaches applicable in managing OMICS data: Urinary proteomics as exemplary case

De-Wei An et al. Mass Spectrom Rev. 2024 Nov-Dec.

. 2024 Nov-Dec;43(6):1237-1254.

doi: 10.1002/mas.21849. Epub 2023 May 4.

Authors

De-Wei An^{1

2}, Yu-Ling Yu^{1

2}, Dries S Martens³, Agnieszka Latosinska⁴, Zhen-Yu Zhang⁵, Harald Mischak⁴, Tim S Nawrot^{2

3}, Jan A Staessen^{1

6}

Affiliations

¹ Non-Profit Research Association Alliance for the Promotion of Preventive Medicine, Mechelen, Belgium.
² Research Unit Environment and Health, KU Leuven Department of Public Health and Primary Care, University of Leuven, Leuven, Belgium.
³ Centre for Environmental Sciences, Hasselt University, Hasselt, Belgium.
⁴ Mosaiques Diagnostics GmbH, Hannover, Germany.
⁵ Research Unit Hypertension and Cardiovascular Epidemiology, KU Leuven Department of Cardiovascular Sciences, University of Leuven, Leuven, Belgium.
⁶ Biomedical Research Group, Faculty of Medicine, University of Leuven, Leuven, Belgium.

PMID: 37143314
DOI: 10.1002/mas.21849

Abstract

With urinary proteomics profiling (UPP) as exemplary omics technology, this review describes a workflow for the analysis of omics data in large study populations. The proposed workflow includes: (i) planning omics studies and sample size considerations; (ii) preparing the data for analysis; (iii) preprocessing the UPP data; (iv) the basic statistical steps required for data curation; (v) the selection of covariables; (vi) relating continuously distributed or categorical outcomes to a series of single markers (e.g., sequenced urinary peptide fragments identifying the parental proteins); (vii) showing the added diagnostic or prognostic value of the UPP markers over and beyond classical risk factors, and (viii) pathway analysis to identify targets for personalized intervention in disease prevention or treatment. Additionally, two short sections respectively address multiomics studies and machine learning. In conclusion, the analysis of adverse health outcomes in relation to omics biomarkers rests on the same statistical principle as any other data collected in large population or patient cohorts. The large number of biomarkers, which have to be considered simultaneously requires planning ahead how the study database will be structured and curated, imported in statistical software packages, analysis results will be triaged for clinical relevance, and presented.

Keywords: multidimensional classifiers; proteomics; statistical methods; urinary proteomics.

PubMed Disclaimer

References

REFERENCES

1. Bartel J, Krumsiek J, Theis FJ. 2013. Statistical methods for the analysis of high‐throughput metabolomics data. Comput Struct Biotechnol J 4, e201301009.
1. Benjamini Y, Hochberg Y. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Royal Stat Soc B 57, 289–300.
1. Bhat A, Heinzel A, Mayer B, Perco P, Mühlberger I, Husi H, Merseburger AS, Zoidakis J, Vlahou A, Schanstra JP, Mischak H, Jankowski V. 2015. Protein interactome of muscle‐invasive bladder cancer. PLoS One 10, e0116404.
1. Blom G. 1958. Statistical estimates and transformed beta‐variables. 1st ed. New York/Stockholm: Wiley/Almquist and Wiksell.
1. Casalicchio G, Molnar C, Bischl B. 2019. Visualizing the feature importance for black box models. In: Machine Learning and Knowledge Discovery in Databases (Berlingerio M, Bonchi F, Gärtner T, eds.). Cham, Switzerland: Springer International Publishing, 665–670.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
- Wiley

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Statistical approaches applicable in managing OMICS data: Urinary proteomics as exemplary case

Affiliations

Statistical approaches applicable in managing OMICS data: Urinary proteomics as exemplary case

Authors

Affiliations

Abstract

References

REFERENCES

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources