Normalization and missing value imputation for label-free LC-MS analysis

Yuliya V Karpievitch¹, Alan R Dabney, Richard D Smith

Affiliations

PMID: 23176322
PMCID: PMC3489534
DOI: 10.1186/1471-2105-13-S16-S5

Normalization and missing value imputation for label-free LC-MS analysis

Yuliya V Karpievitch et al. BMC Bioinformatics. 2012.

. 2012;13 Suppl 16(Suppl 16):S5.

doi: 10.1186/1471-2105-13-S16-S5. Epub 2012 Nov 5.

Authors

Yuliya V Karpievitch¹, Alan R Dabney, Richard D Smith

Affiliation

¹ School of Mathematics and Physics, University of Tasmania, Hobart, Tasmania, Australia. yuliya.karpievitch@utas.edu.au

PMID: 23176322
PMCID: PMC3489534
DOI: 10.1186/1471-2105-13-S16-S5

Abstract

Shotgun proteomic data are affected by a variety of known and unknown systematic biases as well as high proportions of missing values. Typically, normalization is performed in an attempt to remove systematic biases from the data before statistical inference, sometimes followed by missing value imputation to obtain a complete matrix of intensities. Here we discuss several approaches to normalization and dealing with missing values, some initially developed for microarray data and some developed specifically for mass spectrometry-based data.

PubMed Disclaimer

Figures

**Figure 1**
**Examples of missing data**. Intensities for a peptide with two treatment groups with (A) no missing values, (B) MCAR missing values, (C) censored missing values, and (D) censored missing values imputed as a minimum observed value.

**Figure 2**
**Percent coverage for nominal 95% confidence intervals of protein-level differences**.

**Figure 3**
**Histograms of the null p-values for normalized (left) and raw (right) peptide abundances**.

**Figure 4**
**Top three eigentrends identified in raw (left), imputed (middle); and normalized after imputation data (right)**. X-axis is the sample index, y-axis are values in eigentrends.

**Figure 5**
**Top three eigentrends identified in raw (left), normalized (middle); and imputed after normalization data (right)**. X-axis is the sample index, y-axis are values in eigentrends.

See this image and copyright information in PMC

References

1. Aebersold R, Mann M. Mass spectrometry-based proteomics. Nature. 2003;422(6928):198–207. doi: 10.1038/nature01511. - DOI - PubMed
1. Craig R, Beavis RC. TANDEM: matching proteins with tandem mass spectra. Bioinformatics. 2004;20(9):1466–1467. doi: 10.1093/bioinformatics/bth092. - DOI - PubMed
1. Eng JK, McCormack AL, Yates JR. An approach to correlate MS/MS data to amino acid sequences in a protein database. J Am Soc Mass Spectrom. 1994;5:976–989. doi: 10.1016/1044-0305(94)80016-2. - DOI - PubMed
1. Pasa-Tolic L, Masselon C, Barry RC, Shen Y, Smith RD. Proteomic analyses using an accurate mass and time tag strategy. Biotechniques. 2004;37(4):621–624. 626-633, 636 passim. - PubMed
1. Perkins DN, Pappin DJ, Creasy DM, Cottrell JS. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis. 1999;20(18):3551–3567. doi: 10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2. - DOI - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Normalization and missing value imputation for label-free LC-MS analysis

Affiliation

Normalization and missing value imputation for label-free LC-MS analysis

Authors

Affiliation

Abstract

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources