Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jan 18;11 Suppl 1(Suppl 1):S1.
doi: 10.1186/1471-2105-11-S1-S1.

Nonnegative principal component analysis for mass spectral serum profiles and biomarker discovery

Affiliations

Nonnegative principal component analysis for mass spectral serum profiles and biomarker discovery

Henry Han. BMC Bioinformatics. .

Abstract

Background: As a novel cancer diagnostic paradigm, mass spectroscopic serum proteomic pattern diagnostics was reported superior to the conventional serologic cancer biomarkers. However, its clinical use is not fully validated yet. An important factor to prevent this young technology to become a mainstream cancer diagnostic paradigm is that robustly identifying cancer molecular patterns from high-dimensional protein expression data is still a challenge in machine learning and oncology research. As a well-established dimension reduction technique, PCA is widely integrated in pattern recognition analysis to discover cancer molecular patterns. However, its global feature selection mechanism prevents it from capturing local features. This may lead to difficulty in achieving high-performance proteomic pattern discovery, because only features interpreting global data behavior are used to train a learning machine.

Methods: In this study, we develop a nonnegative principal component analysis algorithm and present a nonnegative principal component analysis based support vector machine algorithm with sparse coding to conduct a high-performance proteomic pattern classification. Moreover, we also propose a nonnegative principal component analysis based filter-wrapper biomarker capturing algorithm for mass spectral serum profiles.

Results: We demonstrate the superiority of the proposed algorithm by comparison with six peer algorithms on four benchmark datasets. Moreover, we illustrate that nonnegative principal component analysis can be effectively used to capture meaningful biomarkers.

Conclusion: Our analysis suggests that nonnegative principal component analysis effectively conduct local feature selection for mass spectral profiles and contribute to improving sensitivities and specificities in the following classification, and meaningful biomarker discovery.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Comparison on the five algorithm performance. Comparison on the five algorithm performance on four datasets: 'O1' (ovarian), 'O2' (ovarian-qaqc), 'L' (liver), and 'C' (colorectal). The NPCA-SVM algorithm demonstrated leading performance over the other four algorithms.
Figure 2
Figure 2
Visualization of the colorectal samples by using three biomarkers. The 48 control and 64 cancer samples are visualized by using the three biomarkers. Two types of samples demonstrated significantly different means and variations.
Figure 3
Figure 3
Visualization of the ovarian samples by using three biomarkers. The 253 ovarian samples are visualized by using the three biomarkers. The 91 control and 162 cancer samples are separated into two disjoint clusters.

Similar articles

Cited by

References

    1. Petricoin E, Liotta A. SELDI-TOF-based serum proteomic pattern diagnostics for early detection of cancer. Curr Opin Biotechnol. 2004;15:24–30. doi: 10.1016/j.copbio.2004.01.005. - DOI - PubMed
    1. Coombes KR, Morris JS, Hu J, Edmonson SR, Baggerly KA. Serum proteomics profiling - a young technology begins to mature. Nat Biotechnol. 2005;23:291–292. doi: 10.1038/nbt0305-291. - DOI - PubMed
    1. Hauskrecht M, Pelikan R, Malehorn DE, Bigbee WL, Lotze MT, Zeh HJ, Whitcomb DC, Lyons-Weiler J. Feature Selection for Classification of SELDI-TOF-MS Proteomic Profiles. Applied Bioinformatics. 2005;4(4):227–246. doi: 10.2165/00822942-200504040-00003. - DOI - PubMed
    1. Yu JS, Ongarello S, Fiedler R, Chen XW, Toffolo G, Cobelli C, Trajanoski Z. Ovarian cancer identification based on dimensionality reduction for high-throughput mass spectrometry data. Bioinformatics. 2005;21(10):2200–2209. doi: 10.1093/bioinformatics/bti370. - DOI - PubMed
    1. Mantini D, Petrucci F, Del Boccio P, Pieragostino D, Di Nicola M, Lugaresi A, Federici G, Sacchetta P, Di Ilio C, Urbani A. Independent component analysis for the extraction of reliable protein signal profiles from maldi-tof mass spectra. Bioinformatics. 2008;24(1):63–70. doi: 10.1093/bioinformatics/btm533. - DOI - PubMed

Publication types