Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Mar 13;14(3):e0211463.
doi: 10.1371/journal.pone.0211463. eCollection 2019.

A constrained singular value decomposition method that integrates sparsity and orthogonality

Affiliations

A constrained singular value decomposition method that integrates sparsity and orthogonality

Vincent Guillemot et al. PLoS One. .

Abstract

We propose a new sparsification method for the singular value decomposition-called the constrained singular value decomposition (CSVD)-that can incorporate multiple constraints such as sparsification and orthogonality for the left and right singular vectors. The CSVD can combine different constraints because it implements each constraint as a projection onto a convex set, and because it integrates these constraints as projections onto the intersection of multiple convex sets. We show that, with appropriate sparsification constants, the algorithm is guaranteed to converge to a stable point. We also propose and analyze the convergence of an efficient algorithm for the specific case of the projection onto the balls defined by the norms L1 and L2. We illustrate the CSVD and compare it to the standard singular value decomposition and to a non-orthogonal related sparsification method with: 1) a simulated example, 2) a small set of face images (corresponding to a configuration with a number of variables much larger than the number of observations), and 3) a psychometric application with a large number of observations and a small number of variables. The companion R-package, csvd, that implements the algorithms described in this paper, along with reproducible examples, are available for download from https://github.com/vguillemot/csvd.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Simulated data.
Heatmap of PM: the true left singular vectors (in an horizontal representation: one line equals one singular vector).
Fig 2
Fig 2. Simulated data.
Heatmap of QM: the true right singular vectors (in an horizontal representation: one line equals one singular vector).
Fig 3
Fig 3. Simulated data.
Boxplots of the squared difference between the estimated singular vectors and the ground truth.
Fig 4
Fig 4. Simulated data.
Heatmap of the correlations between the estimated left (P) and right (Q) singular vectors with the ground truth for the CSVD and PMD. Each cell of the heatmap represents the correlation between one of the 7 estimated (left or right) vectors with the 5 true vectors. Each heatmap contains 5 rows (the ground truth) and 7 columns (the estimated vectors). On the left, are the results obtained with the CSVD and on the right, the results obtained with PMD.
Fig 5
Fig 5. Simulated data.
Computational time of PMD and the CSVD when estimating one sparse singular triplet (left) and two sparse singular triplets (right).
Fig 6
Fig 6. Face data.
The data matrix of the face dataset: 6 faces by 55,200 voxels. The female faces are denoted F1, F2, and F3, the male faces M1, M2, and M3.
Fig 7
Fig 7. Face data.
Eigenvalues and pseudo-eigenvalues per dimension. Left column: eigenvalues obtained from regular SVD; middle column: pseudo-eigenvalues (i.e., variance of the factor scores) for the CSVD; right column: pseudo-eigenvalues (i.e., variance of the factor scores) for the PMD. For PMD and the CSVD, each line represents a level of sparsity: none (No), low (L), medium (M), and high (H).
Fig 8
Fig 8. Face data.
Cosine matrix for the 6 faces of the face dataset. The female faces are denoted F1, F2, and F3, the male faces M1, M2, and M3.
Fig 9
Fig 9. Face data.
First two sparse left singular vectors (P) for the CSVD (left) and PMD (right) for three levels of sparsity: low (L), medium (M), and high (H). The results for the CSVD are reported on the left, and the results for PMD are reported on the right.
Fig 10
Fig 10. Face data.
The pseudo-eigenfaces for Dimension 1 on the left column and Dimension 2 on the right column. For this graph, only the CSVD was applied, with three different levels of sparsity: low on the top row (L), medium on the middle row (M), and high on the bottom row (H).
Fig 11
Fig 11. Face data.
The pseudo-eigenfaces for Dimension 1 on the left column and Dimension 2 on the right column. For this graph, only PMD was applied, with three different levels of sparsity: low on the top row (L), medium on the middle row (M), and high on the bottom row (H).
Fig 12
Fig 12. Face data.
The six eigenfaces obtained from the plain SVD.
Fig 13
Fig 13. Face data.
The two first left singular vectors of the plain SVD of the non-centered data.
Fig 14
Fig 14. Face data.
Cross-product matrix of the 6 right pseudo-singular vectors for three levels of sparsity: low (L), medium (M), and high (H). The results for the CSVD are reported on the left, and the results for PMD are reported on the right.
Fig 15
Fig 15. OSIQ data.
Loadings of the first two principal components.
Fig 16
Fig 16. OSIQ data.
Scree plots for SVD, the CSVD and PMD for different values of the sparsity parameter.
Fig 17
Fig 17. OSIQ data.
Loadings of Dimensions 1 and 2 with an increasing degree of sparsity for both the CSVD (left column) and PMD (right column).
Fig 18
Fig 18. OSIQ data.
Plot of the cross-product between the loadings obtained for up to 30 dimensions. Left: CSVD. Right: PMD. Top to bottom: the three different levels of sparsity.
Fig 19
Fig 19. OSIQ data.
Loadings for Dimensions 1, 2, and 3 with sparsity parameters set to c1 ≈ 15.11 and c2 ≈ 2.50. The sparsity parameters were empirically determined visually to create “pure” dimensions.
Fig 20
Fig 20. OSIQ data.
PCA with the reduced set of 14 items from the OSIQ. Left: Dimensions 1 and 2 for plain PCA; Right: Dimensions 1 and 2 after a two-dimensional Varimax rotation.

References

    1. Abdi H. Singular Value Decomposition (SVD) and Generalized Singular Value Decomposition (GSVD) In: Salkind NJ, editor. Encyclopedia of Measurement and Statistic. Thousand Oaks (CA): Sage; 2007. p. 907–912.
    1. Greenacre M. Correspondence analysis. New-York: Academic Press; 1984.
    1. Lebart L, Morineau A, Warwick KM. Multivariate Descriptive Statistical Analysis: Correspondence Analysis and Related Techniques for Large Matrices. New-York: Wiley; 1984.
    1. Jolliffe IT, Cadima J. Principal component analysis: a review and recent developments. Philosophical Transactions of the Royal Society of London A: Mathematical, Physical and Engineering Sciences. 2016;374 (2065). 10.1098/rsta.2015.0202 - DOI - PMC - PubMed
    1. Hotelling H. Relations between two sets of variates. Biometrika. 1936. December;28(3-4):321–377. 10.1093/biomet/28.3-4.321 - DOI

Publication types