S3CMTF: Fast, accurate, and scalable method for incomplete coupled matrix-tensor factorization
- PMID: 31251750
- PMCID: PMC6599158
- DOI: 10.1371/journal.pone.0217316
S3CMTF: Fast, accurate, and scalable method for incomplete coupled matrix-tensor factorization
Abstract
How can we extract hidden relations from a tensor and a matrix data simultaneously in a fast, accurate, and scalable way? Coupled matrix-tensor factorization (CMTF) is an important tool for this purpose. Designing an accurate and efficient CMTF method has become more crucial as the size and dimension of real-world data are growing explosively. However, existing methods for CMTF suffer from lack of accuracy, slow running time, and limited scalability. In this paper, we propose S3CMTF, a fast, accurate, and scalable CMTF method. In contrast to previous methods which do not handle large sparse tensors and are not parallelizable, S3CMTF provides parallel sparse CMTF by carefully deriving gradient update rules. S3CMTF asynchronously updates partial gradients without expensive locking. We show that our method is guaranteed to converge to a quality solution theoretically and empirically. S3CMTF further boosts the performance by carefully storing intermediate computation and reusing them. We theoretically and empirically show that S3CMTF is the fastest, outperforming existing methods. Experimental results show that S3CMTF is up to 930× faster than existing methods while providing the best accuracy. S3CMTF shows linear scalability on the number of data entries and the number of cores. In addition, we apply S3CMTF to Yelp rating tensor data coupled with 3 additional matrices to discover interesting patterns.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures







Similar articles
-
Turbo-SMT: Parallel Coupled Sparse Matrix-Tensor Factorizations and Applications.Stat Anal Data Min. 2016 Aug;9(4):269-290. doi: 10.1002/sam.11315. Epub 2016 Jun 30. Stat Anal Data Min. 2016. PMID: 27672406 Free PMC article.
-
Unraveling Diagnostic Biomarkers of Schizophrenia Through Structure-Revealing Fusion of Multi-Modal Neuroimaging Data.Front Neurosci. 2019 May 3;13:416. doi: 10.3389/fnins.2019.00416. eCollection 2019. Front Neurosci. 2019. PMID: 31130835 Free PMC article.
-
Turbo-SMT: Accelerating Coupled Sparse Matrix-Tensor Factorizations by 200×.Proc SIAM Int Conf Data Min. 2014;2014:118-126. doi: 10.1137/1.9781611973440.14. Proc SIAM Int Conf Data Min. 2014. PMID: 26473087 Free PMC article.
-
Distributed Tensor Decomposition for Large Scale Health Analytics.Proc Int World Wide Web Conf. 2019 May;2019:659-669. doi: 10.1145/3308558.3313548. Proc Int World Wide Web Conf. 2019. PMID: 31198910 Free PMC article.
-
The ELPA library: scalable parallel eigenvalue solutions for electronic structure theory and computational science.J Phys Condens Matter. 2014 May 28;26(21):213201. doi: 10.1088/0953-8984/26/21/213201. Epub 2014 May 2. J Phys Condens Matter. 2014. PMID: 24786764 Review.
Cited by
-
Spectroscopic technologies and data fusion: Applications for the dairy industry.Front Nutr. 2023 Jan 11;9:1074688. doi: 10.3389/fnut.2022.1074688. eCollection 2022. Front Nutr. 2023. PMID: 36712542 Free PMC article.
-
GIFT: Guided and Interpretable Factorization for Tensors with an application to large-scale multi-platform cancer analysis.Bioinformatics. 2018 Dec 15;34(24):4151-4158. doi: 10.1093/bioinformatics/bty490. Bioinformatics. 2018. PMID: 29931238 Free PMC article.
-
TASTE: Temporal and Static Tensor Factorization for Phenotyping Electronic Health Records.Proc ACM Conf Health Inference Learn (2020). 2020 Apr;2020:193-203. doi: 10.1145/3368555.3384464. Proc ACM Conf Health Inference Learn (2020). 2020. PMID: 33659966 Free PMC article.
-
Tensor-structured decomposition improves systems serology analysis.Mol Syst Biol. 2021 Sep;17(9):e10243. doi: 10.15252/msb.202110243. Mol Syst Biol. 2021. PMID: 34487431 Free PMC article.
References
-
- Park N, Jeon B, Lee J, Kang U. BIGtensor: Mining Billion-Scale Tensor Made Easy. In: Proceedings of the International Conference on Information and Knowledge Management. ACM; 2016.
-
- Park N, Oh S, Kang U. Fast and Scalable Distributed Boolean Tensor Factorization. In: Data Engineering (ICDE), 2017 IEEE 33rd International Conference on. IEEE; 2017. p. 1071–1082.
-
- Oh S, Park N, Sael L, Kang U. Scalable Tucker Factorization for Sparse Tensors—Algorithms and Discoveries. In: Data Engineering (ICDE), 2018 IEEE 34th International Conference on. IEEE; 2018. p. 1120–1131.
-
- Koren Y, Bell R, Volinsky C. Matrix factorization techniques for recommender systems. Computer. 2009;42(8). 10.1109/MC.2009.263 - DOI
-
- Kolda TG, Bader BW. Tensor decompositions and applications. SIAM review. 2009;51(3):455–500. 10.1137/07070111X - DOI
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources