ParaDime: A Framework for Parametric Dimensionality Reduction

Andreas Hinterreiter¹, Christina Humer¹, Bernhard Kainz^{2

3}, Marc Streit¹

Affiliations

¹ Johannes Kepler University Linz Austria.
² Friedrich-Alexander-University Erlangen-Nuremberg Germany.
³ Imperial College London UK.

PMID: 38505300
PMCID: PMC10947012
DOI: 10.1111/cgf.14834

ParaDime: A Framework for Parametric Dimensionality Reduction

Andreas Hinterreiter et al. Comput Graph Forum. 2023 Jun.

. 2023 Jun;42(3):337-348.

doi: 10.1111/cgf.14834. Epub 2023 Jun 27.

Authors

Andreas Hinterreiter¹, Christina Humer¹, Bernhard Kainz^{2

3}, Marc Streit¹

Affiliations

¹ Johannes Kepler University Linz Austria.
² Friedrich-Alexander-University Erlangen-Nuremberg Germany.
³ Imperial College London UK.

PMID: 38505300
PMCID: PMC10947012
DOI: 10.1111/cgf.14834

Abstract

ParaDime is a framework for parametric dimensionality reduction (DR). In parametric DR, neural networks are trained to embed high-dimensional data items in a low-dimensional space while minimizing an objective function. ParaDime builds on the idea that the objective functions of several modern DR techniques result from transformed inter-item relationships. It provides a common interface for specifying these relations and transformations and for defining how they are used within the losses that govern the training process. Through this interface, ParaDime unifies parametric versions of DR techniques such as metric MDS, t-SNE, and UMAP. It allows users to fully customize all aspects of the DR process. We show how this ease of customization makes ParaDime suitable for experimenting with interesting techniques such as hybrid classification/embedding models and supervised DR. This way, ParaDime opens up new possibilities for visualizing high-dimensional data.

Keywords: CCS Concepts; Information visualization; Learning latent representations; • Computing methodologies → Neural networks; • Human‐centered computing → Visualization systems and tools.

PubMed Disclaimer

Figures

**Figure 1**
ParaDime is a framework for parametric dimensionality reduction. Left: Data flow in a single training phase of a ParaDime routine. Right: Parametric t‐SNE trained on a subset of 5000 images from the MNIST dataset [LeC05] and applied to 15,000 unseen images.

**Figure 2**
Normalized stress [EMK*19] for parametric versions of metric MDS compared with the non‐parametric SMACOF implementation of scikit‐learn [PVG*11]. The non‐linear models were fully connected neural networks with hidden layer dimensions as indicated. The routine labeled “Direct” is a non‐parametric routine using a batch‐wise optimization which mimics that of the parametric ones. All models were trained on a 10‐dimensional diabetes dataset with 442 items [EHJT04].

**Figure 3**
Embeddings of hybrid embedding/classification routines for the MNIST dataset [LeC05] created with ParaDime. The relative weight of the embedding loss component is indicated by w_r,emb, and the weight of the classification component was 1‐w_r,emb. All embedding‐related specifications were the same as those of the ParaDime parametric UMAP routine. The routines were trained on a subset of 5000 randomly sampled MNIST images. Test accuracy was calculated on a different subset of 5000 images. Trustworthiness [VK01; EMK*19] was calculated based on ten nearest neighbors.

**Figure 4**
Supervised embeddings of a subset of the forest covertype dataset [CSSB10]. All embeddings labeled with R are supervised versions of parametric t‐SNE, where supervision was included by means of a triplet loss based on the ground truth labels. R is the ratio of the weights of the t‐SNE loss and the triplet loss. For comparison, embeddings created with scikit‐learn's non‐parametric t‐SNE implementation and with a plain ParaDime t‐SNE version (using item‐based sampling and no triplet loss) are shown. The perplexity was 200 in all cases, and a class‐balanced subset of 7000 items was used.

**Figure 5**
Attribute‐guided embeddings of a subset of the forest covertype dataset [CSSB10]. Attribute guiding was implemented by combining t‐SNE with a correlation loss which orders the data points along the x‐axis by the value of the eighth feature (hillshade at noon). The weights for the embeddings shown are (w_t‐SNE,w_corr) = (1,0), (5000,1), (1000,1), and (100,1), respectively. The bar chart on the right shows, based on integrated gradients, the feature importance scores for the learned embeddings.

See this image and copyright information in PMC

References

1. Abadi, M. , Agarwal, A. , Barham, P. , et al. “TensorFlow: Large‐Scale Machine Learning on Heterogeneous Distributed Systems”. arXiv:1603.04467 [cs] (Mar. 14, 2016). doi:10.48550/arXiv.1603.04467. arXiv: 1603.04467 9. - DOI
1. Agrawal, A. , Ali, A. , and Boyd, S. “Minimum‐Distortion Embedding”. Foundations and Trends in Machine Learning 14.3 (2021), 211–378. doi:10.1561/2200000090 2. - DOI
1. Abdi, H. and Williams, L. J. “Principal component analysis”. Wiley Interdisciplinary Reviews: Computational Statistics 2.4 (2010), 433–459. doi:10.1002/wics.101 1, 2. - DOI
1. Böhm, J. N. , Berens, P. , and Kobak, D. “Attraction‐Repulsion Spectrum in Neighbor Embeddings”. Journal of Machine Learning Research 23.95 (2022), 1–32. url: http://jmlr.org/papers/v23/21‐0055.html 2.
1. Bengio, Y. , Courville, A. , and Vincent, P. “Representation Learning: A Review and New Perspectives”. IEEE Transactions on Pattern Analysis and Machine Intelligence 35.8 (Aug. 2013), 1798–1828. doi:10.1109/TPAMI.2013.50 2. - DOI - PubMed

Grants and funding

DFH 23/FWF_/Austrian Science Fund FWF/Austria

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

ParaDime: A Framework for Parametric Dimensionality Reduction

Affiliations

ParaDime: A Framework for Parametric Dimensionality Reduction

Authors

Affiliations

Abstract

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources

Research Materials

Miscellaneous