Review

. 2025 Sep 30;6(3):031002.

doi: 10.1088/2632-2153/adf375. Epub 2025 Aug 1.

Beyond Euclid: an illustrated guide to modern machine learning with geometric, topological, and algebraic structures

Mathilde Papillon^{1

2}, Sophia Sanborn^{3

2}, Johan Mathe^{4

2}, Louisa Cornelis^{1

2}, Abby Bertics¹, Domas Buracas⁵, Hansen J Lillemark^{5

6}, Christian Shewmake⁵, Fatih Dinc¹, Xavier Pennec⁷, Nina Miolane^{1

3

4

5}

Affiliations

¹ UC Santa Barbara, Santa Barbara, United States of America.
² Equal contribution.
³ Stanford University, Palo Alto, United States of America.
⁴ Atmo, Inc., San Francisco, United States of America.
⁵ New Theory AI, San Francisco, United States of America.
⁶ UC Berkeley, Berkeley, United States of America.
⁷ Université Côte d'Azur & Inria, Nice, France.

PMID: 40755551
PMCID: PMC12315666
DOI: 10.1088/2632-2153/adf375

Review

Beyond Euclid: an illustrated guide to modern machine learning with geometric, topological, and algebraic structures

Mathilde Papillon et al. Mach Learn Sci Technol. 2025.

. 2025 Sep 30;6(3):031002.

doi: 10.1088/2632-2153/adf375. Epub 2025 Aug 1.

Authors

Affiliations

¹ UC Santa Barbara, Santa Barbara, United States of America.
² Equal contribution.
³ Stanford University, Palo Alto, United States of America.
⁴ Atmo, Inc., San Francisco, United States of America.
⁵ New Theory AI, San Francisco, United States of America.
⁶ UC Berkeley, Berkeley, United States of America.
⁷ Université Côte d'Azur & Inria, Nice, France.

PMID: 40755551
PMCID: PMC12315666
DOI: 10.1088/2632-2153/adf375

Abstract

The enduring legacy of Euclidean geometry underpins classical machine learning, which, for decades, has been primarily developed for data lying in Euclidean space. Yet, modern machine learning increasingly encounters richly structured data that is inherently non-Euclidean. This data can exhibit intricate geometric, topological and algebraic structure: from the geometry of the curvature of space-time, to topologically complex interactions between neurons in the brain, to the algebraic transformations describing symmetries of physical systems. Extracting knowledge from such non-Euclidean data necessitates a broader mathematical perspective. Echoing the 19th-century revolutions that gave rise to non-Euclidean geometry, an emerging line of research is redefining modern machine learning with non-Euclidean structures. Its goal: generalizing classical methods to unconventional data types with geometry, topology, and algebra. In this review, we provide an accessible gateway to this fast-growing field and propose a graphical taxonomy that integrates recent advances into an intuitive unified framework. We subsequently extract insights into current challenges and highlight exciting opportunities for future development in this field.

Keywords: algebra; geometric deep learning; geometry; machine learning; topology.

PubMed Disclaimer

Figures

**Figure 1.**
Beyond Euclid: discrete topological structures. Left: Euclidean space discretized into a regular grid. Right: discrete topological spaces that go beyond classical discretized Euclidean space. Graphs, cellular complexes, hypergraphs relax the assumption of the regular grid and allow points to be connected with more complex relationships. The arrow +topology indicates the addition of a non-Euclidean, discrete topological structure. Reproduced from Papillon *et al* (2024). CC BY 4.0.

**Figure 2.**
Beyond Euclid: Continuous Geometric Structures. Left: Euclidean space. Right: Riemannian manifolds that go beyond the classical Euclidean space. Spheres, hyperbolic spaces, and tori relax the assumption of flatness of the Euclidean space and can exhibit positive or negative curvature. The arrow +geometry indicates the addition of a non-Euclidean, continuous geometric structure.

**Figure 3.**
Beyond Euclid: algebraic transformations. Left: Euclidean space. Right: group transformations that act on the elements of an Euclidean space: 2D translation from the group $R^{2}$ , 2D rotation from the group SO(2), 2D reflection from the group ${1, - 1}$ , and a combination of translation and rotation from the Special Euclidean group SE(2). The arrow +algebra indicates the addition of the non-Euclidean algebraic structure defining a group action.

**Figure 4.**
Geometric, topological, and algebraic structures in data as coordinates. Each card illustrates and exemplifies a type of coordinate. The arrows + topology, + geometry and + algebra between cards indicate the addition of non-Euclidean topological, geometric and algebraic structures respectively. Notations: $R^{m}$ : Euclidean space of dimension m, M: manifold, Ω: topological space; x: data as point or as signal. Reproduced from Yunakov (2011). CC BY SA 4.0.

**Figure 5.**
Geometric, topological, and algebraic structures in data as signals. Each card illustrates the structure of a signal and presents a real-world example. The arrows + topology, + geometry and + algebra between cards indicate the addition of non-Euclidean topological, geometric and algebraic structures respectively. Notations: $R^{m}$ : Euclidean space of dimension m, M: manifold, Ω: topological space; x: data as point or as signal. Reproduced from Asnaebsa (2022). CC BY SA 4.0. © Atmo, Inc. - used with permission. Reproduced from Papillon *et al* (2024). CC BY 4.0.

**Figure 6.**
The Fréchet mean lies on the manifold, unlike the Euclidean mean.

**Figure 7.**
Geometric structures in regression categorized according to the geometry of the input and output data spaces (first two columns) and the geometry of the regression model (last column). Yellow boxes correspond to Euclidean space, while orange corresponds to non-Euclidean space. Partially Euclidean cases are light orange. Each pictogram represents the kind of parametrization used: linear (geodesic), nonlinear (nongeodesic) parametric, or nonparametric, with or without Bayesian priors.

**Figure 8.**
Geometric structures in latent embeddings categorized according to the geometry of the data and latent spaces (first two columns) and of the latent embedding model (last column). Yellow boxes correspond to Euclidean space, while orange corresponds to non-Euclidean space. Partially Euclidean cases are light orange. Each model is further classified by use and type of encoder/decoder, as well as whether it computes a posterior on the latents, and whether it is Bayesian. (P)PC: (probabilistic) principal curves; GP LVM: Gaussian process latent variable model; PGA: Principal Geodesic Analysis; GPCA: geodesic PCA.

**Figure 9.**
Topological structures in regression categorized according to the topology of the input and output data spaces (first two columns) and the kind of parametrization of the model (last two columns). Reproduced from Papillon *et al* (2024). CC BY 4.0.

**Figure 10.**
Topological structures in latent embeddings categorized according to the topology of the input and output data spaces (first two columns), the kind of parametrization of the model (last two columns), and the nature of labels: complex-level or node-level (two rows per box). Reproduced from Papillon *et al* (2024). CC BY 4.0.

**Figure 11.**
Geometry in Neural Network Layers organized according to the mathematical properties of the layer inputs x and outputs y. Inputs and outputs that are data represented as coordinates in a space, typically, a Euclidean space $R^{n}$ or a manifold M. Notations: $R^{n}$ : Euclidean space; M: manifold.

**Figure 12.**
Algebra in Neural Network Layers organized according to the mathematical properties of the layer inputs x and outputs y, which are both signals, i.e. functions from a domain to a codomain. The black curved arrows represent a group action on a space, such as a signal’s domain. Notations: $R^{n}$ : Euclidean space; M: manifold; *G_p*: group, $G_{p} / H$ Homogeneous manifold for the group *G_p*.

**Figure 13.**
Topology in Neural Network Layers organized according to the mathematical properties of the layer inputs x and outputs y, which are both signals, i.e. functions from a domain to a codomain. The black curved arrows represent a group action on a space. Notations: $R^{n}$ : Euclidean space; P: point set; G: graph; Ω: topological space. Reproduced from Papillon *et al* (2024). CC BY 4.0.

**Figure 14.**
Classical transformers illustrated by the mathematical properties of the attention coefficients and of the attention layer. The key k and the query q are the inputs to the attention coefficient α; the value v is the input to the attention layer, and the output value v′ is the weighted result of that layer. Inputs and outputs are represented as signals, i.e. as functions from a domain to a codomain. Notation: $R^{n}$ : Euclidean space.

**Figure 15.**
Geometry in attention mechanisms categorized according to the mathematical properties of the attention coefficients (first subrow of each row) and of the attention layer (second subrow of each row). The key k and the query q are the inputs to the attention coefficient α; the value v is the input to the attention layer, and the output value v′ is the weighted result of that layer. Inputs and outputs are represented as signals, i.e. as functions from a domain to a codomain. Notations: $R^{n}$ : Euclidean space; M: manifold.

**Figure 16.**
Algebra in attention mechanisms categorized according to the mathematical properties of the attention coefficients (first subrow of each row) and of the attention layer (second subrow of each row). The black curved arrows represent the action of a group on a signal’s domain or codomain. Notations: $R^{n}$ : Euclidean space; M: manifold.

**Figure 17.**
Topology in attention mechanisms categorized according to the mathematical properties of the attention coefficients and of the attention layer. Notations: $R^{n}$ : Euclidean space; M: manifold; P: point set; G: graph; Ω: topological space. Reproduced from Papillon *et al* (2024). CC BY 4.0.

**Figure 18.**
Dynamics as algebra on state and latent variables. In many applications, observed quantities evolve in time according to a low-dimensional latent dynamical systems. The arrows indicate algebraic (dynamics) or geometrical (structures representing real-world constraints) additions. Notation: $R^{n}$ : Euclidean space of dimension n, M: manifold, x: state variables as data points, κ: latent variables as data points, $F (\cdot) / G (\cdot)$ : dynamical flow maps defined as algebras on Euclidean spaces and/or manifolds.

See this image and copyright information in PMC

References

1. Abbaspourazad H, Erturk E, Pesaran B, Shanechi M M. Dynamical flexible inference of nonlinear latent factors and structures in neural population activity. Nat. Biomed. Eng. 2024;8:85–108. doi: 10.1038/s41551-023-01106-1. - DOI - PMC - PubMed
1. Abramson J, et al. Accurate structure prediction of biomolecular interactions with alphafold 3. Nature. 2024;630:493–500. doi: 10.1038/s41586-024-07487-w. - DOI - PMC - PubMed
1. Ahmed N K, Rossi R A, Lee J B, Willke T L, Zhou R, Kong X, Eldardiry H. Role-based graph embeddings. IEEE Trans. Knowl. Data Eng. 2022;34:2401–15. doi: 10.1109/TKDE.2020.3006475. - DOI
1. Akhøj M, Benn J, Grong E, Sommer S, Pennec X. Principal subbundles for dimension reduction. 2023 (arXiv: 2307.03128)
1. Alain M, Takao S, Paige B, Deisenroth M P. Gaussian processes on cellular complexes. Proc. 41st Int. Conf. on Machine Learning (ICML’24); JMLR.org; 2024.

Publication types

Actions

LinkOut - more resources

Full Text Sources
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Beyond Euclid: an illustrated guide to modern machine learning with geometric, topological, and algebraic structures

Affiliations

Beyond Euclid: an illustrated guide to modern machine learning with geometric, topological, and algebraic structures

Authors

Affiliations

Abstract

Figures

References

Publication types

LinkOut - more resources

Full Text Sources