Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2025 Sep 30;6(3):031002.
doi: 10.1088/2632-2153/adf375. Epub 2025 Aug 1.

Beyond Euclid: an illustrated guide to modern machine learning with geometric, topological, and algebraic structures

Affiliations
Review

Beyond Euclid: an illustrated guide to modern machine learning with geometric, topological, and algebraic structures

Mathilde Papillon et al. Mach Learn Sci Technol. .

Abstract

The enduring legacy of Euclidean geometry underpins classical machine learning, which, for decades, has been primarily developed for data lying in Euclidean space. Yet, modern machine learning increasingly encounters richly structured data that is inherently non-Euclidean. This data can exhibit intricate geometric, topological and algebraic structure: from the geometry of the curvature of space-time, to topologically complex interactions between neurons in the brain, to the algebraic transformations describing symmetries of physical systems. Extracting knowledge from such non-Euclidean data necessitates a broader mathematical perspective. Echoing the 19th-century revolutions that gave rise to non-Euclidean geometry, an emerging line of research is redefining modern machine learning with non-Euclidean structures. Its goal: generalizing classical methods to unconventional data types with geometry, topology, and algebra. In this review, we provide an accessible gateway to this fast-growing field and propose a graphical taxonomy that integrates recent advances into an intuitive unified framework. We subsequently extract insights into current challenges and highlight exciting opportunities for future development in this field.

Keywords: algebra; geometric deep learning; geometry; machine learning; topology.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Beyond Euclid: discrete topological structures. Left: Euclidean space discretized into a regular grid. Right: discrete topological spaces that go beyond classical discretized Euclidean space. Graphs, cellular complexes, hypergraphs relax the assumption of the regular grid and allow points to be connected with more complex relationships. The arrow +topology indicates the addition of a non-Euclidean, discrete topological structure. Reproduced from Papillon et al (2024). CC BY 4.0.
Figure 2.
Figure 2.
Beyond Euclid: Continuous Geometric Structures. Left: Euclidean space. Right: Riemannian manifolds that go beyond the classical Euclidean space. Spheres, hyperbolic spaces, and tori relax the assumption of flatness of the Euclidean space and can exhibit positive or negative curvature. The arrow +geometry indicates the addition of a non-Euclidean, continuous geometric structure.
Figure 3.
Figure 3.
Beyond Euclid: algebraic transformations. Left: Euclidean space. Right: group transformations that act on the elements of an Euclidean space: 2D translation from the group R2, 2D rotation from the group SO(2), 2D reflection from the group {1,1}, and a combination of translation and rotation from the Special Euclidean group SE(2). The arrow +algebra indicates the addition of the non-Euclidean algebraic structure defining a group action.
Figure 4.
Figure 4.
Geometric, topological, and algebraic structures in data as coordinates. Each card illustrates and exemplifies a type of coordinate. The arrows + topology, + geometry and + algebra between cards indicate the addition of non-Euclidean topological, geometric and algebraic structures respectively. Notations: Rm: Euclidean space of dimension m, M: manifold, Ω: topological space; x: data as point or as signal. Reproduced from Yunakov (2011). CC BY SA 4.0.
Figure 5.
Figure 5.
Geometric, topological, and algebraic structures in data as signals. Each card illustrates the structure of a signal and presents a real-world example. The arrows + topology, + geometry and + algebra between cards indicate the addition of non-Euclidean topological, geometric and algebraic structures respectively. Notations: Rm: Euclidean space of dimension m, M: manifold, Ω: topological space; x: data as point or as signal. Reproduced from Asnaebsa (2022). CC BY SA 4.0. © Atmo, Inc. - used with permission. Reproduced from Papillon et al (2024). CC BY 4.0.
Figure 6.
Figure 6.
The Fréchet mean lies on the manifold, unlike the Euclidean mean.
Figure 7.
Figure 7.
Geometric structures in regression categorized according to the geometry of the input and output data spaces (first two columns) and the geometry of the regression model (last column). Yellow boxes correspond to Euclidean space, while orange corresponds to non-Euclidean space. Partially Euclidean cases are light orange. Each pictogram represents the kind of parametrization used: linear (geodesic), nonlinear (nongeodesic) parametric, or nonparametric, with or without Bayesian priors.
Figure 8.
Figure 8.
Geometric structures in latent embeddings categorized according to the geometry of the data and latent spaces (first two columns) and of the latent embedding model (last column). Yellow boxes correspond to Euclidean space, while orange corresponds to non-Euclidean space. Partially Euclidean cases are light orange. Each model is further classified by use and type of encoder/decoder, as well as whether it computes a posterior on the latents, and whether it is Bayesian. (P)PC: (probabilistic) principal curves; GP LVM: Gaussian process latent variable model; PGA: Principal Geodesic Analysis; GPCA: geodesic PCA.
Figure 9.
Figure 9.
Topological structures in regression categorized according to the topology of the input and output data spaces (first two columns) and the kind of parametrization of the model (last two columns). Reproduced from Papillon et al (2024). CC BY 4.0.
Figure 10.
Figure 10.
Topological structures in latent embeddings categorized according to the topology of the input and output data spaces (first two columns), the kind of parametrization of the model (last two columns), and the nature of labels: complex-level or node-level (two rows per box). Reproduced from Papillon et al (2024). CC BY 4.0.
Figure 11.
Figure 11.
Geometry in Neural Network Layers organized according to the mathematical properties of the layer inputs x and outputs y. Inputs and outputs that are data represented as coordinates in a space, typically, a Euclidean space Rn or a manifold M. Notations: Rn: Euclidean space; M: manifold.
Figure 12.
Figure 12.
Algebra in Neural Network Layers organized according to the mathematical properties of the layer inputs x and outputs y, which are both signals, i.e. functions from a domain to a codomain. The black curved arrows represent a group action on a space, such as a signal’s domain. Notations: Rn: Euclidean space; M: manifold; Gp: group, Gp/H Homogeneous manifold for the group Gp.
Figure 13.
Figure 13.
Topology in Neural Network Layers organized according to the mathematical properties of the layer inputs x and outputs y, which are both signals, i.e. functions from a domain to a codomain. The black curved arrows represent a group action on a space. Notations: Rn: Euclidean space; P: point set; G: graph; Ω: topological space. Reproduced from Papillon et al (2024). CC BY 4.0.
Figure 14.
Figure 14.
Classical transformers illustrated by the mathematical properties of the attention coefficients and of the attention layer. The key k and the query q are the inputs to the attention coefficient α; the value v is the input to the attention layer, and the output value v′ is the weighted result of that layer. Inputs and outputs are represented as signals, i.e. as functions from a domain to a codomain. Notation: Rn: Euclidean space.
Figure 15.
Figure 15.
Geometry in attention mechanisms categorized according to the mathematical properties of the attention coefficients (first subrow of each row) and of the attention layer (second subrow of each row). The key k and the query q are the inputs to the attention coefficient α; the value v is the input to the attention layer, and the output value v′ is the weighted result of that layer. Inputs and outputs are represented as signals, i.e. as functions from a domain to a codomain. Notations: Rn: Euclidean space; M: manifold.
Figure 16.
Figure 16.
Algebra in attention mechanisms categorized according to the mathematical properties of the attention coefficients (first subrow of each row) and of the attention layer (second subrow of each row). The black curved arrows represent the action of a group on a signal’s domain or codomain. Notations: Rn: Euclidean space; M: manifold.
Figure 17.
Figure 17.
Topology in attention mechanisms categorized according to the mathematical properties of the attention coefficients and of the attention layer. Notations: Rn: Euclidean space; M: manifold; P: point set; G: graph; Ω: topological space. Reproduced from Papillon et al (2024). CC BY 4.0.
Figure 18.
Figure 18.
Dynamics as algebra on state and latent variables. In many applications, observed quantities evolve in time according to a low-dimensional latent dynamical systems. The arrows indicate algebraic (dynamics) or geometrical (structures representing real-world constraints) additions. Notation: Rn: Euclidean space of dimension n, M: manifold, x: state variables as data points, κ: latent variables as data points, F()/G(): dynamical flow maps defined as algebras on Euclidean spaces and/or manifolds.

References

    1. Abbaspourazad H, Erturk E, Pesaran B, Shanechi M M. Dynamical flexible inference of nonlinear latent factors and structures in neural population activity. Nat. Biomed. Eng. 2024;8:85–108. doi: 10.1038/s41551-023-01106-1. - DOI - PMC - PubMed
    1. Abramson J, et al. Accurate structure prediction of biomolecular interactions with alphafold 3. Nature. 2024;630:493–500. doi: 10.1038/s41586-024-07487-w. - DOI - PMC - PubMed
    1. Ahmed N K, Rossi R A, Lee J B, Willke T L, Zhou R, Kong X, Eldardiry H. Role-based graph embeddings. IEEE Trans. Knowl. Data Eng. 2022;34:2401–15. doi: 10.1109/TKDE.2020.3006475. - DOI
    1. Akhøj M, Benn J, Grong E, Sommer S, Pennec X. Principal subbundles for dimension reduction. 2023 (arXiv: 2307.03128)
    1. Alain M, Takao S, Paige B, Deisenroth M P. Gaussian processes on cellular complexes. Proc. 41st Int. Conf. on Machine Learning (ICML’24); JMLR.org; 2024.

LinkOut - more resources