Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 May 24;102(21):7426-31.
doi: 10.1073/pnas.0500334102. Epub 2005 May 17.

Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps

Affiliations

Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps

R R Coifman et al. Proc Natl Acad Sci U S A. .

Abstract

We provide a framework for structural multiscale geometric organization of graphs and subsets of R(n). We use diffusion semigroups to generate multiscale geometries in order to organize and represent complex structures. We show that appropriately selected eigenfunctions or scaling functions of Markov matrices, which describe local transitions, lead to macroscopic descriptions at different scales. The process of iterating or diffusing the Markov matrix is seen as a generalization of some aspects of the Newtonian paradigm, in which local infinitesimal transitions of a system lead to global macroscopic descriptions by integration. We provide a unified view of ideas from data analysis, machine learning, and numerical analysis.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
The spectra of powers of A (a), and the diffusion embedding of a mixture of two materials with different heat conductivity (b and c). The original geometry (b) is mapped as a “butterfly” set, in which the red (higher conductivity) and blue phases are organized according to the diffusion they generate: the cord length between two points in the diffusion space measures the quantity of heat that can travel between these points.
Fig. 2.
Fig. 2.
A dumbbell (a) is embedded by using the first three eigenfunctions (b). Because of the bottleneck, the two lobes are pushed away from each other. Observe also that in the embedding space, point A is closer to the handle (point B) than any point on the edge (like point C), because there are many more short paths joining A and B than A and C.
Fig. 3.
Fig. 3.
Original spiral curve (a) and the density of points on it (b), embedding obtained from the normalized graph Laplacian (c), and embedding from the Laplace–Beltrami approximation (d).
Fig. 4.
Fig. 4.
The original function f on the unit square (a), and the first nontrivial eigenfunction (b). On this plot, the colors corresponds to the values of f.
Fig. 5.
Fig. 5.
Pathology slice with partially labeled data (a), tissue classification from spectra by using 1-nearest neighbors (b), and tissue classification from spectra by using geometric diffusion (c). The three tissue classes are marked with blue, green, and pink.

References

    1. Hastie, T., Tibshirani, R. & Friedman, J. H. (2001) The Elements of Statistical Learning (Springer, Berlin), pp. 144–155.
    1. Coifman, R. R. & Saito, N. (1994) C. R. Acad. Sci. 319, 191–196.
    1. Ham, J., Lee, D. D., Mika, S. & Schölkopf, B. (2003) A Kernel View of the Dimensionality Reduction of Manifolds (Max-Planck-Institut für Biologische Kybernetik, Tübingen, Germany), Tech. Rep. TR-110, pp. 1–9.
    1. Roweis, S. T. & Saul, L. K. (2000) Science 290, 2323–2326. - PubMed
    1. Belkin, M. & Niyogi, P. (2003) Neural Comput. 15, 1373–1396.

LinkOut - more resources