Parametric matrix models

Patrick Cook^#^{1

2}, Danny Jammooa^#^{1

2}, Morten Hjorth-Jensen^{1

2

3}, Daniel D Lee⁴, Dean Lee^{5

6}

Affiliations

¹ Facility for Rare Isotope Beams, Michigan State University, East Lansing, MI, USA.
² Department of Physics and Astronomy, Michigan State University, East Lansing, MI, USA.
³ Department of Physics and Center for Computing in Science Education, University of Oslo, Oslo, Norway.
⁴ Department of Electrical and Computer Engineering, Cornell Tech, New York, NY, USA.
⁵ Facility for Rare Isotope Beams, Michigan State University, East Lansing, MI, USA. leed@frib.msu.edu.
⁶ Department of Physics and Astronomy, Michigan State University, East Lansing, MI, USA. leed@frib.msu.edu.

^# Contributed equally.

PMID: 40593843
PMCID: PMC12220823
DOI: 10.1038/s41467-025-61362-4

Parametric matrix models

Patrick Cook et al. Nat Commun. 2025.

. 2025 Jul 1;16(1):5929.

doi: 10.1038/s41467-025-61362-4.

Authors

Patrick Cook^#^{1

2}, Danny Jammooa^#^{1

2}, Morten Hjorth-Jensen^{1

2

3}, Daniel D Lee⁴, Dean Lee^{5

6}

Affiliations

¹ Facility for Rare Isotope Beams, Michigan State University, East Lansing, MI, USA.
² Department of Physics and Astronomy, Michigan State University, East Lansing, MI, USA.
³ Department of Physics and Center for Computing in Science Education, University of Oslo, Oslo, Norway.
⁴ Department of Electrical and Computer Engineering, Cornell Tech, New York, NY, USA.
⁵ Facility for Rare Isotope Beams, Michigan State University, East Lansing, MI, USA. leed@frib.msu.edu.
⁶ Department of Physics and Astronomy, Michigan State University, East Lansing, MI, USA. leed@frib.msu.edu.

^# Contributed equally.

PMID: 40593843
PMCID: PMC12220823
DOI: 10.1038/s41467-025-61362-4

Abstract

We present a general class of machine learning algorithms called parametric matrix models. In contrast with most existing machine learning models that imitate the biology of neurons, parametric matrix models use matrix equations that emulate physical systems. Similar to how physics problems are usually solved, parametric matrix models learn the governing equations that lead to the desired outputs. Parametric matrix models can be efficiently trained from empirical data, and the equations may use algebraic, differential, or integral relations. While originally designed for scientific computing, we prove that parametric matrix models are universal function approximators that can be applied to general machine learning problems. After introducing the underlying theory, we apply parametric matrix models to a series of different challenges that show their performance for a wide range of problems. For all the challenges tested here, parametric matrix models produce accurate results within an efficient and interpretable computational framework that allows for input feature extrapolation.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

**Fig. 1. PMM results for regression and Trotter extrapolation.**
a Performance on regression problems. Normalized mean absolute error on withheld test data for the PMM (blue) compared against several standard techniques: Kernel Ridge Regression (KRR, orange), Multilayer Perceptron (MLP, green), k-Nearest Neighbors (KNN, red), Extreme Gradient Boosting (XGB, purple), Support Vector Regression (SVR, brown), and Random Forest Regression (RFR, pink). Normalized mean absolute error on provided training and validation data is shown for selected datasets as contrasting squares. Mean performance on withheld test data across all problems are shown as horizontal lines. b Extrapolated Trotter approximation for quantum computing simulations. We plot the lowest three energies of the effective Hamiltonian for the one-dimensional Heisenberg model with DM interactions versus time step dt. We compare results obtained using a PMM (dashed), Multilayer Perceptron (MLP, dotted), and polynomial interpolation (Poly, dash-dotted). All training (diamonds) and validation (circles) samples are located away from dt = 0, where data acquisition on a quantum computer would be practical. The inset shows the relative error in the predicted energies at dt = 0 for the three models.

**Fig. 2. ALMG model results.**
Left Panel: Lowest two energy levels of the ALMG model versus ξ. We show PMM results compared with Multilayer Perceptron (MLP) results. The upper plots show the energies, and the lower plots show absolute error. The main plots show the region around the phase transition; the insets show the full domain where data was provided. Center Panel: Average particle density for the ground state of the ALMG model versus ξ. We show PMM results compared with Multilayer Perceptron (MLP) results. The upper plots show the average particle density, and the lower plots show absolute error. The main plots show the region around the phase transition; the insets show the full domain where data was provided. Right Panel: Derivative of the lowest two energy levels with respect to the control parameter ξ as a function of ξ. We show PMM results compared with Multilayer Perceptron (MLP) results. The upper plots show the derivatives of the energies with respect to the control parameter, and the lower plots show absolute error. The main plots show the region around the phase transition; the insets show the full domain where data was provided. No data on the derivatives was provided to either model.

**Fig. 3. Complex-valued ground state energy of the ALMG model for complex ξ.**
We show PMM predictions for the complex-valued ground state energy for complex values of ξ, using training data at only real values of ξ. The left plot shows the exact results, the middle plot shows the PMM predictions, and the right plot shows the absolute error.

**Fig. 4. Concatenated line segments for one input feature.**
The thick line shows a particular eigenvalue λ(c₁) that traces out a function composed of several concatenated line segments. The dashed lines show the affine functions f_j(c₁) = a_jc₁ + b_j that describe the line segments.

**Fig. 5. Comparison of PMM and EC results for ground state energy extrapolation.**
We show results for a 2 × 2 PMM (dashed blue) and EC (dotted red) with 5 training samples on the task of extrapolating the ground state energy of a system of N non-interacting spins. The exact ground state energy is shown in solid black.

**Fig. 6. Diagram of the image classification PMM algorithm.**
We conceptually illustrate the inference process of the image PMM in the context of classifying images of dogs and cats. We start with the original image divided into four rectangular windows, W₁, …, W₄, with trainable quasi-congruence transformation matrices $\{\underline{K_{1}}, \underline{L_{1}}\}, \dots, \{\underline{K_{4}}, \underline{L_{4}}\}$ . From each window the normalized row- and column-wise Gram matrices, R₁, …, R₄ and C₁, …, C₄, are calculated and summed to form the latent space feature encoding matrix M. Additional trainable quasi-congruence transformation matrices, $\underline{D_{1}}$ and $\underline{D_{2}}$ , are applied and added to trainable Hermitian bias matrices, $\underline{B_{1}}$ and $\underline{B_{2}}$ , to form the class-specific latent-space feature matrices, H₁ and H₂, which are the primary matrices of the PMM. Finally, the eigensystem of these primary matrices are used to form bilinears with the secondary matrices of the PMM before finally a softmax is applied to convert the predictions to probabilities.

**Fig. 7. Diagrams showing architectures.**
a Diagram of the convolutional layer architecture used in the ConvPMM for the MNIST-Digits dataset. The architecture consists of four layers of 64, 32, 16, and 8 trainable complex-valued filters of size 3 × 3 with a stride of 1 and a ReLU activation function. The first two layers use “valid” padding while the last two layers use “same” padding. b Model diagram for the frozen, pre-trained ResNet50 model used as a feature extractor in the hybrid transfer learning experiments. The input shape is determined by the dataset, which is 32 × 32 × 3 for the CIFAR-10 dataset used in this figure as an example. The output shape is 2048 for the extracted feature vector. c Model diagram for the trainable feedforward neural network (FNN) used in the hybrid transfer learning experiments. The output shape of the final layer is determined by the number of classes in the dataset, which is 10 in this figure as an example.

See this image and copyright information in PMC

References

1. Raissi, M., Perdikaris, P. & Karniadakis, G. E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys.378, 686–707 (2019).
1. Karniadakis, G. E. et al. Physics-informed machine learning. Nat. Rev. Phys.3, 422–440 (2021).
1. Schuetz, M. J., Brubaker, J. K. & Katzgraber, H. G. Combinatorial optimization with physics-inspired graph neural networks. Nat. Mach. Intell.4, 367–377 (2022).
1. Quarteroni, A., Manzoni, A. & Negri, F. Reduced Basis Methods for Partial Differential Equations: An Introduction, vol. 92 of UNITEXT (Springer, 2015).
1. Hesthaven, J., Rozza, G. & Stamm, B.Certified Reduced Basis Methods for Parametrized Partial Differential Equations. SpringerBriefs in Mathematics (Springer International Publishing, 2016). 10.1007/978-3-319-22470-1.

Grants and funding

LinkOut - more resources

Full Text Sources
- Nature Publishing Group
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Parametric matrix models

Affiliations

Parametric matrix models

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources