Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2006 Dec 5;103(49):18521-7.
doi: 10.1073/pnas.0508445103. Epub 2006 Nov 27.

Empirical Bayes hierarchical models for regularizing maximum likelihood estimation in the matrix Gaussian Procrustes problem

Affiliations
Comparative Study

Empirical Bayes hierarchical models for regularizing maximum likelihood estimation in the matrix Gaussian Procrustes problem

Douglas L Theobald et al. Proc Natl Acad Sci U S A. .

Abstract

Procrustes analysis involves finding the optimal superposition of two or more "forms" via rotations, translations, and scalings. Procrustes problems arise in a wide range of scientific disciplines, especially when the geometrical shapes of objects are compared, contrasted, and analyzed. Classically, the optimal transformations are found by minimizing the sum of the squared distances between corresponding points in the forms. Despite its widespread use, the ordinary unweighted least-squares (LS) criterion can give erroneous solutions when the errors have heterogeneous variances (heteroscedasticity) or the errors are correlated, both common occurrences with real data. In contrast, maximum likelihood (ML) estimation can provide accurate and consistent statistical estimates in the presence of both heteroscedasticity and correlation. Here we provide a complete solution to the nonisotropic ML Procrustes problem assuming a matrix Gaussian distribution with factored covariances. Our analysis generalizes, simplifies, and extends results from previous discussions of the ML Procrustes problem. An iterative algorithm is presented for the simultaneous, numerical determination of the ML solutions.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
ML and LS superposition of simulated protein structure data. (A) The “true” superposition, generated from a known mean structure with known covariance matrices, before arbitrary translations and rotations have been applied. In generating the simulated data, the set of alpha carbon atoms from model 1 of Protein Data Bank entry 2SDF (www.pdb.org) was used as the mean form (67 atoms/landmarks, squared radius of gyration = 152 Å2). The nondiagonal 67 × 67 landmark covariance matrix was based on values calculated from the superposition given in 2SDF, with variances ranging from 0.01 to 80 Å2 and correlations ranging from 0 to 0.99 (see Data Sets 1 and 2). The known dimensional covariance matrix had eigenvalues of 0.16667, 0.33333, and 0.5 corresponding to the x, y, and z axes of 2SDF model 1, respectively. (B) An ordinary LS superposition of the simulated data. (C) A ML superposition of the simulated data, assuming a diagonal landmark covariance matrix Σ (i.e., no correlations), Ξ = 1, and inverse gamma distributed variances.

References

    1. Dryden IL, Mardia KV. Statistical Shape Analysis. New York: Wiley; 1998.
    1. Gower JC, Dijksterhuis GB. Procrustes Problems. Vol 30. Oxford: Oxford Univ Press; 2004.
    1. Lele S, Richtsmeier JT. An Invariant Approach to Statistical Analysis of Shapes. Boca Raton, FL: Chapman and Hall/CRC; 2001.
    1. Mardia K, Goodall C. In: Multivariate Environmental Statistics. Patil G, Rao C, editors. Vol 6. New York: North-Holland; 1993. pp. 347–385.
    1. Pawitan Y. In All Likelihood: Statistical Modeling and Inference Using Likelihood. Clarendon, Oxford: Oxford Science Publications; 2001.

Publication types

LinkOut - more resources