Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 13;21(9):4855-4866.
doi: 10.1021/acs.jctc.5c00076. Epub 2025 Apr 20.

AMUSET-TICA: A Tensor-Based Approach for Identifying Slow Collective Variables in Biomolecular Dynamics

Affiliations

AMUSET-TICA: A Tensor-Based Approach for Identifying Slow Collective Variables in Biomolecular Dynamics

Siqin Cao et al. J Chem Theory Comput. .

Abstract

Elucidating collective variables (CVs) for biomolecular dynamics is crucial for understanding numerous biological processes. By leveraging the tensor-train data structure, a multilinear version of the AMUSE (Algorithm for Multiple Unknown Signals) algorithm for Koopman approximation (AMUSEt) was recently developed to identify CVs for biomolecular dynamics. To find slow CVs, AMUSEt transforms input features (e.g., pairwise atomic distances) into nonlinear basis functions (e.g., Gaussian functions) and encodes these nonlinear basis functions within a tensor-train structure via time-lagged correlation functions. Due to the need to fit these tensor-train data structures into computer memory, AMUSEt can handle only a limited number of input features. Consequently, AMUSEt relies on manually selecting and ranking features based on physical intuition to fully capture the slow dynamics. However, when applied to complex biological systems with numerous features, this selection and ranking process becomes increasingly challenging. To address this challenge, here we present AMUSET-TICA (AMUSEt-based Time-lagged Independent Component Analysis), a CV-identification method using time-structure-independent components (tICs) as the input features for AMUSEt. The key insight of AMUSET-TICA lies in its highly effective embedding of high-dimensional atomistic protein conformations, achieved by expanding orthogonal tICs into overlapping Gaussian basis functions through a tensor-product data structure. This eliminates the need for manually selecting and ranking input features for a wide range of biomolecular systems. We demonstrate that AMUSET-TICA consistently and significantly outperforms AMUSEt and tICA in identifying slow CVs for three different biomolecular systems: alanine dipeptide, the N-terminal domain of L9 (NTL9), and the FIP35 WW domain. For all these systems, the CVs generated by AMUSET-TICA accurately describe the slowest dynamical modes underlying these biological conformational changes. Furthermore, we show that AMUSET-TICA achieves performance comparable to deep-learning approaches like VAMPnets in identifying the slowest dynamical modes, while being significantly more computationally efficient in terms of CPU time. In addition, the CVs yielded by AMUSET-TICA provide insights into the folding mechanisms of NTL9 and the FIP35 WW domain, including CV3 and CV4 of the WW domain, which capture its two parallel folding pathways. We expect AMUSET-TICA can be widely applied to facilitate the investigation of biomolecular dynamics.

PubMed Disclaimer

Similar articles

References

    1. Shaw DE, Maragakis P, Lindorff-Larsen K, Piana S, Dror RO, Eastwood MP, Bank JA, Jumper JM, Salmon JK, Shan Y, Wriggers W. Atomic-Level Characterization of the Structural Dynamics of Proteins. Science 2010, 330 (6002), 341–346. - PubMed
    1. Voelz VA, Bowman GR, Beauchamp K, Pande VS. Molecular Simulation of ab Initio Protein Folding for a Millisecond Folder NTL9(1–39). Journal of the American Chemical Society 2010, 132 (5), 1526–1528. - PMC - PubMed
    1. Shaw DE, Maragakis P, Lindorff-Larsen K, Piana S, Dror RO, Eastwood MP, Bank JA, Jumper JM, Salmon JK, Shan YB, Wriggers W. Atomic-Level Characterization of the Structural Dynamics of Proteins. Science 2010, 330 (6002), 341–346. - PubMed
    1. Bowman GR, Bolin ER, Hart KM, Maguire BC, Marqusee S. Discovery of multiple hidden allosteric sites by combining Markov state models and experiments. Proceedings of the National Academy of Sciences 2015, 112 (9), 2734–2739. - PMC - PubMed
    1. Wagner JR, Lee CT, Durrant JD, Malmstrom RD, Feher VA, Amaro RE. Emerging Computational Methods for the Rational Discovery of Allosteric Drugs. Chemical Reviews 2016, 116 (11), 6370–6390. - PMC - PubMed

MeSH terms

Substances