MAHALANOBIS DIFFUSION MAPS FOR QUANTIFYING RARE EVENTS: THEORY AND APPLICATION TO MOLECULAR DYNAMICS
Publication or External Link
The study of rare events in molecular and atomic systems such as conformal changes and cluster rearrangements has been one of the most important research themes in chemical physics. Key challenges are associated with long waiting times rendering molecular simulations inefficient, high dimensionality impeding the use of PDE-based approaches, and the complexity or breadth of transition processes limiting the predictive power of asymptotic methods. Diffusion maps are promising algorithms to mitigate these issues. We adapt the diffusion map with Mahalanobis kernel proposed by Singer and Coifman (2008) for the SDE describing molecular dynamics in collective variables in which the diffusion matrix is position-dependent and, unlike the case considered by Singer and Coifman, is not associated with a diffeomorphism. We offer an elementary proof showing that one can approximate the generator for this SDE discretized to a point cloud via the Mahalanobis diffusion map. We then upgrade to incorporate standard enhanced sampling techniques such as metadynamics. The resulting algorithm, which we call the target measure Mahalanobis diffusion map (tm-mmap), is suitable for a moderate number of collective variables in which one can approximate the diffusion tensor and free energy. The tm-mmap algorithm allows us to approximate the backward Kolmogorov operator and compute the committor function, the key function for describing transition events in the framework of transition path theory. Simple post-processing steps delineate the transition channels and estimate the transition rates. We apply this methodology to a number of test problems including benchmark systems in chemical physics such as alanine dipeptide with four dihedral angles and Lennard-Jones 7, validate the results, and demonstrate the efficacy of the proposed approach. In particular, we show that use of (i) the Mahalanobis kernel, (ii) enhanced sampling data, and (iii) phase space dimensions beyond the scope of standard PDE solvers (such as finite difference and finite element methods) is essential for capturing the underlying dynamics and accurately estimating transition rates.