**We develop a new data-driven method for extracting complex chemical reaction networks from Molecular Dynamics outputs and reproducing Molecular Dynamics trajectories using Kinetic Monte Carlo. We use this method to study liquid methane under the high temperature and pressure conditions that occur during shock compression. Using only 11% of reactions chosen with L1 regularization, we find that CH _{4} decomposition can be modeled with less than 9% relative error. We also find that this method is transferable to predict the trajectories of other related systems such as the decomposition of C_{4}H_{10} in similar conditions. This work demonstrates how data-driven approaches can be leveraged to extract accurate chemical models, enabling efficient discovery of reaction networks involving as many as tens of thousands of reactions with minimal human intervention.**

One of the overarching goals in computational materials science is the development of fast, accurate and scalable models for simulating complex chemical or materials processes. These processes can often involve thousands of molecules and tens of thousands of reactions. Thus, it is necessary to seek highly reduced models that can sufficiently describe important features of the system. We show that model reduction can be formulated as a convex optimization problem, which is computationally efficient to solve and has a single minimum, distinguishing it from many challenging high-dimensional optimization problems faced in materials science. Furthermore, L_{1} regularization allows us to control the tradeoff between model reduction and error via a single regularization parameter.

We apply our algorithm to a molecular dynamics simulation of high temperature and high pressure methane, under thermodynamic conditions similar to that of liquid methane undergoing shock compression. We first use bond length and duration criteria developed in previous work to derive a stochastic model of the chemical reaction network consisting of more than 2600 elementary reactions and their corresponding reaction rate coefficients. We are then able to compute the tradeoff between reduction of the network and error in the resulting stochastic model by tuning the L_{1} regularization parameter.

*CH _{4} decomposition can be modeled with less than 9% relative error using only 11% of reactions. We can tradeoff between reduction and error by tuning the L_{1} regularization parameter λ.*

We use our framework to transfer the model trained on CH4 decomposition to predict accurately C_{4}H_{10} decomposition and inversely. Using only the 324 reactions shared by the two systems, we are able to obtain similar results to the full model containing around 2,600 elementary reactions in the case of CH_{4} decomposition and 10,000 reactions in the case of C_{4}H_{10} decomposition.

**Current Research:** We are currently trying to apply this method to different chemical systems and explore different ways of describing chemical reactions.

**Contact:** Vincent Dufour-Decieux, vdufourd_at_stanford.edu

Publication:

Yang, Q., Sing-Long, C.A., Reed, E.J. Rapid data-driven model reduction of nonlinear dynamical systems including chemical reaction networks using ℓ1-regularization *Chaos* **30**, 053122 (2020) doi:10.1063/1.5139463

Chen, E., Yang, Q., Dufour-Decieux, V., Sing-Long, C.A., Freitas, R., and Reed, E.J. Transferable Kinetic Monte Carlo Methods with Thousands of Reactions learned from Molecular Dynamics Simulations (2019). 10.1021/acs.jpca.8b09947

Yang, Q. and Reed, E. J. Computational Approaches for Chemistry Under Extreme Conditions. Springer, 28 (2019).

Yang, Q., Sing-Long, C. A., Reed, E. J., Learning Reduced Kinetic Monte Carlo Models of Complex Chemistry from Molecular Dynamics. *Chemical Science*, doi:10.1039/C7SC01052D (2017).

Yang, Q., Sing-Long, C. A., Reed, E. J., L1 Regularization-Based Model Reduction of Complex Chemistry Molecular Dynamics for Statistical Learning of Kinetic Monte Carlo Models. *MRS Advances*, doi:10.1557/adv.2016.124 (2016).