Intramolecular proton transfer reaction dynamics using machine-learned ab initio potential energy surfaces

Hydrogen bonding interactions, which are central to various physicochemical processes, are investigated in the present study using ab initio-based machine learning potential energy surfaces. Abnormally strong intramolecular O–H⋯O hydrogen bonds, occurring in β-diketone enols of malonaldehyde and its derivatives, with substituents ranging from various electron-withdrawing to electron-donating functional groups, are studied. Machine learning force fields were constructed using a kernel-based force learning model employing ab initio molecular dynamics reference data. These models were used for molecular dynamics simulations at finite temperature, and dynamical properties were determined by computing proton transfer free-energy surfaces. The chemical systems studied here show progression toward barrier-less proton transfer events at an accuracy of correlated electronic structure methods. Markov state models of the conformational states indicate shorter intramolecular hydrogen bonds exhibiting higher proton transfer rates. We demonstrate how functional group substitution can modulate the strength of intramolecular hydrogen bonds by studying the thermodynamic and kinetic properties.

MA and its derivatives have been used as model systems in many studies where PT was central to the investigation [33]. In fact, functional group substitution-induced changes in the energetics and dynamics of small molecules (can be extended to larger biomolecules) are interesting, particularly for understanding hydrogen bonding interactions [38]. Specifically, for MA, tunneling splittings were characterized experimentally using infrared spectroscopy [18,19]. In fact, MA was studied extensively at various theoretical method levels, starting from simple Hartree-Fock to high-level coupled cluster singles doubles with perturbative triples (CCSD(T)), and the calculated O-H bond distance therein lies in the range of 1.69-1.88 Å [23,40] compared to the experimentally determined value of 1.68 Å [41]. So far, the PT barrier height is estimated to be in the range ∼2. 9-5.4 kcal mol −1 [23,42,43]. The most reliable barrier height known to date was computed by Bowman and co-workers using the CCSD(T) level of theory at the basis set limit [23].
However, the computationally expensive nature of the high-level quantum mechanics (QM) methods poses certain limitations when calculating an accurate PES. Only by fully exploring the phase space using a less expensive force field (FF)-based approach, may one reach the 3N − 6 full-dimensionality of the PES. Unfortunately, general FFs are not suited for modeling such chemical processes where covalent interactions, such as bond formation and bond breaking, take place. However, there are few reactive FFs which were designed to study such PT in MA. One such FF, developed by Meuwly and co-workers, estimated a PT barrier height of 4.3 kcal mol −1 [26].
In the present work we have chosen the symmetric gradient domain ML (sGDML) approach of Chmiela et al [65]. The sGDML model was successfully employed to predict the PT in MA at the CCSD(T) level of accuracy with only a few hundred reference conformations. Considering the accuracy and reasonably low computational cost, we have chosen the sGDML architecture to construct molecular FFs from machine-learned ab initio PES for the current study.
A systematic study on various derivatives of MA has been carried out to find suitable descriptors to characterize the strength of intramolecular HBs. In the present work, derivatives of MA consist of various electron-withdrawing groups, such as CN, NO 2 , and BH 2 , and electron-donating groups, such as CH 3 , NH 2 , and OCH 3 . Density functional theory (DFT)-based AIMD energies and forces were computed to generate reference datasets. We have constructed MLFFs for all the systems using the sGDML model. The MLFF was then employed to explore the full-dimensional reactive PES, which was not feasible using a conventional molecular FF. Now, subtle quantum effects described by the sGDML's reconstructed PES enabled characterization of the HB strength through geometrical properties, π-delocalization indices, vibrational spectra and rates of PT processes of MA-systems. We have investigated how various functional groups modulate the PT barrier, and to what extent one can achieve a barrierless PT process [67,68].

Methodology
The working pipeline includes the following steps: (1) generation of reference datasets using AIMD simulations, (2) training sGDML models, and predicting force and energy labels and (3) performing long-timescale MD simulations using trained models. The entire workflow is depicted in figure 2, which is discussed in detail in the following sections.

Reference datasets
Terminology used throughout: molecular systems (a total of 16 molecules) that are studied currently are shown in scheme 1. Here, the parent MA structure is shown with functional-group substitutions at the symmetrical carbons C1 and C3 by R1, and at the central carbon C2 by R2. Structure-types I-IV, each consisting of a particular R1-substituent, defined in scheme 2, are as follows: (1) structure-type I has a H atom, (2) structure-type II has CH 3 , (3) structure-type III has NH 2 , and (4) structure-type IV has OCH 3 . Therefore, the 'R1' term will be omitted hereafter in the use of structure types and/or structures to allow compact notations. Each structure type has four molecules with different R2-substituents. Subsequently, functional groups at the R2 position will be mentioned explicitly without the term 'R2' . For example, NO 2 -structure I refers to a molecule where R2 = NO 2 , and R1 = H; likewise, BH 2 -structure III refers to a molecule where R2 = BH 2 , and R1 = NH 2 .
In this study, the smallest system has nine atoms, and the largest has 19 atoms. Born-Oppenheimer molecular dynamics simulations were performed to generate reference data using the Car-Parrinello MD [69] package, which integrates DFT and MD methods. The starting structures were optimized, and then AIMD trajectories were generated. Goedecker, Teter, and Hutter (GTH) pseudopotentials [70][71][72] have been used for all atoms, and the valence electronic orbitals were expanded in plane waves with the maximum kinetic energy cutoff of 80 Rydberg. The Perdew, Burke, and Ernzerhof (PBE) [73] gradient-corrected exchange-correlation functional was used. Additionally, high-level MP2/6-31++G * * single-point calculations were performed using AIMD trajectories of DFT quality by employing the PSI4 [74] program for a few of the systems. The total energy was converged to 1 × 10 −7 Hartree for a system in a cubic box of dimension 12 Å. A time step of 21 a.u. (∼0.5 fs) was used. The total simulation length was 5 ps for each system. This resulted in 9865 reference data points for each molecular system. The equilibrium temperature was kept at 300 K, controlled via Nóse-Hoover chain thermostats.

The sGDML model
The implementation details of the sGDML model can be found in the original work by Chmiela and co-workers [61,65]. The sGDML model uses the kernel ridge regression technique, trained on forces F. Here, K Hess(κ) is the kernel matrix. The regularization is carried out by the hyper-parameter λ. Here, I and ⃗ α are the identity matrix and the parameter vectors, respectively. The Hessian matrix of the kernel, Hess(κ), often termed as the FF kernel, is obtained as the covariance between in the reproducing kernel Hilbert space H . Here, ϕ : χ → H is the mapping from the input space x ∈ χ to the feature space, which is defined implicitly through the scalar-valued kernel function via the so-called kernel trick. Parametric Matérn family kernel functions were used, where d = ||⃗ x − ⃗ x ′ || is the Euclidean distance between two inputs, and the length scale σ is a hyper-parameter. With ⃗ x = D(⃗ r), the kernel function is associated with the descriptor D; where⃗ r is the Cartesian molecular geometry. Each element of the descriptor matrix is defined as the reciprocal of the Euclidean distance for a pair of atoms: The prediction of forces is carried out using the force estimator which collects all the contributions (3 N coordinates) from all the M training samples, and S symmetry transformations realized by the permutation matrix P. Further, by integratingf F w.r.t. the Cartesian geometry, the corresponding energy predictor is obtained.

MD simulations
To generate MD trajectories using the MLFF, we have used the Atomic Simulation Environment [75] with a 0.2 fs time step. A temperature of 300 K was maintained via a Langevin thermostat. A thermostat friction value of 0.002 a.u. was used for all systems unless stated otherwise. Each trajectory of 500 ps (2500000 steps) long was used to evaluate the performance of the sGDML model. MD simulations were performed for all the systems.

Kinetic analysis: PT rates
We have used the transition path theory (TPT) [76][77][78] to calculate the rates of the PT processes, as implemented in the PyEMMA [79] package. To model dynamical events involved in transitions among various metastable states, a Markov state model (MSM) was constructed using the MD simulations data at a finite time interval (1 fs). The MD conformational space was discretized into a three-state model using the O· · · H distance feature. State 2 was defined as the transition state with a O· · · H separation of 1.2 Å, and the rest of the conformations are state 1 and 3 types-corresponding to a pair of minima. To estimate the MSM model, a suitable lag time of 1 fs was chosen. The Chapman-Kolmogorov test was also performed to validate (Markovianity) the MSM models at higher lag times.

Results and discussion
Each of the sGDML models is trained on a set of reference datasets of consistent size, with about 9.8k samples with a resolution of 0.5 fs. Energy values (relative to the structure at the first time step) across the dataset span over a range of about 0-25 kcal mol −1 , and the corresponding force magnitudes lie in the range ∼50-250 kcal mol −1 Å −1 . Using a 5-fold cross-validation, the best model with the lowest validation error was applied for the test dataset, which is not part of the training and validation data. Throughout this paper, the reported prediction errors correspond to those of the test set. The training set sizes were varied as 50, 100, 150, 200, 400, 600, 800, and 1000 by fixing the size of the validation set to 5000. For each training set size, the test set contains all the remaining points from the total dataset with 9865 entries. The performances of the ML models were also tested for a dataset with 2000 entries, calculated using a high-level theoretical method, coupled cluster singles doubles (CCSD), for which we fixed the size of the validation set to 500.  accuracy. We further investigated whether any notable different nature of learning exists among the four systems of structure-type IV. Accordingly, we chose the NO 2 -structure IV and BH 2 -structure IV, and six independent sGDML models were trained for each molecule. MAE values from these models are presented in figures S1 and S2 of the supplementary material. From the kernel density estimation or 'violin' plots of MAE values for each training set size we can see that lower MAE values have higher probability densities as the training set size increases. In particular, a few of the six models even achieved a force accuracy of 1 kcal mol −1 Å −1 , and energy accuracy of 0.3 kcal mol −1 with a training set size of 200. Nevertheless, a training set size higher than 400/600 can be a good choice to get rid of any kind of ambiguity due to sampling bias, by considering the bulkiness and the chemical nature of the substituent OCH 3 . Energy-based ML models typically need 2-3 orders of magnitude more data to gain comparable accuracy [60]. Similarly, energy prediction accuracy is achieved below 0.3 kcal mol −1 for all the models using just 200 training examples (see figure 3(b)). Figure S3 of the supplementary material shows the shapes of the MAE values from six independent runs per training set size for the molecule BH 2 -structure III. As the training set size increases, the densities of lower MAE values are increased as well. This indicates that the model accuracies increase as the training set size increases, and the models are not influenced by sampling biases. Additionally, energy and force predictions for all the structures for a consecutive 500 steps are given in figures S4-S7 in the supplementary material. The model prediction accuracy was checked using the reference data generated at the MP2/6-31++G * * level of theory. Here, four structure types: H-structure I (the parent molecule-MA), NO 2 -structure I, H-structure III, and NO 2 -structure III were considered. The energy and force prediction accuracies of sGDML models of these structure types are given in figures S8(a) and (b), respectively, in the supplementary material. They show that both the energy and force prediction accuracies obtained here are of the order of DFT level. Therefore, hereafter, for further calculations, MLFFs trained on DFT reference data are considered.

Focal point (FP) analysis
To quantify the overall prediction error in our computed data and ML modeling, we also generated data at the CCSD level with the large basis set, 6-31++G * * . Sauceda et al showed, in their earlier studies of the aspirin molecule, that intramolecular interactions are well captured by an sGDML model built using the CCSD/cc-pVDZ level of theory [80]. However, for the chemical process studied here, involving PT, we have selected a larger diffuse function augmented basis set, 6-31++G * * . For example, a 15 atomic molecule NO 2 -structure III has 220 and 165 basis functions for the 6-31++G * * and cc-pVDZ basis sets, respectively. Since calculations at the CCSD/6-31++G * * level are computationally intensive, we have estimated the total energies and forces using FP analysis [33,81]. NO 2 -structure III is considered for this exercise as it was found to be the most chemically relevant substituted MA [33]. FP analysis was performed using MP2 and CCSD methods using a smaller basis set, 6-31G * ) and a larger one, 6-31++G * * . The FP correction for the post-MP2 correlation energy, ∆, is expressed as The total CCSD energies at the large basis set can now be obtained as Similarly, FP estimated CCSD forces (F CCSD(FP) ) were also obtained. CCSD (FP)-level data were obtained for 2000 random entries drawn from the 9865 set. For FP analysis, all reference data were computed using a Gaussian package [82]. To quantify the error due to FP analysis, we selected the first 10 of the 2000 structures and calculated reference data at the CCSD/6-31++G * * level. In figure S9 of the supplementary material, CCSD(FP) energies and forces obtained using the FP method are compared with the reference CCSD/6-31++G * * energies and forces. The Pearson correlation coefficient between the data from CCSD/6-31++G * * and CCSD(FP) levels was found to be 0.999, hinting at a higher degree of agreement. Subsequently, E CCSD(FP) and F CCSD(FP) data for 2000 structures were used to train sGDML models.
From figures 5(a) and (b) it can be seen that the prediction accuracies for the MP2-level data (with 9865 entries) are similar to that of the FP estimated CCSD dataset with 2000 entries. This also indicates the 2000 entries selected to be representative of the total dataset. In figure S10 of the supplementary material the means and standard deviations (σ) of MAE values of six independent runs per training set size are shown. An almost negligible σ indicates that the models are not biased due to random sampling of the Table 1. The O· · · O distances (Å) between two oxygen atoms (O4 and O5; for numbering, see scheme 1) in MA and its derivatives, optimized using MLFFs constructed in this work. Here, π-delocalization indices |Q| are given in parentheses. The O· · · O distances in lower rows were calculated at the DFT-B3LYP/DZP++ level of theory, taken from [33] training/validation/test sets. Cross-validation errors are provided for a model trained on the CCSD(FP) dataset in table S1 of the supplementary material. All these analyses suggest that our MLFF models are accurately trained to follow a general ML protocol by considering different levels of ML errors, such as errors in the training data (either coming from the ab initio method or basis sets), sampling bias, validation errors, and test errors. We have noted consistent performance of sGDML modeling of energies and forces across various ab initio methods, ranging from DFT, MP2, and CCSD. This trend suggests that accurate FF models can be developed for chemically relevant molecules in chemically interesting regions of the potential energy surfaces using more accurate data at the CCSD(T) level in the future, albeit at a very high computational cost.

Ground state properties of ab initio accuracy
Firstly, MLFFs are used to optimize the minimum energy structure of the enol forms of all systems, and vibrational frequencies were computed numerically (see table S2 in the supplementary material for the OH stretch, which is responsible for the hydrogen transfer). The O· · · O distances of all the optimized geometries are given in table 1, where each row is augmented with a second row containing the values from the work of Schaefer and co-workers [33]. It is interesting to see that among the MA derivatives along the R1-series, i.e. substituents at the symmetrical carbon, the shortest O· · · O distance was found for the NH 2 substituent, irrespective of the substituent at the R2-carbon (central carbon). This trend is similar to that reported earlier by Schaefer and co-workers [33]. Upon considering a fixed R2-substituent, along the R1-series, the O· · · O distance decreases as we move toward the electron-rich substituents; however, it increases in the OCH 3 . Structure-type IV is showing almost always a similar O· · · O distance, as in the case of structure-type II. Now, for a fixed R1-substituent, along the R2-series there is a decrease in the O· · · O distance as one goes from the H atom to the BH 2 functional group, except for the NO 2 -structure III that has the shortest O· · · O distance among all the systems studied here. All these findings are very similar to previous results [33]. This is quite convincing that the sGDML models are able to predict the ground state geometries quite well. Gilli and co-workers characterized the HB strength by the π-delocalization index, of the short conjugated chain connecting the HB donor and acceptor atoms [37]. Following the same method, we have calculated |Q| values and tabulated them in table 1. The lower the |Q| value is, the stronger the π-delocalization a system shows. From table 1, we can see that as the electron-withdrawing inductive effect is increasing along the R2-series, the |Q| value is decreasing. Hence, the stronger the delocalization of the π-conjugated system is, the shorter the O· · · O distance becomes. Accordingly, the π-delocalization index can be a reasonable descriptor for predicting the strength of a symmetrical intramolecular HB.

MD results of ab initio quality
As the performance of the models reached chemical accuracy, we further used them to perform MD simulations for longer timescales to obtain sufficient sampling of the conformation space. This was so that we can perform insightful analyses of the thermodynamic and kinetic properties of molecular systems. In fact, a well-known sampling problem often limits our ability to compute rare events using MD simulation data. Followed by the seminal work on the sGDML model of MA, it was shown that the PT barrier in the symmetric double-well potential of the MA molecule was about 4.0 kcal mol −1 , and the O· · · O bond distance was about 2.38 Å [62]. This is overwhelming as conventional FFs are unable to do it. However, Cooper et al used a different approach where a local part of the PES (instead of global) associated with a transition state was used to train a neural network model by taking into account energies, and their first and second derivatives for calculation of accurate rate constants using instanton theory [83]. We have analyzed the free-energy surface from the MD trajectories obtained for each of the sGDML models. Figure 6 shows the two-dimensional free-energy surfaces as a function of a pair of O· · · H distances. It is evident that all the sGDML models are able to qualitatively describe the symmetric double-well potential (two minima) connected via a transition state. We found that as we go along either the R1-series or the R2-series, the energy barrier is decreasing. In structure-type I, the PT barrier is lowered from ∼4 to ∼3 kcal mol −1 . In structure-type II, the PT barrier is lowered even more, from ∼4 to ∼2 kcal mol −1 . In a very recent study, Qu et al calculated a barrier of 3.5 kcal mol −1 for the H-structure II using a ∆-ML approach [84]. Clearly, major substituent effects are seen in structure-type III. Here, the PT barrier is seen to go down from ∼2 to <1 kcal mol −1 . We are realizing almost barrierless transitions. In particular, NO 2 -structure III is exhibiting the lowest PT barrier; earlier, this value was calculated at the MP2 level as 0.6 kcal mol −1 [31]. Interestingly, BH 2 -structure III has an equivalent PT energy barrier to the former. Besides, the area of the PES is decreasing, i.e. the O· · · O distance is becoming shorter. Structure-type IV seems to behave more like structure-type II, even though OCH 3 has stronger electron-donating capability than the CH 3 group.
Presumably, the larger OCH 3 group in structure-type IV reduces the efficacy of the R2-substituents despite their increasing electron-withdrawing power along the R2-series. Earlier too, it was found that a stronger electron-withdrawing group at R2 and an electron-donating group at R1 tend to produce stronger intramolecular HBs. However, bulky groups at R1 may have an altered influence on the hydrogen transfer barrier [33]. They have calculated high-level coupled cluster PT energy barriers at the complete basis set (CBS) limit. Presently, we are able to achieve energetics of chemical processes using sGDML models efficiently at the cost of MD with coupled cluster accuracy. Substituent effects were often used as the basis for qualitative interpretation of HB strength [85]. Here too, we show the same from the free-energy landscape computed using the MLFFs. Fundamentally, intramolecular hydrogen transfer between two oxygen atoms of MA is implied to occur via tunneling due to the lightweight H-atom. The high-level computations of tunneling splitting with MP2/6-31G * * [21], MP2/6-31++G * * [26], and CCSD(T)/aug-cc-pV5Z [23] gave results close to the far-infrared spectroscopic value of 21 cm −1 [86]. They are arguably superior compared to our MLFFs trained on DFT reference data. Nonetheless, our results too gave an agreeable PT energy barrier when considering the variety of our substituted MA systems. Besides, MLFFs trained on MP2/6-31++G * * for four of our sixteen systems produced similar energy and force prediction accuracies (see figures S8(a) and (b) in the supplementary material). Therefore, our sGDML models presented here perform reasonably well, and future studies can aim to repeat the ML exercise presented here with more accurate data that at present are not feasible due to their computational cost.

Kinetic analyses
Feature selection for constructing kinetic models: we have computed the PT rates using the MD trajectories, which were simulated using the sGDML FFs for all the studied MA systems. One of the important aspects of analyzing kinetics of a chemical process is the selection of the feature which is able to describe its nature well. Despite the large number of dedicated experimental and theoretical studies, our understanding of the determinants of HB strength is surprisingly fragmentary [87]. Nevertheless, O· · · H distances are often used [85,88]. The population of the two O· · · H distances is shown in figure 7. We have chosen four structures (H-structure I, H-structure III, NO 2 -structure I, and NO 2 -structure III) to show the extent of substituent effects on the O· · · H population. We can clearly see that the NO 2 substituent has a significant effect on the O· · · H population in structure-types I and III (shown in figures 7(c) and (d), respectively) compared to the H atom (figures 7(a) and (b)); this is expected as the former is known to have strong electron-withdrawing power. Besides, the NO 2 -structure III has an electron-rich NH 2 group at the R2 center. As a result, O· · · H populations are affected here the most. From this, we can easily infer that two O· · · H distances are good choices as features for kinetic modeling.
Discrete-state kinetic models, such as MSMs, are shown to be useful for understanding conformational states involved in bio(molecular) transitions [89,90]. The pipeline of an MSM model is given in figure S11 in the supplementary material. The MSM analyses results are shown in figure 8, which shows the rate matrix obtained from the three-state MSM, represented by a PyEMMA network plot [79]. After analyzing the fluxes between the two metastable sets, it was found that a major transfer of fluxes happens between states 1 and 3 via state 2 (figure S11(d) of the supplementary material). The individual transition rate is given in the unit of fs −1 . We can see that the two minima (states 1 and 3) have larger populations than state 2 (TS) in all systems, except NO 2 -structure III and BH 2 -structure III molecules; here, state 2 has populations as high as states 1 and 3. When comparing the rates along the R1-series, we can see that the PT rate is becoming higher from structure-type I to II to III (row-wise). Also, along the R2-series, the PT rate is elevated, for example, from H-structure I to CN-structure I to NO 2 -structure I to BH 2 -structure I (column-wise). A similar trend was observed in the free-energy barrier also. Using transition state theory, the PT rate was calculated to be 0.0076 ps −1 (applying a frequency factor of 10 12 s −1 ) for the parent MA [91] with an activation barrier of 2.91 kcal mol −1 . In structure-type I, we can see that the rate is increased from 0.002 to 0.004 fs −1 , with R2-substituents varying from the H atom to the functional group BH 2 . Whereas structure-type II changes its rate from 0.004 to 0.01 fs −1 as the electron-withdrawing effect of the substituents increases. Structure-type III shows rates from 0.006 to 0.014 fs −1 among various electron-withdrawing substituents. Structure-type IV showed similar PT rates to structure-type II: likewise, the trend we saw in the free-energy barriers. The rates of the PT processes are found to be very consistent with free-energy barriers, as expected. In practice, Figure 9. Log(k/k 0 ) as a function of the mesomeric constant σ 0 R of: (a) R2 substituents; the shaded rectangle shows the highest rate when R1 is NH2, and (b) R1 substituents; the shaded rectangle shows the highest rate when R2 is BH2. The dashed curves are the smooth Bézier interpolation of the solid lines, indicating how log(k/k 0 ) drops for OCH3 in (a) and for NO2 in (b), despite being the best electron-donating and electron-withdrawing groups, respectively.
we know that as the barrier height decreases, the rate of a chemical process should become faster. The PT rate in structure-type IV is only moderately greater than the parent MA. This can be attributed to the larger size of the OCH 3 group. A moderate increase in reaction rates in bulky functional groups was suggested earlier as well [92]. Our ML models are able to predict PT rates where the fundamental physical phenomena of molecular kinetics are reflected. Therefore, the PT rate could be a reliable descriptor of intramolecular HB strength.
One of the main objectives of the current work was to study the kinetics of PT processes for which MSM analyses, by discretizing the MD trajectory into 'states' , gave insights into the PT rates that were consistent with various functional group substitutions on MA. This shows that MSM that is usually applied to higher-dimensional systems can also be applied to small molecules. Primarily, the reaction coordinates or selected features for discretizing MSM states, and the expected substituent effects on PT rates indicate the accuracy of the model. Even though MSM analysis misses the key quantum effects that are relevant to biomolecular PT processes, such as enzyme catalysis, the computation of kinetic properties to find tailor-made catalysts might be a useful application.
Finally, the determined rates can be used in the Hammett equation, logk = logk 0 + ρσ, where k and k 0 are the rates of the substituted and unsubstituted compounds, respectively. The substituent constant σ measures inductive effects relative to hydrogen of a substituent in the meta-or para-center. Here, we use the mesomeric constant (σ 0 R ) [93], which is more relevant when considering the resonance-assisted nature of the HBs presented in the current study. Here, ρ can be thought of as the susceptibility of the reaction to the inductive effects. Figures 9(a) and (b) show log(k/k 0 ) for a given substituent at R2 (while R1 varies), and at R1 (while R2 varies), respectively. Figure 9(a) shows that the PT rate increases as we move to the electron-rich substituents; however, it drops in the OCH 3 . Figure 9(b) shows that the PT rate increases until the electron-poor substituent NO 2 . Further, the rates are elevated for BH 2 , even though it functions as a weak electron-withdrawing group. All these findings suggest that both electronic and steric effects are important factors for seemingly barrier-less intramolecular PT in substituted MAs.

Conclusions
The occurrence of intramolecular hydrogen bonding interactions is widespread in chemical and biological systems. Characterization of such HBs helps to determine their functional roles in enzymatic reactions, mechanisms of the actions of drugs, and base-pairing interactions in DNA, RNA, etc. MA exhibiting intramolecular HBs has long served as a prototype system for testing and validating computational approaches, such as the development of reactive models for studying chemical reaction dynamics. Here, we have extended the pool of model systems by including various substituted MAs. Presently, the availability of ML approaches enables simulations of full-dimensional long-timescale trajectories. In this study, we showed that MLFFs are able to describe intramolecular low-energy barrier HBs at a high level of QM accuracy. Here, the free-energy barrier of the PT process in symmetric OHO β-diketone tautomers (O-H· · · O=C) was shown to modulate due to the nature of the substituents, stemming from their electronic and steric factors. MLFFs are indeed able to capture the characteristic dynamical behavior of the intramolecular HBs responsible for exhibiting the correct trends of the ground-state structures, as well as the PT kinetics studied by Markov state modeling. We have shown that functional-group inductive effects can serve as a factual basis for a smooth and consistent interpretation of HB strength. It will be interesting to see how asymmetric NHO resonance-assisted hydrogen bonding systems, such as arylazophenol (N=N· · · H-O) and its arylhydrazo-quinone tautomer (N-NH· · · O=C), are influenced by the nature of the functional groups, apart from those considered in the present study, such as various halogen functionalities.

Data availability statement
All data that support the findings of this study are included within the article (and any supplementary files).