Artificial neural network potentials for mechanics and fracture dynamics of two-dimensional crystals

Understanding the mechanics and failure of materials at the nanoscale is critical for their engineering and applications. The accurate atomistic modeling of brittle failure with crack propagation in covalent crystals requires a quantum mechanics-based description of individual bond-breaking events. Artificial neural network potentials (NNPs) have emerged to overcome the traditional, physics-based modeling tradeoff between accuracy and accessible time and length scales. Previous studies have shown successful applications of NNPs for describing the structure and dynamics of molecular systems and amorphous or liquid phases of materials. However, their application to deformation and failure processes in materials is still uncommon. In this study, we discuss the apparent limitations of NNPs for the description of deformation and fracture under loadings and propose a way to generate and select training data for their employment in simulations of deformation and fracture simulations of crystals. We applied the proposed approach to 2D crystalline graphene, utilizing the density-functional tight-binding method for more efficient and extensive data generation in place of density functional theory. Then, we explored how the data selection affects the accuracy of the developed artificial NNPs. It revealed that NNP’s reliability should not only be measured based on the total energy and atomic force comparisons for reference structures but also utilize comparisons for physical properties, e.g. stress–strain curves and geometric deformation. In sharp contrast to popular reactive bond order potentials, our optimized NNP predicts straight crack propagation in graphene along both armchair and zigzag (ZZ) lattice directions, as well as higher fracture toughness of ZZ edge direction. Our study provides significant insight into crack propagation mechanisms on atomic scales and highlights strategies for NNP developments of broader materials.


Introduction
Understanding fracture mechanics and crack propagation is key to predicting and controlling mechanical behaviors for materials processing and subsequent materials applications. In many materials, crack propagation under loading is an overriding failure and, therefore, one of the critical problems in materials science. Accurate computational modeling of crack propagation, thus, becomes an essential tool for The size was selected to avoid self-interaction when we consider the radius cutoff of local descriptors of NNP (c) Energy comparison from the training and validation sets. direction does not occur in a straight line, even though a previous high-resolution transmission electron microscope (TEM) study reported both straight AC-and ZZ-torn graphene [32].
Two-dimensional materials are ideal for validating atomistic models of materials failure under the effects of vacancies, bilayer, crack directions, and folding by comparing the fracture patterns observed through advanced TEM techniques [33][34][35][36]. A recent in-situ TEM with reactive molecular simulation has shown that lattice distortion at the crack tip in 2D materials, WS 2 with a hexagonal lattice, can drastically change the entire path of crack propagation [35]. Also, theoretical comparisons between DFT and AIREBO show that the lattice distortion under tensile loading is significantly different [37]. Therefore, we hypothesized that developing an NNP of graphene for lattice deformation and fracture under various loadings could provide a more faithful prediction of crack propagation in graphene, especially along the AC direction. Furthermore, establishing a data preparation process can provide a foundational workflow to develop NNPs for mechanics and fractures of other materials.
A schematic of the current study is shown in figure 1(a). In a previous study [37], we showed that DFTB could correctly predict both mechanical behaviors and lattice deformation of graphene under various loading conditions, in good agreement with DFT calculations. Therefore, we here utilized DFTB to generate extensive data for possible fracture scenarios of graphene under various loadings by mixing two tensile and one shear deformation. The data are reduced and selected based on deformation and energy differences to improve the generalization during the training. Then, we explored how the data selection affects the accuracy of the trained NNPs and the reliability evaluated based on physical properties such as deformation and stress-strain curves. In the end, we performed simulations for the crack propagation of graphene with a sharp crack using the trained NNP. We compared the results with an approach based on a popular empirical bond order potential, AIREBO. The results show better agreement with previous experiments regarding the resulting edge structures and the frequency of the type of torn edges.

Generation of data
The MDs simulations with DFTB for data generation were performed via the large-scale atomic/molecular massively parallel simulator (LAMMPS) package [38]. DFTB calculations were performed at each time step through the DFTB+ package [39], utilizing a previously developed interface for LAMMPS [40]. We employed the 3OB [41] C-C parameters with the DFTB3 scheme because the stress-strain behaviors are well-matched with those from DFT calculations with the PBE functional [40]. We generated 10 000 data points through the NVT ensemble at 400 K with the time step of 0.5 fs to obtain reference RMSE of the relative energy from the canonical data generation.
Then, we obtained data points for the deformation and fracture under various loading. Static loading or quasi-static loading with full energy minimization has a limitation for the training data set because all atoms are located in the energy minimized positions. We need slightly perturbated coordinates to train the NNP to distinguish the contribution of a single atomic energy. Therefore, we utilized dynamics loading for the data generation. We consider different loading directions by mixing the loadings along x, y (pure tensile), and xy (pure shearing) directions as (v x , v y , v xy ). We prepared 361 directions with constant velocity 400 m s −1 with 0.5 fs time step. Each direction has 2000 steps, so the total data number is 722 000. Here, the loading speed is much faster than what is desired to provide reliable behaviors, which is under 20 m s −1 [40]. The effect of the loading speed becomes critical when the speed is too fast for the system to have enough time to relax the structures under the deformation. So, after every 20 steps, we included a small step of energy minimization to overcome the delay. The number of steps was tuned to match the stress-strain curve under shear loading to the results from the quasi-static loading. At each step, we deformed the simulation box by about 0.002 Å, resulting in a total deformation for each loading of about 4 Å.

Selection of data
From the 722 000 data points, we built neighbor lists between data points based on the deformation of the simulation box (dx, dy, dxy). We consider the distance (δr) between data as an indicator of the deformation similarity. Then, we sequentially deleted the data but saved it if the energy difference in the list is larger than δE. We utilized vales of δr (0.01 Å-0.2 Å) and δE (1 kcal mol −1 -50 kcal mol −1 ) from the original 722 000 data points, and the number of reduced data from (δr, δE) is listed in table 1.

Training
We utilized TorchANI library and its setting for training the NNs. For training, 80% of data was used, while 20% of data was utilized for validation with a small mini-batch size of 64. We note that we did not explicitly prepare a specific test set in this study because we utilized most hyperparameters suggested in the previous study of TorchANI [42]. Instead, we evaluated the performance of each model by the same reference data (δr = 0.01 Å, δE = 1 kcal mol −1 , ∼450 k). Therefore, our selected model from the data set (e.g. δr = 0.1 Å, δE = 1 kcal mol −1 , ∼250 k) was trained without 200 k data in both training and validation for their final performance.
The loss function is defined as where α is a parameter to determine the contribution of forces, and we used 0.1. The Adam optimizer was utilized with weight decay for the weights [43,44] (weight decays for two hidden layers are set 1 × 10 −5 and 1 × 10 −6 , respectively, others are default values) and stochastic gradient descent (SGD) [45] for the biases (learning rate = 0.001, others are default values), as suggested in the previous study. The weights were initialized by Kaiming initialization [46] with the normal distribution, and the initial values of biases were zeros. We utilized a learning rate scheduler (a function 'reduced lr on plateau' in Pytorch) for both Adam and SGD with factor = 0.5, patience = 100, and threshold = 0. Others are default values). We took the best model for the validation set during the entire epochs.

MDs simulations for crack propagation
We prepared the pristine graphene of 10 nm by 20 nm with a 2 nm sharp crack along the y direction with periodic boundary conditions. To avoid the interaction between image cells, 20 nm space along the y direction and 3.35 nm space along the z direction were inserted. Instead of full dynamic loading, we applied a 0.01 strain at each iteration to stretch to a 0.04 strain along the x direction with structural relaxations. Then, we applied dynamic tensile loading along the x direction at low temperature, 10 Kelvin. The loading speed was 2.0 m s −1 , and the time step was set to 1 fs.

PyTorch interface with LAMMPS
We utilize the python functions in LAMMPS (v. 29OCT20) to utilize the python code in the python environment. Through the python environment, we can easily call python library. Utilizing TorchANI, we calculate the atomic environmental vectors (AEV) for NNs. Forces and stress are calculated with the given coordinates and simulation box through the autograd engine in PyTorch [47] and updated at each time step.

Results and discussions
One of the most critical parts of NNP development is the training data set. The data for training should cover the essential features of the problem-specific configurations. Previous NNPs have been trained from the data usually generated from first principles DFT-based MDs simulations and conformal searches based on the normal mode analysis [15,16,48]. The initial training set is generally not sufficient for the desired accuracy. Therefore, adaptive and active learning approaches have been proposed and applied [49,50]. The basic principle of such methods is to detect new data not contained in the initial training data set by analyzing the data using configuration fingerprints or comparing values from multiple models of NNPs, an ensemble, or a so-called committee. Iteratively searching the new data, training, and sampling provide better data sets in the end. However, these approaches are not sufficiently good for fracture dynamics if the initial data set is not accurate enough to describe the dynamics during the failure. We first tested previously trained models of ANI-1x [16], ANI-1cxx [51], and ANI-2x [52] as provided in the TorchANI library [42], where the data sets include the deformed geometries of small organic molecules from normal mode sampling. We examined the reliability with a single-layer graphene system consisting of 24 atoms under three different loadings (shown in figure 1(b)) by evaluating stress-strain curves and the deformation of bond length (l 1 , l 2 , and l 3 ) and angles (θ 1 , θ 2 , and θ 3 ). Since we utilized local descriptors with a radius cutoff ∼5 Å in TorchANI, the unit cell of 24 atoms with system size (∼7.4 Å × 8.5 Å) was selected as a compromise to avoid self-interaction effects on one hand and allow sufficiently large data creation with DFTB for training the NNP. A newly developed PyTorch interface in LAMMPS was utilized for communicating energy, forces, and stress with the TorchANI python library. Figures S1-S3 show the results of stress-strain curves and the deformation through ANI models.
We note that the data sets do not explicitly include graphene information, but interestingly, the NNPs from ANIs can well describe graphene's behaviors under small deformation. As expected, however, they clearly fail for fracture behaviors and large deformation. Therefore, we designed data generation by mixing three loading directions along the x, y, and xy directions and generated more than 700 000 data points. Then, we trained new NNPs from scratch using the TorchANI [42] tool. The structure of the utilized NNP in the current study is shown in figure 2 and the comparison between other ANI models is in table S1. From the actual coordinates (q), AEV (also known as a symmetry function, G) are utilized as input to be invariant under translation and rotational transformation and the permutation of the same atom types [15]. There are two parts in the AEV: radial and angular terms from two atoms (i and j with distance R ij ) and three atoms (i, j, and k with two distances R ij , R ik , and one θ ijk ), respectively [42]: where η controls the width of Gaussian function with multiple R s for probing specified radial environments (m is an index for R s ); ζ controls the width of probing as η; θ s decides the specific region in the angular environments as R s . f C is a cutoff function to change values to zero at R C smoothly, defined as for R ⩽ R C and 0 for R > R C . AEV from each atom is an input for the NN to estimate the single atomic energy. We followed the parameters of AEV and the structures of nodes and layers from ANI-2x: the radial term has 16 different radii, and the angular term has 4 angles and 8 radii [52] shown in figure 2(b). The NN has three hidden layers with Gaussian error linear unit activation function [53] to add non-linearity.
Next, we checked the performance of the NN structures and training processes from the data generated from graphene's MD simulation at equilibrium states at 400 Kelvin. We utilized a root mean square error (RMSE) for the evaluation of the accuracy for both energy and force components as a kcal mol −1 energy unit. Because some previous works utilized a mean absolute error (MAE) with meV atom −1 energy unit as metrics, we also added the MAE values with meV atom −1 . A RMSE in the relative energy lower than 0.8 kcal mol −1 (RMSE = 1.5 meV atom −1 ∼ 0.034 kcal mol −1 atom −1 , MAE = 0.6 meV atom −1 ∼ 0.014 kcal mol −1 atom −1 ) is achieved around 200 epochs, which we consider a very high accuracy from the perspective of computational chemistry, where a threshold 1 kcal mol −1 (for small molecules with ∼20 atoms) in accuracy is commonly considered a 'gold standard' . However, our result clearly shows that an agreement between NNP and ground truth data in terms of relative energy does not guarantee correct physical properties. As shown in figure S4, the failure behaviors are not correctly described. Figure 3(a) shows a naïve way to save data under loading, recording data based on constant deformation (δr) or at a constant time interval. There are two apparent problems with this approach. First, it is likely to miss essential data during the failure process, where the configurations drastically change in a very short time. Second, the many similar data are close to each other near the non-deformed structures, which probably hinders the training due to the data imbalance [54]. Therefore, we utilized a constant energy difference (δE) to select data, as shown in figure 3(b). It would have better chances to capture the key data during the failure process even with the same number of data points. Figure 3(c) shows the schematic to represent the data distributions with the two main directions of the loadings: x and xy. From the total data, we built neighbor lists of each data point with a defined cutoff in the deformation space, δr. Then, the data in the neighbor list is sequentially removed if the energy difference is not bigger than the predefined criterion, δE. Table 1 lists the parameters: δr and δE with the screened data numbers.
We checked the stress-strain curves of the trained model from all data without any reduction, as shown in figure S5. As expected, it shows much better behaviors than ANIs or the trained model from NVT ensemble trajectories because the current data set explicitly includes various fracture scenarios. However, it fails to describe the fracture patterns and stress-strain after fracture along the x direction loading in figure  S5(a). The energy minimization during the quasi-static loading should result in complete bond breaking between broken edges. Figures 4(a) and (b) show the RMSE of energy and force components from training, validation, and total data from each data set selected from the above-mentioned approach as varying δr with fixed δE = 1 kcal mol −1 . We note that the data from larger δr is selected from the data set of smaller δr, which means the smaller data sets always belong to the larger data sets. Therefore, the difference of RMSE, e.g. better accuracy of δr = 0.1 Å than that of δr = 0.05 Å, does not come from the data difference but from better generalization with reducing the overfitting (See SI discussion). We assume the original data set has too many similar data to prevent generalization. So, we evaluate the accuracy of all trained models based on the first selected data set from δr = 0.01 Å and δE = 1 kcal mol −1 as shown in figures 4(c) and (d). . RMSE of relative energy per atom (a) and force components (b) from training, validation, and total data from each data set selected from δr with fixed δE = 1 kcal mol −1 . δr = 0 indicates the entire data points without reduction. Reevaluation RMSE of relative energy (c) and force components (d) with the same data set (δr = 0.01 Å, δE = 1 kcal mol −1 ). RMSEs depend on data set, and even smaller numbers of data can have higher accuracy. Also, RMSE does not guarantee the physical properties of the models.
The RMSE of relative energy from the model (δr = 0.01 Å, δE = 1 kcal mol −1 ) shows the lowest value, but RMSE of force components from the model (δr = 0.1 Å, δE = 1 kcal mol −1 ) show the lowest value. Also, we investigated the stress-strain curves and deformation for the reliability of the trained NNPs. The models trained from δr = 0.01 Å to δr = 0.05 Å data have problems near the fracture point, as indicated with arrows in figures S6-S8 (See figures S9-S11 for other conditions). In terms of the stress-strain curves and deformation, the trained model from δr = 0.2 Å looks better than that from δr = 0.1 Å, but the model from δr = 0.1 Å was selected for the next stage because the RMSE of force components exhibits the lowest value. This is a reasonable choice because the RMSE of atomic force is a good indicator for overfitting (see SI discussion). We also tested how the choice of δE affects the accuracy of models with δr lower than 0.1, as shown in figure S12. Since the number of data decreases as the value of δE increases, it is expected to lose accuracy. However, this also does not monotonically decrease, and data with δr = 0.1 Å and δE = 5 kcal mol −1 is reasonably optimal with the number of data points, 78 000 (= 78 k). Considering the fact that the reference data (∼450 k) is six times larger, the loss of accuracy of energy and force components are only 0.02 kcal mol −1 atom −1 (RMSE = 0.9 meV atom −1 ) and 0.12 kcal mol −1 Å −1 (5 meV Å −1 ), respectively. Also, the NNP describes the stress-strain and deformation under three loadings very well, as shown in figure S13. This kind of data reduction without losing the essential data is important for active learning. However, such data augmentation is out of the current scope, and we selected the model from the data (δr = 0.1 Å, δE = 1 kcal mol −1 ) for the next simulations. Figure 5 shows the stress-strain curves and deformation of the selected model under the loading along the x direction.
A previous microscopy study reported both straight AC and ZZ torn graphene edges through the high-resolution TEM [32]. Also, torn lines along the AC edge are twice more frequently observed than the torn lines along the ZZ edge [55]. In previous theoretical studies, ReaxFF and AIREBO have been utilized to describe the mechanics and crack propagation from atomistic modeling. However, ReaxFF has some limitations in describing mechanics and stress-strain curves near failure compared to DFT calculations [56]. Especially, brittle crack propagation is hindered by the stiffening effect near the point of failure. Instead, AIREBO is preferred because the stress-strain curves of pristine graphene are well-matched with DFT calculations once its bond order switching function is controlled [31]. MD simulations of pristine graphene with AIREBO consistently show that the fracture toughness along the ZZ edge is lower than the AC edge [57][58][59]. Also, once the crack propagates along the AC direction, the fracture pattern from the AIREBO shows the ZZ torn edge preference. Although nanopore formation through the electron beam prefer to form the ZZ edge [60], the configuration does not come from the mechanics or crack preference but from the kinetic stability during the reconstructions [61]. This shows that empirical forcefields are limited for predicting pristine graphene's torn edge configuration and the dynamics of crack propagation. Finally, we performed MD simulations using both selected NNP and AIREBO to test our hypothesis for the crack propagation along the AC direction with a rectangular system with 10 nm × 20 nm with a crack of 2 nm, as shown in figure 6. We performed the crack propagation simulations by combining quasi-static loading and dynamic loading to reduce the computational cost. The NNP results in a straight and clean torn edge in both AC and ZZ crack direction in figure 6 (see movies 1 and 2). As described above, AIREBO predicts that the straight propagation along the AC direction is less likely to occur. Instead, it shows meandering crack paths in figure 6 (see movies 3 and 4) with ZZ crack edges. The obtained stress-strain curves from NNP and AIREBO are shown in figure 6. The notable difference here is the fracture toughness. We estimated the critical energy release rate and fracture toughness in both AC and ZZ crack directions. Considering the brittle behaviors of the stress-strain curve, we can directly estimate the external work as the critical energy release rate (G c ) from the critical stress (σ c , GPa) and the critical elongation (∆l, nm) as where l y (= 20 nm) is the length of the system in the y direction, and ∆a (nm) is the total length of the crack propagation. Fracture toughness is then can be obtained by where E is Young's modulus, and it is assumed as 1 TPa. The obtained values are listed in table 2. The obtained fracture toughness values from AIREBO for AC-edge (zz loading direction, 3.54 MPa m 1/2 ) and ZZ-edge (ac loading direction, 3.29 MPa m 1/2 ) agree with those from the previous studies with AIREBO [59]. The NNP predicts the fracture toughness along the AC-edge (zz loading direction, 3.10 MPa m 1/2 ) is lower than the ZZ-edge (ac loading direction, 3.41 MPa m 1/2 ), which shows an opposite trend to the results of the AIREBO. We note that our focus is on the difference in fracture toughness between ZZ and AC directions. The main factor of our estimations comes from the stored elastic energy before the crack propagation. Therefore, a correction of Young's modulus (calculated elastic constants are listed in table 3 and  table 4) estimation without assuming 1 TPa does not affect the relative fracture toughness, e.g. ratio.
The frequency of torn edges in the suspended polycrystalline graphene monolayer depends on the fracture toughness of pristine graphene. The prediction from the NNP agrees well with those previous observations, while AIREBO predicts opposite behaviors in terms of the frequency of torn edge observation and torn AC edge configuration. The limitation of AIREBO comes from the softened angle stiffness under tensile loading, which also has been compared with DFT and DFTB calculations in the previous study [40]. Figure 7 shows graphene's bond lengths and angles at the crack tip just after the first bond breaking during the crack propagation. The angular deformation of NNP shows a lower angle (∼124 • ) than that of AIREBO (133 • ), which results in the elongation of the inner side bond length, l 2 (∼1.7 Å) than the outer side bond length, l 3 (∼1.6 Å). The relative bond lengths between l 2 and l 3 determine the crack path, and AIREBO prefers ZZ crack paths because l 3 (1.72 Å) is longer than l 2 (1.67 Å). In another 2D material, WS 2 , these lattice distortions can result in anisotropic crack dynamics even with the same surface energy [35]. Bond and angle at the crack tip after the first bond breaking along the AC edge direction from NNP (a) and AIREBO (b). Dotted red arrows represent the crack propagation. The angular deformation of trained NNP shows a lower angle than that of AIREBO, which results in the elongation of the inner side bond length, l2, than the outer side bond length, l3, from NNP. AIREBO shows propagation along the zigzag edge because l3 is longer than l2.

Conclusion
In summary, we proposed generating and selecting the training data for the deformation and fracture of crystals and applied it to 2D crystal graphene through DFTB calculations. We utilized the previously developed PyTorch library, TorchANI, to train the models of the Behler-Parrinello type's NNP. For MD simulation, we developed a PyTorch interface with LAMMPS, which can be expanded to other ML potential libraries through PyTorch. The proposed data reduction process improves the generalization of NNP training by mitigating overfitting. We show that the low RMSEs of energy and force do not automatically guarantee the reliable behaviors of the trained models 3 . The selected model considering physical properties can describe the torn edge configuration observed in the previous studies and explains well the higher frequency of torn AC edge occurrence, which is not possible with other reactive potentials. The proposed work frame can be applied to understand fracture dynamics of 2D and bulk crystals using NNPs.
We wish to emphasize that the current NNPs are limited as they should only be used for the simulation of stress-induced fracture and failure of pristine graphene. The training data set does not explicitly have failure dynamics of bilayer graphene, diamond, amorphous carbon network, carbon nanotube, graphyne, grain boundary, vacancies, folding, etc. Therefore, new data should be generated, or transfer learning/active learning is required to provide a quick path for developing NNPs of other applications. However, NNP is more useful for a large system in terms of computational speed than first principles-based electronic structure approaches. Also, the NNP is very flexible in capturing non-linear deformation-stress behaviors well (figure 5), which is challenging with the fixed functional form, such as harmonic equations in classical forcefields. In the previous study, we observed that it is challenging to match both deformation and non-linearity of stress-strain behaviors of graphene simultaneously with those from DFT even through reactive forcefields [37]. Crack dynamics is one of the exciting applications for NNPs due to its intrinsic multiscale feature. In this study, we only focus on the data generation and selection from the mixed loading and data reduction using DFTB as the reference method. However, the selected data can be utilized for high throughput calculation with more accurate methods. Also, active learning and transfer learning from the selected data would be interesting topics in the future.

Data availability statement
The data that support the findings of this study are openly available at the following URL/DOI: https:// github.com/gsjung0419/LammpsANI.