Structural mode coupling in perovskite oxides using hypothesis-driven active learning

Finding the ground-state structure with minimum energy is paramount to designing any material. In ABO3-type perovskite oxides with Pnma symmetry, the lowest energy phase is driven by an inherent trilinear coupling between the two primary order parameters such as rotation and tilt with antiferroelectric displacement of the A-site cations as established via hybrid improper ferroelectric mechanism. Conventionally, finding the relevant mode coupling driving phase transition requires performing first-principles calculations which is computationally time-consuming as well as expensive. It involves following an intuitive iterative hit and trial method of (a) adding two or multiple mode vectors, followed by (b) evaluating which combination would lead to the ground-state energy. In this study, we show how a hypothesis-driven active learning framework can identify suitable mode couplings within the Landau free energy expansion with minimal information on amplitudes of modes for a series of double perovskite oxides with A-site layered, columnar and rocksalt ordering. This scheme is expected to be applicable universally for understanding atomistic mechanisms derived from various structural mode couplings behind functionalities, for e.g. polarization, magnetization and metal–insulator transitions.


Introduction
Over the years, research in multiferroics of ABO 3 type [1][2][3] have only grown among the physics, materials science community, to explore corresponding exotic, controllable functionalities with potential in device applications [4][5][6].There exists a strong coupling between electric polarization ( ⃗ P) and magnetization ( ⃗ M) in such systems.Particularly in oxides with Pnma symmetry, properties such as weak ferromagnetism, magnetoelectricity are driven by rotation and tilt of BO 6 octahedra.For oxides in the form of layered perovskite supercells or Ruddlesden-Popper phases, the microscopic polarization ( ⃗ P) and magnetization ( ⃗ M) are minimal which significantly limit their practical applications.
The double perovskite oxides (DPOs) with chemical formula AA ′ BB ′ O 6 are possible alternatives, providing much flexibility in the structural as well as compositional space to achieve targeted functionalities.An alkaline-earth or rare-earth ion sits at the AA ′ -sites whereas transition metal ions occupy the BB ′ sites.Different orderings are possible for both cation sites.The BB ′ sublattices typically order in rock-salt while (AA ′ ) sublattice can order in layered [L], rock-salt [R] or columnar [C] form.The layered AA ′ and rock-salt ordered BB ′ phase (A[L]B[R]) exhibits a non-centrosymmetric polar structure with P2 1 symmetry, once the structure is subjected to distortions along (a − a − c + ).The low symmetry phase P2 1 is both polar and magnetic.The microscopic polarization arises due to non-cancellation of layered polarization in two successive layers whereas magnetic ordering originates from the BB ′ site.The lowest symmetry phase is established by the inherent trilinear coupling between in-phase rotation (Q R+ ), tilt (Q T ) and antiferroelectric displacement (Q AFEA ) via hybrid improper ferroelectric (HIF) mechanism.The metal-insulator transition is controlled by the charge disproportionation modes.In the realm of HIFs [7][8][9][10][11], a trilinear coupling interlinks two unstable zone-boundary distortions, labeled as Q 1 and Q 2 , to a polar zone-center mode, Q P .The free energy can be expressed as F ∼ Q 1 * Q 2 * Q P , indicating the interdependence among these modes.Both Q 1 and Q 2 which are the zone-boundary modes, act in concert to drive the phase transformation and establishment of the space group symmetry observed in the polar ground state.Consequently, both Q 1 and Q 2 serve as the primary order parameters in this context.
Recent study lead by Ghosh et al [12] has explored how the trilinear coupling even contributes towards formation of the A-site clear layered ordering in DPOs.Traditional machine learning (ML) combined with causal models have elucidated that the presence of second-order Jahn-Teller distortion is optional for stabilizing A-site clear layered ordering for DPOs, if a trilinear coupling is established.A related, recent investigation [13] has also proposed that the out-of-phase rotation (Q R-) mode drives phase transition in DPOs such that ⃗ P changes from (+) to (−) at high temperature.Overall, the phonon or structural modes, individually or together drive the primary functionalities in perovskite oxides, making them essential to concretize relevant fundamental understandings along with corresponding structure-property relations.
However, there remains several challenges to directly utilize information on structural modes on a wide compositional space [14][15][16][17][18][19][20].The first-principles computations to derive information on modes are time-consuming, expensive.It is also difficult to establish a direct connections between the structural modes and experimental observables [17,18].The conventional way to correctly identify the lowest energy phase as driven by these modes follows an intuitive iterative hit and trial procedure.First, two or multiple mode vectors are added for a particular structure.Next, additional computations are performed to determine if the combination of modes have lead to the ground-state energy.For even a 20 atoms cell, there are a total of 57 optical modes for which many such combinations are possible, turning this procedure inefficient for identifying existing or new mode couplings.Therefore, a smart, active-learning strategy needs to be designed or applied to solve such problems which is also capable to encode the right set of physics laws governing the material functionalities of interest.
Historically, Gaussian process (GPs) has been used within the active learning and Bayesian optimization to reconstruct behavior over a wide chemical space.A kernel function (such as radial basis function kernel) is used to define the degree of correlation across the parameters space in a regular GP.The parameters of the kernel are inferred based on the available data as obtained during exploration-exploitation steps.As a result, standard GPs are purely data-driven and non-parametric in nature which fail to incorporate any prior information of physical or chemical behavior of the system in the process.Here, the learning is derived from the data itself using the parameters of the kernel functions.In recent works, Ziatdinov et al [21][22][23][24] has showed how a physics-augmented GP (sGP), informed with physical priors is suitable for gaining insights into domain growth laws and other physical properties in functional materials using observational data from scanning probe microscopy.This active learning scheme allows to include multiple hypotheses which are compared with a reward-driven acquisition function to select next set of evaluation points.This work has also been extended in the field of molecular discovery reported by Ghosh et al [25].Here, a systematic physics-constrained featurization method is combined with the existing active-learning one to systematically derive competing hypotheses from existing data or prior knowledge, followed by identifying the most suitable one to actively learn the structure-property relations.
In our present study, we show how such hypothesis-driven active learning framework can be utilized to accurately determine mode couplings within the Landau free energy expansion using four key structural modes, Q R+ , Q T , Q AFEA and (Q CD ) 2D as shown in figure 1.We compare four different expressions combining the mode amplitudes of Q R+ , Q T and Q AFEA to revisit the inherent trilinear coupling, present in A-site clear layered polar DPOs.In addition, we also determine how such trilinear coupling is absent for A-site rocksalt and columnar layered DPOs, validated by symmetry arguments.
The overall workflow as adapted in the study is shown in figure 2. It integrates the steps to (a) coming up with possible hypotheses (e.g.Landau free energy expansions leading to ground-state energy state), (b) wrapping the hypotheses within the sGP models with probabilistic priors, (c) evaluating each hypothesis within an active learning loop to finally (d) find a model that accurately reconstructs the functional behavior for all the materials present within the dataset.

First-principles computations:
We have considered the existing datasets from the publicly available repository on A-site cation ordered DPOs with varied tolerance factors (TF) and charge states with generalized chemical formula of AA ′ BB ′ O 6 for our study.The datasets include a total of 145 compounds with three different types of A-site cation ordering, layered [L], columnar [C] and rocksalt [R], respectively with B-site fixed as [R] ordering with G-type antiferromagnetic (AFM) configuration.First-principles calculations were performed using density functional theory (DFT) [26] with projector augmented wave (PAW) potentials [27] and within generalized gradient approximation (GGA) with U E [28], using Vienna ab initio simulation package (VASP) [29].Additional details on the computations can be found in the reference article [12] Landau's free energy expansion: Different structural modes drive phase transition in DPOs in the bulk form.Once the mode vectors for Q R+ and Q T are coupled together followed by geometry relaxation, the A-site clear layered ordered DPOs go from the high symmetry P4/nmm to low symmetry P2 1 phase, driven by these two primary order parameters, inherently coupled with Q AFEA .Listed below are equations (1) and (2) which represent the Landau's free energy expansions (up to second order) for the mode coupling without and with the geometry relaxation, respectively: Here, coefficients such as a 1 , a 2 are determined using DFT computations for a varied set of structural distortions and α attributes to the trilinear coupling.Presence or absence of such coupling is governed by symmetry relations.The symmetry elements for both phases must be included within the group, sub-group relations at the transition phase to correctly model the process.The A-site columnar phase is  centrosymmetric in nature.Therefore, Q T is absent.For A-site rocksalt, all three modes such as Q R+ , Q AFEA , Q T are present with magnitudes comparable to those for A-site clear layered systems.

Hypothesis-driven active learning:
The hypothesis learning employs structured or the physics-augmented GP (sGP) as compared to standard zero mean GP in the GP/BO framework as detailed in the literature [21,25].The sGP allows to include physics-informed priors by substituting the prior mean function with a probabilistic model for which the parameters are inferred jointly with the kernel parameters.The hypothesis-driven learning uses an ensemble of sGP models which may compete with each other to accurately capture the evolution of property of interest over a large parameters space.For this work, our goal is to identify and reproduce trilinear coupling between key modes, as present in A-site layered, columnar and rocksalt DPOs.We utilize open-source GPax library for our implementations for the hypothesis-learning combined with physics-informed featurization scheme as available via Github repository.
We consider the following additional formulations of Landau free energy expansions as hypotheses, along with equations ( 1) and ( 2) to evaluate whether the active learning scheme can correctly identify the expression that includes the trilinear coupling term for A-site clear layered and rocksalt ordered DPOs.We have restricted our investigation up to second order term and trilinear coupling for simplicity, For A-site columnar DPOs, we evaluate if a trilinear coupling between Q R+ , Q AFEA , and (Q CD ) 2D is possible from a set of four formulations, as listed below, (5) 6) 7) These equations are treated as individual probabilistic models with priors as listed in table 1 wrapped into sGPs within the active learning.

Results and discussion
Within the standard ML practices in the field of physical sciences, lot of emphasis has been given to spin out design rules originated from identification of patterns from existing data with a vast utilization of correlative methods.While these are immensely suitable for independent and identically distributed data, their capabilities are often limited when it comes to discovering physical laws.One of our focuses, therefore, in this work, is to extract such rich physics from a small dataset which can be extended to other materials within the same family of oxides.
A-site clear layered and rocksalt ordered DPOs: It has already been known [12] that for A-site clear layered ordered DPOs, the trilinear coupling between Q R+ , Q AFEA , Q T drives the phase transition.The trilinear coupling also plays a key role to stabilize the cation ordering at the A-site.We have utilized equations ( 1)-( 4) as individual probabilistic models wrapped into sGP with priors within the hypothesis-driven learning to reproduce the Kohn-Sham energy.Within the hypothesis learning setup, we utilize several models that are expected to approximate the physical behavior.A basic reinforcement learning policy, such as epsilon-greedy in this case, is used to select the correct one based on the collected 'rewards' .The uncertainty-based exploration (UE) as implemented in GPax library is considered as the acquisition strategy.Within each step, all the sGP models are evaluated, and one model is selected based on the uncertainty minimization that is best suited for the exploration.The model that wins during each of the exploration step scores the highest reward which gets accumulated over all the steps within the active learning loop.
From our existing knowledge, it is evident that the equations ( 1), ( 2) and ( 4) do not correctly represent the Landau's free energy expansion, if a trilinear coupling is attained by the system.However, these equations correctly represent scenarios where (a) trilinear coupling is absent ((1) and ( 4)), (b) trilinear coupling is inherently present, i.e. the mode vectors of Q R+ and Q T are already coupled such that it is not required to explicitly include the Q AFEA square term.1)-( 4), respectively.For A-site columnar ordered compounds (c), (d), Model 1-4 are approximated using equations ( 5)-( 8), respectively.
The idea behind conducting hypothesis-driven learning is to successfully identify the correct expression, even for a system where the physics is not known.In this case, for a DPO with a different ordering such as rocksalt, it is not known if the trilinear coupling would take place.Therefore, we let the active learning workflow identify the correct expression from several competing models.The corresponding rewards collected during exploration as well as the average reward for all four models for A-site layered (a), (b) and rocksalt (e), (f) ordered phases are plotted in figure 4. From figures 4(a) and (b), it is evident that Model 3 (corresponding to equation ( 3)) collects the highest reward.The next competing model in this example is Model 1 (corresponding to equation ( 1)).We speculate this is because for some systems, as present in the dataset, the trilinear coupling is weak (<5 meV/f.u.Å −3 ), contributing nominally to the energy.However, overall the workflow correctly identifies the Landau free energy expansion for a total of 158 A-site clear layered ordered systems with trilinear coupling.
It is pivotal to stress that our active learning framework is not solely focused on minimizing errors within test sets; its primary aim is to unveil rules or principles enabling predictions beyond the initial data scope.Our findings underscore how this approach facilitates pinpointing accurate Landau's free energy expansions, especially for materials exhibiting three distinct types of A-site ordering.Moreover, leveraging the derived formulation allows for predicting the target property.The comparison between predicted energies using the sGP models and those computed via DFT simulations is illustrated in supporting information.We acknowledge the significance of providing accurate error measurements linked to predictions, depicted in the supporting information through histograms and annotated color maps.The mean relative absolute errors (MRAEs) in Kohn-Sham energy predictions were assessed using the model dedicated to each ordered compound that achieved the highest average reward.
The MRAE for the clear layered ordered and rocksalt ordered systems are 0.067 (∼6.7%) and 0.038 (∼3.8%), respectively.Interestingly for A-site rocksalt ordered systems, model 4 (corresponding to equation ( 4)) wins during the exploration step, meaning trilinear coupling is not established.It complies with our physics-bases understanding of the structural distortions.The mode vectors for Q T move along the pseudocubic [001] and [00 1] directions.The apical O atoms located in A ′ O plane displace towards A ′ -site to establish a bonding state, which results in movement of A ′ atoms towards O-sites.Although all three modes are present for rocksalt ordered compounds, the A ′ cations are not located in layered positions, hindering the coupling between Q R+ , Q T and Q AFEA .Therefore, the model representing the presence of these three modes earns the highest reward during exploration.
A-site columnar ordered DPOs: Modes such as Q R+ , Q AFEA and (Q CD ) 2D are present while Q T is absent for A-site columnar ordered DPOs. it is not yet known if there can be a trilinear coupling that can be established between these modes.We utilize this as a motivation to construct four equations ( 5)-( 8) in the similar manner as before, specifically tailored for A-site clear layered and rocksalt ordered DPOs.For formulation, the combinations of Q R+ , Q AFEA and (Q CD ) 2D are considered as compared to Q R+ , Q AFEA and Q T .Model 3 (corresponding to (7)) would be the best model to estimate the Kohn-sham energy, if there is a successful trilinear coupling between Q R+ , Q AFEA and (Q CD ) 2D , responsible for energy lowering.As shown in figures 4(b) and (c), model 1 (corresponding to (5)) accumulated the highest average reward and largely leaves the other models far behind to approximate the energy.It not only signifies that the trilinear coupling is not present for A-site columnar ordered phases but also shows the nominal contribution of Q AFEA towards stabilizing such ordering.We can further justify our findings from crystal symmetry arguments.For Q R+ , the mode vectors are displaced along [110] and [1 10] directions, which in turn can affect the distortions of the planar O atoms.However in absence of Q T , Q R+ , Q AFEA and Q T modes will not couple with each other.Consequently, the displacements of the planar O atoms may not be affected by Q R+ mode vectors.The A-site cations are located at the centrosymmetric positions and therefore, it does not have much effect on the charge disproportionation modes.Hence, trilinear coupling between Q R+ , Q AFEA and (Q CD ) 2D is not feasible which is being rightly captured by the hypothesis-driven workflow.The MRAE for prediction using the model for columnar ordered compounds is 0.071 (∼7.1%).

Conclusions
In summary, we have utilized a hypothesis-driven active learning workflow to reproduce and identify mode couplings for A-site cation ordered DPOs.The learning process accurately derives the correct set of Landau's free energy expansions with different order parameters governing the phase transition and functionalities of these structures.For A-site clear layered ordered systems, the trilinear coupling between Q R+ , Q AFEA and Q T is reproduced.Such trilinear coupling does not exist for rocksalt ordered structures while no new mode coupling between Q R+ , Q AFEA and (Q CD ) 2D is identified for columnar structures, as explained by crystal symmetry.Within the scope of this work, we limit ourselves to investigate couplings between a few key structural modes which have been previously reported to rule the phase transition.Future studies can be carried out to extend this active learning process to identify novel couplings which is yet to be explored, enabling the route, going from extracting design principles to discovering physics laws.

Figure 1 .
Figure 1.Schematic representation of structural modes.Key structural modes such as (a) in-phase rotation (Q R+ ), (b) tilt (Q T ), (c) antiferroelectric A-site displacement (QAFEA), (d) two-dimensional charge disproportionation (QCD)2D for A-site cation ordered double perovskite oxides.The A-site cations are displayed in blue and grey while the B-site cations are marked with green and brown with anion sites colored in red, respectively.
. The Kohn-Sham energies for each ordered phases, [L], [C] and [R] of the DPOs along with their structural modes are utilized for our active-learning based investigations on mode couplings.Description of key structural modes: A list of key structural modes for the A-site layered DPOs include Q R+ , Q R− , Q T , Q AFEA , Q AFEO , (Q CD ) 2D and (Q CD ) 3D .The clockwise/counter-clockwise rotations of the in-plane O atoms located at the top and bottom layers of BO 6 (or B ′ O 6 ) octahedra give rise to Q R+ /Q R− , respectively.The Q T mode predominantly characterizes the movement of apical O atoms present in both layers.The quantification of the antiferroelectric displacement of the A-sites, governing the layered polarization, is represented by Q AFEA .Here, the A and A ′ sublattices move in the opposite directions, attributed to varying charge states and cationic radii.The displacement of planar O atoms toward a higher charge state due to electrostatic Coulomb's interaction in DPOs is represented by Q AFEO .The (Q CD ) 2D mode involves a proportional change in the bond length of B-O (or B ′ -O), while the (Q CD ) 3D mode pertains to the volume alteration of BO 6 (or B ′ O 6 ) octahedra.For the A-site clear layered ordered DPOs, the high symmetry P4/nmm(a 0 a 0 c 0 ) transforms through four different intermediate phases such as P42 1 2, Pmn2 1 , and P2 1 /m and P42 1 m, associated with the key modes, Q R+ , Q AFEA , Q T and (Q CD ) 2D , respectively.The depicted figure 3 displays both the group-subgroup relations and the contour of the energy surface for a DPO (KYFeOsO 6 ).

Figure 2 .
Figure 2. Workflow.Primary steps of the workflow including consideration of hypotheses, evaluation via active learning and followed by prediction of Kohn-Sham energy are shown.

Figure 3 .
Figure 3. Symmetry relations.A visual rendering of the group-subgroup relationships between the high-symmetry P4/nmm(a 0 a 0 c 0 ) and the low-symmetry P21(a − a − c + ).The intermediate phases are within the crystal symmetry groups of P4212, Pmn21, and P21/m and P421m.Additionally, (b) exhibits an energy surface contour of KYFeOsO6 with respect to the primary order parameters Q R+ (a 0 a 0 c + ) and Q T (a − a − c 0 ).

Table 1 .
List of priors for all equations to learn Kohn-Sham energies of DPOs within the Landau's free energy expressions.