Decoding short-range order in cation-disordered rocksalt materials using Metropolis non-negative matrix factorisation

We present a methodology for accessing the cation short-range ordering (SRO) in disordered rocksalt (DRX) materials by decomposing their pair distribution function (PDF) pattern in terms of a set of basis PDFs, pre-determined via Metropolis non-negative matrix factorisation analysis. These basis PDFs, underpinned by structure modelling, point to specific SRO types that subsequently enable identification and quantification of their presence in the DRX structure. Rapid identification of the evolution of SRO addresses a key bottleneck in the analysis and understanding of high energy density characteristics of DRX materials.


Introduction
Cation disordered rocksalt (DRX) materials represent a series of (typically) metal oxides that share the rocksalt (NaCl) structure and contain more than one metal species distributed across the cation sublattice.DRX materials have become increasingly important in the field of lithium-ion batteries, ever since a growing number of Li-based transition metal (TM) DRX oxides (Li 1+x M 1−x O 2 ), and their fluorinated derivatives (Li 1+x M 1−x O 2−y F y ), have been reported to show remarkable performance in both cathode and anode applications [1][2][3][4].In many DRX materials, cation disorder is induced by mechanical forces via ball-milling, or through rapid quenching [5].Both approaches lead to metastable phases, distinct from the thermodynamic phases, in which Li and TM species usually form ordered arrangements (e.g.layered ordering in LiMO 2 ).
While the distribution of cations in DRXs has conventionally been assumed random, a number of recent experimental studies have now reported observations of cation short-range ordering (SRO) in DRX oxides [6,7].The form of SRO is important from a functional perspective, since the (local) structure type, as well as the degree of (dis)order, is now known to affect both Li + -ion mobility and diffusion percolation [8].A clear understanding of these materials' short-range cation arrangement is therefore critical to rationalise their electrochemical properties.However, the complex nature of disorder, reduced particle sizes and increased lattice strain, collectively lead to tremendous challenges in the investigation of SRO.
Addressing these challenges not only requires experimental techniques that allow access to atomic structures but now demands unconventional analytical approaches to determine realistic atomic-scale models of the SRO consistent with experiment.One important established methodology is that of cluster-expansion Monte Carlo (MC), which couples first-principles energy calculations of candidate SRO models with large-scale atomistic simulations [7,9,10].Cluster-expansion MC modelling does not extract structural information directly from experiment, but can be used to test whether experimental signatures of the SRO are consistent with calculations based on Boltzmann-weighted occupation of different SRO motifs.The approach is not directly applicable to metastable phases.Likewise, the application of pair distribution function (PDF) modelling approaches (real-space Rietveld, reverse Monte Carlo) to 'solve' SRO from experiment is time-consuming and often relies on access to a suitable starting model [11][12][13].Consequently, we were interested in the prospect of developing new methods [14] for estimating SRO in DRXs from their PDFs, in a way that would allow rapid interpretation of data measured e.g. during electrochemical cycling.
The core of our methodology combines MC simulations and PDF calculations: both methods assess material structure in terms of their interatomic (two-body) interactions.Using an MC approach, we simulated a phase diagram of rocksalt-type A 1+x B 1−x O 2 structures across a range of stoichiometries that are of potential interest to the battery community.Each structure, containing different degrees and types of SROs, can be used to calculate a corresponding PDF.These patterns jointly constitute a large collection of PDFs that can be described in terms of a dramatically smaller set of basis PDFs via the numerical procedure of Metropolis non-negative matrix factorisation (MMF) [15].We show that these basis PDFs point to specific SRO types and can be used to identify SRO in newly-measured PDF data for different DRX samples.

Results and discussion
For the purposes of this proof-of-principle study, we use an intentionally simplistic microscopic model governing cation arrangements in DRXs: namely, the so-called J 1 -J 2 model e i e j + J 2 ⟨⟨i,j⟩⟩ e i e j . (1) Here, J 1 > 0 captures the energy penalty associated with placing cations of the same type on neighbouring sites on the fcc cation sublattice, and J 2 (which may be positive or negative) captures the energy loss/gain associated with next-nearest-neighbour cation pairs.The e i are Ising states denoting cation occupancies (e.g.= +1 for Li, −1 for transition-metal), with the average ⟨e⟩ fixed by stoichiometry.The Hamiltonian (equation ( 1)) is often frustrated, which gives rise to structural complexity across the J 1 /J 2 / ⟨e⟩ phase diagram; the model has been applied to many different (pseudo)spin systems, from rocksalt antiferro-magnets [16] to half-Heusler thermoelectrics [17].
Using an MC implementation of this model, we simulated many different A 1+x B 1-x O 2 structures (see Methods for details) corresponding to different relative (next-)nearest-neighbour interaction strengths J ′ (=J 2 /J 1 ), compositions ⟨e⟩ (=x), and effective MC temperatures T ′ (=T/J 1 ).The library of structures composes a coherent phase diagram (figure 1(a)) where each voxel at (J ′ , x, T ′ ) is correlated with an MC energy given by equation (1).At the ideal ABO 2 stoichiometry (x = 0), the phase diagram includes three well-defined phase fields at low T ′ and for various J ′ .As T ′ decreases, a discontinuity in energies is evident, denoting a disorder-to-order transition of the A-B arrangement.To isolate distinct A-B ordering types, we incorporate the order parameters (Ψ ) calculated against the propagation vectors of the different magnetic structures previously identified in the fcc AB lattice (see Methods for details) [4].For consistency with the literature, we adopted the nomenclature of these antiferromagnetic (AF) structures to describe the cation orderings in ABO 2 -hence denoted as AF-I, AF-II and AF-III (figures 1(b)-(d)), with their ground-state structures respectively found in γ-LiTiO 2 (I4 1 /amd) [18], LiCoO 2 (R-3m) [19] and γ-LiFeO 2 (I4 1 /amd) [20].According to the derived Ψ values, these ordering types distinguish themselves from the phase diagram where AF-I, AF-II and AF-III structures (figures 1(f)-(h)) respectively locate within the J ′ ranges of J ′ < 0, J ′ > 0.5, and 0 < J ′ < 0.5.
An alternative spinel-type A-B ordering, known as AF-IIb (figure 1(e)), also emerges at J ′ > 0.5 and has a Ψ value twice that of AF-II.Its ground-state structure can be found in the low-temperature (LT) LiCoO 2 (Fd-3m) [19] phase, where both A and B cations show a spinel-like arrangement, i.e. analogous to the B-ordering in the spinel AB 2 O 4 phase.We should note that whilst AF-II and AF-IIb orderings are associated with distinct Ψ values, they correspond to identical (next-)nearest-neighbour pair correlations [16], and so cannot be distinguished either in terms of the energies (equation ( 1)) nor in terms of the corresponding (idealised) PDFs.This is a limitation of the simplistic J 1 -J 2 model.
For the non-stoichiometric A 1+x B 1−x O 2 phases, their structures can also be modelled via MC simulation to study both Li-rich and -deficient metal oxides.Given the current interest in Li-excess materials for electrochemical energy storage [8], we focus on the range 0 < x ⩽ 2/3.The ground states of these nonstoichiometric structures are related to the identified AF orderings in stoichiometric ABO 2 , especially at low x.As x increases, and hence a higher A:B ratio, the MC energies of these structures rise progressively due to a growing number of A-A neighbours.Collectively, this phase space represents a family of many different DRX cation arrangements with varying types and degrees of SRO.The MC structures at every voxel (J ′ , x, T ′ ) can be used to generate a library of calculated PDF patterns G. Given that there are not so many different SRO types-certainly many fewer than there are voxels-the next step in our approach is to decompose the many PDFs in G into a much smaller set of fundamental (basis) PDFs.This set of basis PDFs, g, might then be used to identify and quantify the SRO encrypted in the experimental PDF of any target DRX material.
The deconvolution procedure follows the MMF [15] approach through which the entire library of calculated PDFs G is decomposed into a set of fundamental components g (figure 2(a)) and their corresponding weightings w via G i ≈ ij w ij • g j .A total of five basis g j were employed.Three of these were fixed to the PDFs of the known ordered phases (AF-I, AF-II, AF-III), and the remaining two basis PDFs were allowed to vary as required to best account for the variation observed in G.We use Roman (I-III) and Arabic (4,5) numerals to denote fixed and free basis functions, respectively.
With respect to the weightings, whilst every row w i consists of five elements, the implicit MMF constraint ( j w ij = 1) means there are in practice only four independent variables, which we represent using the CMYK coordinate vector K i .So whereas the phase map shown in figure 1(a) is coloured according to MC energy, that in figure 2(a) is coloured according to the weights of each basis function contained within the corresponding PDF.We see easily that the AF-I, AF-II, and AF-III phase domains light up with the colours of the corresponding basis PDFs (cyan, magenta, and yellow).In the case of the g 4 and g 5 components, a broader distribution is observed without an obvious J ′ dependence.Judging by an increasingly darker hue (larger K component) as the composition progressively approaches A 2 O 2 (i.e. as x → 1), the presence of g 4 becomes more prominent and less dependent on T ′ (figure 2(c)).We interpret this observation as implying that g 4 corresponds to the PDF of an ideal (single cation) rocksalt structure.In contrast, g 5 is most prevalent at high T ′ and low x regions, implying its correlation with random A-B ordering in high-entropy structures, e.g.α-LiFeO 2 .This high-entropy pattern shows relatively broadened peak widths with less distinct peaks, particularly in the high r, indicating a lack of long-range order (or a higher degree of disorder).Additional analysis incorporating five free g i components gave consistent results and is discussed further in the SI.
With every A 1+x B 1−x O 2 structure (G i ) decrypted into a weighted set of SRO models (g j ), we can use the latter as structure identifiers to access the SRO in DRX materials by reverse mapping their structures onto the phase space.To demonstrate this mapping, we use synthetic PDFs G * (r) as experimental input.These patterns were selected from the phase space, G * = G (J ′ , x, T ′ ) since the corresponding structures and SRO contained within are already known (figure 3(a)).Taking a representative combination of J ′ , x, T ′ in phase space, the corresponding PDF was projected onto the MMF basis PDFs g j , to give five weights which we assemble into a weighting vector w * (G * = j w * j • g i ).We then attempt to locate the PDF in J ′ , x, T ′ -space by evaluating the agreement between G * and each voxel G i in terms of the effective distance ϵ i between their weighting vectors Voxels for which ϵ i is less than some critical threshold ϵ 0 then correspond to candidate values of J ′ , x, T ′ relevant to G * .Using a threshold ϵ 0 = 0.006 (figures 3(b)-(d)), we found that nearly any synthetic PDF could be located accurately in J ′ , x, T ′ -space, albeit that PDFs corresponding to low-T ′ (strongly ordered) voxels were less precisely placed (see e.g. the dark brown data in figure 3).The difficulty at low T ′ arises because ordered domains contain structures with essentially identical structures.Nevertheless, comparing between the input data and the analysis output, the high consistency is remarkable, given that the mapping between J ′ , x, T ′ and basis PDF weights is highly nonlinear.
How do we envisage the method working in practice with experimental data?A measured PDF is projected onto the MMF components to give the corresponding weight vector w * (figure 4).The most relevant values of J ′ , x, T ′ are then identified using the thresholding described above.These parameters identify immediately the stoichiometry (x) and dominant SRO characteristics present in the sample; they can also be used to generate representative MC configurations if so desired.In contrast to cluster expansion MC, no knowledge of the underlying energetics is required, and the method can be used even when a system is far from equilibrium.We anticipate this approach being particularly useful for the analysis of large datsets of  parameteric PDF measurements-as collected during electrochemical cycling, for example.In this way, one might hope to track the trajectory of compositional and SRO changes through J ′ , x, T ′ space as Li + is inserted/removed.
Using experimental data introduces some additional considerations.First, the generation of G-and subsequently the MMF basis functions g-requires a set of pre-defined physical parameters, e.g.lattice constants.Whilst these parameters have little influence on our interpretation of the MMF components, the weightings derived from reverse mapping and consequently the values obtained for ϵ I will be affected.Therefore, a good estimation of these structure parameters via preliminary data handling (such as refinement against an average DRX structure) is necessary to ensure the reliability of the protocol.Secondly, the extent to which the weighting vector extracted from experimental data is represented in J ′ , x, T ′ -space is not always clear.Hence, the threshold value ϵ 0 may need to be adjusted for different systems.Our experience is that whenever a slight adjustment of ϵ 0 does not lead to a significant change of the number of voxels mapped, the corresponding solution is probably credible.
A final point we make concerns the J 1 -J 2 model we have exploited in this proof-of-principle study.We make no claim that this model provides an accurate description of DRX structure; indeed we have discussed already that it cannot distinguish AF-II and AF-IIb orderings.We are aware of ongoing studies that aim to develop a more sophisticated understanding of microscopic interactions governing cation distributions in X Hua et al DRXs [10,20,21].Whatever interaction model emerges, the methodology we set out here should nevertheless provide an approach for identifying the model parameters most relevant to a given DRX sample for which the corresponding PDF has been measured.A strength of interpreting PDFs in terms of underlying interactions (we would argue) is that the solution suffers less from uniqueness problems, and yet can still be used to develop atomistic models for comparison against additional experimental data, such as XRD, ED, and NMR [22].Moreover, although our methodology is established for A 1+x B 1−x O 2 systems as a proof of concept, it is straightforward to expand the method's scope to address even more complex compositions, such as those with both cation or anion disorder.Likewise, the approach might equally well be applied to entirely different materials families, wherever the microscopic interactions governing structural complexity can be couched in terms of a relatively simple Hamiltonian.

Conclusions
In summary, we have presented a proof-of-concept study based on a methodology that combines MC simulations and PDF calculations to evaluate the cation SRO in DRX materials.The MC simulations allow for the construction of an A 1+x B 1−x O 2 phase space across a range of stoichiometries that are of interests to the battery community.The structures in the phase space then constitute a library of calculated PDFs, against which MMF method can be applied to decompose them into a set of basis PDFs, each pointing to a specific cation SRO motif.Using these motifs as structure identifiers, we can access the SRO in DRX materials by reverse mapping their structures via these basis PDFs onto the phase space.In contrast to the established method via cluster expansion MC, our method does not require underlying energetics, and can be used in systems that are far from equilibrium, which is often true within a battery.In addition, this method has a potential to extend its application to more complex compositions that also involve anion SRO.

Monte Carlo simulation of A 1+x B 1-x O 2 phase space
MC simulations [23] were performed on a supercell constructed using 6 × 6 × 6 A 1+x B 1−x O 2 unit cell with the periodic boundary conditions applied.The a-lattice parameter (4.099 Å) of the unit cell was estimated from the least-square refinement using an α-LiFeO 2 structure against the XPDF data of a DRX-Li 2 MnO 2 F sample prepared by high-energy ball milling.The A:B ratio was defined by x, and their distribution in the cation sublattice is randomly initialised.A series of MC simulations were performed across a defined range of compositions (0 ⩽ x ⩽ 2/3), relative interaction strengths (−2 ⩽ J ′ ⩽ 2), and effective MC temperatures (0.01 ⩽ T ′ ⩽ 100).The MC energy was calculated on the basis of equation ( 1) whereas e A = +1 and e B = −1.At each MC step, a randomly selected A (or B) atom was swapped with another randomly selected B (or A) atom.The energy change (∆E) due to the move was calculated.The move was automatically accepted if ∆E ⩽ 0; for a positive energy change, the acceptance was subject to the Metropolis algorithm [23] given by P = e (-∆E/kT ′ ) , where k is the Boltzmann constant and T ′ is the effective MC temperature.Moves were continuously proposed, and accepted or rejected until either the number of accepted moves is greater than 5 times total number of atoms or the total number of attempted moves is greater than 25 times of total number of atoms.For a given composition and effective interaction strength, MC simulations were started at an effective temperature T ′ = 100 and, after equilibration, cooled at a constant rate of 0.8.The same cooling rate was used for all subsequent cycles such that the relevant T ′ values are regularly spaced in log(T ′ ).

Isolation of the ordered phases and order parameter calculation
The ordered ground-state phases of the ABO 2 system have been studied previously [16,24] with their corresponding fcc representations shown in figure 5.Each ordering type has a set of corresponding symmetry-equivalent compositional variation propagation vectors, q, which are in reciprocal space.They however reflect a real space conjugational periodicity, analogous to the magnetic propagation vectors, k.Similar to the single and multi k structures described in magnetic structures [4,25], these ordered systems can be described by either individual or multiple q.For the ground state where J ′ < 0, referred to as the AF-1 (type-I antiferromagnetic/AF ordering), the ground state MC energy (in units of J 1 ) per unit site is E 0 = 2 + 3J 2 .It consists of three magnetic lattices whose vectors belong to q ∈ 2π⟨100⟩ * with ferromagnetic (100) planes coupled antiferromagnetically (figure 5(a)).For 0 < J ′ < 0.5, type-III AF ordering dominates (figure 5(b)).This ordering is characterised by two q ∈ 2π 0 1  2 1 * , with E 0 = −2 + J 2 .For 0.5 < J ′ , the both have a degenerate ground state of E 0 = −3J 2 .Even though the magnetic ordering lies along the [111] direction, a factor of 1 /2 is in the propagation vectors as the periodicity takes two fcc cells to repeat.The degeneracy of the AF-II and AF-IIb types has been found to be extremely robust and hence both states are expected to be observed in the MC simulation, explaining why the plot in figure 1(h) is streaked.
For any given J ′ , T ′ , an MC structure can be generated and its order parameter ψ (q k ) for each q-type can be calculated using the following equation where N is the total number of lattice sites, e j represents the value of the spin at site j and r j is the position of spin j within the lattice.A summation of these ψ (q k ) will derive the order parameter of the system:

MMF
The MMF approach followed closely the method reported earlier [15], which interfaces a MC algorithm with conventional NMF [26].The MMF analysis was performed on renormalised PDFs to satisfy the non-negative criterion of NMF.The renormalised G(r) were derived from the calculated G calc (r) using equation [27] G calc (r) = 4π rρ 0 (G(r) − 1), in which ρ 0 refers to the number density of the structure model.The structure parameters used in the PDF calculation (e.g.lattice constant and U iso ) took reference from the XPDF refinement of the DRX-Li 2 MnO 2 F sample.Five fundamental components g i (r) (i = 5) were employed in the analysis.The goal of the analysis was to identify these g i (r) and associated weights w ij (j corresponds to the number of renormalised calculated G j (r)) to minimise |G * j (r) − G j (r)| 2 , where G * j (r) = 5 i = 1 w ij g i (r).Additional constraints were applied to ensure non-negative g i (r) for all i and r, and that 5 i = 1 w ij = 1 for all j.In our first analysis, the initial g 1 (r), g 2 (r) and g 3 (r) were calculated using the AF-I, -II and -III type ground-state structures, respectively, representing the known components, whereas the unknown g 4 (r) and g 5 (r) and all w ij were assigned randomly subject to the various constraints listed above.Each iteration involved random variation of these parameters, followed by the calculation of the change in |G * j (r) − G j (r)| 2 .The acceptance or rejection of the variation follows MC algorithm.The variation was repeated under increasingly stringent acceptance criteria using simulated annealing until convergence was achieved.To verify if the result, a second analysis employing all free g i (r) was also attempted.The result shows well defined phase domains that are highly consistent with the first attempt (figure 6).

Figure 2 .
Figure 2. (a) The basis PDFs: gI (cyan), gII (magenta), gIII (yellow), g4 (black) and g5 (grey) for the MMF analysis.(b) Phase space coloured using the CMYK code (K i ) converted from the MMF-derived weighting row vectors (w i ) where the voxels dominated by gI (AF-I), gII (AF-II/IIb) and gIII (AF-III) are highlighted by their respective colours.Domains with a weighting of ⩾0.35 of (c) g4 and (d) g5 are additionally extracted for a better view.

Figure 4 .
Figure 4. Methodology pipeline.The experimental PDF is projected onto the MMF components to obtain w * .This allows for identification of the closest voxel in J ′ , x, T ′ space, from which a corresponding MC run would give an atomistic representation.

Figure 6 .
Figure 6.CMYK-coded J ′ , x, T ′ -phase space of A 1+x B 1−x O2 (0 ⩽ x ⩽ 2/3) produced using (a) three fixed (known) and two free g i (r) and (b) five free g i (r).Both plots use the same CMYK-codes.(c) Comparison of the MMF-derived components between 3 fixed + 2 free g i (r) (coloured solid lines) and 5 free g i (r) (dashed red lines).