Quantitative Parameter Estimation, Model Selection, and Variable Selection in Battery Science

Nicholas W. Brady; Christian Alexander Gould; Alan C. West

doi:10.1149/2.0012001JES

Since the 1960s, the costs associated with electronic data storage, computational operations, as well as data transfer have decreased exponentially.^1,2 These improvements in computational performance have made simulations easier and cheaper to execute and have led to a proliferation of data generated from numerical physics-based simulations. The challenge of the future (if not already) is that there are too much data to be analyzed manually, and computationally scalable algorithms need to be developed and adopted to maintain the current pace of research; Venkatasubramanian made this observation in 2008 for the broader field of chemical engineering.³ Data science techniques have enabled software programs to perform complex tasks with remarkable results.^4–6 However, how to apply these techniques to the fundamental study of physical processes and complex systems is not always obvious.

Numerical simulations are particularly useful when applied to complex systems such as those that arise in battery science. Numerical physical models allow for the rapid testing of hypotheses and determination of optimal design and control parameters, especially when compared to a trial-and-error process. Because of the inherent structure of batteries, models are particularly useful because they provide insights that are difficult or impossible to obtain through experimentation alone.^7–12 In addition to providing physical inference, models are often essential in the design of optimal control schemes. However, increasing the accessibility of fundamental physics-based battery models remains a challenge. Though commercial software tools are ubiquitous, they can be difficult to tailor to a particular system; development of a suitable model often needs to be done by a modeling expert. Most battery researchers are not modeling experts and often choose to use less rigorous, but more accessible battery models. However, the assumptions inherent in these models may only be applicable to ideal cases, therefore their link to the observed physics can be ambiguous. If, by leveraging data science techniques, a general, computationally scalable algorithm can be defined for the explicit purpose of developing numerical physics-based battery models, it would increase the accessibility of porous electrode models to all battery researchers, accelerating battery development.

Data science techniques have been applied to battery research, especially in battery state of charge and state of health estimation and battery management^13–15 and Dawson-Elli et al. used machine learning to build a surrogate model from a particular physics-based model in an effort to decrease execution time while retaining electrochemical accuracy.¹⁵ While these applications of data science were not designed to elucidate physical processes, other papers have explored using data science techniques to elucidate physical understanding, specifically to perform physical parameter estimation by comparing well-developed models to the performance of commercial battery cells.^16–18 This paper describes how data science can be applied to four aspects of physics-based model development: parameter estimation, model selection, variable selection, and model-guided design of experiment.

Parameter estimation is the first step of model development after the numerical equations have been established. Modeling experts can produce compelling results by manually estimating physical parameters and systematically comparing simulated results to experimental observations.¹⁹ However, because it is a trial-and-error process, manual parameter estimation is time-consuming and tedious, and often lacks a clear quantitative justification. Quantitative approaches to parameter estimation such as, parameter sampling,¹² least-squares²⁰ and gradient-descent have been used in the literature. However, uncertainty quantification is another important element of parameter estimation; it is not only necessary to assess model accuracy,^21,22 but it also necessary for optimal control.^23,24 Sensitivity analysis has been deployed in the literature,^20,25 but linearization sensitivity analysis can be inaccurate in determining uncertainty if the model is highly nonlinear.²⁶ The Markov-Chain Monte-Carlo method provides an alternative to sensitivity analysis that can be applied more generally.²⁷

Sometimes hypothesis testing leads to the development of multiple physical models, so there is a need to develop a general and quantitative method to assess model performance and to discriminate between models. Statistical t-tests²⁵ have been used to assess model suitability, and statistical f-tests have been used to discriminate between models.²⁰ These two tests are good inference tools, however, these statistical tests rely on the assumption of independent and identically distributed samples and do not give a quantification of the model's predictive power. Cross-validation can be applied without making such strong statistical assumptions and is therefore a more general way to conduct model discrimination. By dividing the data into a training set and a validation set, cross-validation provides an empirical measure of a model's predictive power.²⁸ This manuscript employs k-fold cross-validation, which uses a loop in which each iteration holds each data set as the validation set and the remaining k − 1 data sets as the training sets. k-fold cross-validation is widely recommended when applying machine learning to real-world data.^29,30 While the statistical t-test and f-test may be computationally cheaper to perform, using k-fold cross-validation to compare the predictive power of several models is a computationally scalable way (or at least computationally feasible way) to perform model selection.

When it is suspected that parameter values change as a process proceeds (e.g. as a battery cycles the apparent solid-state diffusion coefficient of lithium decreases), the model can be fit to subsets of the experimental data and the parameter estimates can be tracked during the process.³¹ This is reasonable when there are only a few parameters in the model, i.e. the model already provides good inference. However, if the model accounts for many different physical processes, there may be tens of parameters that can vary, and searching for an optimal fit in these instances becomes intractable especially if no prior knowledge is provided. One method to provide prior knowledge is to use Bayesian estimation and Bayesian priors.³² Another method, covered in this paper, is to use lasso regression. Lasso regression allows for the inclusion of prior knowledge about the parameter values and provides a structured method to select the parameters that are most likely changing during the process.³³

After a model has been developed from the original experiments, it may be necessary to conduct additional experiments to refine parameter estimates^21,34 or test a new hypothesis. Model-guided design of experiment can be used to determine which experiments and what operating conditions should be used for optimal information extraction.^25,26

Brady et al.³⁵ and Knehr et al.^36,37 modeled the LiV₃O₈ and Fe₃O₄ cathodes, respectively. In these studies, experts used ad hoc methods for parameter estimation, model selection, and variable selection, and the results were validated against experimental data. Herein algorithms are developed and validated by evaluating conclusions compared to those obtained by "expert" battery researchers. The sections Parameter Estimation, Model-Guided Design of Experiment, and Variable Selection compare the conclusions of the modeler and the algorithm with respect to the LiV₃O₈ electrode; Model Selection compares a conclusion derived from the Fe₃O₄ electrode. The conclusions developed in this paper are consistent with those developed by Knehr et al.³⁷ and Brady et al.³⁵ The advantage of the algorithmic workflow is that it allows for streamlined data analysis and decision-making and strengthens conclusions by providing quantitative justification.

Methods

To perform parameter estimation, a designated parameter space was sampled. To obtain confidence intervals for the estimated parameter values, a bootstrapped Markov-Chain Monte-Carlo (MCMC) algorithm was used. The advantage of using sampling instead of an iterative optimization or a sequential stochastic optimization is that the simulations can be executed in parallel. Performing MCMC with a bootstrap allows for the number of accepted points, m, and the number of simulations, q, to be independent of each other because the MCMC analysis samples the parameter space with replacement. This allows for m > q; which means 10,000 accepted points can be achieved from 1,000 simulations and generate similar statistics to 10,000 uniquely accepted simulations.^38–40 (Bootstrapping means that the random sampling is done with replacement. Once a simulation result has been chosen and either rejected or accepted, it can subsequently be selected again and tested for acceptance or rejection, i.e. a selection of a simulation does not preclude it from subsequent selections; a simulation can be selected and accepted multiple times.)

The simulated voltage data were generated using the methodologies outlined by Brady et al.³⁵ and Knehr et al.^36,37 for the lithium trivanadate (LiV₃O₈) and magnetite (Fe₃O₄) systems, respectively. Markov-Chain Monte-Carlo analysis, cross-validation, and lasso regression were performed using code written in Python and contained in a Jupyter notebook.

Sampling

Sobol Sampling: To sample the parameter space efficiently, Sobol sequences were usually generated in up to 4-dimensions using a downloadable Python module sobol_seq.⁴¹ sobol_seq takes as inputs the number of dimensions to sample as well as the number of points to generate. The points generated are in the space [0,1], so the numbers are linearly scaled to fit the range needed for the specific parameters. A Sobol sequence is a quasi-random low-discrepancy sequence. These types of sequences are efficient for sampling through hypercubes because they efficiently fill in gaps in the hypercube, and when these points are projected onto lower dimensions, the gaps are also small.⁴²

Lasso sampling: Because Sobol sequences produce very few points with parameter values set identically to zero, for the implementation of the Lasso method (variable selection) a grid mesh was used instead of Sobol sequences. The Lasso method regularizes the optimization problem by pushing parameter values toward zero, so it is necessary to have a fair number of points with parameter values set identically to zero.

A metric for how well the simulations emulate the experimental data is the residual sum of squares, RSS:

where n is the number of experimental observations (the total number of voltage versus time measurements for a constant current discharge experiment), are the simulated observations and y_j are the experimental observations. Each parameter set yields an RSS, and this information is stored in a table, an example of which is shown below (Table I), and used for the MCMC analysis.

Table I. Correlation between the simulation parameters and agreement with experimental observations (RSS) for a given applied current.

Simulation #	D_α (× 10^{− 13} cm² s⁻¹)	k_rxn (× 10^{− 8} cm^5/2 mol^-1/2 s⁻¹)	Li_{x_{α, sat}}V₃O₈	k_β (× 10^{− 3} s⁻¹)	RSS
0	5.05	8.0	2.0	25.0	2.29791
1	7.525	7.5	2.5	12.5	2.87751
2	2.575	8.5	1.5	37.5	0.716081
⋮	⋮	⋮	⋮	⋮	⋮
q	0.103625	8.04468	1.99487	23.5291	4.25133

Markov-chain monte-carlo

While sampling by itself gives an apparent "optimal" parameter set, it does not directly lead to statistics on the parameters,¹² and for this reason it is desirable to pair sampling with a Markov-Chain Monte-Carlo (MCMC) method. The MCMC method used in this paper is the Metropolis-Hastings algorithm.⁴³ The method uses an accept-reject criterion to find the simulations that most likely emulate the experimental observations. The accept-reject criterion approximates the experimental variation and inherent deviation between the model and the observations.

(1)
The algorithm is initialized by randomly picking a simulation result; this simulation's correlated RSS value is labeled RSS_t.
(2)
For each subsequent iteration, t:
- (a)
  Randomly choose a candidate simulation and designate its RSS value as RSS'.
- (b)
  Calculate the acceptance ratio, , where f(RSS) is the likelihood that a particular simulation is representative of the observed experimental data.
- (c)
  Accept or reject the candidate simulation based on the criteria:
  - Generate a random number, u, in the range [0,1].
  - If α ≥ u then the candidate simulation is accepted and its parameter set is tabulated; RSS_{t + 1} = RSS'.
  - If α < u then the candidate simulation is rejected; RSS_{t + 1} = RSS_t.

The likelihood, f, that a particular simulation represents the experimental observations is quantified by

When the experimental standard deviation is constant, s_{exp, j} = s_exp for all j, Equation 2 can be simplified to

Results continue to be tabulated until a threshold number of accepted simulations is reached. Because the initial simulation result is randomly chosen, the initially accepted simulations may not yield an accurate distribution of likely parameters. Therefore, after the threshold acceptance number is reached, the first 10% of the selections are discarded and the remaining 90% are used to calculate the pertinent statistics.

From the accepted parameter values, the mean and variance of the parameter values are calculated from

where the P_j are the accepted parameter values and m is the total number of "undiscarded" accepted simulations. The accepted parameter values are assumed to follow a normal distribution. The mean, μ_P is the most likely parameter value, and the standard deviation, σ_P is assumed to be the uncertainty in the parameter, whose value depends on the experimental variation as well as the uncertainties in the other parameter values.

Physical model of Li_xV₃O₈

The details of the physical-based model for the Fe₃O₄ cathode are given by Knehr et al.³⁷ and the details of the LiV₃O₈ chemistry are given by Brady et al.³⁵ The most pertinent details of the LiV₃O₈ system are summarized here for the reader. Lithium ions combine with an electron at an insertion site to enter into the host material:

where Γ is an insertion site in α-phase LiV₃O₈.

Phase change occurs in the material when the local lithium equivalence exceeds a threshold equivalence (or concentration), x_{α, sat}. The equilibrium fractions of the α and β-phases are given by a mass balance on the total lithium within a crystal.

The physical model used for lithium trivanadate, LiV₃O₈, is taken from Brady et al. and summarized in Table II for the reader.³⁵

Table II. Physical equations used to model the Li_xV₃O₈ cathode.

Crystal Scale Equations
(1)	Lithium Concentration		Crystal Center	Crystal Edge
	in the α-Phase (c_α)		∇c_α = 0
	Solid-State	θ_gb = ζθ_β	D_{x, eff} = D_αθ_α + D_gbθ_gb
	Volume Fractions:	θ_α + θ_β + θ_gb = 1	D_gb = 100D_α
(2)	Balance on the		Lithiation	Delithiation
	β-Phase Formation (θ_β)		w = 0, v = 1	w = 1, v = 0
Electrochemical Reaction Rate
(3)	Current Density (i_in)
	Exchange Current Density (i₀)		η = V − U
Reversible Potential
(4)

Algorithmic model development and analysis

Figure 1 depicts a flowsheet showing the connections between parameter sampling, the numerical physics-based model, and experimental measurements. Sets of tunable parameter values are fed to the physics-based model; for each set of parameter values, simulated data are produced and compared to the experimental data and a table is constructed correlating the parameter values and a metric of the goodness of fit, such as an RSS, as shown by Table I. This table of information can be used to perform parameter estimation, model selection, variable selection, and model-guided design of experiment.

Results and Discussion

Parameter estimation

In 2016, Brady et al.³⁵ used current interrupt experiments (lithiation and voltage recovery) to estimate the physical parameters of the LiV₃O₈ electrode. The authors estimated the solid-state diffusion coefficient, D_α, exchange current reaction constant, k_rxn, phase change saturation equivalence, x_{α, sat}, and phase change reaction rate constant, k_β. Using the values D_α = 1 × 10^{− 13} cm² s⁻¹, k_rxn = 3 × 10^{− 8} cm^5/2 mol^-1/2 s⁻¹, x_{α, sat} = Li_2.5V₃O₈, and k_β = 5 × 10^{− 3} s⁻¹, the numerical simulations produced compelling agreement with the experimental observations.³⁵ Figure 2 shows the parameter estimates derived from all the rate data during lithiation and voltage recovery and sampling using Sobol sequences in the ranges [0.1, 10] × 10^{− 13} cm² s⁻¹, [-7, -9] for 1 × 10^x cm^5/2 mol^-1/2 s⁻¹, [2.1, 4.0] for Li_{x_{α, sat}}V₃O₈, and [0, 50] × 10^{− 3} s⁻¹, (D_α, k_rxn, x_{α, sat}, and k_β, respectively) - 4096 Sobol points were generated, followed by MCMC analysis assuming a uniform experimental standard deviation of 100 mV, s_exp = 100 mV and taking 10,000 acceptances. It can be seen that the estimates produced by the algorithm are in good agreement with the estimates derived by Brady et al.³⁵ Figure 2 and Table III show that some parameters can be estimated with high confidence even if other parameters have high uncertainty. This is important because during manual parameter estimation, the "expert" sometimes has to use their judgement to determine which parameters are known confidently and which are not.

**Figure 2.** Physical model parameter estimates obtained from Sobol sampling followed by Markov-Chain Monte-Carlo analysis using the lithiation and voltage recovery data from all four current rates.

Table III. The estimated mean and standard deviation for the physical parameters as a function of applied current, as well as the estimates and standard deviations derived using all rates.

Experiment	D_α (× 10^{− 13} cm² s⁻¹)	k_rxn (1 × 10^x cm^5/2 mol^-1/2 s⁻¹)	k_β (× 10^{− 3} s⁻¹)	Li_{x_{α, sat}}V₃O₈
C/10	0.5 ± 0.1	− 7.7 ± 0.2	6 ± 10	2.48 ± 0.02
C/5	1.1 ± 0.1	− 8.5 ± 0.1	40 ± 13	2.48 ± 0.02
C/2	1.2 ± 0.2	− 8.5 ± 0.1	31 ± 12	2.45 ± 0.02
1C	1.8 ± 0.2	− 8.7 ± 0.2	25 ± 15	3.1 ± 0.6
Overall	1.2 ± 0.1	− 8.5 ± 0.1	31 ± 13	2.47 ± 0.03

The experimental standard deviation, s_exp, has an effect on the estimated parameter distribution. This is illustrated in Figure 3. Examining only the C/10 lithiation data and assuming a uniform estimate for the experimental standard deviation of 100 mV, s_exp = 100 mV, the estimated distribution of the solid-state diffusion coefficient, D_α, is cm² s⁻¹, with a standard deviation of cm² s⁻¹. From Figure 3A, it is clear that using a uniform experimental standard deviation of 100, 50, or 20 mV gives nearly identical estimates of the parameter distribution. If the empirically determined standard deviation, which varies as a function of x in Li_xV₃O₈, is used (given as the inset in Figure 3A) a different distribution is achieved with cm² s⁻¹, and cm² s⁻¹. The derived mean and standard deviation in diffusion coefficient show significant discrepancies, highlighting the importance in quantifying experimental variance as a function of state of charge. Furthermore, the value in inferred diffusion coefficients in both cases differ from estimates in Table III, showing the importance in the choice of the experimental conditions in obtaining parameter estimates. Figure 3B shows comparisons of simulated data using parameter estimates derived by assuming s_exp = 50 mV and using an empirical estimate of s_exp. While both simulations provide good fits to the experimental data, the comparison illustrates the important information obtained by using the empirical s_exp.

It should be noted that this current rate, C/10, is low and it is typically not advisable to extract kinetic information from experiments that do not stress the kinetics. However, lithiation data (not voltage recovery data) at C/10 were the only experimental condition for which replicate experimental data were available and therefore the only condition for which an empirical s_exp could be calculated. The conclusion of this observation is that experiments should be done at least in duplicate to allow for the determination of the experimental variance and, when possible, the empirically observed variance should be used to inform the parameter estimates. Figure 3 also suggests that if data from duplicate experiments are not available, or if replicate data are infeasible to collect, an estimated constant value of the experimental deviation may yield realistic parameter estimates.

The parameter distributions at different current rates are also informative. For instance, as shown in Table III, the experimental measurements at 1C do not inform the values of the parameters that govern phase change, x_{α, sat} and k_β. This makes sense because the voltage plateau indicative of a phase change is not observed during these experiments; i.e. these experiments do not inform phase change because they do not probe phase change. For the lower discharge rates, the parameter distributions for x_{α, sat} seem to be independent of rate.

The other three parameters, D_α, k_rxn, k_β, are kinetic parameters. Intuition indicates that, all things being equal, higher current rates are better for discerning kinetic processes; however, there are practical limits to the maximum current. Though higher current rates might reveal more information about k_β, currents that are too high are unable to probe phase change before the experimental cutoff conditions are reached. Finding the maximum current rate that gives sufficient information, implies there is an optimum condition. Table III shows that some experimental conditions provide more precise insights into parameter values than other conditions. Finding the optimal experimental conditions is explored more thoroughly in the section Model-Guided Design of Experiment.

Model selection

It is common in battery studies to have multiple competing hypotheses about the physics that are dictating battery performance; different assumptions lead to different models. This section illustrates how data science approaches can be used to algorithmically perform model selection by quantitatively identifying which model is statistically most consistent with experimental data. Such approaches allow for an unbiased evaluation of the efficacy of alternative modeling hypotheses. The results of these approaches are not purely numerical, but contain physical insights.

For the magnetite electrode, Fe₃O₄, Knehr et al.^36,37 showed through voltage recovery data that for small crystal sizes of 6 and 8-nm, it was not solid-state transport that dominated performance, but transport through the crystal aggregate structures, or agglomerates, that dominated performance. The authors reached that conclusion by observing that the solid-state diffusion coefficient necessary to replicate the observed voltage recovery times was very small, but when using that same diffusion coefficient during the discharge experiments, the simulations produced significant deviation from the observed electrochemical measurements. However, using an agglomerate model, the mass transfer coefficient necessary to emulate the voltage recovery data also produced reasonable agreement with the discharge data. While this paper does not contest the conclusions of the previous work, it does seek to understand how these conclusions can be developed algorithmically.

One method to perform model selection quantitatively is cross-validation.⁴⁴ Cross-validation requires the experimental data to be divided into a training set, by which the parameters are tuned, and a testing set, to validate the tuned parameters. Because the data come from current interrupt experiments, it can intuitively be divided into discharge data, when the current is on, and voltage recovery data, when the current is off.

Fe₃O₄ electrodes composed of 8-nm crystals were lithiated at a rate of C/200 (4 mA/g) to three different equivalences: Li_0.5Fe₃O₄, Li_1.0Fe₃O₄, and Li_1.5Fe₃O₄; upon reaching the threshold equivalence, the cell current was turned off and voltage measurements continued to be made up to 100 hours of total test time. The data are divided into two sets: lithiation (current on) and voltage recovery (current off). One set is labeled the training set, by which the mass-transfer diffusion coefficient is fit; the other set is labeled the validation set by which the trained model is tested, and the testing error is recorded. Then the training and testing data are switched and the testing errors are summed; this is done for both models. For the agglomerate model, 256 Sobol points were generated in the range [-14, -11] for 1 × 10^x cm² s⁻¹ for D_agg, and for the crystal scale model, 256 Sobol points were generated in the range [-20, -17] for 1 × 10^x cm² s⁻¹ for D_x. It was found that the agglomerate model tested significantly better than the crystal model in cross-validation, c.f. Figure 4.

Figure 4 shows the MCMC determined distributions (using 10,000 accepted points for both models) for the mass-transfer coefficient for the agglomerate and crystal models when fitted to the discharge data and the voltage recovery data; the experimental deviation is assumed to be 50 mV, s_exp = 50 mV. (The experimental data were examined from Knehr et al.³⁶ in the range Li_0.0Fe₃O₄ to Li_1.5Fe₃O₄ and it was observed that the experimental standard deviation in that range during lithiation was almost entirely below 40 mV; experimental standard deviation during voltage recovery could not be calculated empirically because there were not replicate data.) It can be seen that for the agglomerate model, the MCMC results for each partition are quite comparable, 4.0 and 3.3 × 10^{− 13} cm² s⁻¹, while for the crystal scale model, the mass transfer coefficients differ by an order of magnitude, and cm² s⁻¹. The improved precision provided by the agglomerate model as well as the improved testing error both indicate that the agglomerate model is more consistent with the observed electrochemical measurements.

Cross-validation provides a clear methodology to rigorously perform model discrimination in a general way and provides the expert researcher with a quantitative justification; these measures allow for the model development process, specifically the evaluation of competing hypotheses, to be streamlined.

Variable selection

The situation arises where a physical model explains existing experimental data, but fails to adequately describe new observations. In such cases, it may be necessary to select new parameter values or to develop a new model. It may be desirable to initially focus on selecting new parameter values. This section illustrates how data science approaches can be used to systematically identify statistically significant parameter variation.

For the LiV₃O₈ electrode, using the physical parameters, D_α = 1.0 × 10^{− 13} cm² s⁻¹, k_rxn = 3 × 10^{− 8} cm^5/2 mol^-1/2 s⁻¹, k_β = 5.0 × 10^{− 3} s⁻¹, x_{α, sat} = Li_2.5V₃O₈, Brady et al. achieved excellent agreement between simulations and experiments during lithiation and voltage recovery.³⁵ However, during delithiation, simulations using the aforementioned physical parameters significantly differed from the experimental observations. Brady et al. concluded that during delithiation, mass transport of lithium in the solid-state was more facile than during lithiation.³⁵ The authors came to this conclusion because accurate agreement between simulation and experimental observation was achieved by increasing the value of the diffusion coefficient by a factor of 5, D_α = 5.0 × 10^{− 13} cm² s⁻¹. The excellent agreement could only be achieved by changing the parameter associated with mass transport. Changing the exchange current density or changing the reaction rate constant for phase change did not improve agreement.

Again, this paper does not question the conclusions of the previous work, but seeks to explore if an algorithmic approach can come to the same or a similar conclusion. Because the values of the parameters during lithiation are already validated, and physical intuition and experience indicate that most of these parameters likely do not change during delithiation, the implied question is: what is the minimum number of parameters that need to be adjusted to improve agreement?

Regularization is a method to optimize a problem (achieve the best fit) while also providing additional constraints. Lasso regression regularizes the optimization problem as well as performs variable selection.^33,45 This method performs variable selection by only allowing the parameters that most significantly improve agreement to vary; parameters that do not significantly contribute toward improved agreement are not allowed to vary, i.e. they assume their lithiation values. Lasso performs variable selection mathematically by placing a penalty on non-zero parameters. If the parameter values during delithiation are defined in terms of their values during lithiation:

it is observed that β_j = 0 produces no difference between the lithiation and delithiation parameter values. The lasso objective function introduces a bias to minimize changes in the parameters:

where the RSS is calculated according to Equation 1. The additional parameter, λ, weights how significantly the model is constrained; high values of λ force all β_j to 0, which returns the lithiation model, while small values of λ give the ordinary RSS objective function - all parameters are allowed to vary without constraint. Figure 5 shows how the parameter values (for mass transfer - D_α, charge-transfer - k_rxn, and the thermodynamic potential - U_ref) and the agreement between simulations and electrochemical observations vary as the penalty, λ, varies. The vertical dashed lines denote the regions where (from left to right) no parameters, one parameter, then two parameters vary. Figure 5 shows that almost all of the reducible error during delithiation is achieved by increasing the diffusion coefficient and only a small amount of the error is reduced by varying the exchange current density. In addition, immediately before two parameters are allowed to vary, the optimal value of the solid-state diffusion coefficient is D_α = 3.2 ± 0.4 × 10^{− 13} cm² s⁻¹, which is in good agreement with the previous study.³⁵ Figure 5 suggests that the lasso method is a useful tool in identifying physical changes in battery systems; the lasso method may also be a general tool that can be applied to investigate changes that occur between charge and discharge, during cycling, through temperature changes, etc.

**Figure 5.** A) The average deviation per observation between the simulations and experiments versus the value of λ. B) The optimum values of β_j versus the value of λ. The vertical dashed lines indicate the value of λ where an additional parameter is allowed to vary (an additional β_j ≠ 0). A grid mesh of 13 × 13 × 13 was constructed and the β_j's were allowed to vary from [0, 1] for each parameter.

As in the Model Selection section, the utility of a variable selection framework is to provide researchers with rigorous quantitative justification for conclusions. In addition, utilizing the lasso method allows researchers to simultaneously interrogate many parameters in various combinations, instead of having to change each parameter individually, or in a sequential combinatorial fashion. These aspects of lasso regression allow for another process of model development, variable selection, to be streamlined and performed systematically.

The previous sections have shown how to use quantitative and algorithmic approaches to perform parameter estimation, model selection, and variable selection. After gathering insights from these processes it may be desirable or even necessary to validate the conclusions with additional experiments; sometimes it is not clear which experiments to perform. The next section illustrates how these quantitative approaches can be used to guide experimental design.

Model guided design of experimental

In the Parameter Estimation section, the value of the phase change reaction rate constant, k_β, could not be determined precisely. So the question is: what experimental conditions, specifically what constant current lithiation rate, would allow a precise determination of k_β? To answer this question, simulations are used to generate mock experimental data, and the sampling and MCMC analysis are used to show which experimental conditions minimize parameter uncertainty.

In the previous sections, simulated data are compared against experimental data to generate an RSS (Equation 1). In this section, sampled simulated data are compared to mock experimental data; the y_{j, mock} are the mock experimental observations and the are the simulated observations; for clarity this is given in the following equation:

It is important to note that the mock experimental observations, y_{j, mock}, are a function of k_{β, TRUE}, i.e. y_{j, mock} = f(k_{β, TRUE}).

For simplicity it is assumed that D_α, k_rxn, and x_{α, sat} are known with exact precision: D_α = 1 × 10^{− 13} cm² s⁻¹, k_rxn = 3 × 10^{− 8} cm^5/2 mol^-1/2 s⁻¹, x_{α, sat} = Li_2.5V₃O₈, and only lithiation data are used for parameter estimation (voltage recovery data are not used); this is a simplified case of the parameter estimation reported in Table III and Figure 2. A mock experiment is generated using a specific applied current, and a value of k_β = k_{β, TRUE}. Sampling simulations are conducted in the range inferred from existing experiments, k_β = [0, 30] × 10^{− 3} s⁻¹ using 50 Sobol points, and a table is constructed correlating k_β (sampled) with RSS, Equation 11. Using MCMC analysis, and are determined for a specific value of k_{β, TRUE} at a specific applied current.

Using five different values of k_{β, TRUE} = 3, 10, 15, 20, 27 × 10^{− 3} s⁻¹, constant current lithiation mock experiments were run at varying current rates (5, 10, 15, ..., 360 mA/g), with a cutoff potential 2.4 V or a cutoff equivalence of Li_3.0V₃O₈. Then the parameter estimation framework (sampling followed by MCMC analysis) was applied at every current rate for each value of k_{β, TRUE} and the results are shown in Figure 6. The left two plots of Figure 6 show the inferred values of k_β - the mean, , and standard deviation, , derived from sampling combined with MCMC analysis, assuming s_exp = 50 mV. These inferred values are plotted as a function of the applied current rate. Assuming the optimum maximizes precision, i.e. minimizes , an optimal current range is found for each value of k_{β, TRUE}.

**Figure 6.** The left two plots show, for given true values of k_β, how the estimated value of k_β, , and the uncertainty in the estimation, , vary for different applied currents. The right two plots show, for k_β = 5 × 10^{− 3} s⁻¹ (red), how the apparent value, , and uncertainty, , vary for different applied currents. These data are overlaid with the estimated value and uncertainty of k_β derived from four current rates: C/10, C/5, C/2, 1C.

**Figure 6.** The left two plots show, for given true values of k_β, how the estimated value of k_β, , and the uncertainty in the estimation, , vary for different applied currents. The right two plots show, for k_β = 5 × 10^{− 3} s⁻¹ (red), how the apparent value, , and uncertainty, , vary for different applied currents. These data are overlaid with the estimated value and uncertainty of k_β derived from four current rates: C/10, C/5, C/2, 1C.

This method can be applied to a hypothetical example where the phase change reaction rate constant has an estimated value and confidence interval of k_β = 15 ± 12 × 10^{− 3} s⁻¹. Using the mean of this range, the information on the left of Figure 6 indicates the optimal current rate would be 125 mA/g. If k_{β, TRUE} = 5 × 10^{− 3} s⁻¹, algorithmic analysis of experimental data would reveal a new estimated value and confidence interval, k_β = 7.5 ± 3 × 10^{− 3} s⁻¹. If it is desired to achieve more precision, a subsequent experiment can be performed, at a new optimum current of 100 mA/g; this current would likely reduce the uncertainty to a value of about 1 × 10^{− 3} s⁻¹.

It can be seen from Figure 6 that there is an optimal applied current and that the optimal discharge current varies depending on the value of k_{β, TRUE}; as the value of k_{β, TRUE} increases, the optimum discharge current also increases. This is a reasonable conclusion because as the system kinetics increase, the higher the experimental rate needed to accurately investigate kinetics.

It should be noted that the minimum does not necessarily correlate with the most accurate value of (i.e. the minimum does not imply ). The methodology outlined here can also be used to test different types of experiments (e.g. constant voltage experiments, constant power, galvanostatic interrupt titrations) in addition to testing different operating conditions.

The left side of Figure 6 shows theoretically that this process can be used, but is this process actually informative? The right side of Figure 6 shows how the apparent value of k_β and its uncertainty vary with applied current when k_{β, TRUE} = 5 × 10^{− 3} s⁻¹; this was the value of k_β proposed by Brady et al.³⁵ The simulated results are overlaid with experimental data at four different current rates: C/10, C/5, C/2, and 1C (36, 72, 180, 360 mA/g, respectively). It is observed that the experimentally determined values of and agree with the values determined from the mock experiments, which validates this methodology and confirms that simulations in combination with the framework outlined in this paper can be leveraged to help design experiments for maximum utility.

In the previous sections, simulations have been used to deduce parameter values, but this framework can also be used to maximize other metrics; one obvious metric is performance, but here we examine how this framework can be used to optimize characterization conditions. Coin cells are relatively cheap to fabricate and test, so it is reasonable that an experimental group could quickly make and test multiple coin cells and bypass the simulations entirely and do the optimization empirically. However, there are experiments that are more expensive to conduct, for instance conducting operando EDXRD experiments on a synchrotron beam line may be expensive, with limited opportunities for using such state of the art facilities. In these instances, using simulations to guide experimental endeavors is useful to maximize the utility of these opportunities.

Figure 7 shows a hypothetical optimization of design parameters for an operando EDXRD experiment. Assuming the physical parameters of the electrode are known, how can the applied current be tailored to achieve an optimal profile of the β-phase volume fraction, θ_β, during the experiment? The ideal profile needs to be informed by experimental experience, but here it was assumed that a profile spanning the full range of possible θ_β values over the full length of the electrode was best; i.e. it is neither ideal to have a profile that is flat (does not vary) across the electrode, nor is it ideal to have a sharp step-change in the profile. Using the physical parameters outlined by Brady et al.⁴⁶ and assuming an electrode length of 500 μm, simulations were run to determine the optimal applied current. Figure 7A shows how the profiles generated at different current rates deviate from the ideal profile; this deviation was calculated using Equation 12, where θ_{β, j} are the ideal volume fractions of the β-phase at the electrode positions and are the simulated volume fractions of the β-phase at the electrode positions. It was also assumed that the EDXRD gauge thickness is 20 μm so the total number of observations is 25. It was found that the current rate that minimized this deviation was 18 mA/g.

Figure 7B shows representative profiles for 2, 18, and 30 mA/g. From Figure 7B it is seen that at 30 mA/g, from 300–500 μm the volume fraction of the β-phase is 0. This is not ideal because scan time is wasted over that range and would give no additional information. Another constraint may be the total experimental time. The β-phase profile produced at 18 mA/g is optimal, but 18 mA/g also corresponds to C/20; 20 hours for one synchrotron experiment may be too expensive. In contrast, the 30 mA/g profile is not ideal, but that experiment corresponds to a rate of C/12 (it would take 12 hours), which may be more feasible. Researchers must weigh the improvement in information obtained versus the cost of beam time.

Conclusions

The authors do not anticipate that the expertise and intuition of expert physical modelers can be easily replaced by software. However, the recent advances in computational capabilities as well as data science algorithms are significant. This paper has outlined how some of these algorithms can be effectively applied to battery science, but further work is necessary for mathematical modelers to stay abreast of the developments in data science and remain informed of how they can be applied to battery studies. Implementation of computationally scalable techniques has the potential to improve the productivity of modelers as well as strengthen the conclusions of modeling studies. The physical insights provided by these techniques in regard to the LiV₃O₈ and Fe₃O₈ electrodes are consistent with previous studies. The novelty of this paper is that the same conclusions can be reached using a methodology that is more quantitative, provides more information about the fitted physical parameters, minimizes human time, and is computationally scalable.

Acknowledgments

This research was supported by the Center for Mesoscale Transport Properties, an Energy Frontier Research Center supported by the U. S. Department of Energy (DOE), Office of Science, Basic Energy Sciences (BES), under award #DE-SC0012673. We acknowledge computing resources from Columbia University's Shared Research Computing Facility project, which is supported by NIH Research Facility Improvement grant 1G20RR030893-01, and associated funds from the New York State Empire State Development, Division of Science Technology and Innovation (NYSTAR) Contract C090171, both awarded April 15, 2010. The authors thank Timothy Jones whose expertise in statistics was influential in the understanding and application of the Markov-Chain Monte-Carlo analysis.

ORCID

Alan C. West 0000-0002-3524-3040

Quantitative Parameter Estimation, Model Selection, and Variable Selection in Battery Science

Article metrics

Author affiliations

Author notes

ORCID iDs

Dates

Abstract