Computational workflow for steric assessment using the electric field-derived size

Molecular structure plays an important role in the selectivity and performance of catalysts. Understanding the impact of structural differences on catalyst performance via quantitative structure-selectivity relationships is key to developing high-performing catalytic systems. There are several methods that have been introduced to quantify steric contributions, including Tolman cone angles, Charton parameters, and A-values. While these have shown promise in predicting selectivity, they access similar, general steric contributions and are largely empirically derived. Alternatively, Sterimol parameters offer a specific multi-directional measure of steric bulk in the form of three vectors in units of distance. Recently, these parameters revealed strong correlations between structure and selectivity in asymmetric catalysis. Yet, despite their demonstrated performance, Sterimol parameters are commonly derived using van der Waals radii, which approximate molecular size using hard-spheres. This method may not accurately describe highly polarized systems. Recently, a new chemical system size metric based on the electric-field of a molecule was developed, which accesses the occupied space of a molecule. Here, we demonstrate that the electric field-derived Sterimol parameters reveal similar structure-selectivity relationships in asymmetric catalysis as conventional Sterimol parameters. Specifically, we present a computational workflow for calculating Sterimol parameters based on the size of a molecule’s electric field, and validate our method using several asymmetric catalysis reactions.


Introduction
Steric effects play an undeniable role across chemistry, including in catalytic [1][2][3][4][5], biochemical, and adsorption processes, among others [6].Indeed, these fundamental, non-bonding interactions are often cited to explain a variety of experimental observations [7,8].The origins of these interatomic interaction effects lie in Pauli repulsion [9]-atoms and molecules occupy a certain amount of space, defined by their electron cloud; when electron clouds are brought closer together, the quantum mechanical exchange energy increases, creating a repulsive effect.This is often referred to as steric hindrance.Yet, there is still debate surrounding the variety of methods that have been introduced to quantify the effects of steric contributions [9]-these methods range from empirically-derived to ab initio parameters [10,11], including A-values [12], Tolman cone angles [7], Charton parameters [13], and Taft parameters [14].
One of the more promising steric parameters are Verloop Sterimol parameters [15], which were introduced in the 1970s to elucidate quantitative structure-selectivity relationships in medicinal chemistry and were recently demonstrated to outperform other steric parameters metrics in asymmetric catalysis applications [16].Sterimol parameters offer a multi-dimensional description of substituent size within molecules, figure 1(a).Here, substituent size is described along three principle axes, L, B 1 , and B 5 .The L parameter described the distance that the substituent extends from the molecule.The B 1 and B 5 parameters describe the shortest and longest substituent distance orthogonal to the L parameter, respectively.
Importantly, Sterimol parameters are a generalized format that can be used to approximate the geometry of a substituent and there are several considerations when calculating Sterimol parameters.
Quantitative Sterimol parameters are dependent on which atomic radii and definition of chemical size is used.Often, Sterimol parameters are defined using hard-sphere atomic radii definitions, such as van der Waals (or Bondi) radii.Alternatively, Sterimol parameters may be calculated at a range of atomic size definitions, including sizes derived from electronic structure information.For example, Pyykkö radii [17,18] for bonded atoms, and electronic-structure based approaches from Alvarez [19,20], Boyd [21] and Rahm [22], which are alternative descriptions of steric contributions that capture the important role that electrostatic interactions play.Boyd and Rahm define the surface of a chemical system by an electron density cutoff (0.001 e bohr −3 ), which does not rely on a hard-sphere approximation and is tailored to the molecular environment of the chemical species in question.Indeed, the rigid atomic size description neglects important charge and electrostatic considerations of atoms bonded in molecules that may contribute to the nonbonding interactions of steric effects.
Recently, a size metric defined by the electric field, STREUSEL, was introduced [23].STREUSEL implicitly assesses the volume that a chemical system affects, as opposed to conventional approaches, which assess the volume that a chemical system occupies.This size definition has implications for ionic systems and systems for which dispersion interactions play a dominant role.For example, STREUSEL-derived Sterimol parameters may provide insight into the steric-dependent properties of ionic liquids [24].
In light of these advances, we present a workflow to calculate STREUSEL-derived Sterimol parameters, S'more Metrics fOR the Elucidation of Sterics (SMORES); this software package is available on GitHub [25].Owing to the recent, successful application of Sterimol parameters in asymmetric catalyst enantioselectivity prediction, we demonstrate the SMORES workflow by fitting STREUSEL-derived Sterimol parameters to two asymmetric catalysis reactions; (i) NHK allylation of acetophenone, and (ii) NHK allylation of benzaldehyde.

Methods
The complete Python-based workflow comprises several steps, figure 1(b): (i) structure generation, (ii) structure optimization, (iii) electrostatic potential (ESP) calculation, (iv) electric field calculation using STREUSEL, (v) Sterimol parameter calculation, and (vi) Sterimol parameter fitting.SMORES assumes that each molecule is composed of a 'core' and one or more 'substituent(s)' that are connected via 'attached atom(s)' , figure 1(c).Thus, users need only define the core structure and desired substituents in SMILES format, ensuring efficiency and minimizing user intervention.A full description and demonstration of the workflow is provided in the documentation (smores.readthedocs.io).Here, we further describe Sterimol parameters (section 2.1) and the SMORES workflow (section 2.2).

Sterimol parameters
At the core, Sterimol parameters are a series of vectors by which substituent sizes may be assessed.Within this definition, there are several considerations when calculating Sterimol parameters.
First, a size definition must be selected; here, either molecular size is defined by hard-sphere radii (e.g.van der Waals, Pyykkö, etc), or size is derived directly from the electronic structure of the molecules.Molecular size derived from the electronic structure is obtained by employing a cut-off; for example, the electric field surface size metric, STREUSEL, defines the chemical surface as the point in space where there is near-zero variance in the electric field.Electronic structure-derived molecular size is obtained by sampling volumetric pixel (voxel) representations of the molecules, which are tensors containing the relevant electronic structure property.In the case of STREUSEL, molecules are described as electric field tensors (voxels), which is obtained from the electrostatic potential calculated using density functional theory.
Second, the core may be included or excluded from the Sterimol parameter calculation, figure 1(d).This directly impacts the B 1 and B 5 Sterimol parameters, which may be larger if the core is included in the Sterimol parameter assessment.Including or excluding the core has implications for the final fit.
Thus, there are three ways Sterimol parameters may be calculated; (i) hard-sphere, core-included, (ii) hard-sphere, core-excluded, and (iii) voxel-based, core-included.We do not consider voxel-based as core-excluded owing to the inherent challenge and assumptions necessary to partition a continuous surface obtained using electronic structure theory.The SMORES workflow provides functions to calculate Sterimol parameters for each of these cases at a range of atomic and molecular size definitions; this is further detailed in section 2.2.4.
Beyond deciding on a chemical size definition, there is an undeniable geometry-dependence of Sterimol parameters.This is difficult to account for using Sterimol parameters calculated from static, DFT-optimized structures.Recently, this was addressed by Paton et al by calculating weighted-Sterimol parameters [26].
Here, a series of Sterimol parameters are calculated for several conformations of a single species and final Sterimol parameters are weighted by the energetic contribution of the conformers.In this way, Sterimol parameters calculated from lower energy conformers contribute more to the final reported Sterimol parameters.This functionality has been incorporated into the SMORES package for hard-sphere Sterimol parameter calculation.Owing to the large compute resources necessary for electronic structure theory calculations, we have not implemented this feature in the voxel-based STREUSEL Sterimol parameter calculation.

SMORES workflow description
Here, we detail each step in the SMORES workflow.To facilitate information passing and improve efficiency, we implement an SQLite database that is updated at each step with molecular properties and results directories.

Structure generation
SMORES assumes that each molecule is composed of a 'core' and one or more 'substituent(s)' , figure 1(c).The primary axis along which the Sterimol parameters are calculated is determined from specified attached atom(s) and the core.

Structure optimization
Sterimol parameters are highly geometry dependent.SMORES provides the user the option to optimize geometries using density functional theory or semiempirical methods.Density functional theory geometry optimization utility is provided by Psi4 [27], an open-source software package providing a variety of quantum chemical methods.Geometry optimization using a semiempirical tight binding method [28], may also be used; here, geometry optimization is facilitated by the xtb software package [28].Alternatively, users may choose to calculate Sterimol parameters for the lowest energy conformer from a conformer search using RDKit [29].While the level of theory directly impacts the final geometry, the presented workflow is focused on elucidating macroscopic trends, which should be preserved by the selected level of theory and is left to the decision of the researcher.

Electrostatic potential and electric field calculation
An electrostatic potential map is necessary to calculate the STREUSEL (electric field) size.This functionality is provided by the Psi4 software package.The default electrostatic potential calculation parameters within SMORES are in line with the guidelines set out by STREUSEL.

Steric parameter calculation and fitting
SMORES calculates Sterimol parameters at three definitions (section 2.1): (i) hard-sphere, core included, (ii) hard-sphere, core-excluded, and (iii) voxel-based, core-included parameters.Hard-sphere core-included and core-excluded Sterimol parameters are obtained using the MORFEUS package [30], which facilitates the calculation of molecular features for machine learning, with a specific emphasis on steric descriptors.Here, we use the Sterimol parameter implementation of MORFEUS, which calculates these parameters at a variety of accepted radii; Alvarez [19,20], Bondi [31][32][33], Rahm [22], Pyykkö [17], and Mantina et al [34].Voxel-based STREUSEL Sterimol parameters are calculated directly from electrostatic potential maps using STREUSEL and SMORES (see section S1 of the supporting information for a detailed description).
Calculated Sterimol parameters are then fit using linear regression to sterically dependent experimental parameters, such as enantioselectivity.

Results and discussion
To demonstrate the functionality of the SMORES workflow and the validity of the STREUSEL Sterimol parameters, we compared Sterimol parameters calculated using a range of accepted atomic radii for several common carbon substituents (section 3.1).We then examined the utility of STREUSEL-based Sterimol parameters calculated using the SMORES workflow in revealing the substituent-dependence of catalytic activity for NHK allylation of acetophenone and benzaldehyde (section 3.2).

Common carbon substituents
Hard-sphere, core-included Sterimol parameters (L, B 1 , B 5 ) calculated using the SMORES workflow are presented for a series of common carbon substituents bonded to two core identities (benzyl and methyl) calculated at four atomic size definitions (hard-sphere STREUSEL, Bondi and Pyykkö, as well as voxel-based STREUSEL), figure 2. Unsurprisingly, Sterimol parameters are dependent on the definition of size that is used.Figure 2 reveals the larger impact that core identity has on the calculated B 1 and B 5 hard-sphere, core-included Sterimol parameters than the L hard-sphere, core-included Sterimol parameters.This is expected considering L parameters describe the vector directed away from the core, figure 2(d).As the voxel-based, STREUSEL size metric is calculated from the electric field and the field is generated by the bonded environment of the molecule, there is a larger core identity dependence for these parameters.Sterimol parameters calculated at a wider variety of accepted atomic radii are presented in figure S5 of the supporting information.From these results, it is evident that the hard-sphere STREUSEL Sterimol parameters yield similar metrics to those calculated at other accepted atomic radii.Further, voxel-based STREUSEL Sterimol parameters are systematically larger than those calculated at hard-sphere radii definitions and present a larger variance when compared with those calculated by Bondi, figures S6 and S7.This is expected considering voxel-based STREUSEL Sterimol parameters are accessing a different definition of chemical surface.

Asymmetric catalytic reactions
Recently, Sterimol parameters were demonstrated to be useful in predicting ∆∆G ‡ values in asymmetric catalytic reactions [16].We validate the SMORES workflow and the utility of the hard-sphere and voxel-based STREUSEL Sterimol parameters by fitting these metrics to two asymmetric catalytic reactions, NHK allylation of (i) acetophenone, and (ii) benzaldehyde, figure 3. See section S3 of the supporting information for details of the fitting procedure employed in SMORES.
For comparison, we examine the correlation between Sterimol parameters calculated using the three Sterimol parameter calculation methods employed in SMORES (hard-sphere core-excluded, hard-sphere core-included, and voxel-based parameters).Table 1 summarizes the relationship between Sterimol parameters calculated using a range of size definitions and measured enantioselectivity for the NHK allylation reactions explored in this work.Importantly, the SMORES workflow yields similar correlations for the NHK allylation of acetophenone and benzaldehyde as those presented by Harper et al [16].Overall, STREUSEL core-excluded and voxel-based STREUSEL Sterimol parameters perform similarly to the other size metrics studied in this work.This indicates that the electric field-based size metric is also useful for determining the steric effect of substituents in asymmetric catalysis.Overall, hard-sphere, core-excluded Sterimol parameters outperformed the hard-sphere, core-included Sterimol parameters, tables S3 and S4 of

Conclusions
Sterimol parameters are an effective and powerful way of describing sterics; this is demonstrated by their predictive success across a range of fields, including for asymmetric catalysis [4,16,26].Importantly, Sterimol parameters describe substituent sterics using a series of vectors; this is independent of chemical size definition.Varying descriptions of chemical size assess different functional properties; for example, Pyykkö radii enable bond lengths to be approximated as the sum of atomic radii and may be more applicable to systems that do not possess partial multiple bonds [17,18], while STREUSEL assesses the space a chemical system affects and may perform better for highly polar systems [23].Ultimately, chemical size definition selection for Sterimol parameter calculation should be dictated by the chemical systems and problem at hand.Here, we provide a computational workflow, SMORES, to facilitate Sterimol parameter calculation.Further, we present an algorithm and method to easily calculate Sterimol parameters for voxel-based size metrics derived from electronic structure theory calculations, such as STREUSEL and Boyd's electron density cutoff.The utility of STREUSEL-based Sterimol parameters is demonstrated by the high observed correlations for two asymmetric catalytic reactions; (i) NHK allylation of acetophenone, and (ii) NHK allylation of benzaldehyde.The high performance of STREUSEL-based Sterimol parameters is promising for steric-dependent problems featuring highly polar systems, such as enantioselectivity of inorganic catalysts.

Figure 1 .
Figure 1.(a) Visualization of the multi-directional Sterimol parameters (L, B1, B5) for an example system.(b) SMORES workflow is comprised of three main steps: i. system setup, ii.electronic structure calculations, and iii.Sterimol assessment.(c) SMORES structures are defined as a core connected to a substituent via an attached atom.(d) Visualization of Sterimol B1 parameter for an example system with the core structure included or excluded.

Figure 2 .
Figure 2. Sterimol parameters, (a) L, (b) B1, and (c) B5, at four radii definitions (STREUSEL, Bondi, Pyykkö, and STREUSEL cube) are presented for several common carbon substituents bonded to two core identities, methyl (circle) and benzyl (triangle).(d) Sterimol parameters are visualized for one of the examined systems, which features a methyl core (blue) and a CHEt2 substituent (green).

Figure 3 .
Figure 3. SMORES Sterimol parameters are validated for NHK allylation of acetophenone, and benzaldehyde.The examined substituents are detailed in blue, and the core is shown in green.

Table 1 .
The correlation (R 2 ) of measured ∆∆G ‡ values and predicted ∆∆G ‡ values for NHK allylation of acetophenone and benzaldehyde using Sterimol parameters calculated with a variety of radii types.Radii are listed order of decreasing R 2 value.New methods implemented by SMORES are highlighted in bold italics.This reveals the true substituent-dependence of enantioselectivity for the presented asymmetric catalytic reactions.Importantly, hard-sphere and voxel-based STREUSEL Sterimol parameters are among the top performing candidates; this demonstrates the utility of the STREUSEL size metric in enantioselectivity prediction via Sterimol parameters, as well as the utility of the SMORES workflow for asymmetric catalysis analysis.