Decomposing Hofmeister effects on amino acid residues with symmetry adapted perturbation theory

Hofmeister effects, and more generally specific ion effects, are observed broadly in biological systems. However, there are many cases where the Hofmeister series might not be followed in complex biological systems, such as ion channels which can be highly specific to a particular ion. An understanding of how ions from the Hofmeister series interact with the proteinogenic amino acids will assist elucidation of why some binding interactions may be favoured over others. Using symmetry adapted perturbation theory (SAPT2 + 3), the interaction energies between a selection of anions and each amino acid have been investigated. The interaction strengths become more favourable in accordance with the Hofmeister series, and also with increasing polarity of the amino acids (with the exception of the negatively charged amino acid side chains). Furthermore, the interactions are generally most favourable when they simultaneously involve the side chain and both protic moieties of the backbone. The total interaction energy in these anion–amino acid complexes is also primarily determined by its electrostatic component, in a manner proportional to the þ (‘sho’) value of the anion.


Introduction
Proteins are the most versatile macromolecules in living systems [1]. Composed of a well-defined sequence of amino acid residues, proteins serve crucial biochemical functions including catalysis, molecular storage and transport, mechanical support, immune protection, nerve impulse transmission and growth and differentiation control. Each of these functions is directly dependent on the three-dimensional structure of the corresponding protein(s) [1]. Ultimately this structure is determined by the amino acid sequence that defines the protein itself, however, the overall conformation of the protein is influenced by the interaction between individual amino acid residues, adjacent substrates and the solvent environment. This interaction can be influenced directly via the presence and identity of dissolved ions [2,3]. Electrolytes therefore are capable of manipulating protein conformation and hence protein function. More generally, electrolytes play essential roles in controlling many processes and environments within biological organisms; they buffer pH (and more [4]) in the cells, enzymes and blood of organisms [5], act as (amino-acid) salt bridges [6] and cofactors [1] for enzymes, and can 'salt-in' or 'salt-out' proteins (i.e. stabilise or destabilise them, respectively) [3]. Ions are a key component of primary processes in living organisms, and indeed biological systems can undergo stress when specific salt concentrations are out of balance [7].
The manner in which ions influence protein structure often depends strongly on their identity, and therefore constitutes one example of a specific ion effect (SIE). SIEs are observed throughout the physical, chemical and biological sciences, and have been shown to influence reaction rates, polymer morphology,

Computational methods
The investigation of interaction between amino acids and Hofmeister anions follows the protocols established previously [18]. We consider here the interaction between archetypal Hofmeister anions (F − , Cl − , SCN − , and PF 6 − ) and the 20 amino acids in their neutral backbone state in gas-phase (figure 1). Previous work has shown a consistency between the trends in computationally efficient gas-phase calculations and more complex solvated calculations, allowing analysis of a broader range of interactions with a higher level of theory [11]. A similar analysis in the presence of water would also be of interest but is computationally beyond the scope of the current investigation. The structure of each anion-amino acid complex was first obtained using the stochastic Kick [3] algorithm [20] (1000 starting configurations) in conjunction with 3rd order denisty functional tight binding (DFTB3) [21] /3ob-1-1 [22,23], for the sake of computational efficiency. Each unique DFTB3 pre-optimised complex was then re-optimised using M06-2X [24] /aug-cc-pVDZ [25]. SAPT2 + 3 was then used to calculate the basis set superposition error-correct total interaction energy (∆E), and the component electrostatic (∆E Elec ), exchange (∆E Exch ), induction (∆E Ind ) and dispersion (∆E Disp ) energies between the anion and each amino acid. The influence of energy decomposition scheme and diffuse basis set functions is considered in the supporting information. M06-2X/aug-cc-pVDZ optimised structures of gas-phase anion-amino acid complexes for F − , Cl − , SCN − and PF6 − , and the 20 proteinogenic amino acids, glycine, alanine, valine, leucine, isoleucine, methionine, phenylalanine, tryptophan, proline, serine, threonine, cysteine, tyrosine, asparagine, glutamine, the negatively charged aspartate and glutamate, and the positively charged lysine, arginine and histidine. Amino acids are grouped according to the structure of their sidechain.
Anion-amino acid complexes can form through interactions at several different binding sites, but are found to occur predominantly via the following classifications (with some subtle variations): -backbone OH, -backbone NH, -backbone OH and NH simultaneously, -sidechain, -sidechain and backbone OH simultaneously, -sidechain and backbone NH simultaneously, -sidechain, backbone OH and backbone NH simultaneously. as shown in figures 3-5. For the following discussion, each of these binding mechanisms has been considered where they formed naturally as a result of geometry optimisation (i.e. no interactions were 'forced' to occur). All geometry files are supplied in the supporting information (xyz format).

Results and discussion
For the purpose of discussion, the 20 amino acids in figure 1 are categorised according to the nature of their sidechains: amino acids with neutral (nonpolar: glycine, alanine, valine, leucine, isoleucine, methionine, phenylalanine, tryptophan and proline, and polar: serine, threonine, cysteine, tyrosine, asparagine and glutamine) and charged sidechains (aspartate, glutamate, lysine, arginine and histidine) are considered individually.
A summary of the SAPT2 + 3 energy contribution analysis is presented in figure 2 (see table S1, figures S1-2 and accompanying discussion in supporting information for further detail). Figure 2(a) reveals that for the entire dataset, on average, the electrostatics makes up the greatest portion (and in fact exceeds) the total interaction, % Elec = 111.5 ± 7.0%. The comparatively low standard deviation here indicates that the electrostatics plays a dominant role irrespective of the amino acid and ion in question. Exchange repulsion is the second highest contributor to the total interaction energy in general, % Exch = −82.2 ± 53.8%. As it is a repulsive interaction, exchange counteracts the (mostly) favourable electrostatic interactions. Induction (% Ind = 44.8 ± 30.8%) and dispersion (% Disp = 26.0 ± 16.9%) have much lower contributions to the total interaction energy. For exchange, induction and dispersion, there is a large standard deviation, indicating that these energy contributions are more prominently affected by the amino acid identity. Figure 2(b) shows the average correlation coefficient of each amino acid's anion-amino acid interaction energy contributions, compared to the total interaction energy for the entire dataset. This figure shows ∆E Elec to be the most correlated, with a very strong correlation coefficient of ρ Elec = 0.99 ± 0.01, followed by moderately strong correlations for ∆E Exch and ∆E ind , with ρ Exch = −0.89 ± 0.09 and ρ Ind = 0.85 ± 0.13. ∆E Disp is the most weakly correlated with, ρ Disp = 0.72 ± 0.18 across the whole dataset.
The strong correlation between the electrostatic contribution and the total interaction energy, and the dominant electrostatic contribution to the total interaction energy provides the basis to investigate these anion-amino acid complexes in terms of the recently-developed þ parameter, the values of which are (×10 −10 C m −1 ) [11], . The discussion below indeed shows a strong correlation between þ and the interaction energies in these complexes. The exception here is PF 6 − ; due to its larger size and octahedral structure, interactions between this anion and these amino acids are not necessarily dominated by pairwise electrostatic interactions. We note also that individual components of SAPT2 + 3 total interaction energies also generally show strong correlations with the anion þ parameter, with the exception of the dispersion interaction. We also note that only small deviations arise from using less computationally intense levels of SAPT, and furthermore, these conclusions are consistent even when using LMOEDA/M06-2X/cc-pVDZ. We provide a complete discussion on these points in the supporting information.

Neutral nonpolar sidechain interactions
M06-2X/aug-cc-pVDZ//SAPT2 + 3/aug-cc-pVDZ interaction energies for anion-amino acid complexes for the amino acids with nonpolar sidechains are presented in figure 3. It is anticipated that the nonpolar sidechain for these nine amino acids will likely exhibit similar strengths and trends to what has been established in previously published work concerning N-isopropylacrylamide-like monomer residues [18], since the structural components of these acids (i.e. 1 • /2 • amines, aliphatic chains and carbonyl groups) are similar. However, some structural differences exist, notably due to sulphide groups, carboxylic acids and aromatic rings. Sulfide groups (present in methionine) are not expected to cause an exception in this respect, since the electronegativity of the sulfur atom indicates that its interaction with anions will be weak.
Similarly, the aromatic rings in phenylalanine and tryptophan are not expected to exhibit strong interactions with anions, although may do so for cations (not considered here) due to cation-π effects [31]. Carboxylic acids, on the other hand, potentially will interact strongly with the anions that are considered here via hydrogen bonding.
For all observed anion-nonpolar amino acid binding interactions, figure 3 shows that the total gas-phase interaction energy forms a reverse Hofmeister series determined by the anion þ value (i.e. the most negative þ value correlates with the largest interaction strength, which is the opposite to the traditional Hofmeister series [2]). We note though that PF 6 − values do not follow the trend of the remaining ions, as mentioned previously ( figure S6). Further, for SCN − values to follow this trend the binding orientation must be correctly accounted for (i.e. as it can bind either via its S or N atom). The absolute binding strengths at each binding site are also consistent for these amino acids. F − , Cl − , SCN − and PF 6 − anions all interact more strongly with a combination of the amide and carboxylic acid moieties on the amino acid backbone, than with the backbone OH group and NH group alone. The anions interact most weakly with the aliphatic sidechain, as expected. The order of these interaction strengths aligns generally with the acidity of the respective hydrogens (i.e. carboxylic acid OH > amine NH > CH). For glycine ( figure 3(a)) and proline (figure 3(i)), interaction with F − leads to hydrogen abstraction, irrespective of the interaction site. For the F − -alanine complex ( figure 3(b)), hydrogen abstraction occurs at all binding sites except for the sidechain NH moiety. For the remaining amino acids in figure 3, dehydrogenation due to F − is less frequent. This is due to the possibility of direct interaction between F − and the aliphatic hydrogens of the sidechain, as well as the stabilisation of the amine hydrogens by the larger sidechain moieties in these acids, and fewer uni-directional interactions with the F − anion. Methionine illustrates further the influence of the sidechain on the calculated interaction energies. The presence of sulfur in methionine (figure 3(f)) increases the interaction strength of its sidechain interactions, most notably via the interaction between F − and the methionine NH and backbone CH groups. While it could be reasonably assumed that this interaction energy (−219 kJ·mol −1 ) is due to a simultaneous interaction with two aliphatic hydrogens on the methionine sidechain, this is not the case. Indeed, an almost identical interaction occurs between F − and isoleucine (figure 3(e)), which is weaker by 78 kJ·mol −1 . Further, aliphatic chain length does not appear to cause significant differences between alanine, valine, leucine and isoleucine.
For these nine nonpolar amino acids, there were no complexes in which the anion interacted exclusively with the backbone NH moiety ( , figure 3); in all cases such interactions occurred in conjunction with interactions between the anion and the nonpolar sidechain moieties. This is attributed to the close proximity of the amine to the sidechain; simply rotating the C-N bond enables the anion to simultaneously interact with the sidechain, thereby stabilising the complex and increasing ∆E. On the other hand, isolated backbone OH-anion interactions can occur due to the greater conformational change required to achieve this stabilisation. The naturally stronger interaction between the anion and the carboxylic OH group is also a factor here.

Neutral polar sidechain interactions
Interaction energies for anion-amino acid complexes for the amino acids with polar side chains are presented in figure 4. Again, these total gas-phase interaction energies constitute a reverse Hofmeister series, determined by the anion þ value. In this series, there exist three unique possible functional groups on the side chain, an OH alcohol, SH thiol and NH 2 amide. The consistencies between each of these amino acid residues is marked. Once again, these interaction energies constitute a reverse Hofmeister series, irrespective of the anion, the amino acid, and the anion binding site on the amino acid.   a much greater range in the interaction energies between F − and SCN − than observed for the nonpolar amino acids.

Charged sidechain interactions
SAPT2 + 3/aug-cc-pVDZ interaction energies for complexes between Hofmeister anions and amino acids with charged sidechains are presented in figure 5. To this point, interaction energies have only been considered for neutral molecular substrates. Here, by examining complexes with charged amino acids, the interaction energy Hofmeister series will be shown to be a more fundamental property, independent of the formal charge of the molecular substrate. Figure 5 shows that, irrespective of the formal charge on the amino acid substrate and the structure of the acid, the interaction energy between them and Hofmeister anions constitute reverse Hofmeister series, again defined by the anion þ value. This emphasises further the utility of the þ parameter as the fundamental ion property capable of predicting the fundamental specific ion series provided that the interactions are dominated by pairwise Coulombic effects. It is noted however, that the magnitude of the anionic-cationic interactions is substantially greater (∼3×) than those observed for equivalent interactions with the amino acids with nonpolar and polar sidechains (i.e. figures 3 and 4).
Lysine + , arginine + and histidine + (figures 5(a)-(c)) all carry a formal positive charge on their ammonium + , guanidinium + and imidazole + groups, respectively. For these amino acids, backbone interactions with Hofmeister anions no longer occur, unless they are coupled with sidechain interactions. For example, there are no longer any observed backbone OH, NH nor simultaneous NH and OH backbone interactions isolated from the charged moiety of the sidechain. This is unsurprising, due to the strength of the electrostatic interaction with the adjacent positively charged sidechain. The more charge dense Cl − ion has a much stronger (by ∼56-106 kJ·mol −1 ) interaction with lysine + than arginine + for equivalent binding sites. Alternatively, the interaction of these amino acids with the charge diffuse PF 6 − (figure S5(a) and (b)) are more comparable (−469.7 and −407.7 kJ·mol −1 , respectively for the backbone OH/sidechain interaction, and −428.6 and −389.9 kJ·mol −1 for sidechain interactions). Histidine + lacks sufficient Cl − data to make conclusive remarks here, but looks comparable to both lysine + and arginine + in terms of the interaction strengths with NCS − and PF 6 − ( figure S5(c)). Aspartate − and glutamate − (figures 5(d) and (e)) both carry a negative charge at biologically-relevant pH ranges. Unsurprisingly then, the calculated interaction energies between these amino acids and all anions considered here are positive, indicating a repulsive interaction. This repulsion is due to the presence of the charged sidechain, which means that the Hofmeister anions only interact with aspartate − and glutamate − via backbone NH 2 or OH moieties but at dissociative ranges due to the electrostatic repulsion between the anion and the (charged) sidechain functional group. This interaction is nevertheless facilitated by a conformational change in the amino acid backbone. For example, isolated backbone amine interactions are now observed, which are not seen in the other amino acids due to the usual favourability of sidechain stabilisation. Figure 5 also shows an apparent relationship with the anion size in this regard. For instance, while F − and Cl − both interact with the glutamate − NH group, the interaction energy for Cl − is notably larger (more positive) than F − , and consistent with that for SCN − . The weaker binding SCN − also only interacts through its nitrogen for glutamate − , while PF 6 − does not appear to have any resultant interaction at this site ( figure S5(e)). Moreover, the less localised electron density of these ions might mean they are influenced from a longer range and hence interact with the OH group instead or are entirely repelled by the amino acid.

Conclusions
Using the þ parameter of a series of Hofmeister anions, an analysis of all 20 proteinogenic amino acids reveals that gas-phase interactions between amino acids and anions consistently constitutes a reverse Hofmeister series, irrespective of the binding site, amino acid charge and anion identity. Symmetry adapted perturbation theory (SAPT2 + 3) calculations revealed the electrostatics to be most correlated with the total interaction energy (and in turn the Hofmeister series), and dispersion was consistently the least correlated component of the total ion-amino acid interaction energy. Thus, there are fundamental similarities in the origins of these reverse Hofmeister series and those previously in the context of other molecular substrates [18] and bulk electrolyte properties [11]. Results presented here provide new evidence that highlight the utility of the þ parameter as an intrinsic ion property responsible for the fundamental series of specific anion effects.

Data availability statement
All data that support the findings of this study are included within the article (and any supplementary files).