Optical spectra of EGFR inhibitor AG-1478 for benchmarking DFT functionals

Optical spectroscopy (UV–vis and fluorescence spectroscopy) is sensitive to the chemical environment and conformation of fluorophores and therefore, serves as an ideal probe for the conformation and solvent responses. Tyrosine kinase inhibitors (TKI) such as AG-1478 of epidermal growth factor receptor when containing a quinazolinamine scaffold are fluorophores. It is, however, very important to benchmark density functional theory (DFT) method against optical spectral measurements, when time-dependent DFT is applied. In this study, the performance of up to 22 DFT functionals is benchmarked with respect to the measured optical spectra of AG-1478 in dimethyl sulfoxide (DMSO) solvent. It is discovered when combined with the 6–311++G(d, p) basis set, there are top seven functionals; B3PW91, B3LYP, B3P86, PBE1PBE, APFD, HSEH1PBE, and N12SX DFT-VXC functionals are identified as the top performers. Becke’s three-parameter exchange functional (B3) tends to generate accurate optical spectra to form the best three functionals, B3LYP, B3PW91 and B3P86. Specifically, B3PW91 was recommended for studying the optical properties of 4-quinazolinamine TKIs, B3LYP was found to be excellent for absorption spectrum, while B3P86 was identified as the best for emission spectrum. Any further corrections to B3LYP, such as CAM-B3LYP, LC-B3LYP, and B3LYP-D3 result in larger errors in the optical spectra of AG-1478 in DMSO solvent. These best three (B3Vc) functionals are reliable tools for optical properties of the TKIs and therefore the design of new agents with larger Stokes shift for medical image applications. To obtain reliable optical spectra for this class of 4-quinazolinamine based TKIs, it is important to include the electron correlation energy.


Introduction
The epidermal growth factor receptor (EGFR), a membrane-spanning cell surface protein, is a major target for drugs used to treat lung carcinoma [1]. In human cancers, mutations in the tyrosine kinase domain of EGFR often arise, which can cause drug sensitivity or resistance by influencing the relative strengths of drug and adenosine triphosphate binding. To understand how these mutations dysregulate EGFR and to modulate its sensitivity and resistance to tyrosine kinase inhibitors (TKIs), the most effective approach is to combine experimental and theoretical methods at the molecular level [2]. TKIs with a 4-quinazolinamine scaffold can be useful optical probes of the interaction between these inhibitors and the environment. The optical spectral properties (UV-vis and fluorescence) of quinazolinamine fluorophores are sensitive to the chemical environment and conformation of the fluorophores. Therefore, they are ideal tools for optical reporting of inhibitors of protein tyrosine kinases, such as EGFR.
N-(3-chlorophenyl)-6,7-dimethoxy-4-quinazolinamine (AG-1478) is a highly potent TKI that has been extensively studied for its conformation in solvents and its optical spectra using a combination of

Methods and computational details
All ground state quantum mechanical optimization and TD-DFT calculations use the same DFT method. All calculations use the 6-311++G(d,p) BS with the implicit polarizable continuum model (PCM) [18]. That is, the AG-1478 geometries at the ground electronic state (S 0 ) and first singlet excited-state (S 1 ) in figure 1 were optimized, using the LR-PCM/TD-DFT methods in dimethyl sulfoxide (DMSO) solvent, followed by TD-DFT calculations using the same DFT functionals (Vxc) in DMSO. The 22 DFT functionals (21 DFT functional plus HF) are available from the Gaussian 16 computational chemistry software packages [19], and table S1 summarizes selected geometric parameters and dipole moment results calculated using different DFT Vxc functionals.
For the investigation of absorption and emission processes, two approaches are usually employed: the conventional linear-response (LR) approaches [20] and the state-specific ones [21]. In the LR method, the absorption and emission energies including a PCM correction are determined by the electron density variation associated with the transition. All calculations were performed using Gaussian 16 computational chemistry packages [19] at the supercomputing facilities from National Computational Infrastructure and the OzSTAR Supercomputer at Swinburne University. The AG-1478 chemical structure in figure 1(b) is the global minimum structure obtained based on the B3LYP/6-311++G(d,p) model [4]. Analysis of the LR of the GS density was calculated either with the HF or with DFT to an external time-dependent perturbation leads to the TD-HF or TD-DFT schemes, respectively [12], where the wavefunction Hartree-Fock (HF) calculations are employed as a reference.

Performance of DFT-V XC functionals-the top seven
Most common DFT Vxc functionals used to investigate ground and ESs of compounds can be benchmarked to different rung of 'Jacob's ladder' . Although several general conclusions are available, there is no universal solution for choosing the 'best' V XC for the TD-DFT calculations of all systems of interest [13]. As a result, it becomes a common practice of benchmarking studies to assess the V XC for calculating particular properties, (e.g. absorption and/or emission spectra) of a particular class of compounds (e.g. quinazoline scaffold TKIs in our case). A comprehensive review and extensive assessment of about 200 DFT-V XC functionals by Goerigk and Grimme [14] divided the DFT-V XC functionals approximately into three major corrections of wave function theory, exchange-correlation, and dispersion corrections. The top DFT-V XC functional performers include B3LYP; PBE-D3(BJ), revPBE-D3(BJ), BLYP-D3(BJ), PBE0-D3(BJ), B3LYP-D3(BJ), and M06-2x [14]. In this assessment, Perdew's metaphorical Jacob's Ladder [22] with five rungs of accuracy order of local spin-density approximation, generalized gradient approximation (GGA), meta-GGA, hybrid GGA/meta-GGA, and double hybrid functionals was discussed. However, all top DFT-V XC functionals recommended [14] were for ground electronic states and their energetic properties rather than spectroscopic properties in ESs. In addition, the training database consisted of relatively small molecules. Figure 1 shows the chemical structure and nomenclature of AG-1478, as well as the optimized geometries of its ground and first ESs using B3LYP/6-311++G(d,p) (figures 1(a)-(c)). Selected geometric properties and dipole moment of AG-1478 calculated using 21 DFT-V XC functionals and the HF scheme are summarized in table S1. All calculations were performed using DFT-V XC /6-311++G(d,p) in the same manner. The TD-HF results were used as a reference for the benchmarking study.
At present, TD-DFT is one of the most prominent approaches for optical studies, especially for ESs of medium-sized and large molecular systems such as AG-1478 [12]. Figure 2 compares the individual performance (i.e. the discrepancies in eV) between the calculated and measured [3]  , and overall error (λ overall = |∆λ ab | + |∆λ em | + |∆λ ss |) of AG-1478 were calculated and presented in table S2. Note the experimental maximum absorption used is the λmax (i.e. λ 2 ) not λ 1 . As seen in figure 2, the performance of a DFT V XC functional depends on the accuracy of the calculated absorption and emission, as well as the Stokes shift. In the present study for practical reasons [3], the Stoke shifts are obtained using the maximum wavelengths (λ max ) of the calculated absorption and emission spectra [23]. Note that at room temperature, AG-1478 can exist in the form of multiple conformers [5].
We also calculated an overall error ∆λ which is the sum of absolute values of all three properties. It is very clear in figure 2 that the accuracy of the DFT-V XC functionals can be measured using the overall error ∆λ because just two inaccurate wavelengths λ em and λ ab may produce an 'accurate' Stokes shift due to the error cancellation. For example, In figure 2 shows that the HF method and LC-B3LYP produce larger errors in both λ em and λ ab but an error in Stokes shift is small due to the cancellation.
To compare the DFT methods, the results obtained using the TD-HF method demonstrate large errors for both absorption and emission wavelengths calculated, as shown in figure 2 and table S2. As indicated by Adamo and Jacquemin that a good agreement of the predicted wavelength in comparison to the experimental measured absorption and emission wavelength is regarded to be an error within 20-25 nm [24]. Figure 2 shows that DFT functionals such as BVP86 and B3PW91 are two extremes of the performance of poor and excellent, respectively. Most of the DFT V XC functionals result in errors in ∆λ ab and ∆λ em in the same direction, i.e. overestimates or underestimates both absorption and emission energies (wavelengths). We note that three DFT-V XC functionals, B3LYP, B3P86, and B3PW91 are among the top performers in terms of calculation of the optical spectra of AG-1478. Interestingly, Becke's three-parameter (B3) with correlation V xc DFT functionals, B3Vc, also produce accurate ES energies (wavelengths) using TD-DFT. For example, the errors in absorption wavelength ∆λ ab using B3LYP, B3PW91 and B3P86 are −0.63 nm, 1.39 nm, and −3.01 nm, respectively, whereas the errors in emission wavelength ∆λ em using the same B3Vc functionals are 7.43 nm, 2.28 nm, and 1.1 nm, accordingly. These B3Vc functionals are reliable tools for the study of the optical properties of the TKIs and therefore the design of new agents with larger Stokes shift for medical image applications.
The success of the Becke three-parameter DFT functionals (B3LYP, B3PW91, and B3P86) has been recognized by earlier studies for small molecules. They are, however, not exactly as the notation stands for, if these functionals are employed in the Gaussian 16 computational chemistry packages [19] for the TD-DFT calculations like the present study. In 1993, Becke proposed three-parameter hybrid functional B3WP91 [25], where a = 0.20; b = 0.72 and c = 0.81. The three parameters a, b, and c were fit to the atomization-energy data, which were determined by Becke via fitting to the G1 molecule set [26]. As a result, the B3PW91 correlation functional has contributions from the VWN functional [27] and c * PW91 functional [28] where c is fit by Becke. In 1994, Frisch and co-workers [29] reworked the B3PW91 in equation (1) using the Lee, Yang, and Parr (LYP) functional [30] for correlation density-functional approximation instead of the PW91 functional. The reworked functional employs the same three parameters as in equation (1), which is known as B3LYP. Similarly, the B3P86 functional was reworked. The Gaussian programs [19] employ the same a, b, and c values for the LYP, P86, and PW91, and introduced the VWN functional III for (VMN III) [27] for the local contribution. As a result, such reworked B3Vc functionals are excellent performers for AG-1478 optical spectra. Any modifications to the 'original' B3LYP functional, such as CAM-B3LYP and B3LYP-D3, largely reduce the accuracy in the emission wavelength (ESs) calculations. The B3LYP functional performs very well for both absorption and emission wavelengths of AG-1478 in DMSO. However, the CAM-B3LYP functional reduces the accuracy in the absorption wavelength (∆λ ab ) calculation from −0.63 to −35.7 nm while the B3LYP-D3 slightly improves the accuracy of absorption wavelength (∆λ ab ) calculation from −0.63 to 8.45 nm. The B3LYP functional is known to provide reasonably accurate results for a wide range of chemical systems but can suffer from some limitations in predicting electronic excitations. As an empirical correction to the B3LYP functional, the CAM correction aims to improve the long-range correction effects. It does not, however, necessarily lead to a better estimation of the excitation energy than B3LYP alone, since the CAM correction does not modify the underlying exchange-correlation functional of B3LYP, which is important to determine the accuracy of the excitation energy. In this case, the use of the CAM correction leads to less accurate results than B3LYP alone.
In the same vein, the Grimme's D3 correction [31] is an empirical dispersion correction to improve the accuracy of intermolecular interactions in molecules. The D3 correction does not directly enter into the estimation of the vertical excitation energy in DFT calculations. The D3 correction can indirectly improve the accuracy of the vertical excitation energy through improvement of GS, since the vertical excitation energy is mainly determined by the choice of exchange-correlation functional, the BS, and other factors that influence the electronic structure of the molecule. The choice of the appropriate method for a particular system will depend on various factors, including the nature of the system, the properties to be studied and the accuracy requirements. The B3LYP DFT functional is the preferred over its derivatives such as CAM-B3LYP and B3LYP-D3 for the calculations of electronic states which produce accurate optical properties, though Grimme's dispersion correction DFT-D3 approach [31] is satisfactorily accurate at practically the same computational cost as pure DFT for small systems [32].
The performance of the DFT-V XC functionals, along with HF, can be grouped into poor (overall error ∆λ > 150 nm), reasonable (50 nm < ∆λ < 150 nm), and excellent (∆λ < 50 nm). The electron correlation energy seems to be important for optical properties and all DFT-V XC functionals include the electron correlation energy in varying degrees. When paired with the same 6-311++G (d, p) base set, the TD-HF shows a very large error of nearly ∆λ ∼ 300 nm. The DFT-Vxc functionals with poor performance include CAM-B3LYP, B97D3, M11, NM11, ωB97XD, B3LYP-D3, B97D, BMK, LC-B3LYP, SOGGA11X, and BVP86 with an overall error of ∆λ > 150 nm. For example, the LC-BLYP is not suitable for studying the optical properties of the TKI because it gives a very large overall error of over, ∆λ > 286 nm. A group of DFT-V XC functionals yielding moderate accuracy includes PBE1PBE, O3LYP, and M06. Many of them produce small absorption (∆λ ab ) errors but poor emission (∆λ em ) ones. As shown in figure 2, almost all orange columns (error in an ES prediction) are larger than the corresponding blue ones (errors in absorption prediction). As shown in figure 2, the top seven (the top seven) DFT-V XC functional performers which accurately produce the optical properties of AG-1478 are B3LYP, PBE1PBE, B3P86, APFD, B3WP91, HSEH1PBE, and N12SX. They can produce the optical properties within a range of 0.1-0.5 eV, which is comparable with the accuracy of high-level correlated approaches such as EOM-CCSD or CASPT2 [12], with the (EOM-CCSD and CASPT2) having significant computational costs.
The top-performing DFT-V XC functionals (top seven) demonstrate small errors of ∆λ < 50 nm and they are shown in figure 3. Our further discussion will concentrate on the best performing DFT-V XC functionals together with HF calculations as a reference. Performance of the DFT-V XC functionals in term of overall error (λ overall = |∆λ ab | + |∆λ em | + |∆λ ss |) for AG-1478 in DMSO (the smaller the better) is given as B3PW91 (4.56 nm) > B3P86 (8.22 nm) > B3LYP (16.12 nm) > HSEH1PBE (16.84 nm) > N12SX (24.04 nm) > AFPD (40.58 nm) > PBE1PBE (52.9 nm). All top seven in figure 3 exhibits that the ∆λ in absorption or ∆λ in emission are below the 20-25 nm criteria of Adamo and Jacquemin [24] with the 'best three' being B3PW91, B3P86 and B3LYP-all with Beck's three parameter for Vxc. The results are in agreement with the recent benchmarking study of DFT-V XC functionals using TD-DFT calculations for the UV-Vis spectral simulation [16] who obtained the following order in terms of accuracy:

The top seven DFT functionals in ground and first ESs
For studying the ground state (GS, S 0 ) and ES properties it is very important to choose the proper calculation method. Within the framework of the TD-DFT, for calculations of the absorption wavelengths, one needs to estimate the ES energies (and related properties) using the optimal GS geometry without exploring the ES potential energy surface. This approach is usually a good approximation for absorption (e.g. UV-vis) calculations [4,16,[33][34][35]. The GS properties include geometry, dipole moment, and electronic transitions (absorption) [36]. If not using the high-level CASSCF/CASPT2 approaches, proper selection of the most suitable DFT-V XC functional(s) depends on the molecular scaffold [37][38][39]. In the case of accurate but time-consuming CASSCF/CASPT2 methods, some studies are limited by the size of the molecular system. For example, for studying the electronic structure properties of the GS and ES of large drugs such as lapatinib, Vayá et al [40] restricted their study to the core scaffold containing only the furan and quinazoline  chromophores because of the high CPU cost of the CASSCF/CASPT2 methods. However, under this approach, the important effects of side groups are not considered.
Optimizations of the GS and the first ES using the DFT-V XC functionals converges to the planar and twisted structures of AG-1478 presented in figures 1(b) and (c), respectively. Table S1 compares the deviations of the calculated geometric perimeters of the phenyl (R 1 ) and the quinazoline rings R 2 and R 3 , together with the C-Cl bond length of AG-1478 in DMSO using all DFT-V XC functionals for the GS and the first ES. The perimeters of the all-carbon rings (R 1 and R 2 ) share some similarities with close perimeters of 8.459 Å and 8.480 Å, in agreement with the earlier study of AG-1478 [4]. The perimeter of the pyrimidine ring (R 3 ) of the quinazoline moiety is smaller in AG-1478 than the all-carbon ring perimeters. ESs often exhibit different geometries from the GS's, which can be attributed to the insurgence of the solvatochromic behavior to the presence of a twisted intermolecular charge transfer (TICT) state that may be accessible upon excitation [41]. In addition, the optimized geometric parameters such as ring perimeters in the first ES of AG-1478 are not the same as their corresponding perimeters in the ground state using the same method. For example, ring perimeters increase by ∼0.034 between GS and ES for B3LYP/6-311++G(d, p) and ∼0.032 for APFD/6-311++G (d, p) (see values in table S1). The C-Cl bond length is 1.769 Å in the GS but shorter (1.747 Å) in the first ES using the same B3LYP/6-311++G (d, p) method. This is in agreement with the B3LYP/6-311G(d, p) and the experimental study of flufenpyr and amipizone [42]. Table 1 compares the results obtained from the optimized GS (S 0 ) and the first ES (S 1 ) using the top seven DFT-V XC functionals. The results from the DFT and HF calculations indicate that the GS of AG-1478 in DMSO is a planar configuration, in agreement with earlier studies [4]. The dihedral angle, δ = ∠C (10) -C (4) -NH-C (3 ′ ) (refer to figure 1) which determines the relative orientation of the quinazoline and chloro-phenyl planes is nearly planar and equal to 180 • , with the dipole moment being about 8.0 D, except for HF (7.31 D). The calculated maximum absorption wavelength λ ab is close to the experimental absorption wavelength of 332 nm except for the HF of (244.31 nm). Note that the calculations are only based on the global minimum structure of AG-1478, while it is known that other conformers can be populated under room temperature [4,5]. Table 1 shows data for the first singlet ES (S 1 ) of AG-1478 using the top seven DFT-Vxc functional. The optimized structure of the ES is a twisted configuration (see figure 1(c)). Dihedral angle δ = ∠C (10) -C (4) -NH-C (3 ′ ) in the ES is approximately −96 • . The same angle from the HF calculations is larger (at δ = −153.5 • ) but the APFD yields a smaller value (δ = −79.01 • ). The twisted configuration of the ES of AG-1478 leads to the decrease of the dipole moment, from 7-8 D in the GS to 6-7 D in the first ES. Figure 4 illustrates alignments of all optimized ES structures of AG-1478 using the top-performing DFT-V XC functionals. The states relevant for the absorption and emission are the two lowest singlet states based on the ground state (S0) and first ES (S 1 ) geometries of AG-1478 in figures 1(b) and (c). The major configurations of absorption and emission are dominated by transitions between the highest occupied molecular orbital (HOMO, MO82a) and the lowest unoccupied molecular orbital (LUMO, 83a), i.e. the HOMO → LUMO transitions. Figure 5 reports the absorption wavelength λ ab and emission wavelength λ em of AG-1478, respectively, between the orbitals, HOMO (82a) and (LUMO,83a) of AG-1478 in DMSO. The absorption transition of HOMO-LUMO represents a local excitation (LE) on the chromone rings, with minor charge transfer (CT) on the Cl atom. However, as shown in figure 5(b), the emission can be characterized as a TICT in which CTs from the phenyl to the quinazoline chromone, in agreement with a study of fluorescence spectra of isoflavones by Beyhan et al [8] and a study of ESs of 4-(N, N-Dimethylamino)-4 ′ -nitrostilbene [43]. Table S3 collects frontier molecular orbitals which are responsible to major transitions calculated using other DFT methods.
The lowest ES is often the most photobiological relevant state for biomolecules. Low ESs are less likely with charge-transfer problems and without significant Rydberg character so that range-separated functionals are not generally necessary [7]. To understand the experimentally observed maximum emission wavelength at 475.5 nm (2.607 eV) [4] of AG-1478 in more detail, we consider the first singlet ES. The corresponding adiabatic excitation energies and ES dipole moments were calculated using the top-performing DFT-V XC functionals and are given in table 2, together with a dihedral angle δ between the rings. Relaxation of the lowest excited singlet states is always accompanied by a significant rotation toward planarity around the chromone-phenyl bond relative to the ground state structure. The largest change thereby occurs in the TICT state, with the dihedral angle of only δ being only ≈−96 • at its equilibrium geometries. According to the measured spectra of AG-1478 [3], the absorptions (λ 1 and λ 2 ) occur at 346 nm and 332 nm in DMSO, respectively, and the emission occurs at 475.5 nm. Table 2 shows the maximum absorption and emission wavelengths, the Stokes shifts, and the overall error of the top DFT functionals. Some DFT functionals produce accurate absorption wavelengths, some are good for the emission wavelengths, some are accurate for Stokes shifts, and some achieve overall small errors. For example, the top three DFT functional for absorption wavelengths are B3LYP (−0.63 nm), N12SX (0.81 nm), and B3PW91 (1.39 nm). The top three DFT functional for prediction of the emission wavelengths are not necessarily the same as absorption but B3P86 (1.10 nm), B3PW91 (2.28 nm), and HSEH1PBE (−6.10 nm). As for the Stokes shifts, the top three performers are B3PW91 (0.89 nm), B3P86 (4.11 nm), and B3LYP (8.06 nm). The top three DFT functionals in terms of accuracy (the smaller the better) are B3PW91 (4.56 nm), B3P86 (8.22 nm), and B3LYP (16.12 nm). Thus, B3PW91 functional is the top overall performer for prediction of the optical properties including absorption, emission, and Stokes shifts. The B3LYP is excellent for absorption calculations. B3P86 is the best for emission calculations. Finally, the significantly  The Stokes shift was calculated as the difference between the maximum emission (λ em(max) ) and maximum absorption (λ ab(max) ). This experimental value is 475.5-332 = 143.5 nm. Note that the experiment Stokes shift is calculated using the first absorption (λ1) and maximum emission (λem) [3].
large overall error of TD-HF in table 2 indicate that electron correlation energy may play a more important role in optical spectral simulation than electron exchange energy for this EGFR inhibitor system and possibly this class of 4-quinazolinamine TKIs.

Conclusion
The TD-DFT method was used to benchmark the performance of 21 DFT functionals and the HF method with respect to the measured optical properties of EGFR-TKI AG-1478 in DMSO. When combined with the 6-311++G(d, p) BS, the top seven DFT-Vxc functionals, including B3PW91, B3P86, B3LYP, HSEH1PBE, N12SX, APFD, and PBE1PBE, were determined to be the most reliable for reproducing the experimental optical properties of AG-1478. Among the Becke's three-parameter functionals, B3PW91, B3LYP, and B3P86 were identified as the best three for this purpose. Although B3LYP was part of the best three, any corrections to it, such as CAM-B3LYP, LC-B3LYP, and B3LYP-D3, resulted in larger errors in the optical properties of AG-1478 in DMSO. Specifically, B3PW91 was recommended for studying the optical properties of 4-quinazolinamine TKIs, B3LYP was found to be excellent for absorption spectrum, while B3P86 was identified as the best for emission spectrum. These DFT-Vxc functionals are reliable tools for studying the optical properties of TKIs and can aid in the design of new agents with larger Stokes shift for medical imaging applications. The electron correlation energy may play a more important role in optical spectral simulation than electron exchange energy for this EGFR inhibitor system and possibly this class of 4-quinazolinamine TKIs. Considerable efforts have been directed towards enhancing and devising new V XC functionals to overcome the known limitations of TD-DFT [12]. However, a significant question that remains is whether an approximate V XC functional can accurately describe both the ground state and ESs. As different ESs can possess vastly different electronic structures, it is unlikely that a single approximate V XC functional can capture all of them for all classes of molecules. The current benchmarking study illustrates that some DFT functionals may be precise for the ground electronic state but not for the lowest ES or vice versa [12]. Hence, it is crucial to benchmark DFT functionals for a particular class of molecules and the desired properties before undertaking production calculations.

Data availability statement
All data that support the findings of this study are included within the article (and any supplementary files).