Quantitative modeling of EGF receptor ligand discrimination via internalization proofreading

The epidermal growth factor receptor (EGFR) is a central regulator of cell physiology that is stimulated by multiple distinct ligands. Although ligands bind to EGFR while the receptor is exposed on the plasma membrane, EGFR incorporation into endosomes following receptor internalization is an important aspect of EGFR signaling, with EGFR internalization behavior dependent upon the type of ligand bound. We develop quantitative modeling for EGFR recruitment to and internalization from clathrin domains, focusing on how internalization competes with ligand unbinding from EGFR. We develop two model versions: a kinetic model with EGFR behavior described as transitions between discrete states and a spatial model with EGFR diffusion to circular clathrin domains. We find that a combination of spatial and kinetic proofreading leads to enhanced EGFR internalization ratios in comparison to unbinding differences between ligand types. Various stages of the EGFR internalization process, including recruitment to and internalization from clathrin domains, modulate the internalization differences between receptors bound to different ligands. Our results indicate that following ligand binding, EGFR may encounter multiple clathrin domains before successful recruitment and internalization. The quantitative modeling we have developed describes competition between EGFR internalization and ligand unbinding and the resulting proofreading.


Introduction
Receptor proteins on the plasma membrane enable signals, carried by extracellular molecules (ligands) that bind to the receptor, to be transduced to the cell interior [1]. The epidermal growth factor receptor (EGFR) is a receptor tyrosine kinase that mediates activation of a range of signaling pathways that regulate many aspects of cell physiology [2][3][4]. EGFR also plays a key role for tumor progression, representing a drug target and a drug resistance challenge for many cancers [5][6][7].
Ligands bind to EGFR on the plasma membrane, with recent work suggesting that EGFR ligand binding is favored for receptors localized to tetraspanin nanodomains [8]. While unliganded EGFR can form dimers, ligand-bound EGFR form distinct dimers and oligomers, which play an important role in EGFR signaling [9,10]. Phosphorylation of intracellular EGFR domains, while EGFR is on the plasma membrane, is a key step for signal transduction [11][12][13].
Ligand-bound EGFR can be recruited to a clathrin domain or stabilize a nascent clathrin domain [8,[14][15][16][17]. EGFR can be endocytosed upon generation of membrane curvature to form clathrin-coated pits, a process which can lead to formation of intracellular vesicles [18,19]. While EGFR may be internalized via other pathways, under a range of conditions clathrin-mediated endocytosis is the primary internalization pathway for epidermal growth factor (EGF) receptors at lower physiological EGF concentrations [18,19]. EGFR signaling occurs both from EGFR on the plasma membrane and in endosomes following endocytosis [19][20][21][22][23][24]. From endosomes, receptors are either transported to lysosomes for degradation or recycled back to the plasma membrane [25].
EGFR internalization from the plasma membrane into endosomes leads to EGFR signaling modes that are distinct from EGFR signaling from the plasma membrane [18][19][20][22][23][24]. As signaling activity of EGFR in endosomes is lost when the ligand dissociates [23], ligand binding is important for endosome-based EGFR signaling.
There has been substantial effort to understand the mechanistic underpinnings of how different ligands lead to different signaling pathway activation and cellular responses. For example, different ligands lead to distinct EGFR dimers with varying interaction strength and lifetime [26] and different degrees of EGFR trafficking to the nucleus [34]. We will focus on ligand-specific differences in receptor internalization into endosomes. HB-EGF and BTC lead to high internalization, EGF and TGF-α to somewhat lower internalization, and EPGN and AREG to the lowest internalization [25]. Internalized EGFR can be recycled to the plasma membrane or lysosomally degraded, with different ligands more likely to induce EGFR degradation: EGF, HB-EGF, and BTC cause substantial degradation and TGF-α, EREG, and AREG largely induce recycling to the plasma membrane [25,29]. Degradation and intracellular retention of EGFR align with signaling persistence: TGF-α, EPGN, and AREG cause persistent signaling as the receptors to which they bind are typically recycled to the plasma membrane, while EGF, HB-EGF, and BTC cause more transient signaling as they lead to EGFR degradation, lowering plasma membrane EGFR levels until replenished by protein synthesis [25]. The likelihood of receptor degradation with different ligands also largely aligns with internalization differences [19], ligand dissociation rates at endosomal pH (slower dissociation leads to more degradation) [25,35], and ubiquitination and phosphorylation persistence on EGFR in endosomes (persistent modification leads to more degradation) [25].
As EGFR internalization into endosomes facilitates important signaling modes distinct from EGFR signaling modes from the plasma membrane and different EGFR ligands induce varying degrees of EGFR internalization, we explore a possible mechanism behind the variation in EGFR internalization due to binding of different ligands.
'Kinetic proofreading' describes a class of mechanisms that use nonequilibrium conditions to reduce the error rates or enhance discrimination between stimuli for various cellular and biochemical processes. While initially envisioned to increase accuracy of 'reading' or copying processes such as DNA replication or protein synthesis [36,37], kinetic proofreading principles are understood to apply to a broader class of phenomena, particularly signaling and recognition processes [38][39][40][41] such as those of receptors on the plasma membrane. To improve copying or signaling fidelity, these kinetic proofreading processes utilize multiple stages and repeated cycles to enhance the difference between 'right' and 'wrong' substrates. With multiple stages each discriminating between the substrate types, the combined effect is to enhance the overall difference in outcome between substrate types beyond the performance of an equilibrium process.
Recently, a distinct spatial proofreading mechanism was proposed [42], with the diffusive search time between a binding event at one location and a reaction event at another location allowing discrimination between ligands with different unbinding rates. For sufficiently distinct unbinding rates and an appropriate timescale for diffusive search, a receptor stimulated by a slower-unbinding ligand can be far more likely than a receptor stimulated by a faster-unbinding ligand to reach the final reaction location before ligand unbinding.
We propose that the EGFR internalization process following ligand binding, including distinct signaling modes due to interaction with clathrin-coated pits and endosomes, applies the principles of kinetic and spatial proofreading to improve ligand discrimination. We develop two related quantitative models for EGFR dynamics, one solely kinetic and another involving receptor diffusion on the plasma membrane. We find that the internalization probability ratio for EGFR stimulated by different EGFR ligands exceeds the unbinding rate ratio between the ligands for physiological parameters. This increased differentiation between EGFR ligands enables greater ligand specificity in EGFR signaling activity.

Kinetic model
The kinetic EGF receptor model considers receptors immediately following ligand binding until the receptor is internalized into the cell during  endocytosis of a clathrin domain or the ligand unbinds from the receptor. A kinetic version of this model is shown in figure 1(A), with k c the rate of a ligand-bound receptor R L entering a clathrin domain, k −c the rate of a clathrin-localized receptor R C exiting a clathrin domain, k i the rate of a clathrin-localized receptor R C internalizing via endocytosis, and k u the rate of a ligand unbinding from a receptor (inside R C or outside R L a clathrin domain). Receptors are initiated in the ligand-bound state outside of clathrin domains following recent work suggesting ligands primarily bind to EGFR within tetraspanin domains [8].
For a receptor beginning in state R L , the probability P i that a receptor is internalized is (see appendix for derivation) We estimate the clathrin entry rate k c ≈ 0.1 s −1 , which is substantially slower than the estimated diffusion-limited entry rate [43] of ≈1 s −1 (for EGFR with diffusivity 0.2 µm 2 s −1 [8,9,[44][45][46][47] to clathrin domains of 50 nm radius representing 1% of the plasma membrane surface [48]). k −c ≈ 0.05 s −1 and k i ≈ 0.05 s −1 are estimated from clathrin domain dynamics [48]. See the appendix for the details of these parameter estimates from experimental observations.

Proofreading
In figure 2(A) the internalization probability P i is shown as the ligand unbinding rate k u , which will be distinct for different ligands, is varied. Figure 2 shows P i for the estimated EGFR parameters and with each parameter k c , k −c , or k i increased or decreased relative to the estimated parameters by an order of magnitude. At low k u the internalization probability is flat at P i ≈ 1, as the unbinding rate is too small to compete with internalization. As the unbinding rate increases, the internalization probability decreases, eventually reaching a limiting power law for high k u of P i ≃ k i k c /k 2 u . This power law, reached only for sufficiently high unbinding rates, corresponds to receptors that effectively make a single attempt to enter a clathrin domain and then internalize or have the ligand unbind, without sufficient time for another internalization opportunity. EGFR internalization switches from a nearly certain event for low k u to a rare event for k u that are two to three orders of magnitude larger.
EGFR binding affinity varies for other ligands. EREG has a weak EGFR affinity compared to other ligands [27], with at least 10× weaker affinity for EGFR compared to EGF [26,27]. As EGFR ligands have the same binding motif [55], use similar binding modes [56], and ligand affinities can be accurately ranked by ligand-receptor interaction energies [56], we assume that binding rate constants are similar for different EGFR ligands and that unbinding rate differences are proportional to the affinity differences. Thus we take the EREG unbinding rate to be 10× faster than that of EGF, k u,EREG = 0.5 s −1 . This corresponds to a receptor internalization probability of approximately 1.4% (orange circle in figure 2(A)), which is approximately 20× less internalization than receptors bound to EGF.
Proofreading often describes the ratio between a 'right' and a 'wrong' product or signal. However, distinctions in EGFR response to stimulation by different ligands such as EGF and EREG do not clearly fit into categories of 'right' and 'wrong' as these EGFR ligands each send meaningful signals to a cell via EGFR. We will instead treat EGFR internalization as a signaling aspect with a propensity that differs between ligand types. Variation of internalization propensity between ligands, combined with distinct EGFR signaling from endosomes following receptor internalization compared to EGFR signaling from the plasma membrane [18][19][20][22][23][24], could serve as one mode to assist cells to distinguish between the signals mediated by different EGFR ligands and initiate the appropriate response. In this way EGFR internalization proofreading is similar to proofreading during chromatin remodeling, which involves targeting of nucleosomes with particular characteristics by different remodeler families and lacks clear categories of 'right' and 'wrong' targets [57].
We calculate the fidelity [42], or ratio of receptor internalization probabilities for different ligand types, as a measure of the degree to which EGFR internalization can distinguish between stimulation by different ligand types. The EGFR internalization fidelity between a ligand with unbinding rate k u and a ligand with unbinding rate k ′ u is then If the ratio between the unbinding rates is k ′ u /k u = A, then In figure 2(B) the fidelity is shown as the unbinding rate k u is varied for unbinding rate ratio A = 10, corresponding to the estimated ratio between unbinding rates of EGF and EREG ligands. For low k u , the fidelity approaches unity, as binding of both ligand types leads to near-certain receptor internalization. As the unbinding rate k u increases, the fidelity also increases, as the receptor with lower unbinding rate k u becomes internalized more frequently than the receptor with higher unbinding rate k ′ u . The fidelity in equation (4) saturates at A 2 = 10 2 as the unbinding rate becomes high, corresponding to the internalization probability for both ligands described by the limiting power law in figure 2(A). The discrimination between different ligands, as quantified by the fidelity, is bounded from above by (k ′ u /k u ) 2 . With k u,EGF = 0.05 s −1 and k u,EREG = 0.5 s −1 , the fidelity of EGF binding and internalization relative to EREG is approximately 20 (black circle in figure 2(B)), well below the saturated fidelity of 100. Receptors bound to EGF are approximately 20× more likely to be internalized than receptors bound to EREG. This fidelity range leaves ligand internalization sensitive to changes in EGFR dynamics, as illustrated in figure 2(B). Adjusting the rate of clathrin recruitment k c or internalization k i will modestly change fidelity, with an order of magnitude decrease in either of these parameters nearly doubling the fidelity to approximately 35 and an order of magnitude increase in either of these parameters reducing the fidelity by almost two-thirds to approximately 8. The fidelity is less sensitive to the clathrin domain exit rate of receptors, k −c , with an order of magnitude increase in k −c reducing the fidelity by approximately one-quarter, and an order of magnitude decrease in k −c leaving the fidelity essentially unchanged. Fidelity = A, the level at which fidelity matches the unbinding rate ratio, for k u = √ k c k i /A. Thus the clathrin entry rate k c , internalization rate k i , and ratio of unbinding rates A control the unbinding rate at which the fidelity exceeds the unbinding rate ratio.
Figures 2(C) and (D) show the tradeoff between EGFR ligand fidelity and internalization probability, similar to previously noted tradeoffs [38][39][40][41][42]. The low fidelity limit for a given internalization probability occurs for a strongly two-stage and single-clathrinentry process (k i , k c ≫ k −c ; k c = k i ), For the low fidelity limit, the internalization probability P i and fidelity are controlled by a single parameter combination k u /k c once the unbinding rate ratio A is selected. The high fidelity limit for a given internalization probability occurs for multiple cycles into and out of clathrin and high internalization (k c , For the high fidelity limit, the internalization probability P i and fidelity are controlled by a single parameter combination k −c k u /(k c k i ) once the unbinding rate ratio A is selected. The high fidelity limit indicates that for fidelity to increase, the receptor internalization probability must decrease, as some EGFR must fail to internalize for stimulation by different ligands to become substantially distinct. A high fidelity for a given internalization probability may be desirable, as it provides a large distinction between the signaling activity due to the two ligands for a given number of receptors internalized. Figure 2(C) shows that the estimated EGFR parameters for EGF vs EREG binding are near but meaningfully below the high fidelity limit, and that reducing the clathrin exit rate k −c appears to push the fidelity quite close to the high fidelity limit.

Clathrin domain entry
The internalization probability and the fidelity arise from considering receptor visits to clathrin domains. The probability of zero clathrin visits is P v (0) = k u /(k u + k c ) while the probability of n ⩾ 1 clathrin visits is EGF-bound receptors average 0.73 visits to a clathrin domain and EREG-bound receptors average 0.16 visits. While the number of visits to clathrin domains before ligand unbinding or internalization are substantially affected by variation in EGFR dynamics for low unbinding rate k u (figure 3(A)), there is more limited parameter sensitivity in the neighborhood of the EGF unbinding rate k u,EGF = 0.05 s −1 (black circle in figure 3(A)) and the EREG unbinding rate k u,EGF = 0.5 s −1 (orange circle in figure 3(A)). In this neighborhood, the clathrin recruitment rate k c meaningfully impacts the number of visits to a clathrin domain, with an order of magnitude increase in k c increasing the number of clathrin domain visits by 1.85 × (EGF) and 4.4 × (EREG) and an order of magnitude decrease in k c decreasing the number of clathrin domain visits by 5.9 × (EGF) and 8.7 × (EREG).
The inset of figure 3(A) shows the fraction of receptors undergoing zero, one, two, and three clathrin domain visits, for EGF and EREG unbinding rates. For both EGF and EREG, it is rare for a receptor to undergo more than three distinct visits to a clathrin domain before receptor internalization or ligand unbinding. The inset of figure 3(A) also shows that EGF-bound receptors mostly visit clathrin domains zero times, once, or twice, with almost half visiting once; while EREG-bound receptors mostly visit clathrin domains zero times or once, with more than 80% not visiting clathrin domains.
Changes to the unbinding rate also impact the time until either receptor internalization or ligand unbinding, shown as the mean survival time in figure 3(B). Similar to the internalization probability (figure 2(A)), the mean survival time is flatter for low unbinding rate k u and then decreases before reaching a limiting power law for large k u . EGF-bound receptors have a mean survival time of approximately 14 s (figure 3(B), black circle) and EREG-bound receptors have a mean survival time of approximately 2 s (orange circle). While at high k u the internalization probability (figure 2(A)) depends on the parameters describing EGFR dynamics and at low k u the internalization probability converges independent of parameters (converging because P i cannot exceed one), the mean survival time (figure 3(B)) conversely depends on parameters at low k u and converges independent of parameters for high k u . This convergence at high k u occurs because unbinding becomes dominant over internalization, and the mean survival time becomes approximately 1/k u . Figure 3(C) shows the time evolution following ligand binding of the occupation probability of ligand-bound (R L ) and clathrin-localized (R C ) receptor states (see appendix for equations), for both EGF and EREG binding. Consistent with the faster unbinding and lower survival time for EREG-bound receptors compared to EGF-bound receptors, the probability of occupying either state decreases rapidly with time for EREG-bound receptors. The peak in clathrin-localized EREG-bound receptors reaches approximately 6% of receptors at 1-2 s, and decays to below 1% after 8 s. In contrast, many more EGFbound receptors become and remain localized to clathrin domains, with a peak of approximately 26% of receptors occurring at 7-8 s, and decaying to below 1% after more than 50 s.

Spatial model
In addition to behavior of the kinetic model of EGFR clathrin association, internalization, and ligand unbinding described above, we explore a twodimensional (2D) spatial model ( figure 1(B)). In this model, receptors are initiated in a ligand-bound state, diffusing with diffusivity D = 0.2 µm 2 s −1 to encounter randomly distributed (but not overlapping) circular clathrin domains. Energy barriers limit entry into clathrin domains, from a receptor attempting to diffuse onto a clathrin domain site from outside; and exit from clathrin domains, from a receptor attempting to diffuse onto a site outside a clathrin domain from a site inside. Receptors diffuse and enter and exit clathrin domains until the receptor is internalized within a clathrin domain or the ligand unbinds.
Experiments suggest that EGFR can become localized to a clathrin domain either through induction or stabilization of a nascent clathrin domain, or by recruitment to an existing clathrin domain [14][15][16][17]. We treat both of these cases as a diffusive encounter with a circular clathrin domain, as EGFR induction or stabilization of a nascent clathrin domain will involve diffusive search by the receptor for clathrin domain components. For diffusive search by EGFR for an existing domain, which most closely matches this spatial model, the domain entry energy barrier is adjusted to match the clathrin entry rate estimated above (see next section). EGFR exit from a clathrin domain is often due to clathrin domain disassembly [48,58,59]. Accordingly, when an EGFR exits a clathrin domain in the model (crossing an energy barrier with height adjusted to match the clathrin exit rate estimated above; see next section) the clathrin domain is moved to another location. See appendix for further simulation details.

Spatial model parameters
While the internalization rate k i and ligand unbinding rate k u in the spatial model and simulation directly correspond to these same parameters in the kinetic model, the clathrin domain entry rate k c and exit rate k −c in the kinetic model do not have direct correspondence in the spatial model. With the spatial model, k c is controlled by the clathrin domain entry energy barrier ∆E enter , clathrin domain radius r c , clathrin domain concentration c c , and receptor diffusivity D. k −c is controlled by the clathrin domain exit energy barrier ∆E exit , clathrin domain radius r c , and receptor diffusivity D.
The effective entry rate into the clathrin domain k c from the kinetic model is represented in the spatial model by diffusive search for the clathrin domains followed by crossing the entry energy barrier ∆E enter . Diffusive search is determined by the clathrin domain radius r c , clathrin domain concentration c c , and receptor diffusivity D. By assuming that the time to enter the clathrin domain is the sum of an equilibration time period for even distribution on the plasma membrane and a time period for equilibrated receptors that are adjacent to the clathrin domain to overcome the entry energy barrier, we estimate (see appendix for derivation details) the mean time to enter as with d EGFR = 10 nm the EGFR diameter. k B is Boltzmann's constant and T is absolute temperature, together representing a thermal energy scale.
In the simulation, changing ∆E enter in turn changes the mean time to enter a clathrin domain, shown in figure 4(A). While not an exact match, the ⟨t enter ⟩ estimate (equation (8)) has a similar form and magnitude to the simulation domain entry times. ∆E enter = 0 corresponds to diffusion-limited entry of receptors into clathrin domains, where any receptor that reaches a clathrin domain will enter. A mean clathrin domain entry time of 1.47 s (first entry of each trajectory) and 1.71 s (all entries of each trajectory) occurs for ∆E enter = 0, the diffusionlimited rate. This is larger than the diffusion-limited search time estimated above for the kinetic model of ≈1 s, showing the random distribution of clathrin domains in the spatial model leads to somewhat larger diffusion-limited search times. For low ∆E enter ≲ 3k B T the time to enter a clathrin domain is largely the diffusive encounter time (first term in equation (8)) as there is a high chance of a receptor entering an encountered clathrin domain. For high ∆E enter ≳ 4k B T, the time to enter a clathrin domain is largely controlled by the time spent near clathrin domains over time and the difficulty of entering a clathrin domain-for figure 4(A), this is indicated by the simulated mean time to enter a clathrin domain ∼e ∆Eenter/(kBT) (second term in equation (8)).
In figure 4(A), ∆E enter ≃ 4.4k B T leads to a clathrin domain entry time of 10 s (corresponding to k c = 0.1 s −1 from the kinetic model). We will use ∆E enter = 4.4k B T hereafter, unless otherwise stated. Figure 4(B) shows that the simulated mean time to enter a clathrin domain is inversely proportional to the domain radius, ∼1/r c , as expected from equation (8)  The effective exit rate from clathrin domains, k −c from the kinetic model, is represented in the spatial model by diffusive search for the clathrin domain boundary from within followed by crossing the exit energy barrier ∆E exit . Diffusive search is determined by the clathrin domain radius r c and receptor diffusivity D. The energy barrier ∆E exit , together with the diffusive search parameters, determine k −c for the spatial model.
The kinetic model, with parameters estimated from experimental measurements, has a rate of clathrin domain exit k −c = 0.05 s −1 . For free diffusion in two dimensions, ⟨r 2 ⟩ = 4Dt, so for a clathrin domain radius r c = 0.05 µm and EGF receptor diffusivity D = 0.2 µm 2 s −1 , the typical time to reach the clathrin boundary from within is ≲0.01 s, which is much shorter than the typical 20 s to exit a clathrin domain. By assuming that the receptor location has equilibrated within the domain and the exit time is for the receptors adjacent to the clathrin domain boundary to overcome the exit energy barrier, we estimate (see appendix for details) the mean time to exit as In the simulation of the spatial model, increasing ∆E exit (figure 4(D)) and increasing the clathrin domain radius r c (figure 4(E)) each increase the mean time to exit a clathrin domain. While the clathrin domain exit time estimate (equation (9)) does not exactly match the spatial model simulation (exceeding the simulation by up to 45%), the estimate is of similar order of magnitude and matches the simulation data trend of mean clathrin domain exit time proportionality to exit barrier Boltzmann factor e ∆E exit /(kBT) and clathrin domain radius r c . Figure 4(D) shows that for a clathrin domain radius r c = 0.05 µm, as estimated above, the kinetic model clathrin domain exit rate k −c = 0.05 s −1 is achieved for an exit barrier ∆E ≃ 10k B T. We will use ∆E exit = 10k B T hereafter unless otherwise stated.

Clathrin domain entry
We now examine in more detail the entry of receptors into clathrin domains. Figure 5(A) shows the probability density of times t to enter a clathrin domain, for receptors with ligands that do not unbind (k u → 0). The time distributions appear approximately exponential at most times, ∼ exp(−t/τ ), consistent with previous diffusive search and reaction time densities [60]. Adding the clathrin domain entry energy barrier of 4.4k B T increases the time constant τ by approximately a factor of five.
The time distribution to enter a clathrin domain on the first instance (after starting at a random position outside of clathrin domains) is very similar to the distribution to enter after exiting another clathrin domain. With clathrin domains moving after a receptor exits (see spatial model description in appendix), the random initial position and clathrin domain exit position appear to have sufficiently similar distributions to lead to the similar clathrin entry time distributions. As all three distributions are exponential, scaling the first entry distributions with and without an energy barrier, and the repeated entry distribution, by their mean times leads to collapse of the clathrin entry time distributions (figure 5(A) inset).  For unbinding rates corresponding to EGF (k u,EGF = 0.05 s −1 ) and EREG (k u,EREG = 0.5 s −1 ), figure 5(C) shows the number of clathrin domains that a receptor enters before ligand unbinding or internalization. The number of clathrin entries is very similar to those in the inset of figure 3(A) for the kinetic model, with EGF-bound receptors mostly visiting a clathrin domain zero times, once, or twice; and EREG-bound receptors mostly visiting clathrin domains zero times or once.
We also examined the number of distinct clathrin domains that a receptor attempts to enter, as a representation of how many clathrin domains a receptor will encounter, which will be related to the length of the diffusive search before a receptor successfully enters a clathrin domain. An entry attempt is considered distinct if for a different clathrin domain than the previous attempt, or if the receptor successfully entered a clathrin domain between attempts. EGFbound receptors undergo a mean of approximately 5.1 clathrin entry attempts before ligand unbinding or internalization, while EREG-bound receptors undergo approximately 1.2 clathrin entry attempts. Figure 5(D) shows the distribution of the number of attempts. EGF-bound receptors rarely (<10%) have EGF unbind prior to attempting clathrin domain entry, with approximately 30% of EGF-bound receptors attempting entry to 1-2 clathrin domains, a further 30% attempting entry to 3-6 clathrin domains, and nearly 30% attempting entry to ⩾7 clathrin domains. In contrast, approximately 40% of EREGbound receptors have EREG unbind prior to attempting to enter a clathrin domain, with nearly half of EREG-bound receptors attempting entry to 1-2 clathrin domains, approximately 10% attempting entry to 3-4 clathrin domains, and <5% attempting entry to ⩾5 clathrin domains.

Proofreading
We now explore proofreading with the spatial model. Specifically, we compare the internalization probability P i vs ligand unbinding rate k u relationship (as in figure 2(A)) for the spatial model to the kinetic model with the same parameter values. If this relationship is similar for the spatial and kinetic models, then the fidelity-unbinding rate relationship will be similar and there will be similar proofreading behavior. Overall, we find that the internalization probabilities for the kinetic and spatial models are nearly identical (figure 6), indicating that the kinetic proofreading behavior discussed above for the kinetic model is also reflected in spatial model behavior.
In figure 6(A) the clathrin entry energy barrier ∆E enter is adjusted to vary the clathrin domain entry rate k c . For the spatial model, k c cannot exceed approximately 0.58 s −1 , because this is the diffusionlimited limit for clathrin domain encounter, as the shortest mean time to enter a clathrin domain for ∆E enter = 0 is approximately 1/(0.58 s −1 ) ≃ 1.7 s. In figure 6(B), k c is varied by adjusting the clathrin domain radius r c ; and in figure 6(C), k c is varied by adjusting the clathrin domain concentration c c . The adjustment of k c via ∆E enter , r c , and c c applies calibration from figures 4(A)-(C).
In figure 6(D) the clathrin exit barrier ∆E exit is adjusted to vary the clathrin domain exit rate k −c . The adjustment of k −c via ∆E exit applies calibration from figure 4(D). In figure 6(E) receptor internalization rate in clathrin domains k i is directly adjusted, without any calibration needed.

Discussion
We have investigated, using quantitative physical modeling, EGFR internalization differences when stimulated by ligands with different receptor binding lifetimes. Discrimination of EGFR internalization modestly enhances underlying differences in ligand binding lifetimes. The results of the two models we develop, which we term kinetic and spatial models, are in agreement. Discrimination in EGFR internalization following ligand binding is a form of proofreading, combining the well-established kinetic proofreading [38][39][40][41] with the more recently proposed spatial proofreading [42]. For this model of proofreading, EGFR internalization is the irreversible step of the mechanism. Different EGFR ligands lead to distinct cellular responses [28][29][30][31] and EGFR signaling from endosomes following receptor internalization is distinct from EGFR signaling from the plasma membrane [18][19][20][22][23][24]. This work shows that the internalization pathway of EGFR favors internalization and thus endosome-based EGFR signaling for ligands with longer binding lifetimes beyond what is expected from ligand unbinding rate ratios alone, demonstrating that internalization and associated signaling differences between EGFR ligands can be part of how a cell distinguishes between stimulation by different EGFR ligands.
EGFR internalization following ligand binding combines kinetic proofreading and spatial proofreading. The first stage of EGFR internalization proofreading is spatial, as EGFR diffusively searches for a clathrin domain while ligand unbinding can occur. The second stage is kinetic, as the EGFR within a clathrin domain can be internalized Table 1. Correspondence of parameter values between kinetic and spatial models in figure 6.

Parameter variation in figure 6
Kinetic model parameter Spatial model parameter In figure 6 Although the first stage involves a diffusive search for a clathrin domain, the first stage is not diffusion limited-our spatial modeling suggests that EGFR typically encounters several distinct clathrin domains before successful recruitment into a clathrin domain, therefore often requiring a diffusive search comparable to or longer than the mean ligand binding lifetime. The kinetic description of EGFR internalization proofreading ( figure 1(A)) involves two stages, and is similar to the Hopfield description of two-stage kinetic proofreading between two substrates with an energy penalty for one substrate relative to the other [36,40]. Initial substrate binding to form the first intermediate complex corresponds to ligand binding, the transition to the second intermediate complex from the first corresponds to recruitment to a clathrin domain, and the irreversible formation of the final product corresponds to internalization. Substrate unbinding from either intermediate state (generally assumed in the Hopfield description to have different rates from each complex) corresponds to ligand unbinding (which is described here with a single rate across states for EGFR), and returning to the first intermediate complex from the second corresponds to EGFR exiting a clathrin domain to diffuse on the plasma membrane. The Hopfield description energy penalty ∆E corresponds to the ratio of ligand unbinding rates A = k ′ u /k u (k ′ u > k u ), with A equivalent to the Boltzmann factor of the energy penalty e ∆E/(kBT) . For Hopfield two-stage kinetic proofreading, the ratio of the final products between the two substrates cannot exceed e 2∆E/(kBT) [40], and we equivalently find that EGFR internalization of long-relative to short-lifetime ligands cannot exceed A 2 .
There is a tradeoff between fidelity of internalization between two ligands and internalization probability, with a maximum fidelity for a given internalization probability, subject to the ratio of unbinding rates between the two ligands. This is similar to other performance tradeoffs in proofreading and other nonequilibrium processes, such as between speed and accuracy [61,62]. We find that the fidelity of internalization between EGF and EREG binding approaches but is meaningfully below the maximum fidelity for a given internalization probability. While decreasing the rate at which EGFR exits clathrin domains can push the fidelity quite close to the upper fidelity limit, this may be a factor that is difficult for the cell to adjust, or constrained by how changes to clathrin domain dynamics may impact other processes.
Recent spatial proofreading work [42] considered a scenario where a complex with a bound substrate would undergo a diffusive search for a location facilitating catalysis, with an associated distribution of arrival times following substrate binding. Our spatial modeling suggests that EGFR typically undergoes multiple attempts to enter a clathrin domain before succeeding, instead following the exponential distribution of search times of a constant-rate or Poisson process.
EGFR internalization into endosomes facilitates important modes of signaling distinct from EGFR on the plasma membrane [18][19][20][22][23][24], and ligand binding is important for this endosome-based EGFR signaling [25,35]. For physiological EGFR parameters, and comparing EGF to EREG (approximately 10× faster unbinding than EGF) stimulation we find EGF binding is approximately 20× more likely to lead to EGFR internalization compared to EREG, modestly enhancing the unbinding rate difference.
However, the exact ratio between EGF-and EREG-stimulated internalization depends on the rates of recruitment to a clathrin domain, exiting a clathrin domain, and internalization. The receptor internalization fidelity is most sensitive to the clathrin recruitment and internalization rates, although a given fold-change in these rates is larger than the resulting internalization fidelity fold-change. Cells could adjust the factors behind these clathrin-domain rates to adjust discrimination in internalization of EGFR when stimulated by different ligands. Proteins involved in clathrin-mediated endocytosis can be perturbed or have their levels changed in cancer cells, altering clathrin-mediated endocytosis [63][64][65], suggesting that these changes in cancer cells may adjust the internalization and downstream signaling differences between EGFR ligands.
While we have emphasized signaling distinctions that may arise from differential internalization and endosomal localization of EGFR following binding of different ligand types, these internalization distinctions between ligand types also lead to differential depletion of EGFR from the plasma membrane, which is expected to have signaling consequences. With EGF binding inducing internalization with a substantially higher probability than EREG, EGF is expected to cause substantially more depletion of EGFR from the plasma membrane than EREG. Indeed, experiments show that EGF stimulation more rapidly depletes EGFR from the plasma membrane and depletes EGFR on the plasma membrane to a lower level, in comparison to EREG stimulation [25]. Although we focus on EGF and EREG stimulation, EGFR depletion from the plasma membrane following ligand stimulation varies across other ligands as well [25]. These depletion differences between ligand types, partially caused by different probabilities of internalization between ligand types, can then impact signaling persistence as EGFR must be located on the plasma membrane to respond to external ligand concentrations. Experiments show that EGF stimulation rapidly leads to EGFR signaling activity that decreases to less than half of the peak value after 20 min, while EREG and EPGN (both lower affinity ligands) stimulation lead to longerlived EGFR signaling activity persisting for an hour or longer [26]. The correspondence of stimulation by ligands with low receptor internalization probability to persistent signaling provides, in addition to endosomal localization, an additional mechanism to distinguish between EGFR stimulation by different ligands.
We examined two quantitative models of EGFR internalization, one entirely kinetic and another with spatial diffusion of EGFR to clathrin domains, and find that these two models are in close agreement. This agreement indicates that the time to enter a clathrin domain is distributed approximately exponentially, which is not a typical distribution for diffusion-controlled search times [66,67]. However, our results suggest that EGFR recruitment to a clathrin domain is not a diffusion-limited process, and instead typically requires that EGFR visit several distinct clathrin domains. EGFR that are internalized after recruitment to a clathrin domain typically enter only one or two clathrin domains. This suggests that although EGFR can exit a clathrin domain (by exiting a persisting domain [68] or due to domain disassembly [48,59]), EGFR does not typically exit clathrin domains repeatedly.
While recruitment of EGFR to a clathrin domain may involve receptor entry into an already-existing clathrin domain [16,17], EGFR may also induce or stabilize the formation of clathrin domains [14,15,69]. Our model describes recruitment to a clathrin domain either as a rate (kinetic model) or diffusive encounter with a clathrin domain followed by crossing an energy barrier (spatial model). Thus our spatial model does not explicitly account for EGFR recruitment to a clathrin domain via induction or stabilization of clathrin domain formation. However, as EGFR typically approaches several clathrin domains in the spatial model before entering a clathrin domain, this entry is well-approximated by a constant rate process, which is a closer match to recruitment via domain induction compared to a diffusion-limited search for the clathrin domain. Our spatial model also included escape from a clathrin domain as a diffusive search for the domain boundary followed by crossing an energy barrier. EGFR diffusivity inside a clathrin domain is likely limited due to interaction with domain components [8], and the spatial model description is aimed at producing relevant rates of clathrin escape. The energy barriers limiting exit from clathrin domains (approximately 10 k B T) are smaller than those limiting entry to clathrin domains (approximately 4 k B T). This may be attributed to the binding of adaptor proteins in clathrin domains to EGFR in clathrin domains [68,70].
Through quantitative modeling, we have shown that the internalization of EGFR enhances unbinding rate differences between different EGFR ligands. This adds to other factors that allow EGFR to distinguish between ligands, such as differential stabilization of EGF receptor dimers [26] and differences in internalized receptor degradation and recycling to the plasma membrane [29]. This work also proposes that EGFR internalization discrimination between different stimulating ligands is an example of combined kinetic and spatial proofreading.

Data availability statement
Source code (in Matlab) for spatial model simulation, and data and Matlab files to create plots in figures, are available at: https://github.com/aidanbrowntmu/ EGFRproofreading.

Time-dependence of R L and R C occupation
The time evolution of the probabilities P L and P C , of respectively occupying the states R L and R C , in the kinetic model shown in figure 1(A) is We write P ′ L = dPL dt and P ′ C = dPC dt . Taking the derivative of equation (14) with respect to time provides P ′ ′ L = −(k c + k u )P ′ L + k −c P ′ c and then substituting equation (15) gives Equation (16) indicates a P L (t) of the form Equation (17) is substituted into equation (14) to find ] . (18) Following ligand binding EGFR is in state R L , so P L (t = 0) = 1 and P C (t = 0) = 0, leading from equation (18) to The condition that P L (t = 0) = 1 then leads from equation (17) to c 1 = (k c + k u + λ 2 )/(λ 1 − λ 2 ) and c 2 = −(k c + k u + λ 1 )/(λ 2 − λ 1 ). Substituting equation (17) into equation (16) gives

Spatial model
Simulation details EGFR trajectories begin at a random position outside of clathrin domains with a ligand bound. Receptors diffuse on a 2D square lattice with lattice site spacing ∆x = ∆y = 0.01 µm. Experimental measurements suggest an individual EGFR diameter of approximately 7.2 nm [71], 10.6 nm [72], and 11.8 nm [73], and 11 nm between EGF receptor dimers [74], so we choose this lattice spacing equal to an estimated EGFR diameter d EGFR = 10 nm. Receptors attempt a step to one of the four nearestneighbor lattice sites every ∆t = 1.25 × 10 −4 s, corresponding to a receptor diffusivity D = 0.2 µm 2 s −1 . The simulated clathrin domains are circular with radius of r c = 0.05 µm (unless otherwise stated), corresponding to experimental measurements of clathrin domains approximately 100 nm in diameter [48]. Ten clathrin domains (unless otherwise stated), are randomly placed in a square domain with periodic boundaries of side length 2.8 µm without overlapping, corresponding to a concentration of 1.28 µm −2 and an area fraction of approximately 1%. This area fraction is calculated from measurements that the average lifetime of a clathrin domain is 46 s and the appearance per second of approximately three new clathrin clusters lasting at least 20 s per 10 8 nm 2 of membrane [48], which for 100 nm-diameter circular domains is (46 s)(2.4 × 10 4 nm 2 s −1 )/10 8 nm 2 = 0.011 or approximately 1% of the membrane area occupied by a clathrin domain.
Attempted entries of receptors into a clathrin domain (i.e. attempted steps from a lattice site greater than distance r c from the nearest clathrin domain center to a lattice site less than r c from the domain center) are successful with probability e −∆Eenter/(kBT) , such that ∆E enter represents an energy barrier [75] to entering clathrin domains. Attempted exit of receptors from a clathrin domain similarly encounters an energy barrier ∆E exit while steps entirely outside or within a clathrin domain always succeed (no energy barrier). Exit of a receptor from a clathrin domain may correspond to disassembly of a clathrin domain [48,59] or the exit from a persisting clathrin domain. We assume that both of these processes effectively represent a scenario where the clathrin domain has become unable to confine an EGFR, and thus move the clathrin domain to a new location upon receptor exit. The clathrin domain exit energy barrier ∆E exit thus enforces a timescale before receptor exit from the domain.
Receptors inside a clathrin domain are internalized at a rate k i (occurring with probability k i ∆t each timestep). Ligands unbind from any receptor with rate k u (probability k u ∆t). Either receptor internalization or ligand unbinding ends the simulated trajectory of a receptor-if both internalization and unbinding are selected in the timestep, the process to occur is selected with probability proportional to the rate of each process.
Simulation source code, along with figure data, is available on github [76].

Estimating rate of entering clathrin domains
To estimate the mean time for a receptor to enter a clathrin domain, we sum the time period for a receptor to first diffusively encounter a clathrin domain and the time period to enter a clathrin domain after the first encounter, similar to splitting into diffusion and reaction timescales [60]. The time to first encounter a clathrin domain is estimated using t search in equation (13). After this encounter it is assumed that the spatial distribution of the receptors relative to clathrin domains has equilibrated, and the fraction of time the receptor is adjacent to clathrin domains is equal to the area fraction of the membrane adjacent to clathrin domains, where the area within one EGFR diameter of a clathrin domain as a fraction of the total area is f adjacent = c c (2πr c )d EGFR , with c c the clathrin domain concentration. The rate of clathrin domain entry attempts from this adjacent region is k attempt = 1/(4∆t), where ∆t is the simulation timestep. Each attempt will be successful with a probability e −∆Eenter/(kBT) . The estimated mean time to enter a clathrin domain is ⟨t enter ⟩ = t search + 1 f adjacent k attempt e −∆Eenter/(kBT) (19) = t search + 4∆t c c (2πr c )d EGFR e ∆Eenter/(kBT) . (20) Diffusivity D = (∆x) 2 /(4∆t), with ∆x equal to the EGFR diameter d EGFR , ⟨t enter ⟩ = t search + d EGFR 2πc c r c D e ∆Eenter/(kBT) .
Estimating rate of exiting clathrin domains To estimate the mean time for a receptor to exit a clathrin domain, we assume that the receptor location has equilibrated within the domain and the exit time is for the receptors adjacent to the clathrin domain boundary to overcome the exit energy barrier, and the fraction of receptors near the domain boundary is equal to the fraction of the domain area adjacent to the domain boundary, or 2π r c ∆x/(π r 2 c ). The rate of attempts to leave the domain when in this boundary-adjacent region is 1/(4∆t), such that the average rate of attempting to leave the domain for clathrin-localized receptors is 2∆x/(4∆tr c ). In two dimensions, diffusivity D = (∆x) 2 /(4∆t), so for ∆x = d EGFR , the attempt rate for leaving clathrin domains is 2D/(r c d EGFR ). With clathrin domain exit energy barrier ∆E exit , the rate of leaving clathrin domains is 2De −∆E exit /(kBT) /(r c d EGFR ) and the estimated mean time to exit a clathrin domain is r c e ∆E exit /(kBT) .