Multicanonical simulation of coupled folding and binding of intrinsically disordered protein using an Ising-like protein model

An intrinsically disordered protein is one that does not spontaneously fold in physiological conditions but only folds when it binds to a target protein. Computer simulation of this coupled folding and binding is one of the central subjects of computational biophysics. Computing the free energy landscape is helpful in understanding coupled folding and binding. For this reason, we developed an Ising-like protein model and a multicanonical simulation in an energy-entropy space. The calculated free energy landscape indicates that coupled folding and binding induces rapid structural switching of the bound target protein.


Introduction
The multicanonical ensemble method [1] for model proteins has shed light on protein folding and binding. In particular, multicanonical simulations [2,3,4,5,6,7,8,9,10,11,12] provide us with a tool to investigate the free energy landscape, which is a key to understanding protein folding and binding. Recently, Higo et al. [13] applied a multicanonical simulation to the coupled folding and binding [14,15,16,17,18,19,20,21] of intrinsically disordered (ID) proteins [22,23,24,25]. An ID protein is one that is partially or wholly unstructured in physiological conditions, and coupled folding and binding are a folding transition driven by binding between the ID protein and a target protein. Multicanonical simulations succeeded in sampling various low-energy binding structures of an ID protein. However, information only from low-energy structures is not sufficient to achieve a deeper understanding of coupled folding and binding. This is because a protein is a finite size system, and the folding and binding occur at physiological temperatures. We have to clarify the structure of the free energy landscape. Multicanonical simulations to calculate the free energy landscape are difficult not only for elaborate atomic scale models [26,27] but also for coarse-grained (Gō) models [28,29], because such simulations require a very large amount of computational power to achieve a thermal equilibrium state. Therefore a simple model expressing the fundamental characteristics of the coupled folding and binding is necessary.
In the present work, we introduce a simple Ising-like protein model which is a variant of the Wako-Saitō-Muñoz-Eaton (WSME) model [30,31,32,33] Figure 1. The binding structure of the binding part of NRSF on that of Sin3 (PDB:2czy).
variables and therefore enables us to easily carry out simulations in thermal equilibrium. We apply a multicanonical simulation method to the model. With this model and method, we investigate the coupled folding and binding of the neuron-restrictive silencer factor (NRSF) [34,35] and corepresser Sin3. NRSF is an ID protein, and Sin3 is a NRSF-binding target protein. We obtain the free energy landscape representing the coupled folding and binding of NRSF. By comparing it with the free energy landscape of a fictitious binding part which folds in physiological conditions, we show that helical domains of Sin3 are prevented from forming non-cooperatively by the coupled folding and binding. As a result, cooperative forming of the helical domains of Sin3 is promoted. This result implies that the ID protein controls interprotein interaction with the target through the promotion of cooperative folding of the target protein.

Model and Method
For simplicity, we consider only the binding part of NRSF and that of Sin3, instead of full length proteins. The binding structure of the binding parts was obtained through a NMR experiment, as shown in figure 1 [36]. In the structure, NRSF forms a helix and bridges 4 helices of Sin3. In contrast, the binding part of NRSF without Sin3 is intrinsically disordered. Namely, the binding part has no specific native structure in physiological conditions. Since the structure is not obtained by NMR, it will be natural to consider that the binding part of Sin3 without NRSF is partially structured: the binding part consists of 4 α-helical domains which form independently.
To further gain information for constructing our model, we review a previous numerical study of low energy structures in the ID state of NRSF. The binding part of NRSF consists of 15 residues corresponding to the 43th-57th residues from the N-terminal. The size of the binding part is a tractable size on an atomic scale simulation. In fact, low energy structures were examined by a multicanonical simulation [13]. In the simulation, the binding part of NRSF without Sin3 predominantly adopts hairpin-like structures and adopts an α-helix structure with a low probability. Furthermore a simulation of binding with Sin3 was also carried out and showed that the binding part of NRSF adopts various binding structures. We note that the structure of the binding part of Sin3 in the simulation was artificially restrained. This is because the number of residues of the binding part, 77, which corresponds to the 31st-107th residues from the N-terminal, is too large.
On the basis of the above observations of low energy structures, we consider the following simple model in order to illustrate the free energy landscape of the coupled folding and binding. We express a state of the binding parts using a set of binary variables, m i and c i,j . m i represents the structure of the i-th residue and takes unity in an α-helix and zero otherwise. c i,j represents a contact between the i-th residue and j-th one and takes unity in hairpin structures and zero otherwise. The energy of a state expressed by {m i } and {c i,j } is defined by where the summation is taken over all residue pairs natively contacting. We adopt the residue pairs from the helix domains of the binding structure obtained by NMR [36]. In this case, as the criterion of the inter-residue contact, we employ d i,j < 0.65nm and |i − j| > 3, where d i,j is distance between the i-th residue and j-th one. The second term in r. h. s. of (1) is introduced to energetically stabilize hairpin structures of NRSF, which are observed as low-energy structures in the full atomic simulation [13]. Here H hairpin is where the summation is taken over the 2nd-14th, 5th-11th and 7th-9th residue pairs in NRSF.
Here the residues are renumbered from N-terminal of the NRSF binding part. ε and η denote the coupling constant for inter-residue contacts in the helix structure and that in the hairpinlike structures, respectively. η is an important parameter for controlling disorder in the binding part of NRSF. In fact η = 0 corresponds to a fictitious binding part that folds; and η = 2ε corresponds to the ID binding part in our simulation. In this model, the state of the helix structure, namely m i = 1, is unique and hence has no entropy. In contrast, the state of the hairpin structures or that of the others, namely m i = 0, consists of many structures. Therefore we consider their entropy with the following term: where S hairpin represents the entropy for hairpin structures of NRSF, and S otherwise represents that for the other structures of NRSF and Sin3. b i,j represents a binding state between the i-th residue in NRSF and the j-th residue in Sin3 and adopts unity in binding and zero otherwise. The entropy terms explicitly take the forms where τ i ({c i,j } , {b i,j }) = max j,k p,q;r,s {{c j,k |j < i < k}, {b p,q b r,s |p < i < r}} for the i-th residue of NRSF and τ j ({c i,j } , {b i,j }) = max p,q;r,s {b p,q b r,s |r < j < s} for the j-th residue of Sin3. In the maximization in τ i , p, q; r, s is taken over all neighboring binding pair. The neighboring binding pair p, q; r, s indicates that there is no other binding site between the p-th residue and the r-th one or between the q-th residue and the s-th residue. For reproducing the intrinsically disordered state of NRSF, we take s = 1 and µ = 0.25. The value of µ indicates that the binding part of NRSF loses 25% of the entropy at a cost of forming the hairpin structures.
In (5) and (6), we introduce the dependence of τ i on b i,j in order to consider an entropy effect coming from protein binding. For simplicity, we deal with the entropy effect of the binding by the same way with the entropy loss for the state forming hairpin structures. We note that εH helix , ηH hairpin , S hairpin , and S otherwise described so far are adopted from the WSME model [30,31,32,33,37,38] with some modifications for expressing the intrinsic disorder.
We also take into account the binding chemical potential defined on binding sites in NRSF and those in Sin3, hG({m i } , {b i,j }), to investigate binding-driven folding.
where the summation is taken over all binding site pairs between NRSF and Sin3 and h denotes the chemical potential gain for each binding b i,j . γ i,j ({b k,l } , {m p }) = k,l {b k,l ( k p=i m p l q=j m q − 1) +1} and the product over k, l 's is taken over the set of the neighboring binding of i, j . The binding sites are adopted from the NMR structure [36] which satisfies the same criterion for the native contacts in H helix . The pairs of binding sites are the pair of the 29th residue from the N-terminal in Sin3 binding part and the 11th one from the N-terminal in NRSF binding part, 32nd-8th, 33rd-8th, 36th-5th and 66th-3rd. We introduces γ i,j in order to express cooperativity between the neighboring binding. In G, b i,j is coupled with a set of m i and thereby gives rise to chemical potential gain only in the native binding structure (m i = 1). Therefore the coupling leads to the binding-driven folding.
The probability density of the state ({m i } , {c i,j }, {b i,j }) is given by β denotes the inverse temperature, 1/k B T , where T and k B denote a temperature and the Boltzmann constant, respectively. In order to obtain the free energy landscape, we calculate the density of states where C NRSF ({m i }) denotes an order parameter for the α-helix structure of NRSF, and C Sin3 ({m i }) denotes that of Sin3. They are defined by where Ω p and N p denote the set and the number of all residues of the binding part of protein p, respectively. From the density of states, we obtain the free energy landscape where P (E, S) = exp [−βE + S]. We calculate n(C NRSF , C Sin3 , E, S) by using the Wang-Landau method [39] for the two dimensional space of E and S.

Results and Discussions
Let us compare the free energy landscapes between the case of a fictitious binding part (η = 0) and that of the ID binding part (η = 2ε) in order to examine the effect of the coupled folding and binding of the ID binding part. In figures 2(a)-2(f), we show the free energy landscapes for   h = ε at temperatures around k B T = 0.52ε, which corresponds to the folding temperature of NRSF for η = 0. At this temperature NRSF is in intrinsic disorder and exhibits binding-driven folding for h = 2ε.
In the case of η = 0, as shown in figure 2(a), there are two free energy minima corresponding to the folded state (C NRSF = 0.25) and the unfolded one (C NRSF = 0.75) of NRSF at high temperatures. In this case, helical domains of Sin3 independently fold and therefore C Sin3 takes values around 0.5. In the case of k B T = 0.52ε, the unfolded NRSF becomes unstable, as shown in figure 2(b). Hence folding of NRSF is promoted with decreasing temperature. In this case, C Sin3 of the free energy minimum at C NRSF = 0.75 increases. Therefore the folding of NRSF promotes that of Sin3. However, Sin3 does not perfectly fold, as indicated by the C NRSF , being widely distributed, from 0.5 to 1.0. At low temperatures, Sin3 almost perfectly folds as shown in figure 2(c). These free energy landscapes indicate that non-cooperative helical domain formation of Sin3 is promoted by the binding of NRSF for η = 0.
In the case of the ID binding part, namely, η = 2ε, the free energy landscape in figure 2(d) has two minima similar to those observed for η = 0. At k B T = 0.52ε, the intrinsically disordered state of NRSF, corresponding to the minimum for C NRSF ≤ 0.25, is predominantly remained as shown in figure 2(e), in contrast to the case of η = 0. As a result, the partial folding of Sin3, (C NRSF , C Sin3 ) = (0.75, 0.75), is not promoted by the binding of NRSF. Namely the partiallyfolded state of Sin3 is not predominant even at low temperatures. At k B T = 0.48ε only the completely-folded state of Sin3 is realized, as show in figure 2(f). Thus, at the binding-driven folding transition, the ID binding part does not promote the non-cooperative formation of the helical domains of Sin3, and thereby only helps their cooperative formation.
To see the effects of cooperative helical domain formation on the structural switching of Sin3 in binding with NRSF, we examine the structural order parameter of Sin3, C Sin3 , as a function of h. We firstly identify the characteristic chemical potential, h 2ε , where binding-driven folding of NRSF occurs for η = 2ε. At h 2ε , the binding susceptibility of C b shows a peak due to the  binding. Here, C b is a binding order parameter defined by We also denote h with the susceptibility peak for η = 0 by h 0 . We show C b and susceptibilities as a function of h for η = 0 and 2ε in figure 3(a), where · · · denotes an ensemble average. h 0 and h 2ε are indicated by the dashed double-dotted lines. C NRSF for η = 2ε exhibits rapid switching around h 2ε , as shown in figure 3(b). Namely, a binding-driven folding transition actually occurs at h 2ε . Lastly, we show C Sin3 as a function of h in figure 3(c). C Sin3 for η = 2ε around h 2ε increases rapidly as compared to that for η = 0 at h 0 . Thus the coupled folding and binding of the ID binding part enhances sensitivity in the folding response of Sin3 to the change of chemical potential, namely a small change in h induces a large change in C Sin3 . Finally, we consider the implication of the sensitivity. The chemical potential correlates to the concentration of NRSF and Sin3. The change in chemical potential corresponds to the change in concentration of NRSF and Sin3. Therefore, the high sensitivity is efficient in sensing the concentration change. In addition, the folding of Sin3 controls the interaction between NRSF and Sin3 [40]. Thus ID, which induces high sensitivity, is important in the control of the interaction between NRSF and Sin3.