Proteins aggregation and human diseases

Many human diseases and the death of most supercentenarians are related to protein aggregation. Neurodegenerative diseases include Alzheimer's disease (AD), Huntington's disease (HD), Parkinson's disease (PD), frontotemporallobar degeneration, etc. Such diseases are due to progressive loss of structure or function of neurons caused by protein aggregation. For example, AD is considered to be related to aggregation of Aβ40 (peptide with 40 amino acids) and Aβ42 (peptide with 42 amino acids) and HD is considered to be related to aggregation of polyQ (polyglutamine) peptides. In this paper, we briefly review our recent discovery of key factors for protein aggregation. We used a lattice model to study the aggregation rates of proteins and found that the probability for a protein sequence to appear in the conformation of the aggregated state can be used to determine the temperature at which proteins can aggregate most quickly. We used molecular dynamics and simple models of polymer chains to study relaxation and aggregation of proteins under various conditions and found that when the bending-angle dependent and torsion-angle dependent interactions are zero or very small, then protein chains tend to aggregate at lower temperatures. All atom models were used to identify a key peptide chain for the aggregation of insulin chains and to find that two polyQ chains prefer anti-parallel conformation. It is pointed out that in many cases, protein aggregation does not result from protein mis-folding. A potential drug from Chinese medicine was found for Alzheimer's disease.


Introduction
was an academician of Academia Sinica in Taipei and had AD. Thus among 8 Nobel Prize winners in sciences with Chinese background, two families had been identified with AD. It is quite a high percentage. It has been estimated that 45% of USA citizens with age higher than 85 have AD. AD and other human diseases become more serious in longevity societies. century by Ludwig Eduard Boltzmann (20 February 1844-5 September, 1906 [20] and Josiah Willard Gibbs (11 February 1839-28 April 1903 [21]. The objective of statistical physics is to understand the properties of a macroscopic system from interactions of the molecules or atoms of the macroscopic system. One important research subfield of statistical physics is the study of phase transitions and critical phenomena [22], for a recent review please read [23]. Universality and scaling are two key concepts developed from the study of phase transitions and critical phenomena [22,23]. In 1971, Ken G. Wilson developed renormalization group (RG) theory [24,25,26] for critical phenomena, which provides a framework for understanding universality and scaling in critical systems, and can be used to calculate critical quantities of critical systems. For this contribution, Ken G. Wilson won the Nobel Prize in physics in 1982 and his Nobel Prize Lecture "The renormalization group and critical phenomena" was published in [27].
In the development of statistical physics, it has often been found that a simple model system can be useful for understanding collective or critical behavior [22,23] of a complicated system consisting of a large number of atoms or molecules. For example, in 1945 Guggenheim [28] reported that in the T /T c versus ρ/ ρ c plane (T , T c , ρ and ρ c are temperature, critical temperature, density, and critical density of the substance, respectively) the coexistence curves of N e , A r , K r , X e , N 2 , O 2 , CO, and CH 4 are very consistent with each other, and near the critical point the order parameter [22] |ρ − ρ c | ∼ |1 − (T /T c )| β , with the critical exponent β = 1/3. In 1952, Yang and Lee proposed that the critical behavior of the gas-liquid system can be represented by a lattice-gas model [29], which is equivalent to the Ising model [30], in which each atom or molecule on a lattice site is assigned a variable which can be either +1 or -1. In 1995-1996, Blöte and collaborators [31,32] used Monte Carlo simulations to find that the critical exponent β of the spontaneous magnetization and ν of the correlation length [22] of a three-dimensional (3D) Ising model are 0.3269 (6) [32] and 0.6301 (8) [31], respectively. In 2009, Sengers and Shanks [33] reviewed the experimental data for liquid-gas critical systems and reported that the order parameter and the correlation length have critical exponents β = 0.3245 and ν = 0.629±0.003, respectively. In 2012, Watanabe, Ito and Hu [34] used molecular dynamics simulations to find that β and ν of a 3D Lennard-Jones (L-J) model system [35,36] are 0.3285 (7) and 0.63(4), respectively. The values of β and ν obtained for simple model systems reported in Refs. [31,32,34] are highly consistent with experimental data reported in [28,33]. Using Monte Carlo methods [37,38,39,40] and analytic methods [41,42] it has been found that many critical systems can have very nice universal and scaling behaviors [43,44,45,46,47,48,49,50].
On the basis of molecular biology, it is of interest to know whether we can apply ideas and methods of statistical physics to understand protein aggregation related to human diseases. The purpose of this paper is to address this problem. In the next section, we first review the usage of lattice models to understand some factors which influence protein aggregation. In section 3, we review the application of an off-model of polymer chains to understand the influence of bendingangle dependent and torsion-angle dependent interactions on the aggregation of proteins. In section 4, we review some developments in the application of all-atom models to understand some problems related to protein aggregation. In section 5, we report the discovery of a potential drug for AD. Finally, in section 6, we summarize and discuss our result.

Lattice models for aggregation of proteins
Lattice Ising model has been used to describe critical phenomena of gas-liquid systems very successfully [23,31,32,33,51]. Thus one would like to use simple lattice models to understand folding and aggregation of proteins, which are polymer. As the temperature of the dilute polymer solution decreases, the polymer will undergo a collapse transition [52,53]. The interacting selfavoiding walk (ISAW) [54,55,56,57,58,59] on the simple cubic (SC) lattice is the simplest lattice model for homogenous polymers in dilute solution. The model has a collapse transition at a higher temperature and a freezing transition at a low temperature.  Chen, Hsieh, and Hu (CHH) [59] developed a novel method to analyze phase behavior of lattice models based on partition function zeros by considering both loci of partition function zeros and thermodynamical functions associated with them. With this method, the first pair of complex conjugate zeros (first zeros) can be defined without ambiguity and the critical point of a small system can be defined as the peak position of the heat capacity component associated with the first zeros. For the system with two phase transitions, two pairs of first zeros corresponding to two phase transitions can be identified and two overlapping phase transitions can be well separated. This method was applied to the interacting self-avoiding walk (ISAW) of homopolymer with N monomers on the simple cubic lattice. The exact partition functions  Z N with N up to 27 were calculated and CHH approach gives a clear scenario for the collapse and the freezing transitions [59]. CHH's conformation number for the ISAW on the 3 × 3 × 3 SC lattice is consistent with that of [54], which only studied the ISAW on the 3 × 3 × 3 SC lattice.
In 2010, Li et al. studied a lattice model for protein aggregation [60]. In this model, an amino acid is represented by a bead on a site of a simple cubic lattice. The bead can be H Monte Carlo (MC) method was used to calculate the temperature dependence of the fibril formation time τ fib for E +− = −1.4 (circles) and E +− = −0.6 (triangles) with N = 6 and E HH = −1, and calculated results were shown in Figure 2(e), in which arrows show the temperatures at which the fibril formation is fastest. Figure 2(b) shows that for E +− =-1.4 and E HH = −1, P N * has a maximum at T = 0.55, at the same temperature the fibril formation is fastest with the same parameters E +− =-1.4 and E HH = −1 in Figure 2(e). Thus Li et al. [60] found that the probability of a protein sequence appearing in an aggregated conformation can be used to determine the temperature at which the protein can aggregate most quickly. They also found a correlation between the time taken for the protein to aggregate and the strength of interactions between charged amino acids, which is consistent with previous experimental observations [61].
Co, Hu, and Li used the model of [60] to develop a lattice model for studying the aggregation of polypeptide chains in the presence of crowders [62]. Figure 3(a) shows an example of random conformations of peptide chains in a confined box. Figure 3(b) sows a snapshot of crowders (grey cubes) and polypeptide chains (color). The peptide chains can not enter the lattice sites occupied by crowders and there is no other interactions between peptide chains and crowders. Monte Carlo method was used to study the influence of various factors on the fibril formation time τ fib . One MC step consists of 9 local moves of peptide chains and one global move of a chain; 5 MC moves follow by one global move of all crowders. Figure 3(c) shows the dependence of τ fib on the crowder concentration φ c for various sizes of crowders. Orange and red refer to non-cubic (rectangular parallelepiped) crowders with dimensions 2 × 1 × 1 and 2 × 2 × 1. Other crowder sizes are also shown. The numbers of polypeptide chains N are 6, 10, and 24 for top, middle, and bottom panels, respectively. τ fib of Fig. 3(c) decreases with the concentration of crowders if crowder sizes are large enough, but the growth is observed for crowders of small sizes.  [63]. It becomes slow at either small or large coverage of cosolutes. Figure 3(e) shows the U-shape dependence of τ fib on the sizes of confining cubic box. The results obtained from 50 to 200 MC trajectories for each box size. Error bars come from averaging over MC runs. Due to competition between the energetics and entropic effects, the dependence of τ fib on the size of confined space is described by a parabolic function [62].

Relaxation and aggregation of polymer chains
Besides protein aggregation mentioned in Section 1, there is another interesting biological problem: How a biological system can maintain in non-equilibrium state? In particular, why ancient seeds can maintain in a viable non-equilibrium state for about 2000 years [64] or even about 30,000 years [65]? This section will review some clues for answering these questions [51].
Many important biological macromolecules, such as DNA, RNA, proteins, etc, are polymers. To understand why a biological system can maintain in a non-equilibrium state for a very long time and why proteins aggregate, one can try to get relevant information from the relaxation and aggregation of a simple polymer model. In this section, we briefly review the work by Ma and Hu [66,67,68,69,70] about relaxation and aggregation of polymer chains.

Model System with monomers connected by rigid bonds
In this subsection, we consider that polymer molecules are chains of rigidly connected monomers, interacting via L-J potentials between non-bonded monomer pairs and bending and torsion potentials among consecutive sites along the chains. The polymers were modified from the united atom model [71] where the bond length and the interaction potentials describe polyethylene. The L-J potential [35,36] between non-bonded monomer pairs at sites i and j with the distance r ij where σ and ǫ were used as units for length and energy strength, respectively. The bond length between two neighboring monomers in a polymer chain was fixed at b 0 = 0.357σ. For consecutive sites i, j = i + 1, k = i + 2, and l = i + 3 along a polymer chain, the unit bond vector b i defines the direction for two consecutive sites indexed by i and j = i + 1 along We also plot the MB distribution with (q, β) = (1.0, 0.554) as the dashed line, which has the same second moment <ṽ 2 > (orT * P ) as that of solid line. The inset displays the simulation data in log scales. Note that the deviations from the main curves at time t = 23.9, 85.9 and 172.9 are contributed by the locally agitated events over the transient periods G 1 , G 2 and G 3 , respectively. Reproduced from Fig. 2 of [66]. the chain. The unit local helicity vector u i is defined by The bending and torsion angles are determined by respectively. 180 o − θ ijk and φ ijkl are, respectively, the valence angle and torsional angle considered in [72]. We set the bending potential and the torsional potential with a 0 = 18.635ǫ, a 1 = 38.16ǫ, a 2 = 10.3ǫ and a 3 = −67.075ǫ, as in [71], θ 0 is either θ The length σ F and strength ǫ F of the L-J interactions between fluid molecules were expressed in units of σ and ǫ, respectively. All fluid molecules and monomers were assumed to have the same mass m. The composite quantity τ = mσ 2 /ǫ was used as time unit. The purely repulsive interaction potential between a monomer and a fluid molecule takes only the -12 power term in Eq. (1). In such homopolymer system, the thermodynamic conditions were changed by varying the volume (concentration) and the temperature. Heterogeneous conditions were introduced into the system by selecting a small portion of monomer sites, imposing different inter-site interactions. By mixing the polymers with spherical Lennard-Jones fluid molecules, we let the interaction potentials of fluid particles with the ordinary polymer sites be purely repulsive and with that small portions of specially selected sites ('linker sites') on polymers be the same as fluid-fluid potentials. Such setup mimics heterogeneities in the hydrophobic-hydrophilic effects in systems of biopolymers and help to stabilize the system. The shape and the strengths of all those interaction potentials are tunable as the means to alter the physical properties of the simulated system.
It is well known that the dynamic properties are related to the detailed numerical implements in solving the equations of motion in MD simulations [73]. In the simplified model, it should be emphasized that the well-tracked formulation of the discretized equations of motion that are treated as iterative dynamic systems. This aspect is appropriate in searching for microscopic origins of q-statistics [74]. The equations of motions in Lagrangian formulations were numerically solved using a velocity Verlet algorithm; the bonding between neighboring monomers was realized via the RATTLE scheme [75] to include the holonomic constraints for fixed bond lengths.

Relaxation of polymers with monomers in a chain connected by rigid bonds
We had chosen K b = K t = 0.1, corresponding to one tenth of those for polyethylene, so that the simulations can cover longer range of time scales. To start, we filled N P = 40 polymer chains (each chain has n = 100 monomers) and N F /4=1500 fluid molecules in a large enough cube of volume V = L 3 with periodic boundary conditions so that polymer chains can take some arbitrary random initial configurations. We took θ 0 in Eq. (2) to be θ We controlled the pressure and the kinetic energy [76] to adjust the volume V until it reached some value V ≈ (32.5σ) 3 . Then we replaced each fluid molecule with four closely packed molecules of zero kinetic energies and allowed the new system (of N P polymers and N F fluid molecules) to relax and reached finally a volume V = L 3 with L = 33.3875σ.
The NPT processes designed for equilibrium simulations [76] had been used in this preparation stage, where we controlled externally the pressure (P ) and the temperature (T ) to drive the system of fixed number of particles (N ) to the target density. The process also fine-tuned the conformations of the polymer chains that helped the effective performance of RATTLE-algorithm in the subsequent NVE simulations. We then started the production simulations with the integration time step (δt) 1 ≡ 5.0×10 −6 τ and focus on the NVE relaxation processes with constant volume V and total energy E without any external controlling. To speed up the simulation process, we increased the integration time step δt from (δt) 1 ≡ 5.0 × 10 −6 τ by five times to (δt) 2 ≡ 2.5 × 10 −5 τ at time t = 13.4τ .
The production simulations from t = 0 continued for about 180 τ (at the final few τ the integration time step was reduced, see below). The results were shown in Figs. 1 and 2 of [66], reproduced as Figs. 4 and 5 in the present paper. Figure 4(a) shows that the time evolution of the temperature-like quantities, T * P , T * F and T * 0 , consist of several stages, where T * P , T * F , and T * 0 are twice of the mean kinetic energies per degrees of freedom for polymer chains, fluid molecules, and the whole system, respectively. For polymer chains, is total number of degrees of freedom for polymer chains when (n − 1) holonomic constraints for each polymer chain were taken into account.
In the first stage R 1 , the local conformation relaxes with decreasing correlations in bond-pairs ( Fig. 4(d)) and local helicity pairs (Fig. 4(e)). We used inner product b i · b i+k and u i · u i+k for indices i and i + k along a chain to define the correlations of index distance k.
The increase of the integration time step at t = 13.4τ would certainly affect the route of time evolution at microscopic level. Our focus here is whether the value of q changes. The subsequent relaxation stage R 2 continues before the system becomes quasi-static with T * P ≈ T * F . The system is found subject to intermediate transients before and after reaching a quasi-static stage; there are three quasi-static stages, S 1 , S 2 and S 3 , following the transient stages G 1 , G 2 and G 3 , respectively. Entering the static stage S 3 , we reduced the integration time step by a factor fifty to (δt) 3 ≡ 5.0 × 10 −7 τ , to understand its effect on q. From simulation data, we estimated the diffusion constant for monomers as D ≈ 3.3σ 2 /τ . The time spans of S 1 and S 2 are large enough for a monomer to cross a distance ranging from L/2 to L.
The vertical dotted lines in Fig. 4 mark the change in integration time step δt from (δt) 1 ≡ 5.0 × 10 −6 τ to (δt) 2 ≡ 2.5 × 10 −5 τ at time t = 13.4τ , dividing the relaxation stage into R 1 and R 2 ; the intermediate transients, G 1 (from t = 18.4750 to t = 29.2925), G 2 (t = 81.6725 to t = 89.9700) and G 3 (t = 169.315 to t = 176.650) that are followed by the quasi-static stages S 1 , S 2 and S 3 , respectively. The velocity distributions at those steps marked by inverted triangles on top of Fig. 4(a) were analyzed in Fig. 5. Figure 5(a) shows that the velocity distributions of monomers at several typical time steps (marked by inverted triangles at the top line of Fig. 4) can be well described by Tsallis q-statistics [77] with a generalized Maxwell-Boltzmann distribution (GMBD) [78,79,80] where dv] −1 , and 1 ≤ q < 7/5 [66]. In the limit q → 1, Eq. (4) becomes the Maxwell-Boltzmann (MB) distribution The parameter β in Eq. (4) can be absorbed to scale the velocity variable v, whereṽ = β 1 2 v and the right-hand side (RHS) of Eq. (6) is free of β, i.e., a family of distributions with different β but same q can collapse into one master curve. Figure 5(b) shows that the velocity distributions in Fig. 5(a) collapse into a single curve, after being scaled by using Eq. (6). The parameter β −1 increases linearly with the instantaneous  Figure 6. Snapshots of the configurations before (a) and after (b) the formation of polymer aggregation, during the phase separation process for a system with N P =40 chains, each with n =100 rigidly bonded monomers, mixed with a fluid of N F =6000 spherical atoms (shown as grey dots). The monomers along a chain are also subject to bending and torsion (intra-chain) potentials in very weak strengths. Periodic boundary condition is applied to each of the three directions for the 33.3876σ × 33.3876σ × 33.3876σ cell. The instantaneous temperatures for polymer and fluid are T * P = 18.21 and T * F = 19.95 for (a); and T * P = 6.335 and T * F = 8.24 for (b), respectively. Reproduced from Fig. 1 of [68]. temperature of monomers T * P [66]. The result in Fig. 5(b) suggests that Eq. (4) should not be considered merely as a class of fitting functions, it relates the microscopic states over a wide time span as a single non equilibrium state characterized by the family of distributions of virtually the same q. Indeed, the reducing of integration time step δt can change the heights and locations of regions G 1 , G 2 and G 3 in Fig. 4(a), but does not change q, see e.g. S 3 in Fig. 4(a) and 4(b). As for the physical origin of such GMBD, it is useful to consider the thought of Beck and Cohen [81].
Note that each distribution in Fig. 5 is over the monomers in the system for one single step. If it is also a probability distribution for an individual monomer, then the x-component of the velocity of particle i satisfies where K is the total kinetic energy of the polymers. Using (e x q ) (1−q)x and integration by parts, we derived that the mean kinetic energy of a monomer is Equation (7) states a modified version of equipartition theorem; it becomes the conventional equipartition theorem only when q = 1. The conditions for the validity of Eq. (7), especially the role of heterogeneity [82], deserves further analysis.

Aggregation of polymer chains connected by rigid-bonds
We take a system at the end of simulation of Fig. 4 (t = 180τ ) as the initial system and reduce K b and K t to 1.0 × 10 −4 . With quenching and heating alternatively, each for about 50 τ , we let  Figure 7. Snapshots before (a) and after (b) the formation of aggregation, for a system similar to that in Fig. 6, except the fluid atoms are absent and the polymer sites are identical and subject to neither bending nor torsion potentials. The instantaneous temperature is T * P = 26.78 for (a) and T * P = 6.78 for (b), respectively. Reproduced from Fig. 2 of [68]. Figure 8. Snapshots before (a) and after (b) the formation of aggregation, for a system similar to that in Fig. 7, but the polymer chains are subject to the same bending and torsion angle potentials as those in the mixed system plotted in Fig. 6. The polymer temperature for (a) is T * P = 21.7 and for (b) is T * P = 6.42. Reproduced from Fig. 3 of [68].
the system free from possible residual local structures favored by the original angle potentials. We then let the system ready to relax. We also consider the case of pure polymers with zero (K b = K t =0) or very small (K b = K t = 10 −4 ) angle potentials in absence of fluid. It is found that, in all cases, very small or zero angle potentials render the systems to undergo slow quenching processes, as a dynamical system driven by a low temperature attractor caused by the numerical sink. Such scenario is also true for other cases considered in this subsection, as soon as the strengths of angle potentials are tiny. Since integration of time step δt is relevant in determining the time evolution of the aggregation process, we will consistently use two major integration time steps (δt) 2 = 2.5 × 10 −5 and (δt) 4 = 1×10 −4 . Over the beginning a change (on the strengths of angle potential, or on the lengths of polymer chains, etc...) being made, we use the smaller one (δt) 2 and turn to the large We examine the snapshots of the three cases: the mixed system of polymers and fluids (Fig. 6), the pure system where the monomers at linker sites are switched into ordinary monomers and K b and K t are set to 0 (Fig. 7), and the pure system where the monomers at linker sites are switched into ordinary monomers and K b and K t are kept at 10 −4 (Fig. 8) [68]. In all of three cases, polymer chains aggregate as K b and K t become 0 or very small.

Polymer chains with monomers connected by springs
To check whether the obtain q-statistics with q > 1 is an artifact due to the RATTLE scheme [75] or rigid-bond constraints between neighboring monomers, we used the method of ordinary MD simulations to study the relaxation process in a system of polymer chains and Lennard-Jones molecules, in which each polymer chain consists of n monomers labelled by integer i = 1, . . . , n from one end to the other; the monomers at the sites i and j = i + 1 (1 ≤ i < n) are connected by harmonic potentials The coupling constant k spring is k s = 10 s × k 0 , with s = 0, 1, 2, 3, 4 for the five systems, respectively. Please note that the interactions between neighboring monomers of the form of Eq. (8) has been used in the Go-like model for proteins [83] and also in a RNA model [84]. L-J interactions, and bending-angle and torsional-angle dependent interactions are the same as those described in previous subsections. The instantaneous temperatures of the monomers in polymer chains, T * P , and the fluid, T * F , were initially different, but they vary in time and their ratio Γ * = T * P /T * F approached to a constant during the relaxation process. The velocity distributions of monomers in the relaxation process follow q-statistics [77] with q ≥ 1. The value of q and the limiting ratio of Γ * depend on k spring ; in the strong strength limit, they approach those for the system, in which each polymer chain consists of monomers connected by rigid bonds; in the weak strength limit, Γ * and q approach 1 corresponding to MB distribution for velocities of monomers. Typical simulation results are shown in Figs. 5, 6 and 7 of [67]. Such results indicate that there is a continuous increase of q as one increases the strength of springs (between monomers) from the smaller to the larger values, and the result q > 1 in previous subsections is not an artifact of the RATTLE scheme [75] or the rigid-bond constraints.
It was found that when neighboring monomers of the spring-connected polymer chains have small or zero bending-angle and torsion-angle dependent potentials, then the polymer chains also aggregate [69]. When K b = K t = 0.1, polymer chains do not aggregate even at low temperatures. Thus K b and K t play important role for the aggregation of polymer chains.
In summary, the molecular dynamics was used to simulate various systems of polymer chains and L-J molecules; the neighboring monomers along a polymer chain are connected by rigid bonds [66] or spring of strength k spring [67]. It was found that during the relaxation process the velocity distributions of monomers in a wide range of simulation time can be well described by Tsallis q-statistics [77] with a generalized Maxwell-Boltzmann distribution (GMBD) [78,79,80] and a single scaling function. The value of q is related to the conformation constraining potential, the interactions with background fluid, the destruction of chain homogeneity or the value of k spring . Such study is useful for understanding microscopic origin of q-statistics [81,74,85]. The results suggest that complex non-equilibrium systems can still have simple scaling behavior and inspire many interesting problems for further studies.
The system studied in [66] has slow relaxation, similar to that of a spin-glass model at low temperatures [86]. The glassy behavior has also been found in the collagen fibril [87,88]. It seems Conformations and the transformation pathways of the global-and localminimum free-energy states A to N of the system with two polyQ-peptides and 3129 water molecules. The peptide conformations were created using RasMol [91]. (a) and (b) were reproduced from Fig. 2 and Fig. 4 of [15].
that the glassy behavior of bio-polymers is the key factor for a biological system to maintain in a non-equilibrium state [23,51]. It was also found that when neighboring monomers of a polymer chain have small or zero bending-angle and torsion-angle dependent potentials, then the polymer chains tend to aggregate [68,69,70]. The aggregation of proteins in human brains can result in neurodegenerative diseases, such as Alzheimer's disease [89,6,7], Huntington's disease [8], Parkinson's disease [16,17], etc. Thus it is of interest to know relaxation and aggregation behaviors of polymer chains in non-equilibrium situation.

All atom models for protein aggregation
In this section, we briefly review recent developments for simulations of all atom models for protein aggregation.  Table 1. Probability and free energy difference of each secondary structure in the one polyQpeptide system.

Oligomerization of peptides LVEALYL and RGFFYT and their binding affinity to insulin.
Recently it has been proposed a model for fibrils of human insulin in which the fibril growth proceeds via stacking LVEALYL (fragment 11-17 from chain B of insulin) into pairs of tightly interdigitated b-sheets. The experiments have also shown that LVEALYL has high propensity to self-assembly and binding to insulin. This necessitates study of oligomerization of LVEALYL and its binding affinity to full-length insulin. Using the all-atom simulations with Gromos96 43a1 force field and explicit water it is shown that LVEALYL can aggregate. Theoretical estimation of the binding free energy of LVEALYL to insulin by the molecular mechanic Poisson-Boltzmann surface area method reveals its strong binding affinity to chain B, implying that, in agreement with the experiments, LVEALYL can affect insulin aggregation via binding mechanism. We predict that, similar to LVEALYL, peptide RGFFYT (fragment B22-27) can self-assemble and bind to insulin modulating its fibril growth process. The binding affinity of RGFFYT is shown to be comparable with that of LVEALYL [90].

MD simulations of one and two polyglutamine peptides in explicit water molecules
Aggregation of polyglutamine peptides with β-sheet structures is related to some important neurodegenerative diseases such as Huntingtons disease. However, it is not clear how polyglutamine peptides form the β-sheets and aggregate. To understand this problem, we performed all-atom replica-exchange molecular dynamics simulations of one and two polyglutamine peptides with 10 glutamine residues in explicit water molecules. Conformations and the transformation pathways of the global-and local-minimum free-energy states A to I of the the system with one polyQ-peptide and 1611 water molecules was shown in Fig. 9(a); Conformations and the transformation pathways of the global-and local-minimum free-energy states A to N of the system with two polyQ-peptides and 3129 water molecules was shown in Fig. 9(b).
Our results show that two polyglutamine peptides mainly formed helix or coil structures when they are separated, as in the system with one-polyglutamine peptide. As the interpeptide distance decreases, the intrapeptide β-sheet structure sometimes appear as an intermediate state, and finally the interpeptide β-sheets are formed. We also find that the polyglutamine dimer tends to form the antiparallel β-sheet conformations rather than the parallel β-sheet, which is consistent with previous experiments and a coarse-grained molecular dynamics simulation [15]. Table 1 shows that in the one polyQ-peptide system, the random-coil has the highest probability of 64.9±8.9, the helix has the second probability of 29.4±4.6, and intra-β-sheet has probability of only 6.3±5.1 Table 2 shows that in the two polyQ-peptide system, inter-βsheet has the highest probability of 50.5±10.2, random-coil has the probability of 25.3±10.7, the helix has the probability of 23.1±3.2, and intra-β-sheet has the probability of 11.8±5.7. Thus the most stable conformation of two polyQ-peptide system, i.e. inter-β-sheet, is different from the most stable conformation of one polyQ-peptide system, i.e. the random-coil.  Table 2. Probability and free energy difference of each secondary structure in the two polyQpeptide system.

MD approach to effect of Taiwan mutation (D7H) on structures of amyloid peptides
Recent experiments have shown that the Taiwan mutation (D7H) slows the fibril formation of amyloid peptides Aβ40 and Aβ42 [92]. Motivated by this finding, we have studied the influence of D7H mutation on structures of Aβ peptide monomers using the replica exchange molecular dynamics simulations with OPLS force field and implicit water model. Our study reveals that the mechanism behind modulation of aggregation rates is associated with decrease of β-content and dynamics of the salt bridge D23-K28. Estimating the bending free energy of this salt bridge, we have found that, in agreement with the experiments, the fibril formation rate of both peptides Aβ40 and Aβ42 is reduced about two times by mutation [93].

Searching potential drugs for AD
From database of about 32364 natural compounds [94], the Lipinskis rule of five [95] was applied to reduce the number of compounds from 32364 to 3699 which was further studied by the docking method.
A ranking of ligands by their binding energies to two receptors [96,97] was made and the 10 top leads were listed on Table S1 of the Supporting Information (SI) in [98]. Among such 10 top leads only Dihydrochalcone could be purchased in the market.
. The binding affinity of this ligand (Dihydrochalcone) to amyloid beta (Aβ) fibril has been thoroughly studied by computer simulation and experiment. Using the Thioflavin T (ThT) assay we had obtained the inhibition constant IC50≈ 2.46µM . This result is in good agreement with the estimation of the binding free energy obtained by the molecular mechanic-Poisson Boltzmann surface area method and all-atom simulation with the force field CHARMM 27 and water model TIP3P. Cell viability assays indicated that Dihydrochalcone could effectively reduce the cytotoxicity induced by Aβ. Thus, both in silico and in vitro studies show that Dihydrochalcone is a potential drug for the Alzheimers disease [98].
Dihydrochalcone is a compound from Chinese medicine Xuejie (English Dragon's Blood; Latin Draconis Sanguis). According to Chinese historical record, Xuejie was already used as a medicine in Ming Dynasty. Dihydrochalcone is also the active ingredient of a traditional herbal tea in Africa.

Summary and discussion
It is pointed out in Section 1 that universality and scaling are key concepts developed from the study of phase transitions and critical phenomena in physical critical systems. Using computing packages developed by us [99], we have found a universal volume to surface area for proteins in Protein Data Bank [100]. We have used simple lattice models and polymer chain model to understand some key factors for protein aggregation. In the next step, it is of interest to develop universal theory for protein aggregation.  Figures 4 and 5 were obtained by molecular dynamics simulation and the simulated systems still do not reach equilibrium. It is of interest to use other simulation methods, such as parallel tempering which can allow the simulated systems to reach equilibrium more quickly, to study the same polymer systems to check whether in such cases the velocity distributions of monomers in polymer chains still deviate from Maxwell-Boltzmann distribution.
It had been proposed that protein aggregation is due to protein misfolding [101,102]. However, Figures 1(a) and 2(a) show that peptide chains aggregate with the conformation of the first excited state of the one chain system rather than the ground state conformation. Figure 9, Table 1, and Table 2 show that most stable conformation of one chain system is different from the most stable conformation of two-chains system. In such cases, protein aggregation is not due to protein mis-folding.
We have found Dihydrochalcone as a potential drug for AD [98] from in silico and in vitro study. It is of interest to use animal model to study the effect of Dihydrochalcone to prevent AD. Table S1 in SI of [98] list other 9 compounds besides Dihydrochalcone. These compounds are also potential drugs for AD. It is of interest to study the effect of such compounds to reduce protein aggregation via in vitro and in vivo study.