Spike-based local synaptic plasticity: A survey of computational models and neuromorphic circuits

Understanding how biological neural networks carry out learning using spike-based local plasticity mechanisms can lead to the development of powerful, energy-efficient, and adaptive neuromorphic processing systems. A large number of spike-based learning models have recently been proposed following different approaches. However, it is difficult to assess if and how they could be mapped onto neuromorphic hardware, and to compare their features and ease of implementation. To this end, in this survey, we provide a comprehensive overview of representative brain-inspired synaptic plasticity models and mixed-signal CMOS neuromorphic circuits within a unified framework. We review historical, bottom-up, and top-down approaches to modeling synaptic plasticity, and we identify computational primitives that can support low-latency and low-power hardware implementations of spike-based learning rules. We provide a common definition of a locality principle based on pre- and post-synaptic neuron information, which we propose as a fundamental requirement for physical implementations of synaptic plasticity. Based on this principle, we compare the properties of these models within the same framework, and describe the mixed-signal electronic circuits that implement their computing primitives, pointing out how these building blocks enable efficient on-chip and online learning in neuromorphic processing systems.


Introduction
The ability of biological systems to learn and adapt to their environment is key for survival.This learning ability is expressed mainly as the change in strength of the synapses that connect neurons, to adapt the structure and function of the underlying network.The neural substrate of this ability has been studied and modeled intensively, and many brain-inspired learning rules have been proposed (McNaughton et al. 1978, Gerstner et al. 1993, Stuart & Sakmann 1994, Markram et al. 1995).The vast majority, if not all, of these biologically plausible learning models rely on local plasticity mechanisms, where locality is a fundamental computational principle, naturally emerging from the physical constraints of the system.The principle of locality in synaptic plasticity presupposes that all the information a synapse needs to update its state (e.g., its synaptic weight) is directly accessible in space and immediately accessible in time.This information is based on the activity of the pre-and post-synaptic neurons to which the synapse is connected, but not on the activity of other neurons to which the synapse is not physically connected (Zenke & Neftci 2021).
From a biological perspective, locality is a key paradigm of cortical plasticity that supports self-organization, which in turn enables the emergence of consistent representations of the world (Varela et al. 1991).From the hardware development perspective, the principle of locality is a key paradigm for the design of spike-based plasticity circuits integrated in embedded systems, in order to enable them to learn online, efficiently and without supervision.This is particularly important in recent times, as the rapid growth of wearable and specialized autonomous sensory-processing devices brings new challenges in analysis and classification of sensory signals and streamed data at the edge.Consequently, there is an increasing need for online learning circuits that have low latency, are low power, and do not need to be trained in a supervised way with large labeled data-sets.As standard von Neumann computing architectures have separated processing and memory elements, they are not well suited for simulating parallel neural networks, they are incompatible with the locality principle, and they require a large amount of power compared to in-memory computing architectures.In contrast, neuromorphic architectures typically comprise parallel and distributed arrays of synapses and neurons that can perform computation using only local variables, and can achieve extremely low-energy consumption figures.In particular, analog neuromorphic circuits operate the transistors in the weak inversion regime using extremely low currents (ranging from pico-Amperes to micro-Amperes), small voltages (in the range of a few hundreds of milli-Volts), and use the physics of their devices to directly emulate neural dynamics (Mead 1990).The spike-based learning circuits implemented in these architectures can exploit the precise timing of spikes and consequently take advantage of the high temporal resolutions of eventbased sensors.Furthermore, the sparse nature of the spike patterns produced by neuromorphic sensors and processors can give these devices even higher gains in terms of energy efficiency.
Given the requirements to implement learning mechanisms using limited resources and local signals, animal brains still remain one of our best sources of inspiration, as they have evolved to solve similar problems under similar constraints, adapting to changes in the environment and improving their survival chances (Hofman 2015).Bottom-up, brain-inspired approaches to implement learning with local plasticity can be very challenging for solving real-world problems, because of the lack of a clear methodology for choosing specific plasticity rules, and the inability to perform global function optimization (as in gradient back-propagation) (Eshraghian et al. 2021).However, these approaches have the potential to support massively parallel and distributed computations and can be used for adaptive online systems at a minimum energy cost (Neftci et al. 2019).Recent work has explored the potential of braininspired self-organizing neural networks with local plasticity mechanisms for spatiotemporal feature extraction (Bichler et al. 2012), unsupervised learning (Diehl & Cook 2015, Iyer & Basu 2017, Hazan et al. 2018, Kheradpisheh et al. 2018, Khacef et al. 2020b), multi-modal association (Khacef et al. 2020a, Rathi & Roy 2021), adaptive control (DeWolf et al. 2020), and sensory-motor interaction (Lallee & Dominey 2013, Zahra & Navarro-Alarcon 2019).
Some of the recently proposed models of plasticity have introduced the notion of a "third factor", in addition to the two factors used in learning rules, derived from local information present at the pre-and post-synaptic site.In these three-factor learning rules, the local variables are used to determine the potential change in the weight (e.g., by using a local eligibility trace), but the change in the weight is applied only when the additional third factor is presented.This third factor represents a feedback signal (e.g., reward, punishment, or novelty) which could be implemented in the brain for example by diffusion of neuromodulators, such as dopamine ( Lukasz Kuśmierz et al. 2017, Gerstner et al. 2018).While this feedback signal is locally accessible to the synapse, it is not produced directly at the pre-or post-synaptic site.Therefore, these three-factor learning rules violate the principle of locality that we consider in this review.
In the next section, we provide an overview of synaptic plasticity from a historical, experimental, and theoretical perspective, with a focus on compatibility with physical emulation on Complementary Metal-Oxide-Semiconductor (CMOS) systems.We then present a selection of representative spike-based synaptic plasticity models that adhere to the principle of locality and that can therefore be implemented in neuromorphic hardware.We then present analog CMOS circuits that implement the basic mechanisms present in the rules discussed.As different implementations have different characteristics that impact the type and number of elements that use local signals, for each target implementation, we assess the principle of locality taking into account the circuits' physical constraints.We conclude proposing steps to reach a unified plasticity framework and presenting the challenges that still remain open in the field.

A brief history of plasticity
The quest for understanding learning in human beings is a very old one, as the process of acquiring new skills and knowledge was already a subject of debate among philosophers back in Ancient Greece where Aristotle introduced the notion of the brain as a blank state (or tabula rasa) at birth that was then developed through education (Markram et al. 2011).It was in contrast to the idea of Plato, his teacher, who believed the brain was pre-formed in the "heavens" then sent to earth to join the body.In modern times, the question of nature versus nurture is still being debated, with the view that we are born without preconceptions and our brain is molded by experience proposed by modern philosophers such as Locke (1689), and the studies that emphasize the importance of pre-defined structure in the nervous system and in neural networks, to guide and facilitate the learning process (Binas et al. 2015, Hawkins et al. 2017, Suárez et al. 2021).
In the later half of the nineteenth century, learning and memory were linked for the first time to "junctions between cells" by Bain (1873), even before the discovery of the synapse.In 1890, the psychologist William James postulated a mechanism for associative learning in the brain: "When two elementary brainprocesses have been active together or in immediate succession, one of them, on reoccurring, tends to propagate its excitement into the other" (James 1890).In the same period, neuroanatomists discovered the two main components of the brain: neurons and synapses.They postulated that the brain is composed of separate neurons (Waldeyer 1891), and that long-term memory requires the growth of new connections between existing neurons (Ramón y Cajal 1894).These connections became known then as "synapses" (Sherrington 1897).At the end of the nineteenth century, synapses were already thought to control and change the flow of information in the brain, thus being the substrate of learning and memory (Markram et al. 2011).
The first half of the twentieth century confirmed this hypothesis by various studies on the chemical synapses and the direction of information flow among neurons, going from the pre-synaptic axons to the post-synaptic dendrites.Neural processing was associated to the integration of synaptic inputs in the soma, and the emission of an output spike once a certain threshold was reached, propagating along the axon.Donald Hebb combined earlier ideas and recent discoveries on learning and memory in his book "The Organization of Behavior".Similarly to the ideas of James 60 years earlier, Hebb published, in 1949, his formal postulates for the neural mechanisms of learning and memory: "When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A's efficiency, as one of the cells firing B, is increased" (Hebb 1949).Although Hebb stated that this idea is old, strengthening synapses (that is, increasing synaptic efficacy or weight) connecting coactive neurons has since been called "Hebbian plasticity".It is also called Long-Term Potentiation (LTP).
Even though Hebb wrote that "less strongly established memories would gradually disappear unless reinforced through a slow "synaptic decay" (Hebb 1949), he did not provide an active mechanism for weakening synapses.Hence, the synaptic strengths or "weights" are unbounded and it is not possible to forget previously learned patterns to learn new ones.The first solution proposed a few years later was to maintain the sum of synaptic weights in a neuron constant (Rochester et al. 1956).In 1982, Oja proposed a Hebbian-like rule (Oja 1982) that adds a "forgetting" parameter and solves the stability problem with a form of local multiplicative normalization for synaptic weights.In the same year, Bienenstock et al. (1982) proposed the Bienenstock Cooper Munro (BCM) learning rule where during pre-synaptic stimulation, low-frequency activity of the post-synaptic neuron leads to Long-Term Depression (LTD) while high-frequency activity would lead to LTP.This model was an important shift as it introduced the socalled homo-synaptic LTD, where the plasticity was determined by the post-synaptic spike rate with no requirement on the temporal order of spikes.The importance of the post-synaptic neuron in synaptic plasticity was further demonstrated by showing how post-synaptic sub-threshold depolarization can determine whether LTP or LTD is applied (Artola et al. 1990, Sjöström et al. 2001).
Time is inherently present in any associative learning since it only relies on cooccurring events.McNaughton et al. (1978) were the first to experimentally explore the importance of the pre-and post-synaptic spike timing in plasticity.Fifteen years later, Gerstner et al. (1993) hypothesized that these pre/post spike times contain more information for plasticity compared to spike rates.Their hypothesis would be confirmed by experiments conducted by Stuart & Sakmann (1994) who discovered that the post-synaptic spike is back-propagating into the dendrites, as well as by Markram et al. (1995) who showed that a single spike leaves behind a Calcium trace of about 100 ms which is propagated back into the dendrites.These findings were highly influential in the field because they provided evidence that synapses have local access to the timings of pre-synaptic and postsynaptic neurons spikes.In their subsequent experiments, Markram et al. (1995) provided additional evidence that precise timing is important in neocortical neurons: They showed that using a pre/post pairing with a time difference of 10 ms led to LTP, while using the same time difference of 10 ms in an inverted post/pre pairing led to LTD (Markram et al. 1997).Larger time differences of 100 ms did not lead to any change in the synaptic weights.Almost concurrently, Bi & Poo (1998) performed similar experiments and found a 40 ms coincidence time window using paired recordings.These experiments proved that in addition to mean rates, also spike-timing matters.This phenomenon was later formulated in a learning rule named Spike-Timing Dependent Plasticity (STDP) (Song et al. 2000).
In this respect, the Hebbian learning formula proposed by Shatz (1992) that "cells that fire together wire together" could be misleading, as Hebb's (1949) postulate is directional: "axon of cell A is near enough to excite a cell B", which may be interpreted as implicitly time-dependent since cell A has to fire before cell B. On the other hand, STDP had been later found to only partially explain more elaborate learning protocols, which showed that while both LTP and LTD are compatible STDP at low frequencies, only LTP occurs at high frequencies regardless of the temporal order of spikes (Sjöström et al. 2001).As pair-based STDP models do not reproduce the frequency dependence of synaptic plasticity, Pfister & Gerstner (2006) proposed Triplet-based STDP (T-STDP) rule where LTP and LTD depend on a combination of three pre-and post-synaptic spikes (either two pre-and one post or one pre-and two post).Both pair-based and triplet-based STDP were then shown to be able to reproduce BCM like behavior (Gjorgjieva et al. 2011).Furthermore, the same frequency dependent experiments (Sjöström et al. 2001) showed that the state of the post-synaptic membrane voltage is important for driving LTP or LTD under the same pre/post timing conditions, confirming previous studies on the role of the neuron membrane voltage in plasticity (Artola et al. 1990).Therefore, these recent findings supported the computational plasticity models that depend on the arrival of the presynaptic spike and the voltage of the postsynaptic membrane (Fusi et al. 2000, Brader et al. 2007, Clopath et al. 2010), and which were also compatible with the STDP model.The more recent three-factor learning rules aim at bridging the gap between the different time scales of learning, specifically from pre-post spike timings (milliseconds) to behavioral time scales (seconds) (Gerstner et al. 2018).
Today, after more than two millennia of questioning, experimenting and more recently modeling, synaptic plasticity is still not fully understood and many questions remain unanswered.Nevertheless, it is clear that multiple forms of plasticity and time-scales co-exist in the synapse and in the whole brain (Nelson et al. 2002).They link to each other by sharing locality as a fundamental computational principle.

Experimental perspective
Synaptic weights are correlated with various elements in biological synapses (Bartol et al. 2015) such as the number of docked vesicles in the pre-synaptic terminal (Harris & Sultan 1995), the area of the pre-synaptic active zone (Schikorski & Stevens 1997), the dendritic spine head size (Harris & Stevens 1989, Hering & Sheng 2001), the amount of released transmitters (Murthy et al. 2001, Branco et al. 2008, Ho et al. 2011), the area of the post-synaptic density (Lisman & Harris 1994), and the number of AMPA receptors (Bourne et al. 2013, Biology of Synaptic Plasticity 2020).Synaptic plasticity is known to be heterogeneous across different types of synapses (Abbott & Nelson 2000, Bi & Poo 2001), and there is no unified experimental protocol to confront the different observations.Here we present the experimental results that led to the bottom-up definition of multiple plasticity rules.
Spike-timing dependence.Multiple experiments have been performed to demonstrate the dependence of plasticity on the exact pre-and post-synaptic neurons spike times (Markram et al. 1997, Bi & Poo 1998, Sjöström et al. 2001).From a computational point of view, these experiments led to the proposal of the STDP learning rule (Abbott & Nelson 2000, Markram et al. 2011), and its variants, such as T-STDP (Pfister & Gerstner 2006).Typically in these experiments, a pre-synaptic neuron is driven to fire shortly before or shortly after a postsynaptic one, by injecting a current pulse to the specific soma at the desired time.Specifically, these pre-post and post-pre pairings are repeated for 50 to 100 times at a relatively low frequency of about 1 Hz to 10 Hz (Sjöström & Gerstner 2010).Experimental results reveal synaptic plasticity mechanisms that are sensitive to the difference in spike times at the time scale of milliseconds (Gerstner et al. 1993).LTP is observed when the pre-synaptic spike occurs within 10 ms before the post-synaptic spike is produced, while LTD is observed when the order is reversed (Markram et al. 1997, Bi & Poo 1998).In biology, this precise spike timing dependence could be supported by local processes in the synapses that have access to both the timing information of pre-synaptic spikes and to the postsynaptic spike times, either by sensing their local membrane voltage changes or by receiving large depolarizations caused by output spikes that are back-propagated into the dendrite (Stuart & Sakmann 1994).
Post-synaptic membrane voltage dependence.Another feature of synaptic plasticity is its dependence on the post-synaptic neuron membrane voltage (Artola et al. 1990).To study this dependence, the pre-synaptic neuron is driven to fire while the post-synaptic neuron is clamped to a fixed voltage.The clamped voltage level will determine the outcome of the synaptic changes: If the voltage is only slightly above the resting potential of the neuron, then LTD is observed while if it is higher, then LTP is observed (Artola et al. 1990, Ngezahayo et al. 2000).These experiments show that post-synaptic spikes are not strictly necessary to induce long-term plasticity (Lisman & Spruston 2005, Lisman & Spruston 2010).Moreover, even in the presence of a constant pre/post timing (10 ms) at low frequencies (0.1 Hz), the post-synaptic membrane voltage determines whether LTP or LTD can be induced (Sjöström et al. 2001, Sjöström & Gerstner 2010).These findings suggest that the post-synaptic membrane voltage might be more important than the pre/post spike timing for synaptic plasticity.
Frequency dependence.While both spike-timing and post-synaptic membrane voltage dependence are observed in experimental protocols when relatively low spike frequencies are used, at high frequencies LTP tends to dominate over LTD regardless of precise spike timing (Sjöström et al. 2001).This spike-rate dependence, which is correlated with the Calcium concentration of the postsynaptic neuron (Sjöström et al. 2001), is captured by multiple learning rules such as BCM (Bienenstock et al. 1982) or the T-STDP (Pfister & Gerstner 2006) rule.In these rules, high spike rates produce a strong / rapid increase in Calcium concentration that leads to LTP, while low spike rates produce a modest / slow increase in Calcium concentration that decays over time and leads to LTD (Bliss & Collingridge 1993).

Theoretical perspective
Theoretical investigations of plasticity have yielded crucial insights in computational neuroscience.
Here, we summarize the fundamental theoretical and practical requirements for long-term synaptic plasticity.
Sensitivity to pre-post spikes correlations.Synaptic plasticity has to adjust the synaptic weights depending on the correlation between the pre-and post-synaptic neurons (Hebb 1949).Depending on how information is encoded, this can be achieved using spike times, spike rates or both (Brette 2015).It is important to note that the objective behind the detection of correlation is to detect causality which would ensure a better prediction (Vigneron & Martinet 2020).Even if correlation does not imply causality (Brette 2015), correlation can be considered as a tangible trace for causality in learning.
Selectivity to different patterns.In supervised, semi-supervised and reinforcement learning, post-synaptic neurons are driven by a specific teacher signal that forces target neurons to spike and other neurons to remain silent, allowing them to become selective to the pattern applied in input (Brader et al. 2007).In unsupervised learning, the selectivity emerges from competition among neurons (Kohonen 1990, Olshausen & Field 1996) like in Winner-Take-All (WTA) networks (Chen 2017).By associating local plasticity with a WTA network, it is possible to create internal models of the probability distributions of the input patterns.This can be interpreted as an approximate Expectation-Maximization algorithm for modeling the input data (Nessler et al. 2009).Recently, the combination of STDP with WTA networks has been successfully used for solving a variety of pattern recognition problems in both supervised (Chang et al. 2018) and unsupervised scenarios (Bichler et al. 2012, Diehl & Cook 2015, Iyer & Basu 2017, Rathi & Roy 2021).
Stability of synaptic memory.Long-term plasticity requires continuous adaptation to new patterns but it also requires the retention of previously learned patterns.As any physical system has a limited storage capacity, the presentation of new experiences will continuously generate new memories that would eventually lead to saturation of the capacity.When presenting new experiences, the stability (and retrieval) of old memories is a major problem in Artificial Neural Networks (ANNs).When learning of new patterns leads to the complete corruption or destruction of previously learned ones, then the network undergoes catastrophic forgetting (Nadal et al. 1986, French 1999).Both catastrophic forgetting and continual learning are critical problems that need to be addresses for always-on neural processing systems, including artificial embedded processors applied to solving edge-computing tasks.The main challenge in always-on learning is not its resilience against time, but its resilience against ongoing activity (Fusi et al. 2005).
Different strategies can be used to find a good balance between plasticity and stability.A first solution is to introduce stochasticity in the learning process, for example by using Poisson distributed spike trains to represent input signals to promote plasticity, while promoting stability using a bi-stable internal variable that slowly drives the weight between one of two possible stable states (Brader et al. 2007).As a result, only a few synapses will undergo a LTP or LTD transition for a given input, to progressively learn new patterns without forgetting previously learned patterns.A second solution is to have an intrinsic stop-learning mechanism to modulate learning and not change synaptic weights if there is enough evidence that the current input pattern has already been learned.
Depending on the particular pattern recognition problem to be solved and the learning paradigm (offline/online), specific properties can be more or less important.

Computational primitives of synaptic plasticity
In this work, we refer to "computational primitives of synaptic plasticity" as those basic plasticity mechanisms that make use of local variables.The following are the local variables that we consider:

Local variables
Pre-and post-synaptic spike traces: These are the traces generated at the preand post-synaptic site triggered by the spikes of the corresponding pre-or postsynaptic neurons.They can be computed by either integrating the spikes using a linear operator in models and a low-pass filter in circuits, or by using nonlinear operators/circuits.Figure 1 shows examples both linear (denoted as "integrative") and non-linear (denoted as "capped") spike traces.In general, these traces represent the recent level of activation of the pre-and post-synaptic neurons.Depending on the learning rule, there might be one or more spike traces per neuron with different decay rates.The biophysical substrates of these traces can be diverse (Pfister & Gerstner 2006, Graupner & Brunel 2010), for example reflecting the amount of bound glutamate (Karmarkar & Buonomano 2002) or the number of N-Methyl-D-Aspartate (NMDA) receptors in an activated state (Senn et al. 2001).The post-synaptic spike traces could reflect the Calcium concentration mediated through voltage-gated Calcium channels and NMDA channels (Karmarkar & Buonomano 2002), the number of secondary messengers in a deactivated state of the NMDA receptor (Senn et al. 2001) or the voltage trace of a back-propagating action potential (Shouval et al. 2002).Post-synaptic membrane voltage: The post-synaptic neuron's membrane potential is also a local variable, as it is accessible to all of the neuron's synapses.
These local variables are the basic elements that can be used to induce a change in the synaptic weight, which is reflected in the change of the post-synaptic membrane voltage that a pre-synaptic spike induces.

Spikes interaction
We refer to spike interactions as the number of spikes from the past activity of the neurons that are taken into account for the weight update.In particular, we distinguish two spikes interaction schemes: All-to-all: In this scheme, the spike trace is "integrative" and influenced, asymptotically, by the whole previous spiking history of the pre-synaptic neuron.
The contribution of each spike is expressed in the form of a Dirac delta which should be integrated.Nevertheless, if the spikes are considered to be point processes for which their spike width is zero in the limit, the contribution of all spikes in Eq. ( 1) can be approximated as follows: where  ( −   ) is a spike occurring at time   ,  is the exponential decay time constant and  is the jump value such that at the moment of a spike event, the spike trace jumps by .In addition to being a good first-order model of synaptic transmission, this transfer function can be easily implemented in electronic hardware using low-pass filters.Indeed, the trace  () represents the online estimate of the neuron's mean firing rate (Dayan & Abbott 2001).Nearest spike: This is a non-linear mode in which the spike trace is only influenced by the most recent pre-synaptic spike.It is implemented by means of a hard bound that is limiting the maximum value of the trace, such that if the jumps reach it, the trace is "capped" at that bound value.It is expressed in Eq. ( 2): where  is both the jump value and the hard bound, such that at the moment of a spike event, the spike trace jumps to .It means that the spike trace gives an online estimate of the time since the last spike.
Therefore, the jump and bound parameters control the sensitivity of the learning rule to the spike timing and rate combined (all-to-all) or to the spike timing alone (nearest spike), while the decay time constant controls how fast the synapse forgets about these activities.Further spike interaction schemes are possible, for example by adapting the nearest spike interaction so that spike interactions producing LTP would dominate over those producing LTD.

Update trigger
In most synaptic plasticity rules, the weights update is event-based and happens at the moment of a pre-synaptic spike (e.g.Brader et al. 2007), post-synaptic spike (e.g.Diehl & Cook 2015) or both pre-and post-synaptic spikes (e.g.Song et al. 2000).This event-based paradigm is particularly interesting for hardware implementations, as it exploits the spatio-temporal sparsity of the spiking activity to reduce the energy consumption with less updates.On the other hand, some rules use a continuous update (e.g.Graupner & Brunel 2012) arguing for more biological plausibility, or a mixture of both with e.g.depression at the moment of a pre-synaptic spike and continuous potentiation (e.g.Clopath et al. 2010).

Synaptic weights
The synaptic weight represents the strength of a connection between two neurons.Synaptic weights have three main characteristics: (i) Type: Synaptic weights can be continuous, with full floating-point resolution in software, or with fixed/limited resolution (binary in the extreme case).Both cases can be combined by using fixed resolution synapses (e.g., binary synapses), which however have a continuous internal variable that determines if and when the synapse undergoes a low-to-high (LTP) or high-to-low (LTD) transition, depending on the learning rule.
(ii) Bistability: In parallel to the plastic changes that update the weights, on their weight update trigger conditions, synaptic weights can be continuously driven to one of two stable states, depending on additional conditions on the weight itself and on its recent history.These bistability mechanisms have been shown to protect memories against unwanted modifications induced by ongoing spontaneous activity (Brader et al. 2007) and provide a way to implement stochastic selection mechanisms.
(iii) Bounds: In any physical neural processing system, whether biological or artificial, synaptic weights have bounds: they cannot grow to infinity.Two types of bounds can be imposed on the weights: (1) hard bounds, in rules with additive updates independent of weight, or (2) soft bounds, in weight-dependent updates (for example, multiplicative) rules that drive the weights toward the bounds asymptotically (Morrison et al. 2008).

Stop-learning
An intrinsic mechanism to modulate learning and automatically switch from the training mode to the inference mode is important, especially in an online learning context.This "stop-learning" mechanism can be either implemented with a global signal related to the performance of the system, as in reinforcement learning, or with a local signal produced in the synapses or in the soma.For example, a local variable that can be used to implement stop-learning could be derived from the postsynaptic neuron's membrane voltage (Clopath et al. 2010, Albers et al. 2016) or spiking activity (Brader et al. 2007, Graupner & Brunel 2012).

Models of synaptic plasticity
We present a representative set of spike-based synaptic plasticity models, summarize their main features, and explain their working principles.Table 1 shows a direct comparison of the computational principles used by the relevant models, and Tables 14  and 15 show the main variables common to the different models.Spike-Timing Dependent Plasticity (STDP) (Song et al. 2000) was proposed to model how pairs of pre-post spikes interact based solely on their timing.It is one of the most widely used synaptic plasticity algorithms in the literature. (3) The synaptic weight is updated according to Eq. ( 3), whose variables are described in Tab. 2. If a post-synaptic spike occurs after a pre-synaptic one (Δt < 0), potentiation is induced (triggered by the post-synaptic spike).In contrast, if a pre-synaptic spike occurs after a post-synaptic spike (Δt ≥ 0), depression occurs (triggered by the presynaptic spike).The time constants  + and  − determine the time window in which the spike interaction leads to changes in synaptic weight.As shown in Tab. 1, STDP is based on local pre-and post-spike traces with nearest spike interaction, meaning that the spike traces are capped.Fig. 2 illustrates how STDP is implemented using these spike traces for online learning.Specifically, the authors introduce a triplet depression (i.e.2-pre and 1-post) and potentiation term (i.e.1-pre and 2-post).They do this by adding four additional variables that they call detectors:  and . 1 and  2 detectors are pre-synaptic spike traces which increase whenever there is a pre-synaptic spike and decrease back to zero with their individual intrinsic time constants.Similarly,  1 and  2 detectors increase on post-synaptic spikes and decrease back to zero with their individual intrinsic time constants.The weight changes are defined in Eqs. ( 4), whose variables are described in Tab. 3.
While in classical STDP, potentiation takes place shortly after a pre-synaptic spike and upon occurrence of a post-synaptic spike, in the current framework several conditions need to be considered.Potentiation is triggered at every post-synaptic spike where the weight change is gated by the  1 detector and modulated by the  2 detector.If there are no post-synaptic spikes shortly before the current one ( 2 is zero) the degree of potentiation is determined by  + 2 only, just like in the pair-based STDP.If however a triplet of spikes occurs (in this case 1-pre and 2-post)  2 is non zero and an additional potentiation term  + 3  2 ( − ) contributes to the weight change.Analogously,  2 ,  1 ,  − 2 and  − 3 operate for the case of synaptic depression which is triggered at every pre-synaptic spike.The Spike-Driven Synaptic Plasticity (SDSP) learning rule addresses in particular the problem of memory maintenance and catastrophic forgetting: the presentation of new experiences continuously generates new memories that will eventually lead to saturation of the limited storage capacity, hence forgetting.As discussed in Sec.2.3, this problem concerns all learning rules in an online context.SDSP attempts to solve it by slowing the learning process in an unbiased way.The model randomly selects the synaptic changes that will be consolidated among those triggered by the input, therefore learning to represent the statistics of the incoming stimuli.
The SDSP model proposed by Brader et al. (2007) is demonstrated in a feedforward neural network used for supervised learning in the context of pattern classification.Nevertheless, the model is also well suited for unsupervised learning of patterns of activation in attractor neural networks (Del Giudice et al. 2003, Brader et al. 2007).It does not rely on the precise timing difference between pre-and postsynaptic spikes, instead the weight update is triggered by single pre-synaptic spikes.The sign of the weight update is determined by the post-synaptic neuron's membrane voltage  (   ).The post-synaptic neuron's Calcium variable  (   ) represents a trace of the recent low-pass filtered post-synaptic activity and is used to determine if synaptic updates should occur (stop-learning mechanism).The synaptic dynamics is described in Eq. ( 1).
The internal variable  is updated according to Eq. ( 5) with the variables described in Tab. 4.
The weight update depends on the instantaneous values of  ( pre ) and  ( pre ) at the arrival of a pre-synaptic spike.A change of the synaptic weight is triggered by the pre-synaptic spike if  ( pre ) is above a threshold   , provided that the postsynaptic Calcium trace  ( pre ) is between the potentiation thresholds  l   and  h   .An analogous but flipped mechanism induces a decrease in the weights.
The synaptic weight is restricted to the interval 0 ≤  ≤   .The bistability on the synaptic weight implies that the internal variable  drifts (and is bounded) to either a low state or a high state, depending on whether  is below or above a threshold   respectively.This is shown in Eqs (6).The Voltage-based STDP (V-STDP) rule has been introduced to unify several experimental observations such as post-synaptic membrane voltage dependence, pre-post spike timing dependence and post-synaptic rate dependence (Clopath & Gerstner 2010), but also to explain the emergence of some connectivity patterns in the cerebral cortex (Clopath et al. 2010).In this model, depression and potentiation are two independent mechanisms whose sum produces the total synaptic change.Variables of the equations are described in Tab. 5. Depression is triggered by the arrival of a pre-synaptic spike ( () = 1) and is induced if the voltage trace  − () of the post-synaptic membrane voltage () is above the threshold  − (see Eq. ( 7)).
On the other hand, potentiation is continuous and occurs following Eq.( 8) if the following conditions are met at the same time: • The instantaneous post-synaptic membrane voltage () is above the threshold  + , with  + >  − ; • The low-pass filtered post-synaptic membrane voltage  + is above  − ; • A pre-synaptic spike occurred a few milliseconds earlier and has left a trace .
The total synaptic change is the sum of depression and potentiation expressed in Eqs. ( 7) and ( 8) respectively, within the weights' hard bounds 0 and  max .In the proposed model, the synaptic strength is described by the synaptic efficacy  ∈ [0 : 1], which is constantly updated according to Eq. ( 9), whose variables are described in Tab. 6. Changes in synaptic efficacy are continuous and depend on the relative times in which the Calcium trace () is above the potentiation (  ) and depression (  ) thresholds (Graupner & Brunel 2012).
If the Calcium variable is above the threshold for potentiation (Θ[() −   ] = 1) the synaptic efficacy is continuously increased by and as long as the Calcium variable is above the threshold for depression (Θ[() −   ] = 1) the synaptic efficacy is continuously decreased by −     .Eventually, the efficacy updates induced by the Calcium concentration are in direct competition with each other as long as () is above both thresholds (Graupner & Brunel 2012).In addition to constant potentiation or depression updates, the bistability mechanism −(1 − ) ( ★ − ) drives the synaptic strength toward 0 or 1, depending on whether the instantaneous value of  is below or above the bistability threshold  ★ .Graupner & Brunel (2012) show that their rule replicates a plethora of dynamics found in numerous experiments, including pair-based behavior STDP with different STDP curves, synaptic dynamics found in CA3-CA1 slices for postsynaptic neuron spikes and dynamics based on spike triplets or quadruplets.However, the rule contains only a single Calcium trace variable () per synapse, which is updated by both preand post-synaptic spikes.Since the synaptic efficacy update only depends on this variable and not on the individual or paired spike events of the pre-and post-synaptic neuron, the system can get into a state in which isolated pre-synaptic or isolated post-synaptic activity can lead to synaptic efficacy changes.In extreme cases, isolated pre(post)-synaptic spikes could drive a highly depressed ( = 0) synapse into the potentiated state ( = 1), without the occurrence of any post(pre)-synaptic action potential.In a recent work, Chindemi et al. (2022) uses a modified version of the C-STDP rule based on data-constrained post-synaptic Calcium dynamics according to experimental data.They show that the rule is able to replicate the connectivity of pyramidal cells in the neocortex, by adapting the probabilistic and limited release of  2+ during pre-and post-synaptic activity.The Spiking BCM (SBCM) learning rule (Bekolay et al. 2013) has been proposed as another spike-based formulation of the abstract learning rule BCM, after the T-STDP rule.The weight update of the SBCM learning rule is continuous and is expressed in Eq. ( 10), whose variables are described in Tab. 7.
The mechanistic properties of SBCM are closer to the formal BCM rule, with the activities of the neurons expressed as spike activity traces and a filtered modification threshold.Nevertheless, the SBCM exhibits both the timing dependence of STDP and the frequency dependence of the T-STDP rule.The Membrane Potential Dependent Plasticity (MPDP) rule, also called the "Convallis" rule (Yger & Harris 2013) aims to approximate a fundamental computational principle of the neocortex and is derived from principles of unsupervised learning algorithms.The main assumption of the rule is that projections with non-Gaussian distributions are more likely to extract useful information from real-world patterns (Hyvärinen & Oja 2000).Therefore, synaptic changes should tend to increase the skewness of a neuron's sub-threshold membrane potential distribution.The rule is therefore derived from an objective function that measures how non-Gaussian the membrane potential distribution is, such that the post-synaptic neuron is often close to either its resting potential or spiking threshold (and not in between).
The resulting plasticity rule reinforces synapses that are active during postsynaptic depolarization and weakens those active during hyper-polarization.It is expressed in Eq. ( 11), where changes are continuously made on an internal update trace Ψ, and are then applied on the synaptic weight  as expressed in Eq. ( 12).The variables of the equations are explained in Tab. 8.The rule was used for unsupervised learning of speech data, where an additional mechanism was implemented to maintain a constant average firing rate.), which aims to implement a biologically plausible non-Hebbian learning rule.In their rule, they rely on the pre-synaptic spike trace, the post-synaptic spike event and the post-synaptic dendritic voltage of a multi-compartment neuron model.Plasticity in dendritic synapses is realizing a predictive coding scheme that matches the dendritic potential to the somatic potential.
This minimizes the error of dendritic prediction of somatic spiking activity of a conductance-based neuron model, that exhibits probabilistic spiking (Urbanczik & Senn 2014).The neuron membrane potential  is influenced by both a scaled version of the dendritic compartment potential  *  and the teaching inputs from excitatory or inhibitory proximal synapses  som  .In their proposed learning rule (see Eq. ( 13)), the aim is to minimize the error between the predicted somatic spiking activity based on the dendritic potential ( *  ()) and the real somatic spiking activity represented by back-propagated spikes ().The equation's variables are described in Tab. 9.The error () − ( *  ()) is assigned to individual dendritic synapses based on their recent activation, similar to Yger & Harris (2013) and Albers et al. (2016).
Since the back-propagated spikes () are only 0 or 1, but the predicted rate ( *  ) based on a sigmoidal function is never 0 or 1,  will never be 0. In this case, there is never a zero weight change (Urbanczik & Senn 2014).The plasticity induction variable   is continuously updated and used as an intermediate variable, before it is applied to induce a scaled persistent synaptic change, as expressed in Eq. ( 14).
Sacramento et al. ( 2018) showed later analytically that the Dendritic Prediction of Somatic Spiking (DPSS) learning rule combined with similar dendritic predictive plasticity mechanisms approximate the error back-propagation algorithm, and demonstrated the capabilities of such a learning framework to solve regression and classification tasks.).The idea is to potentiate or depress the synapses for which the pre-synaptic neuron activity was high or low at the moment of a postsynaptic spike, respectively.The RDSP learning rule relies solely on the pre-synaptic information and is triggered when a post-synaptic spike arrives.The weight update is shown in Eq. ( 15), whose variables are described in Tab.10.
determines the weight dependence of the update for implementing a soft bound, while the target value of the pre-synaptic spike trace    is crucial in this learning rule because it acts as a threshold between depression and potentiation.If it is set to 0, then only potentiation is observed.It is hence important to set it to a non-zero value to ensure that pre-synaptic neurons that rarely lead to the firing of the postsynaptic neuron will become more and more disconnected.More generally, the higher the value of  tar value, the more depression occurs and the lower the synaptic weights will be (Diehl & Cook 2015).
This rule was first proposed as a more biologically plausible version of a previously proposed rule for memristive implementations by Querlioz et al. (2013).The main difference between the two models is that the RDSP rule uses an exponential time dependence for the weight change which is more biologically plausible (Abbott & Song 1999) than a time-independent weight change.This can also be more useful for pattern recognition depending on the temporal dynamics of the learning task.The Homeostatic MPDP (H-MPDP) learning rule proposed by Albers et al. (2016) is derived from an objective function similar to that of the MPDP rule but with opposite sign, as it aims to balance the membrane potential of the post-synaptic neuron between two fixed thresholds; the resting potential and the spiking threshold of the neuron.Hence, the MPDP and the H-MPDP implement a Hebbian or homeostatic mechanism, respectively.In addition, the H-MPDP differs from the other described models by inducing plasticity only to inhibitory synapses.Albers et al. (2016) use a conductance based neuron and synapse model, similar to the C-MPDP and the DPSS rules.The continuous weight updates of the H-MPDP rule depend on the instantaneous membrane potential  () and the pre-synaptic spike trace   ( −    ) as expressed in Eq. ( 16) whose variables are described in Tab.11.
The authors claim that their model is able to learn precise spike times by keeping a homeostatic membrane potential between two thresholds.This definition differs from the homeostatic spike rate definition of the C-MPDP rule by Sheik et al. (2016).The Calcium-based MPDP (C-MPDP) learning rule (Sheik et al. 2016) was proposed with the explicit intention to have a local spike-timing based rule that would be sensitive to the order of spikes arriving at different synapses and that could be ported onto neuromorphic hardware.
Similarly to the DPSS rule, the C-MPDP rule uses a conductance-based neuron model.However, instead of relying on mean rates, it relies on the exact timing of the spikes.Furthermore, as for the H-MPDP rule, Sheik et al. (2016) propose to add a homeostatic element to the rule that targets a desired output firing rate.This learning rule is very hardware efficient because it depends only on the pre-synaptic spike time and not on the post-synaptic one.The equation that governs its behavior is Eq. ( 17).The weight update, triggered by the pre-synaptic spike, depends on a membrane voltage component (see Eq. ( 18)) and on a homeostatic one (see Eq. ( 19)).All equation variables are described in Tab.12.
The post-synaptic membrane voltage dependent weight update shown in Eq. ( 18) depends on the values of the membrane voltage   and an externally set threshold  lth , which determines the switch between LTP and LTD.The homeostatic weight update in Eq. ( 19) is proportional to the difference in post-synaptic activity represented by the post-synaptic spike trace  and an externally set threshold   .
The authors show that this learning rule, using the spike timing together with conductance based neurons, is able to learn spatio-temporal patterns in noisy data and differentiate between inputs that have the same 1st-moment statistics but different higher moment ones.Although they gear the rule toward neuromorphic hardware implementations, they do not propose circuits for the learning rule.20) whose variables are described in Tab. 13.
where an event   () is said to occur either at the time of an isolated spike or at the time of the first spike in a burst, whereas a burst   () is defined as any occurrence of at least two spikes (at the second spike) with an inter-spike interval less than a predefined threshold.Any additional spike within the time threshold belongs to the same burst.Hence, LTP and LTD are triggered by a burst and an event, respectively.Since a burst is always preceded by an event, every potentiation is preceded by a depression.However, the potentiation through the burst is larger than the previous depression, which results in an overall potentiation.
The moving average   () regulates the relative strength of burst-triggered potentiation and event-triggered depression.It has been established that such a mechanism exists in biological neurons (Mäki-Marttunen et al. 2020).It is formulated as a ratio between averaged post-synaptic burst and event traces.The authors show that manipulating the moving average   () (i.e. the probability that an event becomes a burst) controls the occurrence of LTP and LTD, while changing the pre-and postsynaptic event rates simply modifies the rate of change of the weight while keeping the same transition point between LTP and LTD.Hence, the BDSP rule paired with the control of bursting provided by apical dendrites enables a form of top-down steering of synaptic plasticity in an online, local and spike-based manner.
Moreover, the authors show that this dendrite-dependent bursting combined with short-term plasticity supports multiplexing of feed-forward and feedback signals, which means that the feedback signals can steer plasticity without affecting the communication of bottom-up signals.Taken together, these observations show that combining the BDSP rule with short-term plasticity and apical dendrites can provide a local approximation of the credit assignment problem.In fact, the learning rule has been shown to implement an approximation of gradient descent for hierarchical circuits and achieve good performance on standard machine learning benchmarks.Post-synaptic events   () Pre-synaptic spike trace

Models common variables
Tables 14 and 15 show the major common variables between the different models.This allows an easy comparison of the formalism of each rule.
Table 14: Variables in common between rules Part I

CMOS implementations of synaptic plasticity
Our comparison of plasticity models has highlighted many common functional primitives that are shared among the rules.These primitives can be grouped according to their function into the following blocks: low-pass filters, eligibility traces, and weight updates.These blocks can be readily implemented in CMOS technology, and they can be combined to implement different learning circuits.An overview of the proposed CMOS learning circuits that implement some of the models discussed is shown in Table 16.To better link the CMOS implementations with the models presented, we named all the current and voltage variables of our circuits to match those in the model equations.

CMOS building blocks
The basic building blocks found required for building neuromorphic learning circuits can be grouped in four different families.
Eligibility trace blocks These are implemented using either a current-mode integrator circuit, such as the Differential Pair Integrator (DPI), or other nonlinear circuits that produce slowly decaying signals.Input spikes can either increase the trace amplitude, decrease it, or completely reset it.The rate at which the trace decays back to its resting state can be typically modulated with externally controllable parameters.Circuit blocks implementing eligibility traces are highlighted in green in the schematics.Comparator blocks They are typically implemented using Winner-Take-All (WTA) current mode circuits, or voltage mode transconductance or Operational Amplifiers (OpAmps).The comparator block changes its output based on which input is greater.Circuit blocks implementing comparators are highlighted in yellow in the schematics.Weight update blocks They typically comprise a capacitor that stores a voltage related to the amplitude of the weight.Charging and discharging pathways connected to the capacitor enable potentiation and depression of the weight depending on the status of other signals.These blocks are is similar to the eligibility trace ones, except for the fact that they can produce both positive and negative changes.Circuit blocks implementing weight updates are highlighted in purple in the schematics.Bistability blocks These are typically implemented using a Transconductance Amplifier (TA) connected in feedback operation which compares the weight voltage to a reference voltage.Depending on the value of the weight voltage the bistability circuit will push the weight to the closest stable state.In its simplest form they have one single reference voltage, but they could be expanded to produce multiple stable states.Circuit blocks implementing bistability are highlighted in red in the schematics.Following the formalization of the STDP model in 2000 (see Eq. ( 3)), many CMOS implementations have been proposed.
Most implement the model as explained in Section above (Bofill-i-Petit et al. 2001, Indiveri 2003, Bofill-i-Petit & Murray 2004, Arthur & Boahen 2006, Bamford et al. 2012) however, some exploit the physics of single transistors to propose a floating gate implementation (Liu & Mockel 2008, Gopalakrishnan & Basu 2014).Indiveri et al. (2006) presented the implementation in Fig. 3.This circuit increases or decreases the analog voltage   across the capacitor   depending on the relative timing of the pulses  and .Upon arrival of a pre-synaptic pulse , a potentiating waveform   is generated within the pMOS-based trace block (see Fig. 3).  has a sharp onset and decays linearly with an adjustable slope set by  + .  serves to keep track of the most recent pre-synaptic spike.Analogously, when a post-synaptic spike () occurs,    and  − create a trace of post-synaptic activity.By ensuring that   and    remain below the threshold of the transistors they are connected to and the exponential current-voltage relation in the sub-threshold regime, the exponential relationship to the spike time difference Δ of the model is achieved.While  + and  − set the upper-bounds of the amount of current that can be injected or removed from   , the decaying traces   and    determine the value of  + or  − and ultimately the weight increase or decrease on the capacitor   within the weight update block (see Fig. 3).Similarly, as for the pair-based STDP, there are many implementations of the T-STDP rule.While some are successful in implementing the equations in the model (Mayr et al. 2010, Meng et al. 2011, Rachmuth et al. 2011, Azghadi et al. 2013), others exploit the properties of floating gates (Gopalakrishnan & Basu 2017).

Triplet-based STDP (T-STDP)
Specifically, Mayr et al. (2010) as well as Rachmuth et al. (2011) and Meng et al. (2011) implement learning rules that model the conventional pair-based STDP together with the BCM rule.Azghadi et al. (2013) is the first, to our knowledge, to not only model the function but also model the equations presented in Pfister et al. (2006) (see Eq. ( 4)). Figure 4 shows the circuit proposed by Azghadi in 2013 to model the T-STDP rule.It faithfully implements the equations by having independent circuits and biases, for the model parameters  − 2 ,  + 2 ,  − 3 , and  + 3 .These parameters correspond to spike-pairs or spike-triplets: post-pre, pre-post, pre-post-pre, and postpre-post, respectively.
In this implementation, the voltage across the capacitor   determines the weight of the specific synapse.Here, a high potential at the node  is caused by a highly discharged capacitor indicating a low synaptic weight, which results in a depressed synapse.In the same way, a low potential at this node is caused by a more strongly charged capacitor and resembles a strong synaptic weight and in turn a potentiated synapse.The capacitor is charged and discharged by the two currents   and    respectively.These two currents are gated by the most recent pre-and post-synaptic spikes through the transistors controlled by () and  () within the weight update block (see Fig. 4) comparators, blocks implementing the weight update and bistability mechanism.Here, we present the most complete design by Chicca et al. (2014), shown in Fig. 5, which replicates more closely the model equations (see Eq. ( 5)).
At each pre-synaptic spike , the weight update block (see Fig. 5) charges or discharges the capacitor   altering the voltage   , depending on the values of   and   .Here,   represents the synaptic weight.If   >   ,   increases, while in the opposite case   decreases.Moreover, over long time in the absence of pre-synaptic spikes,   is slowly driven toward the bistable states    or    depending on whether   is higher or lower than   respectively (see bistability block in Fig. 5).
The   and   signals are continuously computed in the learning block, which compares the membrane potential of the neuron () to the threshold   and evaluates in which region the Calcium concentration   lies.The neuron's membrane potential is compared to the threshold   by a transconductance amplifier.If  >   ,  ℎ is high and   is low, while if  <   ,  ℎ is low and   is high.At the same time, the post-synaptic neuron spikes () are integrated by a DPI to produce the Calcium concentration   (see trace -DPI block in Fig. 5), which is then compared with three Calcium thresholds by three WTA circuits (see comparator circuits in Fig. 5).In the lower comparator,   is compared to  1 and if   <  1 no learning conditions of the SDSP rule is satisfied and there is no weight update.Assuming that   >  1 , the two upper comparators set the signals   and   .If   is high and   <  2 ,   is increasing, setting the strength of the nMOS-based pull-down branch in the weight update block.If  ℎ is high and   <  3 ,   is decreasing, setting the strength of the pMOS-based pull-up branch of the weight update block.These two branches in the weight update block are activated by the  input spike.
The first CMOS implementation of a spike-based learning rule done by Häfliger et al. (1997) pre-dates the formalization of the RDSP model, which happened almost 20 years later (Diehl & Cook 2015).It is one of the most apparent cases of how building electronic circuits that mimic biological behavior leads to the discovery of useful mechanisms for solving real-world problems.
The algorithmic definition of their learning rule is based on a correlation signal, local to each synapse, which keeps track of the pre-synaptic spike activity.The correlation signal is refreshed at each pre-synaptic event and decays over time.When a post-signal arrives, depending on the value of the correlation, the weight is either increased or decreased, while the correlation signal is reset.Similarly, the RDSP rule relies on the pre-synaptic spike time information and is triggered when a post synaptic spike arrives.The direction of weight update depends on a target value    , which determines the threshold between depression and potentiation.
The two main differences between the circuit by Häfliger et al. (1997) (see Fig. 7) and the RDSP rule (see Eq. ( 15)) is that the correlation signal in Häfliger et al. (1997) is binary and is compared to a fixed threshold voltage (the switching threshold of the first inverter), which resembles a fixed    .In the Häfliger et al. (1997) implementation, the voltage   across the capacitor   represents the synaptic weight and the voltage     at the capacitor     represents the correlation signal.At the arrival of a presynaptic input spike (), the voltage   determines the amplitude of the current towards the soma (  ) of the post-synaptic neuron.At the same time, the capacitor     is fully discharged and     is low.In the absence of pre-synaptic and postsynaptic spikes ( and  are low),     is slowly charged towards   by the pMOS branch in the trace block (see Fig. 7).
The voltage     is constantly compared to the threshold voltage (resembling    ) of the first inverter it is connected to.At the arrival of a post-synaptic spike ( is high) the weight capacitor   is either charged (depressed) or discharged (potentiated) depending on the momentary level of     .If     is above the inverter threshold voltage, the right branch of the weight update block (see Fig. 7) is inactive, while the left branch is active and the pMOS-based current mirror charges the capacitor   .In the opposite case, where     is below the inverter threshold voltage, the right branch is active while the output of the second inverter disables the left branch of the weight update block.This results in a discharge of the capacitor   controlled by the nMOS-based current mirror.The amplitude for potentiation and depression is set by the two biases   and    .At the end of a post-synaptic spike the correlation signal     is reset to  .A similar approach implementing a nearest-spike interaction scheme and a fixed    was implemented by Ramakrishnan et al. (2011) exploiting the properties of floating gates.

Other models implementations
To the best of our knowledge, there have been no dedicated CMOS-based implementations of the other models presented in Sec. 4.Although the V-STDP rule proposed by Clopath et al. (2010) and Clopath & Gerstner (2010) shares similarities with the T-STDP rule and can be related to the BCM rule (Gjorgjieva et al. 2011), its complexity for implementations comes from its multiple transient signals on different timescales.To this end, emerging novel technologies, such as memristors (Cantley et al. 2011, Li et al. 2013, Li et al. 2014, Ziegler et al. 2015, Diederich et al. 2018) and neuristors (John et al. 2018) are capable of supporting promising solutions to implement different timescales in a compact and efficient manner.Similarly, implementations for the DPSS rule (Urbanczik & Senn 2014) are difficult due to the increased complexity of the required multi-compartment neuron models.Recently, implementations based on hybrid memristor-CMOS systems (Nair et al. 2017, Payvand et al. 2020) or using existing neuromorphic processors to exploit neuron structures to replicate the multi-compartment model (Cartiglia et al. 2020) have been proposed.A detailed view on these implementations is beyond the scope of this review and the authors refer the readers to the original publications.
However, introducing CMOS implemented models through the lens of functional building blocks allows us to quickly look for analogies and differences between the implemented and other models.Throughout this Section, we have highlighted the similarities and differences of each of the implemented models.Focusing on functional building blocks also allows for a broader generalization to all the models that have not been implemented yet: using the basic building block we presented (e.g.Traces, Comparators, Weight updates, and Bistability) one could potentially construct all the learning models we have discussed in Sec. 4.

Toward a unified synaptic plasticity framework
In this survey, we highlighted the similarities and differences of representative synaptic plasticity models and provided examples of neuromorphic circuits CMOS that can be used to implement their principles of computation.We highlighted how the principle of locality in learning and neural computation in general is fundamental and enables the development of fast, efficient and scalable neuromorphic processing systems.We highlighted how the different features of the plasticity models can be summarized in (1) synaptic weights properties, (2) plasticity update triggers and (3) local variables that can be exploited to modify the synaptic weight (see also Table 1).Although all local variables of these rules are similar in nature, the plasticity rules can can be subdivided in the following way: • Pre-synaptic spike trace: RDSP.
Many possibilities arise when exploring how the local variables used by these rules interact (e.g.comparison, addition, multiplication, etc.).This leads to a wide range of additional models that could be proposed and to a large number of biological experiments that could be carried out to verify the hypotheses and predictions made by the rules.
It is difficult to predict whether a unified rule of synaptic plasticity can be formulated, based on the observation that several plasticity mechanisms coexist in the brain (Abbott & Nelson 2000, Bi & Poo 2001), and that different problems may require different plasticity mechanisms.Nevertheless, we provided here a single unified framework that allowed us to do a systematic comparison of the features of many representative models of synaptic plasticity presented in the literature, developed following experiment-driven bottom-up approaches and/or application-driven topdown approaches (Frenkel et al. 2021).While the bottom-up approach can help in explaining the plasticity mechanisms found in the brain, top-down guidance can help to find the right level of abstraction from biology to get the best performance for solving problems in the context of efficient and adaptive artificial systems.In line with the neuromorphic engineering perspective, this work bridges the gap between both approaches.

Overcoming back-propagation limits for online learning
Local synaptic plasticity in neuromorphic circuits offers a promising solution for online learning in embedded systems.However, due to the very local nature of this approach, there is no direct way of implementing global learning rules in multilayer neural networks, such as the gradient-based back-propagation algorithm (LeCun et al. 1998, Schmidhuber et al. 2007).This algorithm has been the work horse of ANNs training in deep learning over the last decade.Gradient-based learning has recently been applied for offline training of SNNs, where the Back-Propagation (BP) algorithm coupled with surrogate gradients is used to solve two critical problems: first, the temporal credit assignment problem which arises due to the temporal inter-dependencies of the SNN activity.It is solved offline with Back-Propagation Through Time (BPTT) by unrolling the SNN like standard Recurrent Neural Networks (RNNs) (Neftci et al. 2019).Second, the spatial credit assignment problem, where the credit or "blame" with respect to the objective function is assigned to each neuron across the layers.However, BPTT is not biologically plausible (Bengio et al. 2015, Lillicrap et al. 2020) and not practical for on-chip and online learning due to the non-local learning paradigm.On one hand, BPTT is not local in time as it requires keeping all the network activities for the duration of the trial.On the other hand, BPTT is not local in space as it requires information to be transferred across multiple layers.Indeed, synaptic weights can only be updated after complete forward propagation, loss evaluation, and back-propagation of error signals, which lead to the so-called "locking effect" (Czarnecki et al. 2017).
Recently, intensive research in neuromorphic computing has been dedicated to bridge the gap between back-propagation and local synaptic plasticity rules by reducing the non-local information requirements, at a cost of accuracy in complex problems (Eshraghian et al. 2021).The temporal credit assignment can be handled by using eligibility traces (Zenke & Ganguli 2018, Bellec et al. 2020) that solve the distal reward problem by bridging the delay between the network output and the feedback signal that may arrive later in time (Izhikevich 2007).Similarly, inspired by recent progress in deep learning, several strategies have been explored to solve the spatial credit assignment problem using feedback alignment (Lillicrap et al. 2016), direct feedback alignment (Nøkland 2016), random error BP (Neftci et al. 2017) or by replacing the backward pass with an additional forward pass whose input is modulated with error information (Dellaferrera & Kreiman 2022).However, these approaches only partially solve the problem (Eshraghian et al. 2021), since they still suffer from the locking effect, which can nonetheless be tackled by replacing the global loss by a number of local loss functions (Mostafa et al. 2018, Neftci et al. 2019, Kaiser et al. 2020, Halvagal & Zenke 2022) or by using direct random target projection (Frenkel et al. 2021).Assigning credit locally, especially within recurrent SNNs, is still an open question and an active field of research (Christensen et al. 2021).
The local synaptic plasticity models and circuits presented in this survey do not require the presence of a teacher signal and contrast with supervised learning using labeled data which is neither biologically plausible (Halvagal & Zenke 2022) nor practical in most online scenarios (Muliukov et al. 2022).Nevertheless, the main limit of spike-based local learning is the diminished performance on complex pattern recognition problems.Different approaches have been explored to bridge this gap, such as DPSS (Urbanczik & Senn 2014, Sacramento et al. 2018) and BDSP (Payeur et al. 2021) learning rules that use multi-compartment neurons and show promising performance in approximating back-propagation with local mechanisms, or using multi-modal association to improve the self-organizing system's performance (Gilra & Gerstner 2017, Khacef et al. 2020a, Rathi & Roy 2021) as in contrast to labeled data, multiple sensory modalities (e.g.sight, sound, touch) are freely available in the real-world environment.

Structural plasticity and network topology
Exploring local synaptic plasticity rules gives valuable insights into how plasticity and learning evolves in the brain.However, in bringing the plasticity of single synapses to the function of entire networks, many more factors come into play.Functionality at a network level is determined by the interplay between the synaptic learning rules, the spatial location of the synapse, and the neural network topology.
Furthermore, the network topology of the brain is itself plastic (Holtmaat & Svoboda 2009).Le Bé & Markram (2006) provided the first direct demonstration of induced rewiring (i.e.sprouting and pruning) of a functional circuit in the neocortex (Markram et al. 2011), which requires hours of general stimulation.Some studies suggest that glutamate release is a key determinant in synapse formation (Engert & Bonhoeffer 1999, Kwon & Sabatini 2011), but additional investigations are needed to better understand the computational foundations of structural plasticity and how it is linked to the synaptic plasticity models we reviewed in this survey.Together, structural and synaptic plasticity are the local mechanisms that lead to the emergence of the global structure and function of the brain.Understanding, modeling, and implementing the interplay between these two forms of plasticity is a key challenge for the design of self-organizing systems that can get closer to the unique efficiency and adaptation capabilities of the brain.

CMOS neuromorphic circuits
The computational primitives that are shared by the different plasticity models were grouped together in corresponding functional primitives and circuit blocks that can be combined to map multiple plasticity models into corresponding spike-based learning circuits.Many of the models considered rely on exponentially decaying traces.By operating the CMOS circuits in the sub-threshold regime, this exponential dependency is given by the physical substrate of transistors showing an exponential relationship between current and voltage (Mead 1990).
The circuits presented make use of both analog computation (e.g.analog weight updates) and digital communication (e.g.pre-and post-synaptic spike events).This mixed-signal analog/digital approach aligns with the observations that biological neural systems can be considered as hybrid analog and digital processing systems (Sarpeshkar 1998).Due to the digital nature of spike transmission in these neuromorphic systems, plasticity circuits that require the use of pre-synaptic traces need extra overhead to generate this information directly at the post-synaptic side.
The emergence of novel nanoscale memristive devices has high potential for allowing the implementation of such circuits at a low overhead cost, in terms of space and power (Demirag et al. 2021).In addition, these emerging memory technologies have the potential of allowing long-term storage of the synaptic weights in a nonvolatile way, that would allow these neuromorphic systems to operate continuously, without having to upload the neural network parameters at boot time.This will be a significant advantage in large-scale systems, as Input/Output operations required to load network parameters can take a significant amount of power and time.In addition, the properties of emerging memristive devices could be exploited to implement different features of the plasticity models proposed (Diederich et al. 2018).
Overall, the number of proposed CMOS-based analog or mixed-signal neuromorphic circuits over the past 25 years is relatively low, as this was mainly driven by fundamental academic research.With the increasing need for low-power neural processing systems at the edge, the increasing maturity of novel technologies, and the rising interest in brain-inspired neural networks and learning for data processing, we can expect an increasing number of new mixed signal analog/digital circuits implementing new plasticity rules also for commercial exploitation.In this respect, this review can provide valuable information for making informed modeling and circuit design decision in developing novel spike-based neuromorphic processing systems for online learning.

Figure 1 :
Figure 1: The local variables involved in the local synaptic plasticity models we review in this survey: Pre-and/or post-synaptic spike traces (capped or integrative) and post-synaptic membrane (dendritic or somatic) voltage.

Figure 2 :
Figure 2: Online implementation principle of STDP using local pre-and post-synaptic capped spike traces which provide an online estimate of the time since the last spike.For example, at the moment of post-synaptic spike, potentiation is induced with a weight change that is proportional to the value of the pre-synaptic spike trace, and the post-synaptic spike trace is updated with a jump to  − .

Figure 3 :
Figure 3: STDP circuit with highlighted the CMOS building blocks used: Eligibility traces (in green) and Weight updates (in violet).The voltage and current variables reflect the model equation.Adapted from: Indiveri et al. (2006).

Figure 4
Figure 4: T-STDP circuit with highlighted the CMOS building blocks used: Eligibility traces with leaky integrators (in green) and weight updates (in violet).The voltage and current variables reflect the model equation.The  and  detectors of the model are also reported in this circuit figure.Adapted from: Azghadi et al. (2013).

Table 1 :
Spike-based local synaptic plasticity rules: comparative table

Table 2 :
Variables of the STDP rule.The main limitation of the original STDP model is that it is only time-based; thus, it cannot reproduce frequency effects as well as triplet and quadruplet experiments.In this work, Pfister & Gerstner (2006) introduces additional terms in the learning rule to expand the classical pair-based STDP to a Triplet-based STDP (T-STDP).

Table 3 :
Variables of the T-STDP rule.

Table 4 :
Variables of the SDSP rule.

Table 5 :
Variables of the V-STDP rule.
max Weight max hard bound 4.5.Graupner and Brunel (2012): Calcium-based STDP (C-STDP) Founded on molecular studies, Graupner & Brunel (2012) proposed a plasticity model (C-STDP) based on a transient Calcium signal.They model a single Calcium trace variable () which represents the linear sum of individual Calcium transients elicited by pre-and post-synaptic spikes at times   and   , respectively.The amplitudes of the transients elicited by pre-and post-synaptic spikes are given by  pre and  post , respectively, and () decays constantly towards 0.

Table 6 :
Variables of the C-STDP rule.

Table 7 :
Variables of the SBCM rule.

Table 8 :
Variables of the MPDP rule.

Table 9 :
Variables of the DPSS rule.

Table 10 :
Variables of the RDSP rule.

Table 11 :
Variables of the H-MPDP rule.

Table 12 :
Variables of the C-MPDP rule.
(Zenke & Neftci 2021)tration trace 4.12.Payeur et al. (2021): Burst-Dependent Synaptic Plasticity (BDSP)The Burst-Dependent Synaptic Plasticity (BDSP) learning rule(Payeur et al. 2021)has been proposed to enable online, local, spike-based solutions to the credit assignment problem in hierarchical networks(Zenke & Neftci 2021), i.e. how can neurons high up in a hierarchy signal to other neurons, sometimes multiple synapses apart, whether to engage in LTP or LTD to improve behavior.The BDSP learning rule is formulated in Eq. (

Table 13 :
Variables of the BDSP rule.
() Exponential moving average of the proportion of post-synaptic bursts   ()

Table 16 :
Neuromorphic circuits for spike-based local synaptic plasticity models Potentiation and depression triggers done with digital logic gates. 2 Weight storage in digital SRAM. 1