System model of neuromorphic sequence learning on a memristive crossbar array

Machine learning models for sequence learning and processing often suffer from high energy consumption and require large amounts of training data. The brain presents more efficient solutions to how these types of tasks can be solved. While this has inspired the conception of novel brain-inspired algorithms, their realizations remain constrained to conventional von-Neumann machines. Therefore, the potential power efficiency of the algorithm cannot be exploited due to the inherent memory bottleneck of the computing architecture. Therefore, we present in this paper a dedicated hardware implementation of a biologically plausible version of the Temporal Memory component of the Hierarchical Temporal Memory concept. Our implementation is built on a memristive crossbar array and is the result of a hardware-algorithm co-design process. Rather than using the memristive devices solely for data storage, our approach leverages their specific switching dynamics to propose a formulation of the peripheral circuitry, resulting in a more efficient design. By combining a brain-like algorithm with emerging non-volatile memristive device technology we strive for maximum energy efficiency. We present simulation results on the training of complex high-order sequences and discuss how the system is able to predict in a context-dependent manner. Finally, we investigate the energy consumption during the training and conclude with a discussion of scaling prospects.


Introduction
Learning sequences of experiences and building a predictive model of the environment have been proposed as the core functions of a number of structures in the brain, especially the neocortex (Hawkins 2021). Implementing this type of computation in an electronic hardware system could be used in numerous possible applications ranging from speech recognition (Battistoni et al 2019) and natural language processing to medical signal analysis (Ballinger et al 2018, Ay et al 2019, Esteva et al 2019 and industrial applications (Silvestrin et al 2019).
The human brain can solve these tasks in a highly energy-efficient way (Cauwenberghs 2013) with only moderate amounts of training data and few exposures. Transferring this ability to artificial intelligence systems is therefore an ongoing matter of research. There are already various machine learning networks and learning algorithms targeting sequential learning (Ismail Fawaz et al 2019). Most prominent among these are recurrent neural networks (RNN) (Sharma et al 2018, Jang et al 2019, Transformer (Karita et al 2019), and long short-term memory (Ballinger et al 2018, Liu et al 2019, which solve even complex sequence learning tasks. However, their working principles are far from those of the biological brain. They require complex supervised training with error back propagation and do not learn in a continuous fashion. In contrast, the human brain employs local learning rules with Hebbian-style potentiation and depression of synaptic connectivity between neurons. Furthermore, the brain learns continuously without distinct inference and update phases. The Temporal Memory (TM) algorithm (as part of the Hierarchical TM (HTM) model by Hawkins and George (2006)) has been proposed as a possible model explaining how sequential memories are stored and recalled in the neocortex. Recently, a biologically more plausible implementation of the TM model referred to as the Spiking Temporal Memory (SpikingTM) has been proposed, which uses spiking neurons and local synaptic plasticity rules to achieve continuous sequence learning . This network consists of a number of excitatory neurons ordered in sub populations with one shared inhibitory neuron per subpopulation. Each neuron has a certain number of synaptic connections. If a synaptic input exceeds a certain threshold, the neuron generates a dendritic action potential making it predictive. Predictive neurons spike earlier than others in the same subpopulation. If a certain number of neurons in the same subpopulation are predictive, the inhibitory neuron of this subpopulation is activated and hinders non-predictive neurons from generating spikes. The synapses are updated using a spike-timing-dependent plasticity (STDP) fashion by increasing or decreasing an internal permanence value based on the relative spike timing of the pre-and postsynaptic neurons. If the permanence reaches a certain threshold, the conductivity of the synapse changes from zero to a predefined value and returns to zero in case the permanence is again below the threshold. In addition, the model uses a homeostatic mechanism that keeps the synaptic growth under control. This model was implemented in the neural simulator NEST (Gewaltig and Diesmann 2007), which runs on classical computer architectures.
Realizations of spiking neural networks on conventional von-Neumann style machines suffer from computational and memory bottlenecks (Wulf and McKee 1995). Neuromorphic hardware, on the other hand, mimics the structure and emulates the functionalities of biological networks (Rajendran et al 2019), and hence, offers more efficiency by combining computation and memory in the same location. CMOS-based neuromorphic hardware also has limitations due to the complexity of circuits necessary to emulate simple synapses and the need for additional memory devices. A possible approach is classical CMOS circuits co-integrated with memristive devices as they offer the combination of computation and memory in the same nanoscale device. These (mostly) two-terminal devices can change their electrical resistance based on the history of applied voltages (Ielmini and Waser 2015) in a non-volatile fashion. There are different principal physical mechanisms enabling such memristive switching. In this work, we focus on devices where an insulating transition metal oxide is sandwiched between two metal electrodes. By applying electrical fields across the oxide layer it is possible to change the local configuration of oxygen vacancies which act as donors within the material and modulate by the so-called valence change mechanism (VCM) an interfacial potential barrier, leading to a change of the electrical resistance (Dittmann et al 2021). Matrix-like array structures of these devices are regarded a key-building block for analog computing-in-memory and neural network accelerators (Hu et al 2018, Christensen et al 2022. Using memristive crossbar arrays offers energy efficiency in contrast to conventional memory technology in multiple respects. The single device can perform read and write operations at energy levels of biological synapses (Jeong et al 2016). Furthermore, due to the non-volatile nature of the information storage, memristive devices need no refresh operation to ensure data retention, like DRAM does. They can be ordered to large arrays to implement synaptic connectivity. Using Ohm's and Kirchhoff 's Law, these arrays offer an implementation of the multiply & accumulate (MAC) function (Cai et al 2019). Since this computation is performed within the array, no data needs to be transferred, avoiding the von-Neumann bottleneck and saving energy. At the same time, this is done in one single operation, improving the latency significantly. Finally, memristive devices have a better scaling behavior than other technologies like SRAM and can thus save silicon area. Therefore, memristive arrays are a preferred way of integrating memristive devices into neuromorphic hardware (Xia and Yang 2019).
As an intermediate step towards the use of memristive hardware implementation,  investigated the suitability of the intrinsic memristive plasticity, with a focus on binary and gradual switching VCM ReRAM devices. Thereto, the biological plasticity models were replaced by a model of the ReRAM device plasticity, and the influence of different device parameters was investigated.  demonstrated that both the analog and the binary ReRAM switching dynamics can be successfully used as synaptic elements in the biologically inspired SpikingTM model, with resilience to different on-off ratios, conductance resolutions, device variability, noise, and synaptic failure.
This result motivated us to go to the next step presented in this work. Here, we devise an implementation of the SpikingTM on a neuromorphic hardware circuit based on an array of VCM-ReRAM devices. Therefore, we move completely away from the biological network elements and structure as simulated in the NEST environment, towards an electronic circuit simulation environment with a peripheral architecture centered around a memristive array and incorporating physics-based memristive device models. The result is a continuously unsupervised learning system for sequences with context-sensitivity.
To our knowledge, this work is the first implementation of the SpikingTM algorithm on a memristive crossbar array. Even though there are already memristive hardware systems proposed for the TM part of the original HTM model, the novelty of our approach is the strict algorithm-hardware co-design, which does not simply use memristive devices as storage for the synaptic weights. Instead, we adapt the SpikingTM algorithm to fit the specific properties of a memristive crossbar array using principles of in-memory-computing that fully exploit the capabilities of the 1-transistor-1-(mem)resistor (1T1R) devices in this crossbar. Our proposal goes beyond using memristive devices only as pure storage for synaptic weights, but we develop an effective way to implement synaptic plasticity in the peripheral circuitry of the array without the need for a complex memory controller for each device. There are approaches that implement STDP-like learning rules with complex overlapping pulse shapes (Serrano-Gotarredona et al 2013), which also achieve this. However, these pulses are complex and so has to be the analog circuitry to generate them. We here see the benefit of our approach that it can operate with simple rectangular pulses. This work is structured as follows: in the Method section, we describe the network structure and the functionalities of the separate components. The Results section begins with a walk-through of an exemplary task explaining the different steps of the training algorithm. After that, the successful training of complex high-order sequences is demonstrated and the energy consumption during training is analyzed. Finally, we discuss scaling prospects of the approach presented here, how the energy consumption compares to the use of conventional floating-gate memory, and how this work relates to the state-of-the-art in memristive sequence learning.

System architecture 2.1.1. Memristive switching crossbar array
At the core of the here proposed neuromorphic architecture depicted in figure 1(a) lies a memristive crossbar array, which emulates synaptic connectivity. In this array, the synapses are represented by a memristive device (1 R) in series with an integrated transistor (1 T), shown in figure 1(b). The array is quadratic with one row and column of 1T1R circuits for each neuron. One terminal of each 1T1R circuit is attached to a horizontal input line and the other terminal to a vertical output line. The transistor gates are connected vertically in each column. Each neuron gives input signals to one horizontal input line and receives output signals from the array from one vertical output line. Furthermore, each neuron has control over one vertical gate line.
The memristive crossbar array can perform two in-memory-compute operation. The first is a vector-matrix-multiplication (VMM) of a vector of input voltages at the horizontal input lines v in and the matrix of memristive conductances C, yielding a vector of output currents i out at the vertical output lines. During this VMM operation, all transistor gates must be opened to the maximum v gate, max not to create additional resistors in series with the memristive devices. The vector of voltages applied to the vertical transistor control lines v gate is the maximum gate voltage for all lines. This VMM operation is performed in a single step.
Furthermore, the conductances can be updated by applying a voltage across the memristive devices. A sub-array of devices can be updated in parallel with the same voltage pulse. The vector L in with and L gate with encode the input and output lines of the sub-array to be updated. An update to all devices in this sub-array can now be applied by choosing the input voltage vector v in = v update · L in , where v update is the update voltage, and v gate = v gate, update · L gate with the update gate voltage v gate, update . Due to the interplay of applied voltages on the horizontal inputs and opened transistor gates along vertical output lines, the update pulse only the sub-array which is the outer product of L in and L gate : The resulting conductance change ∆g i,j = G update, i, j · f(g i,j , v update , v gate, update , l update ) is then independent for each updated device and depends on the update and gate voltages, the update pulse length l update , and the current device conductance g i,j . The array structure could potentially allow an all-to-all connectivity between all neurons. But direct recurrent connections of a neuron to itself are not realized. To reflect sparse connectivity in the brain only 90% of all possible connections are realized which is physically achieved by only electro-forming these randomly selected devices. Electro-forming is a necessary initialization procedure in many oxide-based resistive switching devices. Devices that are not formed have very high electric resistance and act as open circuits. SpikingTM however uses a much higher degree of sparsity of 10% to 20%. The reason for this difference is that the network size in the SpikingTM paper is much larger than in this work. Therefore, a lower percentage of all neurons must be connected to a neuron to generate a high activation. In a larger network, also the architecture presented here could employ a higher sparsity.
To simulate this system, the JART VCM v1b model (Bengel et al 2020) 6 for a HfO/TiO bilayer ReRAM device (Hardtdegen et al 2018, Cüppers et al 2019 is used, which is a physics-based compact model capturing the switching and read out dynamics of the memristive devices along with READ and update noise. The layer stack is depicted in figure 1(c). The compact model features two terminals, the bottom and top electrodes. A positive voltage from top to bottom electrode triggers a decreasing conductivity step (RESET), and a negative voltage from top to bottom an increase in conductivity (SET). Each memristive device is in series with a transistor of a commercially available 130 nm technology. The top electrode of the memristive device connects to a vertical output line of the array. The bottom electrode is attached to the drain contact of the transistor, while the source of the transistor is attached to an input of the array. In such a structure and if the transistor gate is open, a positive voltage at the input causes a RESET of the memristive device, while a negative voltage leads to a SET.
As demonstrated in Cüppers et al (2019), the employed devices can, depending on the resistance range and pulse parameter either be operated in a gradual (analog) or abrupt (binary) switching fashion. Gradual switching means that the conductance increases or decreases a small step upon each stimulus. ReRAM devices usually show non-linear increase and decrease of the conductance applying a pulse train of equal pulses and a saturation at a certain low or high conductance value. In an abrupt switching mode, the conductance changes suddenly from a low to a high conductance in the potentiation direction or vice versa in the depression direction. Exemplary switching curves for the simulated memristive devices with the pulse parameters employed in section 3.2 can be found in the Supplementary Material. For a detailed discussion of the switching mechanisms we would like to direct the reader to Cüppers et al (2019). For the algorithm presented here, the concrete mode of operation does not matter as long as the device integrates the applied voltages pulses. This could either be done by a gradual change of resistance or by the change of an internal state variable, e.g. the oxygen vacancy concentration in a VCM-type memristive device, and switching abruptly upon reaching a threshold value. In the simulations below, an intermediate switching mode is present with an initially gradual switching followed by a more abrupt switching.

Peripheral circuit architecture
Similar to the SpikingTM, in the system presented here, neuron circuits are ordered in subpopulations, where each group of excitatory neurons shares an inhibitory neuron and the same external input. Each sub-population represents one sequence element. An excitatory neuron is a mixed-signal conventional CMOS circuit around an integrator that yields a membrane potential by integrating the input current from the array on a membrane capacity. Other functionalities of the neuron circuit are implemented in a digital circuit.
In contrast to the SpikingTM algorithm, a global clock, which is a periodically alternating signal between high and low voltages, is necessary for the architecture proposed here. Each clock cycle represents one time step where the initial high voltage indicates a READ phase and the low voltage an Update Phase, as depicted in figures 2(d) and (e).
While the inhibitory neurons essentially are counters of the prediction signals in their respective subpopulation, the functionality of the peripheral excitatory neurons is more complex. It can be described as a state machine with the three independent states 'Predictive' , 'SpikeNow' (spiking in the current time step), and 'SpikeLast' (spiked in the last time step). As depicted in figure 1(d) an excitatory neuron receives and emits various signals. To the inhibitory neuron of its subpopulation, it emits its prediction state ('Prediction') and potentially receives an inhibition signal ('Inhibition'). Furthermore, each neuron receives the global signals 'Epoch' and 'Clock' . Connections to the array consist of the contact to an horizontal input line ('Spike Input'), one vertical output line from which the neuron receives the dendritic current input ('Dendritic Current'), and the respective transistor control line ('Gate Control'). The array connections are exclusive to each neuron. How the states in the neuron are triggered is described in the following.

READ Phase
The function implemented by the excitatory neuron during the READ phase is depicted in figures 2(a) and (b). If triggered by an external stimulus to its subpopulation and not inhibited by the respective inhibitory neuron, an excitatory neuron emits a spike in the READ phase , meaning its 'SpikeNow' state (indicated in green in figure 2(d)) is activated. This event is saved in the neuron until the next time step. The spike is applied as a rectangular voltage pulse to the respective input line of this neuron (gray). During the READ phase, all transistor gates are open, i.e. the gate voltage vector v gate = v gate, max for all transistor control lines. Therefore, all memristive device connected to this input line, receive this voltage pulse, and generate a current according to Ohm's Law on the vertical output line connected to this 1T1R circuit. If multiple neurons spike at the same time, the generated currents sum up on the output lines according to Kirchhoff 's Law and feed to the neurons connected to the respective output lines. In other words, the input voltage vector v in to the horizontal input lines is with the read voltage v read . The memristive crossbar array performs a VMM of this array with the conductance matrix G, yielding the output currents i out like i out = G · v in . The dendritic current is integrated over time by each neuron, yielding a membrane potential v mem, j (t) =ˆt where t start, READ is the beginning of a READ phase and Θ (t) a term that resets the membrane voltage to zero at the end of each READ phase. If the membrane potential v mem,j exceeds a threshold, the excitatory neuron Signals') in the i st READ phase and not inhibited (turquoise dashed) the neuron's 'SpikeNow' state is activated (green) and it emits a READ voltage pulse (gray) to its connected horizontal input line of the array. If other neurons spike at the same time, the neuron receives a dendritic current which is integrated on the membrane potential (orange). In the i st Update phase, the vertically connected transistor gates are opened (pink). When the (i + 1) st READ phase begins, the 'SpikeNow' (green) state is deactivated and the 'SpikeLast' (blue) state activated. Therefore, in the (i + 1) st Update phase, the neuron emits a potentiating update pulse (turquoise solid) to the horizontal input. (e) Spikes from other neurons can via high conductive synapses lead to a high membrane potential (orange). If this exceeds a threshold in the j st READ phase, the neuron becomes 'Predictive' (violet). If multiple neurons in the same subpopulation become predictive, the 'Inhibition' of the inhibitory neuron is activated (turquoise dashed). If the neuron is externally stimulated (red) in the ( j + 1) st READ phase, its 'SpikeNow' state (green) is activated and it emits a READ pulse (gray) despite being 'Inhibited' (turquoise dashed). i = j becomes predictive, as sketched in figure 2(b). This predictive state is saved in the neuron until the end of the current time step. This event is indicated in figure 2(e). The inhibitory neuron of each subpopulation receives the prediction signals of the excitatory neurons in its subpopulation and counts the number of predictions. Upon a certain count reached the inhibitory neurons emits an inhibition signal to all excitatory neurons in its subpopulation, hindering these neurons from spiking, if the neurons are not predictive. This way, if a certain number of predictions in a subpopulation is present, only the predicted excitatory neurons can spike upon receiving an external stimulus. Additionally, each excitatory neuron can become predictive only once per epoch (similar to a refractory period in biological neurons), which is implemented by disabling the neuron that spiked until a global epoch signal is triggered. This implements a homeostasis function, by effectively decreasing synaptic weights incoming to neurons that receive high input too often. Only one time per epoch (the first time) the neuron can become predictive and spike if inhibition is triggered in its subpopulation. Only then incoming connections to this neuron are potentiated. If the neuron only receives input but does not spike subsequently, incoming synaptic connections are depressed by the READ pulse.
The local Hebbian learning rule as implemented in the original SpikingTM model is the following: at each presynaptic neuron firing, the synapses undergo a (small) depression. Secondly, if (and only if) a postsynaptic neuron fires immediately after that the synapses undergo a (larger) potentiation. Similarly, in our circuit presented here, the update of the memristive devices consists of two parts. The first part is a small depression, which is caused by the voltage pulses in the READ phase. As described by equation (3), the outer product of spiking neurons and open transistor gates, which are all for the READ phase, is updated. in this case the vector L in is 1 for all spiking neurons and L out is 1 everywhere. The polarity of the memristive device is chosen such that, if a memristive device receives a READ pulse, it will undergo a small depression step. This corresponds to a pre-synaptic spike in the biological SpikingTM.

Update Phase
During the actual Update phase (depicted in figure 2(d)), only potentiation pulses are applied. To do that, all neurons that emitted a spike in the current READ phase will open their vertically connected transistor gates. The vector L gate becomes 1, neuron j spikes in this time step 0, otherwise.
Then, all neurons that spiked in the previous READ phase will emit a positive potentiation pulse to their horizontally connected input lines. The vector L in is now Only where this pulse intersect with open transistor gates, the potentiation pulse reaches the memristive device and causes an increase in conductivity. Following equation (3), this update therefore only applies to synapses connecting neurons that spiked in the last time step to neurons firing in the current time step. This corresponds to synapses with a post-synaptic spike immediately following a pre-synaptic spike. The potentiation pulse is significantly stronger than the depression caused by READ pulses in the READ phase. As a result, if devices are read (pre-synaptic spike), they will experience a small depression and only if they connect neurons that spike in subsequently following time steps (post-synaptic spike), they will receive a strong potentiation. This implements a Hebbian-type learning rule.

Inspiration by the SpikingTM algorithm
As described above, the network presented here is inspired by the SpikingTM concept . When transferring STM to the memristive neuromorphic hardware proposed here, the overall structure of subpopulations of excitatory neurons with a recurrently connected inhibitory neuron per subpopulation and sparse synaptic connectivity between the neurons is maintained. However, the complex biological dynamics of the original neurons and synapses are simplified to be matched by mixed-signal electronic circuits with the aim to minimize the energy and area consumption.
In this sense, the leaky integrate-and-fire (LIF) dynamics of the biological excitatory neuron's membrane potential are represented by an operational amplifier integrator which is reset after each time step. The inhibitory neuron, which originally also exhibits LIF dynamics, is realized by a digital counter of the prediction signals in its subpopulation.
Synaptic currents in the memristive crossbar array are stimulated by voltage pulses from the neuron circuits and depend on the conductance of the memristive device according to Ohm's Law. They do not show biologically plausible exponential decaying behavior like in the SpikingTM network. Furthermore, the update dynamics of the synapses also are dictated by the non-linear switching dynamics of the memristive devices while the stimuli responsible for this update are generated by the peripheral neuron circuitry. This is in contrast to the inherent STDP-like plasticity of synapses in the original SpikingTM work.
An overview of the comparison between the original SpikingTM algorithm and the hardware implementation here is given in table 1.
In the Results section 4.2 these differences between the original SpikingTM algorithm and the hardware implementation proposed here are discussed in greater detail.

Performance metrics
Learning sequences in the frame of this work means being able to correctly predict the next sequence element based on previously seen elements. A sequence element is predicted, if a certain number of neurons in the subpopulation that corresponds to this element become predictive, more explicitly, receive a high current input during a READ phase.
In this work, a sequence is correctly predicted if all sequence elements are predicted at the right time and no other element is, while in the original SpikingTM works, only the last sequence element must be predicted correctly. During the training, two kind of prediction errors can occur: false negative and false positive predictions. A false negative prediction is a missing prediction where one is necessary. For example, if the character B follows character A and it is defined that two neurons in the subpopulation B must be predicted to call the character B predicted, a false negative prediction would be if only one neuron in the subpopulation B is predicted. If none is predicted, this would count as two false negatives. If on the other hand, after the spiking of A a neuron in C is predicted, this would be a false positive, because it is a prediction where none must be. The sum of false negative and false positive predictions yields the total prediction error. The error is normalized to the initial number of false predictions.
The overall goal of using dedicated neuromorphic hardware with a memristive crossbar array is energy-efficiency. Therefore, the energy consumption of the system is investigated during sequence learning. Since we do not propose a concrete implementation for the peripheral circuitry, which would strongly determine the energy consumption of this part of the system along with the technology node, only the energy dissipated in the memristive array is considered. To this extend, the voltages applied to the input lines of the array are multiplied with the input currents and the resulting power integrated over one epoch.

Results
In this section, we will start by demonstrating the training algorithm with a walk-through of a simple task on a small toy network. After that, the network's ability to learn complex high-order sequences is shown and the origin of the context-specific prediction explained.
Context-sensitive prediction in this case means that the network must be able to predict the next character in a sequence before this character is externally stimulated by integrating the history of spiking characters. This is especially challenging in the case of high order sequences, which is a proposed benchmark for the HTM (Cui et al 2016). These are sets of sequences where different predictions must be made not based upon the immediate history of excitations, but based on information that lays further in the past. An example are the two sequences {A, B, C} and {D, B, E}. Both have the element B as the central element. Therefore, the information B as the last element is not sufficient to decide which element comes next and should be predicted. Instead, it depends on the first element of the sequence, A or D, which element comes last, either C or E. A context-sensitive sequence learning system must be able to correctly predict the last element after the common element, depending on the first element of the sequence. This task can be made more complicated by having multiple shared elements in the center of the sequences, e.g. {A, B, C, D} and {E, B, C, F} , where B and C are the common central sequence elements. After showing that the proposed network can also successfully learn and predict these more complex high-order sequences, we conclude the Results section with an investigation of the energy consumption during training.

Walk-through of the training algorithm
In the following, we will explain the proposed training algorithm in detail with the exemplary training of the sequence {A, B, C} in a small six-by-six 1T1R device array. The network is shown in figure 3. In this exemplary network, all synapse connections are realized, while in the later simulated networks only 90% of the connections are present. The six neurons are ordered in three sub populations (A, B, and C) with two neurons each. Each subpopulation represents one character, A, B, or C. In contrast to later simulation, we here assume an inhibition threshold of one, meaning that as soon as one neuron in a subpopulation becomes predictive, the other neuron is inhibited. In the later simulations, a value of two is chosen. In the beginning, all memristive synapses are initialized in a low conductive state. Due to variability reasons, these states are not equal but normally distributed.
Each training epoch consists of the subsequent external stimuli of the subpopulations A, B, and C. For the sake of visibility, only the conductance array of the system displayed in figure 1(a) is drawn here with the peripheral circuitry omitted and the excitatory neurons depicted only with letters. Each subpopulation of two excitatory neurons still shares one inhibitory neuron with an inhibition threshold of one.

Stimulus A on pristine network
If the subpopulation A is externally stimulated in the first READ phase, all neurons in A (A1 and A2) emit a READ pulse, because none of them is inhibited, as A is the first character of the sequence (see figure 3(a)(upper left)). This corresponds to an L in = (1 1 0 0 0 0) T vector and an L gate = (1 1 1 1 1 1) T . All synapse connections colored in red receive this READ pulse and are slightly depressed due to the polarity of the READ pulse. Furthermore, each synapse according to Ohm's Law generates a current which is small due to the initially low conductances. The currents are summed up on the vertical lines and feed into the neurons which are connected to the respective output lines (see also figure 1(a)). Since all currents are small, no neuron is predicted. In the following Update phase, A1 and A2 open their vertically connected transistor gates (see figure 3(a)(lower left)) and thereby enable these 1T1R devices to be updated. As this is the first character of the sequence, there was no spiking in the previous time step, and no potentiation pulse is emitted (see figure 2(b)).

Stimuli B and C on pristine network
In the second time step, the subpopulation B is externally stimulated and again, all neurons in this subpopulation (B1 and B2) spike, because no inhibition was previously caused. Again, all synapses horizontally connected to B1 and B2 are slightly depressed, as depicted in figure 3(a)(upper mid)). The generated currents are again too small to cause any prediction. In the Update phase, neurons B1 and B2 open their vertically connected transistor gates (L gate = (0 0 1 1 0 0) T ) and now, the neurons in A (A1 and A2) emit a potentiation pulse to their horizontally connected lines (L in = (1 1 0 0 0 0) T ), because they spiked in the previous time step. At the intersection of the horizontal lines where A1 and A2 emit the potentiation pulse and the vertical lines, where B1 and B2 have opened the transistor gates, enabling an update, a strong potentiation of the conduction is caused, corresponding to the outer product of L in and L gate These are exactly the synapses connecting subpopulation A to B as depicted in figure 3(a)(lower mid) in green cycles.
A similar situation can be found in the next time step upon the stimulus of character C, where the whole subpopulation spikes as shown in figure 3(a)(upper right) in the READ Phase and all synapses leading from B to C are potentiated in Update Phase as depicted in figure 3(a)(lower right). If all depressions in the READ phase and potentiations in the Update phase are summed up during this training epoch, and considering that a potentiation is much stronger than a depression, bottom line, all synapses are depressed slightly, except for those leading from subpopulation A to B and from B to C, which are potentiated.
This process is repeated in every epoch. Due to the inherent stochasticity of the update dynamics of the memristive devices and enhanced by their self-accelerating potentiation behavior, after some epochs, there will be some synapses which have sufficiently increased their conductivity to cause a larger current upon the application of a READ pulse and cause prediction in other neurons. These mature synapses can only be found in sub-arrays in the memristive array leading from A to B and B to C because only these synapses are potentiated. We exemplarily choose three synapses to be mature. These are depicted with filled circles in the figure 3(b).

Stimulus A on trained network
If A is now externally stimulated, again, neurons A1 and A2 both emit a spike, because A is the first sequence element, and no previous prediction or inhibition exist. As before, all synapses horizontally connected to the neurons A1 and A2 receive a READ pulse and are slightly depressed. However, the one mature synapse due to the READ pulse from neuron A1 produces a large output current and thereby causes neurons B1 in the subpopulation B (depicted with a yellow arrow in figure 3(b)(upper left)) to become predictive. Neuron B2 is inhibited via the inhibitory neuron of subpopulation B because we chose an inhibition threshold of one for this example. In the Update phase depicted in figure 3(b)(lower left)) we find the same situation as for the pristine network, because there was no previous spiking activity.

Stimuli B and C on trained network
Now, when the subpopulation B is externally stimulated, only the one predicted neuron B1 in subpopulation B will spike and neuron B2 is inhibited. Therefore, only synapses horizontally connected to neurons B1 receive a READ pulse and are depressed (see figure 3(b)(upper mid)). In the following Update phase, depicted in figure 3(b)(lower mid), only neuron B1 enables its vertically connected synapses for update. All neurons in A send a potentiation pulse because the whole subpopulation previously spiked. In contrast to the beginning of the training, this potentiation now can only arrive at those synapses connecting subpopulation A to neuron B1 which was predicted by the spiking of A. The connection from A to the neuron B1 which was predicted by A is therefore further enhanced, while synapses leading from A to the not predicted neuron B2 are weakened. When C is stimulated in the next time step, the situation is similar as depicted in figure 3(b)(upper right) for the READ Phase and figure 3(b)(lower right) for the Update Phase.
Looking again at the sum of all potentiations and depressions, one can see that as soon as predictions are caused in the network, the synaptic connections that lead to these predictions (A1 and A2 to B1, and B1 to C2)) are further enhanced while connections leading to other neurons in the same subpopulation are weakened. This leads to sparse conductive pathways through the synaptic network representing the trained sequences that are dependent on the specific context.

High-order sequence learning
The ability to achieve context-specific predictions enables the network to successfully learn even more complex sequence learning tasks like for example the high order sequences {A, B, C} and {D, B, E}. We simulate this training in a 36-by-36 synapse 1T1R array with 36 excitatory neurons. Each 1T1R device consists of a memristive device modeled with the JART VCM v1b model in series with a CMOS transistor in a commercially available 130 nm technology. The aforementioned learning rule is implemented in VerilogA for each neuron. The neurons are organized in six sub populations, where each has one inhibitory neuron. Each subpopulation represents one of the sequence elements, in this case the characters A to F. In the array, a random 10% of the synapses are not realized. Synapses that form direct recurrent connections from a neuron to itself are also not realized. In a real array this would be achieved by not forming these devices, which leaves them in a very high ohmic state.
The parameters for the MemSpikingTM for these simulations are listed in table 2. The algorithm can also converge with other pulse parameters. By increasing the length or magnitude of the pulses, especially of the Update pulse, a faster convergence can be achieved but this can also lead to oscillations in the training and hinder convergence at all. All memristive devices are initialized with a concentration of oxygen vacancies drawn from a normal distribution wit a mean of N init, mean = 5 × 10 24 m −2 and a standard deviation of N init, range = 5 × 10 23 m −2 which leads to an initial conductance of the memristive devices at the READ voltage of 0.6 V of 6 µS to 45 µS. The network consists of n neurons = 36 neurons which are connected by a crossbar array with 90% of the synaptic connections realized. Self-to-self connections of neuron are furthermore not realized. This in total leads to n synapses = 1, 134 synaptic 1T1R devices.
As shown in figure 4(a), in the very first epoch of the training, when a character is externally stimulated, the whole subpopulation representing this character spikes. There are no initial predictions. Figure 4(c) shows the conduction matrix after training. One can clearly see how some synapses have become mature in the sub-matrices leading from A to B, D to B, B to C, and B to E. Mature synapses have memristive devices with an oxygen vacancy concentration of about 2 × 10 26 m −2 which corresponds to a conductance of 470 µS at the READ voltage of 0.6 V. Immature memristive devices are at or near their initial conductance of about 45 µS. A detailed walk through of the different intermediate steps that lead to such a conductance matrix can be found in the Supplementary Material. With this trained conduction matrix, a spiking of the subpopulation A, by an external stimulus of the respective subpopulation causes a prediction in the subpopulation B, as shown in figure 4(b). If now only the predicted neurons in B spike after B is externally stimulated, this causes a prediction in C. On the other hand, if D is stimulated before B, the spiking of neurons in B causes prediction in E. The network is therefore able to make a context-specific prediction after the external stimulus of B. If A is stimulated before B, C is predicted, while if D is stimulated before B, this results in a prediction of E. This experiment was repeated ten times with different initializations of the memristive array. Figure 4(d) shows the evolution of the prediction errors over the training epochs with the dark line representing the average error and the light area the maximum and minimum number of errors at that epoch. As one can see, most prediction errors are false negatives, because initially there are no predictions at all. It then takes between six and 25 epochs for the system to successfully learn both sequences {A, B, C} and {D, B, E}.
The ability of the network to make context specific predictions arises from the formation of sparse conductive sub-pathways through the memristive array. This will be explained now taking the example of the memristive array of one training instance for the high-order sequence training of the sequences {A, B, C} and {D, B, E} described above. Both figures 5(a) and (b) show the final conduction matrix of this training. If A is externally stimulated, all neurons in A emit a READ pulse which is then received by all synapses colored in red in figure 5(a). Most of these synapses have a low conductivity and generate only a very small current. But there are two columns, specifically in columns 8 and 11, of synapses in the sub-matrix leading from A to B which are highly conductive. These synapses will generate a large current when receiving a READ pulse and therefore cause prediction in the neurons they are leading to, neurons 8 and 11.
On the other hand, if the character D is externally stimulated and all neurons in the respective subpopulation emit a READ pulse (all synapses receiving this pulse are colored in blue), there are two other Figure 5. Origin of sparse prediction (a) If the subpopulation A receives external stimulus, all neurons in A send a READ pulse to the horizontal input lines. All synapses marked in red receive this pulse and produce a current according to Ohm's Law. These current are summed up on the vertical output lines. Since most conductances are low, these currents are to low to cause a prediction. In the sub-matrix A-to-B, there are two high conductive columns. Upon the READ pulse these produce high currents and cause predictions in neurons 8 and 11 in the subpopulation B. This is the subset B|A. If on the other hand the subpopulation D is stimulated externally, all blue marked synapses receive a read pulse but only the high conductive columns in the sub-matrix D-to-B produce sufficient current to cause predictions in the subpopulation B, precisely in neurons 6 and 9. This is subset B|D. B|A and B|D are disjoint subsets of B, which is essential for context specific prediction. (b) B|A and B|D both contain two elements which is sufficient to cause inhibition. If after either A or D, subpopulation B receives an external stimulus, only the previously predicted neurons, either B|A or B|D, spike and emit a read pulse (indicated with red or blue horizontal bars). In the sub-matrix B-to-C, there are four synapses which cause high currents and predictions in C. There are however no high conductances along the red horizontal bars leading to other neurons. Vice versa, if D was stimulated in the first place, there are four conductances in the sub-matrix B-to-D which cause predictions in subpopulation E. Therefore, depending on which subset, B|A or B|D, has been predicted by the very first stimulus, either neurons in C or E are predicted by the stimulus of B. The prediction by stimulus of B depends on the context of either A or D. columns, 6 and 9, which are highly conductive in the sub-matrix leading from D to B. Therefore, a spiking of the subpopulation D will cause a prediction in these neurons, 6 and 9. It is important to note that the sets of neurons within to subpopulation B which are predicted either by spiking of A or D are disjoint. We call this subset which is predicted by A 'B in the context of A' or B| A and the subset which is predicted by D 'B in the context of D' or B| D . Figure 5(b) now shows what this prediction leads to when the character B is externally stimulated. If previously the subset B| A was predicted, only those neurons 8 and 11 will spike, because the inhibition threshold is reached with two predicted neurons and the inhibition hinders all non-predicted neurons from spiking. In the rows horizontally connected to neurons B| A (8 and 11), there are mature synapses leading to neurons in the subpopulation C. These neurons are predicted when the neurons in B| A emit a READ pulse. There are, however, no strong connections leading to other neurons, especially not to neurons in the subpopulation E. On the other hand, if D was initially stimulated and the neurons in B| D are predicted, these will spike and emit a READ pulse to their respective horizontal lines, 6 and 9. From these lines there are only strong synaptic connections to neurons in E, but none to B. By forming conductive pathways leading to disjoint predicted subsets of neurons in the subpopulation B, which depend on the context of either A or D, the network can successfully predict either C or E, depending on the first character of the sequence.
An important role to ensure the disjoint characteristics of the subsets plays the homeostasis function that allows each neuron to become predictive only once per epoch. Without this condition, it would be possible that highly conductive columns from A and D to the same neuron in B form. Since this happens by chance, it is less likely in a larger network and for a less complicated task. The homeostatic condition allowing each neuron to become predictive only once per epoch, however, eliminates this risk for all network sizes and tasks.
This relatively small 36-by-36 array using the proposed learning rule can even be used to train even more complex high order sequences with two overlapping . Figure 6 shows the training of the sequences {A, B, C, D} and {E, B, C, F}. Figure 6(a) shows how initially when the characters are stimulated, no predictions are made and all neurons in the respective subpopulations spike. After training one can again find columns of mature synapses in sub-matrices leading from one subpopulation to another forming context-specific conductive pathways in figure 6(c). This leads to successful context-sensitive prediction depicted in figure 6(b). An external stimulus of {B, C} causes prediction in D if previously A was stimulated, while prediction in F is caused if E was the first character. If one looks carefully, in both cases different subsets of neurons in B and C are predicted. The experiment was again repeated ten times and as visible in figure 6(d) the training of these more complicated sequences takes between 20 and 50 epochs. Again, the majority of prediction errors are false negatives, but here also a number of false positives occurs. This especially happens when the conductive pathways through the sub-sequence {B, C} differentiate.

Energy consumption in the memristive crossbar array
The main goal of using dedicated hardware, and especially memristive solutions, for the implementation of neuromorphic algorithms is energy efficiency. Figure 7 shows the energy consumption in the memristive array during the training of the high-order sequences described above. This includes only the electrical energy dissipated in the memristive array, to read devices in the READ phase and apply potentiation pulses in the Update Phase. The energy consumption of the peripheral circuitry strongly depends on the used CMOS technology node and actual implementation. Since this makes it hard to estimate its contribution, the peripheral energy consumption is not taken into account here. A more detailed description of how the energy consumption in the array is calculated can be found in the Supplementary Material.
In section 3.2, we describe the training of the high-order sequences {A, B, C} and {D, B, E} and show that for some instances of the training it took no more than ten epochs to reach correct prediction. The blue curve in figure 7(a) shows the energy consumption in the memristive array (determined as described in section 2.2 in the memristive array per epoch during such a training exemplarily. After an initial increase during the first two epochs, the energy consumption drops suddenly from almost 400 nJ per epoch to below 300 nJ per epoch. This coincides with a decreasing prediction error during training. The gray curve in figure 7(a) shows the accumulated energy that was consumed during training. When considering a successful training after 10 epochs, the total energy consumption up to this epoch was about 4 µJ.
A similar decrease in energy consumption over the training can be observed for the training of the more complex sequences {A, B, C, D} and {E, B, C, F}, which is depicted in figure 7(b). The blue curve showing the energy consumption per epochs shows distinct plateaus while decreasing from initially 500 nJ per epoch to the minimum of below 400 nJ per epoch at about epoch 45. These plateaus also stem from increasing sparsity of the spiking when inhibition is triggered. The total energy consumption to complete the training is here about 20 µJ at epoch 45.
For comparison to common floating-gate memory technology, the read and update activity during training was extracted. When assuming a read-out energy consumption of 100 pJ bit −1 and an average write  figure 7 in grey assume that each synaptic weight is represented by 1 bit floating-gate memory (while the resolution of the memristive devices actually is higher), which can be regarded as a lower boundary for the energy consumption.

Discussion
In this work, we describe how the SpikingTM algorithm is adapted for dedicated neuromorphic hardware based on a memristive crossbar array. The proposed algorithm is able to learn even complex high order sequences and we show how this is made possible by context-specific conductive pathways in the memristive array. Furthermore, the energy consumption in the memristive array reduces with progressing training which makes the system interesting for energy-sensitive applications at the edge.
After discussing how the network and algorithm presented here relates to other sequence learning approaches, we discuss the most important choices made during the adaption of the original STM algorithm and why they benefit its application in the proposed memristive neuromorphic hardware. Before explaining important scaling aspects we show how the specific combination of biologically inspired spiking algorithm and memristive crossbar architecture is suitable for energy-efficient implementation of a sequence learning algorithm.

Relation to previous work
Different hardware implementations of the HTM algorithm have been proposed using memristive devices (Krestinskaya et al 2018). In most cases, only the Spatial Pooler segment, which is not investigated here, of the HTM has been implemented using memristive devices. In other cases, the TM part uses memristive crossbar arrays as dot product engines without specific adaption of the learning algorithm (Fan et al 2015, Krestinskaya et al 2017, while this is the goal of this work. However, there are two previous works strongly related to the network proposed here. The first one is a sequence learning approach by Doevenspeck et al (2018) implemented on a 1Mbit TaOx ReRAM array. The system topology consisting of a 1T1R array which forms recurrent connections for the neurons is similar to the network structure proposed here. Each neuron is connected to exactly one horizontal and one vertical line of the array. Furthermore, the algorithm proposed by Doevenspeck et al yields a similar total update matrix as this work. Synapses connected to a spiking PRE neuron on the horizontal input line are depressed unless they are also connected to a spiking POST neuron on a vertical output line. The system ensures spatial information locality as all storage and processing of information is conducted within the network and no external trainer or loss signal is necessary. While temporal locality of information is given, the work by Doevenspeck et al employs a complex temporal look-ahead algorithm to determine which neurons of a subpopulation should participate in a context-specific conductive pathway (Doevenspeck et al 2019), which is a major difference to the algorithm proposed in our work here. This not only requires additional computational complexity, but it also makes knowledge of the complete sequence necessary before it is applied to the network. This look-ahead algorithm not only breaks temporal locality of information, but it also prohibits real time online learning.
A second strongly related work is the MEMSORN network by Payvand et al (2022). Here, a memristive crossbar array recurrently connects LIF neurons to perform sequence learning tasks in a Spiking RNN (SRNN). External inputs connect directly to the network via additional array columns. The authors show that the network can successfully learn sequences with repeating elements and correctly predict a terminal element after a certain number of repeated elements. They report a significant gain in accuracy compared to a static random recurrently connected network. However, an additional trained linear read-out layer is necessary in MEMSORN, increasing computational complexity and making a two phase global learning scheme necessary, where in the first one, the parameters of the SRNN and in the second the linear read-out are trained.
We therefore see the benefits of the network presented in this paper especially in the simplicity of the algorithm which is conducted completely within the memristive crossbar array and the neuronal peripheral circuitry.

Design decisions for a hardware-friendly implementation
While a common approach of neuromorphic computing is to mimic biological information processing as closely as possible to achieve the high efficiency of the brain, it is sometimes necessary to relinquish some of the biological principles to meet certain restrictions of a non-biological compute substrate. When adapting the biologically plausible SpikingTM algorithm for execution on a memristive array, some fundamental changes had to be made which are discussed in the following:

Discrete vs. continuous time
While the original SpikingTM operates continuously in time 7 , the framework proposed here employs a global clock signal and has one distinct READ (inference) and Update phase per time step. This is necessary as the memristive device requires different applied signals for read and for its plasticity update. Global clock here means a signal that alters between a HIGH and LOW value (or '1' and '0') with a fixed frequency and duty cycle. This signal is the same for all circuits and determines READ (HIGH) or Update Phase (LOW).
The reason for this change in the operation time lies in the bipolar operation mechanism of the memristive devices requiring different voltage polarities for potentiation and depression of the conductance. These have to be applied sequentially.
There actually is a possibility to run the algorithm proposed here in a similar quasi-continuous fashion as the original SpikingTM on a conventional computer by extending the time relation of signals from just one time step to multiple. However, this is outside of the scope of this work.
Due to the two distinct phases of the clocked operation, external stimuli must be applied during the READ phase of a time step. An external stimulus applied during the Update phase would have no effect and the information be lost, which is a potential shortcoming of the clocked time approach. This issue can be mitigated by making the Update phase significantly shorter than the READ phase, by changing the duty cycle of the global clock signal. The update of a memristive device can theoretically be in the pico-second regime , Böttger et al 2020, von Witzleben et al 2021, while biologically plausible time scales are on the millisecond scale. The READ phase therefore can potentially be multiple orders of magnitude longer than the Update phase, tailoring the time scale of the system to the frequency of the input signal.
This absence of absolute time constants makes it necessary to signal the end of an epoch in order to correctly apply the homeostasis condition. In the given implementation this is done by an additional signal at the beginning of every epoch. If the period of the input signal is unknown, this signal could for example be generated when a certain number of clock signals without an input signal passes, indicating the end of an epoch.

Local vs. periphery-mediated plasticity
Synaptic plasticity in the brain happens within the synapses themselves. By comparing pre-and post-synaptic spikes, the conductivity of a synapses is adjusted without the need of an external mechanism. Synapses in the SpikingTM network follow the same principle and implement synaptic weight and plasticity at the same time. A memristive device however, offers no such kind of plasticity mechanism itself. Therefore, additional circuitry would be necessary to detect incoming spikes, compute the update and execute it. Since each memristive device in the array would need such circuitry, this approach would significantly increase the silicon footprint of the network and consume additional energy. There are however approaches to solve this by engineering the shape of the pre-and post-synaptic pulses (Panwar et al 2017, Guo et al 2019, but they require high precision of timing and voltage of complex pulse shapes, which would demand extensive analog pulse generators consuming energy and silicon area.
The algorithm proposed here takes a hybrid approach. Signals relevant for the update are generated in the peripheral neuron circuitry and by an interplay of opening transistor gates along vertical columns and applying update voltages along horizontal rows of the array, a simple STDP-like update rule is realized without the necessity of complex analog waveform generators or additional circuitry for each synapse realizing plasticity.

Homeostasis
A small difference can be found in the implementation of the homeostasis function. While in the original SpikingTM algorithm, synapses are depressed if they too often cause predictions in the same neuron, in the implementation proposed here, each neuron can only become predictive once per epoch. Both mechanisms have a similar overall effect, but touch at different stages of the network.

Neurons
Moreover, in the original SpikingTM network both excitatory and inhibitory neurons are point neurons with LIF dynamics. In this work, the neurons have separate circuits for sensing the somatic (external stimulus) and the dendritic input (from the memristive array). These are implemented as integrators without leakage and share a common membrane potential.
The inhibitory neuron in this implementation is carried out as a digital counter of the predictive neurons in a subpopulation. Due to the clocked timing in the algorithm proposed here, it is not necessary to implement leaky integration dynamics.

Algorithm-inherent decrease of energy consumption
In section 3.3 we demonstrated how the energy consumption of the system reduces during the training. This coincides with a decrease of the prediction error, more specifically, with the onset of predictions and hence a growing connection sparsity in the network. If neurons become predictive and an inhibition is caused, this allows only a subset of neurons in a subpopulation to spike if the subpopulation is externally stimulated. Less READ pulses are generated and the number of read synapses is decreased. This is how connection sparsity leads to an algorithm-inherent gain of energy efficiency during training.
Such sparse connectivity can also be observed in the original TM and SpikingTM algorithms, but due to the implementation on classical computers, also small conductances have to be taken into account and small output currents have to be calculated even if they play no role for the result. In a digital computing system the magnitude of a value has no influence on the energy it takes to communicate this value. In a memristive crossbar array on the other hand, if no voltage is applied on the input side by a non-spiking neuron, no current is produced and no energy is consumed. Even if a neuron emits a READ pulse, only a few mature synapses will produce a high current while all others with a low conductance produce a low current which again consumes less energy. Therefore, by the combination of the biologically inspired learning algorithm and a memristive crossbar architecture, significant energy is only consumed to communicate important information. This is a principal gain of energy efficiency of memristive neuromorphic hardware over the implementation of neuromorphic algorithms on conventional computers.
The gain in energy efficiency can be quantified when comparing the energy consumption of the memristive implementation presented here to a network, where the synaptic weights are stored in floating-gate memory. As do memristive devices, floating-gate stores information in a non-volatile fashion, while other data storage techniques like DRAM and SRAM are volatile. This requires loading stored data to the system for computation. Therefore, we choose to compare the approach presented here with a floating-gate memory based implementation, because floating-gate-transistors resemble the operation of memristive devices closest. When assuming just a single bit precision floating-gate memory, the memristive implementation already outperforms floating-gate memory with a factor 3 lower energy consumption. The one bit precision is a lower boundary and the comparison therefore a conservative estimate. An implementation with multi-level floating-gate memory that would exhibit the same multi-level characteristics as memristive devices would consume even more energy.

Scaling properties
All studies presented here are based on circuit level simulations. The usage of the accurate and physics-based, yet computationally complex JART v1b model prohibits large scale simulations. Convergence issues limit the size of an array size simulation in this framework to the given size. Therefore, the question arises, if the proposed network can be scaled and how this will impact the learning performance.
The scaling complexity is different for the neural peripheral circuitry and the memristive array. If the number of neurons increases and because every neuron is represented by a distinct circuit, the scaling of the peripheral circuitry is linear. This includes the mixed-signal signal application circuits at the horizontal input lines and the current sensing and thresholding circuits at the vertical outputs. The memristive array however scales in a quadratic fashion with the number of neurons, because it mediates all-to-all connectivity between the neurons. The quadratic scaling is in this case acceptable, because the single elements of the memristive array only consist of one transistor which is integrated Front-end-of-line and one memristive device, which is added Back-end-of-line. Not only are the device themselves smaller than the neuron circuitry, due to the different end of the integration stack they are placed at, they can be collocated.
In the original SpikingTM paper, only 20% of the potential synapses are realized. This connection sparsity benefits the learning process strongly . However, in the system presented here, 80% to 90% of the synapses are present. The reason lies in the smaller network size of the here investigated system. It is necessary that a certain number of connections from one subpopulation to another exist to cause prediction in the second subpopulation. If the number of existing synapses is too low, it can happen that only a small number of conductive pathways can be formed between these subpopulations. The capacity of the network is low. But for the learning of high order sequences, multiple parallel pathways are necessary to represent different contexts. Therefore, a higher percentage of existing synapses has been chosen in this work. Scaling the network however, would relief this constraint and allow for higher degrees of connection sparsity.
Most neurons, except for those that are triggered at the beginning or end of sequences, only represent one single context. In this way the inhibition threshold hinders the formation of an arbitrary number of mature synaptic connections from a neuron which is activated within a specific context. Therefore, the number of neurons per subpopulation divided by the inhibition threshold is the maximum of sequences with the respective character as a central element that can be learned. The size of a subpopulation and the inhibition threshold must therefore be matched to the task. Furthermore, the maximum number of synaptic connections which can be formed from this neuron to others is the inhibition threshold. This means that in a large network with at the same time a low inhibition threshold, many synaptic connections remain low conductive and do not take part in a conductive path. The utilization of the network decreases. This is however acceptable if the crossbar array is fabricated with nanometer sized devices in an integrated circuit. The operation time scales beneficial. In the READ phase, all READ pulses are emitted by spiking neurons at the same time. Therefore, the computational complexity in this phase does not scale with the number of neurons. In the update phase, also all potentiation pulses can be applied at the same time, however, to reduce load current it is recommended to apply them in a serial fashion. Therefore, the time to apply potentiation pulses would scale linearly with the number of neurons per subpopulation, since only a maximum of one subpopulation can spike per time step.
The number of neurons per population depends on the number of different sequence elements ('characters') that are be represented in the system and the number of total available neurons. Increasing the number of represented elements obviously increases the dynamic range of the network. The number of neurons which represent one element has an influence on the number of sequences, the network can learn which contain the same character. When scaling the number of neurons, it can therefore be decided depending on the use-case, which capacity of the network is to be increased.
These scaling properties allow the MemSpikingTM algorithm to solve also more complex sequence learning tasks. By simulations of a simplified version of the algorithm in MATLAB, we could show that the counter and occluder benchmark tasks, proposed in Lazar et al (2010) can be solved by the algorithm. To do so, it requires smaller networks and less time steps than comparable sequence learning approaches. Details on these benchmarks can be found in the Supplementary Material.

Conclusion
For the application of neuromorphic computing algorithms in remote environments and subject to energy constrains, it is necessary to develop dedicated hardware systems which comprise all memory and computational functionality necessary for its training and inference operation. This relieves the need for an additional conventional co-processor and energy-costly external communication.
In this work we present an architecture and an algorithm for a mixed-signal dedicated neuromorphic hardware implementation with a simple yet effective neuron algorithm for sequence learning, which is based on a memristive crossbar array, leveraging the non-volatile and energy-efficient character of these emerging electronic devices. We develop a learning rule inspired by the HTM and its biologically plausible form, SpikingTM, that takes advantage of the 1T1R crossbar architecture and the bipolar update dynamics of the memristive devices, beyond just using them as a dot product engine. Instead, the algorithm-hardware co-design is strictly oriented on the principles of compute-in-memory and realizes inference and update plasticity steps without the need of a dedicated memory controller, only with the peripheral neuron circuitry. Recently, the functionality of the MemSpikingTM algorithm was even demonstrated for a small task size on a real memristive crossbar array (Siegel et al 2023). With this learning rule we are able to train even complex high-order sequences to the network, showing how connection sparsity evolves during training and benefits the energy efficiency.

Data availability statement
The data cannot be made publicly available upon publication because they are not available in a format that is sufficiently accessible or reusable by other researchers. The data that support the findings of this study are available upon reasonable request from the authors.