Ferroelectric-based synapses and neurons for neuromorphic computing

The shift towards a distributed computing paradigm, where multiple systems acquire and elaborate data in real-time, leads to challenges that must be met. In particular, it is becoming increasingly essential to compute on the edge of the network, close to the sensor collecting data. The requirements of a system operating on the edge are very tight: power efficiency, low area occupation, fast response times, and on-line learning. Brain-inspired architectures such as spiking neural networks (SNNs) use artificial neurons and synapses that simultaneously perform low-latency computation and internal-state storage with very low power consumption. Still, they mainly rely on standard complementary metal-oxide-semiconductor (CMOS) technologies, making SNNs unfit to meet the aforementioned constraints. Recently, emerging technologies such as memristive devices have been investigated to flank CMOS technology and overcome edge computing systems’ power and memory constraints. In this review, we will focus on ferroelectric technology. Thanks to its CMOS-compatible fabrication process and extreme energy efficiency, ferroelectric devices are rapidly affirming themselves as one of the most promising technologies for neuromorphic computing. Therefore, we will discuss their role in emulating neural and synaptic behaviors in an area and power-efficient way.


Introduction
The role that artificial intelligence plays in our daily life is constantly increasing.Intelligent systems (e.g., smartphones, smart wristbands, medical devices) continuously collect data to monitor our activities.The data elaboration and interpretation are usually carried out by using artificial neural networks (ANNs), which have demonstrated remarkable performance in inference and classification tasks [1,2].However, stateof-the-art ANNs require a vast availability of memory and power resources.Typical approaches rely on a source for computation in the cloud: the devices at the edge send the data to remote servers, where an ANN-based simulation is run, and the result of the elaboration is then sent back to the edge.This method requires an almost always active data connection, poses issues on security and privacy [3], and requires a massive amount of power to transfer and process the data.
A possible solution to optimize energy efficiency is to enable computing right next to where the data is collected, e.g., the sensor [4,5].In this respect, one of the most promising approaches is the neuromorphic approach, in particular the brain-inspired spiking neural network (SNN) [6].SNNs owe their power efficiency to their hardware architecture, which uses artificial neurons and synapses to overcome the physical separation between memory and central processing unit typical of standard von Neumann architectures, and to the adoption of an asynchronous event-based approach that elaborates the information, in the form of spikes, only when available.Both industry and academia have already demonstrated interest in SNNs, defined as the third generation of neural networks, which resulted in remarkable neuromorphic processors such as IBM's TrueNorth [7], SpiNNaker [8], Intel's Loihi [9], DYNAP-SE [10], ODIN [11], and MorphIC [12].However, in pure complementary metal-oxide-semiconductor (CMOS) solutions, the basic building elements of an SNN, i.e., neurons and synapses, present some non-negligible disadvantages.Indeed, neurons are rather complicated to realize.The simplest versions reproducing integrate-and-fire (I & F) behavior need at least an integrator, which also implies using a capacitor, which can occupy a rather large area depending on the time constants that need to be used, as well as a comparator.More complicated versions may need current sensors, capacitors, analogue-to-digital (ADC), and digital-to-analogue converters, with consequent area and power consumption requirements [13].Synapses, instead, store the synaptic weights in volatile elements such as static random access memory (SRAM) cells.This implies that the power supply cannot be switched off during normal system operation unless the relevant information is stored somewhere else and at every start-up of the system, the information on the SNN must be uploaded, which may take tens of minutes since the data stored in an external memory have to be uploaded in an architecture with distributed memory and unconventional addressing schemes [10].
In the last 10 years, new technologies have been therefore investigated to flank standard CMOS to realize a new generation of neuromorphic hardware.In this respect, a promising technology is one based on memristive devices.Memristive devices show a broad range of excellent properties, including non-volatile memory, analogue behaviour, high scalability, high read/program speed, high energy efficiency, and programming voltages comparable with the power supply of typical neuromorphic chips [14,15].The most widely used memristive technologies for neuromorphic computing are phase-change [16] and ion migration based resistive switching devices [17].However, thanks to their extreme energy efficiency and compatibility with CMOS fabrication process [18], ferroelectric devices are rapidly attracting the interest of the neuromorphic community.
In this review, we will present the current state of two ferroelectric technologies, ferroelectric tunnel junctions (FTJs) and ferroelectric field-effect transistors (FeFETs), with a specific focus on the device-related aspects that enable the design of synaptic and neural fundamental blocks for neuromorphic computing.After illustrating their basic operating principle, we will discuss the role of FTJ and FeFET devices as synapses and neurons in neuromorphic systems.

Basic device operating principle
FTJs are two-terminal devices with a ferroelectric material sandwiched between two electrodes.FTJs can reversibly change their resistance by applying a voltage that generates an electric field exceeding the coercive field E c of the material, thus inducing polarization switching [19].An example of the program-and-read scheme to characterize FTJs is represented in figure 1(a), whereas the resulting current state is depicted in figures 1(b) and (c).The ratio between the resistance of the device in low-resistive state (LRS) and high-resistive state (HRS) is called tunneling electro-resistance ratio (TER) since the tunneling resistance is affected by the polarization.Two types of FTJs are commonly used.In the traditional approach, a thin ferroelectric in the range of 1-5 nm thickness is used with two electrodes with different screening lengths.As a result, the barrier on the electron injecting electrode will depend on the polarization direction [20].In the double-layer FTJ [21,22], a thin tunneling layer is added to the ferroelectric, and based on the polarization, the tunneling will be through the thin tunneling layer only or through the tunneling layer and a portion of the ferroelectric itself [22].The latter approach gives more flexibility to the thickness optimization of the ferroelectric at the cost of even lower device current.Typical values for the TER are in a range up to 100 [23].
The change in the polarization state, i.e., the change in the device resistance, is used to store information.It is possible to obtain multilevel storage in reasonably large devices hosting several domains by properly engineering the applied programming voltage [24,25].Of particular interest is that FTJs allow for a non-destructive reading operation by applying an electric field smaller than the E c of the ferroelectric layer.
The switching due to the field altering the polarization state of the ferroelectric material is highly energyefficient, much more than other resistive switching mechanisms like valence change, phase change, or spintransfer torque.Indeed, switching currents in the sub-nA range have been reported for FTJ devices [26] together with write energy per bit projected down to 500 aJ in devices with a feature size of 50 nm [27].When compared with the write energy required by other technologies, e.g., 20 fJ for resistive random access memory (ReRAM) [28], ∼100 fJ for phase change memory (PCM) [29], and 90 fJ for magnetic random access memory [30], the competitiveness of FTJs becomes evident.Together with the very low read currents down to the nA range, this feature makes the FTJ the intrinsically most power-efficient resistance-switching device [18].However, the very low read current with typical values in the range of just 1-10 pA μm −2 [31] may raise a concern when it is in the same order of magnitude as the junction leakage currents of the CMOS transistors connected to it.In this scenario, the state of the device could not be read reliably.The use of low-leakage technologies, such as fully depleted silicon on insulator, mitigates this issue, which needs anyway further investigation.The effect of low reading currents at system level such as comparatively long reading times, as well as possible solutions that are technology independent and rely on either circuits or encoding schemes and algorithms will be discussed later in section 4.

FTJ devices for neuromorphic computing
The main characteristics of the FTJ devices, i.e., non-volatility, the possibility of analog switching [25,32] endurance >10 5 cycles [33], energy efficiency down to 1 pJ per spike for a device of area 2.5 μm 2 [34], and scalability, are desirable in applications such as neuromorphic computing.In particular, FTJs have been recently considered to emulate synaptic properties and mechanisms underlying learning.From the material point of view, alongside the classic devices, also in 3D architectures [34] or using multiferroic layers [35], synaptic properties are reported in organic [36] and flexible [37] ferroelectric devices.
Analogue and accumulative switching behaviour allows for the storage of multiple weights in the same device, thus increasing the density of the network.It is obtained by controlling the polarisation of the domains in the FTJ with voltage pulses.A change in the voltage amplitude or in the time width of the pulses results in a different percentage of switched domains, hence a different resistive state [25,34], as shown in figure 1(d).However, the use of trains of identical pulses to gradually change the state of the device is preferable for ease of implementation in CMOS technology.Indeed, from a circuit design perspective, the design of a pulse generator that provides trains of identical pulses is much simpler and area-efficient than a pulse generator designed to apply pulses with different voltage amplitudes or time widths.In addition, in order to apply the correct pulse in case of non-identical trains, the device state has to be evaluated before every programming operation.This has the twofold disadvantage of a more complex circuitry and of an increased latency before the device is ready to receive another programming operation.Therefore, a possible solution is the exploitation of the accumulative switching behaviour, which integrates the effect of multiple pulses whose amplitude generate a field lower than E c to reach a gradual change in the device state [38].This method does not require any pre-assessment of the device state, resulting in a more area-and time-efficient approach.
Ultra-low power synaptic operation has been widely reported [34,36,39], but it is worth pointing out an asymmetry in the energy efficiency between the transition from LRS to HRS (depression) and the opposite one (HRS to LRS, potentiation) of one order of magnitude [36,39], originating mainly from the asymmetric IV characteristic of these devices and the difference in current flow in LRS and HRS, respectively.However, these estimations of energy consumption should be treated with care since typically the impact of transient charging/discharging effects within real circuits or the self-capacitance of the FTJs itself are neglected.Recently it has been pointed out that the current flow due to polarization reversal in FTJ devices during write operation can be larger by several orders of magnitude compared to the tunneling currents itself that are measured during read operation [31].Hence, a profound assessment of FTJ energy consumption has to consider the whole system architecture rather than single devices, which is still a topic of further research.
The learning algorithms so far demonstrated in FTJ-based synapses are mainly Hebbian, with spike timing dependent plasticity (STDP) dominating the landscape [24,25,[34][35][36][37][38].STDP is a well known biological learning rule which determines the magnitude and direction of the weight update depending on the timing difference between two spikes fired by two neurons, namely the pre-neuron and the post-neuron, connected by a synapse [40].If the pre-neuron fires before the post-neuron, implying a correlation between the two neurons' firing activity, their communication is facilitated, and the synapse is potentiated.In the opposite case, the communication between the two neurons is inhibited, and the synapse is depressed.The closer in time the two spikes are, the higher the weight change.Simulations using FTJ-based synapses and an STDP-based learning showed network accuracy for pattern classification >96% for a custom set of letters [34], whereas when standard Modified National Institute of Standards and Technology (MNIST) database is used, accuracy between 92.8% [37] and 97.3% [35] are reported.A hardware demonstration of a 3 × 3 pattern on a crossbar was shown in [38].
It is worth noticing that the switching time of the FTJ is in the nanosecond range [34,36].In a massively parallel event-driven system, this property allows the processing of events that occur almost simultaneously in time, possibility that can occur even when using biological time windows, i.e., in the millisecond range, for the STDP.
By spike-shape engineering, the plastic behavior of the synapse can be tuned to modulate the shape of the STDP curve, as shown in figure 1(e), or even to reproduce a broader variety of learning algorithms [41].In addition to this solution, more complex synaptic architectures enable radically different learning rules.An example is the differential synapse [42] which uses excitatory and inhibitory synapses for learning.An FTJ-based version has been proposed [31], whose basic operating principle relies on the 2T-1C cell [23,43].This approach is used to amplify the current of the FTJ to values detectable by CMOS sense amplifiers and consequently to decrease the device's read time.
The presence of volatile behavior has also been reported [21,22] and investigated for different thicknesses and bias conditions of HZO/Al 2 O 3 bi-layer stacks [44], as shown in figure 1(f).In these devices, a depolarization field was deliberately introduced.At first order, the depolarization field can be approximated as: where E dep is the depolarization field, P is the polarisation, d int and d FE the thickness of the interface (Al 2 O 3 ) and ferroelectric (HZO) layers, respectively, and ε 0 , ε FE , and ε int are the vacuum, ferroelectric layer, and interface layer, respectively.When the thickness of the interface layer increases, also the depolarisation field increases, thus leading to a volatile behavior [19].
This feature offers the opportunity to emulate both short term plasticity and long term plasticity using similar devices, as demonstrated also using organic FTJs [36] or exploiting the ion migration of a silver metal electrode into the ferroelectric layer [39].The latter approach also results in a very high on/off ratio of 10 7 [39], which is uncommon for ferroelectric devices.The exploitation of the migration of silver ions is also an element of novelty for ferroelectric devices, whereas it was already reported in some filamentary oxide-based resistive switching devices [45,46] used, as an example, for a retina-inspired artificial vision system [47].
The possibility of developing a 3D technology is another non-negligible aspect in view of a massively parallel and dense neuromorphic system.Similar to PCM [48] and ReRAM [49,50] technologies, also FTJs have shown the potential of 3D fabrication (figures 2(a) and (b)) [34].Moreover, the possibility of gradual, repeatable, and sufficiently linear weight update, shown in figure 2(c), allows also for the reproduction of the STDP shown in figure 2(d).The asymmetric FTJ structures with an inserted dielectric layer, such as TiN/HZO/Al 2 O 3 /TiN, are also beneficial to the crossbar array implementations as they intrinsically act as diode-like (i.e., unipolar), thus avoiding the need for a selector device.

Basic device operating principle
FeFETs based on ferroelectric HfO 2 are currently receiving an increased attention for non-volatile memory applications because of the low-voltage and fast switching, good data retention and compatibility with CMOS fabrication [51].The non-volatile memory operation relies on the presence of the two stable polarization P configurations in HfO 2 (P 'up' and 'down') which can be reversibly switched by applying an external electric field [52].The two configurations correspond to two distinct conduction modes of the FeFET, namely high and low conduction states [53].In other words, the transistor displays two different threshold voltage (V T ) values, which are employed to store binary information [54].The two states are usually called low-V T (LVT) and high-V T (HVT) and are set by applying sufficiently large positive and negative voltages at the gate, respectively.The respective write operations are called program (PRG) and erase (ERS).Figures 3(a) and (b) show the typical FeFET structure and the drain current-gate voltage (I D -V G ) curves for the two logical states, respectively.The separation between the HVT and LVT is called memory window (MW).

FeFET as a synapse
In analogy to other non-volatile memory devices [55], also FeFETs can be employed for neuromorphic hardware.In particular, the gradual switching from one saturated state to the other, which is achieved by partially switching the polarization in the gate stack, enables a large number of intermediate conduction levels.Figures 3(c) and (d) show the PRG and ERS switching transitions as a function of the applied gate voltage amplitude and reveal the presence of more than 64 intermediate V T states (>6 bits) between LVT and HVT.Moreover, instead of varying the pulse amplitude, it is also possible to achieve these states with repeated pulses of the same amplitude, or with time modulated pulses [51].It should be noted that 6 bits are not intended in the classical memory approach for multi-level storage, which requires sufficient separation between the V T states.Instead, it merely denotes the large amount of states for analog channel conductance tuning.This was recognized as an attractive feature to emulate analog synaptic weights [56].
In fact, in 2017 appeared the first demonstration of the artificial synapse with a FeFET having a 10 nm-thick Si doped HfO 2 FE layer and fabricated in the 28 nm bulk high-k metal gate (HKMG) technology [56].The device displayed gradual switching and reproducible addressing of intermediate V T states, which also allowed the implementation of STDP as well as the weighted spike transmission.Following this demonstration, several other proposals have been made for FeFETs as analog weights in deep neural networks, including the devices realized in the planar gate-last [57], junctionless [58], nanowire [59] and back-end-of-line (BEOL) [60] technology.All of them exploited the gradual switching shown in figures 3(c) and (d), which indeed is a great advantage of the FeFET compared to other non-volatile memory devices like resistive and phase change RAM.While the latter usually show one gradual transition (the other is abrupt without intermediate conduction levels) [55], the FeFET displays continuous switching during both ERS and PRG (figure 2).However, this is true only when the device is in a multi-domain configuration, i.e., the FE layer contains many switchable domains [61].This is usually verified in large-area devices with the transistor channel length exceeding 100 nm.As the channel length (and in general the lateral size) scales down, FeFETs tend to exhibit abrupt switching transitions [61], which results in a rapid loss of the capability to store intermediate V T states.This situation is shown in figures 3(e) and (f) for a device having a channel width of W = 80 nm and length L = 30 nm.The behavior is explained by the presence of only a few FE domains in the gate stack, so that even the switching of single domains induces a significant V T shift of the transistor.While this represents a scaling limitation for implementing artificial synapses with FeFETs, it can be overcome by increasing the number of FE domains in the gate stack that control the channel, i.e., increasing the gate size/domain size ratio.This may require adopting other device geometries/modules, such as BEOL FeFETs and 3D FeFETs, where the device dimension requirements can be relaxed, providing a sufficient number of switchable domains in the gate stack.Alternatively, multiple synaptic weights could be represented by a number of binary switching FeFETs connected in parallel and programmed in a stochastic way [62].
Recently, another way to influence the number of intermediate levels has been experimentally verified and builds upon changing the doping content in the FE layer [63].Figures 4(a)-(c) show that by increasing the Si doping in HfO 2 , the switching transition becomes less steep and the number of intermediate V T states increases, although the same write pulsing scheme is used.This has been explained by an enhanced switching dispersion in the FE layer that originates from the increased amount of non-FE phase in the HfO 2 film [64] as the dopant content is increased.
A further strategy to accommodate more intermediate states between HVT and LVT is to enlarge the MW.The typical approach is to increase the thickness t F of the FE material [51], as the MW is linearly proportional to t F [65].Nevertheless, this reaches limitations, because the FE film fabricated with the most common processes rapidly loses its ferroelectric properties for t F > 20 nm.Moreover, this approach will also lead to higher write voltages as an undesired side effect.Recently, these limitations have been circumvented by adopting a FeFET with an asymmetric double gate [66] as shown in figure 4(d), which displayed a MW exceeding 12 V (figure 4(e)) and a multilevel capability of 4 bit/cell with very stable retention (figure 4(f)).In this approach, the gates for the write and the read paths are decoupled.Also, the possibility for achieving more than 5 bit/cell for neuromorphic applications has been shown.The decoupling of write and read paths further results in a drastic reduction of read disturbs, which is particularly beneficial in FeFET array environments, as required for large-scale neural networks.
The stability of the stored states over time, i.e., data retention, is critical to synapse applications.Nevertheless, the most common FeFET structure of metal-ferroelectric-insulator-semiconductor (MFIS), as depicted in figure 3(a), suffers from depolarization fields.These arise from an incomplete compensation of the polarization charge in the FE layer by the confining non FE layers, which can lead to a severe retention loss.However, the relatively high coercive field of HfO 2 films and the proper gate-stack design have helped to significantly mitigate this effect, yielding a very stable data retention in MFIS FeFETs as reported by different research groups [51].Another approach to increase the retention stability is the adoption of metal-ferroelectric-metal-insulator-semiconductor (MFMIS) structures for the gate-stack.In fact, Yoon et al [67] reported an MFMIS FeFET having the Pt/Al:HfO 2 /TiN/SiO 2 /Si stack (figures 5(a) and (b)) and not only were able to demonstrate very robust retention, but also to reproduce some of the important synaptic functionalities by partially switching the polarization [68], such as excitatory post-synaptic current (EPSC), paired-pulse facilitation, and STDP (figures 5(c)-(g)).
Finally, it should be mentioned that FeFETs are intrinsically three-terminal devices, when considering the gate, source, and drain terminals.Apparently, this might be a drawback in terms of the integration density with respect to some two-terminal synaptic devices, such as resistive RAM or PCM.Nevertheless, the latter usually avoid adopting the ideal crossbar array structures (which promise the highest integration density) due to the severe cross-talk, undesired voltage drops and power consumption owing to half-selected cells.Instead, pseudocrossbar arrays with 1-transistor-1-resistive unit element (1T-1R) are preferred to minimize such effects [69].Thus, the three-terminal FeFET structure in common memory array configurations, such as NOR or NAND, or pseudocrossbar configurations might not be less advantageous with respect to two-terminal devices.Moreover, the third terminal brings an additional advantage in terms of decoupling of the write and read paths.The write is performed at the gate, the read concerns the sensing of the drain-source current, which might be beneficial in reducing the write/read disturbs.

FeFET as a neuron
Apart from synapses, artificial neurons are another fundamental building block for implementing neuromorphic systems.Different types of artificial neurons have been proposed, which emulate real neurons at many different abstraction levels: from complex biophysical models that emulate ion channel dynamics and detailed dendritic or axonal morphologies to basic I & F circuits [13].Generally, I & F neuron models are widely adopted due to their relatively simple mathematical description, yet sufficient accuracy in capturing essential biological characteristics.They are characterized by two prominent dynamics: (1) the integration of inputs arriving from other neurons which get weighted through respective synapses.These weighted excitations increase the potential of the neuron membrane and lead to (2) the generation of an action potential (the neuron 'fires') after a certain threshold of the membrane potential is crossed.After firing, the membrane potential is reset and the neuron remains apparently inactive for a certain period of time.It is called refractory period, and is generally in the millisecond range, after which the neuron is ready to integrate new incoming spikes and, thus, to start a new I & F cycle.In the past, this functionality has been emulated with large circuits having an input capacitor, called membrane capacitor, and at least two inverter stages.The most area-consuming element was the membrane capacitor, which is used to integrate the input current spikes, i.e., neuronal integration stage.It is therefore desirable to seek for more compact solutions for artificial neurons, possibly without employing such a bulky capacitor.
FeFETs can undergo the full switching transition under a train of sub-critical voltage pulses, even if each of these pulses is insufficient to individually induce any appreciable switching effect [70,71].This phenomenon has been termed accumulative switching and was found to obey a switching voltage-time trade-off originating from the ferroelectric nucleation process [72].Based on accumulative switching, it has been proposed to use a single ultra-scaled FeFET to emulate the key neuronal behaviors [73].In fact, the 'accumulation' of voltage pulses and the subsequent abrupt switching transition, typical for scaled devices, was exploited to implement the I & F functionality (figures 6(a) and (b)), which resulted in a biologically plausible all-or-nothing spiking.Thanks to the time-voltage dependence of accumulative switching, also the spiking frequency tuning (the stronger the neuronal excitation, the higher the spiking frequency) was realized (figure 6(c)), as well as the possibility for achieving 'leaky' I & F behavior and the arbitrary refractory period.The advantage of the FeFET-based neuron is that the incoming action potentials are integrated directly within the FE layer in the gate stack.As a result, the use of an external integration capacitor is avoided, which is usually area-consuming and necessary in classical CMOS neuron realizations [13], thus making the FeFET-neuron a compact, capacitor-less solution.Moreover, low static power consumption is expected during the spike integration stage because of the all-or-nothing spiking.In addition, the intrinsic switching stochasticity in FeFETs [61] can be readily exploited to emulate the firing randomness of the biological neurons [74].A more refined implementation of neuronal behaviors can be further sought by embedding the FeFET in CMOS circuits, as recently demonstrated in simulated two-transistor and seven-transistor I & F neuron circuits [75] or in other circuit implementations [76,77].

Discussion
In this manuscript, we revised the current state of ferroelectric devices and their role as synaptic or neural elements for neuromorphic computing.While PCM and ReRAM technologies have been widely used in neuromorphic systems in the past years [78], ferroelectric technology has only recently been investigated for these applications [15].The systems reported are mainly at simulation level, for both FTJs [34,35,37] and FeFETs [57,[79][80][81][82].
So far, no ideal device was found, and every technology has its advantages and drawbacks.As an example, ferroelectric devices show a better energy efficiency than ReRAMs (read currents in nA range [22] versus μA range [83]), but at the cost of a lower number of available states (5 bits [27] versus 7 bits [84]).PCMs demonstrate an excellent endurance compared to FTJs (10 11 cycles [85] versus 10 5 cycles [33]), but the SET time is necessarily longer to allow for a recrystallisation of the chalcogenide material (>100 ns [86] versus 600 ps [27]).
The same feature that makes FTJs so appealing for massively parallel neuromorphic applications, i.e., energy efficiency, has a strong impact on the CMOS circuits.Indeed, when operated in analogue way, ferroelectric devices need an ADC to read their state and convert it to a digital value.However, currents in the nA range require very high resolution ADCs, which might become area-expensive.To relax the specifications on the ADCs, there are three strategies that can be considered.At circuit level, a 2T-1C structure as described in [23,43] can be adopted.In the 2T-1C cell, one of the two transistors acts as a select transistor to carry out programming operations, whereas the other one is used as a read transistor to amplify the ferroelectric device's current.This solution allows for flexibility in the selection of a suitable current range without affecting the normal operation of the FTJ.At array level, multiple FTJs could be read in parallel, thus resulting in an overall larger current, and then reading the resulting accumulated current directly with a sense amplifier that might, for example, adopt proper thresholding functions that make a high precision ADC not mandatory.Another possibility is to work at system level and conceive encoding schemes that allow for a reduced resolution of the ADCs, as already proposed in the ReRAM-based ISAAC processor [87].
Memristive ferroelectric devices have a great potential to be used for weight storage thanks to their nonvolatile properties, but they suffer from limited bit precision and non-linear state change.Nevertheless, some applications, as an example edge computing, already need to cope with tight memory and power constraints, and use therefore lower bit precision [5].In these cases, adequately high accuracy is ensured using techniques such as synaptic pruning [88], sparse coding [89,90], and stochastic rounding [91].In particular, stochastic rounding benefits from the intrinsic stochasticity of the ferroelectric devices, which indeed offers a compact alternative to stochastic CMOS circuits.Stochasticity is also successfully exploited to implement low-bit precision architectures such as binary SNNs [92].Moreover, it has been already shown that our brain exploits noise to increase the probability of detecting subthreshold events (i.e., with a weak synaptic drive) [93].The same principle can be exploited also in memristive neuromorphic networks [94,95].In this case, however, we need to specify that cycle-to-cycle variability is the one beneficial for a neuromorphic system, to a certain extent, while device-to-device variability is detrimental for the correct operation of the system and must be minimized.
The process that leads to a memristive-based neuromorphic system includes undoubtedly many challenges at device level, e.g., optimisation of the materials, scaling of the device, reduction of the device variability, read and write speed, and energy consumption, but also at circuit and system level, e.g., the circuits to address, program, and read the devices and to interface them with the other parts of the chip and with the external world, and the routing of the events [96].However, to design a hybrid CMOS-memristive hardware, these challenges cannot be tackled separately, they need to be framed in a holistic approach where everything is developed together [5,97].Moreover, also the learning algorithms should be selected not only according to the intended application, but they also need to be adapted in order to best exploit the features of the memristive devices [5,98].
The first choice is between offline and online learning.Offline learning can be carried out and the weights can be uploaded using a program-and-verify algorithm to cope with device variability, provided that the learning algorithm already considers a limited weight precision.Online learning would require low power operations, especially in edge scenarios.FeFET-based neurons and synapses show already a consumption in the pJ range [82] and, at system level, estimated energy requirements for online learning of 98.01 mJ, two order of magnitude larger than the correspondent 6 bit SRAM implementation, as shown in [57].Online learning also calls for high device endurance, ideally higher than the one demonstrated for ferroelectric devices.However, the analog operation of the ferroelectric devices used in neuromorphic applications requires a lower voltage than in memory-like applications, which contributes to achieve an improvement of endurance.Moreover, the co-development of learning algorithm and system can further improve the endurance.Indeed, training algorithms such as backpropagation through time and eprop [99] or in general the class of three-factor learning algorithms [100,101] that combines the strengths of backpropagation and biologically plausible learning, minimise the number of switching events and, as a consequence, increase the device endurance.Moreover, these algorithms have the potential to enhance biologically plausible STDP.Indeed, when dealing with deep networks with one or more hidden layers, STDP is outperformed by backpropagation [102,103].On the other hand, when online learning and real-time interaction with the external environment are required, Hebbian learning allows for a superior flexibility.The use of three-factor learning algorithms is therefore a promising choice for memristive-based systems [104].

Conclusion
In this work, we provided an overview of novel ferroelectric devices, specifically FTJs and FeFETs.At first, we clarified their basic operating principles; then, we discussed how these devices can be employed as artificial neurons and synapses in neuromorphic hardware.Thanks to their non-volatility, CMOS compatibility, and extreme energy efficiency in a small footprint, ferroelectric devices are a promising emerging technology for edge computing neuromorphic systems.

Figure 1 .
Figure 1.(a) Waveforms for the electrical characterization of an HZO-Al 2 O 3 FTJ.(b) and (c) Color-coded read current depending on reset (b) and set (c) pulse amplitude and pulse width.(a)-(c) Adapted from [19] and reproduced under CCBY license.(d) Normalized fraction of switched area as a function of the pulse time for different pulse amplitudes in an HZO-Al 2 O 3 FTJ.(e) Measured spike timing dependent plasticity (STDP) curves.The resulting resistance of the FTJ is a function of the time delay Δt between pre-and post-neuron action potentials leading to an increased or decreased resistance of the FTJ that represents a weakening or strengthening of the synapse.(d) and (e) Reproduced from [25] with permission from ACS Appl.Electron.Mater.(f) Retention characteristics at room temperature of a TiN/Hf 0.5 Zr 0.5 O 2 /Al 2 O 3 /TiN ferroelectric tunnel junction (FTJ) under different bias conditions.Reproduced from [44].Copyright IEEE.

Figure 2 .
Figure 2. (a) Schematic diagram of a high-density 3D vertical HZO-based FTJ array.(b) Zoom of the schematic of the 3D HZO-based FTJ device.(c) Long-term potentiation and depression characteristics.(d) Pre-and post-spikes (top) and Hebbian STDP (bottom).Blue squares: experimental data; red lines: exponential fits.Reproduced from [34] with permission from Royal Society of Chemistry.

Figure 3 .
Figure 3. (a) Schematic structure of the FeFET, indicating the composition of the gate stack and the adopted gate (V G ) and drain (V D ) voltages.M, FE, IL and S stand for metal, ferroelectric, interfacial layer and semiconductor, respectively; (b) the experimental I D − V G curves for the LVT and HVT states; gradual PRG (c) and ERS (d) transitions as a function of pulse amplitude for a large-area FeFET (W = L = 1 μm); abrupt PRG (e) and ERS (f) transitions as a function of pulse amplitude for a small-area FeFET (W = 80 nm, L = 30 nm).(b) is reproduced from reference [73] with permission from the Royal Society of Chemistry.(e) and (f) Reproduced from reference [61] with permission from ACS Appl.Mater.Interfaces.

Figure 4 .
Figure 4. Intermediate states of a FeFET: (a)-(c) switching steepness decreases as the Si doping content in the HfO 2 increases, giving rise to more intermediate V T levels.Copyright IEEE.Reproduced with permission from reference [63]; (d) FeFET with an asymmetric double gate and (e) the resulting memory window (MW) around 12 V; (f) stable retention of 16 intermediate V T states (4 bit/cell).The separation between adjacent V T levels is targeted to 600 mV, allowing for enough sensing margin.(d)-(f) are reproduced from reference [66] with permission from the Royal Society of Chemistry.

Figure 5 .
Figure 5. (a) Schematic cross-sectional view of the fabricated MFMIS-FETs; (b) the TEM image of the Pt/Al:HfO 2 /TiN/SiO 2 /Si gate stack structure; (c) schematic diagram of the artificial synaptic device with three terminals, demonstrating the presynaptic spikes and induced postsynaptic current responses.(d) Collection of excitatory post-synaptic current (EPSC) peak values as a function of amplitude of each pulse composing the presynaptic pulse trains; (e) schematic illustration for evaluating the STDP behaviors of the three-terminal MFMIS-FETs.(f) Pulse schemes of the presynaptic and postsynaptic spikes applied to the gate and drain terminals.(g) Implementation of the symmetric STDP operations of the ferroelectric synapse FET.Changes in the synaptic current are plotted as a function of the relative timing between the presynaptic and postsynaptic spikes.Reproduced from reference [68] with permission from the Royal Society of Chemistry.

Figure 6 .
Figure 6.I & F FeFET-neuron: (a) gate voltage waveform with identical incoming spikes; (b) I & F behavior of a scaled FeFET with W/L = 80 nm/30 nm accomplished with accumulative switching; (c) spiking frequency tuning achieved with the time-voltage trade-off for accumulative switching.Reproduced from reference [73] with permission from the Royal Society of Chemistry.