Magnetic skyrmions and domain walls for logical and neuromorphic computing

Topological solitons are exciting candidates for the physical implementation of next-generation computing systems. As these solitons are nanoscale and can be controlled with minimal energy consumption, they are ideal to fulfill emerging needs for computing in the era of big data processing and storage. Magnetic domain walls (DWs) and magnetic skyrmions are two types of topological solitons that are particularly exciting for next-generation computing systems in light of their non-volatility, scalability, rich physical interactions, and ability to exhibit non-linear behaviors. Here we summarize the development of computing systems based on magnetic topological solitons, highlighting logical and neuromorphic computing with magnetic DWs and skyrmions.


Introduction
Topological solitons are particle-like solutions in which a mathematical or physical field reaches a stable minimal energy state [1]. Depending on the dimension of the solution, there are several types of topological solitons, such as kinks, lumps, vortices, monopoles, and skyrmions. The skyrmion is one example of a topological soliton model [2], and has been experimentally observed to be stabilized and able to be moved at room temperature [3]. A magnetic domain wall (DW), on the other hand, is an kink-like soliton [1,4] that has also been experimentally nucleated and moved in numerous hardware configurations [5,6]. Moreover, both skyrmions and DWs in ferromagetic films can be electrically detected through magnetic tunnel junctions (MTJs) [7]. Combined, skyrmions and DWs as topological solitons satisfy the requirements for a computing system and thus can be considered as a novel approach to implement the next-generation spintronic computing system [8].
This survey is structured as follows: section 2 presents an overview of the fundamental spintronic physics related to magnetic skyrmion and DW devices; section 3 covers skyrmion logic, including skyrmion reversible computing and skyrmion clocking schemes; section 4 covers unconventional skyrmion computing systems such as skyrmion neuromorphic and probabilistic computing systems; sections 5 and 6 cover DW-MTJ logical and neuromorphic computing systems, respectively; and section 7 concludes this survey.

Background
Topological solitons, also known as topological defects, have attracted much attention from the scientific community. Specifically, the study of topological defects and phase transitions was recognized with the Nobel Prize in Physics in 2016 [9,10]. Among the variety of topological solitons, magnetic skyrmions and magnetic DWs are particularly intriguing as information carriers for next-generation computing systems. As background to this review, this section covers the physics of magnetic skyrmions and DWs that underlies the topological soliton computing in later sections.

Magnetic skyrmions
Magnetic skyrmions are magnetic textures that have topological characteristics that enable them to be identified as quasiparticles. The magnetization of a skyrmion contains all possible orientations within a sphere. At the same time, they are chiral, which means that rotation of the magnetization can happen only in one direction.
Skyrmions were first proposed in particle physics as a model in field theory for hadrons by Skyrme [11]. In magnetic systems, they were predicted by Bodganov and Rößler [12], and they were finally observed in 2009 in MnSi [13]. Later, they were also observed in thin magnetic metallic film layers with perpendicular magnetocrystalline anisotropy (PMA), which are more suitable for applications as the skyrmions exist also at room temperature [14][15][16].
As topological structures, magnetic skyrmions possess charge, known as the skyrmion number: where m is the normalized magnetization. The skyrmion number is an integer number that is conserved under continuous transformations, which means that the skyrmion cannot be destroyed without an energy discontinuity or without leaving the system by a boundary. This conservation is commonly known as topological protection. However, the continuous model does not apply to discrete models such as those based on lattices, the size of a skyrmion may be finite, and the presence of defects can result in a finite energy barrier for their creation or destruction, which affects their stability [17].
To have skyrmions one needs a special type of exchange proposed originally by Dzyaloshinskii and Moriya as an explanation for weak ferromagnetism, now known as the Dzyaloshinskii-Moriya interaction (DMI) [18,19]. As that exchange is originated by spin-orbit coupling, one of the requirements to have skyrmions is to have strong spin-orbit coupling. The second requirement is to have reduced symmetry in the system. The interaction is chiral, so the local moments have a fixed relative orientation that depends on the sign of the DMI constant. At the atomic level, the DMI energy per spin pair can be represented as: where ⃗ D ij is the DMI vector, ⃗ S i the spin i, and ⃗ S j . This produces an orthogonal alignment of neighboring spins and the chirality of the spin arrangement is given by the direction of ⃗ S j . This is in contrast to the usual ferromagnetic Heisenberg exchange, E FM = −J( ⃗ S i · ⃗ S j ), which favors a parallel alignment of the spins.
Regarding the type of materials where DMI appears, they can be distinguished by how the broken symmetry originates. DMI can originate in the lattice, which is non-centrosymmetric in compounds with a B20 lattice, in which case it is referred to as bulk DMI. In other materials, the DMI comes from interfacing with a heavy metal, which is the source of spin-orbit coupling, typically Pt. Due to this fact, it is known as interfacial DMI. Being a surface effect, the DMI strength is inversely proportional to the thickness of the ferromagnetic layer. Usually, in these systems, the ferromagnetic layers also have PMA, with the preferred magnetization direction perpendicular to the film plane, which is usually labeled as the z axis. In such systems, the continuous version of the energy density has the following form [12]: where D is usually known as the DMI constant and has units of J m −2 . These two types of DMI generate two types of skyrmions: Néel skyrmions in interfacial DMI systems (see figure 1(a)) and Bloch skyrmions for bulk DMI in confined geometries (see figure 1(b)). In Néel skyrmions, the magnetization is contained in the cross section of the skyrmion, as in figure 1(a). On the other hand, in Bloch skyrmions, the magnetization is perpendicular to the cross section. In both cases, the DMI is chiral and the direction of rotation (clockwise or anticlockwise in figure 1(a)) is fixed for a given material sample and is determined by the DMI sign. In interfacial systems with PMA, the magnetization out of the skyrmion is pointing in the direction favored by the PMA, which coincides with the outer ring of figure 1(a). The measured skyrmion size in such systems is around 100 nm [16], although it can be reduced to around 50 nm in confined geometries [20]. In systems composed of multiple layers and low temperatures, the skyrmion size can be below 10 nm [21].
After their first experimental observation, skyrmions quickly received interest as a possible means of magnetic storage [22]. In such applications, skyrmions are information carriers with binary values represented through the presence or absence of skyrmions. Several challenges must be overcome for efficient skyrmion memory. First, one needs to create and destroy skyrmions at will. However, as mentioned above, this is a process that requires energy. Nevertheless, several processes allow the creation of skyrmions by means of in-plane or out-of-plane electrical currents [23,24]. Another requirement is the detection of skyrmions by electrical means. After the initial observation of a magnetic skyrmion, it was soon reported that a bulk DMI system presents a contribution to the resistance if a lattice of skyrmions is present [25]. More recently, it was reported to be possible to identify the existence of skyrmions by their electrical signature in thin films containing skyrmions [26]. Finally, to propagate the information, the skyrmions should be moved along the wire.
Skyrmions, as well as DWs, can be moved by electrical currents applied through the magnetic system, which will produce a torque on the magnetic texture. One can distinguish spin-transfer torque (STT) or spin-orbit torque (SOT). In STT, the source of the torque is the polarization of the spin by the ferromagnetic domain adjacent to the magnetic texture and it is proportional to the gradient of the local magnetization [27]. In the SOT case, the torque corresponds to a source of spin polarization created by the presence of a layer of a heavy metal with large spin orbit coupling. Different mechanisms have been proposed such as the spin Hall effect (SHE) [15] or inverse spin galvanic effect [28], which all share the spin-orbit origin, and, consequently, are grouped under the denomination SOT. In the experiment, all contributions can be present, and the final strength of each mechanism depends on the actual material parameters, such as the magnitude of the resistances of all the layers involved.
To analyze the motion of a skyrmion the best tool is the Thiele model [29]. In this model, the skyrmion is a rigid object of in-plane coordinates ⃗ X = {X 1 , X 2 }, whose dynamics can be described by: where ⃗ G is the gyrotropic vector which is proportional to the skyrmion number N, α is the damping constant,D is the dissipation dyadic, and ⃗ F is a driving force applied to the skyrmion. Both in STT and SOT, the force direction is parallel to the current direction [30]. Due to the gyrotropic character of the skyrmion motion, when there is a driving force applied to the skyrmion F x in the X direction, there will be a transversal componentẊ 2 perpendicular to the force and larger thanẊ 1 ,with the velocity parallel to the force. Using this model, the deflection angle can be calculated as θ = arctan(Ẋ 2 /Ẋ 1 ) = G z /αD xx , which does not depend explicitly on the force strength. Due to this fact, the skyrmion will not move parallel to the applied electrical current; this has been verified experimentally [31]. This effect is known as the skyrmion Hall effect, and the deflection angle as the skyrmion Hall angle. This is reminiscent of the Magnus force acting on a spinning ball in a fluid, as the skyrmion linear movement is linked to a rotation around its center. In absence of a driving force or for low currents, the skyrmion is confined in the sample by the potential created by the boundary conditions associated to DMI [32], which is derived from equation (3). Eventually, if the electrical current exceeds some threshold, the skyrmion collides with the sample border and disappears [30]. Finally, one characteristic of skyrmions in DMI systems with PMA is that the skyrmion-skyrmion interaction is repulsive [33]. That fact will be exploited in some of the potential applications explained below.

DW-MTJ
As mentioned before, DWs are kink solitons. They separate magnetic domains of opposite orientations. They are topological magnetic textures, which means they can only be removed through continuous transformations if they reach the sample border. That fact means the DW can be annihilated and that the sample will contain only one of the two original domains. In a system with PMA, the domains will be oriented perpendicular to the plane, in the Z direction, and the two domains correspond to the regions where the magnetization is m z = ±1. The expression for the domain profile is the standard Bloch solution m z = ±tanh(x/∆) where ∆ is the DW width. Equivalent to the skyrmion case, in PMA DWs can be either of Néel type (as the cross section of the skyrmion in figure 1(a)) or Bloch type (as the cross section of the skyrmion in figure 1(b)). In PMA, the presence of DMI favors Néel type over Bloch type. DWs can be moved by STT or SOT, as explained in the previous section. The main difference between the SHE and other mechanisms is that when moved by the SHE, the DWs need to be of Néel type, because Bloch walls do not move under efficiently under the SHE [34]. In principle, this is not a limiting requirement, because the same heavy metal used to induce SOT can also provide the DMI to favor Néel walls.
The DW-MTJ device is a novel nonvolatile logic device comprising a DW racetrack and a MTJ for magnetoresistance readout. Unlike a standard MTJ, the DW-MTJ device has an extended free layer in the shape of a racetrack that allows for current-driven DW motion. The DW-MTJ device finds applications in in-memory computing and is shown to be stable against radiation [35], which makes it promising for space applications as well.
A cartoon of the DW-MTJ device is shown in figure 2. The DW racetrack is typically a heavy metal/magnet/oxide (e.g. Ta/CoFeB/MgO) thin film trilayer. An MTJ hard reference layer is patterned atop the track, with additional pinning layers (typically a synthetic antiferromagnet layer) to ensure a higher switching field of the hard layer compared to the free layer racetrack. To prevent the DW from escaping the racetrack, antiferromagnets can be used to exchange-bias the ends of the ferromagnetic track in opposing directions. Several methods and device components can be used to inject/maintain a DW inside the racetrack: an electrode (i.e. Oersted field line) can be placed across the track to inject a single DW in the track; an additional MTJ can be placed to nucleate the DW electrically; and pinning notches can be fabricated along the track to keep the DW in the track.
The basic DW-MTJ device has three terminals, but the four-terminal version has been proposed as well [37,38]. Here, we focus on the operation of the three-terminal MTJ device: the three terminals are referred to as input (IN), clock (CLK), and output (OUT), with IN and CLK being the two ends of DW racetrack and output (OUT) the top of the MTJ. The device operates as follows: (a) wrrite operation: a voltage applied between IN and CLK induces DW motion either through STT and/or SOT; (b) read operation: a voltage applied between CLK and OUT measures the magnetoresistance state of the MTJ which determines the DW position relative to the MTJ, and this output current can drive subsequent devices. One logic application of the device is an analog universal NAND gate: the IN of a device is connected to the OUT of two previous devices, such that only if both are in a low-resistance state (logical '1') will there be sufficient current to depin the DW and drive it past the MTJ, switching it from a low-resistance state (logical '1') to high-resistance state (logical '0'); other combinations of the resistance states of the previous devices cannot provide enough current to depin the DW, leaving the output resistance state low (logical '1').
Experimental device prototypes have demonstrated single-device operation and three-device NAND gate operation. PMA along with SOT-driven DW motion brings switching current density down to the order of 10 11 A m −2 [39]. A PMA device integrating MTJ read/write of DW and SOT driving has been applied to majority logic [40]. In simulation, other functions, such as a shift register and full adder, have been modeled in micromagnetics and benchmarked [41][42][43], and a simulation program with integrated circuit emphasis (SPICE) model has been developed for large-scale simulations [44]. System-level analysis revealed the power-performance trade-offs. For example, one study found undesirable DW pinning and thermal noise caused by ultra-low-voltage operation [45], but thermal noise limitations can be mitigated by increasing tunnel magnetoresistance (TMR).
We now briefly discuss the key metrics of DW-MTJ logic devices, including area, switching current density, DW speed, TMR, resistance-area product, cycle-to-cycle, and device-to-device variation. It has been calculated that a NAND device has a cell size of 18F 2 , where F is the pitch size [46]. For the following calculations, we assume a scaled pitch size of F = 15 nm, which corresponds to an area density of 2.5 × 10 10 devices cm −2 . Based on a previously reported switching current density of 1 × 10 10 A m −2 for pitch size F of 150-450 nm, a scaled device of F = 15 nm requires switching voltage U = 100 mV and switching time t = 5 ns, corresponding to a switching energy of 7.5 × 10 −15 J. Recent reports on new DW racetrack materials find ultra-high DW speed up to 5700 m s −1 in the ferrimagnetic CoGd alloys [47], which will result in switching speeds of 8 GHz when eventually integrated to DW-MTJ logic. A major challenge in large-scale system integration is device-to-device variation, which has been addressed in [48]. Efforts in thin film stack growth and device fabrication techniques are still required to reduce TMR variation and related limitations and errors.

Reversible computing overview
Landauer [66] was the first person to point out that the loss of (known or correlated) information from a digital system implies a corresponding irreducible entropy increase (and thus energy expenditure)-this can be understood to follow as a direct consequence of basic statistical physics and information theory [67]. This fact motivated the early conceptual development of the reversible computing paradigm [68][69][70], which avoids information loss by composing computations out of information-preserving primitive transformations. In principle, one can construct computing mechanisms on this basis allowing computations to be carried out with arbitrarily little energy expenditure. The development of engineering concepts working toward the realization of this principle in concrete electronic systems began in the late 1970s [71,72]. Such research has continued, albeit at a low level of intensity, to this day, yielding concrete examples of low-power digital design techniques utilizing reversible computing principles for both semiconducting [73][74][75][76] and superconducting [77] technology platforms. Estimates of the minimum energy dissipation per reversible logic operation in leading-edge Complementary metal-oxide semiconductor (CMOS) technologies extend down to the sub-attojoule (order ∼100 kT, with T ≈ 300 K) range [78]. But simulations of reversible superconducting circuits suggest that dissipation levels even below kT (with T = 4 K) are possible [77]. Might it be possible to approach or even breach kT dissipation levels at room temperature by using skyrmion interactions to do logic? We explore this question in subsequent sections.

Reversible skyrmion logic gates
The concept of reversible computing inspires the design of energy-conserving logical computing systems. Compared to microfluidic logic [79] or magnetic bubble logic [80,81], magnetic skyrmions demonstrate the potential of implementing such reversible computing systems in a scalable manner for compact and energy-efficient design. Researchers have proposed many designs and systems for skyrmion logic: [53,56,82] utilize skyrmion-DW conversions or interactions for logical operations; [54] nucleates skyrmions through MTJs then drives the skyrmions with the SHE; [57] switches logic gate functions through manipulating the driving current density, and [83] reconfigures the gate through voltage-controlled magnetic anisotropy (VCMA).
While these proposals of skyrmion logic systems are intriguing, it is important that a computing system be scalable and compute with minimal energy. Hu et al [55,84] proposed a scalable reversible skyrmion logic  system that computes with nanoscale topological solitons in an energy-efficient manner. The reversible AND/OR gate shown in figure 3, one of the basic skyrmion logic gates, utilizes a symmetric structure in which skyrmions are propagated by an applied electrical current. This AND/OR gate conserves all skyrmions provided as input without destroying any during the logical operation, and exhibits conditional logical reversibility [85].
Beyond conditional logical reversibility, the Ressler-Feynman gate of figure 4 provides logical reversibility and a form of physical reversibility for all input combinations [84]. As this gate is functionally complete, it allows for the logically reversible computation of any Boolean expression. Its physical reversibility is limited, however, by the existence of hysteresis related to differing skyrmion trajectories under forward-and reverse-current conditions; to achieve dissipation below that predicted by Landauer for irreversible computing schemes, this physical reversibility needs to be improved. However, this technology is still early in its development, and future work will increase its reversibility, resulting in drastic improvements in energy efficiency. . Reprinted from [86], with the permission of AIP Publishing.

Skyrmion clocking
In cascaded skyrmion logic circuits, the path length varies significantly between skyrmion trajectories. As skyrmions need to be synchronized upon gate entry for proper logical operation, active control is required to eliminate the timing differences caused by varying path lengths. Skyrmion clocking elements are thus included to impede the propagation of skyrmions until a global synchronizing pulse allows for their progression. By placing these synchronizers prior to logical gates, skyrmions can be effectively clocked, allowing for proper logical computation.
The first proposed skyrmion clocking mechanism (figures 5(a) and (b)) harnessed the relationship between skyrmion radii and current density for skyrmion synchronization [55]. These current-based synchronizers simply consisted of a constriction in the skyrmion racetrack. At normal current densities, the skyrmion radii are too large to pass the constriction, resulting in skyrmion pinning. However, a large spike in current density is able to reduce the skyrmion radii, allowing global skyrmion depinning. The simplicity of this scheme is its key advantage, with its primary drawback being the large current density spike: the current spike is energetically expensive and the resulting large skyrmion acceleration can cause skyrmion annihilation, thus preventing reversible computation.
Voltage-controlled synchronizers aim to alleviate the challenges associated with current-based synchronization. Via the VCMA effect, the PMA of the ferromagnetic track can be modulated with an electric field applied perpendicular to the track surface [87]. As skyrmions can be pinned by gradients in PMA, the VCMA effect allows for voltage-controlled synchronization [53] (figures 5(c) and (d)). The VCMA-based synchronizers of Walker et al introduce a preexisting PMA barrier at the synchronization region [86]. Through this mechanism, skyrmions are pinned without an applied electric field. Similarly, a negative voltage applied to the region's electrode can eliminate the PMA barrier, depinning skyrmions globally. This voltage-based skyrmion clocking methodology reduces the total power associated with reversible skyrmion logic, allowing for a ∼2× reduction in energy dissipation.

Skyrmions for unconventional computing
In logic devices, skyrmions are carriers of logical bits that must be deterministically created, synchronized, and read out for the devices to properly function. Such applications can be considered the more compact and energy-efficient alternatives of CMOS logic devices. In this section, we go beyond the classical von Neumann architecture and survey the advances in skyrmion devices for unconventional computing paradigms.

Skyrmion neurons and synapses
The manipulation of skyrmions with magnetic field or electrical current gave rise to a large number of proposals of biomimetic skyrmion devices. The first category of these devices includes those emulating various neuron models. In the skyrmion racetrack-type device, device tunability can be achieved by applying VCMA gating to vary the threshold current density of skyrmion depinning [64]. Another type of more realistic neuron model, namely the leaky integrate-and-fire (LIF) spiking neuron, can be implemented with a PMA gradient along the racetrack [49] such that the motion of skyrmions is governed by a competition between the PMA gradient-facilitated leaking and current-induced integration ( figure 6(a)). To mitigate the SHE, the authors further proposed using an antiferromagnetically exchange-coupled bilayer for the skyrmion racetrack [50]. Besides racetrack-type devices based on translational skyrmion motion, oscillatory skyrmion dynamics in a confined structure has been exploited to emulate the resonant and fire model, which describes a neuron's periodic firing when the frequency of stimuli matches its sub-threshold oscillation [89]. It should be noted that due to the narrow parameter window for skyrmion-hosting materials and the difficulties in deterministic write, read, and control of single skyrmions, the aforementioned works are carried out in simulations only.
The second category of biomimetic skyrmion devices are artificial skyrmion synapses. In contrast to skyrmion neuron devices which are based heavily on the control of single skyrmions, skyrmion synapse devices often consist of a skyrmion ensemble to emulate the potentiation and depression of synapses. In one proposed implementation [61], synaptic plasticity is facilitated by the collective migration of skyrmions from one device region to another, and synaptic weight is proportional to the number of skyrmions within the device region of interest. Because the functioning of such devices is more resilient to the uncertainty of individual skyrmion behaviors, these devices have been more successfully implemented experimentally. The exemplary experimental work in this area by Song et al [62] uses a Hall bar structure to create skyrmions by current pulses, and the synaptic weight is represented by the measured topological Hall resistance, directly proportional to the number of skyrmions ( figure 6(b)). The measured skyrmion synapse characteristics are used in an artificial neural network (ANN) to learn the modified national institute of standards and technology database (MNIST) data set; while the classification accuracy does not reach that of a software ANN, it can be further improved with smaller skyrmions (larger number of resistance levels) and MTJ readout (larger on/off ratio).

Skyrmion devices for neural networks
While most proposed spintronic neural networks employ DW-based synapse and neuron devices (as discussed in section 6.5), skyrmion devices have attracted much interest in deep neuron network applications, with various device/system level simulations reported. He and Fan [64] demonstrated the application of skyrmion devices in convolutional neural networks (CNNs). Here, the depinning current threshold of the skyrmion is fine-tuned by the VCMA effect, and the sigmoid activation function can be approximated by connecting N skyrmion devices as a skyrmion neural cluster. The resulting CNN shows a promising 98.74% classification accuracy on the MNIST data set, with energy consumption as low as 3.1 fJ step −1 . Furthermore, mixed synaptic plasticity (long-and short-term plasticity) could be implemented by combining different control mechanisms, as shown by [90] using SOT and VCMA controls. The deep CNN utilizing these skyrmionic synapses approaches software accuracy levels when classifying the static canadian Institute for advanced research (CIFAR)-10 data set, and more interestingly, the short-term plasticity (STP) feature enables dynamic learning. Skyrmion devices have also been built into a binary neural network accelerator [91], as well as a synaptic architecture of a ternary neural network [92], both exhibiting excellent energy and accuracy benefits.
Skyrmion neurons/synapses have also been applied to the more bio-realistic spiking neural network (SNN) architectures. Room temperature simulations show that the skyrmionic synapse device can perform spike timing-dependent plasticity learning in an SNN [93]. Although a fully connected two-layer SNN only achieves 78% classification accuracy on MNIST images due to the limited synaptic weight resolution, this limitation can be mitigated by using a deep SNN with two additional hidden layers. With only 13 synaptic states, the proposed network achieves almost the same level of accuracy as a SNN with full-precision weights. Chen et al [94] proposed an all-spin skyrmionic SNN that employs skyrmion devices as both synapses and neurons, achieving a competitive 96% accuracy with ∼117× energy improvement compared to baseline CMOS implementation in the 45 nm technology. Furthermore, skyrmion LIF neurons can be connected to implement smaller building blocks for SNNs, such as winner-take-all (WTA) modules [95].

Skyrmion reservoir computing
A recurrent neural network (RNN) contains cycles or feedback loops in neuron connections and is a more realistic representation of the biological brain than simpler feed-forward neural networks. One special type of RNN, referred to as a reservoir, consists of neurons connected by random weights that are not updated during training; the corresponding computing paradigm, reservoir computing (RC), generates output through a linear combination of the reservoir outputs [96]. There are two primary requirements for a reservoir: non-linearity and short-term memory. In particular, physical RC implementations have attracted significant attention due to the fact that many physical processes inherently provide these two features [97].
Skyrmion-hosting materials are highly promising for RC because of its flexibility in supporting various configurations of skyrmion ensembles. Prychynenko et al leveraged the nonlinear anisotropic magnetoresistance (AMR) arising from the motion of a single skyrmion [98]. Jiang et al demonstrated the potential of temporal signal processing using the nonlinear skyrmion motion with time [99]. Each training sample is encoded in temporal pulse sequences that are fed into the skyrmion racetrack, and the resulting skyrmion positions at each time interval are recorded. Another approach to skyrmion RC is to use a skyrmion fabric instead of individual skyrmions. Bourianoff et al [88] proposed the use of a skyrmion fabric as an echo state machine, and [100,101] proposed the use of a skyrmion reservoir for more advanced applications, such as audio or spatial analysis. Compared to systems containing only isolated skyrmions, the skyrmion fabric results in a more random and complicated current flow and can be tuned by the external magnetic field (figure 6(c)). In addition, inherent material inhomogeneities can help a random skyrmion texture used as a reservoir, as shown in [102]. Here, the strongly nonlinear response of AMR to different input waveforms is used to perform pattern recognition.

Skyrmion probabilistic computing
Skyrmion dynamics, like all spin dynamics, is intrinsically stochastic due to forces such as thermal activation, material inhomogeneity, and skyrmion-skyrmion repulsion. These factors modulate the size, location, and pinning strength of skyrmions and are often difficult to control during device fabrication and operation. While randomness poses a significant challenge to deterministic skyrmion devices, unconventional computing paradigms can benefit from such stochasticity. One such application is a true random number generator based on local stochastic skyrmion dynamics [103]. Here, due to the moderate pinning sites in the skyrmion stack material, the size of the skyrmion fluctuates randomly as it moves around the Hall cross structure, modifying the Hall resistance.
Random dynamics of the skyrmion ensemble have also found applications in probabilistic computing. In probabilistic computing, a numerical value is represented by the probability of reading a '1' or '0' bit in a bitstream. One crucial requirement of probabilistic bit logic operation is that the input bitstreams must be uncorrelated. In [3], the bitstreams consisting of skyrmions are first deterministically created by MTJs. Following their creation, the skyrmions pass through the 'skyrmion reshuffler' , a chamber that allows sufficient skyrmion-skyrmion interactions to randomize their motions ( figure 6(d)). Therefore, the time at and order in which the skyrmions leave the reshuffler to be read out are also randomized, and two such devices create two uncorrelated bitstreams that can be used for logic operations. Experimental implementation of this device concept shows the great promise of skyrmions for probabilistic computing [3].

Computing with a coupled skyrmion array
Though the previous discussion of skyrmion devices is primarily based on the translational motion of individual skyrmions, they further exhibit a wide range of dynamics that shows great promise for computational applications. For example, when excited by alternating electrical current or magnetic field, skyrmions are shown to have resonant oscillatory states known as gyration and breathing, whose frequencies are dependent on the magnetic parameters of the skyrmion-hosting material. Moreover, coupled skyrmions exhibit a multi-frequency resonance spectrum that can be used to carry information [104].
Jadaun et al exploited the flexibility of coupled skyrmions to create a reconfigurable, oscillatory neuron reminiscent of the adaptiveness of the brain [105]. The proposed neuron, named the Tunable SKyrmion Oscillatory Neuron (T-SKONE), consists of a two-dimensional skyrmion array sitting on artificial soft pinning sites; current-driven skyrmion motion allows the skyrmion lattice to be transformed between two different configurations ( figure 7). Since the change of lattice configurations alters their relative positions and, therefore, the interactions of the skyrmions, the frequencies and amplitudes of the resonant oscillations change accordingly. To implement the adaptable, or 'context-aware' input-output characteristics, the T-SKONE takes two independent inputs: a modulatory input of electrical current that reconfigures the skyrmion lattice, and a direct input of magnetic field that excites skyrmion oscillation; the former mimics rewards, punishments, or other task-related factors that change the internal neuron state, and the latter represents the normal sensory input. Using full micromagnetic simulations, they demonstrated the oscillation frequency and amplitude modulations of T-SKONEs that can be utilized for advanced cognitive tasks.
The computational power of the adaptive T-SKONE is demonstrated by application to context-aware medical diagnosis. Context awareness is quite significant for decision-making in real life. In medical diagnosis, not only are biopsy data considered but also the 'context' of the patient; in this case, habits ('risk factors') or medical history. To demonstrate their computational power, a two-layer feed-forward ANN consisting of T-SKONEs was designed in [105]. T-SKONEs in the input layer take input x that encodes input features and control or contextual input y that reconfigures the neuron input-output characteristics. Due to the limited availability of a data set including both biopsy information and personal medical data, a composite data set is constructed by appending personal medical information based on population statistics to the Breast Cancer Wisconsin Dataset (Diagnostics). The biopsy features are always fed to the direct input of the T-SKONE, and the personal medical data is fed either to the contextual input or direct input for benchmarking purposes. The results show that the T-SKONE ANN has improved classification accuracy when personal medical data is fed to the context input instead of the direct input, thereby demonstrating the superiority of context-aware classification enabled by the adaptiveness of a T-SKONE. Two other advanced neural features exhibited by T-SKONEs are cross-frequency coupling and feature binding.
The design of the T-SKONE shows the potential of developing 'smart materials' using the coupled dynamics of skyrmions. The compactness, low power consumption, and rich dynamics of these devices are promising for the development of next-generation AI. Advances in skyrmion-supporting material design, pinning sites engineering, and high-sensitivity skyrmion oscillation detection methods will significantly contribute to the experimental realization of these devices.

DW-MTJ logic
The DW-MTJ, also referred to as a spintronic memristor, is a type of topological soliton device that has been fabricated in many variations [6,39,40,106]. DW-MTJs are particularly intriguing for logical and neuromorphic computing systems in light of the inherent summing behavior of a current-driven DW, making it ideal for versatile and programmable logic gate design for high computing efficiency [6,44,46]. Additionally, the cross-coupling among DWs enables neuromorphic computing system design through DW-MTJ's mimicry of the behavior of biological neurons and synapses [106][107][108][109][110][111].

DW-MTJ logic concept
The DW-MTJs introduced in figure 2 and shown experimentally in figure 8 can benefit logic in-memory applications due to their non-volatility, high speed, and low energy dissipation. Additionally, as DW-MTJ devices are radiation-hard [35], these devices are well suited for unconventional computing at the edge.
DW-MTJ logic devices have been shown experimentally to implement inverters and buffers depending on the direction of propagation [6,46]. Using characteristics of the devices, simulations also show that these devices can be concatenated to form a clocking circuit. Due to the physics of the critical current density necessary for DW movement, the device can also be used as a NAND gate. Alamdar et al [39] demonstrated scaled versions of these devices with a 350 nm track width with up to 200% TMR, showing that DW-MTJ logic devices also have promising scaling characteristics. Reasonably reliable two-device concatenation was also shown experimentally in [39].

DW-MTJ logic circuits
Larger circuits, such as full adders consisting of DW-MTJ devices, have been designed, simulated, and benchmarked [36]. The DW-MTJ adder architecture can be integrated with existing CMOS systems before and after the adder block. Figure 9 shows the schematic of a DW-MTJ-based one-bit full adder with the associated simulation results, illustrating the logical computing operations driven by the multi-phase clocking scheme [44]. Though non-volatility provides a unique advantage compared to CMOS, the slower speed and higher currents highlight materials innovation requirements for viable integration. Firstly, larger TMRs greater than 300% would greatly relax the requirements for device-to-device variation and increase the reliability of the circuit. Additionally, though SOT devices approach the energy efficiency of CMOS circuits, a low switching current for the DW-MTJ devices at 6.4×10 9 A m −2 is necessary to outperform  CMOS. This co-design-based optimization of architecture, circuit, device, and material parameters can eventually lead to integration of DW-MTJ devices for in-memory computing.

DW-MTJ synapses
A critical requirement for emulating the behavior of a biological synapse is the ability of a device to contain multiple conductance states. In a DW-MTJ device, the DW mediates the ratio of parallel and anti-parallel regions in the MTJ, which allows for analog representations of states between the maximum and minimum conductance. However, experimental realization often results in defects that the DW is attracted to or repelled from. As a result, there is a need to introduce controlled pinning sites in the device to produce cycle-to-cycle and device-to-device reliability.
While analog switching using DWs has been demonstrated in large devices [113], controlled pinning sites are necessary to realize nm-scaled devices. One way to do this is to lithographically pattern notches along the track. In Liu et al [109], the notches provide local energy minima for the DW to relax to, allowing predictable conductance states and preventing DW drift. The structure of the device is shown in figure 10. Additionally, the conductance states are very stable, with only small write noise. In experiments, the average write noise for each state was found to be 0.25% [106], indicating that a large number of states can be engineered into the artificial synapse. This noise also mediates the inference accuracy of a network; weights that can be programmed reliably to the same levels are most important for accuracy. For classifying the CIFAR-100 image data set [114] using ResNet56 [115], the low noise level results in near ideal accuracy at every quantization level presented.
Notches also enable a characteristic that is critical for online learning: linearity [116]. By spacing the notches evenly, the allowed conductance states of the DW-MTJ synapse are constrained to be evenly distributed through the conductance range. In a straight wire, since each current pulse will drive the DW an equal distance and a negative current pulse will drive the DW in the opposite direction to a positive current, linear and symmetric behavior is available. However, the notches come at a cost: weights are constrained to a discrete number of levels. When online learning is performed with quantization-aware training, performance is far less than that of an ideal weight with floating point precision. However, this is greatly alleviated by the stochasticity of DW movement at finite temperature [109]. This means that each current pulse provides a probability of moving the DW to the next notch. When averaged over hundreds of thousands of updates for all of the synapses, the network is able to achieve a precision that is higher than the isolated precision of individual notches.
Beyond notches, the high degree of controllability regarding shape effects of the DW-MTJ devices can also be used to design synapses that have variable plasticity. Section 6.5 explores the possibilities of co-designing synaptic plasticity and network architecture to achieve specialized networks for computing at the edge.

DW-MTJ neurons
To implement a purely spintronic SNN in concert with a DW-MTJ, it is also necessary to implement spintronic neurons. Various DW-MTJ LIF neurons have been proposed using metal oxide semiconductor field-effect transistor (MOSFET) devices to implement the leaking functionality; however, this results in both increased area overhead and also increased static power dissipation. Because of this, three alternate DW-MTJ LIF neurons have been proposed that do not require MOSFETs for leaking.
One DW-MTJ LIF neuron uses a dipolar coupling field, which was implemented using an electrically isolated fixed ferromagnet fabricated underneath the DW track. The magnetic field generated by this ferromagnet is parallel to one domain in the track and anti-parallel to the other domain. This causes the parallel domain to expand and the anti-parallel domain to shrink, thereby causing the DW to leak in the absence of external control circuitry [110].
The second neuron takes advantage of the fact that DWs exist in lower energy states in lower anisotropy tracks than in higher anisotropy tracks. By introducing a linear anisotropy gradient, it is possible for the DW to thereby intrinsically shift from the region of higher anisotropy to the region of lower anisotropy. This anisotropy gradient can be produced using either a TaO x wedge fabricated on top of the DW track or using Ga + ion irradiation of the track [111].
The final neuron uses a shape variation to implement intrinsic leaking. DWs exist in lower energy states in narrower DW tracks than in wide DW tracks. Therefore, linearly varying the DW track width will cause the DW to intrinsically shift from the wide end of the track to the narrow end of the track [117].
Finally, in order to implement the firing mechanism as well as input/output isolation, a secondary electrically isolated MTJ can be placed above the DW track. As the DW shifts underneath this MTJ, the MTJ's free layer changes states, thereby changing the resistance state of the MTJ. This altered resistance can be measured by applying a voltage across the MTJ, effectively adding a fourth terminal to the device. The resulting current can then be used as an input to the next stage and will not affect the state of the neuron [112].

Lateral inhibition
The LIF DW-MTJ neuron of section 6.2 further exhibits the advanced neural function of lateral inhibition (LI). LI is a well-known mechanism of WTA neural networks and has been implemented in CMOS neurons [118][119][120], but these designs often involve complicated external circuitry that results in significant power consumption. In contrast, LI in DW-MTJ LIF neurons is intrinsic to the device due to magnetostatic interactions [110]. Systematic device simulations have been carried out to understand the physical mechanism of DW-MTJ LI and maximize it [121]. Micromagnetic simulator mumax3 was used to model the DW propagation in a pair of close magnetic racetracks. Due to magnetostatic interaction, the racetrack with faster DW motion ('winner') exerts a magnetic stray field on its neighbor ('loser'), further inhibiting its DW motion. The simulation results reveal an optimal racetrack lateral distance that leads to the most efficient LI. This optimal distance is associated with the magnetostatic interaction strength corresponding to the Walker breakdown field of DW motion [122]. The dependence of LI efficiency on material parameters, such as saturation magnetization and PMA, has also been studied. These results indicate that the experimental implementation of DW-MTJ neurons with maximized LI will be made feasible by careful material engineering.
Cui et al [123] further explored the computational potential of DW-MTJ neuron LI by investigating the LI-induced WTA behavior in an array of DW-MTJ neurons. The WTA neuronal competition rule states that in a neuron array only the most active neuron(s) are allowed to win the competition and fire. In a DW-MTJ neuron array, this is equivalent to a change in DW velocity distribution due to the LI between the neurons. Since it is not feasible to perform full micromagnetic simulations on a large number of DW-MTJ neurons, a LI model is extracted from micromagnetic simulations on a DW-MTJ neuron pair, and the DW velocity distributions due to LI in a one-dimensional neuron array are numerically calculated. By tuning the inhibition strength and array layouts, three different types of WTA are achieved: hard-WTA, soft-WTA, and k-WTA, each of which suited is to different computational tasks [124]. This further demonstrates the potential computational power of DW-MTJ neuron arrays, and further work will be required to build a standard WTA building block to be integrated into realistic neuromorphic architectures.

Configurable activation functions
In order to improve the learning characteristics of spintronic neural networks, it is beneficial for DW-MTJ neurons to implement various activation functions. It has been demonstrated that, by altering the shape of the neurons implemented in [125], it is possible to implement several activation functions. Through these simple shape alterations, [125] has implemented both the linear and sigmoidal activation functions. More generally, the corollary to the results of [125] is that any activation function could be implemented using this method.
To implement the linear activation function, [125] used a slight exponential variation in the width of the DW track rather than a linear variation. With a particular rate of exponential decay, it was demonstrated that the DW exhibits linear motion, as the force applied to it decays over time by virtue of the decreasing slope of the track's width gradient.
In a similar fashion, [125] introduced a constriction in the middle of the DW track in order to implement the sigmoidal activation function. Much like the reduced slope of the track's width gradient in the previous case, the reduced slope introduced at both the narrow and wide ends of the track reduced the effective leaking force experienced by a DW in these regions while increasing the force experienced by a DW in the middle of the track. This caused the device to display linear leaking behavior.

DW-MTJ neural network design
Due to the tunability of DW-MTJ neurons and synapses, there are opportunities to design neural network accelerators that can benefit from the characteristics outlined in the previous sections. Due to the linear and symmetric updates that have been demonstrated for DW-MTJ devices [126,127], the devices are effective artificial synapses for deep neural networks. On top of this, the tunability of the magnetic dynamics of the DW-MTJ synapses can be leveraged for specific types of deep neural networks. For example, the shape of the DW-MTJ track can be adjusted to tune the plasticity of the artificial synapse. In Leonard et al [106], trapezoidal DW-MTJ synapses were experimentally shown to have a DW depinning voltage that increased linearly as the width of the track increased, leading to a metaplasticity effect, where plasticity is determined by the state of the synapse. When this synapse was used to construct a multilayer perceptron (MLP) and applied on the Fashion-MNIST clothing article classification task [128], it was shown that the trapezoidal synapse does not provide any benefits, and performs slightly worse than a network composed of linear synapses. However, following Laborieux et al [129], this type of metaplasticity has been shown to be useful in preventing catastrophic forgetting in binarized neural networks. Since DW-MTJ devices are strong candidates for edge computing, this task is suitable for that application since data may be unreliable and hardware will be resource-limited. With this in mind, trapezoidal DW-MTJ synapses with binary readout were used to construct the MLP and a streamed Fashion-MNIST task where the network only had access to a subset of 1000 images (out of 60 000) at a time was constructed. The trapezoidal DW-MTJ network showed significantly improved test accuracy at 86% compared to a linear network at 83% and even reached the accuracy obtained when the network is shown all of the data at once, shown in figure 11. This is because the critical weights learned in previous subsets are remembered due to the larger update necessary to reduce the critical weights in the trapezoidal synapses. This demonstrates that DW-MTJ synapses can be tuned to leverage magnetic dynamics for deep neural networks beyond linear and symmetric weights.
The DW-MTJ LIF neurons have been shown to effectively accomplish inference in artificial neural networks [110,130], but their application as LIF building blocks for SNNs is less explored due to the difficulty in simulating the spatio-temporal dynamics accurately. However, because the activation functions of DW-MTJ LIF neurons can be tuned by shaping the ferromagnetic track [125], the device has promising uses for dynamically rich neurons for SNNs. In inference, multi-domain LIF neurons have been shown to be noise-resilient due to the filtering effect of the leaking behavior [131]. Due to the noise introduced by thermal fluctuations, there is intrinsic uncertainty in the devices. These noisy activation functions are able to tune the network to be tolerant of noise at training time, which results in SNNs that can maintain effectiveness even when the data set becomes noisy. This is important for edge applications, where data is often sourced from sensors.

Conclusions
Topological solitons have been explored in order to leverage their physics for next-generation computing systems. Magnetic skyrmions and DWs are particularly appropriate for these emerging computing systems, especially for logical and neuromorphic computing. Compared to conventional two-terminal memristors, skyrmions have a variety of advantages in implementing neuromorphic computing systems. In general, memristors suffer from endurance, retention, and uniformity issues [132] that prevent them from high-throughput, high-accuracy neuromorphic computing systems.
In contrast, the compact size of skyrmions and the rich physics of skyrmion interactions make them ideal for large-scale biomimetic neural networks without complex interconnection overheads. Additionally, the temperature stability of skyrmions permits them to be operated with a wider temperature range for more versatile applications. On the other hand, skyrmions are more challenging than memristors to fabricate, though they can be more easily integrated with other spintronic components for purely spintronic computing systems for low-energy computing.
DW-MTJs, meanwhile, are multi-terminal devices that, relative to memristors, provide better isolation between the reading and writing path, which results in less disturbance between reading and writing signals. Moreover, the inherent LI and the potential energy-free 'leaky' function make DW-MTJs ideal for implementing energy-efficient neurons [110,111,117].
This survey has summarized several applications of topological solitons that fully leverage the interactions among these solitons. We believe that the interactions and synergy among these topological solitons within a system differentiate solitons from other emerging computing devices and information carriers, leading to the potential for topological solitons to eventually replace conventional CMOS transistors in computing systems.

Data availability statement
No new data were created or analysed in this study. The data that support the findings of this study are available upon reasonable request from the authors.