High-density superconductive logic circuits utilizing 0 and π josephson junctions

Superconductor Electronics (SCE) is a fast and power-efficient technology with great potential for overcoming conventional CMOS electronics’ scaling limits. Nevertheless, the primary challenge confronting SCE today is its integration level, which lags several orders of magnitude behind CMOS circuits. In this study, we have innovated and simulated a novel logic family grounded in the principles of phase shifts occurring in 0 and π Josephson junctions. The fast phase logic (FPL) eliminates the need for large inductor loops and shunt resistances by combining the half-flux and phase logic. Therefore, the Josephson junction (JJ) area only limits the integration density. The cells designed with this paradigm are fast, and the clock-to-Q delay for logic cells is about 4ps. While maintaining over 50% parameter margins for wiring cells. This logic is power efficient and can increase the integration by at least 100×in the SCE chips.

1. Introduction flux quantum (SFQ) technology [1] holds great promise for the next generation of very-large-scale integration (VLSI) circuits.Among SFQ logic circuits, rapid single flux quantum (RSFQ) stands out for its focus on high operation rates.RSFQ uses Josephson Junctions (JJs), which switch incredibly fast in just a few picoseconds (ps).RSFQ logic cells respond in about 10ps from clock to output, enabling RSFQ systems to work well at speeds between 40GHz and 60GHz [2,3].The energy needed for a Josephson Junction to switch is much lower than in CMOS technology, even as low as 10 −19 J/bit.This makes RSFQ systems more power-efficient than current technologies.Many studies focus on RSFQ technology, covering circuit and system designs [4,5,6], layout designs [7,8], as well as electronic design automation (EDA) tools and algorithms [9,10].
Despite the numerous merits of RSFQ circuits, they are not free of substantial challenges.Notably, the integration density of RSFQ circuits remains relatively modest, with approximately 10,000 logic gates accommodating a chip area of 1, cm 2 .This scale of integration is inadequate to meet the computational requirements posed by today's demanding applications.RSFQ circuits face other limitations: the lack of compact on-chip memory solutions and the need for a substantial bias current for effective operation.These impediments underscore the significance of investigating alternative circuit families capable of surmounting these hurdles and propelling further advancements in SFQ technology.
The switching element in superconductor circuits is the Josephson Junction (JJ).JJ is an SFQ circuit's active component with a common Superconductor-Insulator-Superconductor (SIS) structure.The behavior of a JJ may be expressed by the Current-Phase Relationship (CPR): where the J s is the current density of the JJ, J c is the critical current density of the JJ above which the JJ exits the superconducting state, and ϕ is the phase difference between two superconducting layers.This simplified CPR equation, which assumes that supercurrent always tunnels in the JJ's barrier and the temperature is below the critical temperature, approximates the JJ behavior well for Superconductor-Insulator-Superconductor (SIS) JJs and is used in most SPICE-based simulator engines.
The MITLL SFQ5ee process [11,12] is one of the example technologies implementing a Nb/Al − AlO x /Nb type junction where the material of superconducting layers is N b and the insulator is AlO x .By replacing the barrier insulator layer with a magnetic material with a built-in magnetic field, the SIS JJ becomes a magnetic junction (MJJ) [13,14].The magnetic junction (MJJ) has been studied extensively for its unique characteristics, and some new devices based on the MJJ structure have been proposed, for example, the π-junction (π-JJ), ϕ-junction (ϕ-JJ) and 2ϕ-junction (2ϕ-JJ).π-JJ [13,15] has an intrinsic phase shift of π as shown in Eq.2, which some researches have utilized to realize current saving designs [16,17].
Similarly, if the phase shift is not π but an arbitrary value ϕ 0 , it forms the ϕ-junction (Eq.3).The related works may be found in [18,19].
Increasing the on-chip density of SCE circuits is essential to their wider applicability.However, the growing mutual inductance and cross talk pose limitations on minimizing the dimensions of metal lines, although the kinetic inductors for passive transmission line (PTL) design may offer a potential solution.In an attempt to overcome the density challenge, Soloviev et al. [20] developed logic cells (including NDRO, DRO, and half adder) utilizing 2ϕjunctions, aiming to eliminate the need for inductors, and thus, enhance scalability.In their study, Salameh et al. [21] introduce three cells employing 2ϕ-junctions: a Josephson transmission line (JTL), an inverter, and an OR gate.Compared to conventional RSFQ cells, these cells employ half flux quantum (HFQ) pulses, reducing latency and switching power.Additionally, Hasegawa et al. [22] demonstrated an SFQ/HFQ interface circuit by combining 0-and π-Josephson junctions, although this implementation did not encompass the entirety of phase logic functionalities.
To shrink the circuit sizes even further, we design circuits using JJs with higher J C while eliminating the JJs' shunt resistances.
This paper introduces a standard cell library that leverages π-junctions to implement FPL cells, aiming to reduce the footprint of superconductive logic.Logic cells are showcased, demonstrating a remarkable size reduction of at least 100× compared to standard SFQ cells.The PTLs are reduced in width due to an increase in the circuit impedance to ∼ 1.8µm.The diverse range of cells provided caters to the fundamental needs of various computing systems.The functionality of these cells is verified using the JoSIM [26] simulator, while optimization using the qCS tool [27] yields satisfactory margins.The paper presents critical circuit parameters for FPL cells.Note that the projected layout area for cells is computed assuming that the SIS J C is 600 µA/µm 2 and that for the SFS J C is 1000 µA/µm 2 .

2ϕ-Junction
Recently, there have been works showing that at the 0 − π transition, the fundamental term of the CPR vanishes, making the high-order harmonic terms nonnegligible [18].Moreover, in [28], a single SFS junction using the Cu 47 N i 53 alloy barrier was implemented with two parallel superconducting inductors: a readout inductor and a small shunt inductor.The readout inductor is coupled to a commercial DC superconducting quantum interference device (SQUID) sensor, which detects flux Φ in the readout loop.By measuring the CPR on different barrier thicknesses in different temperatures, reference [28] demonstrated a π-periodic behavior, eliminating any other alternative explanations but having a second-order CPR.Thus, the overall CPR may be rewritten to be: And a new device named 2ϕ-JJ comes to light with the CPR shown as: Several intriguing observations can be made regarding the 2ϕ-JJ (2ϕ-JJ).Firstly, its CPR follows a period of π instead of the more typical period of 2π.
Secondly, the 2ϕ-JJ undergoes switching when a π phase jump occurs, generating a half flux quantum ( 12 Φ 0 = 1.03×10 −15 W b). Consequently, each switching event of a 2ϕ-JJ corresponds to a phase shift of π.A few studies have been reported utilizing 2ϕ junctions, as detailed below.
2.2.Replacing 2ϕ with 0 and π JJs Some logic cells designed with 2ϕ JJs were presented in [21].Unfortunately, a dependable fabrication process for these junctions remains lacking.An innovative new approach presented by Soloviev et al.
[29] demonstrated the feasibility of implementing 2ϕ JJs using only 0 and π JJs.This procedure combines 0 and π JJs to create a bistable structure, functioning analogously to a 2ϕ JJ.However, a potential concern arises from using both 0 and π JJs as switching elements within this structure, possibly impacting the circuit reliability.In case of switching happens in both SIS and SFS layers, different parameter variations between two layers can cause unreliable switching and reduce the margins.Therefore, in this work, we design the cells such that all the switching happens in only π JJs.In case a design needs switching in both layers, refinement can be made by substituting the switching π JJ with a 0 JJ in series with a higher I Cπ > 2 × I C0 π JJ, as illustrated in fig. 1.This modification ensures that all switching actions occur within the SIS (0 JJ) layer, thus enhancing overall reliability.

Fast Phase Logic (FPL)
This work introduces a collection of novel superconductive logic cells designed to admit compact layouts.This logic family leverages high critical current density (J C ) 0 and π Josephson Junctions (JJs) to establish an ALL-JJ-based superconductive logic cell family, named fast phase logic (FPL).Notably, the JJs within the FPL family operate without the need for shunt resistance.Moreover, no explicit inductances are present in the design of logic cells.This results in very high layout density.
For example, the proposed JTL cell within this paradigm occupies a mere 0.8 µm 2 in size and incorporates four JJs, with only the π JJ serving as a switching element.The power consumed during switching these JJs is approximately 3 × 10 −20 W. This indicates that a chip spanning 1cm 2 and housing approximately 2.5 × 10 7 JJs would consume around 97 mW in the worst-case scenario, making it amenable for cooling using liquid helium.
A visual representation of the layout featuring four JTL cells employing the FPL paradigm can be observed in Fig. 2. Evident from the figure, the four JTL cells presented here, alongside the bias line, would occupy 2.6 µm 2 of the chip area.Compared to the RSFQ library, where each JTL cell occupies a minimum of 20 × 20µm 2 , the FPL cells are approximately ∼ 500× smaller.Even though this design removes the need for explicit inductors, minor parasitic inductors are inherent in the circuit due to the interconnections and vias between SIS, SFS, and ground (GND) layers.A conservative assumption has been made throughout the design process, attributing a 0.1pH inductance to each connection within the same layer and a 0.3pH inductance to vias.Notably, these inductances are omitted from the circuit schematics for simplicity.Following margin calculations, the outcomes indicate that these parasitic elements will exert negligible influence on the circuits until they exceed a value of >1pH.
Numerous logic cells were designed, and their corresponding circuits were simulated using the JoSIM software.The simulations incorporated thermal noise, although its impact remained negligible due to the absence of shunts in the JJs.Remarkably, these circuits exhibit enhanced resistance to flux trapping, as the absence of inductive loops prevents flux coupling with the circuits.This paper showcases a selection of exemplary cells tailored for elevated margins.These basic cells illustrate the efficient implementation of fast and dependable logic through the FPL approach.

3.2.1.
Josephson transmission line (JTL) Fig. 3 illustrates the schematic of a JTL cell.Within this schematic, J 1 denotes the switching JJ, while three junctions labeled J 2 function as inductance components for the JTL cell.The JTL cell is a foundational component in the design, responsible for interconnecting various other blocks and ensuring impedance matching.
JTLs exhibit cascading capabilities, meaning that linking the OUT port of one JJ with the IN port of a successive JTL establishes a two-element JTL chain.The collaboration between J1-J2 and the subsequent JTL constructs a closed loop, establishing a phase equation in which integrating phase differences across the loop equates to an integer multiple of 2π.When a half-flux-quantum (HFQ) pulse enters the IN port, J1 switches, engendering another HFQ pulse that propagates to the subsequent device.This sequence facilitates the transport of HFQ pulses along the JTL.
The waveform captured through simulations is portrayed in Fig. 4. Notably, the simulated waveform for a chain of 15 interconnected JTLs indicates that each JTL cell introduces a delay of approximately 0.3 ps.The output of this setup interfaces with a load that converts the HFQ pulses into SFQ pulses.resistor can be implemented either on-chip (as per this design) or off-chip.An inductor L IN is introduced following the input resistor.During the rising edge of the input signal, L IN exhibits high impedance, leading the majority of current to flow through J2.As a consequence, an FPL pulse is triggered at the output port.Upon stabilization of the input voltage, L IN operates as a short connection, diverting the input current through L 1 towards the ground.This flow spares J2 from activation.The simulated waveform is depicted in Fig. 7, elucidating the dynamic behavior of the converter.

SPLITTER
Like SFQ logic cells, the FPL cells also exhibit a fan-out of one.To address this limitation, a splitter is employed to replicate the pulse.Fig. 8 illustrates the schematic of the splitter cell, featuring its corresponding component values.The associated test circuit is also depicted, highlighting its architecture.In this configuration, J 1 receives the pulse through the IN port.The looping current undergoes division, effectively triggering J5 and J4 independently.This outcome leads to distinct FPL pulses at each output port.
The simulation waveform in Fig. 9 showcases the dynamic behavior.Notably, the input signal derives from the DC to DC/FPL cell, and subsequent passage through the JTL cells results in a bifurcation of the pulse.The readout pulses on the respective loads are also portrayed within this illustration.

Merger (asynchronous OR)
Fig. 10 shows the schematic of the merger cell, alternatively referred to as the confluence buffer or asynchronous OR gate.Upon reception of an FPL pulse from either input port (e.g., IN 1 ), the FPL triggers the receiving junction (J 1 ).This action culminates in amplifying the current within the corresponding branch (J 1 -3 -J 4 -J 7 ).Consequently, J 7 switches, generating an output pulse.Simultaneously, on the alternate branch (J 6 in the case of input from IN1), the buffering junction is activated to neutralize the backward-flowing flux toward the other input port (IN2).Specifically, J 4 and J 6 collectively prevent the propagation of pulses in the reverse direction.
In instances where two input pulses coincide or fall within a narrow temporal window (a few picoseconds), only one output pulse is emitted at the OUT port.Fig. 11 captures the simulation waveform, illustrating the input and output waveforms and the intricate dynamics of this configuration.

D flip-flop
Fig. 12 introduces the schematic and corresponding test bench of the DFF (D Flip-Flop) cell.When an FPL pulse arrives through the IN port, the pulse is preserved within the J 1 -2 -J 3 -J 4 loop as a clockwise screening current.This action concurrently heightens the bias current of J 4 .Consequently, when the incoming FPL pulse from the CLK port materializes, J 4 is triggered, engendering an FPL pulse at the output port.
Contrastingly, in the absence of a stored pulse, the pulse incoming from CLK activates J 5 , while J 4 remains untouched.The pulse energy is then dissipated into the ground.The simulation waveform, showcased in Fig. 13, affords a comprehensive depiction of this process, illustrating the input and output waveforms and the underlying mechanisms at play.

Asynchronus AND gate
The AND gate employs a structure akin to an OR (Merger) gate, with adjustments made to the key components (merging part).Consequently, the output junction J 7 requires a minimum of two FPL pulses to generate an output signal.Fig. 14 presents the AND gate configuration with its associated test circuit and the listed components as indicated in the caption.The simulation waveform is displayed in Fig. 15, portraying the process, including both the inputoutput waveforms and the underlying mechanisms.

Clocked AND/OR gates
As is well-known, clocked gates can be derived from the asynchronous gates by incorporating DFFs at their inputs.For instance, the OR gate is constructed by integrating two DFFs at each input of a merger cell.These DFFs receive synchronized clock signals from a splitter circuit.The same fundamental configuration can be adapted to transform the asynchronous AND gate into a clock-controlled variant.The schematics of these two types of gates are depicted in Fig. 16.
These structures are amenable to optimization, allowing for the enhancement of margins, reduction in cell sizes, and minimization of Q-to-Clock propagation times.Such optimizations can lead to improved overall performance and efficiency in practical applications.These presented cells serve as illustrative examples, showcasing the potential of FPL technology to facilitate the creation of compact, high-speed, and reliable logic circuits.The relevant details are summarized in Table 1, which provides insights into the designed cell delays, the maximum simulated frequency, and the projected size.
In estimating the size, we consider the JJ technology with demonstrated values of J C = 600µA/µm 2 for the SIS layer and J C = 1000µA/µm 2 for the SFS layer.This estimation suggests that FPL cells can be integrated at densities of up to 50 MJJ/cm 2 and can operate efficiently at clock frequencies of up to 50 GHz.This demonstrates the potential for FPL technology to offer scalability and high-performance capabilities.

Conclusion
The FPL logic cells were demonstrated, harnessing the power of 0 and π-JJs.Operating without the need for inductors (akin to 2ϕ logic) and unburdened by shunts (like HFQ logic), these cells represent a groundbreaking development.The emergence of this novel logic family opens up avenues for unprecedented miniaturization within SCE logic.Leveraging an inventive design, these cells demand bias currents that are a mere 20× Remarkably, the absence of inductive loops renders FPL logic cells considerably less susceptible to trapped fluxes and crosstalk.This sets the FPL logic family apart from other SFQ technologies.Capitalizing on the potential for dependable dense integration and high operational frequencies, the FPL logic family emerges as a promising contender poised to shape the trajectory of the next generation of Very Large Scale Integration (VLSI) circuits.

Figure 1 .
Figure 1.Replacing a switching π JJ with a switching 0 JJ and a series non-switching π JJ that only provides phase shift.

Figure 2 .
Figure 2. A layout sample of the FPL cells assuming the high J C SFS and SIS technology.Here we assume that SIS JJs have J C = 600µA/µm 2 and SFS JJs have J C = 1000µA/µm 2

Fig. 6
Fig.6 presents the schematic and component values of a DC/FPL converter.Within this configuration, R IN signifies a serial 50 Ω input resistor designed to transform the input voltage into a current.This

Figure 5 .
Figure 5. Parameter margin of the JTL cell after optimization.After a few iteration cycles, about 60% margin was achieved.

Figure 16 .
Figure 16.Schematic of the OR gate.

Table 1 .
Delay and estimated size of each designed cell from input-output/ clock-to-Q The delay is calculated from threshold to pulse and depends on the input value.Size is without input resistor. *