NetQASM - A low-level instruction set architecture for hybrid quantum-classical programs in a quantum internet

We introduce NetQASM, a low-level instruction set architecture for quantum internet applications. NetQASM is a universal, platform-independent and extendable instruction set with support for local quantum gates, powerful classical logic and quantum networking operations for remote entanglement generation. Furthermore, NetQASM allows for close integration of classical logic and communication at the application layer with quantum operations at the physical layer. This enables quantum network applications to be programmed in high-level platform-independent software, which is not possible using any other QASM variants. We implement NetQASM in a series of tools to write, parse, encode and run NetQASM code, which are available online. Our tools include a higher-level SDK in Python, which allows an easy way of programming applications for a quantum internet. Our SDK can be used at home by making use of our existing quantum simulators, NetSquid and SimulaQron, and will also provide a public interface to hardware released on a future iteration of Quantum Network Explorer.

FIG. 1: A quantum network application consists of a program for each of the nodes involved in the application. Each program is locally executed by the node. Program execution on each node is split into execution in an application layer, which can send and receive classical messages, and a quantum processor, which can create entanglement with another node. The communication between nodes can hence be both classical and quantum. Communication instructions need to be matched by corresponding instructions in the other program. There is no global actor overseeing execution of each of the programs, and the nodes may be physically far apart.
FIG. 2: A program on a single node consists of different blocks of code, which can be quantum (pure quantum instructions with classical control in between), or classical (no quantum operations at all). These blocks may depend on each other in various ways. For example, the outcome of a measurement happening in one of the quantum blocks may be used in a calculation performed in one of the classical blocks. Blocks may also depend on other nodes. For instance, the value of a message coming from another node can influence the branch taken in one of the classical blocks.
course, classical control communication may be used by the QNPU to realize the services of the quantum network stack accessed through NetQASM.
With NetQASM we solve various problems that are unique to quantum internet programming: (1) for remote entanglement generation, we introduce new instruction types for making use of an underlying quantum network stack [21,22], (2) for the close interaction between classical and quantum operations, we use a shared-memory model for sharing classical data between the application layer and the QNPU, (3) in order to run multiple applications on the same quantum node-which may be beneficial for overall resource usage (see section IV)-we make use of virtualized quantum memory, similar to virtual memory in classical computing [23], (4) since on some platforms, not all qubits may be used to generate remote entanglement, we introduce the concept of unit-modules describing qubit topologies with additional information per (virtual) qubit about which operations are possible.
Since NetQASM is meant to be low-level, similar in nature to classical assembly languages, we have also developed a higher-level software development kit (SDK), in Python, to make it easier to write applications. This SDK and related tools are open-source and freely available at [24], as part of our Quantum Network Explorer [25]. Through the SDK we have also enabled the quantum network simulators NetSquid [26] and SimulaQron [27] to run any application programmed in NetQASM. We have evaluated NetQASM by simulating the execution of a teleportation application and a blind quantum computation using NetQASM. Hereby we have shown that interesting quantum internet applications can indeed be programmed using NetQASM. Furthermore, the evaluations argue certain design choices of NetQASM, namely the use of so-called unit modules, as well as platform-specific flavors.
We remark that NetQASM has already been used on a real hardware setup in the lab, in a highly simplified test case that only produces entanglement [28].
None of these instruction sets or languages contain elements for remote entanglement generation (i.e. between different nodes), which NetQASM does provide. A NetQASM program that uses the vanilla flavor and only contains local operations would look similar to an OpenQASM program. However, the instruction set is not quite the same, since NetQASM uses a different memory model than OpenQASM. This is due to the hybrid nature of quantum network programs, which has more interaction between classical data and quantum data than non-networking programs (for which OpenQASM might be used). So, NetQASM is not just a superset of the OpenQASM instruction set (in the sense of adding entanglement instructions).
In [27], we introduced the CQC interface, which was a first step towards a universal instruction set. However, CQC had a number of drawbacks, in particular: (1) CQC does not have a notion of virtualized memory (see section IV), which meant that applications needed to use qubit IDs that were explicitly provided by the underlying hardware. This introduced more communication overhead and fewer optimization opportunities for the compiler. (2) CQC does not provide as much information about hardware details. Therefore, platform-specific compilation and optimization is not possible. (3) Furthermore, CQC does not match entirely with the later definition of our quantum network stack [21,22]. For example, it was not clearly defined how CQC relates to the definition of a network layer.
Many of the ideas from e.g. QASM for how to handle and compile local gates can be reused also for quantum network applications. For example, version 3 of OpenQASM [17] which is under development, proposes close integration between local classical logic and quantum operations, which is something we also propose in this work. However, there are two key differences that we need to address: 1. Instructions for generating entanglement between remote nodes in the network need to be handled and integrated with the rest of the application, see section II B below.

C. Outline
In section II we define relevant concepts and introduce the model of end-nodes that we use, including the QNPU. In section III we discuss use-cases of a quantum network which NetQASM should support. In section IV we consider requirements and considerations any instruction set architecture for quantum networks should fulfill which then lay the basis for the decisions that went into developing NetQASM, see section V. In section VI and section VII we describe details about the NetQASM language and associated SDK. In section VIII we quantitatively evaluate some of the design decision of NetQASM by benchmarking quality of execution using the quantum network simulator NetSquid [26,60]. We conclude in section IX.

II. PRELIMINARIES AND DEFINITIONS
A. Quantum networks A schematic overview of quantum networks is given in fig. 3. A quantum network consists of end-nodes (hereafter also: nodes), which contain quantum network processors as well as classical processors. Nodes are connected by quantum channels or links that can be used to generate entangled quantum states across nodes. End-nodes possess quantum memory in the form of qubits, which can be manipulated by performing operations such as initialization, readout, and single-or multi-qubit gates. Each quantum memory has a certain topology that describes which operations can be applied on which (pair of) qubits. Some of the qubits in a quantum memory may be used to generate an entangled state with another node. These qubits are called communication qubits [21], in contrast to storage qubits which can only directly interact with other qubits part of the same local node 2 .
Some platforms only have a single communication qubit and multiple storage qubits [61], whereas others can have multiple communication qubits [11]. Qubits are sensitive to decoherence and have limited lifetimes. Therefore, the timing and duration of operations (such as local gates or entanglement generation with another node) have an impact on the quality of quantum memory. Classical processors control the quantum hardware, and also perform classical computation. Finally, classical links exist between nodes for sending classical messages.
Since end-nodes can control their memory and entanglement generation, they can run arbitrary user programs. End-nodes can both communicate classically and generate entanglement between each other, either directly or through repeaters and routers, (fig. 3). Nodes in the network other than end-nodes, such as repeaters and routers, do not execute user programs; rather these run protocols that are part of some level in the network stack [21,22]. These internal nodes in the network perform elementary link generation and entanglement swapping in order to generate long-distance remote entanglement between end-nodes [21].
There are various quantum hardware implementations for quantum network processors, such as nitrogen-vacancy centers in diamond [61], ion traps [8], and neutral atoms [9,62], which all have different capabilities and gates that can be performed.
In contrast to classical networks, we consider the end-nodes in a quantum network to not have a network interface component that is separated from the main processing unit. Having local and networking operations combined in a single interface reflects the physical constraint on current and near-term hardware. Current state-of-the-art hardware for quantum networking devices can make use of up to the order of 10 qubits [63]. Furthermore, certain hardware implementations, such as nitrogen-vacancy centers in diamond [61], only have a single communication qubit, which also acts as a mediator for any local gate on the storage qubits. This prevents dedicating some qubits for purely local operations and some for purely networking operations. Rather, to make maximal use of near-term quantum hardware, a multi-purpose approach needs to be supported.

B. Application layer and QNPU
In this work we will assume an abstract model of the hardware and software architecture of end-nodes in a quantum network. Specifically, we assume each end-node to consist of an application layer and a Quantum Network Processing Unit (QNPU). The application layer can be also be seen as a the user space of a classical computer, and the QNPU as a coprocessor.  1) is abstracted away by a network stack. That is, it is not visible at the application layer how entanglement generation or classical message passing is realized. This may be via direct physical connections, or intermediary repeaters and/or routers. End-nodes hold two types of qubits: (1) communication qubits which can be used to generate entanglement with remote nodes and (2) storage qubits which can be used to store quantum states and apply operations. A communication qubit may also be used as a storage qubit. The qubits within an end-node can interact through quantum gates and their state can be measured.
This model takes into account both physical-and application-level constraints found in quantum network programming. The QNPU can be accessed by the application layer, at the same node, to execute quantum and classical instructions. We define the capabilities of the QNPU, and roughly their internal components, but do not assume how exactly this is implemented. In the rest of this work, we simply use the QNPU as a black box.
The QNPU can do both classical and quantum operations, including (1) local operations such as classical arithmetic and quantum gates and (2) networking operations, i.e. remote entanglement generation. The application layer cannot do any quantum operations. It can only do local computation and classical communication with other nodes. In terms of classical processing power, the difference between the application layer and the QNPU is that the application layer can do heavy and elaborate computation, while we assume the QNPU to be limited in processing power.
The application layer can interact with the QNPU by for example sending instructions to do certain operations. The application layer and the QNPU are logical components and may or may not be the same physical device. It is assumed that there is low latency in the communication between these components, and in particular that they are physically part of the same node in the network.
One crucial difference between the application layer and the QNPU is that the application layer can do application-level classical communication with other end-nodes, while the QNPU cannot. The QNPU can communicate classically to synchronize remote entanglement generation, but it does not allow arbitrary user-code classical communication. We use this restriction in order for the QNPU to have relatively few resource requirements.
The QNPU consists of the following components, see fig. 4: • Processor: The processor controls the other components of the QNPU and understands how to execute the operations specified by the application layer. It can read and write data to the classical memory and use this data to make decisions on what operations to do next. It can apply quantum gates to the qubits in the quantum memory and measure them as well. Measurement outcomes can be stored in the classical memory.
• Classical memory: Random-access memory storing data produced during the execution of operations, such as counters, qubit measurement outcomes, information about generated entangled pairs, etc. • Quantum memory: Consists of communication and storage qubits, see section II A, on which quantum gates can be applied. The qubits can be measured and the resulting outcome stored in the classical memory by the processor. The communication qubits are connected through a quantum channel to adjacent nodes in the quantum network, through which they can be entangled. This quantum channel may also include classical communication needed for synchronization, phase stabilization or other mechanisms needed in the specific realization.
• Quantum network stack: Communicates classically with other nodes and quantum repeaters in the network to synchronize the generation of remote entanglement, and issues low-level instructions to execute the entanglement generation procedures, see [21,22].
We stress that the internals of the QNPU are not relevant to the design of NetQASM. We do assume that the QNPU only has limited classical processing power, and can therefore be implemented on for example a simple hardware board.

C. Applications and programs
As mentioned in section I, quantum network applications (or protocols) are multi-partite and distributed over multiple end-nodes. The unit of code that is executed on each of the end-nodes that are part of the application, is called a program. We will use this terminology throughout the rest of the paper.
As mentioned in the previous section, the end-nodes are modeled such that there is an application layer and a QNPU. We assume that execution of quantum network programs is handled by the application layer. How exactly the program is executed, and how the QNPU is involved herein, is part of the NetQASM proposal.

III. USE-CASES
In the next section we will discuss the design considerations taken when developing NetQASM. These design considerations are based on a set of use-cases listed in this section which we intend for NetQASM to support. Applications intended to run on a quantum network will often depend on a combination of these use-cases.
• Local quantum operations. Applications running on a network node need to perform quantum operations on local qubits, including initialization, measurement, and single-or multi-qubit gates. Such local qubit manipulation is well known in the field of quantum computing. For example, OpenQASM [37] describes quantum operations. Quantum network applications should be able to do these local operations as well.
• Local quantum operations depending on local events or data. The next use-case stems from applications consisting of programs in which limited classical computation or decision making is needed in-between performing quantum operations. Here we consider only dependencies in a program between quantum operations and information that is produced locally, that is, on the node that this program is being executed. For instance, a program might only apply a quantum gate on a qubit depending on the measurement outcome of another qubit, or choose between execution branches based on evaluation of a classical function of earlier measurement outcomes. An example is for the server-side of blind quantum computation, which performs a form of Measurement-Based Quantum Computation (MBQC). In each step of the MBQC, the server performs certain gates on a qubit, depending on results of measuring previous qubits [64]. These applications need classical operations to not take too much time, so that qubit states stay coherent during these operations. This implies that switching between classical and quantum operations should have little overhead.
• Entanglement generation. Crucial to quantum networks is the ability to generate remote entanglement.
Applications should be able to specify requests for entanglement generation between remote nodes. In some cases, a Measure-Directly (MD) [21] type generation is required, where entangled state is measured directly, without storing in memory, to obtain correlated classical bits, such as in Quantum Key Distribution (QKD). However, in many cases a Create-Keep (CK) [21] type is needed, where the entanglement needs to be stored in memory and further operations applied involving other qubits. We want applications to be able to initiate or receive (await) entanglement of both forms with nodes in the network.
• Local quantum operations depending on remote events or data. We already mentioned the use-case of having conditionals based on local information. We also envision applications that need to store qubits and subsequently perform local quantum operations on them and other local qubits, based on classical information coming from another node. An example is teleportation in which the receiver-after successful entanglement generation-needs to apply local quantum corrections based on the measurement outcomes of the sender. Another application is blind quantum computation, where the server waits for classical instructions from the client about which quantum operations to perform. Hence, there need to be integration of classical communication (sending the measurement results or further instructions) and the local quantum operations. Furthermore, since classical communication has a non-zero latency (and is in general even non-deterministic), it should be possible to suspend doing quantum operations while waiting for communication or performing classical processing, while quantum states stay coherent.
• Waiting time. We consider the scenario where an application requires two nodes to communicate with each other, and where communication takes a long time, for example since they are physically far apart. It should be possible for a program to suspend doing quantum operations while waiting for communication or performing classical processing, while quantum states stay coherent. Furthermore, in order to maximize the usage of the QNPU we want to have a way to fill this waiting time in a useful way.

IV. DESIGN CONSIDERATIONS
In this section we review the most important design considerations and requirements that were applied when developing NetQASM. Our proposed solutions to these design considerations are presented in the next section, with more details about NetQASM as a language in the subsequent sections.
• Remote entanglement generation: One of the main differences compared to the design considerations of a quantum computing architecture is that of remote entanglement generation (see the use-case in section III). Nodes need to be able to generate entanglement with a remote node, which requires the collaboration and synchronization of both nodes, and possibly intermediate nodes, which is handled by the network stack (section II).
Further requirements arise in platforms with a limited number of communication qubits. The extreme case is nitrogen-vacancy centers in diamond which have a single communication qubit that additionally is required for performing local operations. For this reason it is not possible to decouple local gates on qubits from entanglement generation. We note the contrast with classical processors, where networking operations are typically intrinsically separate kinds of operations. For example, operations such as sending a message may simply involve moving data to a certain memory (e.g. that of a physically separate network interface), which is often abstracted as a system call.
A quantum network stack has already been proposed in [21,22], and we expect the QNPU of the end-node to implement such a stack, including a network layer that exposes an interface for establishing entanglement with remote nodes. The way in which a program creates remote entanglement should therefore be compatible with this network layer.
• Conditionals: In section III we mentioned the need to do local quantum operations conditioned on classical data that may be generated locally or by remote nodes. Such classical data include for example measurement results or information communicated to or from other nodes in the network. We distinguish between real-time and near-time conditionals [17]. Real-time conditionals are time-sensitive, such as applying a certain quantum operation on a qubit depending on a measurement outcome. For such conditionals, we would like to have fast feedback, in order for quantum memory not to wait too long (which would decrease their quality). Near-time conditionals are not as sensitive to timing. For example, a program may have to wait for a classical message of a remote node, while no quantum memory is currently being used. Although it is preferably minimized, the actual waiting time does not affect the overall execution quality.
• Shared memory: As described in section II, we expect end-nodes to consist of an application layer and a QNPU. These two components have different capabilities. For example, only the application layer has the ability to do arbitrary classical communication with other nodes. Only the QNPU can do quantum operations. These restrictions lead the design in a certain way. The two components hence need to work together somehow. There needs to be model for interaction between the two, and also for shared memory.
Executing programs on an end-node is shared by the application layer and the QNPU (see section II B). Indeed, only the QNPU can do quantum-related operations, whereas the application layer needs to do classical communication.
In order to make these work together, the two components have to share data somehow. This includes the application layer requesting operations on the QNPU, and sending the following from the QNPU to the application layer: (1) measurement outcomes of qubits, (2) information about entanglement generation, in particular a way to identify entangled pairs. This communication between application layer and QNPU needs to be done during runtime of the program. This is in contrast to local quantum computation, where one might wait until execution on the QNPU is finished before returning all data. The challenge for quantum network programs is to have a way to return data while quantum memory stays in memory.
• Processing delay: Since we assume that the application layer and the QNPU have to share execution of a single program, the interaction between the two layers should be efficient. Unnecessary delays lead to reduced quality (see section I). The challenge is therefore to come up with an architecture for the interaction between the application layer and the QNPU, as well as a way to let QNPU execution not take too long.
• Platform-independence: As explained in section I, hardware can have many different capabilities and gates that can be performed. However, application programmers should not need to know the details of the underlying hardware. For this reason, there needs to be a framework through which a programmer can develop an application in a platform-independent way which compiles to operations the QNPU can execute.
• Potential for optimization: Since near-term quantum hardware has a limited number of qubits and qubits have a relatively short lifetime, the hardware should be utilized in an effective way. There is therefore a need to optimize the quantum gates to be applied to the qubits. This includes for example choosing how to decompose a generic gate into native gates, rearranging the order of gates and measurements and choosing what gates to run in parallel. Since different platforms have vastly different topologies and gates that they can perform, this optimization needs to take the underlying platform into account. The challenge is to have a uniform way to express both platform-independent and platform-specific instructions.
• Multitasking: The 'Waiting time' use-case in section III describes that a node's QNPU may have to wait a long time. We consider the solution that the QNPU may do multitasking, that is, run multiple (unrelated) programs at the same time. Then, when one program is waiting, another program can execute (partly) and fill the gap. To make our design compatible with such multitasking, we need to provide a way such that programs can run at the same time as other programs, but without having to know about them.
• Ease of programming: Even though NetQASM provides an abstraction over the interaction with the QNPU, it is still low-level and hence not intended to be used directly by application developers. Furthermore, applications also contain classical code that is not intended to run on the QNPU. Therefore it should be possible to write programs consisting of both classical and quantum (network) operations in a high-level language like Python, and compile them to a hybrid quantum-classical program that uses NetQASM.

V. DESIGN DECISIONS
Based on the use-cases, design considerations and requirements, we have designed the low-level language NetQASM as an API to the QNPU. In this section we present concepts and design decisions we have taken. Details on the mode of execution and the NetQASM-language are presented in section VI.
A. Interface between application layer and QNPU

Execution model
As described in section II, and also in section IV program execution is split across the application layer and the QNPU. Since the QNPU is assumed to have limited processing power (section II), our design lets the application layer do most of the classical processing. The program blocks ( fig. 2) are hence spread over two separate systems: blocks of purely classical code are executed by the application layer, and blocks of quantum code (containing both quantum operations and limited classical control) are executed by the QNPU.
The quantum code (including limited classical control) is expressed using the NetQASM language. The classical code is handled completely by the application layer, and we do not impose a restriction to its format. In our implementation (section VII), we use Python. This classical code on the application layer also handles all application-level classical communication between nodes, since it cannot be done on the QNPU.
We let the application layer initiate a program. Whenever quantum code needs to be executed, the application layer delegates this to the QNPU. Since processing delay should be minimized (section IV), the communication between application layer and QNPU should be minimized. Therefore, NetQASM bundles the quantum operations together into blocks of instructions, called subroutines, to be executed on the QNPU. A program, then, consists of both both classical code and quantum code, and the quantum code is represented as one or more subroutines. These subroutines can be seen as the quantum code blocks of fig. 2.
For most programs, we consider subroutines to be sent consecutively in time. However, if the QNPU supports it, NetQASM also allows to send multiple subroutines to be executed on the QNPU at the same time, although this requires some extra care when dealing with shared memory. From the perspective of the QNPU, a program consists of a series of subroutines sent from the application layer. Before receiving subroutines, the application layer first registers a program at the QNPU. The QNPU then sets up the classical and quantum memories (see below) for this program. Then, the application layer may send subroutines to the QNPU for execution.

Shared classical memory
Since classical and quantum blocks in the code (as per fig. 2) can depend on each other, the application layer and the QNPU need to have a way to communicate information to each other. For example, a subroutine may include a measurement instruction; the outcome of this measurement may be used by the application layer upon completion of the subroutine. Therefore, NetQASM uses a shared memory model such that conceptually both layers can access and manipulate the same data. This solves the need to return data, and to do conditionals (section IV).
Each program has a classical memory space consisting of registers and arrays. Registers are the default way of storing classical values, like a measurement outcome. In the example of the application layer needing a measurement outcome, there would be an instruction in the subroutine saying that a measurement outcome needs to be placed in a certain register. The application layer can then access this same register (since they share the memory space) and use it in further processing. The number of registers is small, and constant for each program. Arrays are collections of memory slots (typically the slots are contiguous), which can be allocated by the program at runtime. Arrays are used to store larger chunks of data, such as parameters for entanglement requests, entanglement generation results, or multiple measurement outcomes when doing multiple entangle-and-measure operations. The application layer may only read from the shared memory; writing to it can only be done by issuing NetQASM instructions such as set (for registers) and store (for arrays). The QNPU may directly write to the shared memory, for example when entanglement finished and it writes the results to the array specified by the program.

Unit modules
In order to support systems with multitasking (section IV), NetQASM provides a virtualized model of the quantum memory to the program. This allows the QNPU to do mapping between the virtualized memory and the physical memory   . 5b). In the case of hybrid-quantum computing, qubits are reset in between circuits (in e.g. QASM). For quantum internet programs the qubits should on the other hand be kept in memory, since they might be entangled with another node and intended to be used further.
and perform scheduling between programs. The quantum memory for a program is represented by a unit module ( fig. 6). A unit module defines the topology of the available qubits (which qubits are connected, i.e. on which qubit pairs a two-qubit gate can be executed), plus additional information on each qubit. This additional information consists of which gates are possible on which qubit or qubit pair. It also specifies if a qubit can be used for remote entanglement generation or not. The extra information is needed since on some platforms, not all qubits can be used for entanglement generation and different qubits may support different local gates. For example, in a single NV-centre, there is only one communication qubit and any additional qubits are storage qubits. Also, the communication qubit can do different local gates than the storage qubits.
A single program has a single quantum memory space, which is not reset at the end of a subroutine, which is in contrast with quantum computing. This allows the application layer to do processing while qubits are in memory. The following sequence of operations provides an example. (1) The application layer first sends a subroutine containing instructions for entanglement generation with a remote node R. (2) The QNPU has finished executing the subroutine, and informs the application layer about it. There is now a qubit in the program's memory that is entangled with some qubit in R. (3) The application layer does some classical processing and waits for a classical message from (the application layer of) R. (4) Based on the contents of the message, the application layer sends a new subroutine to the QNPU containing instructions to do certain operations on the entangled qubit. The subroutine can indeed access this qubit by using the same identifier as the first subroutine, since the quantum memory is still the same. We note the contrast with (non-network) quantum computing, where quantum memory is reset at the end of each block of instructions ( fig. 5).
Unit modules contain virtual qubits. This is because of the requirement that it should be possible to run multiple programs at the same time on a single QNPU (section IV). Qubits in the unit module are identified by a virtual ID. The QNPU maps virtual IDs to physical qubits. A program hence uses a virtual memory space (the unit module), and does not have to know about the physical memory.  6: Example of a unit-module topology on a platform using nitrogen-vacancy centers in diamond. A unit-module is a hypergraph [65], with associated information on both nodes and edges. Each node represents a virtual qubit, containing information about (1) its qubit type (communication or storage), (2) physical properties of the qubit, such as decoherence times and (3) which single-qubit gates are supported on the qubit, together with their duration and noise. Each edge represents the possibility of performing joint operations on those qubits, such as two-qubit gates, and also containing information about gate durations and noise.

Instructions
As explained in section V A, the application layer delegates quantum code (including limited classical control) of the program to the QNPU by creating blocks of instructions and sending these to the QNPU for execution. These blocks are called subroutines and contain NetQASM instructions. Since the QNPU is meant to be limited in processing power, the instruction set that it interprets should also be simple and low-level. The NetQASM instruction set contains instructions for simple arithemetic, classical data manipulation, and simple control flow in the form of (un)conditional branch instructions. Although conditional control-flow can be done at the application layer as well, NetQASM branching instructions allow for much faster feedback since they are executed by the QNPU, and hence cover the design consideration of real-time conditionals (section IV). There are no higher-level concepts such as functions or for-loops, which would require more complicated and resource-demanding parsing for the QNPU, such as constructing an abstract syntax tree.
A single instruction specifies an operation, possibily acting on classical or quantum data. For example, a singlequbit rotation gate is represented as an instruction containing the type of gate, the classical register containing the rotation angle, and the classical register containing the virtual ID of the qubit (as specified in the unit module) to act on. NetQASM specifies a set of core instructions that are expected to be implemented by any QNPU. These include classical instructions like storing and loading classical data, branching, and simple arithmetic. Different hardware platforms support different quantum operations. NetQASM should also support platform-specific optimization (section IV). Therefore, NetQASM uses flavors of quantum instructions (section V B 3). The vanilla flavor consists universal of a set of platform-independent quantum gates. Particular hardware platforms, such as the NV-centre, may use a special NV flavor, containing NV-specific instructions. A QNPU implementation may use a custom mapping from vanilla instructions to platform-specific ones. The instructions in a flavor are also called a software-visible gate set [31]. See appendix F for more details on NetQASM instructions.

Remote entanglement generation
Generating entanglement with a remote node is also specified by instructions. These are however somewhat special compared to other instructions. First, entanglement generation has a non-deterministic duration. Therefore, when an entanglement instruction is executed, the request is forwarded to the part of the system repsonsible for creating entanglement, but the instruction itself immediately returns. A separate wait instruction can be used to block on entanglement generation to actually be completed. Second, entanglement generation requests should be compatible with the network stack proposed in [21], including the network layer from [22]. These requests need to be accompanied by information such as the number of EPR pairs to generate or the minimum required fidelity. Third, this information should be able to depend on runtime information. For example, the required fidelity may depend on an earlier measurement outcome. Therefore, entanglement generation parameters cannot be static data, and must be stored in arrays. Furthermore, the result of entanglement generation with the remote node consists of a lot of information, such as which Bell state was produced, the time it took, and the measurement results in case of measuring directly. This information is written by the QNPU to an array which is specified by the entanglement instruction. Finally, since writing the information to the array indicates that entanglement generation succeeded, the wait instruction can be used to wait until a certain array is filled in, such as the one provided by the entanglement instruction. Since the entanglement instruction is non-blocking, it is possible to continue doing local operations while waiting for entanglement generation to complete. We assume that the QNPU implements a network stack where connections need to be set-up between remote nodes before entanglement generation can happen [21,22]. NetQASM provides a way for programs to open such connections in the form of EPR sockets. The application layer can ask the QNPU to open an EPR socket with a particular remote node. The QNPU is expected to set up the required connections in the network stack, and associates this program socket with the connection. When the program issues an instruction for generating entanglement, it refers to the EPR socket it wants to use. Based on this, the QNPU can use the corresponding connection in the network.

Flavors
We want to keep NetQASM platform-independent. However, we also want the potential for platform-specific optimization (section IV). Therefore we introduce the concept of flavors. Flavors only affect the quantum instruction set of the language, and not the memory model or the interaction with the QNPU. We use the vanilla or generic flavor for a general, universal gate set. Subroutines may be written or generated in this vanilla flavor. Platform-independent optimization may be done on this level. A QNPU may directly support executing vanilla-flavored NetQASM. Platform-specific translations may then be done by the QNPU itself. It can also be that a QNPU only supports a specific flavor of NetQASM. A reason for this could be that the QNPU does not want to spend time translating of the instructions at runtime. In this case, the application layer should perfrom a translation step from the vanilla flavor to the platform-specific flavor. In such a case, the vanilla flavor can be seen as an intermediate represenation, and the translation to a specific flavor as a back-end compilation step.

Programmability
Since the NetQASM instructions are relatively low-level, we like to have a higher-level programming language for writing programs, that is automatically compiled to NetQASM. We introduce a higher-level SDK in section VII. However, we do not see this as part of the NetQASM specification itself. This decoupling allows the development of SDKs to be independent such that these can be provided in various languages and frameworks.
We still want NetQASM instructions to be suitable for manual writing and inspection. Therefore, instructions (and subroutines) have two formats: a binary one that is used when sending to the QNPU, and a text format that is human-readable. The text format resembles assembly languages including OpenQASM. Example are given in section VII A and the Appendix.

A. Interface between application layer and QNPU
Here we explain the flow of messages between the application layer and the QNPU. The application layer starts by declaring the registration of an application, including resource-requirements for the application. After this, the application layer sends some number of subroutines for the QNPU to execute before declaring the application is finished. See fig. 8 for a sequence diagram and below for a definition of the messages. In section VI B we will describe in more details the content of the subroutines and the format of instructions. The QNPU returns to the application layer an assigned application ID for the registered application and returns data based on the subroutines executed.
The application layer and the QNPU are assumed to run independently and in parallel. For example, while a subroutine is being executed by the QNPU, the application layer could in principle do other operations, such as heavy processing or communication with another node. Figure 8 shows an example of a message exchange between the application layer and the QNPU. The content of these messages is further detailed in appendix A.

B. The language
The syntax and structure of NetQASM resemble that of classical assembly languages, which in turn inspired the various QASM-variants for quantum computing [37][38][39][40].
A NetQASM instruction is formed by an instruction name followed by some number of operands: where instr specifies the instruction, for example add to add numbers or h to perform a Hadamard. The operands part consists of zero or more values that specify additional information about the instruction, such as which qubit to act on in the case of a gate instruction. Instructions and operands are further specified in appendix B.  of NetQASM, which can be used by a compiler.

C. Instructions
There are eight groups of instructions in the core of NetQASM. Also summarized in fig. 9, these are: • Classical: Classical arithmetic on integers.
• Branch: Branching operations for performing conditional logic.
• Memory: Read and write operations to classical memory (register and arrays).
• Allocate: Allocation of qubits and arrays.
• Wait: Waiting for certain events. This can for example be the event that entanglement has been generated by the network stack.
• Return: Returning classical values from the QNPU to the application layer. In our implementation we implement this by having the QNPU write to the shared memory so that the application layer can access it.
• Entanglement: Creating entanglement with a remote node using the quantum network stack.
Quantum gates are specific to a NetQASM flavor and given as a set of software-visible gates of a given platform, see section IV. There is a single platform-independent NetQASM flavor which we call the vanilla flavor, see fig. 9. The vanilla flavor can be used as an intermediate representation for a compiler.

D. Compilation
Although application programmers could write NetQASM subroutines manually, and let their (classical) application code send these subroutines to the QNPU, it is useful and more user-friendly to be able to write quantum internet applications in a higher level language, and have the quantum parts compiled to NetQASM subroutines automatically. For this, we use the compilation steps depicted in fig. 10. The format and compilation of the higher-level programming language is not part of the NetQASM specification. However, we do provide an implementation in the form of an SDK, see section VII. from higher-level programming language, to the NetQASM flavor exposed by the specific platform. What is contained at each level is further specified to the right of the diagram.

VII. PYTHON SDK
We implemented NetQASM by developing a Software Development Kit (SDK) in Python. This SDK allows a programmer to write quantum network programs as Python code, including the quantum parts. These parts are automatically translated to NetQASM subroutines. The SDK contains a simulator that simulates a quantum network containing end-nodes, each with a QNPU. The SDK can execute programs by executing their classical parts directly and executing the quantum parts as NetQASM subroutines on the simulated QNPU. By executing multiple programs at the same time, on the same simulated network, a whole multi-partite application can be simulated. In section VIII we use this SDK to evaluate some of the design decisions of NetQASM.
We refer to the docs at [24] for the latest version of the SDK. Below, we give an example of an application written in the SDK to give an idea of how development in the SDK looks like. In appendix H 2 we provide a few more examples of applications in the SDK and their corresponding NetQASM subroutines.

A. SDK
The SDK of NetQASM uses a similar framework to the SDK used by the predecessor CQC [74]. Any program on a node starts by setting up a NetQASMConnection to the QNPU-implementation in the backend. The NetQASMConnection encapsulates all communication that the application layer does with the QNPU. More information about supported backends can be found below in section VII A 1. Using the NetQASMConnection one can for example construct a Qubit object. The Qubit object has methods for performing quantum gates and measurements. When these methods are called, corresponding NetQASM instructions are included in the current subroutine being constructed. One marks the end of a subroutine, and the start of another, either by explicitly calling flush on the NetQASMConnection or by ending the scope of the with NetQASMConnection ... context.
The following Python code shows a basic application written in the NetQASM SDK. The application will be compiled into a single subroutine executed on the QNPU, which creates a qubit, performs a Hadamard operation, measures the qubit and returns the result to the application layer. The following NetQASM subroutine is the result of translating the above Python code to NetQASM of the vanilla (platform-independent) flavor.

Backends
As mentioned above, the NetQASMConnection in the SDK is responsible for communicating with the implemented QNPU in the backend. The backend can either be a simulator or an actual QNPU using real quantum hardware. Currently supported backends are the simulators SquidASM [66] (using NetSquid [26,60]) and SimulaQron [27]. A physical implementation of QNPU running on quantum hardware is being worked on at the time of writing. Using the SDK provided at [24], one can for example simulate a set of program files for the nodes of a quantum network on NetSquid using a density matrix formalism with the command: 0 netqasm simulate --simulator = netsquid --formalism = dm For more details see the docs at [24].

VIII. EVALUATION
We evaluate two of the design choices that we made for NetQASM: (1) exposing unit-modules to the application layer and (2) adding the possibility to use platform-specific flavors of instructions. For both elements we study the difference in including them in NetQASM versus not including them. We do this by simulating a teleportation application and a blind quantum computation application. These examples also showcase the ability of NetQASM to express general quantum internet applications.
We have implemented a simulator, called SquidASM [66], that simulates a network in which end-nodes have the internal architecture as described in section II, that is, with an application layer and a QNPU. The simulator internally uses NetSquid [60], which was made specifically for the simulation of quantum networks. SquidASM executes programs written using the SDK (section VII), including sending NetQASM subroutines to the (simulated) QNPU.
We evaluate the performance of NetQASM by looking at the runtime quality of two applications, both consisting of two programs (one per node). The first is a teleportation of a single qubit from a sender node to a receiver node. We define the quality as the fidelity between the original qubit state at the sender and the final qubit state at the receiver. The second application is a blind computation protocol which involves a client and a server. The server effectively performs, blindly, a single-qubit computation on behalf of the client. The protocol is a so-called verifiable blind quantum computation [69]. This means that some of the rounds of the protocols are trap rounds. We define the quality that we evaluate as the error rate of these trap rounds, since this indicates the blindness of the server.
We run these applications on SquidASM, where we simulate realistic quantum hardware. Specifically, we simulate nodes based on nitrogen-vacancies (NV) in diamond, that can do heralded entanglement generation between each other. The simulated hardware uses noise models that are also used in [26]. For more details, see appendix I.

A. Unit modules
We ask ourselves the question whether it pays off to expose unit modules, that is, a qubit topology with gate-and entanglement information. Specifically, we want to know if there are situations where knowing the unit module gives the application layer an opportunity to optimize the application in a way that is not possible when not knowing the unit module. If so, we are interested in how much advantage this gives (in terms of the runtime quality defined above).
In the next section we show that there are indeed situations where knowledge of the unit module is advantageous. It can be that the order in which NetQASM instructions are issued in a subroutine is sub-optimal, since virtual qubit IDs may be mapped in such a way that the QNPU has to move virtual qubits to different physical qubits in order to execute the instructions. If the application layer layer does not know this mapping, it cannot know that the instructions are ordered sub-optimally. With knowledge of the unit module, on the other hand, the application layer can optimize the order and the overall application performance is improved.
We consider a teleportation application where a sender program teleports a single qubit to another receiver program. It is assumed that the underlying platform is based on nitrogen-vacancy centers in diamond (NV) and use wellestablished models for both the noise and operations supported on such platforms, see appendix I. The sender program uses two qubits: one to create entanglement with the receiver (qubit E), and one to send (teleport) to the receiver (qubit T). At some point, the sender measures both qubits, after which it sends the outcomes to the receiver so that it can do the relevant corrections on its received qubit. We assume that the sender program is written in a higher-level language like, like in our SDK (section VII A), and in such a way that it first issues a measurement operation on qubit T, and then on E. However, due to the differences in characteristics of the physical qubits, as will be explained below, it is more efficient to first do the measurement on E, and then on T. Now we consider two scenarios, namely • Unit-modules (UM). We assume that the sender program is written and executed on a software stack implementating NetQASM, which means that the application's view of its quantum working memory is in the form of a unit module. This unit module contains information about the above-mentioned hardware restrictions, and therefore a compiler can take advantage of it by re-ordering the measurement operations while generating the NetQASM subroutines to be sent to the QNPU.
• No unit-modules (NUM). In this case the software stack also implements NetQASM, but without unit modules. Specifically, the application sees its quantum memory as just a number of uniform qubits. Therefore, a compiler for this application does not know about the hardware restrictions, and will construct NetQASM-subroutines sent to the QNPU without doing any optimization and leaves the order of the operations to be performed as they are specified in the high-level SDK.
Let's first go through the steps of the teleportation application: sender : 1. Initialize qubit q t to be teleported in a Pauli state. 2. Create entanglement with receiver using qubit q s . 3. Perform CNOT gate with q t as control and q s as target. 4. Perform Hadamard gate on q t . 5. Measure qubit q t and store outcome as m 1 . 6. Measure qubit q s and store outcome as m 2 . 7. Send m 1 and m 2 to receiver.
receiver : 1. Receive entanglement with sender using qubit q r . 2. Receive measurement outcomes from sender.
3. Apply correction operations on q r based on measurement outcomes.
We will now consider the order of the steps of the sender. Firstly, we assume that the qubit to be teleported, q t , is always created before the entanglement. We motivate this assumption below. For this reason, steps 1-3 and 7 are fixed and cannot change. However, we are free to do step 6 before step 4 and 5, since these single-qubit operations and measurements commute, as long as we are consistent with the outcomes m 1 and m 2 . Let's now consider what impact this decision of measuring q s before q t or not has on the quality of execution for a NV-platform.
One of the biggest restrictions on a NV-platform is the topology of the qubits. In particular, the NV-platform has a single communication-qubit (electron) surrounded by some number of storage qubits (carbon spins), see for example fig. 6. The single communication qubit is not only responsible for any remote entanglement generation but also for any two-qubit gate and is the only qubit that can be directly measured. These restrictions require qubit states to be moved back and forth between the communication qubit and the storage qubits in order to free up the communication qubit, to create new entanglement or to measure another qubit. Since the operation of moving a qubit state is relatively slow on this platform (up to a millisecond [7]) and adds noise to the qubits, it is important to try to minimize the number of moves needed. For more details on the NV-platform, see for example [61] or [21].
In the steps of the sender above, the communication qubit is first initialized to a Pauli state. This state is then moved to a storage qubit to free up the communication qubit in order to create entanglement with the receiver. Then in step 5, q t should be measured, which is currently in the storage qubit. This requires the qubit state to first be moved to the communication qubit. However, at this point the communication qubit is occupied by the entangled pair and therefore first needs to be moved to a second storage qubit. Qubit q t can then be moved to the communication qubit to be measured and then the same is done for q s , requiring in total four move operations and three physical qubits.
We can now see that performing step 6 before 4 and 5 has the advantage that this qubit is already in the communication qubit and can be measured directly without moving it first. Afterwards, q t can be moved to the communication qubit, which is cleared after the measurement, requiring in total only 2 move operations and only two physical qubits. The decision of performing step 6 before 4 and 5 is highly dependent on the NV-platform and can only be made by a compiler that is aware about these restrictions. The inclusion of unit-modules and qubit types in the NetQASM-framework, which are exposed to the compiler at the application layer, allows for these optimization decision and can therefore improve the quality of execution.
For the two scenarios we consider, i.e. performing step 6 before 4 and 5 (Unit modules (UM)) or not (No unit-modules (NUM)), we check the average fidelity of the teleported state as a function of the gate noise ( fig. 11a), as well as the average fidelity and execution time as a function of gate duration ( fig. 11a), of the native two-qubit gate of the NV-platform. We see that performing step 6 before 4 and 5 improves both total execution time and average fidelity. This can be explained by the fact that using unit modules allowed a compiler to produce NetQASM code containing fewer two-qubit gates. Therefore, an increase in two-qubit gate noice leads to a lower fidelity. Also, an increase in two-qubit gate duration leads to higher execution time difference between the two scenarios. Finally, fig. 11a shows that the two-qubit gate duration does not affect the final fidelity in this situation, but the difference between using unit modules versus not using them remains.

B. Flavors
While aiming to let NetQASM be mostly platform-independent, we did also choose to allow platform-specific instructions, bundled in flavors. The idea is that this allows for platform-specific optimization leading to better application performance. Here we evaluate if flavors really impact potential performance, and if so how much. FIG. 12: Circuit representation of the simulated BQC application. The client remotely prepares two qubits on the server, by twice creating an entangled pair with the server followed by a local measurement. The server locally entangles its two qubits (cphase gate). Then, the client and server use classical communication to further guide the server's quantum operations. The client computes δ 1 = α − θ 1 + p 1 · π and sends this to the server. The server uses the received value to do a local rotation and later sends measurement outcome m 1 back to the client. The client then sends δ 2 = (−1) m1 · (β − θ 2 + p 2 · π) to the server. The qubit state q is the result of this application.
We show that platform-specific optimization can indeed improve application performance, and that there are such optimizations that are not possible without flavors. We see that it has impact mostly on the execution time, but not necessarily on outcome quality.
We consider the blind computation application depicted in fig. 12, where both the client and server node implement the NV hardware. Again we compare two scenearios, in this case: • Vanilla. We compile both the client's and server's application code to NetQASM subroutines with the vanilla flavor.
The QNPU, controlling NV hardware which does not implement all vanilla gates natively, needs to translate the vanilla instructions on the go. We assume this translation is ad-hoc and does not do any optimizations like removing redundant gates.
• NV. The code is compiled to NetQASM subroutines containing instructions in the NV flavor, and redundant gates are optimized away. The QNPU can directly execute the instructions on the hardware.
We implemented this by writing two separate programs in the SDK, one for the client and one for the server. The SDK automatically compiles the relevant parts of these programs into NetQASM subroutines. Classical communication (values δ 1 , m 1 and δ 2 ) is done purely between the two simulated application layers, so these operations are not compiled to NetQASM subroutines. More details about the simulation can be found in appendix I.
The protocol is a verifiable blind quantum protocol [69], which means that the circuit in fig. 12 is run multiple times, namely once per round. Some of these rounds are trap rounds in which the client chooses a special set of input values. Such a trap round can either succeed or fail, depending on the values returned by the server. The fraction of trap rounds that fail is called the error rate. The error rate should stay low in order for the computation to be blind.
We simulate the BQC application by running the client's and server's programs in SquidASM. We look at the error rate of the trap round as a function of the two-qubit gate noise. The result can be seen in fig. 13. It can be seen that using the NV flavor provides a better (lower) error rate than using the vanilla flavor. This can be explained by noting that NetQASM instructions in the vanilla flavor are mapped ad-hoc to native NV gates by the QNPU at runtime, which leads to more two-qubit gates in total.

C. Relation to other results
We note that a similar question of how many physical details to expose from lower-level layers (in our case the QNPU) to higher-level layers (in our case the application layer) has also been evaluated in [31]. Their conclusion is that exposing and leveraging some of these details can indeed improve certain program success metrics. That result agrees with that of ours, which shows that program execution quality can improve by exposing and leveraging unit modules and platform-specific NetQASM flavors.

IX. CONCLUSION
NetQASM enables the development of quantum internet applications in a platform-independent manner. It solves the question of dealing with the complexity of having both classical and quantum operations in a single program, while at the same time providing a relatively simple format for QNPU-like layers to handle. Multiple applications, such as remote teleportation and blind quantum computation, have already been implemented. A simple compiler has been implemented that can translate code written in the higher-level SDK into NetQASM.
Additionally to the work in this paper, we are also developing a physical implementation of the QNPU. One key component in this implementation is the Quantum Node Operating System (QNodeOS), which acts as the bridge between the applications and the physical layer. QNodeOS will be presented in a dedicated paper including results of a first integration test between NetQASM, QNodeOS and underlying physical quantum hardware. This will mark the first time a quantum network node has been programmed using platform-independent code.
• RegisterAppOK: Returned from the QNPU when application is registered, containing an application ID to be used for future messages.
• RegisterAppErr: Returned from the QNPU when registration of application failed. For example if required resources could not be met.
-error_code: Error code specifying what went wrong.
• Subroutine: Message from the application layer to the QNPU, containing a subroutine to be executed. Details on the content are presented in later sections.
subroutine: The subroutine to be executed.
• Done: Message from the QNPU to the application layer, indicating that a subroutine has finished. Which subroutine is indicated by the message ID.
-message_id Message ID used for the Subroutine-message.
• Update memory: The application layer will have access to a copy of the memory allocated by the QNPU for certain registers and arrays, see section VI B. This memory is read-only by the application layer. Updates to the copy of the memory are performed by the end of a subroutine or if the subroutine is waiting. Furthermore, updates need to be explicitly specified in the subroutine by using one of the return-commands. How the actual update is implemented depends on the platform and can either be done by message-passing or with an actual shared memory. However, the subroutine is independent from this implementation. The application layer will be notified by an explicit message whenever the memory is updated.
• StopApp: Sent from the application layer to the QNPU indicating that an application is finished.

Appendix B: Operands
In this section we give the exact definition of the types of operands used in the NetQASM language. Each instruction of NetQASM takes one or more operands. There are five types of operands, which are listed and described below. Each instruction has a fixed types of operands at each position. The exact operands for each instruction is listed in appendix F. We note also that in the human-readable text-form of NetQASM, there are also branch variables. However, these are always replaced by IMMEDIATEs (constants), corresponding to the instruction number of the subroutine, before serializing, see appendix C.
The operand types of NetQASM are: • IMMEDIATE (constant): An integer seen as it's value. The following instruction, beq branch-if-equal, branches to instruction index 12 since the number 0 equals the number 0. 0 beq 0 0 12 In the binary encoding used at [24], IMMEDIATEs are int32.
• REGISTER: A register specifying a register name and a index. The following instruction sets index 0 of the register name R to be 0.
In the current version of NetQASM there are four register names and the indices are relative to the names. They are all functionally the same but are meant to be used for different purposes and increase readability: -C: Constants, meant to only be set once throughout a subroutine.
-R: Normal register, used for looping etc.
In the binary encoding used at [24], REGISTERs are specified by one byte and hold one int32.
• ADDRESS: Specifies an address to an array. Starts with @. The following instruction declares an array of length 10 at address 0. For more information about arrays, see below. The address here is just an identifier of the array and does not refer to a actual memory address. For this reason @1 above does not mean the second entry of the declared array but simply a different array. Addresses are relative to the application ID and are valid across subroutines.
• ARRAY_ENTRY: Specifies an entry in an array. Takes the form @a[i], where a specifies the address and i the index. The following instruction stores the value of R0 to the second entry of the array with address 0. 0 store R0 @0 [1] In the text-form i can either be an IMMEDIATE or a REGISTER, however in the binary encoding used at [24], i is always a REGISTER. This is handled by the compiler by using a set-command before.
• ARRAY_SLICE: Specifies a slice of an array. Takes the form @a[s:e], where a specifies the address, s the start-index (inclusive) and e the end-index (exclusive). The following instruction waits for the second to the fourth entry of array with address 0 to become not null, see appendix F 7.
0 wait_all @0 [1:4] In the text-form s and e can either be an IMMEDIATEs or a REGISTERs, however in the binary encoding defined used at [24], s and e are always a REGISTERs. This is handled by the compiler by using a set-commands before.

Quantum gates
Single-qubit gates There is a number of single-qubit gates which all have the following structure • instr: Perform a single-qubit gate.
1. REGISTER: The virtual address of the qubit.
Single-qubit gates without additional arguments are the following.
Single-qubit rotations Additionally one can perform single-qubit rotations with a given angle. The angles a are specified by two integers n and d as: These instructions have the following structure b. Classical logic (for-loop) A subroutine which performs a for-loop which body creates a qubit, puts in the |+ state and measures it. The outcomes are stored in an array. In a higher-level language (using python syntax) the below subroutine might be written as follows: The equivalent NetQASM subroutine is: In the above subroutine DEFINE statements have been used to clarify what registers/arrays correspond to the variables in the higher-level language example above.

c. Create and recv EPR
This code is for the side initializing the entanglement request. In this section we detail how simulations in section VIII were performed and what models and parameters were used. All simulations used the NetQASM SDK [24], using NetSquid [26,60] as the underlying simulator. All code used in these simulations can also be found at [66].   fig. 11b. All values are from [21].

Noise model
In both the teleportation and the blind quantum computing scenario we used the same model for nitrogen-vacancy centres in diamonds as was used in [21] and [26]. All gates specified by the application in the SDK were translated to NV-specific gates, see table I, using a simple compiler without any optimization. The parameters used in the model from [21] are listed in tables I and II, together with an explanation and a reference. ec_controlled_dir_xy are the native two-qubit gates of the NV-platform, ideally performing one of the unitary operations where R x (α) and R y (α) are the rotation matrices around X and Y , respectively. When sweeping the duration and noise of this two-qubit gate the same value is also used for the carbon_xy_rot (X-and Y -rotations on the carbon) on the storage qubits, since these are also effectively done with a similar operation also involving the communication qubit (electron). All noise indicated by a fidelity in table II are applied as depolarising noise by applying the perfect operation, producing the state ρ ideal , and mapping this to where X, Y and Z are the Pauli operators in eqs. (F1) to (F3), p = 4 3 (1 − F ), with F being the value specific in table II. Decoherence noise is specific as T 1 (energy/thermal relaxation time) and T 2 (dephasing time) [75].

BQC application and flavors
In section VIII B we simulated the blind quantum computation (BQC) application from fig. 12. The code for this is available at [66].
In the scenario when the application code was compiled to subroutines with the vanilla lavour, the QNPU had to map the vanilla instructions to NV-native operations on the fly. We used the gate mappings listed below. For convenience we use PI and PI_OVER_2 for π and π 2 respectively. A h (Hadamard) vanilla instruction was mapped to the following NV instruction sequence: 0 rot_y PI_OVER_2 1 rot_x PI A cnot C S vanilla instruction between a communication qubit (C) and a storage qubit (S) (as specified in the unit module) was mapped to the following NV instruction sequence:   fig. 11a and  fig. 13. All fidelities are realized by a applying depolarising noise as in eq. (I4). All values are from [26], except link_fidelity which is set to relatively high value to avoid this being the major noise-contribution and preventing any conclusions to be made. A cnot S C vanilla instruction between a store qubit (S) and a communication qubit (C) (as specified in the unit module) was mapped to the following NV instruction sequence: