Quantum device fine-tuning using unsupervised embedding learning

N M van Esbroeck; D T Lennon; H Moon; V Nguyen; F Vigneau; L C Camenzind; L Yu; D M Zumbühl; G A D Briggs; D Sejdinovic; N Ares

doi:10.1088/1367-2630/abb64c

1. Introduction

Electrostatically defined semiconductor quantum dots are intensively studied for solid-state quantum computation [1–4]. Gate electrodes in these device architectures are designed to separately control electrochemical potentials and tunnel barriers [5, 6]. However, these device parameters vary non-monotonically and not always predictably with applied gate voltages, making device tuning a complex and time consuming task. Fully automated device tuning will be essential for the scalability of semiconductor qubit circuits.

Tuning of electrostatically defined quantum dot devices can be divided into three stages. The first stage is the ultra coarse tuning which consists of setting gate voltages to create the confinement potential for electrons or holes. The second stage, known as coarse tuning, focuses on identifying and navigating different operating regimes of a quantum dot device. The third stage, referred to as fine-tuning, involves optimizing a particular set of charge transitions. Full automation of the first tuning stage has been recently achieved [7]. Automated coarse tuning has been demonstrated using convolutional neural networks to identify the double quantum dot regime [8] and reach arbitrary charge states [9]. Template matching was also used to navigate to the single-electron regime [10]. During this stage, virtual gate electrodes can be used to independently control the electrochemical potential of each quantum dot [11, 12]. Previous work on automated fine-tuning focussed on achieving a target value for the tunnel coupling between two quantum dots by systematically modifying gate voltages [13, 14]. However, these approaches only allow for the optimization of a device parameter easily estimated from the performed measurements and rely on calibration.

Here, we demonstrate an automated approach for simultaneous fine-tuning of multiple device parameters, such as tunnel rates and inter-dot tunnel coupling, without requiring parametrization of the desired measurement features. Our approach is based on a variational auto-encoder (VAE). The VAE compresses training data sets to a lower-dimensional space, called the latent or embedding space. In this latent space, locations corresponding to measurements with desirable characteristics (according to human experts) are identified. After training, the algorithm takes a measurement and assigns it a location in latent space. The distance between this location and the target locations is used by the algorithm as a basis to score the measurement. By optimizing this score, device parameters are fine-tuned in real time. We have previously shown that VAEs significantly improve the efficiency of quantum dot measurements [15].VAEs are able to extract very high-level contextual information from data. For example, from pictures of human faces, a VAE model can detect a hairstyle, facial expressions, etc [16]. That high-level contextual information is encoded in the non-linear relationship of several pixels and it is thus extremely difficult to extract it using conventional methods like pixel-wise difference, linear embeddings (e.g. principle component analysis), or other hand-crafted functions. This is the case for the transport features we aim to optimize during fine tuning. We have now, for the first time, used a VAE to fine tune a double quantum dot device by locally optimizing transport features in a completely automated manner. Without requiring any prior knowledge of the device architecture, we are able to fine-tune several device parameters at once.

2. Fine-tuning double quantum dot devices

The device we focus on is a double quantum dot defined in a two-dimensional electron gas (2DEG) [5, 6] at the interface of a GaAs/AlGaAs heterostructure. Ti/Au gate electrodes are patterned on top of the heterostructure (figure 1(a)). DC voltages V₁ to V₈ are applied to these gate electrodes, to define and control the double dot confinement potential. A bias voltage V_bias determines the flow of current I through the device. Stability diagrams, displaying I as a function of two gate voltages, allow us to characterize charge transport in the device. A charge stability diagram for our double quantum dot device after coarse tuning is shown in figure figure 1(b). All measurements were performed at approximately 20 mK.

**Figure 1.** Overview of the quantum dot device and algorithm. (a) Scanning electron microscopy image of a device lithographically identical to the one measured. A bias voltage V_bias is applied between two ohmic contacts to drive a current I through the device. Gate voltages V₁ to V₈ define and control the double quantum dot. (b) Current as a function of gate voltages V₃ and V₇, with V_bias = 0.2 mV. In this stability diagram, at the cross-points of the hexagonal lattice representative of the double quantum dot regime, bias triangles are observed. The zoom in shows a pair of bias triangles before (bottom) and after (top) running our fine tuning algorithm. Bias triangles are much better defined after fine tuning. (c) Schematic overview of the fine-tuning algorithm. A set of gate voltages is fixed by the algorithm. The algorithm then takes a low resolution stability diagram (LRSD), defines a measurement window centered in the observed bias triangles, and takes a high resolution stability diagram (HRSD). This diagram is fed to the VAE, which assigns it a score based on target locations in the latent space identified during training. A new set of gate voltages is fixed according to a decision model and the algorithm restarts. The VAE is trained once (indicated with a dotted red arrow), stage at which target stability diagrams are identified. In every iteration, their location in latent space is used by the VAE.
Download figure:
Standard image High-resolution image

In the double quantum dot regime, charge stability diagrams display pairs of bias triangles, which reveal device parameters such as charging energies, tunnel coupling to the electron reservoirs and interdot tunnel coupling [17]. The shape, sharpness and brightness of these bias triangles are the features guiding humans when fine-tuning device parameters. Humans modify gate voltages until features they identify as favourable, according to their experience, are observed. Fine-tuning a double quantum dot is a very time-consuming task and thus automation is required to scale this technology. Our algorithm modifies gate voltages to achieve various bias triangle characteristics commonly associated with favourable device parameters, as done by humans when tuning these devices.

3. Overview of the algorithm

A schematic of our algorithm is shown in figure 1(c). The device is automatically coarse tuned to the double quantum dot regime [7] and a pair of bias triangles is identified. In each iteration a set of gate voltages N_j, $\left\{{V}_{1}^{j},{V}_{2}^{j},\dots ,{V}_{8}^{j}\right\}$ is evaluated. After acquiring an initial LRSD (30 × 30 pixels, 18.4 × 18.4 mV), the bias triangles are centred using Laplacian of Gaussian blob detection. In computer vision, blob detection techniques aim to detect bright regions on dark backgrounds or vice versa [18]. In figure 1(b), the two pairs of bias triangles displayed were centred with this approach. Next, a HRSD of the centered bias triangles is measured (32 × 32 pixels, 17 × 17 mV). The HRSD is then assigned a location in latent space by the VAE. We use a distance metric to score the measurement by comparing this latent space location to the latent space location of a set of targets chosen by a human expert during VAE training. Based on the score value S_i, the algorithm sets the gate voltage configuration for the next stability diagram measurement N_j+1. This process is iterated until the bias triangles measured are as close as possible in latent space to the target measurements.

4. VAE implementation

The VAE is a convenient approach to evaluate a combination of features observed in an image, in this case a stability diagram that is costly to measure (approximately 1 minute), with the aim of creating a score metric to guide the choice of gate voltages. The VAE ensures that only the most important features in the stability diagram are encoded in the latent space.

The VAE consists of an encoder and decoder, which are probability distributions whose distribution parameters are computed by neural networks [19]. The encoder ${q}_{\phi }\left(\boldsymbol{z}\vert \boldsymbol{x}\right)$ maps input data x to a low-dimensional latent vector z which is real-valued. The decoder ${p}_{\theta }\left(\boldsymbol{x}\vert \boldsymbol{z}\right)$ maps a latent vector to a reconstruction $\hat{\boldsymbol{x}}$ . The parameters of the encoder and the decoder neural networks are ϕ and θ, respectively. The VAE is a generative model; it seeks to preserve the maximum amount of information during the encoding process so that input data can be reconstructed with minimal error during the decoding process. During a training phase, ϕ and θ are iteratively updated to minimize a loss function. The loss function is given by a reconstruction error ${\mathcal{L}}_{\text{rec}}$ , which penalizes the networks for producing reconstructions that are dissimilar from the input data, and a regularization term ${\mathcal{L}}_{\text{reg}}$ , which enforces input data with similar characteristics to be encoded in close proximity in latent space. The reconstruction error and the regularization term have weights α and β, respectively. Mathematical definitions of ${\mathcal{L}}_{\text{rec}}$ and ${\mathcal{L}}_{\text{reg}}$ are provided in the supplementary material (https://stacks.iop.org/NJP/00/000000/mmedia).

We implement Factor-VAE [20], an adaption of VAE that seeks to generate a latent space in which each dimension corresponds to a unique characteristic of the input data. The Factor-VAE framework assumes that there are underlying independent factors associated with the data. If fully disentangled, each of those factors can be identified with a dimension in latent space. By using a Factor-VAE, we aim to generate a latent space in which each dimension is associated with a single bias triangle characteristic, such as size or brightness. In this way, the distance in latent space to a target location results in a good metric to score acquired measurements.

The loss function of Factor-VAE includes a total correlation term which encourages the distribution of embeddings ${q}_{\phi }\left(\boldsymbol{z}\right)$ to be disentangled. It is given by:

$\begin{equation}{\mathcal{L}}_{\text{Factor}-\text{VAE}}={\mathcal{L}}_{\text{rec}}+{\mathcal{L}}_{\text{reg}}+\gamma {D}_{\text{KL}}\left({q}_{\phi }\left(\boldsymbol{z}\right)\vert \vert \prod _{j}{q}_{\phi }\left({z}_{j}\right)\right)\end{equation} \tag{ 1 }$

where the total correlation term is given by ${D}_{\text{KL}}\left({q}_{\phi }\left(\boldsymbol{z}\right)\vert \vert {\prod }_{j}{q}_{\phi }\left({z}_{j}\right)\right)$ , i.e. the Kullback–Leibler divergence between the distribution of embeddings ${q}_{\phi }\left(\boldsymbol{z}\right)$ and the product of the distribution of embedding components ${\prod }_{j}{q}_{\phi }\left({z}_{j}\right)$ , with the index j corresponding to the jth latent space dimension. Note that the distribution of embeddings is given by ${q}_{\phi }\left(\boldsymbol{z}\right)$ , whereas ${q}_{\phi }\left({z}_{j}\right)$ describes the distribution of a single embedding dimension z_j. The total correlation loss term has a weight γ. Since this term is intractable, it is estimated using a discriminator $D\left(\boldsymbol{z}\right)$ . The discriminator is trained to classify between non-factorial and factorial samples, i.e. that its input is a sample from ${q}_{\phi }\left(\boldsymbol{z}\right)$ rather than from ${\prod }_{j}{q}_{\phi }\left({z}_{j}\right)$ . The loss function of the discriminator can be found in the supplementary material.

The training set for the Factor-VAE was collected from a device which differs considerably in material, architecture and transport regime from the device used to demonstrate the performance of the algorithm, evidencing its generality. The VAE was trained using 2253 sets of bias triangles, measured on a double quantum dot defined in a Ge/Si core–shell nanowire [21, 22]. In order to increase the robustness of the VAE, simple data augmentation techniques were applied. Data augmentation included rotation, mirroring, Gaussian noise and random contrast, resulting in a total training set of 8732 stability diagrams of pixel resolution 32 × 32. The dimension of the latent space was set to 10, as in reference [20], given the similar structure of input data. We tried multiple combinations of weights α, β and γ to achieve the optimal VAE performance, which was found empirically for α = 34, β = 1 and γ = 1. The full architectures of the VAE and discriminator neural networks are presented in the supplementary material.

5. Score metric

The score metric used by the algorithm is given by the distance between the latent space representation z of an input stability diagram and the latent space representation $\left\{\tilde {\boldsymbol{z}}\right\}$ of a set of target inputs. Note that the loss function is used during training, while the score metric is used to evaluate each new measurement. A measurement acquired by the algorithm is assigned a low (high) VAE score if its representation in latent space is near to (far from) the targets in latent space. Embeddings that are close together in latent space have similar z, implying that the original inputs can be generated using similar underlying variables. As a result, bias triangles that are assigned a low score possess similar characteristics to the target bias triangles. The target bias triangles are chosen from the unaugmented training set by a human expert who recognizes in these triangles the characteristics indicative of favourable quantum dot parameters. The targets are augmented using the same augmentation techniques as described in section 4. Augmentation of 30 selected targets resulted in a total target set of size 360.

In figure 2 the latent space of the test set of the trained VAE is shown, and the embedding locations of example target and training inputs are indicated. A full plot of the latent space of the trained VAE with original input stability diagrams is shown in the supplementary material.

**Figure 2.** Schematic overview of the VAE. The VAE consists of an encoder and decoder. The encoder ${q}_{\phi }\left(\boldsymbol{z}\vert \boldsymbol{x}\right)$ compresses input stability diagrams to a lower-dimensional latent space. The decoder is denoted by ${p}_{\theta }\left(\boldsymbol{x}\vert \boldsymbol{z}\right)$ and maps vectors in latent space z to the distribution of input data. In this way, the input vector x, for which each element is the brightness of one pixel in the stability diagram, is transformed into a reconstruction vector $\hat{\boldsymbol{x}}$ . In order to visualise the ten-dimensional latent space, t-distributed Stochastic Neighbour Embedding is applied for dimensionality reduction [23]. The resultant two-dimensional latent space is described by a vector w. Each dot represents the embedding of an input stability diagram. The embedding location of one of the target inputs is highlighted in red. It is expected that embeddings which are close to each other in latent space are generated by input data with similar characteristics. Test example 1, with similar characteristics to the target, can be found in close proximity to the target, whereas test example 2 is further away in latent space. The target and example stability diagrams are plotted as a function of V₃ and V₇ and use a color scale running from red, the highest current measured, to blue, the lowest current.
Download figure:
Standard image High-resolution image

To write the expression for the score S_i, where i denotes the ith input measurement, we use the latent vector zⁱ produced by the encoder for this measurement. The output of the encoder is assumed to follow a multivariate Gaussian with diagonal covariance structure: ${q}_{\phi }\left(\boldsymbol{z}\vert \boldsymbol{x}\right)=\mathcal{N}\left(\boldsymbol{z};\boldsymbol{\mu },\enspace \mathrm{diag}\left({\boldsymbol{\sigma }}^{2}\right)\right)$ , where the mean μ and variance σ² are outputs of the encoding network. Considering two independent normal distributions in latent space, the expectation value of the squared distance between the distributions is given by:

$\begin{equation}\mathit{\text{d}}\left({\boldsymbol{z}}^{\mathbf{1}},{\boldsymbol{z}}^{\mathbf{2}}\right)=\mathbb{E}\left({\Vert}{\boldsymbol{z}}^{\mathbf{1}}-{\boldsymbol{z}}^{\mathbf{2}}{{\Vert}}^{2}\right)={\Vert}{\boldsymbol{\mu }}_{{\boldsymbol{z}}^{\mathbf{1}}}-{\boldsymbol{\mu }}_{{\boldsymbol{z}}^{\mathbf{2}}}{{\Vert}}^{2}+{\boldsymbol{\sigma }}_{{\boldsymbol{z}}^{\mathbf{1}}}^{2}+{\boldsymbol{\sigma }}_{{\boldsymbol{z}}^{\mathbf{2}}}^{2}.\end{equation} \tag{ 2 }$

For each input measurement embedded in latent space zⁱ, (2) is used to determine the distance in latent space to target input ${\tilde {\boldsymbol{z}}}^{\boldsymbol{j}}$ . The final score S_i consists of the average of the distance to its k nearest targets:

$\begin{equation}{S}_{i}=\frac{1}{k}\sum _{{\tilde {\boldsymbol{z}}}^{\boldsymbol{j}}\in {A}_{k}}^{k}\mathit{\text{d}}\left({\tilde {\boldsymbol{z}}}^{\boldsymbol{j}},{\boldsymbol{z}}^{\boldsymbol{i}}\right)\end{equation} \tag{ 3 }$

where A_k is the set of k targets closest to zⁱ in latent space. In this way, optimal tuning corresponds to a low score. We found that for k = 3 the score metric produced a robust ranking of the training inputs in terms of their similarity to the targets.

6. Decision model

The decision model adjusts the gate voltage settings, so that this new gate voltage configuration can be measured and scored. The decision model is illustrated in figure 3. In each iteration, one gate electrode is selected at random, and the voltage applied to this electrode is modified by a fixed amount ±ΔV. Therefore, the algorithm chooses between a number of gate voltage branches equal to twice the number of gate electrodes to be tuned minus one, as branches that lead to the reversal of the latest accepted gate voltage change are excluded. We chose ΔV = 2 mV based on human experience in tuning similar devices. After centering and acquiring a high resolution measurement of the resulting bias triangles, the value of S_i determines the algorithm's decision. If S_i is lower than the previously best (lowest) scored bias triangles, the gate voltage change is accepted, leading to a new gate voltage configuration N_j+1. Conversely, if S_i is higher, the gate voltage change is rejected and the gate voltage setting returns to its previous configuration. In this case, the rejected gate voltage change will be an excluded branch in the random selection corresponding to the next iteration. It is possible that all gate voltage branches become depleted, in which case the decision model returns to the previously accepted gate voltage configuration with unexplored branches.

**Figure 3.** Overview of the decision model. In this figure, the node N_j represents the best scored gate voltage configuration obtained after a number of algorithm iterations. The branched arrows represent the different gate voltage adjustment options, which are changes of ±ΔV in every gate electrode to be tuned. In this case, we consider 5 gate voltage branches since we focus on 3 gate electrodes for fine-tuning. (a) Score S_i corresponding to a new configuration N_j+1 is lower than at N_j, so the gate voltage change is accepted. For N_j+1, a new random gate voltage branch is selected and explored. (b) Score S_i corresponding to a new configuration N_j+1 is higher than at N_j, so the gate voltage change is rejected. For N_j, one of the remaining gate voltage branches is randomly selected and explored. (c) If all possible gate voltage configurations are rejected the algorithm returns to the closest previously accepted gate voltage node that has unexplored branches. At this configuration, a gate voltage branch is randomly selected and explored.
Download figure:
Standard image High-resolution image

7. Experimental demonstration

We test the algorithm for different bias triangles measured on our device. Stability diagrams are measured as a function of barrier gate voltages V₃ and V₇, which are adjusted during centering of the bias triangles. The gate voltages tuned by the algorithm are V₁, V₂ and V₈. For simplicity, we chose to keep gate voltages V₄, V₅ and V₆ fixed. We checked their effect on the bias triangles was weak. All measured stability diagrams are min–max normalized with respect to the minimum and maximum current values measured during coarse tuning to the double quantum dot regime, in this case achieved with the algorithm in reference [7]. In figure 4, the fine-tuning of four different pairs of bias triangles is shown.

**Figure 4.** Experimental demonstration of the algorithm. The first column shows four different pairs of bias triangles before the algorithm was run. Each row displays each of these bias triangles at selected iterations of the algorithm. For all these iterations, the applied gate voltage change led to a decrease in score S_i. All measurements were performed with V_bias = 0.2 mV. The stability diagrams were measured as a function of barrier gates V₃ and V₇, while gate voltages V₁, V₂ and V₈ were tuned by the algorithm.
Download figure:
Standard image High-resolution image

In cases 1 to 3, the initial bias triangles lack a well-defined shape, indicative of small inter-dot tunnel coupling [17]. Furthermore, pronounced co-tunnelling lines, which are denotative of second-order transport processes, are observed. As the fine-tuning progresses, the bias triangles separate from each other and acquire a sharper triangular shape. Also, co-tunnelling currents are reduced. In the fourth case, the initial stability diagram shows very faint bias triangles. The algorithm proves capable of increasing the current flowing through the double quantum dot while preserving most of the other bias triangle characteristics. More bias triangle fine-tuning examples achieved by our algorithm can be found in the supplementary material.

Figure 5(a) shows S_i as a function of the number of iterations of our algorithm for cases 1 to 4. Most of the fine-tuning takes place during the first ten iterations, after which the score does not change significantly. In all cases, the algorithm completes the fine-tuning within 26 iterations, corresponding to a total tuning time of 36 min. This time is limited by the measurement time, which could be drastically reduced by radio-frequency reflectometry techniques [24–31].

In figure 5(b) we plot the trajectories in gate voltage space corresponding to each fine-tuning run of the algorithm. The average distance in gate voltage space between the initial gate voltage configurations is greater than for the final gate voltage configurations. This suggests that there exists a region in gate voltage space for which the bias triangles exhibit the most favourable transport characteristics, regardless of their values of V₃ and V₇. Additional data can be found in the supplementary material.

8. Conclusion

We experimentally demonstrate an algorithm for the fine-tuning of bias triangles in gate-defined quantum dots. The algorithm scores real-time measurements by computing distances in the embedding space of a VAE. We show that this score can be used to locally optimize double quantum dot parameters in a completely automated manner. Our results demonstrate that, even with a simple decision model implementation, the score metric proves capable of tuning multiple device parameters at once with no prior knowledge of the device architecture.

The robustness and efficiency of the decision model for proposing new voltage configurations and minimizing the score could potentially be improved by using reinforcement learning or Bayesian optimization, appropriate for cases such as this fine-tuning problem for which data acquisition is costly. Also, while we utilized the Euclidean distance between two Gaussian distributions for computing scores, recent work argues that the decoder induces a Riemannian metric in the latent space [32]. This would imply that shortest paths in latent space do not correspond to straight lines. Therefore, it might prove insightful to implement a Riemannian metric to measure latent space distances. Finally, the influence of selecting targets with different characteristics, such as different excited state energies, could be investigated in the future.

While all measurements presented are performed on a gate-defined GaAs double quantum dot, the VAE was trained on data obtained from a Ge/Si core–shell nanowire device, showing the algorithm is readily applicable to different types of devices. Moreover, our algorithm can be adapted to include any number of additional gate electrodes, paving the way for the tuning of quantum dot arrays.

Acknowledgments

We acknowledge discussions with E A Laird. This work was supported by the Royal Society, the EPSRC National Quantum Technology Hub in Networked Quantum Information Technology (EP/M013243/1), the EPSRC Platform Grant (EP/R029229/1), the Quantum Technology Capital Grant (EP/N014995/1), Nokia, Lockheed Martin, the Swiss NSF Project 179024, the Swiss Nanoscience Institute and the EU H2020 European Microkelvin Platform EMP Grant No. 824109. This publication was also made possible through support from Templeton World Charity Foundation and John Templeton Foundation. The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of the Templeton Foundations. We acknowledge J Zimmerman and A C Gossard for the growth of the GaAs/AlGaAs heterostructure. Lastly, we acknowledge F N M Froning and F R Braakman for providing the Ge/Si training data.

Quantum device fine-tuning using unsupervised embedding learning

Article metrics

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Abstract

1. Introduction

2. Fine-tuning double quantum dot devices

3. Overview of the algorithm

4. VAE implementation

5. Score metric

6. Decision model

7. Experimental demonstration

8. Conclusion

Acknowledgments

Quantum device fine-tuning using unsupervised embedding learning

Article metrics

Share this article

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Abstract

1. Introduction

2. Fine-tuning double quantum dot devices

3. Overview of the algorithm

4. VAE implementation

5. Score metric

6. Decision model

7. Experimental demonstration

8. Conclusion

Acknowledgments