Learning pairing symmetries in disordered superconductors using spin-polarized local density of states

We construct an artificial neural network to study the pairing symmetries in disordered superconductors. For Hamiltonians on square lattice with s-wave, d-wave, and nematic pairing potentials, we use the spin-polarized local density of states near a magnetic impurity in the clean system to train the neural network. We find that, when the depth of the artificial neural network is sufficient large, it will have the power to predict the pairing symmetries in disordered superconductors. In a large parameter regime of the potential disorder, the artificial neural network predicts the correct pairing symmetries with relatively high confidences.

Machine learning algorithms [36] such as deep learning are powerful tools and have been widely used in different domains from academic research to industry applications. In recent years, these algorithms have been used in multiple research fields in physics, such as gravitation wave detection [37,38], particle physics and high-energy physics [39][40][41][42], quantum entanglement and quantum information [43,44], disorder and phase transition [45,46], material designing [47,48], quantum many-body problem [49,50], etc. The deep learning algorithms based on convolutional neural network are powerful tools in the image recognition, feature extraction and phase classification in physics. In this work, we show that the deep learning algorithms built on multilayer convolutional neural networks can be used to extract features from SP LDOSs for different pairing symmetries, and make a prediction with quantitative confidences on the pairing symmetries for disordered superconductors. The detailed implementation is as follows. Theoretically, one can construct different model Hamiltonians with different pairing symmetries. The model Hamiltonians may come from different superconductivity mechanisms, or come from a systematic classification according to symmetry analysis and group theory. The parameters in the model Hamiltonians can be obtained from first principle calculation or fitted from experimental data. Theoretically, one can generate experimental measurable quantities (e.g., the SP LDOS used in this work) using the Hamiltonians in the clean limit.
Using the data sets in the clean limit, one can train an artificial neural network. A well trained, robust, and generic artificial neural network will be predictable for disordered systems.
The paper is organized as follows. In section 2, we build the model Hamiltonians for a superconductor on square lattice with different pairing symmetries, i.e., s-wave, d-wave, and nematic pairings, and show the theoretical calculation of SP LDOS using two different methods. The patterns for model training is generated here from the Hamiltonians without disorder. In section 3, we give a brief introduction to the artificial neural network and show the detailed construction of the network model used in this work. We present the prediction of the artificial neural network for systems with different pairing symmetries and different disorder strengths in section 4. A conclusion is given in section 5.

Model Hamiltonian and SP LDOS
We consider the following Hamiltonians on square lattice with three types of pairing symmetries and on-site potential disorder, where ψ † R,σ and ψ R,σ are the creation and annihilation operators of electron states at site R. σ =↑, ↓ refers to the spin-up and spin-down states, respectively. δ 1 = (1, 0), δ 2 = (0, 1), δ 3 = (−1, 0), and δ 4 = (0, −1) are the four nearest neighbor sites. t is the hopping amplitude, μ is the chemical potential. In this work, we consider three kinds of different pairing symmetries: (i) the s-wave pairing shown in equation (3), where Δ S refers to the pairing strength, h.c. means the hermitian conjugate; (ii) the d-wave pairing between nearest neighbors, d-wave symmetry is characterized in the phase factor, φ 1 = φ 3 = 0, φ 2 = φ 4 = π; (iii) nematic pairing with pairing potential in the form of equation (3) and the nematicity is characterized in the anisotropic hopping term in equation (2), δ 1 = δ 3 = δt, δ 2 = δ 4 = −δt [51] . H # in equation (1) has three different choices, H S , H D and H N for s-wave, d-wave and nematic pairings, respectively. The impurity is located at the center (original point) of the lattice R = 0, V 0 in equation (5) represents the strength of impurity potential. In this work, we consider the classical magnetic impurity, S z = 1 2 σ z and σ z is the third Pauli matrix of spin. As shown in equation (6), disorder is manifested by a random on-site potential with the amplitude distributed uniformly within width W.
In order to get a good performance of numerical results for a large number of samples, we use two different methods to calculate the SP LDOS. For the clean system without potential disorder, we choose periodic boundary conditions for both the two directions of square lattice. In this case, the wavevector is a good quantum number and the Green's function of superconductor (corresponding to the Hamiltonian H 0 + H # Δ ) can be evaluated directly in wavevector space by using the standard T-matrix method. The system size has been chosen to be as large as 1024 × 1024 and the fast Fourier transformation is applied to speed up numerical calculation. The SP LDOS as a function of energy at the impurity site is well performed using this method. Here we show the detailed derivation of SP LDOS. Using the Nambu spinor representation in wavevector space, , and the Bogliubov-de Gennes formalism of the Hamiltonian without disorder, see equations (12)- (14) for detailed expressions, one can find that, the SP LDOS can be written as follows after some straightforward derivation [52][53][54], where ζ z is the third Pauli matrix in Nambu spinor space, ΔG(E, R) = G(E, R) − G 0 (E, R) is the difference between the full Green's function with impurity contained and the free Green's function without impurity.
ΔG(E, R) can be expressed as a product of the free Green's function, G 0 (E, R), and the T-matrix [20], where V = V 0 ζ z /2 is the magnetic impurity potential expressed in Bogliubov-de Gennes formalism. The free Green's function in real space can be calculated using the Fourier transformation, where h BdG (k) is the Hamiltonian of clean superconductor expressed in Bogliubov-de Gennes formalism and wavevector space, it takes different forms for different pairing symmetries, where (k) = 2t(cosk x + cosk y ) is the dispersion of normal state on square lattice, δ (k) = 2δt(cosk x − cosk y ) is the distortion of dispersion induced by nematicity, d xy = cosk x − cosk y is the form factor of nearest neighbor d-wave pairing, σ # and ζ # (# = x, y, z) are the Pauli matrices in spin and Nambu spinor spaces, respectively. When the random potential is applied, wavevector is no longer a good quantum number, we have to diagonalize the total Hamiltonian to find the SP LDOS. The following formula is more suitable [18], here, ψ n (R) is the nth eigenvector of the total Hamiltonian (1) and E n is the corresponding eigenvalue of energy. In practical calculation, the system size has been chosen to be 151 × 151 with open boundary condition, the δ-function in equation (15) is approximated by the Lorentzian distribution function, where δ is the infinitesimal imaginary part of retarded Green's function given in equation (11), it has been chosen to be 0.002 for numerical calculation. The summation in equation (15) is approximated by a small subset with E n ∈ [E − 10δ, E + 10δ], the FEAST eigenvalue solver package [55,56] is used to find the eigenvalues in this interval and the corresponding eigenvectors. For both the two situations (W = 0, the clean system, and W = 0, the disordered system) and each SP LDOS pattern, only a small subset with 41 × 41 sites and the magnetic impurity located at the center has been generated for the pairing symmetry classification using artificial neural network.

Artificial neural network classification
An artificial neural network is a nonlinear function which is constructed by layers of 'neurons'. Using some hidden 'neurons' layers, z (1) , z (2) , . . . , z (n) (here n is the number of hidden layers), the input data x can be mapped to the output results y. For the sequenced feedforward artificial neural network, this can be written as, For each mapping between sequenced layers, z (i) → z (i+1) , generally, there are two elementary functions: (i) the weighted linear function, W (i+1) z (i) and (ii) the activation function, A (i+1) , A deep neural network with 16 hidden layers are chosen in practical training. Figure 1 shows the structure of the neural network model (see the caption for the detailed description of the network). For the output layer, the activation function has been chosen to be the sigmoid function, where z (n) k refers to the kth value of the last hidden layer z (n) . For all the hidden layers, we chose the rectified linear unit (ReLU) function as the activation function, One can find that, if the model is established, i.e., the number of layers, the type and the activation function of each layer, the size and number of convolutional kernels in each convolutional layer, are chosen and fixed, then the full mapping y(x) is totally determined by the weight matrices, W (i) , i = 1, 2, . . . , n + 1. The training process is designed to optimize these weight matrices. In classification problems, the cross-entropy function is widely used as the cost function for the training, Here, N s is the number of samples in training, N c is the number of classes.ŷ s l is a binarization function: when x s belongs to the lth class,ŷ s l = 1, otherwiseŷ s l = 0. y s l is the output for the lth label for the input sample x s . For a well trained artificial neural network model defined as y s l →ŷ s l for any s in 1 to N s , the cross-entropy function tends to its global minimum value. The last term is the regularization and λ is the regularization parameter. This term is applied to reduce the overfitting. The weight matrices are learned using a set of clean samples without disorder, these weight matrices are updated in the training process by minibatch gradient descent with batch size 40 as where η is the learning rate, which has been chosen to be the small value 0.0015. After the training process, the artificial neural network model is determined, y l (x) returns a value in the interval (0, 1) for the input pattern, this value gives the prediction of confidence that the input data x belongs to the lth class. In this work, three classes of symmetries are investigated, l = 0, 1 and 2 stands for the s-wave, d-wave and nematic pairings, respectively. In order to get the feature of strong disorder which destroys all the pairing symmetries significantly, a  reference class referring to randomness and labeled as l = 3 is added. In this class, the training pattern are generated using random numbers.

Numerical results
Firstly, we consider the SP LDOSs without disorder, which give us an intuitive understanding of the training pattern. Figure 2 shows typical patterns for different pairing symmetries. These results can be calculated from both equations (7) and (15). We carry out both these calculations and find that the SP LDOS patterns are coincide to each other very well. This can be considered as an evidence that our numerical calculation of SP LDOSs are correct. The resonance energy E = E r is determined by the singularities of the T-matrix, i.e., det[V − G 0 (E r , 0)] = 0. Generally, there are two solutions located symmetrically at the two sides of the Fermi surface for the equation det[V − G 0 (E r , 0)] = 0, i.e., E = ±E r . In practical calculation, we chose randomly one of these two resonance energies to evaluate the training pattern. One can find that these patterns reveal significant different features, i.e., figure 2  The first row of figure 3 shows the prediction of pairing symmetries from the artificial neural network for three kinds of superconductors with disorder potentials. Single-shot SP LDOSs are shown in figures 3(d)-(l) for different pairing symmetries and different disorder strengths. For all the three kinds of pairing potentials, when strong potential disorder is added, e.g., W > 2, the dominated characteristic of SP LDOSs is randomness, i.e., the l = 3 class labeled with black ×. In the weak disorder regime, i.e., 0 < W < 0.5, all of the three kinds of pairing symmetries can be discerned by the artificial neural network. This is also manifested in the single-shot patterns. Figures 3(d), (g) and (j) have weak disorder, the symmetry of the patterns are clearly distinguishable. The confidences of these patterns calculated from trained neural network are (0.966, 0.006, 7 × 10 −5 , 0.002), (3 × 10 −5 , 0.953, 2 × 10 −6 , 1 × 10 −4 ), and (1 × 10 −5 , 1 × 10 −4 , 0.704, 6 × 10 −5 ), respectively. Here, the four components in the lists show the confidences of s-wave, d-wave, nematic pairing, and randomness. In the last row of figure 3, we find the dominated characteristic is randomness (l = 3 class), the confidences of randomness are 0.312, 0.044, and 0.034 for these patterns. The confidences of other features are negligible (few orders of magnitude smaller). Figure 4 shows the averaged confidences with errorbars for each pairing symmetry. One can find that the identification of correct pairing symmetry is ended at around W = 1.5, 1.0, and 0.5 for s-wave, d-wave, and nematic pairing symmetries, respectively. Near the critical disorder strengths of s-wave ( figure 4(a)) and nematic ( figure 4(c)) pairing classes, one can find that the averaged confidence of d-wave symmetry emergences, i.e., there is a peak of the blue line labeled with at W = 1.0 and 0.5 for figures 4(a) and (c) respectively. When we get into the detailed pattern with neural network learned high confidence of d-wave symmetry in these regimes (e.g., typical patterns shown in figures 3(e) and (k)), we find that these patterns do not display fourfold rotation symmetric. This inconsistence may be induced by overfitting and solved by using more generic neural network model.
Two things need to emphasized here. Firstly, in order to get a model which is applicable in disordered systems and trained from only clean samples (i.e., the training data and evaluation data are completely non-intersected), a deep neural network with multi convolutional layers are essential. We have tested some simple models with only a single hidden convolutional layer. We find that, these simple models perform good at the original training data but break down when applied to the disordered systems. They perform badly even for the weak disordered patterns which are very close to the SP LDOSs of the clean systems. This demonstrates that pairing symmetry is not a superficial aspect. Another factor which makes the symmetry identification to be difficult is that the SP LDOSs are smooth functions. These patterns do not have sharp edges as typical features for different pairing symmetries. This makes the machine learning with few hidden layers perform badly for this problem. The second thing we need to emphasize is, here we use the sigmoid function as the activator of the output layer. The confidences shown in figure 3 are ranged from 0 to 1, and look almost even distributed. This does not mean that the symmetry is not well captured by the model. Actually, the confidences of the training patterns are also ranged from 0 to 1, and look almost even distributed. This makes the neural network generic and applicable for disordered systems. We have tried the widely used maxsoft function as the activator for the output layer. We find that, when the trained models are applied to identify the original data, the confidences for the correct pairing symmetry are close to one. However, they perform badly in identifying the symmetry of disordered systems.
Due to the limitation of SP LDOS, for some special situation, the identification of pairing symmetry should be complemented by using other methods. If the neural network predicts a high confidence of d-wave pairing, the extended s-wave pairing can not be excluded, further investigations are recommended.

Conclusion
In this work, we establish an artificial neural network using the SP LDOS patterns near a magnetic impurity of clean superconductors as the training data. Three kinds of pairing symmetries, i.e., s-wave, d-wave, and nematic pairings are investigated. We find that this deep neural network model trained from clean systems can be applied to predict pairing symmetries in disordered superconductors. Our work pave the way for the future investigation of pairing symmetries in disordered superconductors by using the deep learning methods.