Adversarial network embedding on heterogeneous information networks

Network embedding has been proven to be helpful for solving real-world problems. Moreover, real-world networks are often heterogeneous information networks(HINs). In this paper, we propose a new adversarial framework for heterogeneous network embedding, namely AGNE-HIN. AGNE-HIN can learn latent code distribution in the network through a generative adversarial way. What’s more, to reduce the global smoothness of the embedded vector caused by GAN, we apply perturbation to the input to form adversarial data. Experimental results verify our design and demonstrate the effectiveness of the proposed method in node clustering, link prediction and similarity ranking tasks.


Model
In this paper, we employ GAN [15] to learn node representations in HINs. As shown in Fig.1, our model mainly consists of two main parts: generator and discriminator. The discriminator and the generator compete with each other to achieve their best performance. Fig.1 The architecture of our model

Model details
Our generator samples nodes from a Gaussian distribution, and the generated fake nodes should be close to the nodes of the real data to confuse the discriminator. Therefore, given a node V  u , u and V   connected, the generator tries to generate a fake node   connected to the node u . In other words, the representation of the fake node should be as similar as possible to the real node  .
The goal of the discriminator is to determine whether a sample node is a node that is actually connected to a given input node. Our discriminator can be expressed as follows: Where w is a sample node (possibly a fake or true node), We define the adversarial perturbation adv  in the embedding space and set the L2 norm constraint according to the connectivity pattern of the node pair. Then the adversarial perturbation is added into the input to form an adversarial example, which extends the original clean data, and then we use the new mixed data to train our model. Before each training, first, we solve the optimization problem to generate the adversarial perturbation adv  corresponding to the input x , so that it has anti-disturbance to the current model. What's more, because of its computational complexity, we use fast gradient descent to obtain the adversarial perturbation. During the training process, the embedded vector will perform stochastic gradient descent [14] optimization according to the designed perturbation.

Loss
The generator hopes that after the perturbation is added to the input, the embedding generated by itself can still confuse the discriminator. In addition, we add smoothness constraints to make the fake node's embedding  e and the embedding of the given node u have similar representations. So, the loss of our generator is shown below: For the discriminator, the loss of our discriminator is as follows: The node u and the node  are indeed connected in the HIN. The discriminator hopes to get high scores for the node  : When a node u is connected to a node  , but  is not in the HIN, then  is a fake node generated from the generator, in which case the discriminator wishes to get low scores for the node  : Where v e is the embedding of the fake node  .

Experimental Setups
In our experiments, we brief the DBLP dataset to construct the HIN. The criteria are that it cannot be too small and too sparse. In other words, it should contain at least one edge per node per relation type. Therefore, the HIN used in our experiments contains 31975 nodes which have a total of three node types: authors, papers and conferences. The complete DBLP dataset is available online 1 . Besides, the HIN has 627974 edges, including 473480 paper-author edges and 154494 paper-conference edges.
For AGNE-HIN's training, in each iteration, we first fix the parameters in the generator G and generator fake nodes from G to update the parameters of the discriminator, so as to improve the discriminator. After updating the discriminator, we fix the parameters of the discriminator to update the generator. We do these steps many times until our model converges.

Node clustering
The node clustering experiment is to aggregate all nodes into several different classes according to the characteristics of different node types. Therefore, we use it to evaluate the ability of our model to learn the characteristics of nodes. In this task, we classify the nodes of the paper and use the field associated with the node as the label of the node, so that the learned classifier can determine the category of each node. Among them, the paper nodes have seven categories 2 . We use the K-means algorithm to cluster the data and evaluate the clustering results based on normalized mutual information (NMI) [15]. And, a higher value represents better node representations. All clustering results are conducted 10 times and the average performance is reported in Table 1. From Table 1, we can conclude that, with respect to the above metric, the proposed AGNE-HIN model achieves the best performance among all methods, demonstrating the effectiveness of AGNE-HIN in learning the node representations of HIN. The results between metapath2vec and GAN-based embedding methods (HeGAN and AGNE-HIN) has proved the superiority of the adversarial principle. AGNE-HIN outperforms HeGAN, showing the effectiveness and importance of perturbations when learning robust node representations.

Link Prediction
Link prediction [16] is used to predict the possible links between two types of nodes, so we use it to evaluate our model's performance to learn the relationship between two types of nodes. In this task, we predict the paper-author links in DBLP. We randomly hide 20% of such links from the original network and use them as a test set to verify performance. Finally, the remaining network is used to learn node representations.
Following the commonly used evaluation criteria in similar tasks, we use Accuracy as the evaluation criteria in our experiment. A higher value indicates better performance. We performed ten experiments and used the average as the final score. The details of our experimental results on link prediction are shown in Table 2. The results show that AGNE-HIN achieves better performance in all methods. For example, AGNE-HIN has increased the accuracy from 2.9% compared with HeGAN to 7.5% compared with metapath2vec. We integrate effective perturbations into GAN to learn the potential relationships between different types of nodes so that we can predict the unseen and non-existent links.

Similarity Ranking
We perform the similarity ranking experiment to verify the effectiveness of the proposed AGNE-HIN. This experiment can illustrate the similarity between nodes of the same type.
First of all, we select three conferences from all the conferences, namely "FOCS'', "TREC'' and "ER'', and we select the top 10 conferences that are most relevant to the three conferences as our benchmark for this task. These data are shown in Table 3. It should be noted that the correlations between conferences are determined by the number of co-authors between them, and the greater the number of co-authors between conferences, the stronger their correlations.  KDD  ICEIS  9  APPROX  ICDM  OTM  10 ISAAC JCDL WISE Then, we select "FOCS'', "TREC'' and "ER'' as query nodes, and we use cosine similarity to calculate the correlations (distance) between the query nodes and their corresponding nodes in Table 3. In the end, their similar results corresponding to the query nodes are listed in Table 4. From Table4, we can know that for the node "ER'', there are up to four conferences ("CAISE'', "DEXA'', "CIKM'' and "ICEIS'') in Metapath2vec in the same relative orders as the benchmark. In HeGAN, there are at most five conferences ("CAISE'', "DEXA'', "SIGMOD'', "DASFAA'' and "ICEIS'') in the same relative orders as the benchmark. In the proposed AGNE-HIN, there are at most five conferences ("CAISE'', "DEXA'', "VLDB'', "DASFAA'' and "ICEIS'') in the same relative orders as the benchmark. Other nodes do the same analysis. Their relative orders are the bold font in Table4. What's more, for these query nodes, the first three conferences obtained by the proposed AGNE-HIN are the same as the first three conferences of the corresponding node in the benchmark, although they are in a different order.
The proposed AGNE-HIN can learn the features between different node types, and for the same type of nodes, AGNE-HIN is able to capture the commonness between nodes, so as to obtain highquality node representations.

Conclusion
In this paper, we propose a novel adversarially regularized framework to generate high-quality node representations, namely AGNE-HIN. AGNE-HIN obtains the potential data distribution through a generative adversarial way. Moreover, we integrate perturbation into our model to form adversarial data, which can reduce the global smoothness of the embedded vectors caused by GAN. Through the above operations, we can achieve model's robustness and better generalization performance. We test our model on three tasks: node classification, link prediction and similarity ranking. Experimental results show that our model is better than baselines.