Optimization of routing strategies for data transfer in peer-to-peer networks

Since peer-to-peer file-sharing systems have become familiar recently, the information traffic in the networks is increasing. Therefore it causes various traffic problems in peer-to-peer networks. In this paper, we model some features of the peer-to-peer networks, and investigate the traffic problems. Peer-to-peer networks have two notable characters. One is that each peer frequently searches for a file and download it from a peer who has the requested file. To decide whether a peer has the requested file or not in modelling of the search and download process, we introduce file-parameter Pj, which expresses the amount of files stored in peer j. It is assumed that if Pj is large, peer j has many files and can meet other peers' requests with high probability. The other character is that peers leave and join into the network repeatedly. Many researchers address traffic problems of data transfer in computer communication networks. To our knowledge, however, no reports focus on those in peer-to-peer networks whose topology changes with time. For routing paths of data transfer, generally, the shortest paths are used in usual computer networks. In this paper, we introduce a new optimal routing strategy which uses weights of peers to avoid traffic congestion. We find that the new routing strategy is superior to the shortest path strategy in terms of congestion frequency in data transfer.


Introduction
Recently, studies on networks are extended to many fields such as human social networks, computer networks, neural networks and so on. In particular, complex networks, which have remarkable characters such as small world and the scale-free distribution of degrees, are attracting a great deal of attention [1,2,3]. Many studies have been addressed to information traffic congestion in the networks. Many users send data to other users through computer networks and traffic congestion happens. Efficient strategies for reducing the traffic congestion have been considered in recent years [4].
In this paper, we focus on peer-to-peer file-sharing networks [5,6,7] on computer networks. Since, in these days, applications for peer-to-peer networks, such as Napster, Gnutella, and Winny have been popularized, the amount of traffic in peer-to-peer networks is increasing yearly and becomes a major part of whole one in the Internet.
Therefore, avoidance of traffic congestion in peer-to-peer networks is an important problem. Although there are many papers that note the ways to reduce traffic congestion at the Internet, we recognize no papers which focus on peer-to-peer networks to our knowledge. Thus, focusing on characteristic aspects of peer-to-peer networks, we address traffic congestion on the networks. The structures of peer-to-peer networks are quite different from that of usual computer networks. There are main computers called "central servers" in the usual computer networks and all computers can be servers or clients.

Search for files and download of them
In the peer-to-peer networks each computer is called a "peer". A peer forwards a query of a file to its neighbors and the neighbors forward the query to their neighbors. This process is repeated until a specified threshold TTL (Time-To-Live) is reached. Here TTL is the threshold of the number of links where a search query passes. Queries are discarded if the number of the passed peers exceeds TTL. We assume that a peer can search files in the searchable domain in which all peers within a distance of TTL from the peer are included. A peer can search for a file and get data of the file from a "host peer", which has the requested file and exists in the searchable domain. Inversely, the peer can be a host peer for the other peers if it exists in their searchable domains. When a host peer sends data of a file, it is to be desired that the routing path is controlled for reduction of traffic congestion.

Leave and join in the peer-to-peer networks
The other character of the peer-to-peer networks is that peers leave and join in the networks by means of switching on or off the application which is needed for the peer-to-peer technique. Peers leave a peer-to-peer network for the reason that they have got files and need not connect to it any longer, and their PCs are gone down. On the other hand, peers join the network in order to get files one after another.

Model
We model a peer-to-peer file-sharing network, where peers leave and join in the network. The peer j has a file-parameter expresses the amount of stored files ! . The distribution of j P is given as [3] ) , and j P is normalized as the maximum value is set to be one. In the peer-to-peer networks, there are the peers who principally supply files to other peers. Empirically, these peers rarely leave the network. We call such peers "file suppliers", whereas peers who hardly supply files and quite often leave the network after having just joined are called "free riders". Therefore, we assume that the less files peers have, the more often they leave the network.
On the other hand, peers newly joining in the networks need information of peers who are already in the network in order to connect to the other peers. Since it is difficult to find information by themselves for the new peers, the administrator of the application for the peer-to-peer network gives information about peers. The administrator tends to select nodes which are in the network for a long time, because he or she hope that the network is held firm. Thus when peers join in the network, they often provide links to selected peers, who are file suppliers, namely have many files. To sum up these things, peers join in and leave the network in accordance with the next procedure.
! Some peers leave the network. The smaller is, the more often peer j leaves.
! Due to the leaves, peers whose degree is 0 often appear in the network. Therefore in this time the respective peers add a link to a randomly selected peer whose degree is nonzero.
! The leaving peers join into the network again. The larger j P is, the more often the rejoining peer connects a link to the peer j. The number of neighbors of the rejoining peer is set to be from 3 to 7 at random.
We obtain good agreement between theoretical and simulation results

Betweenenss and routing strategy
First, a peer A searches for files which they want and forwards a query to its neighbour peers. This searching method is called "flooding". After the search, if the file which peer wants exists in the j P searchable domain, peer A downloads the file from peer B which has it in the domain. If a lot of data concentrate on a node, the node cannot transfer data to next node immediately, and it takes a lot of time to send files. Therefore, we need disperse transfer routing paths.

Betweenness
We consider that paths for all pairs of nodes in the network are defined according to a rule, for example, choosing the shortest path and so on. Betweenness of a peer is defined as a measure for concentration of optimal paths at the peer.

Routing Strategy
First, we introduce the shortest path strategy. Transport routing strategies used on computer networks have been solely determined based on the shortest paths. However in the case of shortest path strategy, traffic concentrates on peers with a great many degrees, and one of the peers have the big maximum betweenness !"# . In finding optimal (smallest) !"# , we need to disperse routing paths. For the purpose, we employ the smallest weight path algorithm, that a path with the smallest sum of weights assigned to peers along the path is chosen. We show !"# at both the shortest path strategy and new strategy which uses the smallest weight path algorithm.

Calculation !"# (t)
In Fig.1, we show !"# ( ) for both the shortest path algorithm and smallest weight path algorithm. We find that smallest weight path algorithm is superior to shortest path algorithm because !"# of the algorithm is stably lower than one of the shortest path algorithm. We perform these four procedures with some peers simultaneously. To perform data transfer with many peers lead to traffic congestion. In next section we define congestion in data transfer.

Congestion
Through data transfer, we assume that all peers can send two different data to its neighbour at once. If a peer receive three different data at the same step, at next step the peer can send only two data, and consequently it leads one data to stop at the peer . We define this phenomenon as congestion. Table.1 The number of leaving peers is 50. The value is the number of times traffic congestion occurs.

Simulation
When leaving peers are selected in the path of data transfer, the transfer path changes in order to send data to the peer who wants a file as much as possible.
From Table 1, we find that the number of times traffic congestion occurs is smaller in the case of the smallest weight path than in the case of the shortest path.