Routing in Networks on Chip with Multiplicative Circulant Topology

The development of multi-core processor systems is a demanded branch of science and technology. The appearance of processors with dozens and hundreds of cores poses to the developers the question of choosing the optimal topology capable to provide efficient routing in a network with a large number of nodes. In this paper, we consider the possibility of using multiplicative circulants as a topology for networks-on-chip. A specialized routing algorithm for networks with multiplicative circulant topology, taking into account topology features and having a high scalability, has been developed.


Introduction
Currently, one of the most important areas of research in the field of computer science and computing systems is the construction of multi-core processors.The transition to multi-core processors allows overcoming the performance degradation in the design of increasingly complex single-core systems [1].When growing interest in the Systems on Chip (SoCs) and Multi-Processor Systems on Chip (MPSoCs) [2], Networks-on-Chip (NoCs) [3] are becoming widespread.In a multi-core processor with a small number of cores (2-8 cores), communication between IP-cores occurs via a common bus not capable to ensure communication between a large number of cores [1].The problem of scalability can be solved by technology for NoC construction used to replace bus architectures.
One of the most urgent problems in the study of NoCs is the search for optimal topologies, since standard regular topologies (mesh [4], torus [5], hypercube [6], spidergon [7]) do not meet modern requirements for NoCs, especially with an increase in the number of nodes [8].
Circulant topologies have better characteristics than standard ones: they have better indicators of structural survivability, reliability, and connectivity, and also require fewer inter-processor exchanges in solving computing tasks, and system management tasks [9].This allows them to be used in networks with a large number of nodes [10,11].To use circulant topologies as a topology for NoCs, it is necessary to develop routing algorithms in them taking into account the features of these families of graphs and the organization of NoCs, and all these determine the relevance of this work.

Multiplicative circulants
Let us give the definition of a circulant graph.Let -whole numbers, such as Graph C with set of vertices and set of edges [12,13]; numbers ( ) -generatrices; numbers k and n -dimension and order of the graph [14].
Circulant graphs of ( ) ( ) with and (Figure 1) are considered to be a separate class of multiplicative circulants [15].The most important characteristic of a graph is its diameter and average distance.The diameter of a graph is the maximum possible distance (the length of the shortest path) between two vertices.

( ) ( ),
where ( ) -the length of the shortest path from vertex to .In a number of works [15,16], attempts have been made to find the most accurate estimates of the diameter and the average distance for multiplicative circulants.
More accurate estimates of the diameter (1) and mean distance (2) for the class of multiplicative circulants ( ) are given in [17]: A comparison of the main characteristics of topologies on the basis of multiplicative circulants (1-2) and mesh-type topology with the same number of nodes is presented in Table 1.To calculate diameter and average distance of mesh topology, the formulas (3-4) [8,18] are used.
From Table 1 it follows that even with a small network size, the circulant topology has better performance in all characteristics compared to mesh topology.Thus, multiplicative circulants are good enough for NoC designing with a large number of cores.However, it should be noted that the this topology, have a limited number of options -the number of nodes must be strictly an integer degree.Otherwise, the circulant becomes a recursive one with other properties and characteristics [19].

Development of the packet structure for static routing in networks with topology based on multiplicative circulants
Circulant networks use pairwise routing [14], when a packet is sent from the source node to the receiver node.For pair routing in NoCs with a multiplicative circulant topology, it is possible to use standard shortest path search algorithms, for example, breadth-first search algorithm (BFS) [20,21].It is suggested to choose a static type of routing in which each router has adjacency list for each node.Each router knows its own serial number, i.e. the node number whose incoming packets it distributes.
The destination node number (receiver node) is input to the router.The router calculates shortest path by the search algorithm in width.Then the router overwrites the path in the reverse and replaces the node numbers with the port numbers for which the packet is to be sent.A port is a connection between the current node and others.Each node has ports: 2 ports of each length from S (except for circulants ( ), where one port is less).Thus, for circulant ( ) (Figure 1b) the path 5 → 21 → 17 is converted to 17 ← 21 ← 5, then to the sequence of actions -(4), +( 16), and then to 2|1 or 010|001 in binary code, if the rule for determining the port numbers is set as following: -( 16  Replacing node numbers to port numbers allows reducing size of part of the packet that is allocated for storing the path.Each node reads last k bits, shifts the path to same number of bits to the right, and passes the packet further.The package reaches the destination when the entire path is filled with zeros.
This algorithm is universal and it is suitable for networks built on the basis of any multiplicative circulants with difference and .The problem with this algorithm is that with the increase in the number of nodes and connections, the running time of the algorithm is significantly increased too.Also, the size of the address part of the packet increases, so for large NoCs, it is required to develop a specialized algorithm optimized for this type of topology.

Development of a specialized routing algorithm for NoCs based on the topology multiplicative circulant
The review of algorithms of routing used in NoCs [22], shows that the most common in networks with a mesh-like topology is the XY algorithm [23].Multiplicative circulants also have a strict, previously known geometric form.Therefore, if we take into account the peculiarity of this topology, consisting in the fact that the lengths of generatrices are powers of one base, it is possible to propose a specialized routing algorithm that simplifies the structure of the address part of the packet and reduces its size.
The address field stores the number of the destination node; the field size can be calculated by the formula ⌈ ⌉, where -number of nodes in the network: Receiving the node number where the packet is to be delivered, the router of the current node does not calculate all the way, but only the next step.In order to calculate the next step, it is enough for the router to know its own number, destination node number and circulant characteristics -and .
Thus, the total size of the data, stored by the network routers in bits, can be calculated by the following formula: where -number of nodes in the network; ⌊ ⌋ -required amount of memory (bit) to store the number of routers in the network; ⌈ ⌉ -required amount of memory (bit) to store the network router number; (⌊ ⌋ ) -required amount of memory (bit) to store array of the generatrices; ⌈ ⌉ -required amount of memory (bit) to store the indices of the generatrices and the primary port of the router and flag of primary port; Taking into account the fact that the circulant topology is cyclic, the algorithm is developed from the position of the null node.To do this, before starting work, the destination node number is recalculated, based on the current node number: if the number of the current node is greater than the destination node number, the recalculated destination node number is equal to the difference of the two numbers.If the destination node number is less than the current node number, the resulting difference is subtracted from the number of nodes in the network to allow for the transition through the null node.Then the algorithm determines in which direction it is better to start moving -to the left, or to the right.Since the circulant is symmetrical, it is sufficient to consider half of the circulant and by fixing the chosen direction of motion.The step is determined by the length of the generatrix closest to the destination node number.This makes it possible to take into account the cases when it is more advantageous to overstep the destination node, and then go back.To do this, a generatrix with a maximum length, not exceeding the destination node number, and next longer generatrix are chosen.Of the two generators, it is selected the one whose absolute value of difference between its length and destination node number is the minimum.The packet reaches the destination node when the current node number is equal to the node number specified in the address part of the packet.Thus, for routing in the address part of the packet, it is necessary to provide ⌈ ⌉ bits for storing the destination node number.
The algorithm, specially developed for multiplicative circulants, requires significantly less time for calculations (Table 2) and is easily scaled to larger networks.This algorithm has a linear complexity in contrast to the width-search algorithm with an exponential dependence of the search path time on the graph dimensions.The developed algorithm is characterized by an approach similar to the routing algorithm proposed independently for recursive circulants in [19].Due to the fact that the algorithm, obtained in this work, was developed on the basis of the features of the structure of multiplicative circulants, which are a articular case of recursive circulants, it is characterized by a lower computational complexity, but it can only be used for this class of circulants.

Approbation of algorithm operation
Testing of developed algorithms was performed on FPGA Cyclone V 5CGXFC9A6U19I7 form Intel FPGA (Altera).The description of the routers was done in Verilog [24].Testing [25] was conducted in two stages.At the first stage, the data on the occupied chip resources was obtained for one router and the network as a whole for those considered earlier (Figure 3a, Figure 3b).
Based on the data obtained, it can be concluded that the major cause for increase in the routeroccupied chip resources is the number of multiplicative network generatrices..With an increase in number of generatrices in the structure of the circulant, the number of mathematical operations for their verification, as well as memory for storing the values of the generators, increases.At the second stage, for comparison with other families of circulants, the circulants, which can be described in several representations, were chosen.The comparison was made between multiplicative circulants with the number of generatrices equal to 2, and ring circulants.The number of network nodes was formed as a natural number in degree 2. As a result of RTL synthesis, the data on the number of ALM blocks and registers (Table 3, Figure 4a  The columns "ALM (Table alg.)" and "REG (Table alg."represent the results of tabular routing, the columns "ALM (Ring)" and "REG (Ring)" show the results of the algorithm for describing the graph as a circular circulant [26], and in the columns "ALM (MC)" and "REG (MC)" -for multiplicative circulant.The graphs show that the rate of increase in the use of ALM blocks in the algorithm for describing the network as a multiplicative circulant is much lower than that of the algorithm for describing the network as a ring circulant.At the same time, the rate of increase in the use of registers in the algorithm, proposed for the multiplicative circulant, is slightly higher than that of the algorithm for the ring circulant.Given the fact that the logical resources of the chip are much smaller than the registers, it can be concluded that the use of multiplicative circulants is more effective than the ring ones with an equal number of generatrices.

Conclusion
In the conditions of inconsistency of the characteristics of common topologies to the requirements of modern networks and in view of the need to search for alternative options for building networks, it is suggested to consider as a topology a special kind of graphs -multiplicative circulants.Strict rules for the formation of the circulant structure impose restrictions on the number of nodes in the network, and make the topology data easily scalable to larger networks and allow improving such important characteristics as diameter and average network distance in comparison with classical topologies.
For multiplicative circulants, standard shortest path search algorithms are applicable, and in this paper we propose the address part of a packet to reduce its size when static routing.At the same time, most of the standard shortest path search algorithms are built around the graph; that is why with increasing complexity of the circulant, the running time of the algorithm dramatically increases.Taking into account the peculiarities of the structure of multiplicative circulants, it was developed a specialized algorithm for this class of circulants, which makes it possible to considerably simplify the structure of address part of the packet and to reduce the time for finding the optimal path in the graph.

Figure 3 .
Figure 3. Number of resources occupied by the router (a) and by the network (b) on a chip.

Figure 4 .
Figure 4. Dependence of used ALM chip blocks (a) and REG (b) on the routing algorithm.

Table 1 .
Comparison of characteristics of the circulant and mesh topologies.

Table 2 .
The time spent on searching for the shortest path in multiplicative circulants.

Table 3 .
Resources occupied chip blocks for the whole network.