The software structure for agent-oriented simulation with distributed dispatching

This paper deals with the problem of the implementation of the agent-oriented simulation on high-dimensional models in the social science domain. It was shown that high dimensionality of models and the complexity of their development using the existing tools were the main constraint factors. The analysis of existing agent-oriented simulation systems was held. Some special aspects of platforms, on which such systems were performed, were considered. The influence of clustering on the simulation rate was shown. The structure of the software for the agent-oriented simulation using the existing agent-oriented simulation systems on the basis of the distributed scheduler, which provides clustering optimization of a model problem, was proposed.


Introduction
At the present time there are a great number of investigations in the social science domain due to the growth of adversarial social phenomenа [1,2]. Nowadays the most advanced analysis tool, which is applyed in the social science domain, is simulation on the basis of an agent-oriented approach [3]. In works [4][5][6] the results of the above mentioned method for analysis of large social groups were shown.
The successful application of the agent-oriented simulation in the social science domain takes place due to the fact that the method closely corresponds to the domain field.
It should be emphasized that there are some peculiarities for the agent-based simulation in the social science domain: -large agent population (hundreds of thousands of agents), -complex agent behavior, -dividing the agent by groups/types, -using the high-efficiency cluster solution.

The analysis of agent-oriented simulation software
There are some key problems of the agent-oriented simulation application for real tasks: -inadequate agent models due to the lack of domain data; -inadequate tools [8][9][10], which requair some special user's knowledge, which does not link with a domain.
In [9] the latest agent-oriented simulation, systems (AOSS) were presented. The results of We note that the majority of AOSS, which are oriented to performe scaled models, in order to achieve a required efficiency, keep an operation on clusters and have optimized implementations (for example, Flame-HPC, Repast-HPC).
Due to enhancement of the efficiency of graphics processors as compared to general-purpose processors [11] the problem of AOSS implementation on GPU has become currently important.
AOSSs based on FLAME-GPU, which are aimed at using the NVidia made graphics accelerator, was developed in [12]. The above mentioned AOSS performs modeling of scaled systems without an accessibility to clusters.
However, in the context of GPU modeling, it should be taken into account that, since a particular limit, GPU efficiency begins to reduce the population growth much faster than in cases when clusters are used. Though, FLAME-GPU can be considered as the necessary alternative to cluster solutions in case of the out-of-the-box solution, when the population size is not extrimal and GPU has certain efficiency.
To sum up, it can be said that FLAME, FLAME-HPC, Repast-HPC, FLAM-GPU are optimal solutions among the open AOSSs in the social science domain, which can be described in terms of large models and complex agent behavior.

The analysis of implementation platforms for AOSSs
On the basis of AOSS analysis, it can be concluded that AOSS efficiency depends on the agent model which is used. As a result, the choice between the cluster and GPU solutions is rather problematic, particularly, at the initial stages.
The AOSS comparison on the basis of the implementation platform is presented in Table 2.  3 We can observe in Table 2 that GPU-based solutions are much better in terms of the cost/performance ratio. At the same time, model size restrictions do not allow us to use GPU-based solutions for extremely large size models. To some extent it can be negated by means of several GPUs in a workstation computer. However, it requires buying specialised GPUs and a mother card. But this approach does not solve a scaling problem. It just increases a limit when AOSS simulation performance begins to reduce. It follows that a combined solution, which involves clusters of workstations and GPU, can be considered as an optimal. Herewith, the number of workstations can be increased according to scaling demands. In this case, the problem of communication between GPUs both intra-cluster and intra-workstations, is meant to be solved. To deal with this problem NVIDIA company developed NCCL -NVIDIA Collective Communications Library, which provides parallel computation primitives on multi-GPU and multi-node environment [13].
Cluster solutions use the powerful interfaces of date transmission both between cluster nodes and cluster interconnects. For example, NCCL provides date transmission between nodes by means of InfiniBand verbs, libfabric, RoCE and IP Socket.
Another way to solve a scaling problem involves middleware using. It provides model code crosscompilation. This approach allows us to use some AOSSs of various types with various implementation platforms. The similar approach using of OpenABL is presented in [14].
It must be mentioned that the foregoing approach allows us to solve the problem of high complexity model development. As it shown in [9], highly-efficient AOSSs are characterized by high complexity model development (Table 1). Herewith, for this purpose a general-purpose programming language is used. If model often changes during the modeling process, it is necessary to attract programmers to solve some current problems. As a result, the bottle neck of modeling is the rate of model change by the programmers for needs of domain experts instead of the execution speed of a model. There is some information about the cost of model development in eLOC for various AOSSs in [14]. We can see that FLAME GPU demonstrates the cost of development, which can be compared with ABL cost. It comes from the fact, that a pattern system is used for model development [11,15].
Another way to simplify the process of model development is middleware usage. It represents a graphical interface for model development. In this case, original model code is generated automatically on the basis of input data. This approach can be easily implemented for AOSS, which uses the various configuration files for model description (for instance, FLAME, FLAME GPU). The step of code generation should perform on a separate module, as in [14], in order to increase its functionality (for example, various AOSS supporting) regardless of middleware.
Secondly, factors, which have an impact on model execution speed, should also be considered. A scheduler, which executes an agent replacement, should be put in the cluster system. The scheduler is responsible for monitoring of the state of the cluster and carries out agent replacement when computational node workload exceeds the threshold or heavy network load takes place. In order to provide performance and scalability of the system, the scheduler should be distributed. In [16,17] the description of distributed scheduler usage in the context of failover problem solving in distributed computing environment was presented.

The assessment of factors having an impact on the simulation rate
It must be mentioned that there are some factors having an impact on the simulation rate: -cluster node performance, -speed of data transmission interface between the cluster nodes, -intensity of communications between the cluster nodes, -complex agent behavior. There are some methods to speed up the simulation rate: -cluster node performance enhancement, -high-speed network interface, -effective agent distribution. The first two methods can be used if there is an opportunity to choose a cluster or to change its configuration. In the context of a given cluster these methods cannot be used.
In this work we will dwell upon the last method. For this purpose, we should assess in what way the ineffective distribution of agent groups influences the simulation rate.
Let the intensity of communications be from agents, which are split into M groups: (2) Let us assess the volume of a single node transmission considering the following assumptions: -an agent participates only in one group, -each group is fully located on one node. Let agents be involved in groups of equal size i N , which are equally distributed on computational nodes (k groups per node).
Let l be all nodes, and then the total number of agents is described by the following equation: We assume that agents communicate with each other with intensity in S within a cluster and out S between clusters. Considering the fact that each cluster is located on one computational node, information exchange within a cluster does not lead to occurrence of traffic in the network. From among out S communications a part of them ( 1) i kN  is transmitted to agents of a cluster, which are located on the same node as an agent-transmitter is. It also does not lead to occurrence of traffic in the network. The number of communications which are transmitted by each agent of any node to agents of other nodes is equal to And for all agents it is equal to ( ) / out ii kN S N kN N  . Herewith, the agents of a node will receive communications from agents of each other nodes: Consequently, a generic traffic of a node is calculated using the following formula: From the given expression we can see that the volume of data transmission grows in a linear direction when: -increasing the group size, -increasing the number of clusters on the node, -increasing the transmitted communications. It must be mentioned that: -increasing the number of nodes increases the traffic limitedly ( -synchronization of agent performance within a simulation cycle in case of different computation and data transmission time leads to reduction in the computation system effectiveness and the simulation time increases respectively.
The obrained expressions allow evaluating the possibility of simulation, depending of model parameters and communication interface.
In figure 1 a "node trafficnumber of agents per node" diagram is presented. It allows evaluating the possibility of simulation, depending of model size and equipment configurationnumber of nodes and communication interface parameters.   It is worth noting that, in case when an agent can be involved in more than one group simultaneously, these groups should be located on one computational node. If it is impossible to implement such distribution, then it is necessary to reduce the number of agents included in groups, which are located on different computational nodes.

The software structure for agent-oriented simulation
To provide clustering controlling during the simulation, the software structure for agent-oriented simulation is presented in figure 3. The system involves a control node and execution nodes. The control node is a coordinator of the whole system. AOSS is not performed on it. A scheduler on the control node executes only communication between schedulers from other nodes.
The following components are deployed on the control node: -model development module, which provides domain-specific tools for model development and setting; -visualization and analysis module, which provides simulation result representation and analysis tools; -control module, which provides simulation control (launch, stop and so forth); -model generator, which creates configuration files on the basis of data from model development module that are necessary for AOSS performance; -scheduler, which provides monitoring of the state of the model implementation platform using data exchange with other schedulers and performs AOSS configuration data transmission to other nodes.
AOSS is launched on the execution nodes. The scheduler provides the AOSS configuration and control, and is performed on the same node.
The system operates in the following manner (we assume that a model has been developed and the initial distribution has been prepared): 1.
Schedulers are launched on each node.

2.
Schedulers exchange data about their own nodes condition, coordinating the approval of their conceptualization about the system.
3. When synchronizing is reached, a scheduler of control node transfers data, which needed for simulation (configuration files, initial distribution files, etc)

4.
After having received needed data, schedulers perform a resynching. When the resynching is successfully completed, schedulers run simulation. 5.
During the simulation, schedulers monitor the state of the environment: nodes operability, nodes processor utilization, data transmission channel loading, etc). 6.
In case of abnormal situations (node failure, loading threshold exceeding of a computing resource) schedulers break simulation. 7.
On the basis of the last completed iteration schedulers create a new initial distribution and perform a synchronizing of a new system state. 8.
After system synchronizing the simulation is run. 9.
If it is necessary, 7-8 steps are repeated until simulation is completed. In the event that there is no need to use a cluster, the system can be deployed on a single node. In this case the scheduler also provides control functions.
It must be mentioned that when we use a single node configuration, the scheduler is not necessary. A software structure can be simplified. In this case AOSS access to configuration data should be provided and control functions should be put in the control module.

Conclusion
The problem of agent-oriented simulation in social science domain was examined in this work. It was shown that the main difficulties in this field are high dimensionality of models and high complexity of model development due to shortcomings of existing modeling tools.
The analysis of existing AOSSs was held and some peculiarities of platforms, on which AOSS is launched, were examined. Some approaches to model scaling and simplification of model development were considered.