Elastic computing self-organizing for artificial intelligence space exploration

The aim of this article is to provide a methodology useful for designing and controlling elastic computing self-organizing for artificial intelligence space exploration. The artificial intelligence application itself should be elastic and distributed in the context of limited information technologies resources in space. The most important use of elastic computing is artificial intelligence’s ability to continually learn and adapt to evolving environments and goals. The conceptual framework uses elastic infrastructure model and the terminology of graph dynamical systems to be able to capture a broad variety of processes taking place on self-organizing networks. The methodology’s uniqueness lies in the theory of graph dynamical systems used to explain the self-organizing processes life cycle.


Introduction
Nowadays, information technology (IT) plays an important role in space exploration by spacecrafts (SCs). Cloud computing is crucial because of the internet that provides fast access and analysis of data through shared servers allows to solve problems in a cost-effective manner. This is achieved through elasticity of the system. The elasticity has become main characteristic of the contemporary information technologies. Run-time factors on demand are increasingly determining the elasticity. One of the fundamental elastic computing properties is autoscaling. It is the process of automatically increasing or decreasing the IT resources delivered to a cloud on demand. IT resources are categorized into hardware, software, and human resources. Hardware resources are processors, memory drives, and various technical devices. Software resources are software packages, applications, databases, and knowledge. Human resources include experience in mission management, the activities of astronauts and the skills they have acquired in training.
Advancement in general artificial intelligence (AI) [1] techniques such as machine learning and adaptation methods has provided relevant growing role for AI space exploration. Artificial intelligence systems are needed to automate the activities of astronauts, depending on emerging needs, to improve the autonomous functions of spacecraft (SC), space robots and equipment. AI poses an increasingly popular approach for implementing SC autonomy. Autonomy is the ability of a SC to achieve goals while operating independently without external control. To operate independently is required system self-sufficiency. To achieve goals is required system self-directedness. High-level autonomy for system and subsystems provides to improve overall performance of human and robotic missions. The autonomy of the SC [2] equipped with AI [3,4] practically without human intervention is a shift towards the design of self-organizing systems [5] for space exploration that would require the modernization of industrial robotics [6]. Self-organizing systems have the following advantages: increase efficiency and speed up missions by making decisions on their own, without waiting for instructions from the Earth; learn on the fly; adapt to certain situations and solve problems that were not initially defined for them [7]; explore space and planets using special robotic systems [8,9]; conduct traditional scientific observations and experiments [10]; increase the efficiency of data processing [11]; commercialize space [12]; automate routine procedures and improve technical systems for servicing the activities of astronauts. Work in this direction is being intensively carried out [13]: autonomous groups and autonomous systems equipped with AI [14], a robot astronaut [15,16], the Curiosity rover [17], and the commercialization of space [18,19].
The aim of this article is to provide a methodology useful for designing and controlling elastic computing self-organizing (ECSO) for AI space exploration which should be elastic and distributed in the context of limited IT resources in space. The paper includes the conceptual framework using elastic infrastructure model and the terminology of graph dynamical systems and the methodology having uniqueness lies in the theory of graph dynamical systems used to explain the self-organizing processes life cycle.
2. Conceptual framework for elastic computing self-organizing A conceptual framework is a set of concepts that should be useful to describe elastic computing selforganizing with AI space exploration context. Types of infrastructures, distributed AI, elasticity model, and graph dynamics are considered in the conceptual framework.

Types of infrastructures
Cloud computing services form the AI infrastructure. There are different types of infrastructures: cloud infrastructure (a data center), fog infrastructure (a backbone IP network, content delivery networks. and a ground station), edge infrastructure (a cellular core networks, "instant data" that is real-time data generated by sensors, and SC), and hybrid (figure 1).  Cloud computing is a collection of servers that make up a distributed network, fog and edge computing are their extensions. The advantage of cloud computing is that they allow to collect data from multiple sites and devices and make it available anywhere in the world. Data from devices is collected by embedded hardware (edge infrastructure) and sent to the fog computing. The relevant data is then sent to the cloud computing, which is usually located in a different geographic area. The use of 3 edge computing offers advantages in optimizing the use of IT resources for cloud computing. The probability of a data bottleneck is reduced by reducing network traffic. The security improved by encrypting data near the core of the network. Edge computing, on the other hand, requires a bidirectional method for handling data. Figure 1 shows the cloud infrastructure of a data center processing data from a SC with centralized elastic computing. Cloud infrastructure has only one autoscaling subsystem and one strategic planning subsystem for provisions IT-resources. And furthermore, IT-resources are centralized.
Due to GPU-enabled chipsets, mobile devices have become extremely powerful. Instead of being centralized, AI can now be moved from cloud data centers to edge devices such like cell phones. Edge computing allows processing to be performed locally at multiple decision points for the purpose of reducing network traffic. For example, Microsoft's Azure Space combines a set of offerings and products, like cloud computing, artificial intelligence (AI), edge computing and machine learning. Figure 2 shows the shift elastic computing to the distributed edge infrastructure.  Integrating a cloud infrastructure with fog and edge yields superior performance. This allows to effectively solve the following problems: management of interaction and organization of stable communication of SCs; management of dynamically scalable systems, both consisting of several SCs, and consisting of hundreds and thousands of SCs; SС control in conditions of large dynamically moving space debris (traffic in near-earth space is becoming increasingly dense); processing of data received from incrementing number of SCs; commercialization of space missions (to exploit natural resources originating outside Earth and to organize space tourism).

Distributed artificial intelligence
The traditional AI approach is trained locally machine learning models on a large dataset. The most important use of elastic computing is AI's ability to continually learn and adapt to evolving environments and goals. Continual and adaptive learning focused on increasing amounts of data while maintaining previously acquired skills and updating predictive models. As data keeps coming in, it is possible to scale a distributive cognitive net.
Autonomous SC equipped with AI require new approaches to the use of the elastic computing. Regardless of the chosen AI methods [3], at the stages: training, data processing and decision-making, 4 different loads on elastic computing are required. For efficient operation of SC with AI, both new methods in processing of distributed data and approaches to organizing computational processes based on elastic computing will play an important role.

Elasticity model
The distributed elastic computing infrastructure allows to optimize AI systems in conditions of limited resources. It is based on the implementation of a distributed infrastructure with a scalable autonomy for multi-SCs, robots and equipment. The scalable autonomy impacts for distributed planning, scheduling, fault management, data distribution, and task execution. In this context, AI is regarded as a distributed one with high-level of the scalability.
Elasticity is ability of the system to transform its behavior: increase capacity on demand and decrease capacity on demand. Elasticity is the degree of autonomous adaptation over a certain time interval of the service capacity to the workload.
Let us consider workloads change and the system's elasticity over time. Such adaptation must be minimized due to the autonomy of the adaptation process. Workload characterizes the data that must be processed by the service's operations when demand fluctuates, for example, throughout the day.
The increase or decrease of the system capacity can be repeated several times over a period of time until the workload returns to its original value. Server workload and power are at the heart of elastic computing. Figure 1 shows widely used for the analysis of elastic systems pattern named Once-in-alifetime Workload [20]. In Figure 1 the transformation of the elastic system at runtime shown by the red curve: workload increases on the left side of the peak loads (vertical green line), as it decreases on the right. Let us look at example of internet load during the day, where the horizontal axis shows time in hours, the vertical axis shows user requests per seconds. Let the service has enough capacity with a load of 2000 requests /sec (figure 3). Service capacity determines the maximum workload that the system can handle in terms of the number of requests. If the workload has increased to 3000 requests/sec, the capacity of the elastic computer system must dynamically transform.
If the workload increases, the system must dynamically transformed by adding IT resources to the system. If the workload is decreased, for example after 9000 requests/sec, the system capacity must be reduced by releasing IT resources.
Adding and releasing IT resources requires support for late binding of IT resources. This method is successfully used in programming language environments, in operating systems, when modules can be loaded and unloaded at will. Hardware and software binding and replication is eliminated by virtualizing the underlying infrastructure. There are following binding methods: resource-node, clientserver, process-resource, geolocation. To design elastic systems, it is necessary to conduct an analysis in terms of the system autoscaling depending on the workload, planning strategy, provision, binding of IT resources, and system synthesis. In order to summarize these, the following formula has been created: Elasticity = Autoscaling + Strategic Planning + Binding + Synthesis. (1) Let us give definitions to each term used in the formula (1). Automatic scaling (autoscaling) is a prerequisite for elastic calculations. The autoscaling method is a series of procedures to adapt the system to fluctuating workloads: without adding IT resources, or adding (deploying) IT resources to the system. The system's scalability should be considered during the design stage.
Strategic Planning is an integration of both methods: IT resources allocation and the scheduling. IT resources allocation defines the type and number of functional units, storage components, busses, and other connectivity elements required to generate current hardware infrastructure of the system. There are different models for the resource allocation (RA) problem [21]. Infrastructure may be used to categorize this RA models for cloud, fog, and edge. Further RA models can be categorized by organizing servers with Virtual Machines (VMs) or Containers.
For cloud infrastructure the following RA models were created: to deploy Software as a service (SaaS) applications over cloud computing platforms by taking into account their multi-tenancy [22]; mathematical model based on group technology (GT) to map the VMs to workflows in order to control some costs [23]; request type using autonomous and multi-objective resource sharing model for federated geo-distributed clouds [24]; VM allocation algorithm which optimizes the task execution in a secure cloud environment [25]; the solution of optimized container RA is made by considering objectives like Threshold Distance, Balanced Cluster Use, System Failure, and Total Network Distance, respectively. Finally, the performance of the proposed model is compared over other conventional models and proves its superiority [26]; an online multi-resource neural network is used to create a proactive autoscaling and energy-efficient VM allocation system for cloud data centers [27]; modeling of cloud outages events [28], quality of service model based on comparing the performance metrics (e.g., response time, throughput and resource utilization) [29]; a fair multiattribute combinatorial double auction model [30]; Hybrid fog/cloud computing RA model based on joint consideration of limited communication resources and user credibility [31]; adaptive RA model allocation based on the billing granularity in edge-cloud infrastructure [32]; a combinatorial double auction RA model [33]; a feedback-based combinatorial fair economical double auction RA model [34]; in a Fog-Cloud environment, dynamic RA and provisioning algorithms based on deadlines [35]; self-adaptive RA model based on iterative quality of service (QoS) prediction [36]; a game theorybased dynamic RA strategy in Geo-distributed Datacenter [37]; a tenant-based RA model for scaling Software-as-a-Service applications over cloud [38].
Consider Strategic Planning for a constellation of autonomous SC (edge infrastructure). It is proposed to use the migrated master node model to request IT resources. Let SC0 request an IT resource (figure 4, SC0 node highlighted by red color. This means that a master node will be generated for it. The master node is the main node of the cluster that will be formed from IT resources of the SC constellation. IT resources are allocated in SCs themselves and redistributed among them. An autonomous SC tries to compute all its tasks using its own IT resources and partially the IT resources of others. Let SC1, SC2,..., SCn, are free resources most suitable for increasing the capacity of the SC0. The scheduling method forms software architecture and maps it on hardware architecture. On this step work (for example, W1, W2,..., Wn) is assigned to resources. The work may be threads, processes or data flows. Computational operations and the order in which each resource will occur are identified. Works are in turn scheduled onto hardware resources.  A binding allows to use services in a variety of ways for access to resources. In this step decisions about the sharing of the functional and storage units of the allocated resources are made. These decisions may have an effect on resource connectivity elements which may be revised.
A synthesis configuration [39] (or, simply, configuration) defines the transformations of the software and hardware infrastructure. Restriction and optimization instructions are used to manage these transformations.
The autoscaling, strategic planning, binding, and synthesis are interdependent and influenced by one another's decisions. Computational cost-effectiveness is closely linked to optimal resource allocation, balancing over-provisioning and under-provisioning. The transformation of the elastic system should be taken into consideration when developing planning strategies.

Graph dynamics
Swarm intelligence, multi-agent systems, actors, Petri nets, and graph dynamic systems are conceptual frameworks for distributed artificial intelligence. In this article graph based systems are used for distributed problem solving by scaling processing autonomous nodes. The demand for IT-resources in AI systems is growing, particularly if they processing a very large dataset. Distributing the processing to a cluster of autonomous computing nodes speeds up computation. The nodes of ECSO are SCs. Interactions or communication between these nodes are used to make decisions.
Let us consider migrated master node model M and graph dynamics of cluster formation for modeling available IT -resources for elastic computing.  Figure 5a shows the initially marked mesh M. Vertexes marking by "free" are depicted green color. Vertex marking by "busy" is depicted red color. Figure 5b shows the model is yielded in one step and the marker m in the vertex SC3. It has become the master node that forms a cluster.

Methodology
It should be said that design of the ECSO system is more like supervised learning and simulation when required software subsystems have already been implemented. Because of it, the first stage of methodology is associated with setting the initial values, which can be represented by the following formula ( The life cycle of ECSO consists of two stages Stable development and Transition (figure 6). These stages followed one by one, the stages merge with each other. Processes of both stages are considered on a certain time horizon and are characterized by the dynamics of behavior: at the stage Stable development it is ordered, at the Transition stage it is chaotic. Within the life cycle, the sequence of stages is repeated.
Backtracking occurs when the ECSO has to go back to a previous stage for reconsideration before completing the j-th step iteration. Backtracking within each cycle iteration will help in the design of ECSO system that are able to deal with uncertainty in dynamic problem domains.
ECSO system is involved both in the processing of data received from the outside world and in the SC management. External influences F affect on the SC, they are converted into datasets and after that they are processed by the ECSO system. When processing data at a Stable development stage imposes constraints on the use of IT resources in the range [a,b]. This is especially important for an autonomous constellation of SC. During the transition stage, a new restriction range [h,d] of IT resources is settled. Setting the ranges is necessary to find a suitable alternative with the least amount of available IT resources. In the SC management case, the own state of the ECSO system is taken into account by fitness function. The Stable development stage includes the processes of adaptation ( figure 6, the points of the beginning of the process from A and the end of B) and autoscaling (the points of the beginning of the process from B and the end of C). The adaptation process is a parametric change of the system in order to complete the task. The Transition stage includes the processes of bifurcation (the points of the beginning of the process from C and the end of D) and decision-making (the points of the beginning of the process from D and the end of F). The bifurcation process is associated with the generation of possible alternatives for decision-making, and decision-making is responsible for the further development of the ECSO system.
The change in configuration of the IT resources depends on the criteria of the fitness function, which characterizes the quality of the configuration. For example, on the complexity factor f (Ω), which either remains constant, or increases or decreases. The amount of change depends on the current configuration of the system and the external influences. The sequence of cycle iterations transfers the system from the current configuration to the next and actually sets the trajectory of the system movement.

Conclusion
Strategic Planning for a constellation of autonomous SC with edge infrastructure is considered. For simulation of IT resources cluster forming of ECSO, the migrated master node model are introduced. The model based on the graph dynamics approach. The methodology ECSO provides a guideline to direct search of the solution. The proposed steps are new, but they remain iterative and incremental, as in the already known methodologies. The methodology's uniqueness is that it uses stages of selforganizing processes.
Further research entails the development of a conceptual framework through involvement in it terms of graph-based dynamic analysis [40]. The migrated master node model will be the subject of comprehensive research taking into account cloud, fog, and edge infrastructures. Identification of its properties and development of methods of analysis is intended.