Multi-criteria choice of the software package architecture for automating the analysis of the forest vegetation state

The article describes a method for solving the problem of choosing the architecture of a software package (SP) of an information system for analyzing the state of forest vegetation, carried out using distributed information resources and calculation modules which usually belong to various organizations. The SP quality indicators can be assessed using both quantitative and qualitative scales. The proposed method is based on the joint application of the mathematical apparatus of the fuzzy sets and experimental design theories. It is shown that the predominant type of architecture for this class of tasks is the service-oriented architecture, which provides the best value of the integral indicator of the SP quality.


Introduction
The process of finding solutions to the overwhelming majority of environmental and forest management problems, the state of vegetation assessments, and many others, are connected to the development and application of information technology (IT). One of the main advantages arising from the use of IT is the ability to automate labor-intensive procedures for obtaining and processing large volumes of initial data required for analysis, including spatial data, as well as interpreting the results obtained and providing them to users in a simple visual form. The main trend in the development of IT is the use of unified data processing units and their combination, including the concept of "zero coding" on platforms capable of serving a large number of users. However, Environmental Science has somewhat different requirements to software; the latter is mostly used to build and run complex mathematical models that require data from heterogeneous distributed sources. A typical example is the InnoForestView project [1], which analyzes the negative consequences of the influence of anthropogenic and natural factors on the forests of border regions. To achieve the project goals, mathematical models and algorithms for processing heterogeneous data and information about the state of forest vegetation in the affected areas are being developed. Therefore, it became necessary to combine the results of calculations performed according to various algorithms and in various organizations involved in the implementation of the project. Under these conditions, when creating a unified distributed information system, the task of a reasonable choice of architecture and technology of the developed software package (SP) comes to the fore. To perform the task, it is required to carry out multi-criteria assessment, analysis and selection of the basic architecture of the information platform, on which special model-algorithmic, software and information support will be implemented. Among the most common approaches to architectures of programs and software systems [2] suitable for solving the formulated problems, the following architectures can be singled out: − monolithic architecture; − modular architecture; − component architecture; − client-server architecture; − service oriented architecture.
Applications with monolithic architecture [3,4] are easy to implement, manage and deploy. They provide high consistency of a program code, explicit control of the computational process, and unified error control. This type of architecture is advisable for small information systems localized on one computing resource. Basically, these are ecological models used by a limited number of researchers to perform calculations based on data prepared in advance.
For more complex software systems, modular architecture is used [5]. Its underlying application decomposition simplifies the process of developing, maintaining, and then upgrading the software. The development of modular architecture has led to the emergence of the concept of "plug-in", i.e. an independently compiled software module, dynamically connected to the main program and designed to expand its functionality. In solving environmental problems, plugins are often used to pre-process some data types.
A similar approach is used in component architecture [6], which allows you to build more complex information systems based on the use of debugged "building blocks" of program code. The main idea behind the component approach is code reuse. This architecture places greater demands on plug-in versatility. The development of component architecture resulted in the emergence of software libraries with a wide range of universal functions. Component architecture, like monolithic architecture, is intended for building localized modeling software systems.
The foundations of network interaction of software modules were laid in client-server architecture [7]. In a narrow sense, the components of this architecture are considered to be a user client that implements the main logic of the task, and a server part, represented by a DBMS or a file server. In a broad sense, client-server architecture describes any interaction of two or more distributed software modules of the information systems being created. However, this interaction is most fully embodied in the ideology of service-oriented architecture.
Service Oriented Architecture (SOA) implements a modular approach to software development based on the use of distributed, loosely coupled replaceable components equipped with standardized interfaces for interoperability using standardized protocols [8]. This approach is used to build complex distributed information systems based on the integration of web services. In this case, integration is carried out at the protocol level, without the interacting parties' understanding of each other's internal structure. This provides the so-called weak "mesh" of modules (web services). According to ISO / IEC / IEEE 24765, the term "coupling" describes the degree of interdependence between software modules, the strength of the relationships between modules, and the degree to which different subroutines or modules are interdependent. The use of services allows you to combine heterogeneous computing modules into a single complex application that operates on a variety of hardware (servers). It is this situation that is typical for information systems created during the implementation of interdisciplinary projects, including international projects involving assessments of the state of forest vegetation on the basis of heterogeneous data and carried out in several organizations separated in geographic space.
Service-oriented architecture can be presented by both the "classic" form and micro-services format. At the same time, the "classic" service-oriented approach has already taken root in a certain segment of scientific information systems, while the micro-service architecture has recently been rapidly gaining popularity in commercial applications [9].
Improving the usability of micro-services development and operation has led to the abandonment of some of the basic capabilities that were originally present in the "classic" service-oriented architecture. Specifically, SOA's protocol-level integration, or so-called "contract decoupling", allows a service and its consumers to evolve separately while maintaining a consistent description of the interaction interface. Micro-services architecture does not "support contract decoupling", although this is one of the main features of a service-oriented architecture. This architecture is suitable for building research information systems operating in a continuous mode, as well as in real time.

Methods and materials
A rational choice of a SP architecture for the distributed assessment of the forest vegetation state, including that carried out in the framework of international projects [1], involves a multicriteria comparative analysis of various possible SP configurations. Moreover, the criteria used can be assessed using both quantitative and qualitative scales. The analysis showed that to solve such problems, it is advisable to use the approach used to solve weakly structured multicriteria problems [10]. The essence of this approach is in the combined use of methods of verbal analysis of decisions (simple and complex supporting situations of the survey) and methods of converting qualitative indicators into quantitative indicators based on the joint application of the mathematical apparatus of the theory of fuzzy sets, relations and measures, as well as the theory of the experimental design.
The procedure that implements this approach assumes, as the first step, the formation of a set of indicators by which the assessment and selection of the specific architecture of a software complex is carried out. These indicators include: Indicator of modularity. This indicator is specifically used for options of information system creating within the framework of projects for assessing the forest vegetation state carried out by various organizations, when it is necessary to combine the results generated by separate distributed calculation modules. Modularity is assessed based on the concepts of "cohesion" and "coupling" used in the ISO / IEC / IEEE 24765 standard [11]. Connectivity can be assessed on a scale from 1 (the case of software libraries, in which the functions in modules are combined thematically) to 5 (for web services, where a certain task is solved entirely at the expense of internal functions). Likewise, in the case of engagement, the highest score is given if the modules can function independently of each other.
Indicator of permissible heterogeneity. For each component, the most appropriate means of its implementation are selected. In addition, if there are requirements for the openness of a software package, it is allowed to embed third-party software solutions into it. Specified properties of the complex imply the functioning of its components in various, sometimes incompatible software environments. The indicator is formed as the sum of particular indicators characterizing acceptable options for implementing an application for a number of popular programming languages and operating systems.
Performance indicator. The presence of time-consuming tasks and the need for their prompt solution require high computational efficiency. Numerically, this indicator is estimated by the time it takes to solve a test mathematical problem by implementation of the same code, but organized using different architectures.
Multi-user indicator. The inherent interdisciplinarity of environmental modeling may require a range of experts and researchers to work together. In this case, a software package should provide convenient means of multi-user work. The indicator has the lowest value for single-user mode, and the highest, for architecturally unlimited number of users.
Scalability metric. It evaluates the ability of the architecture to quickly increase the overall performance of a software package when the load increases. At the same time, in the absence of the load, the software package should not consume a large amount of computing power. To obtain a numerical estimate of the indicator, the ratio of the time of solving a thousand test mathematical problems on one server to the time of solving them by a hardware-software complex of three identical servers is estimated.
The algorithm for multicriteria selection based on the described composition of indicators is as follows. Let the set of architectures for building software systems be evaluated by a set of the above quality indicators , each of which is a linguistic variable. In this case, the linguistic average", "high"}. For a qualitative interpretation of the resulting indicator that evaluates the generalized quality of a particular architecture, the linguistic variable "Efficiency" is used, which can take on the values ( res ) {"bad", "below average", "average", "above average", "good"} [12,13]. In the most general form, the decision maker's knowledge about the relationship between the individual quality indicators and the resulting indicator res can be represented by production models of the form: :  The extreme ("minimum" and "maximum") values of the linguistic variable are marked "-1" and "+1 ". To build a resulting indicator res , according to the theory of experimental design, an orthogonal plan of an expert survey is built, the elements of which are the extreme values of individual effectiveness indicators . Further, to solve the problem of a reasonable choice of SP architecture, the following steps are performed: Step 1. Formation of a set of linguistic scales for each indicator and the resulting quality indicator of the compared architectures of software systems. Conversion of all indicators to the scale [-1, +1].
Step 2. Creation of an orthogonal survey plan for experts and conducting an expert survey (answers to questions of the production rules).
Step 3. Construction of the resulting quality indicator for comparable architectures.
One of the advantages of the proposed methodology is modification of the procedure for obtaining expert answers to the questions contained in production rules. So, for example, in the case of a standard approach to performing an expert survey with the number of quality indicators exceeding 5, the number of questions asked is above 32, which, as a rule, leads to inconsistency in expert statements due to peculiarities of human thinking. In the methodology discussed, it is proposed to conduct an abbreviated expert survey. Production rules, in which all quality indicators, except one, take "low" or "high" values, are called simple rules for interviewing an expert, or simple reference situations. The number of such situations corresponds to the number of particular indicators of the SP architecture quality. Composite rules (complex support situations) are represented using simple support situations with production rules of a special kind. It is proposed to estimate the resulting indicator in complex support situations by building a constructive parametric -fuzzy Sugeno measures [14] on a finite set of simple support situations. The Sugeno measures reflect the estimate of the resulting indicator in a complex rule. The resulting estimates of complex rules are used to check the decision maker's statements for consistency. So, for example, when answering a complex rule, the relative indicator obtained on the basis of the Sugeno measures is greater than the specified error value, then it is concluded that the expert gave an incorrect answer. The revealed contradictions are presented to decision makers for their analysis and elimination.

Results and Discussion
As a result of the described methodology implementation for choosing a SP architecture for the forest vegetation state assessment, a final table with normalized values of individual quality indicators was  made (table 1).  To estimate the resulting indicator in complex reference situations, taking into account the opinions of an expert in simple reference situations, the parameter Taking into account = −0.9596, expert estimates of the resulting indicator in difficult reference situations and the resulting indicator of the quality of the SP architecture are calculated according to the proposed methodology. The calculated values of the resulting architecture efficiency indicator will be equal to: res ( ) = 0,711, res ( ) = 0.754, res ( ) = 0.813, res ( ) = 0.891, res ( ) = 0.931. Thus, as a result of implementation of the proposed multi-criteria assessment procedure, a reasonable choice was made in favour of service-oriented architecture. Since for the SP analysis of the forest vegetation state, the multi-user mode is important, as well as the scalability of the SP, the result looks reasonable. In addition to justifying the choice of architecture, the proximity of the numerical values of the resulting indicator is of great importance. Thus, client-server and service-oriented architectures differ insignificantly in the resulting indicator, which means that they are interchangeable.