Adaptive Positive-Negative Selection Approach

Positive selection and negative selection are the main artificial immune approaches. They can find application in any task where an automatic identification of incoming cases as self or non-self is needed. Both the approaches build the so-called detectors for protecting self cells, i.e. positive class objects. In a positive selection approach, detectors are able to recognize self objects, whereas the other approach produces detectors that eliminate non-self objects. The difference in construction of detectors causes that the choice of the approach may depend on the characteristic of the data under consideration. The goal of this paper is to provide a hybrid approach that is able to adapt to a given data set, producing thereby a classifier achieving the best performance. The adaptation is done twofold: (a) in the training phase, each candidate for a detector is automatically determined whether it will be used to build a self or non-self detector; (b) in the inner test phase, the best type of the classifier (i.e. pure positive selection, pure negative selection, positive-negative selection, negative-positive selection) is chosen. Results of experiments conducted on databases from the UCI Repository show that the combination of positive and negative selection approaches may give a higher classification accuracy.


Introduction
The ability of a natural immune system to distinguish between pathogens and non-pathogens has inspired researchers to adapt this mechanism in artificial intelligence [1]. Such a system includes immune cells (detectors), called T cells, which have a component for recognizing self cells and an antigen receptor for locating and eliminating infected pathogens. The system generates a number of T cells, thereby any pathogen to infect the organism is very likely to be detected by at least one of cells. Before becoming a detector of the system a cell is verified to ensure that it is able to distinguish between self and non-self cells. The verification can be done using positive or negative selection. In positive selection T cell that recognizes any own cell is saved, otherwise it is removed. In negative selection, in turn, only cells that do not detect self cells are stored [2].
In spite of the fact that both selection approaches are mutually dual, thereby they can be complementary to each other, they have been usually used separately. Only a few studies on combining positive and negative selection approaches can be found.
A formal framework for analyzing different positive and negative detection schemes for data represented by binary strings was given in [12]. The framework enables to find the tradeoff AIACT IOP Publishing IOP Conf. Series: Journal of Physics: Conf. Series 1061 (2018) 012020 doi :10.1088/1742-6596/1061/1/012020 between positive and negative detection schemes in terms of the number of detectors needed to correctly distinguish self objects from non-self ones.
An approach proposed in [13] is devoted to binary data representations and uses r-chunk matching rule. The approach starts with building binary trees corresponding to positive r-chunk detectors. To obtain a compact representation of the positive detector set, all complete subtrees of these trees are removed. Finally, a positive tree is replaced with its negative counterpart if the latter one is more compact.
The goal of this paper is to propose a combination of positive and negative selection approaches that adapts to the data to be classified. In the training phase, points from realvalued space (i.e. candidates for detectors) are selected randomly. Based on the distance of a point to the nearest self object it is determined whether it will be a self or non-self detector. In the inner test phase, objects are classified using four combinations: pure positive selection, pure negative selection, positive-negative selection, negative-positive selection. In the first two cases, only self or non-self detectors are used. In the third (fourth) case, if an object is not recognized by any self (non-self) detector, then it is classified by its nearest detector. One of the four combinations that gives the highest accuracy is chosen as the classifier for the data.
Experimental research shows that the positive-negative approach may increase the classification accuracy and that a better performance is obtained when one of the four combinations is defined individually for each dataset.
The rest of the paper is organized as follows. Section 2 proposes a combination of positive and negative selection approaches. Section 3 describes experimental research conducted using the proposed approach and six datasets taken from the UCI Repository. Concluding remarks are provided in Section 4.

A Combination of Positive and Negative Selection Approaches
This section proposes an algorithm (Algorithm 1) that generates self and non-self detectors at the same time. An algorithm (Algorithm 2) that classifies new objects according to one of four versions (positive selection, negative selection, and their two combinations) is also introduced.
Each point (i.e. m-ary vector of real numbers) to be used for building a detector in Algorithm 1 is selected randomly from the data space that is assumed to be normalized to the hypersphere. The selection is done from either the whole space, i.e. R SN or the space of self objects, i.e. R S . The choice of the space is done at random. This procedure enables to avoid an undesirable disproportion between self and non-self detectors sets in terms of their cardinalities. Namely, selecting points from the whole space might cause that the construction of a self detector is very unlikely (i.e. a candidate for a detector is usually too far from the nearest self object). On the other hand, when we limit the domain to the self objects space, the number of constructed non-self detectors might be very small (i.e. a candidate for a detector is usually too close to the nearest self object).
The radius of a self detector is constant and is defined by the user. The radius of a non-self detector is defined individually and equals to the distance of the detector to nearest self objects.
The algorithm stops searching for new detectors if each of k points selected in a row is within the scope of any detector already created.
Besides the pure positive or negative selection, Algorithm 2 is able to use its combination for objects that are not included in the sphere of any detector. The class of the object (that is not inside any detector already generated) is defined based on the type of the detector (self or non-self) that is the nearest to the object. The distance is measured between the object and the detector center. Six datasets (Table 1) taken from the UCI Repository (http://archive.ics.uci.edu/ml) were used in the experimental research.
The following testing procedure was applied. First of all, to avoid possible influence of different ranges of attribute domains, all database were normalized into a unitary hypercube [0, 1] n where n is the cardinality of the attribute set. One class (the one whose cardinality is underlined in Table 1) of a database was taken as self data, the remaining ones as non-self data. Each data set was divided into five equinumerous sets. Four of them was used in 4cross-validation (inner test), and the remaining one as an additional testing set (outer test). The non-self objects were only used in the classification phase. To eliminate the influence of randomness each data set was tested five times and the average results were taken. The radius of self objects (0.15 of the maximal possible distance) was defined in a preliminary experimentation. The distance between an object and detector was measured using the Euclidean metric. The quality measure used in the experiments was accuracy. Table 2 shows the classification accuracy for the training data (T R), test data from 4-crossvalidation (T S1), and additional test data (T S2). The negative-positive selection combination (N P S) had the best performance for five (T R), three (T S1), and three (T S2) out of six datasets. However, its results for other datasets are not always close to the best ones, see e.g. W ine (T S1, T S2) or Record (T R, T S1, T S2). It leads to the conclusion that the choice of the combination should be done for each database individually based on the training data (T R) result or first  test data (T S1) result. One can see that the best T R result translates into the best T S1 result for five data sets, whereas the highest T S1 result always implies the highest T S2 result. To investigate this issue more deeply, all the results, i.e. ones that were used to calculate the average results, were taken into account. Table 3 shows in percentage how often the best result for T R (T S1) is also the best one for T S1 (T S2). The experimentation reveals that the choice of the combination is more accurate if it is done based on the test data (T S1). Namely, for five databases the result is better or the same, and the average performance is clearly higher.

Conclusion
This paper has proposed a combination of positive and negative selection approaches devoted to real-valued data. The approach controls the process of detector generation, by determining whether a new detector will be self or non-self. This is also able to find the best combination (i.e. pure positive selection, pure negative selection, positive-negative selection, negative-positive selection) for a given dataset. Based on the experimental research one can conclude that: (i) A combination of both the approaches may give a higher classification accuracy than each of them used separately. (ii) Finding the best combination for a given dataset is more likely when it is done based on not the training data but the test one.
The proposed approach is planned to be applied in future works as a tool for automatic detection of network anomalies.