Gain ratio in weighting attributes on simple additive weighting

Simple Additive Weighting (SAW) is one of Multi Attribute Decision Making (MADM) method known as simple weighted linear combination and most used. However, based on several studies, it produces lower accuracy values than other MADM methods. Because there is no validation in the weighting system for each attribute so that it affects the decision-making process and for some newly incompatible attributes causing errors in decision making and determining the best alternative. In this study, researchers used gain ratio as the basis of attribute weighting on SAW. Datasets used from UCI machine learning repository, such as cryotherapy, immunotherapy, ILPD and user knowledge modelling. The accuracy result of this research will be compare with the result of SAW method accuracy value based on the weight of the dataset using relative standard deviation. The average value of accuracy obtained by weighting attributes based on the weight of the dataset of 28.1825% and weight gain ratio of 31.6975%. Then on attribute weighting based on the gain ratio has a better accuracy. However, the Cryotherapy dataset value accuracy based on the weight gain ratio more 0.5% lower than the weight of the dataset due to the value in the spread dataset.


Introduction
Simple Additive Weighting (SAW) developed by Mac Crimon in 1968, known as a weighted linear combination, assessment method, or weighted summation [14]. SAW uses the principle of weighted average. The Simple Additive Weighting (SAW) method is also call the weighted addition method and is the simplest and widest method used [7]. In research [14] producing Simple Additive Weighting (SAW) cannot provide controlled consistency because they do not have a comparative index as an indicator. Research [15] analyzes several MCDM methods and determines their application in certain situations by evaluating the relative advantages and disadvantages of each method. As a result, the loss of the Simple Additive Weighting (SAW) method on the resulting estimate does not necessarily reflect the true situation, the result may not be logical with the value of one particular attribute that is very different from the others. Some researches compare and analytics two MCDM methods [1]. The result of the SAW method is a good alternative to Analytical Hierarchical Processing (AHP), but if many hierarchical levels with sub attributes then SAW does not qualify for aggregate rank values. Research [12] the results were analyzed using the standard deviation relative from the rank value, the result for SAW reached 12.64% compared while and WPM reached 35.75%. Research [3] results show that evaluation of product ranking results of TOPSIS, SAW and 2 1234567890''"" WPM method is more accurate as it supports alternative products because it uses entropy to evaluate the weight of each attribute. Research [9] the results showed that TOPSIS accuracy reached 50% of at least three matches in order of twenty participants, followed by PROMETHEE II (40%), ELECTRE II (30%), TODIM (25%) and SAW (25%) .TOPSIS correctly predicted 15 times the first alternative rank which means 79% accuracy and SAW method 74%. Research [2] results showed that the AHP method (42.42%) was most widely used because the structure was very easy, flexible and easy to understand, the second method TOPSIS (13.33%) and the next method SAW (12.73%). Based on several studies, SAW method were established based on policy analysis, making it easier and less time consuming to calculate than other multi-attribute methods. However, SAW is a method that does not have a clear validation in its weighting system, it can even produce a less accurate [7]. The Simple Additive Weighting (SAW) method that has been combined in the study [6] produces more accurate accuracy and in research [8] modified SAW to obtain a better alternative, Simple Additive Weighting (SAW) can work more optimally with the help of other methods. In research [11] gain ratio is a modification of the information gain that reduces its bias. Therefore, in this research will utilize gain ratio in giving weight of attribute before making process decision. It is hoped this will improve the accuracy of Simple Additive Weighting (SAW).

Multi Attribute Decision Making (MADM)
MADM is the method used to find the most optimal alternative of a number of alternatives that exist based on certain attributes. MADM sets the weight value for each attribute, then performs a ranking process to select a number of alternatives. The value of an attribute describes alternate characteristics, quality, and performance. Weight attribute serves to measure the weight value of importance of each attribute. In looking for the weight value of each attribute there are 3 categories of approaches namely, subjective approach, objective approach and hybrid approach [4] [13].

Simple Additive Weighting (SAW)
Simple Additive Weighting (SAW) based on a weighted average. The preference value is calculated for each alternative of the attribute, by multiplying the given scale value to an alternative of the attribute with the relative importance weight value provided by the decision maker and summing the product for all criteria [12]. SAW is able to balance between each criterion or attribute, intuitive in making decisions, and simple calculation process so it does not need to use complicated computer programs [15]. In the Simple Additive Weighting Method (SAW) there are two attributes: attribute of benefits or benefits and cost attributes. Both attributes have a fundamental difference in the selection of attributes when making decisions [4]. The preference value of each alternative calculated as follows [13]: Where, Vi is the preference value for each alternative, Wj is the weight value of each attribute and Rij is normalized work rating value

Gain Ratio
The C4.5 algorithm is a decision tree method in the selection of its attributes based on the gain ratio. Gain ratio is a modification of the information gain that reduces its bias. Gain Ratio by taking intrinsic information from each attribute can improve information gain [11].

1234567890''""
2nd Entropy dataset is the number of values needed to declare an attribute. The more information gain is obtained, the stronger the attribute. Thus, the partition will occur in the attribute that has the highest information gain [5]. The gain ratio found in the C4.5 algorithm, used to calculate the attribute's influence on the target of a data [11]. Gain ratio is the development of information gain.

Relative Standard Deviation (RSD)
Relative Standard Deviation (RSD) used to compare how well an accuracy value is. The relative standard deviation expressed in percent and obtained by multiplying the standard deviation (x) by 100 and dividing the dataset by mean [12]. Standard deviation is a measure of how exactly the average is, that is how well individual numbers fit into each other. Standard deviation is also an index that represents deployment in a set of values. If the values grouped close to the average, the standard deviation will be small, and the standard deviation will be large [12].

Methodology
The method used in this study consists of several stages, namely the pre-processing of data by determining the attributes, performing weight calculations using the gain ratio and decision making using Simple Additive Weighting (SAW) method then calculate the relative standard deviation of each dataset based on the weighting of the gain ratio and the weight of the dataset.

Pre-processing Data
Pre-processing stages of data are the exact data selection stage to use. This study using 4 datasets derived from UCI machine learning repository. The dataset is as follows 90 cryotherapy patient data, 90 immunotherapy patient data, 583 Indian liver patient dataset (ILPD) data and 403 user knowledge modelling data.

Types of attributes.
The attributes used in the decision-making process and weighting are as follows cryotherapy dataset with 5 attributes as follows: sex, age, time, number of warts, type, area and result of treatment with ratings 0 and 1. Dataset immunotherapy with 6 attributes as follows: sex, age, time, number of warts, type, area, induration diameter and result of treatment with ratings 0 and 1. Dataset of Indian liver patient dataset (ILPD) with 10 attributes as follows: age, gender, TB, DB , the alkphos, sgpt, sgot, TP, ALB, A / G and the results are selectors with ratings 1 and 2. And the last data user knowledge modelling with 5 attributes as follows: STG, SCG, STR, LPR, PEG and UNS results with very low valuation, low, middle and high.

Data Processing
The specified dataset will processed by several stages. The weighting phase of each attribute based on the weight of the dataset, the weight gain ratio, the calculation phase of the Simple Additive Weighting (SAW) method to generate the preference value of each attribute and then calculate the value of the relative standard deviation dataset based on the preference values of the alternatives.

Weighted Attributes.
Determining the weight of the attribute used the weight of the dataset and the weight gain ratio. The weights of the dataset are deriving from a predefined dataset source. In determining the weight Gain Ratio there are several processes, are as follows [10]: Step 1: Calculates the entropy value of all data based on the sum of the entropy value of the result of the dataset Where, S is the set of cases, n is number of partitions S and pi is the proportion of Si to S.
Step 2: Followed by calculating the Information Gain as one of the selection attributes used to select the attribute test of each alternative to the dataset. The Information Gain of each attribute could be calculate based on the reduction of the overall entropy value of the selector by the amount of data value of each alternative multiplied by the entropy value of each result.

Where, S is all dataset, A is subset attribute, n is Number of Partition Attributes A,| Si | is Size of a subset of the dataset that belongs to the attribute on A partition -I and | S | is Number of Case Size in Dataset
Step 3: Then, Split Info of each attribute is calculated based on the number of values of each alternate one attribute.
Where, D is all dataset, A is subset attribute, V is number of partition attributes A, | Dj | is size of a subset of the dataset belonging to the attribute on A partition -j and | D | is number of case size in dataset.
Step 4: And then calculated the value of Gain Ratio for each attribute by dividing the value of Information Gain with Split Info.
Weighted attribute based on the weight of the dataset and the weight Gain Ratio is used in the process of calculating the Simple Additive Weighting (SAW) method to determine the value of the preference value of the two weightings of different attributes.

Calculation of Simple Additive Weighting (SAW).
In the process of calculating the Simple Additive Weighting (SAW) method, the weight of the dataset and the weight Gain Ratio are calculate using the Simple Additive Weighting (SAW) method, thus generating the preference value of each alternative dataset. The process of calculating the Simple Additive Weighting (SAW) method uses attribute weighting based on the weight of the dataset and the weight Gain Ratio is as follows [13].
Step 1: Determine the decision Z size m x n, where m = alternative (Ai) is selected and n = criterion or attribute (Cj).
Step 2: Gives the value of x each alternative (i) on any specified criterion or attribute (j), where i = 1,2, ... m and j = 1,2, ... n on the decision matrix Z.
Step 3: Input weight preference value (W) based on the weight of the dataset and the weight Gain Ratio for each attribute.
Step 4: Then, normalize the Z decision matrix according to the type of criterion or attribute (advantage and cost) to obtain a normalized matrix (R). If (j) is an advantage: If (j) is cost:
Step 6: And then, multiply the normalized matrix (R) by the preference weight value (W) based on the weight of the dataset and the weight Gain Ratio. So it generates the preference value of each alternative.
Step 7: The preference value for each alternative (Vi) generated can determine the best alternative and accuracy of the Simple Additive Weighting (SAW) method using equation (1).

Calculate Relative Standard
Deviation. The result of preference values of all alternatives based on the weight of the dataset and the weight Gain Ratio will determine the accuracy of the SAW method. In determining the accuracy, the root result of the sum of squares of the preferences value of all alternatives minus the sum of preference values of all alternatives divided by all alternatives and divided by the number of alternatives in the dataset minus 1, will result in the standard deviation value [12].
Where, S is standard deviation, xi 2 is the sum of the squares of individual measurements, xi is number of individual measurements and n is number of samples of data analyzed.
The standard deviation value multiplied by 100 and divided by the mean of the total preference value of all alternatives will result in the value of relative standard deviation. RSD = ̅ × 100 (9) Where, RSD is relative standard deviation, S is standard deviation and ̅ is the mean of the data.

Result and Discussion
In this chapter will discuss the results of testing the methods of Simple Additive Weighting (SAW) obtained such as attribute weighting, the value of each alternative preference and the value of performance accuracy.

Attribute Weighting
Weights based on the weight of the dataset are determined using the following 3 categories of approaches, subjective approaches, objective approaches and hybrid approaches. The weights on the 4 tested datasets are determined from the source dataset. When the weighting based on the Gain Ratio was determined based on the amount of data and data groups corresponding to each other, the data value of each attribute of all alternatives has a great influence in the determination of the weights because the gain ratio determines the weight by reducing the bias of the data. Weights for 4 datasets that have been tested according to the number of attributes can be seen in

Alternative Preference Value
The result of testing the Simple Additive Weighting (SAW) method by using weighting based on the weight of the dataset and the weight gain ratio is the preferred value of each alternative dataset. The preference value determines the best alternative of the dataset. The best alternative of 4 datasets can be seen in

Value of method performance accuracy
The result of performance accuracy of Simple Additive Weighting (SAW) method based on the weight of the dataset and the weight gain ratio is generated from the preference value of the overall alternative data calculated using equation (8) and (9) in decision making and determining the best alternative. The result of performance accuracy can be seen in table 3.  Based on table 2 test results on the 4 data sets, the average accuracy value obtained based on two different attribute weights that is, weighting attribute based on the weight of data set of 28.1825% and weighting attribute based on the weight gain ratio of 31.6975%. Then it can be see that the accuracy on attribute weighting based on gain ratio has better accuracy. However, in the cryotherapy data set the attribute weighted accuracy value based on the lower gain ratio is 0.5% compared to the weighting of attributes based on the weight of the data set because the values in the data sets are scattered thus affecting the results of the alternative preference values.

Conclusion
Based on the results of research and testing of 4 datasets that the average accuracy of the four data sets increased by 3.5150%. Then it can be see the accuracy of attribute weighting based on Gain Ratio has better accuracy. The higher the weight of attribute to eat the higher the level of importance of the attribute and have a considerable influence in decision making and determine the best alternative. However, if the weights between attributes are much different then it affects the preference value of each alternative and the performance of the Simple Additive Weighting SAW method.