Selection of representative embankments based on rough set - fuzzy clustering method

The premise condition of comprehensive evaluation of embankment safety is selection of representative unit embankment, on the basis of dividing the unit levee the influencing factors and classification of the unit embankment are drafted.Based on the rough set-fuzzy clustering, the influence factors of the unit embankment are measured by quantitative and qualitative indexes.Construct to fuzzy similarity matrix of standard embankment then calculate fuzzy equivalent matrix of fuzzy similarity matrix by square method. By setting the threshold of the fuzzy equivalence matrix, the unit embankment is clustered, and the representative unit embankment is selected from the classification of the embankment.


Introduction
The embankments are generally longer they will generally choose some representative dike sections to evaluate the safety of dike. At present, there are not many researches on the selection of representative embankments which are mainly random sampling method [1] and geological conditions prospecting method.Random sampling method is adopted to select random grades or random selection of embankments.For example, Zhang Qing-bin divides the dike segments with the same features into 1km evaluation unit when the existing embankments of hidden danger can be shortened appropriately.The accuracy of random sampling method depends on the number of units selected. When the embankment is longer, the workload will be large so is suitable for the shorter embankment [2] . The exploration method of geological conditions mainly uses the geological prospecting data to select the general geological condition or the history of the dangerous section of the embankment to evaluate the overall security state of the embankment [3] .The method of workload is simple but also easy to ignore the parts of the embankments which results in the inaccuracy of the comprehensive safety evaluation.The above method, each has its advantages and disadvantages, this article on the basis of some unit embankments are selected, then used to measure the similarity of the various indexes of the embankment on rough set-fuzzy clustering method that representative embankments can be selected more comprehensively and rationally.

Based on rough set -fuzzy cluster analysis of unit embankment classification
A 4-tuple is commonly used to represent information system in rough set theory [4,5] , which U is all objects in the data set; A is the full attribute of data set, A C D   ; C is the set of conditional attribute; D is a collection of decision attributes; V is the range of attributes; : f U A V   is the information function, specifying the attribute value of each object in U.
The theory is applied to the unit embankment chosen clustering, U is the collection of all unit embankments, C is collection of core index attribute set, which contains the quantitative index and qualitative index, The value V is the assignment of all attributes. The form of rough set decision table for unit embankments is shown in table 2.

Table2. Rough set decision table of unit embankments
Unit embankments In table 2, C 1~Cp represent the set of all indexes, which can describe the characteristics of unit embankments from different angles. However, since their dimensions and magnitudes are not consistent, the original index values need to be processed. This paper studies the quantitative indicators using normalized processing [6] .
For quantitative and qualitative indexes in rough set decision table, a suitable similarity measurement formula is needed to construct the fuzzy similarity matrix. The similarity of unit embankments can be measured by the distance between them. For quantitative indicators, the usual distance formula is Minkowski distance, Manhattan distance, Euclidean distance, Mahalanobis distance, Chebyshev distance, etc. In this paper the Euclidean distance formula is adopted.
Where: d E is the Euclidean distance of the quantitative index for the unit embankment; c ik and c jk are the k th quantitative index value of unit X i and X j respectively; w is number of quantitative indexes.
For qualitative indicators, define the function ( , ) The distance about the qualitative index between the unit embankment X i and X j is Where: ik c  and jk c  are the k th quantitative index value of unit X i and X j respectively; q is the number of quantitative indexes. The above analysis shows that the similarity measure formula of unit embankment X i and X j is Where:  is the proportion of quantitative index to the total indexes; The formula (4) can establish the n-order fuzzy similarity matrix R, which satisfies the following two properties: ①self-reflexivity, 1 ii r  ; ②Symmetry, ij ji r r  . The matrix R shows the degree of similarity between the unit embankments. Fuzzy clustering analysis uses fuzzy similarity relation or fuzzy equivalence relation to divide the unit embankment into different classes according to certain standard. High similarity between the same class of unit embankment, less similarity between different classes. At present, there are many fuzzy clustering methods based on fuzzy division concepts, such as transitive closure method, maximal tree method and net-making method, etc. [6] .
The transitive closure method is defined by: R is the relation on the nonempty set A, if the relationship ' R on A satisfies the following conditions: R is reflexive (symmetric and delivered), then there is ' R is called the reflexive (symmetric, transitive) closure of R, denoted by ( ) t R . The matrix R established by formula (4) satisfies the reflexivity and symmetry, and does not satisfy the transitivity, that is not satisfied . So we need to find the transitive closure ( ) t R of R. The literature theorem 1 shows that [5] , if R is a fuzzy similarity matrix, there is a minimum natural number k, so that the transitive closure, t(R)=R k and h is greater than k for all natural numbers, always have R h =R k . Based on this, the method is used to calculate the transitive closure of fuzzy similarity matrix, namely fuzzy equivalent matrix. Then the threshold  of the fuzzy equivalent matrix is set, and cluster of the unit embankment is according to it. During the clustering process,  can be reduced continuously until the appropriate number of classes is obtained.

Project profile
An embankment project is 12.92km long, which is an important flood protection project in the plain area of the drainage basin. The main embankment of the wide of top embankment is 6m, the average height of the embankment is 12.5 m, the upstream slope is 1:2.5, the downstream is restricted by the topography, the slope ratio is 1:2 and 1:2.5, and the embankment is the loam soil. The embankment foundation has the boulders, gravel sand and a small amount of sand and sandy loam. The upper part is sandy loam, powdery sand, cohesive soil and silt soil.The surface layer is the most recently deposited silty soil, which has been washed and deposited by water. Because the water flow is slow and stable, the sedimentary regularity and stratification are obvious, and the layers and interstratification, thin layer distribution and horizontal bedding development are the main characteristics.

Impact index classification
According to the unit dike division of basic principles, the total length of 12.92 km dikes were divided into some 12 units, according to the body soil clay content, dry density, downstream slope ratio, embankment foundation structure, hidden dangers, ect. unit embankment classification index statistics as shown in table 3.

Unit embankment classification
The quantitative and qualitative indexes in table 3 are treated by formula (1) and (3) respectively, then the formula (4) is adopted to establish the fuzzy similarity matrix of the unit embankment. , and get the  intercept matrix R  .