Increasing the Safety in Recycling of Construction and Demolition Waste by Using Supervised Machine Learning

This paper discusses the possibility of the optical identification of recycled aggregates of construction and demolition waste (CDW) using methods of image processing, spectral analysis and machine learning. The classification performances in colour images shown, that we have to use other added spectral information to solve the recognition task in a satisfactory manner. In addition to investigations on a large colour image dataset first investigations in visible (VIS) and infrared (IR) spectrum were done for analysing significant characteristics in spectrum, which are useful for classification the C&D aggregates.


Introduction
Construction and demolition waste (CDW) are with around 80 million tons per year the biggest waste flow in Germany. Certainly the rate of the recyclable amount depends on the composition and heterogeneity of material. For recycled masonry aggregates and recycled mixed aggregates the lowest recycling rates are found because of the high heterogeneity and the mineral admixtures. Therefore the reuse of these materials is very difficult.
The recycling industry of building materials is dominated by simple technologies. For instance the single-stage crushing is used with advance sieving and separation of reinforcement steel by over belt magnetic separator. For the processing of building materials sorting processes are only used for the separation of light components until now. These technologies are not able to separate the incidental mixed aggregates. They are suitable in no way for "new building materials" including connected building materials, which will use more and more in building industry.
The heterogeneity of C&D aggregates prevents the profitable reuse. Therefore it is necessary to reduce the heterogeneity. But this is possible only by a multi-stage process with several classify and sorting steps.
The investigations are focused on the optical differentiation of the CDW classes according to the standard specification DIN 4226-100, which describes recycled aggregates for concrete and mortar [1].
First investigations, which showed the feasibility of automatic recognition of a selection of CDW classes on the basis of colour image information, were published in [2] and [3]. In the further investigations in this publication the amount of subclasses and instances are significantly extended for evaluation the necessity of different wavelength information for solving such complex recognition task. So the observed wavelength range for solving this problem was extended from information out of colour images over VIS spectrum information to the point of IR spectrum information.
The used 50 subclasses of the 5 superordinated classes of DIN 4226-100 show in many cases a high phenotypic similarity for example concrete / aerated concrete / lightweight concrete and also porous and dense brick. The investigations were done on new, not used building materials, which were crushed for obtaining homogeneous samples.
The aim is to develop an optical solution for determination of building material classes by using methods of image processing, spectral analysis and machine learning for realize a real closed cycles and a high standard of quality in recycling. Several optical and spectral attributes were found, which have discriminatory power to classify the chosen materials. Several classification algorithms of supervised machine learning were tested on different feature vectors as a numerical representation of objects of the given dataset.

State of the art
As in other sectors of recycling, for example the glass or plastic recycling, the sensor-based sorting has been becoming more interesting in the recycling of building materials and sorting of minerals in the last years. There are mainly used methods with optical, magnetic, NIR or X-Ray sensors.
The application of automatic sensor sorting in the areas of mining and recycling is successful in Europe and will increase in the following years. The benefits are the increase of the end product value and the cost reduction of downstream handling steps in the processing [4], [5].

Image analysis -object recognition in colour images
The first trial to solve the given recognition task was to use only phenotypic characteristic in VIS for classifying C&D aggregates. So the goal was to acquire colour images of all given CDW classes and use the visual contour, colour and texture features for training supervised classifier algorithms.

Image acquisition
For this research in the field of image based recognition of C&D aggregates in the visible spectrum (VIS) a specialized system for feeding, separation and image acquisition was developed and used. It is possible to handle the sample components sized from 8 till 16 mm in diameter with the developed device. The sample is filled in the charging bin, gets separated via a chute and tracks over a belt conveyor. The aim is to get non-touching and non-overlapping objects in the field of view of the colour line scan camera resulting in single object images after segmentation.
A large dataset was generated with different sample objects by using the described image acquisition device. Totally nearly 32.000 objects were given for building up training and testing dataset by using cross-validation. This dataset was used to compare several supervised classifiers and tune classifier parameters for example the SVM parameter via cross-validation.

Feature extraction
After an accurate segmentation of the foreground region the images was transformed from RGB to HSI colour space. Then a feature vector with 234 features (several contour, colour and texture features) has been calculated. The used feature extraction algorithms are part of the machine vision software Halcon and were described in the release notes for MVTec HALCON 8.0.3 [6]. These specific feature vectors provide the basis of the classifier training independent from the chosen classifier algorithm.

Comparison of classification algorithms
Several classification algorithms of supervised machine learning were tested on the given colour image dataset to solve the recognition task with an automatic recognition routine. The tested classification algorithms are part of the machine learning library Weka [7], such as support vector machines like the LibSVM as C-SVM and nu-SVM, decision tree classifiers like J48 and Random Forest classifier, distance-based classifiers like k-nearest neighbour and statistical classifiers like the Naive Bayes classifier by using a 10-fold cross validation.

Recognition rates of different classifiers.
The classifiers with the best performance on the given problem are Random Forest, the Nu-and C-LibSVM with obtained TRR of 94.7% up to 97.2%. Statistical classifiers like Naive Bayes (TRR = 44.8%) and other simple distance-based classifiers like Nearest Neighbour (TRR = 84.8%) fail or show worse performances on the given dataset (TRR=total recognition rate). If we only use colour image analysis good recognition rates could be reached. But there are also a bigger amount of false classifications between critical and uncritical classes for recycling. So the results of colour image analysis are not enough for using so classified C&D aggregates as high quality building materials.

Spectral analysis -enlargement of spectral wavelength range
The results of object recognition in colour images shown, that we have to use other added spectral information to solve the recognition task in a satisfactory manner. So investigations in VIS and infrared (IR) Spectrum were done for analysing significant characteristics in spectrum on the same datasets, which are useful for classification the C&D aggregates.

Used spectrometer devices
We used two different spectrometers for analysing the CDW classes, the VIS spectrometer Ocean Optics USB2000+ with a linear silicon CCD array as detector and a detector range of 200 to 1100 nm and the Polytec PSS 2120 with an InGaAs detector and a range of 1100 to 2100 nm.

Analysis of VIS Spectrum of several CDW classes.
First investigations of the specific VIS spectrums are shown in Figure 1 (left side). We used a light source, which shows a stable emitted lightning till 720 nm. In the spectrum range of 470-720 nm some materials show significant features, for example the classes of brick show significant characteristics in VIS for identifying in context of the classes of concrete. Other materials show only differences in intensity like concrete, gypsum, aerated concrete etc. During the capturing process of the spectrums it is difficult to reach the same intensity for all samples (subclasses) out of one class. This depends on the different object contours and on the different behaviour of reflecting the light. Classes with similar chemical substances, for example lightweight concrete and concrete, show very similar spectral characteristics without significant differences in intensity. And also phenotypic very similar classes like sand-lime brick and aerated concrete, which shown very similar VIS spectrum characteristics with the exception of differences in intensity, needs the use of other wavelength ranges (like IR spectrum information) for auspicious results. This causes the necessity for the additional using of colour image information and other spectral information.

Analysis of IR Spectrum of several CDW classes.
Within a research project at the Bauhaus University Weimar were carried out studies on the spectral characteristics of building materials forming pure minerals in cooperation with the firm LLA Instruments GmbH [8]. Many pure minerals have spectral features (absorption bands) in the near infrared wavelength range between 1100-2500 nm regard to their active near-infrared absorptions of the minerals can be divided into the following groups:  Water containing minerals,  Minerals with characteristic hydroxide,  Carbonate minerals.
In particular, the hydroxide groups can be in the wavelength range between 1350 -1430 nm evaluate well and are mineral-specific. Because of different conditions in the crystal lattice varying the wavelength at which absorbs hydroxide. This variation is between different groups of minerals only a few nanometres, but is sufficient for their identification. The background spectra are characterized by broad absorption bands especially in the wavelength range between 1350 -1450 nm and 1800 -2100 nm, which are composed of the superposition of the absorption of various minerals and water. The characters appearing in construction materials are minerals, especially the mineral classes carbonates, silicates (including layered silicates), and sulphates (especially gypsum products) and oxides (e.g. hematite) assigned.
In further investigations a bigger amount of samples (nearly 1100 samples out of the 9 superordinated classes: lightweight concrete, concrete, aerated concrete, sand-lime brick, dense and porous brick, gypsum, asphalt and granite) was used by capturing and analysing their IR spectrum. Figure 1 (right side) shows that the class specific characteristics in IR is much better for classification between the different classes as in VIS (see Figure 1 (left side)). From Figure 1 (right side) follows a good distinction of some of the defined categories of building materials in the IRspectrum. Concrete, lightweight concrete and aggregates are similar in the IR-spectrums. The mineral composition of normal concrete and also lightweight concrete varies depending on the used aggregates. However the mineral calcite is detectible by near infrared at 1911 nm. A differentiation of normal and lightweight concrete is not easy (only differences in intensities of several wavelengths).
Aerated Concrete can be seen very well on the basis of the formed tobermorite phase in the infrared spectrum. During the steam curing in the autoclave of aerated concrete tobermorite formed between 30 and 40 percent by mass. The spectrum of the aerated concrete show adsorption bands at 1430 and 1920 nm and additional a less intensive band at 1680 nm.
All sulphates like gypsum, plasters and plasterboard show high characteristic absorption bands at 1440, 1750 and 1930 nm, which is very good detectable. There is a small shifting about 10 to 15 nm in the spectrum of gypsum and plasterboard, which can be used for the differentiation of these materials. Sulphates are foreign material which is to separate completely by sorting if the secondary aggregates will be used for the production of new concrete for instance.
The examined sand lime bricks consist mainly of quartz, which is not detectable by IR. As a vapour cured product the sand lime brick also contains the mineral tobermorite. The characteristic absorption bands are at 1410, 1680 and 1920 nm.
The bricks are very variable in its mineralogical composition depending on the used clay for the production. The peak in the spectrum is at 1900 nm and is different from the concrete peak at 1911 nm, which is the feature for differentiation. To keep apart the dense and porous bricks we should be using the differences in intensities of several wavelengths by near infrared. From the recycling point of view a brick with a bulk density of 2.0 g/cm³ and more can be used for the production of recycling concrete so that a separation makes sense.
In previous studies it could be shown that the application of optical classifiers is predestined for the recognition of dense and porous brick [2], [3]. Also the recognition of the different concrete types like normal and lightweight concrete can be solved better by adding the colour image analysis in the visible spectrum and the use of classifiers.
After analysing the IR spectrums a principal component analysis was done. The first 5 and 10 principal components with highest information content were used for the application of a supervised classifier for differentiation between the given classes. Figure 2 shows the first 3 principal components of the IR spectrums. The classes lightweight concrete, concrete, aerated concrete and sand-lime brick build a main cluster with an extensive size. There are small, well-defined clusters like dense and porous brick, gypsum, asphalt and granite. The classes aerated concrete and granite build ellipsoidal clusters with good homogeneity. The class gypsum shows a plane ellipsoidal cluster, which has a long distance to the main cluster and is relatively homogeneous. The two classes of brick build two closely spaced ellipsoidal clusters, which are separable between each other and the main cluster. The class aerated concrete lie compactly on the border to the main cluster and shows a marginal overlapping with the cluster of sand-lime brick. In contrast the classes concrete, lightweight concrete and sand-lime brick (especially the first two classes) show significant overlapping and the class concrete builds an inhomogeneous cluster. This fact causes a difficult classification between the three classes. The total variance by using the first 3 principal components is 99%, this shows that we can visualize the spectral information of the dataset without information loss by using only the first three components. The visualization of the first three components allows the conclusion that the classes aerated concrete, dense and porous brick, gypsum, asphalt and granite are good distinguishable. The application of only the first 10 principal components instead of the 501 dimensional wavelength-specific information accelerated the classification time significantly without producing a loss of information. A total recognition rate of 98.4% by using only the first 10 components (cumulative variance = 99.995%) and of 90.8% by using only the first 5 components (cumulative variance = 99.95%) and by using a 10-fold cross validation. The individual recognition rates are 100.0% for asphalt, 97.4% for concrete, 98.3% for gypsum, 100.0% for granite, 98.5% for sand-lime brick, 97.5% for lightweight concrete, 99.1% for aerated concrete, 100.0% for dense brick and 98.1% for porous brick. The most false classifications appear between the classes lightweight concrete and concrete. Also critical are the false classifications between the classes dense and porous brick. If the goal is the recycling of C&D aggregates to win high quality building materials, these false classifications have to be avoided. The conclusion is that the best way for solving the classification and sorting task is to combine the IR information and the information of the colour image analysis (in VIS) and to use supervised classifiers.
The reached high recognition rates by using only IR spectrums are based on the relatively low intra class variability of the nine C&D aggregate classes. The cause of the relatively low intra class variability is the use of new building materials, which were crushed for obtaining homogeneous recyclates. Real C&D aggregates show a higher intra class variability because of contaminations and dust on their surfaces and a higher variability in shape and size. Further investigations are planned for combining feature vectors of colour image analysis and IR spectrum analysis and using used real C&D aggregates as sample dataset.

Results
The further investigations on the extended colour image dataset by using supervised machine learning algorithms showed that the obtained total recognition rates (TRR) of the best classifiers (Random Forest, Nu-and C-LibSVM) were 94.7% up to 97.2% for using the given 5 superordinated classes in context of the DIN standard. The first investigations in the VIS spectrum range showed that some materials show significant spectrum characteristics for identifying, for example the classes of brick. Other materials show only differences in intensity like concrete, gypsum, aerated concrete. Classes with similar chemical substances, for example lightweight concrete and concrete, showed very similar spectral characteristics without significant differences in intensity. This causes the necessity for the additional using of colour image information and other spectral information.
The first investigations in the IR spectrum range showed that concrete and brick in the infrared spectrum are well distinguishable in principle. Gypsum as impurity in the C&D waste is very well detectable by IR. Aerated concrete and sand lime brick can also be very well recognized by IR sensors. Lightweight and normal concretes and dense and porous brick show a little amount of false classifications in the IR spectrum. A total recognition rate of 98.4% by using only the first 10 principal components (cumulative variance = 99.995%) and of 90.8% by using only the first 5 principal components (cumulative variance = 99.95%) and by using a 10-fold cross validation. So the best way for solving the analysing task is to complete the IR information by information of the colour image analysis (in VIS) and to use supervised classifiers.
Further investigations are planned for combining feature vectors of colour image analysis and IR spectrum analysis and using used real C&D aggregates as sample dataset.