Machine learning applications in structural engineering - a review

Machine learning (ML) is a major subfield of artificial intelligence (AI) that provides systems, the ability to automatically learn and improve from experience without being explicitly programmed. With its ability to capture complex behaviour of structures and systems, ML has been proposed as a solution to overcome the limitations of conventional methods in Structural Engineering. This paper is an insight in to a few of such applications, based on neural networks, Support Vector Machines and Nearest Neighbours, projecting their accuracy in performance.


Introduction
Machine Learning (ML) is a major subfield of artificial intelligence (AI) that deals with the study, design, and development of algorithms that can learn from the data itself and make predictions using the learned data. ML can efficiently perform tasks such as clustering, regression, prediction and classification upon any type of data. The learning process can be supervised or unsupervised based on the presence or absence of a teacher to provide a cost or category label to each item in the training dataset. Reinforcement learning algorithms that are forced to optimum goals through trial-and-error approaches are also not rare [1]. The discipline of ML really began with the development of Artificial Neural Networks (ANN), inspired right from its inception by the recognition that the human brain computes in an entirely different way from the conventional Von Neumann computer. ANNs can handle nonlinear interactions and complex behaviour among input and output variable parameters in a system without any prior knowledge about the system. The advent of Deep Learning concept in ANNs led to the birth of deeper networks such as, Convolutional Neural Networks (CNN).
ML algorithms have the capability to make classifications and predictions from a set of unstructured, incomplete or even contaminated datasets. It was due to this reason that they were preferred over the versatile physical model-based methods to solve structural engineering problems such as uncertainty of various influencing factors, lack of accuracy, dependence on environmental fluctuations, dependence on human expertise, etc. Support Vector Machine, Random Forest, k-Nearest neighbours, etc. are also subsets of ML applied extensively in structural engineering field. This paper is an insight into basic concepts of Machine Learning algorithms, used extensively in the Structural Engineering field. The applications of these algorithms in Structural Engineering such as, prediction of structural behaviour, damage detection, etc. have been surveyed in this paper.

Common Machine Learning Algorithms
This section is an introduction to a few of the commonly used ML algorithms in Structural Engineering such as, Neural Networks, Support Vector Machines, Nearest Neighbours and Random Forests.

Neural Networks
The discipline of ML really began with the development of Artificial Neural Networks (ANN), inspired by the capabilities of the human nervous system. Due to its massively interconnected structure to attain fast conduction, with less energy consumption, the human brain can effortlessly solve problems that are complex to a computer like face recognition, text manipulation, etc at alarmingly high speed. The output y of a typical layer k in a NN, with x as input vector, w as the weight vector, b as the bias term, and f(.) as the activation function (to non-linearize the output), can be written as: Artificial Neural Networks (ANN) are a stack of such computational units to result in a model with desired level of generalization error and over fitting. Deep Neural Networks make use of a number of interior or hidden layers to represent complex behaviour of structures and systems. Convolutional Neural Networks (CNN) are deeper networks capable of image processing, used in modern computers. Convolutions are in fact a relaxation from fully connected layers and varied input size of ANN. CNNs can be utilized to solve any real-world problem provided, the input vector is available in tensor form.

Support Vector machines
Support Vector Machines (SVM) are feed-forward type binary learning machines, devised for classification and regression problems. Given a training dataset, the support vector machine constructs a hyper plane as the decision surface in such a way that the margin of separation between positive and negative samples is maximized. To find the optimal parameters, the following constraint is to be satisfied by the training set {(x i , d i )}. The particular data points which satisfy any one of the above equations are the support vectors. The structure of SVM as shown in Figure 1 is similar to a two-layer ANN in which, each node performs an inner product operation between the input x and the support vector. The activation here, is IOP Publishing doi:10.1088/1757-899X/1114/1/012012 3 provided by means of a kernel function, K (x i , x). The output layer is merely, a linear combiner with weights, α.

k-Nearest Neighbors
The Nearest Neighbours (k-NN) is an instance-based learning algorithm used for classification and regression problems. The basic assumption of k-NN is that similar datasets exist in close proximity, with a set of k inputs in the feature space. The k-NN classifier, defines a non-linear boundary between the classes (Figure 2a). A higher value of k implies that, more distances are computed to find the neighbours to the input in hand, which increases the accuracy of the model. In k-NN regression, the output is the property value for the object, which is the average of the values of k nearest neighbours. K-NN algorithms are sensitive to local features of a dataset.

Random Forests
The Random Forest (RF) algorithm is an ensemble learning method, used in classification, prediction and regression problems in which, numerous trees with the same distribution are used to set up a forest to train and predict the behaviour of the sample data. Each tree in a random forest makes a class prediction and that class with the highest vote will become the prediction of the model (fig 2b). Number of trees is optimized to reduce the error in generalization. The basic principle of a Random Forest classifier is that, a large number of relatively uncorrelated models or trees, operating as a committee, will outperform any of their individual constituent models. Since, the trees protect each other from their individual errors, uncorrelated models can produce accurate predictions. Some of trees might be wrong, and many others might be right in making a prediction, so as an ensemble, they make predictions in the right sense. Random forest trees are extremely sensitive, and therefore each individual tree is allowed to randomly sample from the dataset with the ability of replacement, resulting in different trees. This feature of Random forest trees is called bagging. The property of feature randomness-picking up a feature from a random subset of features-is what results in the uncorrelated nature of the trees. percentage error of less than 5% and the same result was achieved, by employing 512 regression trees of depth 6. Though the method was proposed to overcome the complexity of empirical approach to fracture toughness estimation, the accuracy of prediction results equalled with that of numerical simulation only, even with a huge collection of datasets.

ML Applications for Prediction of Structural Behaviour
Huang and Burton (2019) studied six machine learning algorithms-Logistic Regression, Decision Tree, Random Forests, Adaptive Boosting, Support Vector Machine and Multilayer Perceptron-for the classification of 114 masonry-infilled RCC frame specimens in to four distinct failure modes, utilizing nine structural parameters as inputs, of which masonry is also a variable with binary values (0 or 1). Most of the models were able to achieve more than 80% prediction accuracy, based on recall score, with the highest value of 85.7% achieved with the Adaptive Boosting and SVM algorithms. A 5-fold cross-validation exhibited potential over fitting possibilities of the neural network-based model (Multi-layer Perceptron), due to relatively small size of dataset. The study is limited to database obtained from prior studies, in which the effect of dynamic loading is not accounted. Also, failure of infill frames, that are part of a multi-storied building, need not always belong to four distinct modes only.
Adeli and Park (1995) proved that a Counter Propagation Neural Network (CPN), along with a simple formula for choosing an optimum model, instead of running trials to update its learning coefficients, outperforms the conventionally used Back Propagation Networks (BPN), with slow learning rates and model complexities. Experiments revealed that CPNs are 3000 times faster than BPNs. Concrete beam design was accomplished using 31 samples, with a percentage error less than 0.5%, while the buckling problem was dealt with a more complex CPN network and 28 examples, with almost 100% accuracy. The equation for updating weights, developed to estimate Momentgradient coefficient for singly and doubly symmetric steel beams, based on AISC 1989 recommendations, despite its simplicity, invites a lot of assumptions into the problem. For instance, the non-dimensional elastic critical buckling moment is estimated using Rayleigh-Ritz method and a truncated Fourier sine series as the buckled shape of the beam. Mangalathu and Jeon (2019) studied the efficiency of various machine learning models such as quadratic discriminant analysis, k-NN, decision trees, Random Forests, naive Bayes, and ANNs in failure mode prediction of circular RC columns, assembling 311 columns with circular or octagonal cross sections and spiral or circular hoops. The input variables were derived on the basis of the formulations for flexural and shear capacity of the columns, based on their influence on the failure modes and, are not the optimal input parameters. It is evident, from the simplest ANN model which achieved an accuracy of 91%, consisting of 1 hidden layer of 10 neurons, that ML algorithms are dependable for seismic assessment, for prioritizing the retrofitting strategies, and deciding the operational strategies of bridge after an earthquake.
Mangalathu et al. (2020) used experimental databases to suggest Random Forest models for failure mode predictions of 311 specimens of RC columns and 393 specimens of shear walls. The S Hapley Additive ex Planations (SHAP) approach for ranking input variables based on their importance in a prediction, is the essential requirement of every risk assessment problem. An accuracy of 84% and 86% for columns and shear walls respectively, obtained based on geometric and reinforcement parameters, shall be extended to other influencing factors also.
Brown et al. (2005) acclaimed the use of Levenberg Marquardt (LM) algorithm, to train feedforward networks with multiple performance variables, to be responsible for its accuracy in using a single feed-forward network in time-response comparison, with multiple performance variables such as roof displacements and control forces, of a three-storeyed lumped-mass shear-beam model IOP Publishing doi:10.1088/1757-899X/1114/1/012012 5 subjected to EL Centro earthquake. The LM trained network had a mean-squared error, 2 orders of magnitude less than, two gradient descent-trained networks, with single performance variables, utilizing 0.1 s of lag time to predict ahead for a 50ms lead time. But the algorithm was prone to overfit, and hence required early stopping.
Siam et al. (2019) studied earthquake responses of shear and flexural-dominated reinforced masonry shear walls, with different aspect ratios by employing scatter plots, and to study the influence of geometrical and mechanical properties of the structure to its responses, Principle Component Analysis (PCA) to cluster the walls based on their features, and a Projection of Latent Structures (PLS) algorithm to classify the walls and predict their lateral drifts according to their failure modes. A total of 97 samples, compiled from previous studies, were tested under displacement-controlled quasistatic cyclic loading in the in-plane direction. Validation results indicate possibilities of both overfitting and under fitting, even with such great quantum of work involved. For example, the ultimate drift results of flexure-dominant walls were less than the true value, while it is more than the true value for shear-dominant walls.
To identify the key parameters governing failure pattern and shear capacity of Ultra-High-Performance Concrete (UHPC) and Fibre-Reinforced Concrete (FRC) beams, Solhmirzaei et al.
(2020) studied 360 UHPC beams with different geometric, fibre properties, loading and material characteristics, using different ML algorithms including, Support Vector Machine, ANN, k-nearest neighbour (k-NN), and genetic programming (GP). The ANN model outperformed other models with an accuracy of 89% in failure mode classifications and GP yielded an R 2 of 0.92 for shear capacity prediction. Failure modes and shear capacities of prestressed beams are not only dependent on prestressing forces and fibre properties, but the number of tendons used, geometry and their eccentricity also. Hence, a greater number of variables are to be incorporated for better results.
Almustafa and Nehdi (2020) employed a hybrid Random Forest approach for prediction of response of RC slabs subjected to blast loading, with 150 samples simulated in FEM modelling software, from previous studies. Just as in [27], this study also implemented a variable importance measure termed, Permutation Feature Importance (PFI), which is easy to implement. A Mean Absolute Error (MAE) of 4.38 ± 0.22, a variance explained by cross-validation VEcv value of 94.4%±3.5%, and an R 2 value of 96.2%±0.6% suggests that, the method is computationally simple. Since, actual experiments with blast loading scenarios are both expensive and risky, ML approach is undoubtedly an alternative to conventional methods, given that economy is the highest concern.
Data-based probabilistic Seismic Hazard Analysis (PSHA) and ground motion simulation are useful tools to predict the likelihood of the occurrence of seismic events at a site over a period of time. The method proposed by, Alimoradi and Beck (2014) selects appropriate pairs of magnitude and distance given, the probability of exceedance of intensity measure at the site, to generate acceleration spectra of ground motion anticipated at the site, incorporating Gaussian Process (GP) Regression, Principal Component Analysis (PCA) and Genetic Algorithm (GA). With 530 records of ground motion records, in the form of acceleration-time history or power spectra, within a radius 50 km from a site in downtown, Los Angeles, obtained from appropriate databases, considerable resemblance can be seen between the truth and prediction. However, a well-defined performance metric is not adopted to project the accuracy in prediction. Moreover, reliability on GA introduces errors in prediction due to local optima convergence.
The task of regression in structural engineering is achieved with less effort, using ML algorithms especially, when the structure to analyse is a massive one like, dams. Segura et al. (2020) employed Polynomial Response Surface (PRS) for seismic assessment of concrete gravity dams, using metamodels for fragility analysis. PSHA was performed to characterize the seismic hazard at the dam site and to select 250 ground motions, from which 5 x 10 5 dam samples were generated, to yield an R 2 of 0.991, RMSE of 0.013 and RMAE of 0.207, adopting Concrete-rock angle friction as the model parameter and Peak Ground Velocity as the intensity measure. Since the limiting states required for training are derived from ASCE 2016, this limits the boundary values of model parameters.

M L Applications for Damage Detection of Structures
Machine Learning-based damage detection strategy involves two levels of action-a Structural regime and a Machine Learning regime-that decides whether, Neural Networks, Support Vector Machine Machines, Nearest Neighbour, or Random Forest, etc. to be incorporated with, vibration-based methods in time-domain, frequency-domain or modal-domain, to yield better accuracy in damage detection. Nevertheless, it is still uncertain that a chosen combination of features/classifiers might be suitable for damage detection of any kind of a structure and for all types of damages.
A combination of Frequency Response Functions (FRF) with ANNs as in [16], had been extensively investigated, owing to their straightforward application. Even though, the massiveness of FRF could be addressed through dimensionality reduction with the aid of Principal Component Analysis (PCA), the difficulty in obtaining FRF data limited the scope of such studies to numerical simulations alone. This uncertainty and inefficiency of FRF data has been addressed in [17].
Damage ratios based on Modal Strain Energy (MSE), were employed as inputs for training CNNs, in [18], with damage scenarios being induced as a reduction in elastic modulus or stiffness. But, other forms of damage such as corrosion, cracks, and degradation, may also induce changes in mechanical parameters of a structure. In this regard, Crack detection and segmentation studies were conducted on various infrastructures. Anyhow, an assembly of lot of electronic gadgets for image acquisition, processing, and in most cases, 3D modelling software, were required in these studies, proving them to be expensive. Moreover, the need for alarmingly large number of labelled images to facilitate supervised learning, along with massive shape of deep CNNs, increased the processing time. This compounded the limited possibility of image-based damage detection.
Zajam et al. (2019) located damages caused due to corrosion and fatigue in gas pipelines using wavelet transforms to decompose real-time acceleration signals, and trained Support Vector Machines (SVM), to obtain 86.2% accuracy. But the intensity of damage is not determined in the study. Also, damage detection at supports and bends, with heterogenous medium and varying load velocity, require further studies.
Real-time crack detection of massive structures such as bridges, was accomplished by a combination of CNN-based You Only Look Once (YOLO) algorithm and structured light in [20], with 94% accuracy, but using 3,30,000 images. However, practical implementation of this method might require Unmanned Aerial Vehicles (UAV) for capturing real-time images. The real-time monitoring method proposed by [21], using damaged/undamaged acceleration signal arrays to train, 1D CNNs is a promising approach, due to its compactness and ability to classify accelerometer signals directly, without any pre-processing. The method was validated using a laboratory grandstand simulator and practically implemented on a girder.
Parametric methods for damage detection might be successful in mechanical and aerospace domains but, in Civil Engineering, it is difficult to excite a structure to respond in an anticipated range. This gives more power to non-parametric methods as discussed by [21], where, the emphasis is to directly detect damage from accelerometer signals and not on the frequencies or mode shapes.
Physically attached sensors used in vibration response measurements, such as accelerometers, interferes in the detection results due to the weight of the sensor and its low spatial resolution. Installation and calibration of the sensors are also a time-consuming procedure. Yang et al. (2020) suggests the use of high-speed cameras to capture natural frequencies of beam samples from a system excitation video, to train CNNs for feature extraction and Long-Short Term Memory (LSTM) for spatial and temporal correlation. A mean percentage error of 1.48% obtained for an Aluminium beam was the best among other specimens. This work opens up avenues for further studies with audio signal IOP Publishing doi:10.1088/1757-899X/1114/1/012012 7 generated during vibration and direct measurement of mode shapes instead of less sensitive natural frequencies.
Dynamic testing of massive structures like bridges make use of ambient excitations like wind, traffic loads, etc. But ambient excitations, corrupted with environmental and operational interferences, necessitated approaches to eliminate say, temperature effects as discussed in [23].The adaptive wavelet-like time-frequency spectrum, provided the natural frequencies of the structure. Oh and Sohn (2009) employed data normalization techniques to eliminate such effects, by characterizing the relationship between the environmental and operational parameters and damage-sensitive features using non-linear PCA based on Support Vector Machines (SVM). Considerable reduction in baseline data is one of the notable advantages of using this method.
Experimental validation of methods involving model updating with real healthy state and real damaged state, are prone to errors in measurement and FE modelling, in addition to errors due to varying loads and noise contaminations. Most of the studies in this field concentrated on simulating damage scenarios as deep cuts in critical members, loosening bolted connections, cutting notches etc. As a deviation from this, [25] employed raw frequencies obtained from real-time monitoring along with real healthy state data to update the FE model and hence minimize the error.
Most of the conventionally used damage detection approaches are based on supervised learning, which require labelled data for training. In other words, the training requires vibration data collected for both the undamaged and damaged structure. In civil engineering, the data prior to damage is rarely available. A transition from supervised techniques to unsupervised or even semi-supervised approach has therefore, multiple merits. Bull et al. (2018) demonstrated this with cluster-adaptive Active learning tools used for aircraft experiments without defining the damage classes a priori. The proposed model achieved 95.5% accuracy, utilizing only 3% of the labels.
Apart from the challenges faced in the Structural Engineering regime, Machine learning also suffers from a lot of issues, while dealing with real world problems. Deep Neural Networks tend to increase their non-linearity by going more and more deeper with more neurons and hidden layers, complicating the process of optimization of its weights. The recently proposed pre-trained and finetuned Auto encoder networks in [27], with their ability of relationship learning coupled with dimensionality reduction has surprisingly solved this exploding gradient problem. The proposed model maps between natural frequencies of the system and percentage reduction in stiffness. Flexibilities, mode shapes, time-frequency spectra and FRFs are also possible indicators of damage, and their applicability in this framework demands further investigations. Jia et al. (2015) also employed Auto encoders for fault diagnosis in rotating machinery, a task that demands high precision, using frequency spectrum. Unlike, the hyperbolic tangent functions used in this work, the ReLU function and its derivatives have been proved to perform better in such applications.

Conclusions
Machine Learning algorithms have exhibited the momentum of strengthening or even replacing the conventional methods, which otherwise had the limitation of uncertainties and complexities, while dealing with non-linear behaviour of structures and heterogeneous nature of materials. ANN and SVM are both competitive and general-purpose Networks, which are applied extensively in Structural Engineering. Their fully connected layers and varied input sizes are capable of tracking complex behaviour, with better performance. The uncorrelated trees in Random Forests and the immense feature space of k-NNs, outperformed in both classification and regression applications. Convolutional Neural Networks are mainly concentrated on image processing. Since, the input feature vector of CNN is necessarily three-dimensional tensors representing pixel co-ordinates, it can be employed to train any set of inputs given, the data is represented in the tensor form.
This paper surveyed some of the recent studies in Machine Learning-based damage detection, focusing on prediction of structural behaviour, and damage detection, employing Artificial Neural Networks, Convolutional Neural Networks, Random Forests, Support Vector Machines and k-Nearest Neighbours. However, further studies are to be carried out to explore the possibilities of other IOP Publishing doi:10.1088/1757-899X/1114/1/012012 8 disciplines of Machine Learning in structural Engineering field. Unlike the statistical approach of fitting a well-known distribution to a dataset to study its behaviour, Machine Learning captures the actual behaviour of any system, be it an unstructured, incomplete or even contaminated one. The availability and quality of input data is however, a challenge faced by this novel area of research. Non-probabilistic method to consider uncertainties in frequency response function for vibration-