Anomaly Detection in the Automotive Stamping Process: An Unsupervised Machine Learning Approach

In metal forming, such as stamping of automotive parts, unsupervised machine learning models offer a transformative approach to real-time quality control, especially when labelled data are scarce. Leveraging clustering algorithms and autoencoders, we develop a machine learning system capable of autonomously monitoring sensor data and identifying deviations suggestive of potential defects. The system offers multiple benefits including rapid intervention, reduced part defects and lower stoppages required to rectify defects. The use of unsupervised machine learning models also adds a layer of adaptability, allowing the system to continually refine its understanding of what constitutes a ‘normal’ operation. Empirical evaluation demonstrates the potential of the developed system in detecting anomalies in production data collected from dynamic automotive manufacturing environments.


Introduction
In the automotive industry, the stamping process plays a crucial role in transforming metal sheets into specific parts, such as body panels, interior connectors, and Body-In-White structural components [1][2][3].On the production floor, large tonnage stamping presses are used to mold as-received blanked metal sheets into desired three-dimensional shapes based on predefined tooling.The stamping process often involves a series of operations such as blanking, drawing, piercing, and trimming, executed in a highly coordinated manner to ensure precision and consistency.Given the complexity and high-speed nature of these operations, quality control is of paramount importance.Even minor deviations in pressure, alignment, or IOP Publishing doi:10.1088/1757-899X/1307/1/012035 2 material quality can result in defects, leading to waste of materials and the potential costly stoppages to rectify issues.Therefore, ensuring that each stamped part meets rigorous quality standards is a significant concern, demanding advanced technological solutions for effective monitoring and control.
Quality control in automotive stamping process presents unique challenges owing to the intricate nature of the operations and the high part throughput.Ensuring consistency and precision in stamping parts is not just about controlling the mechanical aspects like pressure and alignment; it may also involve monitoring material characteristics such as grain microstructure and yield strength/hardening, or process characteristics such as temperature and lubrication.Each of these factors can significantly influence the quality of the final product.Moreover, given the volume of parts produced, even a small percentage of defects can lead to substantial material waste, necessitating line stoppages that are both time-consuming and expensive.Traditional quality control methods, often based on post-production inspection, are inadequate for detecting defects early enough to avoid costly interruptions and may only be discovered at the point of assembly.As a result, there is a rising demand for real-time, data-driven approaches to quality control that can swiftly identify and rectify issues.The incorporation of machine learning models, central to the Industry 4.0 era, offers a powerful approach to achieving the demand [4][5][6][7][8].These models can analyse real-time data to predict potential defects and enable immediate corrective actions, thereby elevating the efficiency and accuracy of quality control processes [9].
The cost implications of quality control in stamping processes are significant and multi-faceted.Indeed, frequent quality issues can lead to longer downtimes, increased labour costs for rework, and potential reputational damage, which could impact future business.On the other hand, investing in advanced quality control systems, such as machine learning solutions for real-time monitoring, requires initial capital expenditure and ongoing maintenance costs.However, these investments are often justified by the subsequent reduction in defects and associated savings in material and operational costs.

Related work and proposed framework
Process data harvested from real operational environments provide a goldmine for machine learning models to derive insights, offering advantages such as enhancing production efficiency, minimising downtime, and reducing waste [10].However, the absence of labelled data poses a significant challenge for developing supervised learning-based solutions.Specifically, supervised learning models, which are commonly used for classification tasks like defect detection, require substantial number of labelled samples for training purposes [10][11][12][13].Without these labelled samples indicating what constitutes a 'defect' or 'non-defect', a supervised learning model lacks the guidance needed to generalize and make accurate predictions on new, unseen data.Therefore, the lack of labelled data necessitates the exploration of alternative methods, like unsupervised learning, to effectively identify anomalies or defects in both historical and real-time data; therefore, the motivation of this research.
Autoencoders have been effectively used to evaluate defect and machine degradation conditions by analysing process signals like acoustic emission and force data in stamping [14,15].However, these investigations mainly considered stroke forces, and their findings were not cross validated using alternative unsupervised learning methods like clustering algorithms.In our study, we focus on the effects of lubrication depth in stamping processes, conceptualising the varying depths as an image-like data set for anomaly detection.Additionally, we cross-reference the outcomes from autoencoders with those obtained from clustering algorithms, aiming to mitigate the issue of false positives.The proposed detection framework is shown in Figure 1.

Methodology
This study employs both clustering algorithms and autoencoders for anomaly detection in stamping processes in an automotive production plant.Data samples are gathered from three key areas: the parts, the machines, and the lubricants involved in the stamping process.We focus on the lubrication data for anomaly detection in this study.Anomalies or issues in automotive stamping processes can arise from various factors including material defects, tooling issues, process variations, and equipment malfunctions.Among these, lubrication and oiling play crucial roles in the stamping process, significantly influencing the quality of stamped parts.Proper distribution of lubrication and oiling reduces friction between the metal sheet and the die, promoting smooth material flow and preventing surface defects, ultimately enhancing the dimensional accuracy and surface finish of stamped parts while extending the tool life.Proper lubrication also minimizes the occurrence of common stamping defects such as wrinkles, tears, resulting in higher production yields and reduced scrap rates.
Lubrication distribution depends on factors such as part geometry and material properties.In some cases, uniform distribution across the part surface ensures consistent material flow, while selective application in specific areas may optimise lubrication effectiveness for complex geometries or high-friction zones.To identify the right distribution of lubrication, autoencoders are utilised to reconstruct lubrication sensor inputs and flag significant deviations as anomalies, while clustering algorithms are used to corroborate and visualise the results.The performance of our method is assessed through two primary means: comparative analysis and visual inspection.For comparative analysis, anomalies identified by both the clustering and autoencoder methods are compared for consistency.
Significant overlap in the anomalies detected by both methods would lend confidence to the outcome.For visual inspection, graphical representations of the normal and abnormal data samples are examined to qualitatively assess the outcome.These visual checks serve to corroborate the algorithmic findings and provide an additional layer of validation.Through this dual-prone evaluation approach, our study aims to provide a rigorous assessment of the effectiveness of the developed solution in real-time for anomaly detection, leading to reduction in defective parts and operational costs.

Clustering algorithm
Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is an unsupervised learning model that identifies clusters in a data set based on the density of data samples.Unlike partitioning methods such 4 as K-means, DBSCAN does not require the number of clusters to be specified in advance.The algorithm starts with an arbitrary sample and explores its neighbourhood based on a distance metric (usually Euclidean) and a density parameter, which is defined by the number of samples within a given radius.If the neighbourhood has sufficient density, a cluster is formed, and the algorithm continues to explore neighbouring samples, expanding the cluster.Samples that are insufficiently dense are labelled as outliers.

Autoencoder
An autoencoder is a type of neural network used for undertaking unsupervised learning tasks, primarily for data compression and reconstruction [16].It consists of two main parts: an encoder, which compresses the input data into a lower-dimensional latent space, and a decoder, which reconstructs the data from the latent representation back to its original form.The network is trained to minimise the difference between the input and the reconstructed output, typically measured by a loss function like the mean squared error (MSE).Autoencoder is useful for feature learning, anomaly detection, and data denoising, among other applications.They are particularly useful in scenarios where labelled data samples are scarce, as they can learn to identify patterns and anomalies solely based on the input data structure.

Evaluation criteria
Comparative analysis between DBSCAN and autoencoder forms a critical part of the performance evaluation in this study.By comparing the anomalies identified by both methods, we aim to assess the robustness and reliability of each method in real-world environments.A high degree of overlap in the anomalies flagged by both methods would indicate a consistent and reliable identification of abnormal conditions, thus reflecting the effectiveness of our approach.On the other hand, discrepancies between the two methods could point to areas where one model may be more sensitive or specific than the other, especially in detecting anomalies related to lubrication.This comparative lens offers a more nuanced understanding of the strengths and weaknesses of each method, thereby providing valuable insights for refinements and implementations.
Visual inspection serves as a qualitative performance metric in our methodology, complementing the quantitative metrics obtained through comparative analysis.By graphically representing both the original and reconstructed data samples, as well as the clustering outputs, we can identify the abnormal data samples, facilitating the next step of validation and verification.Visual inspection provides an intuitive understanding of how well a method segregates normal operational data from anomalies, particularly in the context of lubrication metrics.Plots and charts are used to highlight possible misclassification, overlap, or conspicuous gap that may not be immediately evident through numerical evaluation alone.The visual representation can corroborate findings from the computational methods, therefore offering an additional layer of assessment, making it an indispensable component of our developed evaluation strategy.

Experimental Studies
We use a Python-based environment for implementing the DBSCAN and autoencoder models, relying on libraries such as scikit-learn for clustering and TensorFlow for building and training the autoencoder.A set of lubrication data was collected in May and June 2022.Based on an 18-by-20 grid, each cell records a value for the lubrication depth applied to the blank.A histogram of the collected data samples is depicted in Figure 2. The autoencoder model is trained and evaluated using a data set with a split of 80% for training and the remaining 20% for test.During training, the autoencoder minimises the reconstruction error, which is commonly measured using a loss metric like MSE.Several training epochs are used to iteratively adjust the model parameters for yielding an optimal performance.After training, the model is evaluated on the test set to assess its generalisation capability on unseen data samples.The test performance indicates the ability in accurately detecting anomalies, providing insights into real-world stamping processes in an automotive plant.

Anomaly detection on test data
During training, the autoencoder learns to reconstruct the input from a latent space with a reduced dimension.Similarly in the test phase, the autoencoder reconstructs the test data from that latent space.The reconstruction error or loss on the test data set is assessed.Three instances with the highest reconstruction losses are flagged as anomalies for further inspection.These high-reconstruction-loss instances are deemed as deviations from the normal operating conditions, suggesting potential issues in the stamping process, particularly related to lubrication.The problem with lubrication could lead to defects such as wrinkles and tears.The identification of these anomalies indicates the model's ability to promptly flag deviations that could lead to defective parts or necessitate a production stoppage, thereby facilitating real-time monitoring of stamping processes in the production line.There is a clear separation between larger clusters of parts with small reconstruction loss and isolated clusters of parts with relatively large reconstruction loss.Those samples that lie above a nominal line are flagged as anomalies.

Clustering results
When applying the DBSCAN algorithm to the lubrication data, two distinct clusters appear when the Principal Component Analysis (PCA) is used to project the data into a two-dimension space, as shown in Figure 3.The first cluster, denoted as "normal cluster", consists of 706 parts that depict expected patterns pertaining to the standard operating conditions.The second (smaller) cluster, deemed as "anomaly cluster," contains only four parts.These four parts deviate significantly from the norm, based on the density-based metrics of DBSCAN, suggesting potential issues in the lubrication process that could lead to defects or 6 even require a halt in production.The stark difference in the number of parts between the two clusters serves to underscore the efficacy in the DBSCAN algorithm.

Comparison of autoencoder and DBSCAN
A comparison between the autoencoder and DBSCAN methods reveals highly correlated results, reinforcing the usefulness of each model in anomaly detection.Specifically, the bottom right three samples in the DBSCAN "anomaly cluster" correspond exactly to the top three samples with the highest reconstruction loss scores yielded by the autoencoder.Additionally, the data sample with the fourth-highest reconstruction loss score from the autoencoder matches the remaining top-left sample in the DBSCAN anomaly cluster.This strong alignment not only indicates the effectiveness of each method but also adds a layer of confidence in the finding.The correlation confirms that both methods are sensitive in identifying lubrication-related anomalies in the stamping production line.

Visualisation of results
To enhance our assessment of anomaly detection, we present the lubrication depth data using heatmaps.Figure 4 and Figure 5 depict the distribution of lubrication data pertaining to normal and anomaly parts, respectively.Both heatmaps share similar patterns of thick lubrication in the left half.However, for normal part, rows 4 to 6 of the right half part exhibit consistent deep lubrication and transition out to rows 3, 2, and 1 gradually.In contrast, for anomaly part, only rows 5 to 6 displays deep lubrication.Not only there is a significant drop in row 4, but row 1 also shows a sudden jump in lubrication depth.This visual inspection reinforces the distinctions made by both autoencoder and DBSCAN models.

Limitations and possible improvements
While the results from both autoencoder and DBSCAN models demonstrate strong potential for anomaly detection, it is necessary to validate the detected samples by subject matter experts.The absence of such expert validation means that the model outcome is yet to fully corroborate with human-level quality inspection.Nevertheless, the consistency observed between the two models represents a significant step towards the development of a usable and useful recommender system for anomaly detection, facilitating screening by experts and leading to possible cost and time savings in stamping operations.

Conclusions and Future Direction
In summary, our study has demonstrated the applicability of unsupervised learning models, specifically DBSCAN and autoencoder, for anomaly detection in stamping processes in an automotive production plant.Focusing on the lubrication data, the results have indicated a high degree of correlation between the anomalies detected by both DBSCAN and autoencoder, reinforcing their reliability in anomaly detection.The detection outcomes present the clue for subject matter experts to pinpoint their attention on these anomaly parts.The next step of our work is to link these anomaly parts with process data such as material and machine status for further investigation.
For further research, we plan to integrate our developed methods into real-time context-aware monitoring systems, prompting the identified anomalies for validation by subject matter experts as well as ontology-based semantic knowledge base in the production line.A real-time context management platform,

Figure 1 .
Figure 1.Framework for anomaly detection in stamping process.

Figure 4 .
Figure 4. Lubrication depth data for normal part.

Figure 5 :
Figure 5: Lubrication depth data for anomaly part.