Optimizing Neural Networks for Enhanced Material Property Predictions: Insights from Bulk Modulus Analysis

This research presents a deep learning model designed to accurately compute material properties, with a specific focus on the bulk modulus. This study places significant emphasis on hyperparameter optimization, involving adjustments to batch size, learning rate, hidden layer, and neuron count. The dataset, comprising 7107 diverse materials, undergoes thorough preprocessing, which includes outlier removal and the extraction of elemental property descriptors using the matminer library and the Magpie dataset. The core model utilized in this research is an Artificial Neural Network (ANN), with the descriptors serving as crucial input features. Model performance assessment is conducted by using the Mean Absolute Error (MAE) as a quantitative metric, providing insights into predictive accuracy. This research also employs sensitivity analysis to scrutinize the significance of 132 features in predicting the bulk modulus property, contributing to an understanding of material behavior dynamics and facilitating model optimization. The results highlight the impact of neuron count, layer depth, learning rate, and batch size on prediction accuracy. Furthermore, feature importance analysis underscores the critical role of specific material properties, with mean covalent radius emerging as the most influential factor in predicting the bulk modulus. These discoveries provide guidelines for optimizing neural network configurations and material property descriptors for predicting material elasticity.


Introduction
In the field of materials science and engineering, the quest for material properties prediction stands as an ongoing challenge [1,2,3].The fourth industrial revolution, characterized by the pervasive integration of digital technologies, has ushered in an era of unprecedented data accessibility and computational power.Within this landscape, the application of advanced machine learning techniques has emerged as a transformative approach to materials research [4,5,6,7].As materials continue to play a pivotal role in various industries, the accurate prediction of their mechanical behavior, such as the bulk modulus, becomes important [8].This research aims to contribute to this evolving paradigm by focusing on the prediction of the bulk modulus.
In this research field, a previous study predicted the mechanical properties of materials by employing supervised machine learning based on microstructural images [9].In addition, the universal descriptors combining atomic properties and crystal fingerprints for predicting elastic properties were proposed [10].Moreover, the use of machine learning to predict elastic constants and mechanical properties of multi-component alloys is demonstrated [11].While these studies have pushed the boundaries of materials prediction, the gap remains in the prediction of bulk modulus using deep learning models.This present study, which emphasizes hyperparameter optimization and feature importance extraction, addresses this gap.

Model Development
The primary objective of this work is to define the architectural framework of a deep learning model for the precise computation of material properties, while simultaneously exploring the implications of various features through a systematic sensitivity analysis.In this work, several key hyperparameters are optimized.These include the batch size, which varied between 8 to 128, the learning rate which ranged from 10 −2 to 10 −6 , and the number of neurons within the hidden layer, which varied between 16 to 512.The outcomes of this exploration are for the refinement of the deep learning model's architecture and training parameters, thereby ensuring its effectiveness in material property prediction.
The research utilizes materials data, with a focus on elastic properties, specifically the bulk modulus, owing to its relevance in the field of materials science and engineering applications.The selected dataset, gathered from the Materials Project, consists of 7107 distinct materials.This selection is driven by the fundamental requirement for a diverse and representative dataset, essential to the generation of a reliable model prediction.To prevent overfitting while enabling robust performance evaluation, the data is divided into training (80%), validation (10%) and test (10%) sets.The training data tunes model parameters, validation data guides hyperparameter configuration, and unseen testing data quantifies out-of-sample generalization.
Prior to the deep learning analysis, the materials data undergoes an extraction of relevant information embedded within material formulas.This task is accomplished through the application of descriptor techniques.Specifically, the matminer library with the Magpie dataset is employed to derive elemental property descriptors.This selection is motivated by the intrinsic significance of elemental properties in shaping material behavior.The resultant descriptors contain a comprehensive array of formula properties, including Stoichiometric Attributes, Elemental Property Statistics, Electronic Structure Attributes, and Ionic Compound Attributes, each contributing to the bulk modulus properties.
In the context of evaluating the deep learning model's performance, this work relies on the Mean Absolute Error (MAE) as the quantitative performance metric.The selection of MAE is rooted in its innate suitability for regression tasks, quantifying the average magnitude of errors between predicted and actual bulk modulus values.This metric, capturing the essence of model accuracy, aligns with the research objective of optimizing bulk modulus estimation.The mathematical formulation for MAE is shown in Equation 1.
In this equation, y 1 represents the measured value, whereas ŷ1 denotes the predicted value for the i-th data point within the dataset.The symbol ȳ is the mean value extracted from the entirety of the dataset, while N denotes the total number of datasets considered.This equation serves as a tool for evaluating the predictive performance of the model.

Hyperparameters Optimization and Feature Importance
The collected material formulas undergo a systematic transformation process to convert material formulas into composition objects and subsequently extract an array of formula properties.This feature set, consisting of a total of 132 distinct properties, was further organized into Within the domain of model optimization, this work focuses on hyperparameter optimization.This iterative and systematic exploration of hyperparameter combinations is undertaken with the objective of identifying the most effective configuration.Additionally, sensitivity analysis is carried out in this optimization process.
Furthermore, the calculation of feature importance is performed through sensitivity analysis, wherein each of the 132 extracted features undergoes detailed evaluation to determine its influence on bulk modulus property predictions.

Results and Discussion
In this section, the designed deep learning model trained on materials data is presented.In order to optimize the model, several hyperparameters are iteratively studied.Furthermore, a feature importance calculation is performed on the trained model to identify the most influential feature.

Number of Layers
Next, we examine the influence of the number of layers using the previously determined number of neurons.Figure 2 shows the influence of the number of layers on MAE.The figure conspicuously illustrates the effect of altering the number of layers within the neural network on predictive accuracy.As the number of layers increases from 1 to 5, the MAE tends to decrease with an anomaly when the number of layers is equal to 4. As the number of layers increases beyond 5 layers, systematic improvement of the model performance is not observed; it even gets much worse at 10 layers.This may indicate a well-known phenomenon of an overfitting condition when the number of hidden layers reaches a certain threshold.
The important information is the identification of the most optimal neural network architecture.The figure points to the configuration employing 5 layers as the optimum number of layers, with the lowest MAE observed at 18.24.This finding underscores the importance of carefully selecting the number of layers.
3.1.3.Learning Rate Figure 3 shows the effects of learning rates on MAE.From the figure, one can notice a trend of decreasing MAE as the learning rate decreases from 10 −2 to 10 −5 .This observation signifies that lower learning rates are associated with improved predictive accuracy which is up to 10 −5 with the resulted MAE of 23.75.However, reducing the learning rates further results in much worse model performance.Thus, the optimum learning rate of the model should be carefully tested, as monotonically increasing or reducing its value does not always guarantee the improvement of model performance.

Batch Size
In Figure 4, the effects of the number of batches on MAE are shown.It becomes evident that employing a smaller batch size leads to a lower MAE.This confirms the well-known feature in which a smaller batch size is able to capture complex properties more accurately.
Batch size is a critical hyperparameter in machine learning, determining the number of data points utilized in each iteration of the training process.The choice of batch size impacts not only the efficiency of model training but also its overall performance.Smaller batch sizes are often associated with noisier gradient estimates but can lead to faster convergence and potentially better generalization.Conversely, larger batch sizes can provide more stable gradient estimates but may require longer training times and prone overfitting [12].

Models Performance
The results demonstrate that both the ANN and XGBoost models achieve good overall performance for predicting bulk modulus, as evidenced by the high R2 scores of 0.81 and 0.80 respectively in Figures 5 and 6.This indicates that over 80% of the variance in the actual bulk modulus values can be explained by each model's predictions.
From Figure 5 depicting the ANN model, we can observe a strong correlation between the actual and predicted bulk modulus values, with the data points largely concentrated along the diagonal.This shows the model's capability to closely match the real bulk modulus through its predictions across a wide range of materials.A few outliers exist, but the bulk of points lie near the parity line.
From to the XGBoost results in Figure 6, a similar trend is noticed with most points aligned with the diagonal and high R2 score, underscoring equally robust modeling.However, more scattered points are discernible compared to the ANN plot, highlighting greater deviations between certain predicted and actual values.This elucidates why the mean absolute error As a deep neural network, the ANN model comprises multiple hidden layers and nodes that can capture complex non-linear relationships between the input features and output variables.This allows it to build a more accurate mapping between the material descriptors and bulk modulus.The additional layers enhance the model's representation power to discern intricate patterns within the data.
In contrast, XGBoost with its decision tree ensemble foundation has relatively simpler modeling capabilities.While still robust, XGBoost may fail to extract more subtle insights compared to deep networks like ANN.Consequently, the predictive errors are marginally higher than the ANN model.
Both models prove capable of bulk modulus prediction, with the ANN model demonstrating a slight edge owing to its advanced architecture and ability to approximate intricate functions.Further hyperparameter tuning and the addition of hidden layers to XGBoost could narrow its performance gap with ANN.Nonetheless, the ANN model's exceptional accuracy highlights the merits of deep learning for decoding complex materials data.

Feature Importance
In this work, feature importance as a crucial metric in assessing the influence of input on the bulk modulus prediction, is investigated.Feature importance represents the degree to which each input feature contributes to the predictive accuracy of the model.It provides insights into which aspects of material composition are most significant in determining the material elastic properties.The input utilized for feature importance calculation is derived from matminer formula property descriptors using the Magpie dataset.These descriptors contain a comprehensive range of material properties, including Stoichiometric Attributes, Elemental Property Statistics, Electronic Structure Attributes, and Ionic Compound Attributes.These descriptors collectively capture the essential aspects of material composition.
The feature importance is calculated by sensitivity analysis.This approach involves measuring the rates of change of the bulk modulus concerning each of the 132 input features.By systematically varying the input feature values and observing the resulting fluctuations in the bulk modulus predictions, the relative impact of each feature on the model's performance can be quantified.This method provides a detailed understanding of feature importance to identify the most influential material properties in the context of elastic property prediction.
Figure 7 illustrates the relative feature importance of the top 10 out of a total of 132 features employed to predict bulk modulus properties.Among the top five features with the highest importance, mean CovalentRadius and maximum CovalentRadius emerge amongst the top three most influential.The Covalent Radius is related to the bond length between the atoms in the material.This result suggests that the distance between atoms in a covalent bond plays a significant role in determining the material's elastic properties.
Additionally, Mean MeltingT which describes average melting temperature also holds a significant contribution to the material elastic properties.This feature can be understood as the melting temperature is closely related to the strength of the atomic bond within the material.Hence, materials with a higher melting temperature tend to have stronger atomic bonds.
Minimum Gsmagmom (smallest magnetic moment) also holds substantial importance in bulk modulus prediction.These features provide information about the largest atom present in the material, the material's melting characteristics, and the presence of elements with low or no 10th Asian Physics Symposium (APS 2023) Journal of Physics: Conference Series 2734 (2024) 012038 magnetic moments, respectively.Furthermore, Minimum Gsmagmom (smallest magnetic moment), Mean Gsvolume pa (average volume per atom), and Maximum MendeleevNumber (highest Mendeleev number) also contribute to feature importance.While the role of magnetic moment in the bulk modulus deserves further study, we speculate that this may be related to the occupation of the crystalline orbitals.In this respect, smaller magnetic moments which are related to the double occupations of the orbitals may enhance the bulk modulus.On the other hand, Mean Gsvolume pa offers insights into the material density and atomic packing, while Maximum MendeleevNumber indicates that the presence of elements with higher atomic numbers may also affect the bulk modulus properties.
Feature importance analysis helps to identify the most significant feature to determine the bulk modulus of the material.The findings collectively underscore the multifaceted nature of feature contributions to bulk modulus prediction, providing a deeper understanding of the interplay between material properties and elastic behavior.

Conclusion
In this study, a deep learning model has been developed to accurately compute material properties, specifically on the bulk modulus.Through hyperparameter optimization, involving adjustments to batch size, learning rate, and hidden layer neuron count, the predictive accuracy of the model has been enhanced.
Comparative assessment reveals inherent tradeoffs between flexibility, precision, outlier handling, and overfitting resistance that guide optimal model selection.The ANN demonstrates superior numeric accuracy on more generalized data data while XGBoost better captures secluded data.Further unification could yield more universally effective materials forecasting.
The feature importance analysis has revealed that the covalent radius is the most influential property in predicting the bulk modulus.These findings provide essential guidance for optimizing neural network configurations and material property descriptors, enhancing the predictive accuracy of material elasticity.

Figure 1 .
Figure 1.Effect of Number Neuron on Model Performance.

3. 1 . 1 .
Number of Neurons Figure 1 shows the effects of Mean Absolute Error (MAE) on the number of neurons.The initial observations indicate that the MAE experiences a notable reduction as the number of neurons within the hidden layer increases.Based on the results, it can be seen that there are two distinct regions, where the first region shows a rapid decrease of MAE when the number of neurons increases from 16 to 96.The second region starting from 112 neurons shows a slower decrease of MAE.The rapid decrease of MAE indicates the big changes 10th Asian Physics Symposium (APS 2023) Journal of Physics: Conference Series 2734 (2024) 012038 IOP Publishing doi:10.1088/1742-6596/2734/1/0120384 in weights and bias of the model, showing that the model is still in the learning phase.As the model learned the important weights and bias, it undergoes a saturated phase which can be seen from the slower decreasing rate of MAE.The slower rate of MAE reduction suggests diminishing returns in terms of predictive performance with further increases in the number of neurons.

Figure 2 .
Figure 2. Effect of Number Layers on Model Performance.

10thFigure 3 .
Figure 3.Effect of learning rate on model performance.

Figure 4 . 6 Figure 5 .
Figure 4. Effect of number of batch size on Model performance

10th 8 Figure 7 .
Figure 7.The resulted 10 input features with the highest importance.
specific categories, including Stoichiometric Attributes, Elemental Property Statistics, Electronic Structure Attributes, and Ionic Compound Attributes.Each of these categories contains essential information on material composition that is fundamental to the predictive modeling process.