Predictive model of mass flows of gaseous emissions from beehive ovens

One of the techniques used in the industry for the control of variables is, from their magnitudes, such as fuel flow, air volume, amount of material mass, among others. The ceramic industry needs to measure and control the polluting gases of its fixed sources in a less costly way, based on tools that allow agility in decision making to mitigate the adverse effects, not only to comply with a legal standard, but also for environmental and management commitment. The objective of the research is to design a predictive model of the concentration of polluting gases in the beehive ovens based on the results of the balance of matter and energy in the beehive ovens. An exploratory descriptive methodology was used, where data on beehive ovens and fourteen (14) continuous quantitative variables were considered through the statistical technique of multiple regression to analyze the predictive behavior of the pollutant concentration variables. As a result, the predictive capacity of the resulting model was high, explaining 79% of the total variation of the variable. The multiple correlation coefficient of the complete model was 0.79. During the analysis of the model assumptions, the Durbin Watson score reached a value of 1.971, evidencing compliance with the assumption of independence of the errors.


Introduction
The tendency of companies in the ceramic sector is focused solely on the construction sector; some of the products resulting from industrial manufacturing are tablets, flooring, bricks, tiles, blocks, and pavers; other by-products and handicrafts are considered as complementary productive actions, all of which are part of the ceramic sector. The sector's deficiencies are concentrated in: the quality and quantity of clay exploitation, the incipient technology used by the various productive units in the development of their processes, the low diversity of value-added products, their unique tendency to the construction sector, the lack of investment for the execution of research and innovation activities, and the lack of availability of specialized human talent in specific areas, present a great weakness that affects the sector's competitiveness. However, an aspect to which little importance has been given and that in recent years has gained strength, thanks to the regulatory framework, has been the management of atmospheric emissions pollutants, many of these companies manage precarious processes in the control of pollutants generated in their cooking processes, therefore, it is relevant and urgent to develop emission control mechanisms, which rather than regulate, or sanction and close manufacturing plants, provide them with tools that allow them to remedy this shortcoming [1][2][3].
Therefore, it is proposed to establish a mathematical model that allows estimating the concentration of pollutants coming out of the chimney mouth of the identified beehive kilns, based on the variables that affect the firing process and based on the equations of mass conservation of the process, it is also expected to compare it with the historical studies of evaluation of atmospheric emissions in stationary sources of factories, in a statistical analysis of the data obtained by a monitoring and surveillance entity. The proposed mathematical model will be a useful tool, since it simplifies the control operations, serving as support in the decision making process: from the industrial point of view, it is about optimizing the process maximizing the ceramic production to be obtained, and at the same time minimizing the concentration of the pollutants that will be produced; from the environmental point of view, the unfavorable environmental impacts for the environment will be mitigated; and from the social point of view, it will give a piece of mind to the community because it is assumed that the factories will maintain the minimum levels required by the norm [4,5].
In this context, given the problems to which the environmental impact is increasingly attributed, it is necessary to analyze the future behavior or projection of the particles from the moment they leave the emission source, and their path in the atmosphere, mainly over large geographic environments where pollution directly or indirectly affects a population, so, for the development of this research, questions arose such as: a) What is the correlation between variables for the design of the predictive model? b) What method should be used for the design of the predictive model? c) hat is the behavior of the variance for the significance of the adjustment in the predictive model? e) How to mathematically model the dispersion of particulate matter (PM) and other pollutants in the context of the ceramic industry? Thus, this document is structured as follows: in section 2, the methodology used for the development of the research is presented, in section 3, answers are given to the questions mentioned above, addressing the multiple regression analysis of the different variables, the analysis of variance for the significance of the fit in the regression model, and the correlation of the variables, finally, in section 4, conclusions and a final contribution are presented.

Methodology
For the predictive model through beehive kilns in the ceramic sector of the municipality of El Zulia, Norte de Santander, Colombia, the dependent variable is related to the concentration of pollutants, the dependent variables are height of the chimney, kiln volume, mass balance, amount of coal and mass of material to be burned, and the intervening variable in this case is the percentage of sulfur in the coal; this set of variables is shown in Figure 1. For the development of the predictive model, a descriptive analysis of the data was previously performed through exploratory and confirmatory factor analysis, then the model was re-specified, and subsequently, for the integration of results and formulation of the model, a multiple regression analysis was performed. The statistical technique of multiple regression was considered to analyze the predictive behavior of the pollutant concentration variables: PM, SO2 and NOx, considering these as observed and not latent variables. The procedure followed through the SPSS 24.0 software, was the stepwise hierarchical regression, considering all possible models until a complete model was obtained and generally follows the following steps: 1) analysis of correlations between variables; 2) analysis of the basic application assumptions; 3) estimation of standardized and non-standardized coefficients; 4) evaluation of the fit and quality of the model [6][7][8].

Results and discussion
According to the results obtained and expert opinion, PM is the main pollutant that is associated with the production capacity of the kiln and the quality of the fuel used. Likewise, experts reflect that the type of material entering the kiln can affect the combustion process. When firing processes are carried out in the ceramic industry, the following can be considered as possible pollutants: Hot air, PM from products, fuels, combustion gases such as carbon monoxide, carbon dioxide, sulfur dioxide. On the other hand, the quality and characteristics of the fuel used play an important role because a good fuel allows it to be fully consumed during the process, depending on the amount of air that enters it can be oxidizing or reducing, this would generate considerable changes in the generation of gaseous emissions. Within the results, it is necessary to have a rigorous control of the variables, through the measurement, this is in physical terms, of the correct handling of the magnitudes of the variables, for example, volume of fuel or air, for its correct dosage, these factors are considered essential for the control of the gaseous emissions derived from the process in mention.

Multiple regression analysis with the concentration of particulate matter pollutant
For the design of an approximation of the predictive model of the concentration of pollutants generated from the cooking process in beehive kilns, we have previously evaluated the results of the solution obtained in the approximation of latent variables, which is shown in Figure 2, where we proceed to the formulation of a predictive model considering as a dependent variable the measurement obtained in the concentration of each type of pollutant and as independent variables each of the indicators proposed in the structural solution through a multiple regression analysis [9][10][11][12]. According to Figure 2, and starting with PM, the existence of significant correlations between PM and the different associated variables considered as predictor variables is verified, that is, the following were adopted as independent variables that predict PM: production during measurement (TON), capacity, actual consumption, and the percentage of moisture content (Layer, Consum and PorCon). Table 1 shows the correlation matrix between the variables considered for this analysis. In general, a high and negative correlation between PM and the predictor variables is considered. A high positive correlation (approximately r > 0.80) is also reported between production and moisture percentage and actual fuel consumption.
Once the strength and significance of the correlations between the variables involved have been verified. The solution obtained is shown in Table 2. The predictive capacity of the resulting model was high, explaining 99% of the total variation of the variable (R2). The multiple correlation coefficient of the complete model was 0.99. During the analysis of the model assumptions the Durbin Watson score reached a value of 2.204 (located between the range of 1 and 3) evidencing that there is independence of errors. In relation to the model, yielded an F value of 1521.13, for 3 and 596 degrees of freedom, and with a p-value = 0.000 p < 0.01, which is statistically significant. This means that when considering the three independent variables, a significant solution is obtained in the prediction of PM concentration in this type of furnaces. When reviewing the coefficients of the regression model, it is observed that the tscores indicate that all the variables considered contribute significantly to the prediction model and consequently the values obtained (t = -32.53; t = 22.21, and t = -10.78; p < 0.01) are valid in the population, this suggests that all the predictor variables were statistically significant. On the other hand, the values of the inflated variance factor (IVF) indicate that the assumption of non-multicollinearity is met, since they range between 2.79 and 6.67, which shows that they are close to 1 and are not located above 10.

Multiple regression analysis with SO2 pollutant concentration
Following the same procedure, the multiple regression analysis was performed with the SO2 pollutant concentration. The variables scheme comprises: SO2 dependent variable, the absolute stack pressure (mmHg), stack gas velocity (m/s) and fuel calorific value (BTU), that is: pressure, velocity, and power, were taken as independent variables predicting SO2. Table 3 shows the correlations between the variables considered in this model. Like PM, in general a high and negative correlation between SO2 and the predictor variables is considered. A high and positive correlation (approximately r = 0.80) is also reported between pressure, speed, and power. The results of the regression analysis are shown in Table 4, the predictive ability of the resulting model was high, explaining 79% of the total variation of the variable (R 2 ). The multiple correlation coefficient of the full model was 0.79. During the analysis of the model assumptions the Durbin Watson score reached a value of 1.971 evidencing the fulfilment of the assumption of independence of the errors.
In relation to the model, yielded an F value of 340.56, for 3 and 596 degrees of freedom, and with a p value = 0.000 p < 0.01 which is statistically significant. Consequently, the variables: pressure, speed and power are excellent predictors of SO2. When evaluating the coefficients of the regression model, it is observed that the t-scores indicate that all the independent variables considered contribute significantly to the prediction model and consequently the values obtained can be generalized to the population (t = -27.52; t = 11.24, and t = 8.71; p < 0.01), this suggests that all the predictor variables were statistically significant. On the other hand, the values of the inflated variance factor (IVF) indicate that the assumption of non-multicollinearity is met, since they range between 3.33 and 4.72, lower than the suggested parameters of 10 points.

Multiple regression analysis with NOx pollutant concentration
Replicating the procedure for the NOx pollutant concentration, we consider as independent predictor variables % fuel sulfur, diameter (m), height (m) and stack temperature (ºC), that is: PorAzu, diameter, height, and temper. The correlation matrices between the variables considered are shown in Table 5. In general, low to moderate inverse correlations are reported in most cases, the exception being the temper variable with the diameter and height of the chimney. Table 5. Correlation matrix, NOX predictive model. Table 6 reports the results of the multiple regression model. The predictive ability of the resulting model, considering all predictor variables, is about 60%. The Durbin-Watson score reached a value of 1.875 and ratifies the independence of the errors. In relation to the model, yielded an F value of 213.08, for 4 and 595 degrees of freedom, and with a p value = 0.000 p < 0.01 which ratifies the significance of the model and the predictive power of the variables: porazu, diameter, height, and temper in the prediction of NOx. Evaluating the standardized and unstandardized coefficients of the regression model, it can be observed that the t scores indicate that the variables considered in the model contribute significantly to the NOx prediction model and consequently the values obtained can be generalized to the population (t = -22.98; t = -7.37; t = 17.75, and t = -14.76; p < 0.01), confirming the predictive quality of the independent variables considered. In the analysis of compliance with the assumption of non-multicollinearity based on the IVF values, these are found to be less than 10, since they oscillate between 1.45 and 4.32, ratifying the overcoming of this assumption.

Approximation to an integral model
Considering the results found in the latent variable approximation and the multiple regression analysis, the following assumptions are suggested for the consideration of an integral model of beehive kiln contamination: Considerations in the measurement model; a) PM is considered as a dependent observed variable that is measured from the independent variables: production, percent moisture content, and actual fuel consumption; b) SO2 is a dependent observed variable measured from the variables: absolute pressure, gas velocity and calorific value; c) NOx is a dependent observed variable measured from the predictor variables: percentage of sulfur in the fuel, stack height, diameter, and temperature; d) production capacity is excluded from this model since all furnaces are produced at maximum capacity overlapping these two variables. According to the parameters estimated from the general linear model of the multiple regression analysis, given in Equation (1), Equation (2) Table 7, the correlation matrix between the pollutant concentration variables is shown.

Conclusions
For legal and operational reasons, companies must periodically monitor the emissions generated by their beehive ovens, being these the source of pollutant gas emissions, in other words the system where the physical phenomenon occurs; but this legal provision is difficult to comply with for the ceramic industry, since the available resource of equipment for measurement, are technologies located in other regions of the country; and also must be considered the high costs of each isokinetic analysis. Therefore, as a result of the present research, a proposal for a predictive mathematical model was developed, which should analyze the polluting source, the present study managed to develop three equations as a result, by the method of general multiple regression, as follows one equation for PM, whose measurement or physical magnitude should be measured in mass flow, thus being in parts per million, the second equation represents the amount of sulfur dioxide, which can also be measured according to its mass flow, resulting from the balance of matter in the system, and the third equation is the approximate representation of nitrogen oxides, also evaluated from its mass flow. It is important to measure these variables in terms of mass flow or in terms of quantity, to be compared with the standard patterns allowed and regulated by the authorities. The solution obtained from the ESM allowed validating the indicators associated to each type of pollutant, despite sharing common aspects and elements in the measurement framework, it was possible to define the adequate saturation of each indicator, to facilitate the formulation of a predictive model based on the analysis of the observed variables. Considering that this analysis suggests that the quantity or measurement of PM is an independent variable from which the behavior of SO2 and NOx can be explained. After the analysis of the structural model, it can be concluded that PM is considered as an exogenous independent observed variable; SO2 and NOx are considered endogenous dependent observed variables. Similarly, a correlation is found between PM and SO2 and NOx and a correlation between SO2 and NOx variables. After applying the resulting model, it is observed how the three types of pollutant concentrations are related, among which the presence of PM and its influence on SO2 and NOx clearly stand out. These relationships and the variables found in the model show the quality and relevance of the proposed model in the evaluation of pollution due to pollutant concentration in beehive kilns.