A method of generative model design based on irregular data in application to heat transfer problems

The paper presents the results of applying the generative design method to reconstruct a model driven by irregular data in the form of a partial differential equation. The problem of non-stationary heating of a metal rod is considered as an example. The finite element method is used to construct templates of possible elements of the derived differential equation. The optimization algorithm for determining a model design is based on the procedure of best subset selection. The efficiency of the method for derivating a model of the thermal conduction process based on irregular data is shown. As irregular data, data obtained on spatial meshes with alternating steps of different size, or a step changing in arithmetic progression, are considered.


Introduction
The development of methods of the generative model design (GMD), in particular, data-driven models, is a promising direction for creating toolkit for monitoring and predicting complex multiphysical manmade or natural processes. Data-driven models have become widespread in various applications of neural networks and machine learning methods [1,2]. The idealized task is to reconstruct an accurate and well -interpreted model of a particular process at the same time. The generative design of models in the form of differential equations is a promising means of solving this problem [3,4,5,6,7].
The GMD methods have been intensively developed in the last few years. The methods were used for reconstruction of partial differential equations (PDE) that describe classical problems of mathematical physics: the wave equation, the Burgers equation, the Korteweg de Vries equation, with derivative term coefficients of the order of unity [7], as well as to solve real problems of ocean dynamics [8,9]. In these studies, the data presented on regular spatial meshes were considered. The studies demonstrated the high efficiency of the GMD methods.
The GMD algorithm consists of several procedures. The central place among which is (i) the construction of templates of elements (terms) of a potential model based on the available data, (ii) an optimization procedure for selecting the most significant elements.
In the works devoted to the problem of the GMD application, the finite difference method (FDM) is used to discretize partial differential equations (PDE), which is necessary for calculating the templates of elements of the derived model. In this paper, at the stage of template calculation, it is proposed to use the finite element method (FEM) [10,11] , which makes it relatively easy to take into account the irregularity of the data. As a test, the problem of heating a metal rod is considered. To obtain synthetic data, a heat transfer equation is used, describing a non-stationary one-dimensional heat conductivity process. The equation has an analytical solution. For reconstruction procedure, data arrays on irregular meshes are obtained. Several types of meshes are considered: with alternating step size along the spatial coordinate and with a step size increasing in arithmetic progression with a given progression difference.
To obtain a sparse vector of coefficients of the reconstructed dependence of the response on predictors, the LASSO technique has been widely used in regression analysis over the past two decades [12]. The advantages of the LASSO procedure include the ability to take into account a very large number of predictors. The disadvantages include the need for a reasonable choice of a penalty hyperparameter to filter out insignificant predictors, as well as, in some cases, the low accuracy of the obtained results. The differential equations underlying the mathematical model of most physical processes usually contain only a few terms. As a rule, it is enough to take into account up to 10-20 terms (elements) in the template of a potential model within the GMD method. In this case, the method of the selection of best subset of variables [13], which does not require the introduction and justification of auxiliary parameters, can be used as an optimization procedure. This approach is used in this work.
The present work has two objectives. First, to apply the GMD technique to derive the PDE based on the available data on the non-stationary process of thermal conduction in a metal rod. Secondly, to find out the effectiveness of the GDM method for solving a real problem in the case of irregular data. Irregular data refers to data obtained on an irregular spatial mesh.

Object of the research
As an object of the research, the problem of heating a thermally insulated rod is considered. The nonstationary process of heat conduction in a rod in a one-dimensional formulation is described by the equation [14] ∂ where t is the time, x is the coordinate, T is the temperature, α is the coefficient of thermal diffusivity of rod material. The initial and boundary conditions have the form Here T 0 is the initial rod temperature, T b is the fixed temperature of rod ends. The analytical solution of the problem for given initial and boundary conditions can be obtained by the method of separation of variables [14]: where K is the number of terms of the equation (in calculations K = 100). In this paper, heating of a copper rod is considered. The coefficient of thermal diffusivity is α = 10 −4 m 2 /s, T b = 700K and T 0 = 600K. The analytical solution for these parameters is shown in Figure 1 for three times t= 500s, 1000s, 1500s. The length of the rod is assumed to be equal to 1m.

The procedure of generative model design
In order to determine the design of equation for heat transfer problem, the following general form of an equation is considered in this paper where α 1 , α 2 , α 3 , α 4 are unknown coefficients. To obtain a discretized template, the present paper proposes to use the finite element method [10,11]. The approximate solution of the equation (4) is sought as a combination of the nodal values of the desired function where T j is the temperature in node j, ϕ j is the interpolation function, N is the number of nodes ((N − 1) is the number of elements). In order to determine the nodal values T j , the integral of weighted residual over the entire computational domain (in the case under consideration [0; 1]) is equated to zero: In the Galerkin formulation, the weight function has the same form as the interpolation function. In the present paper, first order polynomials (linear interpolation) are considered as interpolation functions. Substituting the functions in (6) leads to the following system of algebraic equations A three-layer approximation is used in (7) to discretize temporal derivatives in (6). The index j denotes the node number and varies from 2 to (N −1) taking into account the specified stationary boundary values of the unknown function T in nodes j = 1 and j = N. The index l denotes a time slice. One time slice consists of three neighbor time layers n − 1, n and n + 1. The number of time slices considered is L (in the present calculations, L = 1). The β im parameters regulate the degree of implicitness (the contribution to the computational templates of data from different time points). In the calculations, β im 1 = 0, β im 2 = 0. In vector form, the expression (7) can be represented as: Here vector V 1 consists of elements a jl (index j changes from 2 to (N − 1), index l from 1 to L) 21 a 31 a 41 . . . a (N−1) 1 a 22 a 32 a 42 . . . a (N−1) accordingly the vector V 2 consists of elements b jl , V 3 consists of elements c jl , V 4 consists of elements d jl , Z 0 is zero vector.
For the application of statistical learning methods [13], it is convenient to rewrite the expression (12) in the form: where Y = α 2 V 2 , α 0 = 0. Estimates of α coefficients can be made using least-squares (LS) method. The statistical analysis was performed using the R package [15]. The represented below the GMD algorithm consists of the following steps: 1) Using the generated data (see Section 2), the components of the vectors V i are calculated by the use of equations (8)(9)(10)(11).
2) The coefficient α 2 is assumed to be equal to unity. Other coefficients are calculated by the LS method. The value of the t-criterion and the p-value (the probability of observing a value equal to or greater than |t|) are determined. 3) The procedure of the selection of best subset of coefficients is applied [13,15], which allows to filter out insignificant coefficients. In the case of equation (13), the number of variables (predictors) is P = 3. The procedure involves iterating through 2 P possible arrangements with one, two or three terms (elements). For each fixed number of terms, the possible variants of the terms are sorted out and the optimal model is selected based on the calculation of the smallest sum of the square of the residuals. Next, a single optimal model is selected, for example, using the C P statistic estimation of the mean squared error on the control sample [13]: whereσ 2 is the estimation of the variance of residuals associated with each response value in (13), RSS is the residual sum of squares, d is the number of predictors in the model, n o = (N − 2) × L is the number of observations. The C P statistics adds penalty equals to 2dσ 2 to RSS, calculated on training sample. Thus it takes into account the property of the error on the training sample to underestimate the error on the control sample. The penalty increases as the number of predictors included in the model increases.

Application of the method for deriving the heat transfer equation from irregular data
Three cases corresponding to different types of data obtained using the expression (3) are considered. Case 1. The regular mesh with number of nodes N and space step Here L S = 1m is the rod length. Case 2. The irregular mesh with alternating steps ∆x 1c and ∆x 2c : where γ is the parameter of irregularity. The value γ = 0.475 , defining a difference in the size of neighbor cells of 10%, is considered. Case 3. The irregular mesh with number of nodes N and distance between nodes varied in arithmetic progression with progression step d a : Here ∆x 1 is the distance between nodes 0 and 1. The values ∆x 1 = 0.0049005 and d = 10 −6 are considered. The size of the first and last cells differs by 4%. It is assumed that N = 201 for all cases. The analysis showed the correlation of data on different time slices, which is typical for many nonstationary problems [13]. The paper considers one time slice corresponding to the time of 1500 s, the step between the neibor time layers is 1 s. Additional internal tests showed a more accurate recovery of the model in the case of the consideration only internal nodes, excluding nodes adjacent to the boundaries, the conditions at which are stationary according to (2). Therefore, the real number of observations n o was 163 (19 < i < 182). Table 1 shows the recovered by the LS-method coefficients as well as the t-criterion values and the corresponding p-values.
In the case of a uniform mesh (case 1), all estimates of the coefficients are reproduced. However, the coefficient α 3 has significantly lower p-value. One can expect that the term (element) with this coefficient  will have the significant impact on the response Y. This hypothesis is supported by the results shown in table 2. For case 1 of 2 3 = 8 possible combinations, three optimal sets among variants of one, two and three terms are marked by asterisks. Value of C P parameter is minimum for the first set with one term with coefficient α 3 . The results of a procedure of the best subset selection for case 1 are shown in table 3. The results fully correspond to the desired model. For case 2 with "alternating" steps in space (with step ratio 0.475/0.525 = 0.9), the situation is similar. The difference in steps does not affect the derived model quality (see table 3).
For case 3 with a step varying in arithmetic progression according to the results of statistical analysis (table 2), a model with two coefficients α 3 and α 4 can be preferred (table 3). However, first, the value of the coefficient α 4 is negligible compared to α 3 (the norms of vectors V 3 and V 4 are of the same order). Secondly, the data in table 1 indicate a significantly greater influence on the response of the term with α 3 . Therefore, we can recognize the results of the model recovery process as good.

Conclusions
The method of the generative model design is a promising tool for constructing a data-driven process model based on a partial differential equation. In this paper, the GMD is used to derive the PDE, which describes the non-stationary process of heat conduction. It has been shown that the GMD qualitatively reproduces the model on both regular and irregular data. As irregular data, spatial-temporal distributions of temperature on irregular meshes with alternating steps of different length and with step changing in arithmetic progression are considered. Within the framework of the GMD algorithm for obtaining discretized templates of model elements on an irregular grid, the use of the finite element method is proposed, and for optimizing the model structure -the method of the best subset selection. In the future, it is planned to take into account the presence of statistical noise in the source data used to reconstruct the heat transfer equation.