Discovering Interpretable Physical Models using Symbolic Regression and Discrete Exterior Calculus

Computational modeling is a key resource to gather insight into physical systems in modern scientific research and engineering. While access to large amount of data has fueled the use of Machine Learning (ML) to recover physical models from experiments and increase the accuracy of physical simulations, purely data-driven models have limited generalization and interpretability. To overcome these limitations, we propose a framework that combines Symbolic Regression (SR) and Discrete Exterior Calculus (DEC) for the automated discovery of physical models starting from experimental data. Since these models consist of mathematical expressions, they are interpretable and amenable to analysis, and the use of a natural, general-purpose discrete mathematical language for physics favors generalization with limited input data. Importantly, DEC provides building blocks for the discrete analogue of field theories, which are beyond the state-of-the-art applications of SR to physical problems. Further, we show that DEC allows to implement a strongly-typed SR procedure that guarantees the mathematical consistency of the recovered models and reduces the search space of symbolic expressions. Finally, we prove the effectiveness of our methodology by re-discovering three models of Continuum Physics from synthetic experimental data: Poisson equation, the Euler's Elastica and the equations of Linear Elasticity. Thanks to their general-purpose nature, the methods developed in this paper may be applied to diverse contexts of physical modeling.


Introduction
Science and engineering routinely use computer simulations to study physical systems and gather insight into their functioning.These simulations are based on established models made of sets of differential and algebraic equations that encode some first principles (i.e.balances of forces, mass, energy. . .), and whose analytical solution cannot be generally found.Given the recent advances in Machine Learning, methods that leverage experimental data to automatically discover models from observations offer a valuable support to scientific research and engineering, whenever the analysis of a system with complex or unknown governing laws is involved.
To address the question of interpretability, Symbolic Regression (SR) has been employed in several works to discover the mathematical expressions of the governing equations of single-DOF or dynamical systems [1,2,3].In [4], a combination of neural networks and physics-inspired techniques enhance SR by cleverly reducing the search space, while in [5] SR was used to distill equations from a Graph Neural Networks with a suitable architecture for problems involving Newtonian and Hamiltonian mechanics.Recent works [6,7] proposed to combine Deep Learning with SR via Recurrent Neural Networks or Transformer Networks.In [8,9], a sparse regression approach was introduced to recover the governing equations of nonlinear dynamical systems and continuum systems governed by PDEs.However, this method produces coordinate-based descriptions of the PDEs that obscure their meaning, it is sensitive to noisy data and is based on a special representation of the equations (linear combination of terms containing the derivatives of the unknown).Apart from [9], all these papers apply SR to the (re)discovery of algebraic or ordinary differential Table 1: Comparison among SR frameworks for physical model identification.Some instances of ✗ are software and/or conceptual limitation.We denoted by ✗ * some limitations of our framework that we plan to overcome next.
Ours PySINDy [9,8] EQL [18] Eureqa [1] DSR [6] AI Feynman [4] PySR [19] Field Problems equations of classical Physics or other standard benchmarks, and are hence inappropriate for field problems, which are a large class of problems in Continuum Physics (fluid/solid mechanics, heat transfer, mass transport, . . .).Moreover, all these approaches do not exploit the geometrical character of Physics, so that, for example, physical variables and mathematical operators may be combined arbitrarily without accounting for their spatial localization (i.e.whether they are defined on points, lines, surfaces or volumes).A high-level comparison between our framework and existing ones is reported in Table 1.
Here, we devise a framework that combines SR and Discrete Exterior Calculus (DEC) [10,11] for the discovery of physical models starting from experimental data.Following the pioneering ideas of Tonti [12,13] and others [14,15,16] we adopt a discrete geometrical description of Physics that offers a general-purpose language for physical theories, including the discrete analogue of field theories.A similar idea was proposed in [17], although with a less complete mathematical framework than DEC, a different model search strategy was employed and a no application to vector-valued problems was shown.In Section 2, we introduce the main concepts of DEC and we extend the theory to deal with problems characterized by vector-valued unknowns, such as the equations of Linear Elasticity.Then, we describe our approach to SR that exploits the synergy between DEC and SR for the effective identification of mathematically and physically consistent models.In Section 3 we demonstrate that our method can successfully recover the coordinate-independent representations of three models of Continuum Physics starting from a limited set of noisy, synthetic experimental data.

Automated Discovery of Physical Models
As anticipated in the Introduction, we combine SR and DEC to produce interpretable physical models starting from data.In particular, geometry is at the foundations of many physical theories of the macrocosm, e.g.general relativity, electromagnetism, solid and fluid mechanics [12], and DEC allows to capture the geometrical character of these theories better than standard vector calculus [10,16,15,14].The incorporation of such a mathematical language within SR has several key advantages: • being DEC a discrete theory, it bypasses the differential formulation, so that the governing equations can be readily implemented in computations, without the need for discretization schemes; • it allows to distinguish physical variables based on their spatial localization and thus guarantees that the generated symbolic expressions are mathematically and physically consistent; • it helps to constrain the search space of the SR by suggesting a type system for the mathematical operators and physical variables; • it encodes the discrete counterpart of several differential operators (such as divergence, gradient and curl) in a minimal set of operators, thus making the symbolic representations more compact and expressive.
We argue that our model discovery strategy exploits these features of DEC to produce compact and effective models that generalize well starting from small training datasets.Also, our approach aims at identifing the governing equations that generate the observed data, rather then directly fitting the observations through an end-to-end model that relates inputs (sources, parameters) to measured quantities.This is a substantial difference with respect to all the state-of-the-art alternatives reported in Table 1, which is also expected to provide better generalization and physical insight associated to the recovered model.
A schematic of the proposed framework is depicted in Figure 1.Model discovery consists of several steps.In the problem setting step, the physical system whose model has to be identified is defined in terms of a discrete structure, i.e. a "mesh", resembling (a simplified/coarse-grained description of) its geometry.A preliminary dimensional analysis is carried out to define the appropriate scales for the physical quantities and the number of dimensionless groups, including those that will be used as fitting parameters as a part of SR.Physical quantities, including those that are measured experimentally, are defined in terms of their spatial localization and adimensionalized.The set of DEC operators and other mathematical operations to be used for SR is chosen.The definition of boundary conditions and domain sources (such as external forces) is also carried out at this stage.Then, SR is performed based on the following loop: 1. candidate models generation: a population of trees representing mathematical models based on the chosen operators set is generated, initially in a random fashion; 2. each model is solved, i.e. by interpreting it as the expression of a potential energy function and minimizing it, and the solution (or a derived physical quantity) is compared with experimental data to evaluate an error metric that contributes to the fitness of each model; 3. the candidate models are evolved through genetic operations (crossover and mutation) based on their fitness values and produce a new generation.
The loop repeats usually until reaching a given number of generations, after which the top performing model(s) can be extracted.
In the following sections, we first briefly review some fundamental concepts of DEC, and then describe the SR approach based on it, with particular reference to physical systems that can be described by a potential energy (variational formulation).

Elements of Discrete Exterior Calculus (DEC)
Exterior Calculus (EC) deals with objects called differential forms and their manipulations (differentiation, integration) over smooth manifolds.Compared to classical tensor calculus using index notation, it provides a coordinate-independent description of physical models and unifies several differential operators, which results in more compact and readable equations that easily generalize to manifolds of different dimension [10].DEC offers an inherently discrete counterpart to EC, designed to preserve some fundamental differential properties.In particular, in the context of DEC, cellcomplexes and cochains play the role of smooth manifolds and differential forms, respectively.Here, we provide a very brief introduction to DEC, geared towards the applications we will consider in Section 3. We report in Appendix A more formalism of DEC definitions and notations.Without loss of generality, we limit the presentation to simplicial complexes.A more detailed treatment of DEC can be found in [10,11,20].
Henceforth, E N is the embedding N -dimensional Euclidean point space, V is the associated space of translations associated with E N and Lin is the space of linear applications from V to V (tensors).
From the physical modeling viewpoint, a simplicial complex provides the topological and geometrical structure that supports the physical phenomenon.A generic simplex is indicated with the letter σ and we use a pair of subscripts to identify each of them, such that σ p,i is the i-th p-simplex.Cochains are the discrete counterpart of differential forms, hence they are also known as discrete forms.Specifically, a cochain can describe a physical quantity (differential form) integrated on the corresponding simplex and is thus the proxy for a field.The discrete analogue of the exterior derivative in Exterior Calculus is represented by the coboundary d, which plays a key role in expressing balance statements [12] of physical models.A dual complex is needed to represent the behavior of a physical quantity in the neighborhood of a simplex (see Section 4.2 in [12]); a role similar to that of the dual complex is played by, e.g., stencils in the finite difference method.
In addition to the topological operators introduced so far, metric-dependent ingredients are needed to complete the formulation of physical theories, especially as concerns constitutive equations.Indeed, these equations relate primal to dual variables [12,14], so that it is convenient to introduce an isomorphism between primal and dual cochains, called the discrete Hodge star and denoted by ⋆.Finally, in continuum theories the L 2 inner product allows to express power expenditures by pairing power-conjugated quantities and thus allows to represent potential energy terms in variational models for physical systems, which will be employed in our numerical experiments.In particular, it acts on a pair of primal/dual cochains and in the following is indicated by ⟨, ⟩.This inner product allows to define the discrete codifferential δ as the adjoint of the coboundary with respect to this operator.

Symbolic regression
We tackle SR through Genetic Programming [21] (GP), an evolutionary technique that explores the space of symbolic models starting from an initial population and using genetic operations.The candidate models (individuals) are represented by expression trees, whose nodes can be either primitives (i.e.functions and operators from a prescribed pool) or terminals (i.e.variables and constants).Specifically, for the primitives we use the standard operations on scalars and the DEC operations introduced in Section 2.1, see Table B.1.
The initial population is generated using the ramped half-and-half method (minimum height = 2, maximum height = 5) that allows to obtain trees with a variety of sizes [22].We constrain the initial individuals to have length -the number of nodes of its tree -less or equal than 100 and to contain the unknowns of the problem.

Model discovery strategy
The goal of the GP evolution is to maximize the following fitness measure where α is a scaling factor set equal to {1, 10, 1000} for Poisson, Elastica and Linear Elasticity, respectively, MSE is the mean-square error of the solution corresponding to the current individual I and R is a regularization term, and η is the associated hyperparameter.The regularization term aims at preventing overfitting, i.e. high fitness on the training dataset and low fitness on the validation dataset.To favor individuals with low complexity among those who accurately fit the training dataset, we represent the regularization term as R(I) := length(I).
Starting from the initial population, the program evolves such a population through the operations of crossover and mutation, similarly to genetic algorithms.Individuals eligible for crossover and mutation are chosen through a binary stochastic tournament selection, where the competing individuals have given survival probabilities such that the weakest individual too has a chance of winning the tournament.In all the experiments we used one-point crossover and a mixed mutation operation that is chosen among uniform mutation, node replacement and shrink (see here for details on these operations).The two survival probabilities for the stochastic tournament and the three probabilties associated to the mixed mutation are hyperparameters tuned as a part of the model selection.
As anticipated, the discrete mathematical framework introduced in the previous section naturally suggests a type system for the primitives of the GP procedure.For example, we cannot sum cochains with different dimensions.To enforce this typing, we use a variant of Genetic Programming called Strongly Typed Genetic Programming (STGP), which generates and manipulates trees there are type-consistent.Crucially, in our approach the physical consistency of the generated expressions is guaranteed by the use of dimensionless quantities.To evolve the population, we use a (µ + λ)-Genetic Algorithm with µ = λ and overlapping generations, such that top µ individuals with the highest fitness are selected (truncation selection) among the total population including parents and offspring [23].
Datasets For each physical system analyzed in our experiments (Section 3), we generate a dataset based either on exact solution of the discrete model that we would like to recover or on the numerical solution of the corresponding continuous Variational models of physical systems We model physical systems adopting a variational approach, such that each individual of the population that is evolved via SR corresponds to a discrete potential energy functional.A typical individual tree is represented in Figure 3B.Boundary conditions are enforced by adding a penalty term to the discrete energy.For the fitness evaluation, we first compute the minimum-energy solution of each model numerically.We assign the arbitrary value 10 5 to the MSE of constant energies or for which the minimization procedure does not converge.The minimum-energy solutions are then compared with those in the datasets to compute the fitness of the individuals according to eq. (1).

Implementation
We implement our framework in two open-source Python libraries: dctkit (Discrete Calculus Toolkit) for the DEC concepts described in Section 2.1 and alpine for the symbolic regression.Specifically, dctkit leverages the JAX [24] library as a high-performance numerical backend for the manipulation of arrays and matrices encoding cochains and DEC operators.For the minimization of the discrete energies found during the symbolic regression process, we use the LBFGS algorithm implemented in the pygmo [25] library, with the gradients evaluated via the automatic differentiation feature of JAX.The alpine library is based on the STGP algorithms provided by a custom version of the DEAP (Distributed Evolutionary Algorithms in Python) [26] library and on the distributed computing features of the ray library [27].

Results
In this section, we assess the effectiveness of our model discovery method by solving three problems: Euler's Elastica [29] and Linear Elasticity [30].In the Poisson and Linear Elasticity problems, the goal is to recover the governing equations of the discrete model that is used to generate the dataset.In particular, by running the SR algorithm multiple times, these benchmarks allow to evaluate the recovery rate, i.e. percentage of runs where the exact equations of the model used to generate the data were found.Instead, in Elastica we emulate a real scenario where experimental data is employed for model discovery.
The primitives used for the problems are reported in Table B.1.Note that for Elastica we also use 1 as the identity dual 0-cochain and 1 int as a primal 0-cochain that is identically 1 on the interior of the simplicial complex and 0 on the boundary nodes 1 .For all the problems, in addition to the primitives we employ the constant terminals {1/2, 2, −1}; for Linear Elasticity we also add the constants {10, 0.1}.

Poisson equation in 2D
In the continuous setting, the Poisson equation can be obtained as the stationarity conditions of the Dirichlet functional [31].Such a functional contains the L 2 norm of the gradient of the unknown field.In the discrete setting, we represent the dimensionless unknown as a primal scalar-valued 0-cochain u, and application of the coboundary operator d leads to a primal scalar-valued 1-cochain that is the discrete equivalent of the gradient of the continuous field integrated along the edges.Denoting by f the primal scalar-valued 0-cochain representing the (dimensionless) source, we can write the potential energy function as It can be shown [32] that the stationarity conditions of E p correspond to the discrete Poisson equation.The benchmark consists in recovering eq. ( 2) defined over a 2-simplicial complex.In particular, a Delaunay triangulation of the unit square with 230 nodes was constructed using gmsh [33].
For the SR procedure, we generated 12 arrays representing the coefficients of the 0-cochain u by sampling the following scalar fields at the nodes of the mesh: The corresponding source cochains f were computed taking the discrete Laplacian eq.(A.9) of these functions.The full dataset is made of the (u, f ) pairs.The values of the functions u i,j evaluated at the boundary of the square domain were used to enforce Dirichlet boundary conditions during the minimization of each candidate energy produced in the SR.As initial guesses for the minimization procedures, we took the null 0-cochain.
As reported in Table 2, the set of hyperparameter values includes regularization (η = 0.1).With such a set, we obtained a recovery rate of 66% in the model discovery runs.Notice that the model discovery process is generally effective, even when the exact model is not found, as shown by the decrease of the mean of the fitness of the best individual of each generation, as the evolution proceeds (Fig. 2A).To assess the effect of regularization on model discovery, we performed 50 additional runs with the best combination of hyperparameters without regularization (η = 0) found in the grid search (see Table C.1).In this case, the recovery rate dropped to 60%, while the length of the tree corresponding to the best individual found at the end of the SR procedure was 39.97 ± 15.53, which is significantly larger than that (9) of the shortest string corresponding to the energy eq.(2).
To compensate for such a growth in model complexity, we set η = 0.1 while keeping all the other hyperparameters fixed and ran 50 additional evolutions seeding the initial population with the 50 individuals found at the end of the model discovery runs without regularization.After 20 generations, all runs converged (recovery rate = 100%) to equivalent expressions of the energy eq.( 2), with length comprised between 9 and 11 (fitness equal to 0.9 and 1.1, respectively), see Fig. 3A.

Euler's Elastica
Unlike the previous problem, here we emulate a physical experiment involving an elastic cantilever rod subject to an end load by generating synthetic experimental data, instead of starting from an exact solution of a given discrete model.Specifically, we compute numerically the solution of a continuous model for different values of the load, we perturb it with noise and sample it at mesh nodes, to mimic experimental measurements.In the following, I = [0, L] ⊂ R is a one-dimensional interval that represents the coordinates of the points of the longitudinal axis of the rod in its undeformed, straight configuration, L is the length of the rod, ũ(σ) is the angle formed by the tangent to the axis of the rod and the horizontal direction at the dimensionless arc-length σ := s/L.The axis of the rod is assumed to be inextensible and its curvature can be readily computed as ũ′ , where a prime denotes differentiation with respect to σ.For a constant cross-section rod subject to a vertical end-load at the right end and clamped at the left end, the boundary-value problem for the (dimensionless) continuous Euler's Elastica equation reads [29] ũ′′ + f cos ũ = 0 in [0, 1], ũ(0) = 0, ũ′ (1) = 0, and the corresponding variational formulation is In eqs.( 6)-( 7), f = P L 2 /B is the dimensionless load parameter, where P is the vertical component of the load applied at the right end and B is the bending stiffness of the rod.
In the discrete setting, we represent the unknown of the problem as a dual scalar-valued 0-cochain u, where the value at each dual node is the angle formed by the corresponding primal edge and the horizontal direction.The discrete curvature k may be evaluated by first taking the difference between the angles of two consecutive edges (d ⋆ u), then transferring this information to the primal nodes (⋆d ⋆ u) and finally considering only the internal nodes ( where k is meaningful.Hence, the discrete equivalent of ( 7) reads where k = 1 int ⊙ ⋆d ⋆ u.The 1D mesh for this problem was generated directly by code.Precisely, we generated a 1-simplicial complex in [0, 1] consisting of 11 uniformly distributed nodes.
To generate the datasets, we considered a rod with length L = 1 m and bending stiffness B ≈ 7.854 Nm 2 .We computed numerically the solutions of the continuous problem eq. ( 6) for 10 different values of the force P ∈ {−5, −10, . . ., −45, −50} N and we sampled the (x, y) coordinates of the points of the deformed configurations at the nodes of the 1-simplicial complex.Finally, we perturbed them with uniformly distributed noise (amplitude equal to ≈ 5% of the order of magnitude of the clean data) and we recovered the arrays of the coefficients of u.Linear approximations of these data were used as initial guesses for the minimization of the candidate energies found during the SR.The full dataset consists of the (u, P L 2 ) pairs.
In a realistic experimental scenario, the load magnitude is prescribed, while the bending stiffness is an unknown material-dependent parameter that must be calibrated to reproduce the measurements.Hence, for each individual of the SR procedure the bending stiffness must be fitted on one solution ū belonging to the training set.For this purpose, for each candidate energy E(u, f ) we solved the constrained optimization problem  Both the average and the standard deviation of the fitness of the best individual of each generation over 50 model discovery runs decrease during the SR evolution (Fig. 4A) and attain the values 0.28 and 0.13, respectively, after 100 generations.Also, the selected models fit the data of the test set well (Fig. 4B-C).These observations confirm the effectiveness of the model discovery process along with the accuracy and generalization capabilities of the recovered models.
We notice that the first and the last models among top 4 of Table 4 are such that B = B noise /2h, where h is the (primal) edge length.This relation exploits the uniform edge length of the simplicial complex representing the axis of the rod.Therefore, these models coincide with that of eq. ( 8) up to addition and/or multiplication with a constant.

Linear Elasticity in 2D
Differently from the previous benchmarks, this case deals with a vector-valued unknown and thus proves the capability of our framework of handling another class of problems.We collect the dimensionless coordinates of the nodes of the 2-simplex representing the reference configuration of an elastic body in a primal vector-valued 0-cochain.We assume plane strain conditions.Let e i , i = 1, 2 be the covariant basis for V made of two edge vectors of the reference configuration of a given 2-simplex and let e ′ i be the covariant basis V made of two edge vectors of the corresponding current configuration of the 2-simplex.The contravariant bases are indicated with superscript indices.We represent the two-dimensional deformation gradient F as a dual (tensor-valued) 0-cochain whose coefficient on a given 2-simplex is Then, we define the two-dimensional, infinitesimal strain measure as a dual 0-cochain and by applying the constitutive equation for linearly elastic isotropic bodies we obtain the corresponding dimensionless (two-dimensional) stress as a dual 0-cochain where I is the identity dual 0-cochain and (µ_, λ_) are the Lamé moduli (here, we take λ_/µ_ = 10).Notice that the second term in the constitutive equation involves the coefficient-wise multiplication between a scalar-valued and a tensor-valued cochain.
In the case of zero body forces and assuming Dirichlet boundary conditions (prescribed positions of the nodes), the potential energy function reads The goal of this benchmark is recovering eq. ( 13) as a function of F .The deformed configuration is given in terms of the current positions of the nodes that minimized eq. ( 13).In principle, an admissible candidate energy E(F ) must be frame-indifferent, i.e.E(F + W ) = E(F ) for any skew-symmetric dual 0-cochain W .To enforce such a constraint, we added the penalty term −(E(F + W ) − E(F )) 2 to the fitness eq. ( 1), where Table 5: Discrete energies corresponding to the 4 best individuals obtained from the model discovery runs (n = 50) of the Linear Elasticity problem using the final values of the hyperparameters reported in Table 2.

# Energy Training MSE
This addition is clearly necessary but not sufficient.In practice, we observed that all the energies obtained at the end of the model discovery runs are frame-indifferent.
In both cases, the current positions of the nodes can be easily computed from the strain cochain and from eqs. ( 10)- (11).
As initial guess for the energy minimization procedures associated to the evaluation of the fitnesses of the candidate energies, we took the reference positions of the nodes.
In the model discovery runs performed using the set of hyperparameters of Table 2, we obtained a recovery rate of 92%.However, the SR process is always very effective, even in the runs where the correct solution is not attained (Fig. 5A).
Notice that the second best individual of Table 5 equals 1/2 E LE .For the remaining energies reported in Table 5, we can still prove their equivalence to E LE , under specific hypothesis.In the case of homogeneous deformations (F (⋆σ 2,i ) = F for any i) where A is the total area of the 2-complex.Moreover, the following identities hold in general: Additionally, when A = 1 as in our benchmark, eqs.( 17)-( 19) provide the identities ⟨I, F T ⟩ 2 = ⟨tr(F )I, sym(F )⟩ and ⟨I, I⟩ = 2.These facts, together with eqs.( 20)-( 22), allow to prove the aforementioned equivalences.Of course, some of these equivalences do not hold in general (i.e. when deformations are not homogeneous), so the recovered models correspond to different expressions of the potential energy, while having the same accuracy (in terms of MSE) with respect to the datasets.To resolve this ambiguity, we compute the associated energy densities by dividing the energies by A, since the deformations are homogeneous.We find that energies #1, #3 and #4 of Table 5 are not admissible, because their energy densities linearly depend on the area A due to the term in eq. ( 17).However, these expressions may be fixed by interpreting such a term as that in eq. ( 18), so that they all correspond to the exact energy density that was used to generate the dataset.We remark that this a posteriori discussion can be done since our method produces interpretable models that are amenable to analysis.
We notice that traditional SR can be used when the deformation gradient and the stress are homogeneous.However, its performance is much worse than that of our method, because the former lacks tensor-based operations, which are needed to achieve a compact representation of the energy density.The results of the comparison between the two methods are reported in Appendix D, along with an example of a genuine field problem in Linear Elasticity, which clearly cannot be solved via a traditional SR approach.We believe that these simple, yet effective examples demonstrate the advantage of the proposed approach in situations where conventional SR methods struggle to learn an accurate model.

Conclusions
In this work, we have introduced a new method to learn discrete energies of physical system from experimental data that exploits DEC as an expressive language for physical theories and the interpretability stemming from the SR approach.
We have demonstrated its effectiveness by studying three classical problems in Continuum Physics.Our approach suggests that the synergy between geometry and SR leads to physically-consistent models that generalize well.Further, it paves the way to the application of SR to the modeling of a broad class of physical systems.

Limitations and future work
We have developed our framework based on a special class of problems (variational and steady-state).This choice was motivated by the simpler implementation of the numerical routines compared to time-dependent, non-variational problems.To overcome this limitation, we will implement time-integration schemes and, regarding non-variational problems, we will use trees to represent the residual of the system of governing equations, instead of the energy functional.
B List of primitives for the SR used in the numerical experiments C Details on the selection of the hyperparameters for the SR We report in Table C.1 the results of the grid search carried out for each problem.For Poisson and Linear Elasticity, we used the recovery rate as the performance measure, while for Elastica we took the MSE on the validation set, since the data are noisy.For each combination of hyperparameters we performed 5 runs.In particular, for Linear Elasticity, we selected the set of hyperparameters among those (5) leading to a 100% recovery rate in the grid search after carrying out 50 final model discovery runs.We reported in Table 2 the set of hyperparameters corresponding to the best recovery rate.

D Comparison between our method and traditional GP on the Linear Elasticity benchmark
As discussed in Section 3.3, the dataset used for Linear Elasticity only consists of homogeneous deformations.Hence, a traditional Genetic Programming-based SR offers an alternative strategy to learn eq. ( 13), where the energy is a function that maps scalar variables F 11 , F 12 , F 21 , F 22 , which represent the homogeneous deformation gradient.We compared our method with such an approach, where we used addition, subtraction, multiplication, division, square and square root as primitives and did not enforce any strong typing.After choosing the best set of hyperparameters through a grid search, we performed the 50 model discovery runs to assess the performance of the traditional approach (Figure D.1).Notably, the recovery rate dramatically drops to 4%, since in the traditional SR tensorial operations (such as, trace and symmetric part) must be learned as well from component representations.
Also, the traditional SR approach cannot be applied without introducing additional operators (such as discrete integration/differentiation operations) whenever the deformations are non-homogeneous.Hence, our approach provides a distinct advantage over traditional SR in the case of field problems.To better elucidate this aspect, we studied a genuine field problem, where a body force induces non-homogeneous deformations.In this case, eq. ( 13) becomes 1 2 ⟨S, E⟩ + ⟨u, f ⟩, (D.1) where u and f are the primal vector-valued 0-cochains representing the (dimensionless) displacements of the primal nodes and the body force at each primal node, respectively.We generated 10 data samples where u is quadratic (constant body force) and given by: u(x i,1 , x i,2 ) = − a 10 x 2 i,1 ê1 + a(x 2 i,2 − 1)ê 2 , i ∈ {0, . . ., n 0 }, (D.2) where a ∈ {0.01, 0.02, . . ., 0.1} is a dimensionless constant and (x i,1 , x i,2 ) are the (normalized) Cartesian coordinates of the i-th node and êi the corresponding orthonormal basis vectors.The associated body forces f are computed by taking the negative of the (discrete) divergence of the stress tensor S, with the deformations and stresses related by eqs.( 10)- (12).The full dataset (training, validation and test) consists of the 10 newly generated (X, f ) pairs, where X is the matrix of the current positions of the nodes, and the pure tension samples (10) of Section 3.3.
We performed 50 model discovery runs using the best set of hyperparameters found for Linear Elasticity; the results are reported in Figure D.2.The recovery rate dropped to 62%, probably because the hyperparameters were not optimal for the problem based on the new dataset.

Figure 1 :
Figure 1: Schematic of the model discovery approach.

Figure 2 :
Figure 2: Results for the Poisson problem.(A) Evolution of the mean and standard deviation of the absolute value of the best training fitness (i.e.fitness of the best individual of a generation evaluated on the training set) over the model discovery runs (n = 50) (the first 34 generations have been removed from the plot, since the range of values of the fitness is very large).Comparison between the minimum-energy solutions (B, D, F) corresponding to the best individual at the end of the model discovery stage evaluated for the source cochain associated to the test data {u 2,3 , u 1,3 , u 1,0 } of eq.(3) (C, E, G).

Figure 3 :Table 3 : 8 •
Figure 3: (A) Evolution of the mean and standard deviation of the absolute value of the best training fitness over the the final runs (n = 50) with η = 0.1 and the initial population consisting of the best 50 individuals found in the runs with η = 0. Dashed horizontal lines mark the fitnesses of individuals having the shortest expressions that represent the energy in eq.(2).(B) Tree corresponding to representation of one of the individuals with lowest absolute value of the training fitness (see Table B.1 for the meaning of the primitives appearing in the tree).Table 3: Discrete energies corresponding to the 4 best individuals obtained in the model discovery runs (n = 50) with η = 0.1 and the initial population consisting of the best 50 individuals found in the model discovery with η = 0. # Energy Training Fit.Test Fit.Test MSE 1 ⟨u, δ(d(u/2) − f )⟩ 0.9 0.9 9.8 • 10 −10 2 ⟨u, δ(du) − 2f ⟩ 0.9 0.9 9.8 • 10 −10 3 ⟨⋆(⋆u), δ(du) − 2f ⟩ 1.1 1.1 9.8 • 10 −10 4 ⟨du, du⟩ − ⟨f, 2u⟩ 1.1 1.1 9.8 • 10 −10

Figure 4 :
Figure 4: Results for the Elastica problem.(A) Evolution of the mean and standard deviation of the absolute value of the best training fitness over the the model discovery runs (n = 50) along with the tree representation of the best individual (int_coch is 1 int ).Comparison between the deformed configurations corresponding to the best individual at the end of the model discovery stage (blue curves) evaluated for P = −45 N (B) and P = −10 N (C) and the test data (red dots).

Figure 5 :
Figure 5: Result for the Linear Elasticity problem.(A) Evolution of the mean and standard deviation of the absolute value of the best training fitness over the model discovery runs (n = 50).Comparison between the deformed configurations (B, D) for ϵ ∈ {0.01, 0.02} (uniaxial tension) and (F, H) for γ ∈ {0.06, 0.08} (pure shear) corresponding to the best model at the end of the model discovery stage and the associated test data (C, E) and (G, I), respectively.The magnitude of the displacements has been scaled by a factor of 5 for clarity.Orange meshes represent the reference configurations.

Figure D. 1 :
Figure D.1: Result for the Linear Elasticity problem using traditional genetic programming.(A) Evolution of the mean and standard deviation of the absolute value of the best training fitness over the model discovery runs (n = 50).Comparison between the deformed configurations (B, D) for ϵ ∈ {0.01, 0.02} (uniaxial tension) and (F, H) for γ ∈ {0.06, 0.08} (pure shear) corresponding to the best model at the end of the model discovery stage and the associated test data (C, E) and (G, I), respectively.The magnitude of the displacements has been scaled by a factor of 5 for clarity.Orange meshes represent the reference configurations.

Figure D. 2 :
Figure D.2: Result for the Linear Elasticity problem using a dataset containing non-homogeneous deformations.(A) Evolution of the mean and standard deviation of the absolute value of the best training fitness over the model discovery runs (n = 50).Comparison between the deformed configurations (B, D) for ϵ ∈ {0.01, 0.02} (uniaxial tension) and (F, H) for a ∈ {0.03, 0.04} (non-homogeneous deformations, see eq. (D.2)) corresponding to the best model at the end of the model discovery stage and the associated test data (C, E) and (G, I), respectively.The magnitude of the displacements has been scaled by a factor of 5 for clarity.Orange meshes represent the reference configurations.

Table 2 :
Values of the hyperparameters used in the numerical experiments of Section 3.For mixed mutation, the probabilities refer to uniform mutation, node replacement and shrink, respectively.For stochastic tournament, the probabilities refer to the survival change of the strongest and the weakest individual, respectively.differentvalues of a chosen load parameter.We split the dataset in three sets (double hold out): training (50%), validation (25%), and test (25%).In the present context, training consists of model discovery and possibly fitting of dimensionless parameters with unknown values.By a screening phase we identify the hyperparameters that mostly affect the learning process and we tune their values through a grid search to minimize the MSE on the validation set.More details on the grid search choices and results are reported in Appendix C. With the best values resulting from the grid search (Table2), we perform 50 model discovery runs using as a dataset the combination of the training and the validation sets.Finally, we assess the generalization capability of the best model found in the model discovery runs by evaluating its MSE on the test set.

Table 4 :
Discrete energies corresponding to the 4 best individuals obtained in the model discovery runs (n = 50) of the Elastica problem using the final values of the hyperparameters reported in Table2.⟨⋆1int , (d ⋆ u) 2 ⟩ − ⟨sin u, f 1⟩Then, from the relation B = P L 2 /f we recovered the value of B and used it to evaluate the fitness of the individual over the whole training, validation and test sets, as described in Section 2.2.In particular, taking the ground-truth model, E ≡ E el , and solving the optimization problem we can identify a value B noise ≈ 7.4062 Nm 2 for the bending stiffness that accounts for the noise.