Investigating the ability of PINNs to solve Burgers’ PDE near finite-time blowup

Physics Informed Neural Networks (PINNs) have been achieving ever newer feats of solving complicated Partial Differential Equations (PDEs) numerically while offering an attractive trade-off between accuracy and speed of inference. A particularly challenging aspect of PDEs is that there exist simple PDEs which can evolve into singular solutions in finite time starting from smooth initial conditions. In recent times some striking experiments have suggested that PINNs might be good at even detecting such finite-time blow-ups. In this work, we embark on a program to investigate this stability of PINNs from a rigorous theoretical viewpoint. Firstly, we derive error bounds for PINNs for Burgers’ PDE, in arbitrary dimensions, under conditions that allow for a finite-time blow-up. Our bounds give a theoretical justification for the functional regularization terms that have been reported to be useful for training PINNs near finite-time blow-up. Then we demonstrate via experiments that our bounds are significantly correlated to the ℓ2 -distance of the neurally found surrogate from the true blow-up solution, when computed on sequences of PDEs that are getting increasingly close to a blow-up.


Introduction
Partial Differential Equations (PDEs) are used for modeling a large variety of physical processes from fluid dynamics to bacterial growth to quantum behaviour at the atomic scale.But differential equations that can be solved in "closed form," that is, by means of a formula for the unknown function, are the exception rather than the rule.Hence over the course of history, many techniques for solving PDEs have been developed.However, even the biggest of industries still find it extremely expensive to implement the numerical PDE solvers -like airplane industries aiming to understand how wind turbulence pattern changes with changing aerofoil shapes, [1] need to choose very fine discretizations which can often increase the run-times prohibitively.
In the recent past, deep learning has emerged as a competitive way to solve PDEs numerically.We note that the idea of using nets to solve PDEs dates back many decades [2] [3].In recent times this idea has gained significant momentum and "AI for Science" [4] has emerged as a distinctive direction of research.Some of the methods at play for solving PDEs neurally [5] are the Physics Informed Neural Networks (PINNs) paradigm [6] [7], "Deep Ritz Method" (DRM, [8]), "Deep Galerkin Method" (DGM), [9] and many further variations that have been developed of these ideas, [10,11,12,13,14].An overarching principle that many of these implement is to try to constrain the loss function by using the residual of the PDE to be solved.These different data-driven methods of solving the PDEs can broadly be classified into two kinds, (1) ones which train a single neural net to solve a specific PDE and (2) operator methods -which train multiple nets in tandem to be able to solve a family of PDEs in one shot.[15,16,17] The operator methods are particularly interesting when the underlying physics is not known and state-of-the-art approaches of this type can be seen in works like [18].
For this work, we focus on the PINN formalism from [6].Many studies have demonstrated the success of this setup in simulating complex dynamical systems like the Navier-Stokes PDE [19,20,21], the Euler PDE [22], descriptions of shallow water wave by the Korteweg-De Vries PDEs [23] and many more.
The existing literature in classical numerical analysis for estimating the error of approximating PDE solutions falls short in analyzing the performance of PINNs.In Section 3.1, we provide a review of this inapplicability of classical results to the specific case of Burgers' PDE that we focus on in this work.The work in [24,25] has provided the first-of-its-kind bounds on the generalization error of PINNs for approximating various standard PDEs, including the Navier-Stokes' PDE.Such bounds strongly motivate why minimization of the PDE residual at collocation points can be a meaningful way to solve the corresponding PDEs.However, the findings and analysis in [26,27] point out that the training dynamics of PINNs can be unstable and failure cases can be found among even simple PDE setups.It has also been studied that when trivial solutions can exist for the PDE, the PINN training can get stuck at those solutions [28,29].Work in [30] has shown that traditional ways of solving PINNs can violate causality.
However, in all the test cases above the target solutions have always been nice functions.But an interesting possibility with various differential equations representing dynamical systems is that their solution might have a finite-time blowup.Blow-up is a phenomena where the solution u becomes infinite at some points as time t approaches a certain time T < ∞, while the solution is well-defined for all 0 < t < T i.e.
One can see simple examples of this fascinating phenomenon, for example, for the following ODE du dt = u 2 , u(0) = u 0 , u 0 > 0 it's easy to see that it's solution blows-up at t = 1 u0 .Wintner's theorem [31] provided a sufficient condition for a very generic class of ODEs for the existence of a well-defined solution for them over the entire time-domain, in other words, the non-existence of a finite-time blowup.More sophisticated versions of such sufficient conditions for global ODE solutions were subsequently developed [32] and [33] (Theorem 3.3).Non-existence of finite-time blow-ups have also been studied in control theory [34] under the name of "forward completeness" of a system.
The existence of a blow-up makes solving PDEs difficult to solve for classical approximation methods.There is a long-standing quest in numerical methods of PDE solving to be able to determine the occurrence, location and nature of finite time blow-ups [35].A much investigated case of blow-up in PDE is for the exponential reaction model u t = ∆u + λe u , λ > 0 which was motivated as a model of combustion under the name Frank-Kamenetsky equation.The nature of blow-up here depends on the choice of λ, the initial data and the domain.Another such classical example is u t = ∆u + u p and both these semi-linear equations were studied in the seminal works [36,37] which pioneered systematic research into finite-time blow-ups of PDEs.
To the best of our knowledge, the behaviour PINNs in the proximity of finite-time blow-up has not received adequate attention in prior work on PINNs.We note that there are multiple real-world phenomena whose PDE models have finite-time blow-ups and these singularities are known to correspond to practically relevant processes -such as in chemotaxis models [38,39,40,41] and thermal-runoff models [42,43,44,45,46].
In light of the recent rise of methods for PDE solving by neural nets, it raises a curiosity whether the new methods, in particular PINNs, can be used to reliably solve PDEs near such blow-ups.While a general answer to this is outside the scope of this work, we derive theoretical risk bounds for PINNs which are amenable to be tested against certain analytically describable finite-time blow-ups.Additionally, we give experiments to demonstrate that our bounds retain non-trivial insights even when tested in the proximity of such singularities.
In [22], the authors provided thought-provoking experimental evidence was given that PINNs could potentially discover PDE solutions with blow-up even when their explicit descriptions are not known.Since finite-time blowups are a phenomenon at a particular instant of time (and hence a measure zero set), there is a surprise that PINN methods can find neural surrogates sensitive to it while having been trained on averaged losses.Hence inspired, here we embark on a program to understand this emerging interface from a rigorous viewpoint and show how well the theoretical risk bounds correlate to their experimentally observed test errors -in certain blow-up situations.As our focus point, we will use reduced models of fluid dynamics, i.e Burgers' PDE in one and two spatial dimensions.The choice of our test case is motivated by the fact that these PDE setups have analytic solutions with blow-up -as is necessary to do a controlled study of PINNs facing such a situation.We note that it is otherwise very rare to know exact fluid-like solutions which blow-up in finite-time [47,48] Notation In the subsequent section we use d + 1 to represent dimensions, here d is the number of spatial dimensions and 1 is always the temporal dimension.Nabla (∇) is used to represent the differential operator i.e. ( ∂ ∂x1 , . .., ∂ ∂x d ).And for any real function u on a domain D, ∥u(x)∥ L ∞ (D) will represent sup x∈D |u(x)|.

Informal Summary of Our Results
Firstly we give a brief review of the framework of Physics-Informed Neural Networks (PINNs) which is the focus of this work.Towards that consider the following specification of a PDE describing a dynamical system and satisfied by an appropriately smooth function u(x, t) where x and t represent the space and time dimensions, subscripts denote the partial differentiation variables, N x [u] is the nonlinear differential operator, D is a subset of R d s.t it has a well-defined boundary ∂D.Following [6], we try to approximate u(x, t) by a deep neural network u θ (x, t), and then we can define the corresponding residuals as, Note that the partial derivative of the neural network (u θ ) can be easily calculated using auto-differentiation [49].The neural net is then trained on an empirical loss function, where L pde , L t and L b penalize for R pde , R t and R b respectively for being non-zero.Typically it would take the form where (x i r , t i r ) denotes the collocation points, (x i t ) are the points sampled on the spatial domain for the initial loss and (x i b , t i b ) are the points sampled on the boundary for the boundary loss.The aim here is to train a neural net u θ such that L θ is as close to zero as possible.
At the very outset, we note that to the best of our knowledge there are no available off-the-shelf generalization bounds for any setup of PDE solving by neural nets where the assumptions being made include any known analytic solution with blow-up for the corresponding PDE.So, as a primary step we derive new risk bounds for Burgers's PDE in Theorem 4.1 and Theorem 4.2, where viscosity is set to zero and its boundary conditions are consistent with finitetime blow-up cases of Burgers' PDE that we eventually want to test on.We note that despite being designed to cater to blow-up situations, the bound in Theorem 4.2 is also "stable" in the sense of [50].
Our experiments reveal that for our test case with Burgers' PDE the theoretical error bounds that we derive ‡, are such that they do maintain a non-trivial amount of correlation with the L 2 -distance of the derived solution from the true solution.The plot in Figures 2 and 3 vividly exhibit the presence of this strong correlation between the derived bounds and the true risk despite the experiments being progressively made on time domains such that the true solution is getting arbitrarily close to becoming singular.We will also show that for the one-dimensional blow-up case that we consider, the time to solve the neural surrogate is almost independent of the proximity to the blow-up that we want to ‡ It is to be noted that it is routine for analytic neural net generalization bounds to be vacuous, particularly for deep nets as considered in the experiments here.
solve it to and also that the derived bounds drop with the width of the net and hence reflecting the benefits of using overparameterized nets.
A key feature of our approach to this investigation is that we do not tailor our theory (Theorem 4.1) to the experimental setups we test on later.We posit that this is a fair way to evaluate the reach of PINN theory whereby the theory is built such that it caters to any neural net and any solution of the PDE while these generically derived bounds get tested on the hard instances.§

Related Works
To the best of our knowledge the most general population risk bound for PINNs has been proven in [51], and this result applies to all linear second order PDE and it is a Rademacher complexity based bound.This bound cannot be applied to our study since Burgers' PDE is not a linear PDE.The authors in [24] derived generalization bounds for PINNs, that unlike [51], explicitly depend on the trained neural net.They performed the analysis for several PDEs, and the "viscous scalar conservation law" being one of them, which includes the 1 + 1-Burgers' PDE.However, for testing against analytic blow-up solutions, we need such bounds at zero viscosity unlike what is considered therein, and most critically, unlike [24] we keep track of the prediction error at the spatial boundary of the computational domain with respect to non-trivial functional constraints.
The authors in [25] derived a generalization bound for Navier-Stokes PDE, which too depends on the trained neural net.We note that, in contrast to the approach presented in [25], our method does not rely on the assumption of periodicity in boundary conditions or divergencelessness of the true solution.These flexibilities in our setup ensure that our bound applies to known analytic cases of finite-time blow-ups for the d + 1-Burgers' PDE.
Notwithstanding the increasing examples of the success of PINNs, it is known that PINNs can at times fail to converge to the correct solution even for basic PDEs -as reflected in several recent studies on characterizing the "failure modes" of PINNs.Studies reported in [27], and more recently in [52] have demonstrated that sometimes this failure can be attributed to problems associated with the loss function, specifically the uneven distribution of gradients across various components of the PINN loss.The authors in [27] attempt to address this issue by assigning specific weights to certain parts of the loss function.While [53] developed a way to preferentially sample collocation points with high loss and subsequently use them for training.In [26] a similar issue with the structure of the loss function was observed.While not changing the PINN loss function, they introduced two techniques: "curriculum regularization" and "sequence-tosequence learning" for PINNs to enhance their performance.In [54] PINNs have been analyzed from a neural tangent kernel perspective to suggest that PINNs suffer from "spectral-bias" [55] which makes it more susceptible to failure in the presence of "high frequency features" in the target function.They propose a method for improving training by assigning weights to individual components of the loss functions, aiming to mitigate the uneven convergence rates among the various loss elements.

Inapplicability of Classical Numerical Analysis Error bounds to PINN Experiments
To put the ongoing attempts at theoretical analysis of PINNs in perspective, it is to be noted that to the best of our knowledge existing results in numerical analysis cannot be deployed to understand PINN training -as has been the target here.Specifically for the zero viscosity Burgers' PDE one can see that in works like [56], the theory does not seem to be give a bound on the distance from the true solution of the solution found by the finite element method.
More generally, in works such as Corollary 3.5 in [57] the authors consider a weak solution of the ε-viscosity regularized Burgers' PDE and derive bounds on the local L p -distance between the weak solution and the true solution at zero viscosity.There is no obvious way to apply these bounds for a PINN solution since the trained net has no guarantee to be satisfying the conditions required of the surrogate here.
With results like Theorem 2.1 in [58], we observe that these too don't have an obvious way for the bounds to be applied for PINN experiments because they need stringent conditions (like satisfying the conservativeness property) to be true § One can surmise that it might be possible to build better theory exploiting information about the blow-ups -like if the temporal location of the blow-up is known.However, it is to be noted that building theory while assuming knowledge of the location of the blow-up might be deemed unrealistic given the real-world motivations for such phenomena.
for the approximant and there is no natural way to know if the neural surrogate satisfies these conditions.Also both the above cited classical bounds are not tailored to any compact domain and hence there is no boundary condition error that is getting tracked there, as in our Theorem 4.2.

Results
In the next two subsections, we will present the main generalization bounds that we prove for Burgers' PDE being solved by a neural surrogate.Next, we will experimentally demonstrate the high correlation of these bounds to measured test error when neural nets are trained to solve for certain exact Burgers' PDE solutions which have a finite-time blow-up, in one and two spatial dimensions.

Generalization Bounds for the (d + 1)-Dimensional Burgers' PDE
The PDE that we consider is as follows, Here u ∶ D × [t 0 , T ] → R d is the fluid velocity and u t0 ∶ D → R d is the initial velocity.Then corresponding to a surrogate solution u θ we define the residuals as, Corresponding to the true solution u, we will define the L 2 -risk of any surrogate solution u θ as, where, The theorem above has been proved in Appendix A.1 We note that the bound presented in equation 5 does not make any assumptions about the existence of a blow-up in the solution and its applicable to all solutions that have continuous first derivatives however large, as would be true for the situations very close to blow-up as we would consider.Also, we note that the bound in [25] makes assumptions (as was reviewed in Section 3) which (even if set to zero pressure) prevent it from being directly applicable to the setup above which can capture analytic solutions arbitrarily close to finite-time blow-up.
Secondly, note that these bounds are not dependent on the details of the loss function that might eventually be used in the training to obtained the u θ .In that sense such a bound is more universal than usual generalization bounds which depend on the loss.
Lastly, note that the inequality proven in Theorem 4.1 bounds the distance of the true solution from a PINN solution in terms of (a) norms of the true solution and (b) various integrals of the found solution like its norms and unsupervised risks on the computation domain.Hence this is not like usual generalization bounds that get proven in deep-learning theory literature where the LHS is the population risk and RHS is upperbounding that by a function that is entirely computable from knowing the training data and the trained net.
Being in the setup of solving PDEs via nets lets us contruct such new kinds of bounds which can exploit knowledge of the true PDE solution.
While Theorem 4.1 is applicable to Burgers' equations in any dimensions, it becomes computationally very expensive to compute the bound in higher dimensions.Therefore, in order to better our intuitive understanding, we separately analyze the case of d = 1, in the upcoming Section 4.2.Furthermore, the RHS of ( 5) only sees the errors at the initial time and in the space-time bulk.In general dimensions it is rather complicated to demonstrate that being able to measure the boundary risks of the surrogate solution can be leveraged to get stronger generalization bounds.But this can be transparently kept track of in the d = 1 case -as we will demonstrate now for a specific case with finite-time blow-up.Along the way, it will also be demonstrated that the bounds possible in one dimension -are "stable" in a precise sense as will be explained after the following theorem.

Generalization
Bounds for a Finite-Time Blow-Up Scenario with (1+1)-Dimensional Burgers' PDE , T ] → R being at least once continuously differentiable in each of its variables we consider a Burgers's PDE as follows on the space domain being [−1, 1] and the two limits of the time domain being specified as t 0 = −1 + δ and T = δ for any δ > 0, We note that in the setup for Burger's PDE being solved by neural nets that was analyzed in the pioneering work in [24], the same amount of information was assumed to be known i.e the PDE, an initial condition and boundary conditions at the spatial boundaries.However in here, the values we choose for the above constraints are non-trivial and designed to cater to a known solution for this PDE i.e u = x t−1 which blows up at t = 1.For any C 1 surrogate solution to the above, say u θ its residuals can be written as, We define the L 2 −risk of u θ with respect to the true solution u of equation 6 as, )) be the unique solution of the one dimensional Burgers' PDE in equation 6.Then for any surrogate solution for the same PDE, say u * ∶= u θ * its risk as defined in equation 10 is bounded as, where The theorem above has been proved in Appendix A.3.Note that the RHS of equation 11 is evaluable without exactly knowing the exact true solution u -the constants in equation 11 only require some knowledge of the supremum value of u at the spatial boundaries and the behaviour of the first order partial derivatives of u.
Note that the most natural PINN risk function we can minimize (and what will be used in the experiments) is, The expectations in above are understood to be over separate distributions as appropriate for each of the terms.In light of this, most importantly, Theorem 4.2 shows that despite the setting here being of proximity to finite-time blow-up, the naturally motivated PINN risk as stated above is "(L 2 , L 2 , L 2 , L 2 )-stable" ∥ in the precise sense as defined in [50].This stability property being true implies that if the PINN risk of the solution obtained is measured to be O(ϵ) then it would directly imply that the L 2 -risk with respect to the true solution ( 10) is also O(ϵ).And this would be determinable without having to know the true solution at test time.
In Appendix A.4 we apply quadrature rules on (11) and show a version of the above bound which makes the sample size dependency of the bound more explicit.

Experiments
Our experiments are designed to demonstrate the efficacy of the generalization error bounds that we presented above.The novelty of our experimental setup can be seen in the light of the brief overview we give below of how demonstrations of deep-learning generalization bounds have been done in the recent past.
In the thought-provoking paper [59] the authors computed their bounds for 2−layer neural nets at various widths to show the non-vacuous nature of their bounds.However, these bounds do not apply to any single neural net but to an expected neural net sampled from a specified distribution.Inspired by these experiments, works like [60] and [61] perform a de-randomized PAC-Bayes analysis on the generalization error of neural nets -which can be evaluated on any given net.
In works such as [62] we see a bound based on Rademacher analysis of the generalization error and the experiments were performed for depth-2 nets at different widths to show the decreasing nature of their bound with increasing width -a very rare property to be true for uniform convergence based bounds.It is important to point out that the training data is kept fixed while changing the width of the neural net in the setups in [59] and [62].
In [63] the authors instantiated a way to do compression of nets and computed the bounds on a compressed version of the original net.More recently in [64] the authors incorporated the sparsity of a neural net alongside the PAC-Bayes analysis to get a better bound for the generalization error.In their experiments, they vary the data size while keeping the neural net fixed and fortuitously the bound becomes non-vacuous for a certain width of the net.
In this work, we investigate if theory can capture the performance of PINNs near a finite-time blow-up and if larger neural nets can better capture the nature of generalization error close to the blow-up.To this end, in contrast to the previous literature cited above, we keep the neural net fixed and vary the domain of the PDE.More specifically, progressively we choose time-domains arbitrarily close to the finite-time blow-up and test the theory at that difficult edge.In Figures 2a and 2b we see that the LHS and the RHS of equation 11 measured on the trained models is such that the correlation is very high (∼ 1) over multiple values of the proximity parameter -up to being very close to the blow-up point.We also note that the correlation increases with the width of the neural net, a desirable phenomenon that our bound does capture -albeit implicitly.
In Figure B1 in the appendix, we illustrate that the upper-bound derived in Theorem 4.2 does indeed fall over a reasonable range of widths at a fixed δ.The mean and the standard deviations plotted therein are obtained over six iterations of the experiment at different random seeds.In Figure C1 in the appendix, we illustrate the time needed to train the PINN for the 1 + 1 Burgers' PDE across a range of δ, representing its proximity to the blow-up.It is evident from the figure that the training time remains approximately constant across all values of δ.

Testing Against a (2+1)-Dimensional Exact Burgers' Solution with
Finite-Time Blow-Up From [65] we know that there is an exact finite-time blow-up solution for Burgers' PDE in equation 2 for the case of d = 2, where u i denotes the i th component of the velocity being solved for.Note that at t = 0, both the above velocities are smooth while they eventually develop singularities at t = 1 √ 2 -as is the expected hallmark of non-trivial finite-time blow-up solutions of PDEs.Also note that this singularity is more difficult to solve for since it is blowing up as O( 1 t 2 ) as compared to the O( 1 t ) blow-up in the previous section in one dimension.We set ourselves to the task of solving for this on a sequence of computational domains 2 ).Hence we have a sequence of PDEs to solve for -parameterized by δ and larger δs getting close to the blow-up.Let g x1,0 (x 2 , t) and g x1,1 (x 2 , t) be the boundary conditions for u 1 at x 1 = 0 & 1.Let g x2,0 (x 1 , t) and g x2,1 (x 1 , t) be the boundary conditions for u 2 at x 2 = 0 & 1 and u 1,t0 and u 2,t0 with t 0 = − 1 √ 2 + δ be the initial conditions for the two components of the velocity field.Hence the PDE we seek to solve is, Let N ∶ R 3 → R 2 be the neural net to be trained, with output coordinates labeled as (N u1 , N u2 ).Using this net we define the neural surrogates for solving the above PDE as,  Correspondingly we define the PDE population risk, R pde as, In the above u θ = (u 1,θ , u 2,θ ) and ν 1 is a measure on the whose space-time domain δ] (first interval being space and the later being time), we define R s,0 and R s,1 corresponding to violation of the boundary conditions, For a choice of measure ν 3 on the spatial volume [0, 1] 2 we define R t corresponding to the violation of initial conditions u t0 = (u Thus the population risk we are looking to minimize is, We note that for the exact solution given above the constants in Theorem 4.1 evaluate to, In Figure 3 we see the true risk and the derived bound in Theorem 4.1 for depth 6 neural nets obtained by training on the above loss.The experiments show that the insight from the previous demonstration continues to hold and more  vividly so.Here, for the experiments at low width (30) the correlation stays around 0.50 and only until δ = 0.307, and beyond that it decreases rapidly.However, for experiments at width 100 the correlation remains close to 0.80 for δ much closer to the blow-up at t = 1 √ 2 .

Conclusion
In this work we have taken some of the first-of-its kind steps to initiate research into understanding the ability of neural nets to solve PDEs at the edge of finite-time blow-up.Our work suggests a number of exciting directions of future research.Firstly, more sophisticated modifications to the PINN formalism could be found to solve PDEs specifically near finite-time blow-ups.
Secondly, we note that it remains an open question to establish if there is any PINN risk for the d + 1-dimensional Burgers, for d > 1, that is stable by the condition stated in [50], as was shown to be true in our 1 + 1-dimensional Burgers' in Theorem 4.2.
In [66] the authors had given numerical studies to suggest that 3D incompressible Euler PDEs can develop finite-time singularities from smooth initial conditions for the fluid velocity.For their setup of axisymmetric fluid flow they conjectured a simplified model for the resultant flow near the outer boundary of the cylinder.Self-similar finite-time blow-ups for this model's solutions were rigorously established in [40] -and it was shown that an estimate of its blowup-exponent is very close to the measured values of the 3D Euler PDE.
In the seminal paper [67] it was shown that the unique local solution to 3D incompressible Euler PDEs can develop finite-time singularities despite starting from a divergence-free and odd initial velocity in C 1,α and initial vorticity bounded as ∼ 1 1+∥x∥ α .This breakthrough was built upon to prove the existence of finite time singularity in 2D Boussinesq PDE in [68].
In [69] it was highlighted that there is an association between blow-ups in 3D Euler and 2D Boussinesq PDEs.In [22], the authors investigated the ability for PINNs to detect the occurrence of self-similar blow-ups in 2D Boussinesq PDE.A critical feature of this experiment was its use of the unconventional regularizer on the gradients of the neural surrogate with respect to its inputs.In light of this, we posit that a very interesting direction of research would be to investigate if a theoretical analysis of the risk bound for such losses can be used as a method of detection of the blow-up.

2 2 1 √ 2 +
dx dt In the following theorem we consider t 0 = −δ and T = δ for some δ > 0.Here the spatial domain is represented by D ⊂ R d and Ω represents the whole domain D × [t 0 , T ].Theorem 4.1.Let d ∈ N and u ∈ C 1 (D × [t 0 , T ]) be the unique solution of the (d+1)-dimensional Burgers' equation given in equation 2. Then for any C 1 surrogate solution to equation 2, say u θ , the L 2 -risk with respect to the true solution is bounded as,

4. 3 . 1 .
The Finite-Time Blow-Up Case of (1+1)-Dimensional Burgers' PDE from Section 4.2 The neural networks we use here have a depth of 6 layers, and we experiment at two distinct uniform widths of 30 and 300 neurons and the training loss is the empirical form of equation 13.For training, we use full-batch Adam optimizer for 100, 000 iterations and a learning rate of 10 −4 .We subsequently select the model with the lowest training error for further analysis.In Figure1the plots have been shown for the predicted and actual solution, for neural nets solving equation 6 at different values of the δ parameter and it is clear that the visual resemblance of the neurally derived solution persists even quite close to the blowup at δ = 1.

Figure 1 :
Figure 1: A demonstration of the visual resemblance between the neurally derived solution for equation 6 (left) and the true solution (right) at different values of the δ parameter getting close to the PDE with blow-up at δ = 1.A PINN with a width of 300 and a depth of 6 was trained to generate the plots on the left.

Figure 2 :
Figure 2: Demonstration of the presence of high correlation between the LHS (the true risk) and the RHS (and the derived bound) of equation (11) in Theorem 4.2 over PDE setups increasingly close to the singularity.Each experiment is labeled with the value of δ in the setup of equation 6 that it corresponds to.

Figure 3 :
Figure 3: These plots show the behaviour of LHS (the true risk) and RHS (the derived bound) of equation (5) in Theorem 4.1 for different values of the δ parameter that quantifies proximity to the blow-up point.In the left plot each point is marked with the value of the δ at which the experiment is done and the right figure, for clarity, this is marked only for experiments at δ > 1 2 .