Beyond dynamics: learning to discover conservation principles

The discovery of conservation principles is crucial for understanding the fundamental behavior of both classical and quantum physical systems across numerous domains. This paper introduces an innovative method that merges representation learning and topological analysis to explore the topology of conservation law spaces. Notably, the robustness of our approach to noise makes it suitable for complex experimental setups and its aptitude extends to the analysis of quantum systems, as successfully demonstrated in our paper. We exemplify our method’s potential to unearth previously unknown conservation principles and endorse interdisciplinary research through a variety of physical simulations. In conclusion, this work emphasizes the significance of data-driven techniques in deepening our comprehension of the principles governing classical and quantum physical systems.


Introduction
Conservation laws and principles lie at the core of our understanding of the physical world, providing insight into the fundamental rules governing the behavior of complex systems.The discovery and application of these principles span various disciplines, such as physics, engineering, biology, and computer science.In recent years, data-driven techniques have emerged as powerful tools for uncovering previously unknown principles and fostering innovative interdisciplinary research.
We present a novel method that tackles the challenge of revealing hidden conservation laws through the synergistic combination of representation learning and topological analysis.Our approach transcends traditional limitations by the robustness to the noise in the measurements, which is important on itself, but also it opens the possibility to work with the quantum systems.
In this paper, we elaborate on the details of our technique, outlining its advantages over existing methods, and demonstrating its potential for uncovering conservation laws in a wide range of settings.By applying our approach to various physical simulations, we showcase its efficacy, broad applicability, and opportunities in interdisciplinary research.
This work contributes to data-driven physics by developing a method for robust inference of conservation laws.In contrast to existing state-of-the-art approaches, this method is robust to the statistical noise in measurements, can handle both classical and quantum systems, and it can analyze non-Euclidean topological spaces, which leads to a more accurate number of inferred laws.The method has been validated on a set of classical and quantum model systems.
The paper is organized as follows.In the section 2 Related work, we review previous approaches to the discovery of conservation laws, setting the stage for our novel method.The section 3 Method description details our data-driven algorithm, which combines representation learning and topological analysis.We report the results of applying our approach to physical simulations in the section 4 Experiments and analyze the implications, limitations, and future research in the section 5 Discussion.Finally, the section 6 Conclusion highlights critical insights and emphasizes the potential of our data-driven technique in advancing our understanding of conservation laws in various disciplines.

Related work
Several existing approaches utilize data-driven methods for discovering conservation laws of dynamical systems (Kaiser et al 2018, Cranmer et al 2020, Iten et al 2020, Wetzel et al 2020, Ha and Jeong 2021, Liu and Tegmark 2021, Mototake 2021, Kasim and Lim 2022, Lejarza and Baldea 2022, Liu et al 2022, 2023, Lu et al 2022, Muller 2022, Arora et al 2023, Zhang et al 2023).Specifically, a wide variety of machine learning techniques has been applied in these papers, with examples including the use of conventional neural networks in works such as Ha and Jeong (2021), Mototake (2021), Liu et al (2022), Zhang et al (2023), the utilization of graph neural networks as demonstrated in Cranmer et al (2020), and the application of Siamese networks in Wetzel et al (2020).
Some of these methods focus on finding analytical formulas for conserved quantities, which is referred to as symbolic regression in the scientific literature.Discovered symbolic formulas can subsequently enhance the robustness of integration (Channell and Scovel 1990).It's worth noting that not all approaches consider the physical properties of the data; for instance, in the work of Lample and Charton (2019), the authors do not take into account the data's physical characteristics.In contrast, others, such as Udrescu and Tegmark (2020), introduce symbolic regression techniques tailored to the specific physical domain they are dealing with.These researchers present algorithms designed to uncover symbolic formulas for conserved quantities or dynamic behaviors within the physical context.
Furthermore, it is important to note that certain recent works have limitations that may restrict their practical applications, particularly in regard to the systems they are capable of handling: Limited number of trajectories or conservation laws: Certain approaches, such as Liu and Tegmark (2021), exclusively focus on the manifold associated with a single trajectory.Consequently, these methods are ill-suited for systems characterized by a high number of dimensions, typically around 100.In contrast, approaches like those introduced by Ha and Jeong (2021), Zhang et al (2023)only have the capacity to learn a solitary conserved quantity, which can pose challenges when dealing with systems featuring numerous conserved quantities.
Exclusivity to certain dynamical systems: Some approaches such as Greydanus et al (2019), Cranmer et al (2020), Arora et al (2023)are focused on learning the Hamiltonian or the Lagrangian.These methods are therefore confined to a subset of general dynamical systems, albeit a large one.Additionally, these approaches usually require some prior knowledge related to the physical system in question.This could be a major drawback when dealing with unknown or vastly complex systems.Moreover, in most of the literature, there is a notable absence of consideration for quantum systems, a critical omission given their significance in modern physics.
Accurate experimental data: Utilizing certain contemporary techniques, such as those described in Champion et al (2019), Gin et al (2021), can be significantly impacted by data noise and low time resolution.This necessitates the availability of highly accurate data, making the application of these methods more demanding in practical scenarios.
Most of those methods that deal with conservation laws description take the number of expected conserved quantities as a hyperparameter.Our method aims to address aforementioned concerns and mitigate the limitations.

Physical system's dynamics and conserved quantities
The subject of our algorithm is dynamical systems.The states of the system are the points in a d−dimensional phase space M. For each dynamical system, there is a set (maybe empty) of conserved quantities H 0 (x), . . ., H n−1 (x), which are constant functions along each trajectory.Given a single trajectory, all conserved quantities restrict it to the isosurface M h ⊆ M, where h is a n-dimensional vector representing the values of conserved quantities along the trajectory.The set C formed by the isosurfaces M h is called shape space (Lu et al 2022).It is a n-dimensional manifold since it is parameterized by the n-dimensional vector h.We will learn the number of conserved quantities in the system by learning the manifold C.
Since our algorithm only has the data from sampled trajectories and does not have any information about the conserved quantities, the manifold C will be learned from the prospective of M h , represented by the trajectories.To fully discover C, we should sample the trajectories appropriately, that is, all conserved quantities should variate equally in this set of trajectories; otherwise, the algorithm will not discriminate some directions of the shape space.
Moreover, each trajectory itself should fully represent the respective M h , i.e. the following should hold: 1. Different trajectories with the same conserved quantities h should converge to the unique for the isosurface M h probability distribution µ h : which means the system should be ergodic (Medio and Lines 2001).2. Trajectories should be sampled during a sufficiently large time, such that the trajectory approximates µ h well enough (the limit converges).

Shape space structure encoding
To give the Riemannian metric structure to the manifold C we consider Wasserstein distances (Panaretos and Zemel 2019) between the distributions µ h corresponding to all M h : where π can be any valid transport map between µ h1 and µ h2 .Here, β is a hyperparameter of the algorithm, which we set to be equal to 2 for all our experiments.We know the distributions µ h from the trajectories.Therefore, we should adjust the formula for the Wasserstein distance for the discrete distributions ({x 11 , . . ., x 1m 1 } and {x 21 , . . .x 2m 2 }) representing two trajectories x 1 and x 2 sampled at m 1 and m 2 points correspondingly: where T ij meets the following constrains: Before computing the Wasserstein distances, we normalize the trajectories, such that for any i, the mean along each coordinate over all trajectories is 0, and the standard deviation along each coordinate over all trajectories is 1.

Metric space approximation and projection (MSAP)
To determine the dimensionality of the shape space C we look for a Riemannian manifold C ′ isometric to the shape space C while restricting the dimensionality of C ′ .The resulting C ′ will be the approximation and projection of the shape space into the R d .To do that, we first fix a number l, numbers d 1 , . . ., d l , d, activation functions g 1 , . . ., g l .Then we try to find the appropriate C ′ in the set θ is a fully connected neural network with the input dimensionality k, l intermediate layers with d 1 , . . ., d l neurons and activations g 1 , . . ., g l , and the output dimensionality d; D p -Minkowsky metric; m(k) is the number of parameters of the f (k) .This way we force S k to contain only manifolds with dimensionality ⩽ k.
If the neural network f θ is large enough, such that the corresponding set of manifolds S n contains a manifold isometric to C, then the dimensionality of C will equate to the smallest k that allows us to identify a suitable C ′ .In numerous instances, the topology of C ′ may play a pivotal role in the desired dimensionality.For instance, a circular topology may reside within a 2-dimensional manifold, yet its representation can be effectively condensed to a 1D variable.
Aiding the MSAP in finding an optimal approximation for C with non-trivial topology can be achieved by seeking approximations across varied S k .For instance, considering X ⊂ S 1 ⊂ R 2 compels the MSAP to pinpoint a periodic 1-dimensional manifold.This consideration becomes particularly salient when the shape space exhibits non-trivial topological properties, a phenomenon we elucidate using Turing oscillating patterns as an illustrative example.
Note, that ∀k, ∀X- , equal to the weights θ in the respective coordinates.Then θ could be chosen from a more general set than the fully connected neural networks.The important conditions are: θ should be continuous, otherwise the resulting set could be a manifold with a dimensionality higher than k.
• ∀k : S k ⊆ S k+1 .The role of this condition is shown in the next paragraph.
However, for simplicity, we will stick with the neural networks described above.
To find the appropriate C ′ for the given k, we minimize the Stress with respect to x i and θ: where W-Wasserstein distance matrix, M-Minkowsky distance matrix for the target space: ) . The stress is dimensionless, and therefore it is possible to compare it for different systems.This way we learn the latent manifold X and the transformation f θ (X) is similar to C. The overall scheme of the method up to this point is presented in the figure 1.
We train such model for k = 1, . . ., K (K is a hyperparameter and should be higher than n) and denote L 1 , . . ., L K the final losses for the trained models.In the ideal case L k > 0 for k < n and L k = 0 for k ⩾ n.In reality that is improbable, and L k > 0 ∀k.Nevertheless, for k ⩾ n L k will be small.Moreover, since we are trying to approximate the Riemannian manifold C with the dimensionality n, the best approximation also has the dimensionality n, and therefore for k > n L k ⩾ L j .On the other hand, for k < j L k ⩾ L j because Sk ⊆ Sj .Also, we suppose that for k < j ⩽ n L k > L j .Therefore, after optimization, we should get the following: From the standpoint of the unknown n, we got an algorithm to find n: get the sequence L 1 , . . ., L K , find the point where it stops decreasing, and the corresponding index will be equal to n.Indeed, K should be greater than n, but for example, K equal to the number of coordinates in the phase space will be enough.
To reduce the probability of the error, we train each model several times with different initializations and choose the best loss.

Method pipeline
The MSAP has several hyperparameters.The most important are the parameters controlling the structure of the neural network: number of layers, number of neurons for each layer, activation functions, target dimensionality.Combining all steps together, we get the pipeline of the proposed method: 1.The input data consists of N trajectories starting at various initial conditions.Each point of a trajectory is a d-dimentional vector.2. Normalize the trajectories and compute the pairwise Wasserstein distances between the trajectories and store it into matrix W. 3. Run MSAP algorithm to get the sequence of L 1 , . . ., L K from W. 4. Determine the number n of conservation laws by finding the point where the sequence {L i } stops decreasing.
As a result, we get the determined number of conserved quantities, as well as the points {x i } representing embedding of the trajectories into low-dimensional metric space R n .

Experiments
We set up several numerical experiments with dynamical systems with known conserved quantities and tested the proposed method on them.The considered dynamical systems are harmonic oscillator, coupled oscillator, quantum harmonic oscillator, oscillating Turing patterns and Korteweg-De Vries equation.We conducted experiments both with noiseless and noisy numerically sampled datasets.You can find sampling details in the appendix.The code of the model and all experiments is available by the following link.

Coupled oscillator
The coupled oscillator consists of 2 massive blocks connected to each other and to the walls by springs.The coordinates in the coupled oscillator's phase space are the blocks' positions (x 1 , x 2 ) and their momentums (p 1 , p 2 ).Therefore, the phase space of the coupled oscillator is 4-dimensional.The equations of motion of the coupled oscillator are The coupled oscillator has two conserved quantities, which are energies of two independent modes of oscillating: The figure 2 demonstrates the results of our algorithm applied to data collected from the coupled oscillator.The validation loss during the training is shown on the graph figure 2(a).We see that the models with k = 1 got much worse loss than all other models, which all got the loss approximately 0.04.That means that the resulting losses stop decreasing at the k = 2, so that is the final prediction of our algorithm about the number of the conserved quantities in the system, which is the correct prediction.The graphs on the figures 2(c) and (d) show how the trained latent representation X ∈ R 2 correlates with the conserved quantities.

Coupled oscillator with noise
We tested our algorithm on the coupled oscillator introducing normal noise into the normalized trajectories.We considered various levels of noise with standard deviations up to 0.8.
We demonstrate the resulting validation losses on the figure 2(b).The heuristic blue area on the graph corresponds to the values of the loss which are at least three times lower than the loss for k = 1.When the loss for k = 2 is in this area, the prediction of the algorithm will be correct.We see on the graph that our algorithm is able to determine the 2 conserved quantities in the coupled oscillator for the levels of noise up to 0.5, which is already a very high level of noise.

Quantum harmonic oscillator
To test our approach on a more complicated system, we tested it on the quantum harmonic oscillator.That is a quantum system described by the Hamiltonian:  The wavefunction ψ(x, t) evolves according to the Schrodinger equation: The total energy ⟨ Ĥ⟩ of the quantum harmonic oscillator is its only conserved quantity.
The figure 3 demonstrates the collected data for the quantum harmonic oscillator as well as the results of our algorithm.The validation loss during the training is shown on the graph figure 3(b).We see that for each k best models got the loss approximately 0.04 − 0.05.That means that the resulting losses stop decreasing at the k = 1, so that is the final prediction of our algorithm about the number of the conserved quantities in the system, which is the correct prediction.This example with the quantum harmonic oscillator emphasizes the ability of our algorithm to work with the quantum systems, which other algorithms discovering of the number of conserved quantities in the system did not achieve.

Oscillating turing patterns
We consider oscillating Turing patterns as an example of the system with large phase space (dim ∼ 100).In particular, the Barrio-Varea-Aragón-Maini model is chosen (Barrio 1999, Aragón et al 2012): Aragón et al (2012) showed that these equations result in oscillating chaotic patterns.There is only one conserved quantity in this system, which is the spatial phase η of the patterns that correspond to the position of the pattern in the space.The critical thing about η is its periodic topology.To numerically express the phase space, we discretized it on a mesh of size 50, which resulted in the 100-dimensional phase space.
To help our algorithm learn 1-dimensional periodic representation better, we created additional models for k = 1 case.To enforce the model to study the periodic embedding, we put the trainable points {x i } on the circle in 2d: x 1 , . . ., x N ∈ S 1 ⊂ R 2 , therefore training the periodic 1-dimensional manifolds.Note that this step could be added in the pipeline as well as the similar checks in other dimensions (e.g. for S 2 ).
The figure 4 demonstrates the results of our algorithm.The validation loss during the training is shown on the graph figure 4(a).We see that the periodic 1-dimensional model learned approximately as good as the models for larger k.That means that the resulting losses stop decreasing at the k = 1, so that is the final prediction of our algorithm about the number of the conserved quantities in the system, which is the correct prediction.This example demonstrates that our algorithm can deal with the large phase spaces and also topologically non-trivial shape spaces.Moreover, the figure 4(b) shows the periodic 1-dimensional representation of the shape space.

Korteweg-De Vries equation
As discussed by Lu et al (2022), the number of conserved quantities in this system is infinite, but we could consider local features, which describe finite number of local conserved quantities.We took the data for KdV equation from Lu et al (2022), consisting of discretized u(x) and ∂u ∂x , which corresponds to three local conserved quantities.The results of training our model on this data are shown in the figure 5. We see that our model predicts the correct answer of three conserved quantities in the system, since the loss sequence stops decreasing at k = 3.

Small number of trajectories
Our approach relies on the assumption that the sample of the trajectories is sufficiently large to cover the shape space.In general, the number of samples required to represent a manifold grows with respect to dimensionality of the manifold.
To show this effect on practice, we trained our model on a small sample of 20 trajectories for both quantum harmonic oscillator and coupled oscillator.The resulting learning curves are shown on the figure 6.
We see that the model did not distinguish k = 1 and k ⩾ 2 for the coupled oscillator (figure 6(a)).It means that the model did not determine the number of conserved quantities in the coupled oscillator.That happened because 20 trajectories is not a representative sample for the coupled oscillator's shape space.
On the other hand, the validation losses for the quantum harmonic oscillator are approximately the same (figure 6(b)), meaning that the model correctly determines that the quantum harmonic oscillator has one conserved quantity.Therefore, even a sample of 20 trajectories is a representative sample for the shape space of the harmonic oscillator.
This example underscores that the number of trajectories required to represent the shape space properly grows with the number of conserved quantities in the system.A heuristic phenomenon known as curse of dimensionality suggests that this growth is roughly exponential, which means that our approach requires significantly more data for systems with large number of conserved quantities.

Comparison
On the simple example of the coupled oscillator, our approach showed the ability to discover the underlying principles as good as the previous methods, e.g.Iten et al (2020).However, we achieved the new, previously unmet results for the more complicated systems.More specifically, in the experiment with the oscillating Turing patterns, we showed the ability of our algorithm to uncover topologically non-trivial shape space, while the method proposed by Lu et al (2022) was only able to represent the same shape space in the higher dimension.Furthermore, our algorithm was able to discover the number of conserved quantities in a quantum system, which has not been done by the previous algorithms.We cannot not compare our method to the other approaches solving the same problem because (Iten et al 2020) do not have open code of their approach and the method of Lu et al (2022) has hyperparameters which we cannot tune.

Limitations
Our approach has a few limitations.One comes from assumptions about the dynamical system and the data collected from it.As mentioned above (see section 3.1), the trajectories of the dynamical system should be ergodic.And even though most of the simple physical systems are ergodic, non-ergodic systems exist, which are interesting for discovering but not available for discovery with our approach.Another aspect of data collection is that all conserved quantities should vary.Moreover, variations of different quantities should have approximately the same magnitude.Otherwise, the algorithm will not detect directions in the shape space C with a small variance, resulting in an underestimated number of conservation laws.
Moreover, systems without conservation laws (for example, pendulum with friction) also cannot be found by our algorithm since the minimal possible prediction about the number of conservation laws for our algorithm is one.
In addition, trajectories should be sampled long enough to represent the respective isosurface well.We need a clearer understanding of determining whether the time in which we sampled the trajectories was enough.However, formulas for the convergence rate of the Wasserstein metric, described by Fournier and Guillin (2014), can help to find the right time to measure or sample the trajectories.
Another limitation comes from the number of trajectories, as discussed in section 4.6.As we have shown in our examples, if the number of conservation laws in a system is low, it is enough to sample 200 trajectories even for systems with very complicated shape space (e.g.section 4.4).However, we expect the required number of trajectories to grow exponentially with respect to the number of conserved quantities, which makes it computationally more expensive to evaluate our model on the systems with high number of conserved quantities.

Possible applications
The proposed method, combining representation learning and topological analysis, has the potential for broad applicability across various domains.Its ability to uncover underlying conservation principles makes it a valuable tool for numerous fields.In robotics and advanced physical system simulation, researchers can utilize the method to optimize energy efficiency, control, and motion planning and improve the accuracy and efficiency of simulations across various physical contexts.Furthermore, this method's applicability extends to control systems, astrophysics, biology, material science, and environmental science.Researchers from these disciplines can leverage this method to understand better the governing principles of the systems they are investigating and drive innovations.Ultimately, this data-driven approach can advance our comprehension of the fundamental tenets of physical systems and beyond.
Notably, the suggested method is computationally inexpensive.Our experiments primarily utilized a single GPU for training.Each of the 1000 individual MSAP models we prepared for various systems/embeddings took about 10 min to train on a single GPU.Including preliminary or failed experiments not reported in the paper, the estimated total computation time was approximately 3000 min (50 h).The developed method's efficiency reduced costs and increased accessibility for researchers implementing similar techniques.

Conclusion
This work introduces an innovative data-driven Metric Space Approximation and Projection approach for discovering unknown conservation laws and principles in complex physical systems.By combining representation learning techniques with topological analysis, our method is able to robustly extract the topology of conservation law spaces, even in the presence of noise.
We have demonstrated the efficacy and versatility of this approach through experiments on various classical and quantum systems.The results showcase the ability of our technique to correctly determine the number of conserved quantities in systems like coupled oscillator and quantum harmonic oscillator.Notably, our method was able to handle experimental noise and uncover the underlying number of conservation laws.
Furthermore, the proposed approach is computationally efficient and does not rely on extensive hyperparameter tuning or expert knowledge.This makes it widely accessible for researchers across disciplines looking to gain insights into the fundamental governing principles of complex systems they are investigating.
Overall, this work emphasizes the potential of data-driven techniques to advance our understanding of conservation laws in classical and quantum systems.The robustness to noise makes our method particularly well-suited for experimental setups.We believe this approach can drive innovations and interdisciplinary collaborations by helping researchers elucidate the core principles that dictate system behavior across numerous domains.Further refinements of the technique could enhance its applicability to even more intricate systems.got the loss approximately 0.007 − 0.008.That means that the resulting losses stop decreasing at the k = 1, so that is the final prediction of our algorithm about the number of the conserved quantities in the system, which is the correct prediction.The graphs on the figures 7(c) and (d) show how 1-and 2-dimensional latent representations {x i } correlate with the energy.We see that the 2-dimensional latent space is squeezed towards a line.This graph gives you an idea of how the model behaves for k > n.

Figure 1 .
Figure 1.We present in this figure the general idea behind the Metric Space Approximation and Projection.The pairwise Wasserstein distance matrix W is given and has to be approximated.{x j } and θ are the trainable parameters.To train them we construct the pairwise Minkowsky distance matrix M for f (k) θ (x j ) and optimize L(θ, {x j }, W) = Stress(M, W).

Figure 2 .
Figure 2. The results of the algorithm applied to the coupled oscillator.(a): Validation learning curves for all k and all random initializations.(b): Validation losses for experiments with noise.(c), (d): Graphs showing how the 2d latent coordinates represent the data.

Figure 3 .
Figure 3.The results of the algorithm applied to the coupled oscillator.(a): A few trajectories of the collected data.(b): Validation learning curves for all k and all random initializations.

Figure 4 .
Figure 4. Results of the experiment with oscillating Turing patterns.(a) Validation learning curves for all k and all random initializations.(b) Learned periodic embedding.

Figure 5 .
Figure 5. Loss curves for the experiment with Korteweg-De Vries equation.

Figure 6 .
Figure 6.Results of the experiment with small number of trajectories (a) Validation learning curves for coupled oscillator.(b) Validation learning curves for quantum harmonic oscillator.

Figure 7 .
Figure 7.The results of the algorithm applied to the harmonic oscillator.(a): A few trajectories of the collected data.(b): Validation loss curves for all k and all random initializations.(c) Graph showing how the 1d latent coordinates represent the data.(d) Graph showing how the 2d latent coordinates represent the data.