Why neural functionals suit statistical mechanics

We describe recent progress in the statistical mechanical description of many-body systems via machine learning combined with concepts from density functional theory and many-body simulations. We argue that the neural functional theory by Sammüller et al (2023 Proc. Natl Acad. Sci. 120 e2312484120) gives a functional representation of direct correlations and of thermodynamics that allows for thorough quality control and consistency checking of the involved methods of artificial intelligence. Addressing a prototypical system we here present a pedagogical application to hard core particle in one spatial dimension, where Percus’ exact solution for the free energy functional provides an unambiguous reference. A corresponding standalone numerical tutorial that demonstrates the neural functional concepts together with the underlying fundamentals of Monte Carlo simulations, classical density functional theory, machine learning, and differential programming is available online at https://github.com/sfalmo/NeuralDFT-Tutorial.


I. INTRODUCTION
The discovery of the molecular structure of matter was still in its infancy when van der Waals predicted in 1893 on theoretical grounds that the gas-liquid interface has finite thickness.The theory is based on a square-gradient treatment of the density inhomogeneity between the coexisting phases [1,2] and it is consistent with van der Waals' earlier treatment of the gas-liquid phase separation in bulk.Both the bulk and the interfacial treatments are viewed as simple yet physically correct descriptions of fundamental phase coexistence phenomena by modern standards of statistical mechanics.
What was unknown then is that an underlying formally exact variational principle exists.This mathematical structure was recognized only much later, first quantum mechanically by Hohenberg and Kohn [3] for the groundstate of a many-body system, subsequently by Mermin [4] for finite temperatures, and then classically by Evans [5].The variational principle forms the core of density functional theory and the intervening history between the quantum [4] and classical milestones [5] is described by Evans et al. [6]; much background of the theory is given in Refs.[7][8][9].Kohn and Sham [10,11] re-introduced orbitals via an effective single-particle description, which facilitates the efficient treatment of the many-electron quantum problem.
Practical applications of density functional theory require one to make concrete approximations for the central functional.(We recall that a functional maps an entire function to a number.)Quantum mechanically one needs to approximate the exchange-correlation energy functional E xc [n], as depending on the electronic density profile n(r), and classically one needs to get to grips with the excess (over ideal gas) intrinsic Helmholtz free energy F exc [ρ], as a functional of the local particle density ρ(r).
Owing to its rigorous formal foundation, density functional theory provides a microscopic, first-principles treatment of the many-body problem.The numerical efficiency of (in practice often approximate) implementations allows for exhaustive model parameter sweeps, for systematic investigation of bulk and interfacial phase transitions, and for the discovery and tracing of scaling laws.Exact statistical mechanical sum rules [20][21][22][23] integrate themselves very naturally into the scheme and they provide consistency checks and can form the basis for refined approximations.Nevertheless, at the core of such studies lies usually an approximate functional and hence resorting to explicit many-body simulations is common in a quest for validation of the predicted density functional results.
Inline with topical developments in other branches of science, the use of machine learning is becoming increasingly popular in soft matter research.Recent applications of machine learning range from the characterization of soft matter [24], reverse-engineering of colloidal self-assembly [25], local structure detection in colloidal systems [26], to the investigation of many-body potentials for isotropic [27] and for anisotropic [28] colloids.Brief overviews of machine learning in physics [29] and in particular in liquid state theory [30] were given recently.
The quantum mechanical problem was addressed on the basis of machine learning the exchange-correlation potential [43][44][45], testing its out-of-training transferability [43], using a three-dimensional convolutional neural network construct [45], considering hidden messages from molecules [46], and using the Kohn-Sham equations already during training via a regularizer method [47].The Hamiltonian itself was targeted via deep learning with the aim of efficient electronic-structure calculation [48].A recent perspective on these and more developments was given by Burke and co-workers [50].Huang et al. [51] argue prominently that quantum density functional theory plays a special role in the wider context of the use of artificial intelligenece methods in chemistry and in materials science.
While the central problem of quantum density functional theory is to deal with the exchange and correlation effects between electrons that are exposed to the external field generated by the nuclei, classical statistical mechanics of soft matter relies on a much more varied range of underlying model Hamiltonians.The effective interparticle interactions in soft matter systems cover a wide gamut of different types of repulsive and attractive, short-and long-ranged, hard-, soft-, and penetrable-core behaviours.
In particular the hard core model plays a special role.For hard core particles the pair potential between two particles is infinite if the particle pair overlaps and it vanishes otherwise.Hard core particles are relatively simple as temperature becomes an irrelevant variable while the essence of short-ranged repulsion and the resulting molecular packing remain captured correctly [52,53].The statistical mechanics of the bulk of one-dimensional hard core particles was solved early by Tonks [54].The free energy functional is known exactly due to Percus [55][56][57][58][59] and his solution provides the general structure and thermodynamics of the system when exposed to an external potential, see Fig. 1 for an illustration.The mathematical form of Percus' free energy functional was one of the sources of inspiration [60] for Rosenfeld's powerful fundamental measure density functional for three-dimensional hard spheres [61][62][63][64][65][66][67][68].One-dimensional hard rods are also central for nonequilibrium physics [69][70][71][72][73] and the Percus functional forms a highly useful reference for developing and testing machine learning techniques in classical density functional theory [32,33,[36][37][38].
In recent work, de las Heras et al. [40] and Sammüller et al. [41] have put forward machine learning strategies that operate on the one-body level of correlation functions.Here we address in detail the neural functional theory [41] for inhomogeneous fluids in equilibrium.We argue that this approach constitutes a neural network-based theory, where multiple different and mutually intimately related neural functionals form a genuine theoretical structure that permits investigation, testing, and to ultimately gain 1. Illustration of hard rods in one spatial dimension that are exposed to a position-dependent external potential Vext(x).In response to the external influence a spatially inhomogeneous density profile ρ(x) emerges in equilibrium at temperature T and chemical potential µ.The particles with position coordinates xi and particle index i = 1, . . ., N have radius R and diameter σ = 2R.A configuration is forbidden (bottom row) if any two particles overlap, i.e., if their mutual distance is smaller than the particle diameter σ.
profound insight into the nature of the coupled manybody physics.Thereby the training is only required for a single neural network, from which then all further neural functionals are created in straightforward ways.The method allows for multi-scale application [41] as is pertinent for many areas of soft matter [74][75][76].It is furthermore applicable to general interactions, as exemplified by successfully addressing a supercritical Lennard-Jones fluid [41], thus complementing analytical efforts to construct density functional approximations.Such work was based, e.g., on hierarchical integral equations [77,78], on functional renormalization group methods [79][80][81], and on fundamental measure theory [82][83][84].
Here we use the one-dimensional hard core model to illustrate the key concepts of the neural functional theory, as the required sampling can be performed easily and Percus' functional provides an analytical structure that we can relate to the neural theory.The Percus functional is one of the very few general classical free energy density functionals that is analytically known for a continuum model (see e.g. also Refs.[85,86]) and this fact provides further motivation for our study.A handson tutorial that demonstrates the key concepts of constructing a neural direct correlation functional, generating the required data from Monte Carlo simulations, testing against a numerical implementation of the Percus functional, and working with automatic differentiation is available online [42].
The paper is structured into individual subsections, as described in the following; each subsection is selfcontained to a significant degree such that Readers are welcome to select the description of those topics that match their own interests and individual backgrounds.An overview of key concepts of the one-body neural functional approach is given in Sec.I A. This hybrid method draws on classical density functional concepts, as summarized in Sec.I B. Functional differentiation and integration methods are described in Sec.I C.
Readers who are primarily interested in the use of machine learning may want to skip the above material and rather start with Sec.II A, where we describe how to construct and train the neural correlation functional on the basis of many-body simulation data.We concentrate on the specific model of one-dimensional hard core particles and complement and contrast the neural functional by the known exact analytical results for this model, as described in Sec.II B. Model applications for predicting inhomogeneous systems based on neural density functional theory are described in Sec.II C.
Several methods of neural functional calculus are described in Sec.III.Manipulating the neural correlation functional by functional integration and automatic functional differentiation is described in Secs.III A and III B, respectively.The application of Noether sum rules as a standalone means for quality control of the neural network is presented in Sec.III C. Functional integral sum rules are shown in Sec.III D. A brief overview of key concepts of neural functional representations in nonequilibrium are presented in Sec.IV.We give conclusions in Sec.V.

A. Neural functional concepts
The neural functional framework [41] rests on a combination of simulation, density functional theory, and machine learning.Data that characterizes the underlying many-body system is generated via grand canonical Monte Carlo simulations of well-defined, but random external conditions.Based on these results the one-body direct correlation functional is constructed as a neural network that accepts as an input the relevant local section of the density profile.This method allows for very efficient data handling as only short-ranged correlations contribute; Fig. 2 depicts an illustration.
The neural one-body direct correlation functional c 1 (r, [ρ]) forms the mother network for the subsequent functional calculus.Automatically differentiating the mother network with respect to its density input yields the two-body direct correlation functional c 2 (r, r ′ , [ρ]) as a daughter functional.Two-body direct correlations are central in liquid state theory [8] and they are here represented by a standalone numerical object that is created via straightforward application of automatic differentiation.This workflow is very different and arguabaly much simpler in practice than the standard technique of carrying out the functional differentiation analytically and then implementing the resulting expression(s) via numerical code.

MC ML
V ext (r) ⇢(r) c 1 (r) µ, T T r Differentiating the daughter network yields a granddaughter network, which represents the three-body direct correlation functional c 3 (r, r ′ , r ′′ , [ρ]).Again this is an independent and standalone numerical computing object.Very little is known about three-body direct correlations, with e.g.Rosenfeld's early investigation for hard spheres [61] and the freezing studies by Likos and Ashcroft [87,88] being notable exceptions.The neural functional method [41] offers arguably unprecedented detailed access.
Tracing the genealogy in the reverse direction requires functional integration, which is a general and standard technique in functional calculus.In the present case again a quasi-standalone numerical object can be built based on mere network evaluation and standard numerical integration, both of which are fast operations.In this way functionally integrating the mother one-body direct correlation functional creates as the grandmother the excess free energy functional F exc [ρ].This mathematical object is the ultimate generating functional in classical density functional theory for all n-body direct correlation functions [5,8,9].We give more details about the interrelationships within the family of functionals below in Sec.I C.
When applied to the three-dimensional hard sphere fluid and restricted to planar geometry, such that the density distribution is inhomogeneous only along a single spatial direction, the neural functional theory outper-forms the best available hard sphere density functional (the formidable White Bear Mk.II fundamental measure theory [65]) in generic inhomogeneous situations.For spatially homogeneous fluids the neural functional even surpasses the "very accurate equation of state" [8] by Carnahan and Starling [52], despite the fact that no explicit information about any bulk fluid properties was used during training.
Formulating reliable strategies of how to test machinelearning predictions constitutes in general a complex yet very important task, not least in the light of ongoing and projected increased use of artificial intelligence in science [51].The neural functional theory offers a wealth of concrete self-consistency checks besides the standard benchmarking techniques.Commonly and following best practice in machine learning, benchmarking is performed by dividing the reference data, as here obtained from many-body simulations, into training, validation and test data.The simulations in the test data set have not been used during training and hence can serve to assess the performance of the trained network.In our present model application, we can perform testing directly with respect to the exact Percus theory.
Assessing extrapolation capabilities beyond the underlying data set requires the availability of further reference data.In Ref. [41] this is provided by comparing (favourably) against a highly accurate bulk equation of state [89] as well as comparing against free energy reference results obtained from simulation-based thermodynamic integration of inhomogeneous systems.
However, due to its computational efficiency the neural approach allows to make predictions for system sizes that outscale significantly the dimensions of the original simulation box.Ref. [41] describes systems of micron-sized colloids confined between parallel walls with macroscopic separation distance.The density profile is resolved over a system size of 1 mm with nanometric precision on a numerical grid with 10 nm spacing.Such "simulation beyond the box" is both powerful in terms of multiscale description of soft matter [74][75][76], but is also serves as template for the more general situation of using artificial intelligence methods far outside their original training realm.
In order to provide quality control, the neural functional theory hence allows to carry out a second type of test.This is less generic than the above benchmarking but it can nevertheless provide inspiration for machine learning in wider contexts.In the present case, the specific statistical mechanical nature of the underlying equilibrium many-body system implies far-reaching mathematical structure, as it lies at the very heart of Statistical Mechanics.Specifically, it is the significant body of equilibrium sum rules that provide formally exact interrelations between different types of correlation functions.These sum rules hold universally, i.e., independent of the specific inhomogeneous situation that is under consideration and they hence constitute formally exact relationships between functionals.
As the neural functional theory expresses direct correlation functions using neural network methods, the sum rules directly translate to identities that connect the different neural functionals and their integrated and differentiated relatives with each other.Crucially, these connections have both different mathematical form, as well as different physical meaning, as compared to the bare genealogy provided by the automatic functional differentiation and functional integration.Without overstretching the analogy, one could view the sum rules as genetic testing the entire family for absence of inheritable disease.While the body of statistical mechanical sum rules is both significant and diverse [20][21][22][23], here we rely on the recent Noether invariance theory [90][91][92][93][94][95][96][97][98] as a systematic means to create both known and new functional identities from the thermal invariance of the underlying statistical mechanics [90,91].In particular from invariance against local shifting one obtains sum rules that connect different generations of direct correlation functionals with each other in both locally-resolved and global form.We present exemplary cases below in Sec.III C. Generic sum rules that emerge from the mere inverse relationship of functional integration and functional differentation are presented in Sec.III D.

B. Introduction to classical density functional theory
We give a compact account of some key concepts of classical density functional theory; for more details see Refs.[5][6][7][8][9].Readers who are primarily interested in machine learning of neural functionals can skip this and the next subsection and directly proceed to Sec.II.
In a statistical mechanical description of a many-body system the local density acts as a generic order parameter that measures the probability of finding a particle at a specific location.The formal definition of the one-body density distribution as a statistical average is: where the sum over i runs over all N particles, r i is the position coordinate of particle i = 1, . . ., N , and δ(•) indicates the Dirac distribution, here in three dimensions.
The angles indicate a thermal average over microstates, which can e.g.be efficiently carried out in Monte Carlo simulations.
For completeness, we give a formal description of the equilibrium average based on the grand ensemble, where it is defined as ⟨•⟩ = Tr • e −β(H−µN ) /Ξ.Here the inverse temperature is β = 1/(k B T ), with the Boltzmann constant k B and absolute temperature T , the Hamiltonian H, chemical potential µ and grand partition sum Ξ.The classical trace is defined as where h denotes the Planck constant and dr N dp N is a shorthand for the high-dimensional phase space integral over all particle positions and momenta.Pedagogical introductions can be found in standard textbooks [8] and an introductory compact account together with a description of the force point of view is provided in Ref. [91].
The Hamiltonian has the following standard form: where p i is the momentum of particle i, the interparticle interaction potential u(r N ) depends on all position coordinates r N = r 1 , . . ., r N , and V ext (r) is an external potential energy function that depends on position r.Hence the sum in Eq. ( 2) comprises kinetic, interparticle, and external energy contributions.For the common case of particles interacting via a pair potential ϕ(r) that only depends on the interparticle distance r, the interparticle energy reduces to u(r N ) = ij(̸ =) ϕ(|r i −r j |)/2 where the double sum runs only over distinct particle pairs ij with i ̸ = j and the factor 1/2 corrects for double counting.
For the ideal gas the interparticle interactions vanish, u(r N ) ≡ 0, and the density profile is given by the generalized barometric law [8]: where Λ denotes the thermal de Broglie wavelength, which in the present classical case can be set Λ = σ, with σ denoting the particle size; for simplicity of notation here we use Λ = 1; we have indicated the spatial dimensionality by d.
Taking the logarithm of Eq. ( 3) and collecting all terms on the left hand side gives the following ideal gas chemical potential balance: For a mutually interacting system, where u(r N ) ̸ = 0, Eq. (4) will not be true when replacing the ideal density profile ρ id (r) by the true density profile ρ(r) as formally given by Eq. (1).Rather the sum of the three terms on the left hand side of Eq. (4) will not vanish, but yield a nontrivial contribution: where the one-body direct correlation function c 1 (r) is in general nonzero and arises due to the presence of interparticle interactions in the system.(For hard core systems c 1 (r) typically features negative values.)The machine learning strategy described below in Sec.II A is based on this pragmatic access to data for c 1 (r), as obtained by direct simulation of ρ(r) on the basis of explicitly carrying out the average in Eq. ( 1) for given form of V ext (r) and prescribed values of the thermodynamic parameters µ and T .As the one-body direct correlation function is central in the neural functional theory, we combine Eqs. ( 4) and (5), which yields the following equivalent form for the one-body direct correlation function, where ρ id (r) is given by Eq. ( 4) with Λ = 1.Equation ( 6) has the direct interpretation of c 1 (r) as the logarithm of the ratio of the actual density profile and the density profile of the ideal gas under identical conditions, as given by the external potential and thermodynamic statepoint.In alternative terminology [8] one defines the intrinsic chemical potential as µ int (r) = µ − V ext (r).The intrinsic chemical potential and the one-body direct correlation function are related trivially to each other via µ int (r) = k B T [ln ρ(r) − c 1 (r)] as is obtained straightforwardly by re-arranging Eq. ( 5).
The practical, computational, and conceptual advantage of density functional theory lies in avoiding the explicit occurrence of the high-dimensional phase space integral that underlies thermal averages; we recall the definition of the density profile (1) as such an expectation value.Instead, and without any principal loss of information, one works with functional dependencies.Rather than mere point-wise dependencies, such as between the functions ρ(r), V ext (r), and c 1 (r) that hold at each point r, see Eq. ( 5), a functional dependence is on the entirety of a function and it has in general a nonlocal and nonlinear structure.
Density functional theory is specifically based on the fact [3][4][5] that for a given type of fluid, as characterized by its interparticle interaction potential u(r N ), and known thermodynamic parameters µ and T , the form of density profile ρ(r) is sufficient to determine the entirety of the external potential V ext (r).Hence a unique functional map exists [3][4][5]: Here we omit the position arguments on both sides to reflect in the notation that the functional map relates the entirety of the density profile to the entirety of the external potential.Applying Eq. ( 7) to the external potential, as it occurs in Eq. ( 5), implies that the left hand side is determined from knowledge of the density profile alone, in principle without any need for a priori knowledge of the form of V ext (r).Via the identity Eq. ( 5) we can conclude the existence of the map: where the entirety of the density profile determines the entirety of the direct correlation function.As a consequence the one-body direct correlation function actually is a density functional, c 1 (r, [ρ]), where the brackets indicate the functional dependence, i.e. on the entirety of the argument function, here ρ(r).We will discuss below more explicitly that the dependence is effectively short-ranged for the case of short-ranged interparticle interaction potentials and that this can be exploited to great effect in the neural network methodology.

C. Density functional derivatives and integrals
While we have emphasized above the role of the onebody direct correlation functional c 1 (r, [ρ]), primarily due to c 1 (r) being directly measurable via Eq.( 6), one typically rather starts with a parent functional, the excess free energy functional F exc [ρ], in standard accounts of classical density functional theory.The relationship of F exc [ρ] and c 1 (r, [ρ]) is established via functional calculus.Functional differentiation, see Ref. [9] for a practitioner's account, yields additional position dependence and we use the notation δ/δρ(r) to denote the functional derivative with respect to the function ρ(r).Functional integration is the inverse operation.We give a brief description of the functional relationships in the following.An overview is illustrated in Fig. 3 and we will return for a broader account below in Sec.III.
The method of automatic differentiation [99] is an integral part of the new computing paradigm of differentiable programming [100].Automatic differentiation is based on a powerful set of techniques and it differs from both symbolic differentiation, as facilitated by computer algebra systems, and from numerical differentiation via finite difference, as is computational bread and butter.As shown in the tutorial [42] only high-level code is required to invoke automatic differentiation, and both neural and analytical functionals can be differentiated with little effort.As the derivative (of the functional) is with respect to its entire input data, the method constitutes a representation of a genuine functional derivative.
We give an overview.In the present context the functional calculus that relates the one-body direct correlations to the parent excess free energy functional is given by the following functional integration and functional differentiation relations: In Eq. ( 9) we have parameterized the general formal integral D[ρ] by using ρ a (r) as a scaled version of the density profile, with a simple linear relationship ρ a (r) = aρ(r).Hence the parameter value a = 0 corresponds to vanishing density and a = 1 reproduces the target density profile, as it occurs in the argument of βF exc [ρ] on the left hand side of Eq. ( 9).We emphasize that the integral over a in Eq. ( 9) is a simple one-dimensional integral over the coupling parameter a.The consistency between Eqs. ( 9) and ( 10) is demonstrated below in Sec.III D.
The perhaps seemingly very formal functional calculus acquires new and pressing relevance in light of the neural functional concepts of Ref. [41], which allows to work explicitly with both functional derivatives and functional integrals, which can be evaluated efficiently via the corresponding standalone neural functionals.
In light of these benefits it is fortunate that the functional differentiation-integration structure extends recursively to higher orders of correlation functions.The next level beyond Eqs.( 9) and ( 10) involves the two-body direct correlation functional c 2 (r, r ′ , [ρ]) and the integration and differentiation structure is as follows: and we refer to Refs.[8,9,101,102] for background.
We can chain the functional derivatives together by inserting c 1 (r, [ρ]) as given by Eq. ( 10) into the definition (12) of c 2 (r, r ′ , [ρ]).In parallel, we can also chain the functional integrals in Eqs. ( 9) and (11).These procedures yield the following second order functional inte-gration and differentiation relationships: where the scaled density profile in Eq. ( 13) is ρ a ′ (r) = a ′ ρ(r).The double parameter integral in Eq. ( 13) can be further simplified [7], as described at the end of Sec.III D. The generalization of Eq. ( 14) to the n-th functional derivative defines the n-body direct correlation functional, which remains functionally dependent on the density profile and which possesses spatial dependence on n position arguments.Although increasing n yields objects that become very rapidly out of any practical reach, the neural functional concept provides much fuel for making progress.While we do not cover c 3 (r, r ′ , r ′′ , [ρ]) here, Sammüller et al. have demonstrated its general accessiblity and physical validity for bulk fluids in Ref. [41].
We have so far focused on the properties of the intrinsic excess free energy functional F exc [ρ] and its density functional derivatives.This is natural as classically F exc [ρ] is the central object that contains the effects of the interparticle interactions and thus depends in a nontrivial way on its input density profile.The functional F exc [ρ] is intrinsic in the sense that it is independent of external influence.We recall that we here work in the grand ensemble (see e.g.Refs.[103][104][105][106] for studies addressing the canonical ensemble of fixed particle number).Hence the appropriate thermodynamic potential is the grand canonical free energy or grand potential.This is required in order to determine ρ(r).
When expressed as a density functional the grand potential consists of the following sum of ideal, excess, external, and chemical potential contributions: The form of the ideal gas free energy functional is explicitly known as and the third term in Eq. ( 15) contains the effects of the external potential V ext (r) and of the particle bath at chemical potential µ.
The variational principle of classical density functional theory [4,5,105] ascertains that δΩ[ρ] δρ(r) ρ=ρ0 = 0 (min), ( 16) Equations ( 16) and ( 17) imply that the grand potential becomes minimal at ρ 0 (r), which is the real, physically realized density profile and Ω 0 is the equilibrium value of the grand potential.Recall that based on the manybody picture we have Ω 0 = −k B T ln Ξ with the grand ensemble partition sum Ξ = Tr e −β(H−µN ) .We have used the subscript 0 to denote equilibrium but we drop this elsewhere in our presentation to simplify notation.
Inserting Eq. ( 15) into Eq.( 16) and using the explicit form of the ideal free energy functional together with the definition Eq. ( 10) of c 1 (r, [ρ]) leads to Eq. ( 5) with the one-body direct correlations expressed as a density functional, as anticipated in Sec.I B. Exponentiating and regrouping the terms then yields the following popular form of the Euler-Lagrange equation: Equation ( 18) is a self-consistency relation that can be solved efficiently for the equilibrium density profile ρ(r) via iterative methods, as detailed below in Sec.II C. A prerequisite is that c 1 (r, [ρ]) is known, usually as an approximation that is obtained from an approximate excess free energy functional F exc [ρ] via functionally differentiating according to Eq. ( 10).Having obtained a numerical solution of Eq. ( 18) for the density profile, this can then be inserted into the grand potential functional (15) to obtain full thermodynamic information via Eq.( 17), which by construction is consistent with the density profile.
We demonstrate in the following how this classical functional background can be put to formidable use via hybridization with simulation-based machine learning.As our aim is pedagogical, we choose the one-dimensional hard core system as a concrete example to demonstrate the general methodology [41].We complement the neural functional structure with a description of Percus' analytical solution, which then allows for mirroring of the neural theory.

II. NEURAL FUNCTIONAL THEORY
Jerry Percus famously wrote in the abstract of his 1976 statistical mechanics landmark paper [55]: "The external field required to produce a given density pattern is obtained explicitly for a classical fluid of hard rods.All direct correlation functions are shown to be of finite range in all pairs of variables."Here we relate his achievement to the neural functional theory, which allows to reproduce numerically a variety of properties of the exact solution.We emphasize that the neural functional theory remains generic in its applicability to further model fluids; see the Supplementary Information of Ref. [41] for the successful treatment of the supercritical Lennard-Jones fluid in three dimensions.We refer the Reader to the provided online resources [42] for a programming tutorial on the concrete application of the following concepts.Figure 4 shows a schematic of the workflow that is inherent in the neural functional concept, as described in the following.Many-body simulations under randomized conditions are used to sample statistically averaged and spatially resolved data that characterize the inhomogeneous response of the considered system.A neural network is then trained to represent the direct correlation functional, which is subsequently applied numerically and via neural functional methods to investigate the physics of the system in the desired target situations.

A. Training the neural correlation functional
The classical fluid of hard rods that Percus considers has one-dimensional position coordinates x i , with particle index i = 1, . . ., N and a pairwise interparticle interaction potential ϕ(x) which is infinite if the distance x between the two particles is smaller than their diameter, x < σ, and it vanishes otherwise.The system is exposed to an external potential V ext (x), which is a function of position x across the system, and this in general creates an inhomogeneous "density pattern" ρ(x).
We adjust the definition (1) of the density distribution to the present one-dimensional case: where δ(•) here indicates the Dirac distribution in one dimension and the brackets indicate a grand canonical thermal average.Due to the hard core nature of the model, the statistical weight of each "allowed" microstate is particularly simple and given by exp[−β i V ext (x i ) + βµN ]/Ξ, where Ξ is a normalizing factor.Allowed microstates are those for which all distinct particle pairs ij are spaced far enough apart, |x i − x j | ≥ σ.If already a single overlap occurs, then the microstate is "forbidden" as the interparticle potential becomes formally infinite, which then creates vanishing statistical weight; we recall the illustration in Fig. 1.Despite the apparent simplicity of the many-body probability distribution, the Statistical Mechanics of the hard rod model is nontrivial.The particles interact nonlocally over the lengthscale σ and the external potential has no restrictions on its shape or on the lengthscale(s) of variation.Hence features such as jumps and positive infinities that represent hard walls are allowed.In bulk, V ext (x) = 0, and the solution is straightforward [8,54].The general case is however highly nontrivial, which makes Percus' above quoted opening a very remarkable one.We present more details of his work further below, after first laying out the general machine learning strategy of Ref. [41].This neural functional method is neither restricted to hard cores nor to one-dimensional systems, but addressing this case here is useful to highlight the salient features of the approach.
We aim for explicitly sampling the microstates of the system according to their probability distribution via particle-based simulations.This can be implemented efficiently, and for the present introductory purposes in also an intuitively accessible way, via grand canonical Monte Carlo (GCMC) sampling.Excellent accounts of this method are given in Refs.[8,[107][108][109].Briefly, a Markov chain of microstates is constructed, where based on a given configuration, a trial step is proposed, which is accepted with a probability given by the Metropolis function min[1, exp(−β∆E)], where ∆E is the energy difference between the original and the trial state.
Three trial moves are used in the simplest yet powerful scheme: i) Selecting one particle i randomly and displacing it uniformly within a given maximal cutoff distance.If the displacement creates overlap, then the trial move is discarded.If otherwise there is no overlap in the new configuration, the energy difference is due to only the external potential, ∆E = V ext (x i ) − V ext (x ′ i ), where the prime denotes the trial position of particle i. ii) A new particle j is inserted at a random position x j with energy change that accounts for both the external potential and the chemical equilibrium with the particle bath and hence ∆E = V ext (x ′ j ) − µ. iii) Correspondingly, a randomly selected particle i is removed from the system.The acceptance of the removal happens again with a probability given by the Metropolis function with energy difference ∆E = −V ext (x i ) + µ.
Despite its conceptual simplicity GCMC is a very powerful method for the investigation of complex effects [107][108][109] and significant extensions exist both in the form of histogram techniques [108,109] and the tailoring of more complex and collective trial moves.Investigating a typical physical problem, as specified by the interparticle interactions u(r N ) and the type of considered external influence, such as walls as represented by a model form of V ext (r), requires e.g.scanning of the thermodynamic parameters and acquiring good enough statistics at each statepoint.Our ultimate goal (Sec.II C) is to perform this tasks with significant gain in efficiency via the neural theory; we re-iterate the availability via Ref. [42] of hands-on code examples for the present hard rod model.
We base the training on the following rewriting and adaption of the chemical potential balance Eq. ( 5) to the one-dimensional system: All quantities on the right hand side are either prescribed a priori or are accessible via the GCMC simulations: Specifically, the density profile ρ(x) is obtained by filling a position-resolved histogram according to the encountered microstates as specified by its particle coordinates x i .We recall the formal definition (19) of ρ(x) via the Dirac distribution, which in practice is discretized such that sufficient finite spatial resolution, say 0.01σ, is obtained.This "counting" method is arguably the most intuitive one to obtain data for the density profile.As an aside, there is a number of force-sampling techniques that can improve the statistical variance significantly [97,[110][111][112] and that also can serve to gauge the quality of sampling of the equilibrium ensemble [97].
While the issues of Monte Carlo sampling efficiency and quality assessment of thermal averages can be pertinent in higher dimensions and in physically more complex situations, the simplicity of the present one-dimensional hard core model makes counting according to Eq. (19) an appropriate choice to obtain data for ρ(x).Then adding up the three contributions on the right hand side of Eq. ( 20) yields results for c 1 (x).We proceed at this data-generation stage somewhat heretically and ignore at first the central role that c 1 (x) plays for the physics of inhomogeneous systems.
In contrast to the typical deterministic setup for investigating a specific physical situation described above, training the neural network proceeds on the basis of randomized situations rather than with the ultimate application in mind; we recall the illustration of the neural functional workflow shown in Fig. 4. The motivation for using this strategy comes from the goal of capturing via the machine learning the intrinsic direct correlations of the many-body system that then transcend the specific inhomogeneous situations that were under consideration during training.Figure 5 depicts an illustration of the neural network topology of the trained central neural network c 1 (x, [ρ]) and its relation to the physical input and output quantities, i.e., to ρ(x) and c 1 (x).
We hence perform a sequence of simulation runs, where each run has an input value βµ (k) and an input functional shape βV (k) ext (x), both of which are generated randomly.Specifically, we combine sinusoidal functions with periodicities that are commensurate with the box length L, linear discontinuous segments, and hard walls in the creation of V (k) ext (x); see Refs.[41,42] for further details.The superscript k enumerates the different GCMC simulation runs and in practice we perform 512 of these.The result is a set of corresponding density profiles ρ (k) (x).We then use Eq. ( 20) to obtain for each run the onebody direct correlation profiles from simply adding up: (k) .As a result of the simulation protocol we have generated a bare data set {βµ (k) , βV 1 (x)} for all positions x and for all different runs k.As a practical detail, this requires to exclude regions where ρ(x) = 0 and V ext (x) = ∞.
In order to address our declared goal to learn a functional dependence of c 1 (x), we have to carve out a nontrivial dependence relationship and hence restrict the data input.Motivated by the physics, one might see the scaled chemical potential βµ (k) and the scaled external potential βV (k) ext (x) to be the true mechanical origin of the shape of the direct correlation function c (k) 1 (x).However, the insights provided by density functional theory hint at the fact that this is not the best possible choice of functional relationship to consider.
We recap that the GCMC simulations yield data according to: where the curly brackets indicate all function values inside of the system box, with ranges 0 ≤ x ′ ≤ L and 0 ≤ x ≤ L; the arrow indicates an input-output relationship.Applying Eq. ( 20) to the entire data set also allows to have the direct correlation function as an output according to: If one were to mimic the simulations directly by the neural network one would be tempted to base the training directly upon Eq. (22).In less clearcut machine-learning situations than considered here, it can be a standard strategy to attempt to represent the causal relationship, which governs the complex mathematical or real-world system under consideration, by a surrogate artificial intelligence model.The present functional formulation of Statistical Mechanics hints at potential caveats, such as the necessity of dealing with the full input and output data sets (parameter ranges of x and x ′ ) across the entire system.Furthermore the specific physics of the mutually interacting rods appears to play no role.The density functional-inspired training (Sec.I B) proceeds very differently.We here take a pragmatic stance and attempt to create via training a neural representation of the dependence of c 1 (x) on ρ(x) alone.This leads to a surrogate model c 1 (x, [ρ]) based on the following mapping where the input on the left hand side consists of function values ρ (k) (x ′ ) that lie inside the density window centered at x, i.e., only the values x ′ that lie within a narrow interval x − x c ≤ x ′ ≤ x + x c .Here x c is a cutoff parameter that for short-ranged interparticle potentials is of the order of the particle size.For the present onedimensional hard core system we set x c = σ.Instead of having to output an entire function, as would be the case when attempting to learn via Eq.( 22), here the output is merely the single value of the direct correlation function at the center of the density window.We recall that this target value is obtained from the simulation data via Eq.( 20) such that c for each run k.A simple GCMC code is provided online [42], along with a pre-generated simulation data set and a pre-trained neural functional.
We choose the loss function to be the mean squared error of the neural network output compared to the simulation reference value for c 1 (x), as obtained via Eq.(20).As a further metric to gauge the training progress, we make use of the mean absolute error of reference and output.Both choices are standard [100].The quadratic loss is convenient as it is analytical and hence the machinelearning gradient-based methods directly apply.The mean absolute error is nonanlytical due to the modulus involved, but it is a useful supporting quantity that has a very direct interpretation.
After training the mean absolute error was of the order of ∼ 0.013, which implies that the neural network prediction deviates on average by this value from the simulation data.Although the simulation data carries some statistical noise, its effect is comparatively smaller, when taking the numerical solution of the Percus theory (detailed below) as the reference.
Our training data consists of 512 simulation runs using a simulation box size of L = 10σ.Each of the simulation runs requires only about three minutes runtime on a single CPU core of a standard desktop machine.
We use a standard fully-connected artificial neural network with three hidden layers that respectively possess 128, 64 and 32 nodes.We use 201 input nodes to represent the density profile in a finite window of size 1σ and spatial bin size 0.01σ, where we recall that σ is the particle size.To accommodate the local functional mapping, we reshape the training data into input density windows and corresponding output values of c 1 (x), where we also apply twofold data augmentation by exploiting mirror symmetry of the simulation results.Excluding regions where V ext (x) = ∞ and hence where Eq. ( 20) is not defined, this results in ∼ 10 6 input-output pairs.
From the above description and without considering the background in density functional theory it is not evident that the training will be successful and minimize the loss satisfactorily to yield a trained network c 1 (x, [ρ]).From a mathematical point of view, this raises the questions whether a corresponding object c 1 (x, [ρ]) indeed exist and whether it is unique.And if so, is its structure simple enough that it can be written down explicitly?

B. Percus' exact direct correlation functional
Due to Percus singular achievement [55] the one-body direct correlation functional c 1 (x, [ρ]) for interacting hard rods in one spatial dimension is known analytically and this has triggered much subsequent progress, see e.g.Refs.[56][57][58][60][61][62][63][64][65].The functional dependence on the density profile is nonlocal, as one would expect from the fact that the rods interact over the finite distance σ, and it is also nonlinear, as is consistent with the behaviour of a nontrivially interacting many-body system.The spatial dependence is characterized by convolution operations which, despite performing the task of coarsegraining, retain the full character of the microscopic interactions.The Percus functional provided motivation for developing so-called weighted-density approximations (WDA) [8], where the density profile is convolved with one or several weight functions that are then further processed to give the ultimate value of the density functional.
We here give the Percus direct correlation functional in Rosenfeld's geometry-based fundamental measure representation, see Ref. [59] for a historical perspective.Instead of working with the particle diameter σ as the fundamental lengthscale, Rosenfeld rather bases his description on the particle radius R = σ/2, which allows to find deep geometric meaning in Percus' expressions and to also generalize to higher dimensions [60,61,65].
The exact form [60] of the one-body direct correlation functional is analytically given as the following sum: Here the two functions Φ 0 (x) and Φ 1 (x) each depend on two weighted densities n 0 (x) and n 1 (x) in the following form: The weighted densities n 0 (x) and n 1 (x) are obtained from the bare density profile via spatial averaging: The discrete spatial averaging at positions x ± R in the weighted density (27) parallels that in the first term of Eq. ( 24).Similarly the position integral over the interval [x − R, x + R] in Eq. ( 28) appears analogously in the second term of Eq. ( 24).These similarities are not by coincidence.The structure is rather inherited from the grandmother (excess free energy) functional, as is described in Sec.III A.
Having the analytical solution ( 24)-( 28) for c 1 (x, [ρ]) allows for carrying out numerical evaluation and comparing against results from the neural functional c 1 (x, [ρ]).The range of nonlocality, i.e., the distance across which information of the density profile enters the determination of c 1 (x, [ρ]) via Eqs.( 24)-( 28) is strictly finite, as announced in Percus' abstract [55].As two averaging operations, each with range ±R, are chained together, the composite procedure has a range of ±2R = ±σ, inline with our truncation of the density profiles in the training data sets according to Eq. (23).A numerical implementation of Percus direct correlation functional is available online [42].

C. Application inside and beyond the box
Actually making the predictions for the hard rod model is now straightforward as we can resort to density functional theory and its standard use in application to physical problems.The arguably most common method for solving the Euler-Lagrange equation self-consistently is based on Eq. ( 18), which we re-express for the onedimensional case considered: We recall that the range of nonlocality of c 1 (x, [ρ]) is limited to only the particle size σ and that we were able to extract the functional dependence from simulation data obtained by sampling in boxes of size L.Although the value of L could in principle be imprinted in subtle finite size effects that c 1 (x, [ρ]) has acquired, the size L of the original simulation box has vanished and the application of the neural functional in Eq. ( 29) is fit for use to predict properties of much larger systems.As an example, Ref. [41] demonstrates the scaling up by a factor of 100 from the original simulation box to the predicted system of three-dimensional hard spheres under gravity.
The numerical solution of Eq. ( 29) can be efficiently performed on the basis of Picard iteration where an initial guess of the density profile is inserted on the right hand side and the resulting left hand side is used to nudge the initial guess in the correct direction toward the self consistent solution.This is numerically fast and straightforward to implement, see the tutorial [42].A common choice is to mix five percent of the new solution to the prior estimate.
As laid out above, we choose the one-dimensional hard core model due to both the availability of Percus' functional and the computational ease of both numerical evaluation of the analytical expressions and of carrying out many-body simulations.On the downside, the model does not form a very credible platform for assessing the numerical efficiency gain of the neural theory, as in general one will be interested in more complex systems and more complex physical situations than addressed here.Nevertheless, to give a rough idea about the required computational workload, minimizing the neural density functional takes of the order of seconds on a GPU, while the GCMC simulation runtime is of the order of several minutes.Minimizing the analytical Percus functional is faster than using the neural network, due to the simple structure of Eqs. ( 24)-( 28), which facilitates using very high-performance fast Fourier transforms.
We show three representative examples of density profiles for narrow to wide confinement between impenetrable walls in Fig. 6.In all cases the results from using the neural functional are numerically identical to those from the Percus functional on the scale of the plot.The profiles in narrow [Fig.6(a)] and in moderately wide [Fig.6(b)] pores show very dinstinct features with the strongly confined system in (a) having a striking Vshape, which arises from having at most two particles in the system, to the more generic damped oscillatory behaviour in the moderately wide pore (b).The main panel Fig. 6(c) shows the influence of a weak gravitational field, which creates a continuously varying density inhomogeneity across the entire system.The decay in local density occurs with a much larger length scale as compared to the particle packing effects that are localized near the lower wall.
The behaviour shown in Fig. 6(c) away from the walls is well-represented by a local density approximation [8] (see e.g.Ref. [113] for recent mathematical work).The local density approximation can be a useful tool when investigating e.g.macroscopic ordering under gravity, where the occurring stacking sequences of different thermodynamic phases can be traced back to the phase diagram [114,115].In particular the effects on mixtures were rationalized by a range of techniques, from generalization of Archimedes' principle [116,117] to analyzing stacking sequences [114,115].We stress that the present model applications constitute very significant extrapolations from the training data that we recall was obtained in a fixed box size L = 10σ and under the influence of randomized external and chemical potentials.Hence both the very confined system [Fig.6(b)] and the large system [Fig.6(c)] constitute significant extrapolations from the training data.
As a further potential application of the neural func- Representative density profiles that the inhomogeneous hard rod system exhibits under the influence of an external potential.The results are obtained from numerically solving Eq. ( 29) upon using either the neural direct correlation functional c1(x, [ρ]) or Percus exact solution therof.The three cases comprise (a) two hards wall with separation distance 9σ and chemical potential βµ = 2, (b) two hard walls with much smaller separation distance 2σ and identical chemical potential βµ = 2, and (c) sedimentation-diffusion equilibrium under gravity with a locally varying chemical potential, βµ loc (x) = βµ − βVext(x) = 2 − 0.05x/σ; here the linearly varying contribution accounts for the influence of gravity on the system and confinement is provided by two widely spaced hard walls at x = 0.5σ und x = 99.5σ.Note the crossover in panel (c) from the strongly oscillatory behaviour near the lower wall to a very smooth density decay, effectively following a local density approximation [8], upon increasing the scaled height x/σ.
tional theory, the dynamical density functional theory [5,69,118] is similarly easy to implement numerically as Eq. ( 29) and it is a currently popular choice to study time-dependent problems [119,120].We comment on the status of the approach [40] and how machine learning can help to overcome its limitations in Sec.IV below.

III. NEURAL FUNCTIONAL CALCULUS
We have seen in Sec.II how a neural one-body direct correlation functional can be efficiently trained on the basis of a pool of pre-generated Monte Carlo simulation data that are obtained under randomized conditions.The specific way of organizing the simulation data into training sets mirrors the functional relationships given by classical density functional theory.We have then shown that the neural functional can efficiently be used to address physical problems, taking the one-dimensional hard rod system as a simple example of a mutually interacting many-body system.
We here proceed by exemplifying the depth of physical insight that can be explored by acknowledging the functional character of the trained neural correlation functional.Hence we lay out functional integration (Sec.III A) and functional differentiation (Sec.III B).We show sum rule construction via Noether invariance (Sec.III C), via exchange symmetry (also Sec.III C), and via functional integration (Sec.III D).The presentation in each subsection is self-contained to a considerable degree and we illustrate the generality of the methods both by application to the neural functional as well as by revisiting the analytic Percus theory.

A. Functional integration of direct correlations
Having captured the essence of molecular packing effects, as they arise from the short-ranged hard core repulsion between the particles, via the neural functional c 1 (x, [ρ]), begs for speculation whether additional and as yet hidden physical structure can be revealed.We give two plausibility arguments why one should expect to be able to postprocess c 1 (x, [ρ]) in a meaningful way to retrieve global information.
First, thermodynamics is based on the existence of very few and well-defined unique and global quantities, such as the entropy, the internal energy, and the free energy.Carrying out parametric derivatives, with powerful interrelations given by the Maxwell relations, enables one to obtain equations of state, susceptibilities and further measurable global quantities.Our neural direct correlation functional in contrast is a local object with finite range of nonlocality.So how does this relate to the global information?
The second argument is more formal.Suppose we pre-scribe the form of the density profile and then evaluate the neural functional c 1 (x, [ρ]) at each position x.This procedure yields a numerical representation of the corresponding direct correlation function c 1 (x).In the practical numerical representation we have a set of discrete grid points that represent the function values at these spatial locations x.Hence the entire data set forms a numerical array or numerical vector, indexed by x.One can then ask whether this vector could potentially be the gradient of an overarching parent object?
The physical and the formal question can both be answered affirmatively due to the existence of the excess free energy density functional F exc [ρ].Its practical route of access, based on functional integration along a continuous sequence of states (a "line") in the space of density functions, is strikingly straightforward within the neural method.The core of the method is to evaluate c 1 (x, [ρ a ]) as described above, but for a range of scaled versions of the prescribed density profile ρ a (x) and then integrating in position to obtain the excess free energy as a global value, see the functional integral given in Eq. ( 9).
Specifically, we define a scaled version of the density profile as ρ a (x) = aρ(x), such that a = 0 generates the empty state that has vanishing density profile, ρ a=0 (x) = 0. On the other end a = 1 yields the actual density profile of interest, ρ a=1 (c) = ρ(x).The excess free energy functional is then obtained easily via functional integration according to The numerical evaluation requires evaluation of c 1 (x, [ρ a ]) at all positions x in the system and for a range of intermediate values 0 ≤ a ≤ 1 such that the parametric integral over a can be accurately discretized.
Analytically carrying out the functional integral (30) on the basis of the analytical direct correlation functional c 1 (x, [ρ]) as given by Eqs. ( 24)-( 28) is feasible.The result [56], again expressed in the more illustrative Rosenfeld fundamental measure form, is given by: Here the integrand Φ(n 0 (x), n 1 (x)) plays the role of a localized excess free energy density which depends on the weighted densities n 0 (x) and n 1 (x) as given via the spatial averaging procedures in Eqs. ( 27) and ( 28), respectively.Inserting Eq. ( 32) into Eq.( 31) yields the hard rod excess free energy functional in the following more explicit form: Equation ( 33) is strikingly compact, given that it describes the essence of a system of mutually interacting hard cores exposed to an arbitrary external potential.
Although the result of the functional integral ( 30) has lost all position dependence, the specific form of the density profile ρ(x) is deeply baked into the resulting output value of the functional via both the prefactor ρ(x) in the integrand in Eq. ( 30) and the evaluation of the direct correlation functional at the specifically scaled form ρ a (x).In parallel with this mathematical structure, the explicit form (33) of the Percus functional clearly demonstrates that the resulting value will depend nontrivially on the shape of the input density profile.
Having demonstrated that F exc [ρ] as a global quantity can be obtained from appropriate functional integration of a locally resolved correlation functional c 1 (x, [ρ]) naturally leads to the question whether a reverse path exists that would mirror the inverse structure provided by integration and differentiation known from ordinary calculus.
The availability of a corresponding derivative structure for functionals is quite significant, as this by construction generates spatial dependence, as indicated by δ/δρ(x); see e.g.Ref. [9] for details.We can hence retrieve, or generate, the direct correlation functional as the functional density derivative of the intrinsic excess free energy functional: While we turn to more general functional differentiation below, we here address again the analytical case, which is useful as it reveals the origin of the double appearance of the two spatial weighting processes in Eqs. ( 24)- (28).Rosenfeld [61] introduced two weight functions w 0 (x) and w 1 (x), which respectively describe the end points of a particle and its interior onedimensional "volume": where Θ(x) indicates the Heaviside unit step function, i.e., Θ(x ≥ 0) = 1 and 0 otherwise.The weighted densities n 0 (x) and n 1 (c), as given respectively by Eqs. ( 27) and ( 28), can then be represented via convolution of the respective weight function of type α = 0, 1 with the density profile according to In more compact notation we can express Eq. ( 37) as n α (x) = (w α * ρ)(x), where the asterisk denotes the spatial convolution.Then the direct correlation functional is given by which is an exact rewriting of the form given in Eq. ( 24).The coefficients Φ α are obtained as partial derivatives of the scaled free energy density (32) via Φ α = ∂Φ/∂n α .This derivative structure reveals the mechanism for the generation of the explicit forms Φ 0 (x) and Φ 1 (x), as respectively given by Eqs. ( 25) and ( 26).

B. Functional differentiation of direct correlations
While the above described use of functional differentiation in an analytical setting might appear to be very formal and perhaps limited in its applicability, we emphasize that the concept is indeed very general.Given a prescribed functional of a function ρ(x), the functional derivative δ/δρ(x) simply gives the gradient of the functional with respect to a change in the input function at a specific location x.
By applying the functional derivative in the present one-dimensional context to a given functional form of c 1 (x, [ρ]), one obtains the two-body direct correlation functional and we recall the generic expression ( 12) of the two-body direct correlation functional: Using the Percus version (38) of the one-body direct correlation functional and carrying out the functional derivative on the right hand side of Eq. ( 39) gives via an analytical calculation the following nonlocal result: We make the double asterisk convolution structure more explicit below.The coefficient functions in Eq. ( 40) are obtained as second partial derivatives via Φ αα ′ = ∂ 2 Φ/∂n α ∂n α ′ .Explicitly, we have Φ 00 (x) = 0 and the symmetry Φ 01 (x) = Φ 10 (x).The remaining terms are given by Inserting these results into Eq.( 40) and making the convolutions explicit yields the following expression: We recall the definitions (35) and (36) of the weight functions w 0 (x) and w 1 (x).The convolution structure couples two weight functions together and each of them has a range of R. Hence indeed the two-body direct correlations are of finite range 2R = σ in the position difference x − x ′ [55].
While the above results for the Percus theory have been derived by pen-and-paper symbolic calculations, the neural functional is not amenable to such conventional techniques.Fortunately, the framework of automatic differentiation [99] provides a powerful alternative to both symbolic and numerical differentiation methods, and it is a natural choice to consider in the context of machine learning [100].Via the implementation of either modified algebra or of computational graphs, automatic differentiation facilitates to obtain derivatives directly in the form of executable code, and crucially there is no need of any manual intervention.Automatic differentiation thereby is free of the numerical artifacts that are typical of finite difference schemes.The method is applicable in broad contexts, which we illustrate in the online tutorial [42] by computing the Percus result for c 2 (x, x ′ ) via automatic differentiation of Eq. ( 38) rather than by manual implementation of Eq. (43).
For completeness, we can recover the one-body direct correlation functional by functional integration.We reproduce Eq. ( 11) for the present one-dimensional geometry: On the basis of the neural representations of the direct correlation functionals, this identity can be used to check for consistency and for correctness of the automatic differentiation.

C. Noether invariance and exchange symmetry
In its standard applications Noether's theorem is used to relate symmetries of a dynamical physical system with associated conservation laws.Obtaining linear momentum conservation from a symmetry of the underlying action integral is a primary example, see e.g.Ref. [91] for an introductory presentation.Besides such deterministic applications, the Noether theorem is currently seeing an increased use in a variety of statistical mechanical settings [121][122][123][124][125][126][127][128].
The central statistical Noether invariance concept [90,91] was demonstrated in a range of studies, addressing the strength of force fluctuations via their variance [92], the formulation of force-based classical density functional theory [93,94], and the force balance in quantum manybody systems [95].The invariance theory has led to the discovery of force-force and force-gradient two-body correlation functions.These correlators were shown to deliver profound insight into the microscopic spatial liquid structure beyond the pair correlation function for a broad range of model fluids [96,98].Noether invariance is relevant for any thermal observable, as associated sum rules couple the given observable to forces via very recently identified hyperforce correlations [97].
The statistical Noether sum rules are exact identities that can serve a variety of different purposes, ranging from theory building via combination with approximate closure relations, testing for sufficient sampling in simulation [97], carrying out force sampling to improve statistical data quality and, last but not least, testing neural functionals [40,41].Having the latter purpose in mind, here we describe a selection of these Noether identities.
As a fundamental property, the interparticle interaction potential only depends on the relative particle positions and not on the absolute particle coordinate values.Specifically, whether two particles overlap in the onedimensional system is unaffected by displacing the entire microstate uniformly.This invariance against global translation leads to associated sum rules for direct correlation functions; we recall that the direct correlations arise solely from the interparticle interactions and hence they are not directly dependent on the external potential.We quote two members of an infinite hierarchy of identities, which is originally due to Lovett, Mou, Buff, and Wertheim [129,130], see Eqs. ( 45) and ( 46) below.We group these together with a recent curvature sum rule (47) [92].Ultimately the identities ( 45) and ( 46) express the vanishing of the global interparticle force, as obtained by summing over the interparticle forces on all particles.The three sum rules read as follows: where in the one-dimensional system the gradient is a simple scalar position derivative, ∇ = d/dx.Briefly, Eq. ( 45) is obtained by noting that where the displaced density profile is given by ρ ϵ (r) = ρ(r + ϵ) with displacement vector ϵ (in three dimensional systems).Building the gradient with respect to ϵ yields the result 0 = ∂βF exc [ρ ϵ ]/∂ϵ| ϵ=0 = dr(δβF exc [ρ]/δρ(r))∇ρ(r), which gives Eq. ( 45) upon integration by parts, resorting to the one-dimensional geometry, and identifying the one-body direct correlation functional via Eq.( 34); for more details of the derivation we refer the Reader to Refs.[90,91].Equation ( 46) is then obtained as the density functional derivative of Eq. ( 45) and re-using Eq. ( 45) to simplify the result.Equation ( 47) is a curvature sum rule that follows from spatial Noether invariance at second order in the global shifting parameter ϵ [90].
Using a locally resolved shifting operation, where the displacement ϵ(r) is local and depends on the spatial position r and hence constitutes a vector field (in the case of a three-dimensional system), yields in one dimension the following position-resolved identity: The left hand side has the direct interpretation of the mean interparticle force field, expressed in units of the thermal energy k B T .This force both acts in equilibrium and it drives the adiabatic part of the time evolution in nonequilibrium [9]; we describe some details of the nonequilibrium theory for time evolution in Sec.IV.When inserting the relationship (34) of c 1 (x, [ρ]) to the free energy functional which is the one-dimensional version of the general relationship (14).As the order of the two functional derivatives is irrelevant we obtain the following exact symmetry with respect to the exchange of the two position arguments: When applied to the neural functional, the exchange symmetry relationship ( 50) is highly nontrivial, as the density windows that enter the functionals on the left and on the right hand sides differ markedly from each other, as do the corresponding evaluation positions.That both displacement effects cancel each other and lead to the identity ( 50) is nontrivial and can serve both for testing the quality of the neural direct correlation functional and for demonstrating the existence of an overarching grandmother functional In order to illustrate the theoretical structure, we display numerical results in Fig. 7.We select a representative oscillatory density profile, as shown in Fig. 7(a), and take this as an input to evaluate the one-body direct correlation functional c 1 (x, [ρ]).This procedure yields a specific form of the direct correlation function c 1 (x), displayed in Fig. 7(b), which belongs to the prescribed density profile ρ(x).The spatial variations of ρ(x) and c 1 (x) are roughly out-of-phase with each other.The nonlinear and nonlocal nature of the functional relationship ρ → c 1 is however very apparent in the plot.The results from choosing the neural functional or Percus' analytical onebody direct correlation functional agree with each other to excellent accuracy.The agreement is demonstrated in Fig. 7(b), where the two resulting direct correlation profiles are identical on the scale of the plot.
As laid out above above, the exchange symmetry (50) constitutes a rigorous test for the two-body direct correlation functional c 2 (x, x ′ , [ρ]).Both the neural and the analytical functional pass with flying colours, see Figs.Numerical results for functional calculus and Noether invariance.The results are shown for an exemplary oscillatory density profile displayed in panel (a).Results for the neural prediction for c1(x) are compared to numerically evaluating Percus' analytical direct correlation functional (24) in panel (b).The two-body direct correlation function c2(x, x ′ ) as a function of x/σ and x ′ /σ, as obtained from automatic differentiation of the neural functional is shown in panel (c) and compared to the result from using Percus' analytical expression (43) in panel (d).Using the neural functionals, the agreement of the left and right hand side of of the Noether force sum rule (48) is shown in panel (e).In all cases the neural functional and Percus theories give numerically identical results on the scale of the respective plot.and (d) respectively, where the symmetry of the respective "heatmap" graph against mirroring at the diagonal is strikingly visible.
As a representative case for the use of a Noether sum rule as a quantitative test for the accuracy of the neural functional methods, we show in Fig. 7(e) the numerical results of evaluating both sides of Eq. ( 48) for the same given density profile [shown in Fig. 7(a)].We find that both sides of the equation agree with high numerical precision with each other.

D. Functional integral sum rules
We next address general identities that emerge from exploiting the inverse nature of functional differentiation and integration.For this, we recall the functional integral form (9) of F exc [ρ] and the functional derivative form (10) of c 1 (ρ, [ρ]), which both are central for the following derivations.That Eq. ( 9) is the inverse of Eq. ( 10) can be seen explicitly by functionally differentiating Eq. ( 9) as follows: We have interchanged the order of integration and functional differentiation on the right hand side of Eq. ( 51) as these operations are independent of each other.The functional density derivative now acts on the product ρ(r ′ )c 1 (r ′ , [ρ a ]) and we need to differentiate both factors according to the product rule.Differentiating the first factor gives the Dirac distribution, δρ(r ′ )/δρ(r) = δ(r − r ′ ).Differentiating the second factor generates the two-body direct correlation functional according to Eq. ( 39) and hence δc 1 (r ′ , [ρ a ])/δρ(r) = ac 2 (r, r ′ , [ρ a ]), where multiplication by the scaling factor a arises from the identity δ/δρ(r) = aδ/δ(aρ(r)) = aδ/δρ a (r).We can hence reformulate Eq. ( 51) by rewriting the left hand side via Eq.( 10) and expressing the right hand side by the two separate terms.Upon multiplication by −1 the result is the following functional integral identity: In the first term on the right hand side of Eq. ( 52) the position integral has cancelled out due to the presence of the Dirac function, which leaves over the position dependence on r, as occurring in all other terms.In order to prove Eq. ( 52) and hence to establish that indeed Eqs. ( 9) and ( 10) are inverse of each other, we integrate by parts in a addressing the first integral on the right hand side of Eq. ( 52).This yields a sum of boundary terms and an integral: c 1 (r, [ρ]) − 0 − 1 0 daa∂c 1 (r, [ρ a ])/∂a.The derivative with respect to the parameter a generates the second term in Eq. ( 52) up to the minus sign upon carrying out the parametric derivative via ∂/∂a = dr ′ ρ(r ′ )δ/δρ a (r ′ ) and identifying c 2 (r, r ′ , [ρ a ]) = δc 1 (r, [ρ a ])/δρ a (r ′ ).Hence the two integrals cancel each other.Only the upper boundary term c 1 (r, [ρ]) remains, which is the left hand side of Eq. ( 52), and hence completes the proof.
Despite this explicit derivation via functional calculus, as both c 1 (r, [ρ]) and c 2 (r, r ′ , [ρ]) are directly available as neural functionals, the functional integral sum rule ( 52) provides yet again fresh possibility for carrying our consistency and accuracy checks.
Going through the analogous chain of arguments one generation younger leads to the following functional integral relationship between the two-and three-body direct correlation functionals: The neural functional calculus allows to obtain c 3 (r, r ′ , r ′′ , [ρ]) via automatic generation of the Hessian of c 1 (r, [ρ]) [41], which elevates Eq. ( 53) beyond mere formal interest.
The structure of Eqs. ( 52) and ( 53) expresses a general functional relationship.When applied to the excess free energy functional itself the result is: We furthermore demonstrate explicitly the relationship from daughter to grandmother via functional integration of the two-body correlation functional to obtain the excess free energy functional: where again the scaled density profile is ρ a ′ (r) = a ′ ρ(r).That Eq. ( 55) holds can be seen by chaining together the two levels of functional integrals ( 9) and ( 44) and then simplifying the two nested parameter integrals.The double parametric integral in Eq. ( 55) can alternatively be written with fixed parametric boundaries as , where the twice scaled density profile is defined as ρ aa ′ (r) = aa ′ ρ(r).
Evans [7] goes further than Eq. ( 55) by using the identity , which is valid for any function f (a), as can either by shown geometrically by considering the triangle-shaped integration domain in the two-dimensional (a, a ′ )-plane or, more formally, by integration by parts.The identity allows to express Eq. ( 55) in a form that requires to carry out only a single parametric integral: Evans [7] also considers more general cases where the parameter a linearly interpolates between a nontrivial initial density profile ρ i (r) ̸ = 0 and the target profile ρ(r) via ρ a (r) = ρ i (r) +a[ρ(r)−ρ i (r)].In our present description we have restricted ourselves to empty initial states, ρ i (r) = 0, but the functional integration methodology is more general, see Ref. [7].
Throughout we have notated the functional integrals via an outer position integral over r and an inner parametric integral over a.This structure allows to take the common factor ρ(r) out of the inner integral.Standard presentations often reverse the order of integration.Taking the functional integral over the one-body direct correlation functional as an example, both versions are identical: Our (mild) preference for the order on the left hand side of Eq. ( 57) has two reasons.i) In a numerical scheme, where one discretizes on a grid of positions r and of values of a, the multiplication by ρ(r) is only required to be carried out once at each gridpoint r, when using the left hand side, not also for every value of a as on the right hand side.ii) Although the result of the inner integral, 1 0 dac 1 (r, [ρ a ]), depends on the specific chosen parameterization ρ a (r) and is hence not unique from the viewpoint of the entire functional, it nevertheless constitutes a well-defined localized function of r.

IV. NONEQUILIBRIUM DYNAMICS
We have so far demonstrated that the equilibrium properties of correlated many-body systems can be investigated on a very deep level by using neural networks to represent the functional relationship that are inherent in the statistical physics.The required computational workload is thereby only quite moderate.The neural functionals that encapsulate the nontrivial information about correlations and about thermodynamics are lean, robust and they can be manipulated efficiently by the neural functional calculus outlined above.
These features of the neural theory naturally lead one to wonder about the potential applicability beyond equilibrium, i.e., to situations where the considered system is driven by external forces such that flow is generated.The recent nonequilibrium machine-learning method by de las Heras et al. [40] is based on the dynamical one-body force balance relationship for overdamped Brownian motion.The required dynamical functional dependencies are those given by power functional theory [9].The power functional approach is formally exact and it goes beyond dynamical density functional theory [5,69,118,[131][132][133] in that it also captures nonequilibrium interparticle force contributions that exceed those generated by the free energy functional; see Refs.[9,119,120,134] for recent reviews.Such genuine nonequilibrium effects include viscous and structural nonequilibrium force fields [9,[135][136][137], which for uniaxial compressional flow of a three-dimensional Lennard-Jones fluid were shown to be well-represented by a trained neural network [40].
The neural nonequilibrium force fields were successfully compared against analytical power functional approximations, where simple and physically motivated semi-local dependence on both the local density and the local velocity was shown to capture correctly the essence of the forces that occur in the nonequilibrium situation.Together with the exact force balance equation, this allows to predict and to design nonequilibrium steady states [40].The approach offers a systematic way to go beyond dynamical density functional theory and to address genuine nonequilibrium beyond a free energy description.We recall studies based on dynamical density functional theory that addressed non-equilibrium sedimentation of colloids [138], the self-diffusion of particles in complex fluids [139], and the behaviour of the van Hove two-body dynamics of colloidal Brownian hard disks [140] and of hard spheres [141,142].
We have briefly touched on the concept of forces when discussing the direct correlation sum rule (48).Locally resolved force fields are central to power functional theory [9,[149][150][151] for the description of the nonequilibrium dynamics of underlying many-body systems.The connection to the present framework is via the locally resolved interparticle force density F int (r, t).When expressed in correlator form, this vector field is given as the following nonequilibrium average: The dependence on time t arises as the average on the right hand side of Eq. ( 58), which is taken over the instantaneous nonequilibrium many-body probability dis-tribution, as given by temporal evolution of the Smoluchowski equation for the case of overdamped dynamics.The interparticle force density F int (r, t) can be split into a sum of an equilibrium-like "adiabatic" force density F ad (r, t) and a genuine nonequilibrium "superadiabatic" contribution F sup (r, t).Making the functional dependencies explicit, as they arise in power functional theory [9], gives the following sum of two contributions: Here the functional arguments are the density profile ρ(r, t) and the one-body velocity field v(r, t) = J(r, t)/ρ(r, t), which are both microscopically resolved in space and in time.The numerator is the one-body current, which is given as an instantaneous nonequilibrium average via J(r, t) = ⟨ i δ(r − r i )v i ⟩, where v i (r N , t) is the velocity of particle i in the underlying many-body overdamped Brownian dynamics.
Reference [40] presents a demonstration of the validity of the functional dependence on ρ(r, t) and v(r, t) via successful machine-learning of F int (r, t, [ρ, v]) for inhomogeneous nonequilibrium steady states.The strategy for constructing the neural network is similar to that described here, but it is based on predicting the locally resolved nonequilibrium forces rather than the equilibrium one-body direct correlations.One important connection between equilibrium and nonequilibrium is given by the adiabatic construction [9] that relates F ad (r, t, [ρ]) in the nonequilibrium system to an instantaneous equilibrium system with identical density profile ρ(r, t).The adiabatic force field is then given as a density functional via the standard relationship where the density argument of the one-body direct correlation functional c 1 (r, [ρ]) is the instantaneous density distribution ρ(r, t).
For overdamped Brownian dynamics with friction constant γ, the one-body current J(r, t) appears in the force density balance, which is given by γJ where f ext (r, t) is an external force field that acts on the system, in general in a time-and position-dependent fashion.The prescription for the current is complemented by the microscopically resolved continuity equation, ∂ρ(r, t)/∂t = −∇ • J(r, t).Upon neglecting the superadiabatic force density in Eq. ( 59) and hence only taking adiabatic forces into account, i.e. approximating F int (r, t, [ρ, v]) ≈ F ad (r, t, [ρ]), one arrives at the dynamical density functional theory [5,69,118].Its inherent central approximation is hence to replace the nonequilibrium forces by effective equilibrium forces that are obtained from the free energy functional via the adiabatic construction [9].
Returning to the one-dimensional geometry of the hard rod model, this leads to the following closed approximate equation of motion for the time-dependent density profile: The derivative is simply ∇ = ∂/∂x in one dimension and the diffusion constant D 0 = k B T /γ is the ratio of thermal energy and the friction constant.Equation ( 62) can be efficiently propagated in time with a simple forward Euler algorithm and the neural representation of c 1 (x, [ρ]) can be used in lieu of an analytic approximation.However superadiabatic forces, i.e. force contributions that go beyond the adiabatic approximation of working with a free energy functional, are neglected.These include viscous and structural nonequilibrium contributions; we refer the Reader to Ref. [9] for background and to Ref. [40] for a recent perspective on the description of microscopic nonequilibrium dynamics of fluids in the light of machine learning on the basis of power functional theory.

V. CONCLUSIONS AND OUTLOOK
In conclusion we have given a detailed account of the recent neural functional theory [41] for the structure and thermodynamics of spatially inhomogeneous classical many-body systems.The approach is based on input data obtained from Monte Carlo simulations that provide results for averaged density profiles.Thereby the training systems are exposed to the influence of randomized external potentials.Based on the functional relationships that are rigorously given by classical density functional theory, the training data is used to construct a neural network representation of the one-body direct correlation functional, which acts as a fundamental "mother" object in the neural functional theory.
From automatic functional differentiation of the onebody direct correlation functional follow daughter and granddaughter functionals that represent two-and threebody direct correlation functionals.Conversely, functional integration yields the neural excess free energy functional, which acts as the ultimate grandmother functional in the genealogy.We have shown that chaining together the functional differentiation and integration operations yields exact functional sum rules.Further exact identities are given by the statistical mechanical Noether invariance theory [90][91][92][93][94][95][96][97][98], by variety of fundamental liquid state techniques [8,[20][21][22][23] and by functional calculus alone [5,7].We have described a selection of these sum rules in detail and have shown how their validity can be used to carry out consistency and accuracy checks for the different levels of mutually related neural density functionals.
We have here in particular focused on the onedimensional hard rod systems for reasons of ease of data generation via simulations [42], the availability of Percus' exact functional [55], the possibility of analytical manipulations to be carried out, and not least the fundamental character of this classical model [54].A beginnerfriendly interactive code tutorial is provided online [42], together with stand-alone documentation that describes the key strategies and the essence of the methods that constitute the neural functional theory [41].We have discussed prototypical applications for "simulation beyond the box", where the neural functional is used for system sizes that outscale the dimension of the original training box that was used to generate the underlying Monte Carlo data [41].We have also given an overview of nonequilibrium methods and have emphasized the important role of the occurring force fields and their functional dependencies.
We recall that a detailed description of the setup of the paper is given before the start of Sec.I A; the modular structure of the paper invites for selective reading.An overview of the relevant statistical mechanical concepts is given in Sec.I.The neural functional theory is described in detail in Sec.II and we have emphasized the important concept of local learning, as illustrated in Fig. 2, which facilitates very efficient network construction and training.The neural functional approach allows to explicitly carry out functional calculus as presented in Sec.III and it is relevant for nonequilibrium as described in Sec.IV.We once more highlight the availability of the online tutorial [42], which covers all key aspects of our study and includes a practitioner's account of automatic differentiation and differentiable programming.
The neural functional theory is a genuine hybrid method that draws with comparable weight from computer simulations, machine learning, and density functional theory.The compuational and conceptual complexities of the involved methods from each respective field are relatively low.Yet their combination offers a new and arguably unprecedented view of the statistical physics of many-body systems.We lay out in the following why the approach is interesting from the viewpoints of each of the three constituent approaches.
From the machine-learning perspective it seems unusual to have a large set of testable self-consistency conditions available.These conditions stem from statistical mechanical sum rules, as they follow e.g. from the Noether invariance theory, the functional integrationdifferentiation structure, and exchange symmetry.Taking the latter case as an example, that the automatic derivative of a neural network satisfies the exchange symmetry of its two (position) arguments is very remarkable, see the graphical demonstration of the diagonal symmetry in Fig. 7(c).This is a purely structural test for the quality of the network that does not require any independent reference data as a benchmark.All presented sum rules are of this type and they hence provide intrinsic constraints, either genuinely following from the underlying Statistical Mechanics or from mere functional calculus alone, which is the case for the functional integrationdifferentiation formalism outlined in Sec.III D. Crucially, in our methodology the constraints are not enforced during training the network, as is done in methods of physicsinformed machine learning in the classical density functional [37] and wider [152,153] contexts.
From a computer simulation point of view the neural functional methods offer a new way of designing simulation work.Instead of direct simulation of the physical problem at hand, an intervening step of constructing the direct correlation functional is introduced.We have shown that the direct correlation functional can thereby be obtained explicitly and accurately.Rather than playing the role of a formal object, its availability as a trained neural network facilitates making fast and precise predictions in nontrivial situations.This application stage of the neural functional theory requires very little effort both in terms of the required numerical algorithmic structure and the computational workload; we recall the illustration of the neural functional workflow shown in Fig. 4.
From a density functional perspective the neural approach is arguably unprecedented in its degree of access to the excess free energy functional.We find it highly remarkable that so much of the seemingly very abstract functional relationships and formal concepts can be inspected and tested in computationally straightforward and highly efficient ways.The range of these methods includes automatic differentiation to generate direct correlation functions as well as performant functional integration routines.The neural functional framework offers the possibility to work numerically with exact functional identities with great ease.Hence the neural network technology relieves one from the task of constructing an approximate analytical functional and manipulating it on paper.
This leaves over the question of the status of analytical density functionals in the light of the neural network capabilities.We have here deliberately chosen the exact Percus functional for one-dimensional hard rods to demonstrate how much insight can be gleaned from the analytical manipulations; as a representative example see the nonlinear convolutional structure of Eqs. ( 24)-( 28) along with the excellent numerical comparison against the neural functional as shown in Fig. 7(b).As the neural functional method is not restricted to the hard core system, one can expect that having an accurate neural functional for a given system can be of very significant help when attempting first-principles construction of analytical free energy functionals.After all we should make use of the tools that van der Waals did not have at his disposal!In summary, in light of the progress reported in Refs.[40,41] and the present model investigation, we anticipate that a wealth of deep questions can be addressed from the viewpoint of the neural functional theory, including fundamental questions of phase coexistence [154] as well as the possible construction of fundamental measure functionals [155,156].While we here have restricted ourselves to hard core systems, the principal applicability of the neural functional theory for soft potentials was demonstrated in Ref. [41] for (planar) inhomogeneities of the supercritical three-dimensional Lennard-Jones fluid.Going beyond planar geometry and addressing spatial inhomogeneity in two or three dimensions could benefit from the use of equivariant neural networks [157][158][159][160][161][162][163], which possess the fundamental symmetry properties of Euclidean space.
For complex Hamiltonians the required amount of simulation work to provide training data might seem as a limitation.We are however optimistic that the subsequent efficient use of the direct correlation functionals in the form of neural networks can by far outweigh the training cost.Hence the application to complex models such as the monatomic Molinero-Moore water model [164,165] might not be out of reach.Furthermore it is inspiring to think that potential progress could be made in the treatment of dielectric [166] and long-ranged forces [167].
As a final note, we re-emphasize the successful application of the neural method to nonequilibrium flow problems presented in Ref. [40] and it is certainly very inspiring to speculate whether this facilitates making progress concerning questions of slow dynamics in soft matter [153,168,169] and beyond that [170].

T T r FIG. 2 .
FIG. 2. Illustration of the relevant functional maps of the neural functional theory.The external potential Vext(r) generates a one-body density profile ρ(r) that is associated with a one-body direct correlation function c1(r).At given temperature T , chemical potential µ, and for a specific form of the external potential Vext(r), Monte Carlo simulations provide data for the corresponding density profile ρ(r) and for the direct correlation function c1(r).Machine learning is used to represent the functional map ρ → c1 via a deep neural network.The functional dependence of c1(r) on the density profile is of much shorter spatial range as compared to the training data obtained from Vext → ρ.

FIG. 3 . 1 0
FIG. 3. Illustration of four different generations of density functionals.Shown are the excess free energy functional Fexc[ρ] and the one-, two-, and three-body direct correlation functionals.Upward arrows indicate the relationship via functional integration drρ(r) 1 0 da with the integrand being evaluated at the scaled density aρ(r).Downward arrows indicate functional differentiation δ/δρ(r).The neural functional theory is based on training c1(r, [ρ]) as the generating mother functional.Implementing the arrowed operations only requires high-level code.The resulting neural networks, as well as functionals derived from analytical expressions, are highly performant.

FIG. 4 .
FIG.4.Schematic of the workflow of the neural functional theory.Many-body simulations under randomized conditions are used to sample statistically averaged and spatially resolved data that characterize the inhomogeneous response of the considered system.A neural network is then trained to represent the direct correlation functional, which is subsequently applied numerically and via neural functional methods to investigate the physics of the system in the desired target situations.

FIG. 5 .
FIG. 5. Illustration of the neural one-body direct correlation functional c 1 (x, [ρ]) represented by a fully connected neural network with three hidden layers.The topology maps a small finite window of the density profile ρ(x) to the local value of the direct correlation function c1(x).
FIG.6.Representative density profiles that the inhomogeneous hard rod system exhibits under the influence of an external potential.The results are obtained from numerically solving Eq. (29) upon using either the neural direct correlation functional c1(x, [ρ]) or Percus exact solution therof.The three cases comprise (a) two hards wall with separation distance 9σ and chemical potential βµ = 2, (b) two hard walls with much smaller separation distance 2σ and identical chemical potential βµ = 2, and (c) sedimentation-diffusion equilibrium under gravity with a locally varying chemical potential, βµ loc (x) = βµ − βVext(x) = 2 − 0.05x/σ; here the linearly varying contribution accounts for the influence of gravity on the system and confinement is provided by two widely spaced hard walls at x = 0.5σ und x = 99.5σ.Note the crossover in panel (c) from the strongly oscillatory behaviour near the lower wall to a very smooth density decay, effectively following a local density approximation[8], upon increasing the scaled height x/σ.
FIG. 7.Numerical results for functional calculus and Noether invariance.The results are shown for an exemplary oscillatory density profile displayed in panel (a).Results for the neural prediction for c1(x) are compared to numerically evaluating Percus' analytical direct correlation functional(24) in panel (b).The two-body direct correlation function c2(x, x ′ ) as a function of x/σ and x ′ /σ, as obtained from automatic differentiation of the neural functional is shown in panel (c) and compared to the result from using Percus' analytical expression(43) in panel (d).Using the neural functionals, the agreement of the left and right hand side of of the Noether force sum rule(48) is shown in panel (e).In all cases the neural functional and Percus theories give numerically identical results on the scale of the respective plot.