Thermodynamics of exponential Kolmogorov-Nagumo averages

This paper investigates generalized thermodynamic relationships in physical systems where relevant macroscopic variables are determined by the exponential Kolmogorov-Nagumo average. We show that while the thermodynamic entropy of such systems is naturally described by R\'{e}nyi's entropy with parameter $\gamma$, an ordinary Boltzmann distribution still describes their statistics under equilibrium thermodynamics. Our results show that systems described by exponential Kolmogorov-Nagumo averages can be interpreted as systems originally in thermal equilibrium with a heat reservoir with inverse temperature $\beta$ that are suddenly quenched to another heat reservoir with inverse temperature $\beta' = (1-\gamma)\beta$. Furthermore, we show the connection with multifractal thermodynamics. For the non-equilibrium case, we show that the dynamics of systems described by exponential Kolmogorov-Nagumo averages still observe a second law of thermodynamics and the H-theorem. We further discuss the applications of stochastic thermodynamics in those systems -- namely, the validity of fluctuation theorems -- and the connection with thermodynamic length. namic length.


I. INTRODUCTION
In ancient Greece, three classic types of averages were extensively studied: arithmetic mean 1 n i x i , geometric mean ( i x i ) 1/n , and harmonic mean n( i 1 xi ) −1 , which played various roles in physics, geometry, and music. These so-called Pythagorean means found a natural generalisation via functional analysis and measure theory into the well-known one-parametric class of Hölder means ( 1 n i x p i ) 1/p . A different generalization of the notion of average was independently proposed by Andrey Kolmogorov [1] and Mitio Nagumo [2] in 1930, which took the form of f −1 1 n i f (x i ) for any continuous and injective function f . These Kolmogorov-Nagumo meansalso known as quasi-arithmetic means or f-means -have triggered numerous theoretical developments by several researchers, including de Finneti [3], Jessen [4], Kitagawa [5], Aczél [6], or Fodor and Roubens [7].
Kolmogorov-Nagumo averages have found applications in many fields, including machine learning [8], random fields [9], or fuzzy sets [10]. A particularly important application of Kolmogorov-Nagumo averages is the introduction of Rényi entropy [11], which in turn have found many applications in quantum systems [12][13][14], strongly coupled or entangled systems [15][16][17], phase transitions [18][19][20], multifractal thermodynamics [21,22], and time series analysis [23,24], among others. Another important application of generalized means have been the development of the thermodynamics of complex systems [25]. In addition to concepts such as deformed calculus [26,27], escort means [28][29][30], or non-linear dynamics [31,32], Kolmogorov-Nagumo means served as a natural framework for generalized entropies [33] -with the Rényi entropy being a special case. Furthermore, connections between the maximum entropy principle and Rényi entropy have been established in Refs. [17,34] In this paper we investigate the thermodynamics of systems whose relevant macroscopic variables are determined by the exponential Kolmogorov-Nagumo average. The pioneering work of Czachor and Naudts [33] on the thermodynamics of exponential Kolmogorov-Nagumo averages discusses aspects of equilibrium thermodynamics, such as a generalized internal energy and its relationship with Tsallis entropy [35]. Bagci and Tirnakli [36] studied a generalized maximum entropy principle for the exponential Kolmogorov-Nagumo averages, while leaving its thermodynamic interpretation unclear. A first thermodynamic interpretation of Rényi entropy was given by Baez [37], who noted that the Rényi entropy of a Boltzmann distribution is equal to the change of Helmholtz free energy divided by the change of temperature. The first goal of this paper is to develop a framework that unifies and extends these previous results, enabling a unified interpretation of all thermodynamic quantities as well as the maximum entropy principle in such scenarios.
A second set of relevant results is related to the Légendre structure of thermodynamics [38]. Scarfone, Matsuzoe and Wada [39] showed that the Légendre structure of equilibrium thermodynamics remains valid for Kolmogorov-Nagumo averages on the level of macroscopic quantities. Additionally, Wong [40,41] introduced a generalized Légendre structure that corresponds to the Rényi entropy. A second goal of this paper clarify the thermodynamic interpretation of this generalized Légendre structure. Furthermore, Liangrong, Hong, and Liu [42] have recently discussed non-equilibrium thermodynamics with non-extensive quantities. Investigations on non-equilibrium thermodynamics and multifractals related to dielectric breakdown was carried out by Enciso and colleages [43]. We will show that these results are closely related to non-equilibrium stochastic thermodynamics based on exponential Kolmogorov-Nagumo averages, while deriving a non-equilibrium version of the second law of thermodynamics, a generalized H-theorem, and proving the validity of detailed and integrated fluctuation theorem. Finally, several authors have studied generalized statistical mechanics from the point of information geometry [44][45][46][47]. In particular, Eguchi, Komori, and Ohara have investigated the geometry of generalized e-geodesic and m-geodesic in terms of Kolmogorov-Nagumo means [48]. We extend their results to the case of a generalized divergence, and calculate relevant quantities such as Fisher-Rao information and thermodynamic length.
The rest of the paper is organized as follows. Section II defines the exponential Kolmogorov-Nagumo means and summarizes its main properties. Section III then establishes equilibrium thermodynamics based on Kolmogorov-Nagumo averages of both entropy and internal energy. Section IV describes the application of this framework to multifractal systems. Section V is focused to the discussion on non-equilibrium thermodynamics. Section VI is then focused on thermodynamic length. Finally, Section VII summarizes our main conclusions.

II. EXPONENTIAL KOLMOGOROV-NAGUMO AVERAGES
Our line of inquiry focuses on systems governed by constraints that can be expressed in terms of non-linear averages. A natural extension of linear (arithmetic) averages is the Kolmogorov-Nagumo average [1, 2] where f is a continuous injective function. Without loss of generality, we focus on cases where f is an increasing function. Note that the average is invariant to affine transformations of the function The linear average is recovered by setting f (x) = x, which can be shown to be the only one that satisfies two properties: 1. Homogeneity: aX = a X , 2. Translation invariance: X + c = X + c .
In [49], it is shown that the first condition alone leads to functions of the form f (x) = x p , which corresponds to the well-known class of Hölder averages. In contrast, the second property has been shown to lead to the class of so-called exponential Kolmogorov-Nagumo averages [50], which corresponds to f (x) = exp γ (x) = (e γx − 1)/γ and inverse function ln γ (x) = 1 γ ln(1 + γx). The property of translation invariance is of particular interest for statistical mechanics, as it guarantees that thermodynamic relations are independent of the specific value of the ground state energy. Also, the standard arithmetic mean is recovered for γ = 0. These considerations make us focus on the Kolmogorov-Nagumo averages, which lead to the following type of averages: (2) A key property of the exponential mean is a weaker version of the additivity of the expectation value, which reads where ⊥ ⊥ denotes statistical independence. In the general case, it is possible to express the expected value of sums as where Y |X γ is the conditional average, i.e.,

Connection with cumulants
The exponential Kolmogorov-Nagumo averages are closely related to the cumulant-generating functions, which are given by By considering the well-known Taylor expansion of the cumulant-generating function, one can find that where κ n (X) is the n-th cumulant. This shows that the exponential Kolmogorov-Nagumo averages combine all cumulants weighted by the factor γ n−1 /n!. The relationship between cumulants and Kolmogorov-Nagumo averages can be used to provide a complementary view of the properties of the latter. For example, the additivity property of the Kolmogorov-Nagumo average can be understood as a consequence of the additivity of the cumulant generating function for independent random variables.

Connections with large deviation theory
When considering the sum S n = 1 n n i=1 X i of n i.i.d. random variables X 1 , . . . , X n , large deviation theory [51] states that the probability of observing S n = s can be expressed as with I(s) corresponding to the so-called rate function. Cramér's theorem states that the rate function can be obtained from the cumulant-generating function [51], which -via Eq. (6) -can be expressed in terms of γ-average using γ = nk as follows: where the nk-average can be expressed as S n nk = 1 nk ln ds e nks P (S n = s) .
A connection between large deviation theory and statistical mechanics can then be drawn using Eq. (9) by considering k = −β being the inverse temperature and X i = h i being the energy of i-th subsystem. Then, one can find that with Ψ = S − βU being the Free entropy (also known as Massieu function), with S being the thermodynamic entropy and U being the internal energy. For more details about this relationship, we refer the interested reader to Ref. [51].
The Rényi entropy can be naturally formulated in terms of the exponential Kolmogorov-Nagumo average of Hartley information ln 1/p k [52], also known as the Shannon pointwise entropy [53]. To show this, let us consider X to be a random variable with probability distribution p k and calculate the following: where R γ (X) denotes the Rényi entropy of order 1 − γ. This result shouldn't be surprising since the Rényi entropy appears in the Campbell coding theorem as the minimal price that one must pay to encode a message, where the price is the exponential function of the message length [54]. Note that the definition of the Rényi entropy that we are using includes a factor 1/(1 − γ) that is often not considered. We include this factor in this paper as its addition greatly simplifies the calculations presented in the next sections, which in turn will endow it with a clear thermodynamic meaning. Additionally, by including this factor the limit γ → 1 of R γ (X) leads to the well-known Burg entropy R 1 = − i ln p i [55].
Let us finalize this section by noting that the joint Rényi entropy can be decomposed as follows: where R γ (Y |X) is known as the conditional Rényi entropy.

III. EQUILIBRIUM THERMODYNAMICS
Let us now consider a system whose internal energy at state i is given by ǫ i . Then, the Kolmogorov-Nagumo γaverage energy of the system is calculated as U = ǫ γ = ln i p i exp(ǫ i ). However, as discussed in the previous section, this quantity is not invariant to rescaling by a factor (i.e., ǫ i → λǫ i ), and furthermore, it does not have properly defined units of energy. Therefore, it will be convenient to focus our analysis on the following rescaled internal energy: where β is the inverse temperature of the system. Note that the units of β are 1/Joules, and hence βǫ i is dimensionless, making U β γ a properly defined mean energy. This type of average energy function has already been considered in previous research, e.g., in Ref. [33]. A summary of thermodynamic quantities studied in this work is presented in Table I Similarly, our analysis will consider the thermodynamic entropy as defined by the Kolmogorov-Nagumo average of ln 1/p k , which gives us Rényi entropy, as discussed in the previous section. Interestingly, for the case γ = 0 this formalism recovers the standard definitions Quantity Shannon thermodynamics Rényi thermodynamics Entropy of average energy and Boltzmann-Gibbs entropy, while for γ = 1 it accounts for different scenarios -which we study in the following.

Maximum entropy principle
Let us now focus on distributions that correspond to a given value of mean energy as given by U β γ , according to the maximum entropy principle. The distribution π i that maximizes the Rényi entropy while satisfying a given γaverage level of energy can be found by using the method of Lagrange multipliers on the following Lagrange function: A direct calculation shows that π i is the solution of the following equation: By multiplying the equation by π i and summing over i, one obtains that α 0 = 1−α1 γ , which leads to Above, the Lagrange parameter α 1 can be chosen such that one recovers standard thermodynamic relationships.
To this end, we identify that α 1 = β (which is the standard relation between the Lagrange multiplier and inverse temperature), which gives us which is just the Boltzmann distribution with inverse temperature β. We can rewrite the previous relationship as and using the fact that k π k = 1 one finds that where Ψ γ = R γ − βU β γ is the free entropy (also called Massieu function). Finally, one can derive the free energy by plugging in the equilibrium distribution into internal energy and Rényi entropy as . This result shows that the Kolmogorov-Nagumo average is effectively a rescaling of the free energy from inverse temperature β to (1−γ)β. Note that Eq. (22) recapitulates the standard relationship between the free energy and the partition function for γ = 0.
These results reveal that, perhaps surprisingly, the obtained equilibrium distribution in Eq. (19) (obtained via the maximum entropy principle) is Boltzmann, while the thermodynamic quantities at play are nonetheless different from the case of ordinary thermodynamics based on Shannon entropy and linear averages. In fact, Eq. (21) implies that the free entropy and the logarithm of the partition function generally are not equal but differ by the term −γR γ , which vanishes only for γ = 0.

Thermodynamic interpretation
Let us now focus on the thermodynamic interpretation of equilibrium thermodynamics with the exponential Kolmogorov-Nagumo average. We denote the equilibrium versions of thermodynamic potentials by calligraphic symbols, i.e., https://www.overleaf.com/project/62d7b085180e6e22d4360b73 Let us start with Eq. (24). As already shown in Ref. [37], the equilibrium Rényi entropy can be expressed as By defining we obtain which is the β rescaling of the free energy difference. This can be interpreted as the maximum amount of work the system can perform by quenching the system from inverse temperature β to inverse temperature β ′ . Note that γ → 0 corresponds to β ′ → β and we recover the ordinary relation between entropy and free energy where S β is the ordinary thermodynamic entropy. Since F β γ = F (1−γ)β , the Kolmogorov-Nagumo energy can be expressed as By denoting free entropy as Ψ β = −βF β = S β − βU β we find that Again, in the limit γ → 0 we recover the classic relationship

IV. CONNECTION WITH MULTIFRACTAL THERMODYNAMICS
Let us now focus on the connection of Kolmogorov-Nagumo averages with multifractals. The multifractal analysis provides a powerful tool for investigating the self-similarity of complex systems, including physical and chemical systems [56], weather forecast [57] or financial systems [58]. It has been shown that the Rényi entropy plays a crucial role in the theory of multifractals [21]. We will show that the presented framework based on Kolmogorov-Nagumo averages establishes a connection between two distinct multifractal formalisms.

Fundamentals in multifractal theory
Following the standard approach in the theory of multifractals [59], let us consider a physical system whose state space is parcelled into distinct regions k i (s) indexed by i ∈ I , with the partition depending on a characteristic scale s. Such partitions can be studied with respect to positional, spatiotemporal, or energetic state space. Consider the probability of observing the system within region k i (s), which is denoted by p i (s). Let us assume that this probability observes a scaling property of the form where α i is a scaling exponent and z(s) = i s αi is a normalization constant. For small scales, i.e. s → 0, let us assume that the frequency of the scaling exponent α i is given by a continuous probability distribution, whose density ρ has the form Above, c(α) is a slowly varying function of α and f (α) is the so-called multifractal spectrum of the system, being the fractal dimension of subset which scales with exponent α.This means that the number of sets k i (s) that have the scaling exponent α j scale as where card denotes cardinality (i.e., the number of sets) and N (s) is the normalization constant. Systems that satisfy eq. 34 are called multifractals, and the scaling exponent of the Rényi entropy, denoted by D γ , is known as the generalized dimension [59], i.e., [60] A. Multifractals and Rényi's entropy A direct calculation shows that the Rényi entropy can be rewritten as In the limit of s → 0, both integrals can be approximated by the steepest descent approximation, i.e., it is possible to find a value α γ (resp. α 1 ) that maximizes the exponents in the integrals. To this end, we define and obtain that, for small s, the Rényi entropy can be approximated as where we omitted the constant term coming from the normalization function c(α). By introducing the Légendre transform of f (α), one can express the generalized dimension as As a result, we find that the connection between the multifractal spectrum and generalized dimension can be established as Thus, by calculating Rényi entropy, one can obtain the multifractal spectrum and vice versa.

B. Turbulence cascades
Let us now focus on an alternative approach to multifractal based on turbulence cascades on the scaling of the energy field [61]. The fundamental assumption at the base of this approach is that the average energy ǫ(s) scales as ǫ(s) γ ∼ s Mγ (ǫ) , which can be formally written as Interestingly, this approach can be connected with the previous one by noting that the distribution obtained from maximization of Rényi entropy with the constraint on the cumulant generating function leads to p(ǫ i ) ≡ π i , i.e., Boltzmann distribution (19). By comparing Eqs. (19) and (33), we find the following correspondence: Therefore, ǫ i = α i relates the energy to the characteristic scaling exponent, and β = − ln s connects the inverse temperature with the characteristic scale. Furthermore, the parameter γ plays the role of the rescaling of the characteristic scale, where β ′ = (1 − γ)β leads to This implies that using the γ-exponential Kolmogorov averages implies a change of the characteristic scale s → s 1−γ . Finally, we obtained that We note that the relation between cumulant generating function has been described in Ref. [61]. These results show that the connection between the two formalisms naturally leads to thermodynamics with exponential Kolmogorov-Nagumo averages. Our results imply that, for the case of multifractal systems, the maximization of Rényi entropy under the constraint of Kolmogorov-Nagumo average naturally gives us the Boltzmann distribution, where the energy can be translated to the scaling exponent of the equilibrium distribution.

V. NON-EQUILIBRIUM THERMODYNAMICS
In this section we investigate the thermodynamics of non-equilibrium systems subject to constraints in the form of Kolmogorov-Nagumo averages. We focus on their Légendre structure, H-theorem, entropy production, and detailed and integrated fluctuation theorems.

Légendre structure
The Légendre transform establishes several key relations in thermodynamics, connecting the internal energy, entropy, temperature, and Helmholtz free energy -which is defined as F = U − T S. It also establishes a natural link between extensive variables (e.g., energy, entropy) with intensive variables (e.g., temperature). Interestingly, while the Légendre transform arises naturally in the classic Boltzmann-Gibbs framework, it has been shown that a more general Légendre structure can still be applied to more general scenarios [38] in the context of thermodynamics and [62] in information geometry.
Following this line of reasoning, we now derive thermodynamic relationships that lead directly to a generalized Légendre structure. The Légendre structure for the case of Kolmogorov-Nagumo averages has also been investigated by Scarfone, Matsuzoe, and Wada in [39]. Let us start by calculating the change of Helmholtz free energy from the equilibrium distribution to an arbitrary state. Using Eq. (22), this can be expressed as The difference between non-equilibrium and equilibrium internal energy can be expressed as where ∇R γ is the vector of partial derivatives of the Rényi entropy expressed in the equilibrium state, which can be written as Thus, the free energy difference can be expressed as where D γ (p||π) is the Rényi-Bregmann divergence [40]. Hence, the divergence between non-equilibrium and equilibrium distributions can be re-written as where Π , and C(x, y) = 1 γ ln (1 + γ x · y) is a generalized link function, which becomes the ordinary dot product x · y for γ → 0. The fact that D γ (p||π) ≥ 0 [41] implies that ∆F γ is always positive, and therefore the free energy is minimized by the equilibrium distribution -generalizing the classical result to the cases of non-linear averages. The implications of this result for thermodynamic scenarios are developed in the next section.
To conclude, let us note that while the standard free energy is obtained by a regular Légendre transform on the macroscopic level (i.e., as F = U − T S), the more general free energy F γ is obtained as a generalized Légendre transform on the mesoscopic (i.e., probability) level. This can be seen by comparing Eq. (22), where free energy difference is obtained as ∆F γ = ∆U β γ − β −1 ∆R γ , with Eq. (53), where the free energy difference is obtained as ∆F = β −1 [−∆R γ + ln γ (∇R γ · ∆p)] .

Second law and H-theorem
Let us now focus on the case of non-equilibrium stochastic thermodynamics of systems that follow constraints given by Kolmogorov-Nagumo averages. Stochastic thermodynamics [63] recently became an important topic of non-equilibrium statistical physics. While there have been several attempts to derive stochastic thermodynamics associated with generalized entropies (see, e.g., Refs. [31,32,42]), there have been no studies focused on the stochastic thermodynamics of Kolmogorov-Nagumo averages.
Before starting, let us consider a general formula for a time derivative of a γ-exponential average: With this expression at our disposal, let's start investigating non-equilibrium thermodynamics. We begin by focusing on the second law of thermodynamics. For this, let us use Eq. (55) to re-state the first law of thermodynamics for the total energy measured as a Kolmogorov-Nagumo average, which reads as follows: where the work and heat flow into the system of interest are given bẏ (57) Using these expressions, we can now derive the second law of thermodynamics for the case of Rényi entropy and an arbitrary non-equilibrium process driven by linear Markov dynamics. To this end, let us consider a standard master equation given bẏ whereṗ i (t) := dpi(t) dt is the time derivative of p i (t), w ij is the transition rate between states i and j. Let us consider a control protocol λ(t) which controls the energy spectrum ǫ i (t) ≡ ǫ i (λ(t)). Let us also introduce the Boltzmann distribution as π i (t) = 1 Z exp(−β(ǫ i (t)). We assume that the transition rates satisfy satisfies detailed balance, i.e.
The time derivative of entropy and heat flow can then be expressed aṡ where we are using P (γ) (t) as shorthand notations. In the rest of the text, we will often omit the explicit dependence on time for the sake of clarity. Then, the entropy production rate can be calculated aṡ Thus, it is possible to write the entropy production rate asΣ where J ij := w ij p j − w ji p i are the thermodynamic fluxes and F ij = σ bath ij + σ sys ij the thermodynamic forces with For the case of γ → 0, the thermodynamic force reduces to the ordinary thermodynamic force F ij = ln wij pj wjipi . Using this derivation, one can show that the entropy production satisfies the second law of thermodynamics, which in this case, reads aṡ Let us now define a generalized H-function as H γ (p) := D γ (p||π) = β(F γ (p) − F γ (π)), where the second equality is based on Eq. (53). Then, one can show that for the case of pure relaxation, i.e., where w ij does not depend on time and satisfies the detailed balance H γ fulfills a generalized H-theorem given by the fact that H γ (p) ≥ 0 anḋ withḢ γ = 0 if and only if p = π. This implies that H γ (p) is a Lyapunov function of the dynamics, i.e., a non-negative quantity that monotonically decreases with time until equilibrium is attained. The proof of both the second law and H-theorem is provided in Appendix A. Interestingly, the H-function (i.e., the distance from equilibrium distribution) and the entropy production (i.e., the maximum amount of reversible work) generally differ if γ = 0, contrary to the case of the standard Boltzmann-Gibbs framework (γ = 0) where both are equal to the Kullback-Leibler divergence. The difference betweenΣ γ (Eq. (66)) andḢ γ (Eq. (68)) is in the replacement of Φ i , i.e., a different averaging in the denominator. In fact, by combining the first and second laws of thermodynamics, one can find thaṫ To conclude, it is important to note that both Eqs. (66) and (68) reveal a family of inequalities -indexed by γ -that hold for any stochastic system whose dynamics are governed by a master equation (as in Eq. (58)) and satisfy detailed balance (as in Eq. (59)). Said differently, our results reveal that any process following a master equation and satisfying detailed balance will satisfy those inequalities for all values of γ. It is then the constraints of the system that determine which values of γ are physically meaningful : systems obeying linear constraints yield to a Shannon-type second law (with γ = 0), while non-linear Kolmogorov-Nagumo constraints lead to a Rényi-type second law (with γ = 0).

Fluctuation theorems
We now show that trajectory thermodynamics remains in the same functional form as in ordinary stochastic thermodynamics. Throughout this section, we will be denoting trajectories and their functionals by bold symbols. We will be denoting probabilities of observing a trajectory by uppercase, bold symbol P P P , which is the probability of observing a state x at time t by a lowercase symbol p.
Let us consider a trajectory x x x = (x 0 , t 0 ; x 1 , t 1 , . . . ; x f , t f ) starting at t 0 and finishing at t f . We denote the trajectory state at time t as x x x(t). The trajectory dwells in state x x x(t) = x i−1 for t i−1 ≤ t < t i and then jumps to a state x i at time t i . Let us define j(x x x) as the number of trajectory jumps. The probability of observing a trajectory x x x can be obtained from the master equation (58) as The energy corresponding to the state x is controlled by a time-dependent protocol λ(t), so ǫ x (t) ≡ ǫ x (λ(t)). Similarly to the previous section, we assume that the transition rates satisfy the detailed balance (59).
Let us now introduce an operation of time reversal ). This concept is crucial for understanding the connection between the irreversibility of mesoscopic systems and entropy production. By considering the time-reversed dynamics with timereversed control protocolλ(t), the probability of observing the time-reversed trajectoryx x x in the time-reversed dynamics determined by the time-reversed protocolλ can be expressed as whereP denotes the fact that the probability is calculated in the time-reversed dynamics. By calculating the log ratio of the trajectory probabilities, we get that the waiting probabilities e − t j t j−1 wx j−1 x j−1 (τ )dτ cancel out and we end with Let us now define the entropy of state x at time t as the Hartley information of p x (t), i.e. s x (t) = − log p x (t). From this definition, it is possible to define a trajectory entropy s s s[x x x](t) ≡ s x x x(t) (t) = − ln p x x x(t) (t). By using the condition of detailed balance, we can express the log-ratio as (73) Above, the first difference is equal to the change of the trajectory entropy from the beginning of the trajectory to the end of the trajectory, and is denoted by . Please note that while ∆s formally depends on the trajectory, it actually only depends on the starting and ending point (and hence we don't use the bold symbol for it). The second term is equal to the β times the heat exchanged with the reservoir during the trajectory and is denoted by q q q[x x x] := j∈j(x x x) ǫ xj (t j ) − ǫ xj−1 (t j ) . Thus, the log ratio of forward and reversed probabilities is equal to which is the trajectory entropy production.
To show the relation between trajectory quantities and ensemble quantities, we calculate the time derivative of the entropy production. The time derivative of trajectory entropy can be expressed as where the first term is due to the change in the probability distribution, and the second term is due to trajectory jumps. Similarly, the time derivative of trajectory heat is In both cases, the time derivative depends only on x − = x x x(t − ) and x + = x x x(t + ). By introducinġ the ensemble entropy production rate (Eq. (63)) can be expressed aṡ In this case, the relation between trajectory quantities and ensemble quantities is not so straightforward due to more complicated averaging.
Let us now focus on another aspect of the entropy production, i.e., the measure of irreversibility. We define a Kolmogorov-Nagumo average of all trajectories for a functional G G G x x x defined for each probability as where Dx x x is the the path integral measure. Using this, one can find the ensemble entropy production as given by where D γ (p||q) = 1 γ ln dx p(x) γ+1 q(x) −γ is the Rényi-Csiszár divergence.
Furthermore, one can use Eq. (74) to show the validity of a detailed fluctuation theorem that holds in the common form [64]. Let us define the probability of observing trajectory entropy production as i.e., we sum over all trajectories that result in entropy production equal to σ. By a simple manipulation, we obtain HereP again denotes that probability is calculated for the time-reversed dynamics. As a result, we obtain the detailed fluctuation theorem Therefore, on the trajectory level, the relations remain exactly the same as in the case of the ordinary Shannon-Boltzmann-Gibbs framework. Finally, the integrated fluctuation theorem can then be formulated as where the integral takes place over the values of trajectory entropy production. An application of Jensen's inequality yields dσP (σ)σ ≥ 0.

VI. THERMODYNAMIC LENGTH
Thermodynamic length is a well-known metric that characterizes the distance between thermodynamic states. More specifically, this metric is related to the dissipation in a thermodynamic system due to finite-time transformations [65,66], and has important connections with the Jensen-Shannon divergence [67], Fisher information and Rao's entropy differential metric [68]. Therefore, thermodynamic length is of great interest for outof-equilibrium analyses.
Let us consider a collection of thermodynamic states that can be parametrized by (θ 1 , . . . , θ n ) ∈ Θ. Then, the thermodynamic length of a path s(t) = s 1 (t), . . . , s n (t) : [0, τ ] → Θ can be calculated as is the time derivative of s i (t) where g ij is a metric tensor corresponding to the well-known Fisher metric [44], which is given by Above, D(p||q) corresponds to a divergence, a central quantity in information geometry from which all other geometrical properties -including the metric tensor (87) and connections -can be derived. It is worth noting that there are two fundamental types of divergences: (1) Csiszár divergences, of the form: (2) Bregmann divergence, of the form: Both families of divergences are closely related to a traceclass generalized entropy S f,g (p) = g ( i f (p i )) [69], where f is an increasing function and g is a concave function. For the case of g(x) = x and f (x) = x ln x, both divergences reduce to the well-known Kullback-Leibler divergence. This shows that the Fisher-Rao metric corresponding to the Cziszár divergence is equivalent to the Fisher metric corresponding to the Kullback-Leibler divergence [46]. Furthermore, as elaborated in Eq. (53) (see also Ref. [45]), the Bregmann divergence corresponds to the difference of free energies -i.e., the amount of reversible work between two states. These facts motivate us to focus on the Rényi-Bregman divergence.
To recapitulate, Eq. (53) shows that the Rényi-Bregmann divergence can be expressed as where R γ (p, q) is the Rényi-Bregmann cross entropy given by This generalizes the standard relationship between Kullback-Leibler divergence, Shannon entropy, and crossentropy D 0 (p||q) = −H(p) − H cross (p, q), which corresponds to the case of γ = 0. By leveraging the structure of the Rényi-Bregmann divergence, one can find the following expression for the metric tensor valid for arbitrary γ: has been adopted for brevity. Using this expression, one can then evaluate ds 2 as follows:

VII. CONCLUSIONS
This paper presents a description of the thermodynamics of systems that follow non-linear constraints in the form of Kolmogorov-Nagumo averages. Our results provide a first step towards a deeper understanding of the thermodynamics of such systems, opening the door to the analysis of systems with long-range correlations and/or multifractal properties, which are naturally described by the Rényi entropy [17,21]. Furthermore, recent applications of non-equilibrium thermodynamics based on Rényi entropy (and consequently exponential Kolmogorov-Nagumo averages) to utility theory have been investigated [70], opening the possibility to apply the presented formalism in the context of game theory.
The thermodynamics of Kolmogorov-Nagumo averages was found to be naturally centered around the notion of Rényi's entropy and a generalized Légendre transform, which lead to a novel form of free entropy. Our results show that this free energy, in turn, is directly related to the entropy production of the system. The presented framework allows to extend of most thermodynamic relations to these non-linear systems -including the second law of thermodynamics and fluctuation theorems -if their relationships are adequately recast in terms of the deformed Légendre transform, as summarized in Table I.
In the context of current attempts to apply generalized entropies known mainly from information theory in thermodynamics, it is worth mentioning that generalized entropies -such as the Rényi entropy -do not only emerge in the case of systems under non-arithmetic means as constraints but also emerge in the description of systems that obey a non-linear master equation with ordinary arithmetic average [32]. In the context of equilibrium thermodynamics, it has been argued -perhaps surprisingly -that different entropies and constraints can lead to the same equilibrium distribution [71]. Our results can be seen as extending these ideas for nonequilibrium processes, as they imply that different combinations of particular dynamics (i.e. the precise form of the Fokker-Planck/master equation), detailed balance (identifying stationary distribution with the equilibrium distribution), and energetic constraints can also lead to the same entropic functional. The precise choice of thermodynamic and dynamic relations depends on the particular choice of a physical system that one intends to describe. The investigation of how to better characterize classes of dynamics and constraints that lead to similar phenomena is an important topic that deserves further investigation in future work.
i pi(t)π −γ i (t) , we can then find thaṫ (A5) Similarly, an analogous derivation shows that the time derivative of H-functionḢ γ can be expressed aṡ To conclude the derivation, let us consider now, instead, consider a more general expressioṅ and show that it is non-negative for Υ  i (t). By using the inequality γ(x −1/γ − 1) ≥ log(1/x) for γ > 0, one can find thaṫ cases one can use the fact that the processes satisfies the detailed balance, i.e., and log Π respectively, and express Y as Then, by using log(1/x) ≥ 1 − x, one can obtain the following inequality: which is equal to −Ẏ γ . Thus, one finally finds thatẎ γ ≥ −Ẏ γ , which in turn leads toẎ γ ≥ 0. This proves both the second law (Σ γ ≥ 0) and the H-theorem (Ḣ γ ≤ 0).