Entropy production rates for different notions of partial information

Experimentally monitoring the dynamics of a physical system, one cannot possibly resolve all the microstates or all the transitions between them. Theoretically, these partially observed systems are modeled by considering only the observed states and transitions while the rest are hidden, by merging microstates into a single mesostate, or by decimating unobserved states. The deviation of a system from thermal equilibrium can be characterized by a non-zero value of the entropy production rate (EPR). Based on the partially observed information of the states or transitions, one can only infer a lower bound on the total EPR. Previous studies focused on several approaches to optimize the lower bounds on the EPR, fluctuation theorems associated with the apparent EPR, information regarding the network topology inferred from partial information, etc. Here, we calculate partial EPR values of Markov chains driven by external forces from different notions of partial information. We calculate partial EPR from state-based coarse-graining, namely decimation and two lumping protocols with different constraints, either preserving transition flux, or the occupancy number correlation function. Finally, we compare these partial EPR values with the EPR inferred from the observed cycle affinity. Our results can further be extended to other networks and various external driving forces.


Introduction
Entropy production (EP) is a thermodynamic quantity intimately tied to the irreversibility of a nonequilibrium process [1,2]. The EP sets fundamental bounds on the efficiency of physical systems like heat engines [2] and biological * Author to whom any correspondence should be addressed.
Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. processes [3][4][5][6], and provides insight into the thermodynamics of nonequilibrium systems [7]. Estimating the total EP along a trajectory requires access to the entropy change of the system and the amount of heat dissipated to the surrounding reservoirs during the process [2,8]. A major challenge in EP inference is the large number of out-of-equilibrium degrees of freedom, many of which are not accessible to an external observer and cannot be directly resolved [9,10]. Nevertheless, a lower bound on the total EP can be obtained, for example, based on the fluctuations of an observed thermodynamic flux, like transition currents [11][12][13][14] or first passage times of the current [15][16][17], by the thermodynamic uncertainty relations. In essence, one can only estimate a lower bound on the total EP for a partially observed system. The observations can be, due to the finite spatiotemporal resolution, a subset of the 'states' [18][19][20][21][22][23] or a subset of the 'transitions' [24,25] between the different states.
Theoretically, the effective, coarse-grained dynamics of various chemical and biological processes which occur over a wide range of timescales can be written in terms of the slow timescale processes while taking into account the action of the fast ones [26,27]. A coarse-grained system, however, might appear to be time-reversible despite underlying nonequilibrium dynamics in hidden cycles [26,27]. Still, eliminating fast degrees of freedom does not necessarily alter the apparent EP if the cycle carrying the probability current is not removed in the decimation process [28]. Coarse-graining of this kind was first studied using a Markovian master equation, and the coarse-grained EP was shown to satisfy a Fluctuation theorem [29]. In other scenarios, the entropy production rate (EPR) estimation for a coarse-grained system can exactly produce the total EPR. For example, by decimating states [30] while considering the inclusion of self-loops, the EPR obtained from a Markovianized master equation for the underlying non-Markov process estimates the actual EPR. Moreover, proper coarse-graining in the cycle space [31], or reducing bridgestates [32], can also preserve the mean EPR. Finally, observing a single transition in a unicycle network is sufficient to recover the total EPR [18,24,25].
EPR estimators for partially accessible systems have been studied widely, and several approaches have been proposed, including the passive partial entropy production (PPEP) and informed partial entropy production (IPEP) [18][19][20][21][22][23]33], which provide a lower bound on the total EPR. For the case of the PPEP estimation, the observer has access only to a few microstates and the transitions between them. In contrast, in the IPEP estimation, the observer relies on the observed dynamics and the affinity of the observed cycles. Both the PPEP and the IPEP fail at stalling conditions, e.g. in the absence of a net current along the accessible link. However, the Kullback-Leibler divergence (KLD) estimator based on the asymmetry of the waiting time distributions (WTDs) can provide a tighter lower bound in the absence of the net flux along the observed link for second-order Markov processes [19,24,34,35]. Recently, EPR estimators have been proposed based on observed transitions rather than observed states [24,25]. The lower bound on the total EPR was estimated based on observing two successive repeated transitions along the same direction and the waiting times between the transitions or the 'inter-transition time' [24,25]. The topology information of the network can also be deduced using this method from a single observed link for a unicyclic network or multiple observed links for a general Markov network [25]. The transition-based estimators equal the informed partial EPR for a unicyclic network [18,25] and a general Markov network without hidden cycles.
Several attempts have been made to find a tighter bound on the total EPR. One approach suggests fitting the observed dynamics of a discrete-time Markov chain to an underlying Markov model that produces the observed statistics of the coarse-grained system [36]. When the number of hidden states is known, this fitting method can provide a tight bound on the total EPR. Moreover, two recently developed estimators are based on an optimization process of searching over possible underlying Markov models that obey the observed statistics of mass transfer between the mesostates [37] and the WTDs for the time forward and time-reversed transitions [38]. A novel approach for utilizing waiting time information by reformulation of the observed statistics of intra-transitions within macrostates was demonstrated to yield an improved lower bound on the EPR [39]. Further, a hierarchy of EPR estimators has been recently proposed, by optimizing over systems with the same observed mass rates and the same moments of the WTDs between coarse-grained states [40].
In general, there are two notions of partial information. In the first notion, the observer can distinguish between the observed and the hidden part of the system, where no information about the network topology is available for the hidden subsystem. Here, the sum of the partial EP of the observed and the hidden subsystems equals the total EP [18]. The other notion involves coarse-graining [28,29,32,[41][42][43][44][45][46][47][48], performed by lumping, i.e. merging several states into a single compound state [49,50], or decimating [30] states. In decimation, the probability densities of the decimated states equal zero and are redistributed between the probability densities of the surviving states [51]. Similarly, in lumping [49,50], the steady-state probabilities of the merged states sum up, whereas the rest of the states are not affected. In addition, one lumping protocol preserves the transition fluxes between the states, except for the states undergoing lumping [49]. However, another specific procedure for lumping conserves the time-dependent occupation number correlation function after the coarse-graining [50]. The accountability of the internal dissipation of the state undergoing coarse-graining is not guaranteed and depends on the coarse-graining procedure. Generally, coarse-graining is one method by which partially observed systems are often modeled. Further details on the different coarse-graining methods are discussed in the upcoming sections.
Here, we study how the above notions of partial information affect the partial EPR values obtained from partially observed systems. We start with the evolution equation for the EP of a driven system governed by a Master Equation, introduce lumping and decimation procedures on the system states, and obtain the partial EPR from an approximated Master (AM) equation. We further compare EPR values obtained from different coarse-graining approaches for different network topologies with the total mean EPR values.
The paper is organized as follows. First, we briefly review the EPR obtained from partially observed systems with observed and hidden substates in section 2. Next, we discuss the EPR inferred following the various coarse-graining methods. Section 3 shows the calculation of the total mean EPR from the evolution of the EP for a Markov chain and the evaluation of the partial mean EPR from the non-Markovian coarse-grained system following AM equation. Section 4 discusses the decimation method and the mean EPRs (scaled cumulant generating function (SCGF) and AM) obtained from their respective AM equations. In section 5, we describe two different lumping coarse-graining methods and calculate the corresponding mean EPRs. We discuss our results in section 6, and, finally, conclude our findings in section 7.

Partial EPRs
We consider a continuous-time Markov chain over a finite number of discrete states. The generator of the Markov jumpprocess, or the transition rate matrix, is defined by a matrix W. The non-diagonal elements (w ji = [W] j ̸ =i ) of the matrix refer to transition rates between different states i and j, where w ji is non-zero only if w ij is non-zero. The diagonal elements (w ii = [W] i =j ) represent exit rates (λ i = −w ii = j ̸ =i w ji ) from a particular state i. The evolution of the state probabilities is governed by a Master Equation d t P (t) = WP (t), where d t denotes the time derivative, and P (t) = (P 1 (t) , P 2 (t) , . . .) T denotes the probability vector to find the system in the respective states at time t. The system reaches a steady state at the long-time limit, denoted by the state probabilities vector π , which can be obtained from solving Wπ = 0. The distributions at steady state can also be obtained directly from the transition rates using the matrix-tree theorem [23,52]. The stationary current between states i and j is defined by J π ji = w ji π i − w ij π j . We start with simple yet non-trivial networks of continuous-time Markov chains (figure 1) [18,53] to calculate the mean partial EPR for partially observed dynamics. In our model, we chose a system with 4 microstates, in which states 1 and 2 are observed, whereas states 3 and 4 cannot be distinguished, and are masked into a coarse-grained hidden state, H, as shown in the shaded area of figure 1. In practical scenarios, one can only observe a few transitions among states, and only a subset of the states. In PPEP, the observer only has access to an observed link between two microstates and the transition rates w 12 , w 21 between them At a long time limit, the PPEP reaches its steady-state rate [18,[21][22][23], given by where log is the natural logarithm in base e. On the other hand, for the IPEP [18,[21][22][23], in addition to the cycle affinity of the observed link, one has access to the steady state probability densities of both of the observed states at stalling conditions. For calculating the IPEP, we additionally assume that the observer can tune the rates over the observed link and use the system's steady state probabilities when the new rates are applied to infer information about the hidden states. The observer can tune the transition rates according to w 12 (F) =w 12 e βFL =w 12 e F and, with driving forceF of dimension β −1 L −1 and F dimensionless. L is the characteristic length scale, and β is the inverse temperature. To calculate the IPEP, the system is tuned to stalling conditions, in which the current over the observed link vanishes at F = F stall , and the resulting steady-state probability densities of the two observed states are π st 1 , and π st 2 , respectively. At a long-time limit, the IPEP rate is given by [18,23]: The stalling distribution can be calculated by solving W st π st = 0, where the modified generator W st is obtained by setting the rates over the observed link (1-2 link) as zero in the original Markov rate matrix W, i.e. w 12 = w 21 = 0 and adjusting the exit rates accordingly (see [18,[21][22][23] for full derivation). The value of the stalling force can then be calculated according to F stall = (w 12 π st 2 /w 21 π st 1 ) /2. Due to added information about the hidden states through the stall force, IPEP provides a better bound to the total EPR compared to the PPEP [18], and therefore we only compare to IPEP in the following. In this manuscript, we assume to have access only to the state information, without having knowledge regarding the fluctuations of the thermodynamic flux, first passage time associated with the thermodynamic flux, and waiting time information.

EPR estimation from master equation
We consider a trajectory γ with a sequence of N states {i 0 , . . . ., i N } and the corresponding waiting times {t 0 , . . . ., t n } for a total observation time T, The probability of observing such a trajectory is given by Similarly, one can define the probabilityP [γ] for the time- t N ), . . . ., (i 0 , t 0 )}, and express the total EP along the trajectory, ∆ using their ratio [18]: where ϕ ij is the net number of transitions from state j to state i. The EPR can be calculated from the evolution of the EP governed by the Master equation. Following Teza and Stella [30], we denote the probability P i (S, t) that the system is at state i at time t, and have produced entropy S from all possible trajectories up to that time. Since each transition between a pair of states i and j adds up log (w ji /w ij ) to the total EP, we can write Master Equations for all the state probabilities P i (S, t) having EP S at time t: where λ i is the exit or the escape rate from a particular state t) with respect to the entropy S. We reformulate equation (5) as: Equation (6) can be cast in matrix form as as its ith entry, andW the tilted transition matrix [30,54]: The dominant eigenvalue (Ω TM ) of the tilted transition matrix,W, is the SCGF of the EP. The mean EPR can be calculated from [30,54,55]: TM refers to 'Total mean' (figure 2) as all the states' dynamics have been considered during its calculation.
In the next sections, we discuss EP obtained from the decimation and the lumping coarse-graining methods. In each of the methods, we find a Markovianized rate matrix (U) associated with the coarse-grained system dynamics, modelled by an AM equation of the form d t Q (t) = UQ (t), where Q is the column vector of probability densities of the coarse-grained states at time t.

Coarse graining via decimation
Decimation involves removing some states and redistributing their population density among the rest of the states. In this section, we consider decimating one of the states, say state d from equation (5), and evaluate the mean EPR from the decimated system. To do so, we perform the following four steps [30,51].
We start with the following Master equation in terms of the probability of the EP: where λ i is the exit rate from state i, w ik is the transition rates between different states k and i, and P i (S, t) is the probability of being in state i and having EP S at time t.
The terms are the same as mentioned in section 3. First, we Fourier transform equation (9) using P i (S, ω) = dt e −iωt P i (S, t), where i includes all the states. Second, we substitute P d (S, ω) in the Fourier transformed equations, where P d (S, ω) is given by Substituting P d (S, ω) in the Fourier transformed equations, we obtain the following: Multiplying both sides by (λ d + iω), we get Third, we perform the inverse Fourier transform on P i (S, ω), and we obtain second-order differential equations. However, we only retain the first-order terms in t, i.e. O(t) therefore considering a Markovianized approximation [51]. In order to get the same mean EPR as the mean total EPR (TM) after coarse-graining at steady-state, we use an approximation following [51] for the proportionality relation between states i and j at steady state ∂ t P i → (π i /π j ) ∂ t P j .
After the decimation of state d (i.e. redistributing P d (S, t) among other states), the remaining states have renormalized probabilities that sum up to 1. If state d is decimated, then the probability of being in the decimated state d in the coarsegrained system equals zero, P CG d = 0.However, the occupation probabilities of the rest of the states i ̸ = d are affected by the transition probability (ϖ id = w id / k w kd = w id /λ d ) from the decimated state d to state i in order to preserve the population density. Thus, in the fourth and the last step, the probabilities are renormalized according to is the probability of the coarsegrained state i, P i is the probability before the coarse-graining, and ϖ id is the transition probability from state d to state i. The transition probability is given by the ratio of the transition rate from state d to state i (w id ) to the total exit rate from state d (λ d ), i.e. ϖ id = P d w id /λ d .
Using P CG i ∼ P i (1 + π d w id /λ d π i ) as an approximation [51], and consider this relation to hold throughout the evolution, we find the relation between the probabilities of a state before and after the decimation. From, P CG i ∼ P i (1 + π d w id /λ d π i ) we get where γ i = 1 + w id π d λ d π i , and λ d = j ̸ =d w jd . The equation in terms of the probability distribution function for the EP of the coarse-grained state is written by where N i = λ i + λ d − j ̸ =i,d w ij a ji and a ij = (π i /π j ). We now write the equations for the transformed variable associated with the coarse-grained states,G CG i (Λ, t) = s e ΛS P CG i (S, t), as follows: We want to emphasize that G i CG is also a function of the transition rates w ji , but for the sake of simplicity, we show the explicit dependencies only on Λ and t. In matrix notation, equation (15) can be written as with the matrix elements of Γ SCGF being We calculate the mean EPR from the scaled cumulant generating function, Ω SCGF (Λ), which is the dominant eigenvalue of the 3 × 3 tilted transition matrix Γ SCGF . The mean EPR (σ SCGF ) is calculated in the following way We refer to this mean EPR as SCGF (figure 2) as this quantity is calculated from the scaled cumulant generating function of the EP.
As mentioned earlier at the end of section 3, we consider the coarse-grained system dynamics to follow an AM equation with effective transition rates, derived from equation (14). We write the approximated equation governed by the probability density of the coarse-grained system to be in the coarsegrained state i (where i = i\ {d}) at time t as the following, ] .
In terms of the probability distribution of the EP, P i CG,approx (S, t), we obtain similar equation as equation (14), which we can rewrite in terms of the generating function, ] . (20) In matrix notation which reads as d t G i approx = Γ AM G i approx , where AM stands for EP obtained from the AM equation. As previously, a tilted matrix, Γ AM , is used to calculate the mean EPR We obtain the mean EPR from the dominant eigenvalue of the above matrix, Ω AM (Λ): The mean EPR calculated using this value is shown as AM in figure 2.

Coarse graining via lumping
A different strategy of coarse-graining is lumping several microstates into a single mesostate H. Our goal is to find a rate matrix by the lumping method [49] where the coarse-grained system dynamics follow an AM equation. The probabilities of the merged states, i ∈ H, are summed up, while the steady states probabilities of the rest of the states remain unchanged, In this scenario, the coarse-grained transition rates are modified to conserve the steady-state probabilities and the transition flux among all the states except for the flux between the merged states, so the coarse-grained system is expected to have similar steady-state properties as the original system [49]. As the affinity is not preserved in this method, the EPR of the coarse-grained system is changed. The modified rates for the AM equation are as follows [49]: These transition rates are proven to minimize the KLD of the trajectories and can be used for lumping any two states even without a timescale separation [49]. The diagonal terms of the coarse-grained transition matrix are w CG kk = − k̸ =l w CG kl . We calculate the mean EPR from the following tilted transition matrix The mean EPR is obtained from the dominant eigenvalue of the matrix, Ω L (Λ) .
where the mean EPR from this matrix is referred to in figure 2 as L (lumping with preserving steady-state properties). Hummer and Szabo [50] developed another procedure for defining a transition rate matrix for a coarse-grained system. Their method ensures that the time-dependent occupancy number correlation functions in the coarse-grained system are equal to the ones of the original system. The same reduced matrix (rate matrix for the coarse-grained system) can be obtained from the projection operator technique. In the approximated Markovian limit, the reduced transition rate matrix was calculated analytically [50]. This coarse-graining procedure of merging or lumping microstates into a single merged state was suggested to be used for optimal aggregation and applied to Markov chains, random walk models, models with discrete states and discrete time, and with continuous states and discrete time. This coarse-graining method works well at intermediate timescale. The autocorrelation function between coarse-grained states I and J at time t is defined by ⟨θ I (t) θ J (0)⟩, where θ I (t) equals 1 when the system is at state I. The autocorrelation function of the coarse-grained system and the full system are related by ⟨θ At the Markovian limit, the reduced matrix R is given by [50]: where the initial rate matrix K is of order n-by-n (being the number of microstates), and the reduced matrix R is of the order of N-by-N (N is the number of coarse-grained states after merging). The normalized equilibrium probability density for the merged state I is given by P eq , and the matrix obeys RP eq = 0, where P eq is a column vector with entries P eq (I).
The elements with the suffix N (n) refer to the merged states (constituent microstates). D N and D n are the diagonal matrices with the equilibrium distributions as diagonal elements. 1 N is the unit matrix of dimension N-by-N. A is the adjacency matrix of dimension n-by-N, defined as A iI = 1 for i ∈ I. The adjacency matrix (A) was introduced to map the coarse-grained system onto the full system via matrix notation. Note that the expression in equation (27) for the coarse-grained system is a reduced form of the full system transition rate matrix K. The reduced matrix can have negative off-diagonal elements [50], and therefore, it is not a 'true' transition rate matrix. We calculate the mean EPR from the tilted transition matrix R, and call it HS.
where [R] ji is the ji th element of the transition matrix R. HS is calculated from the dominant eigenvalue of the matrix, Ω HS (Λ) , We compare the mean EPR values obtained from the lumping and decimation coarse-graining methods, with the informed partial EP for the networks shown in figure 2. The results are presented and discussed in the next section.

Results and discussion
In this section, we compare EPR obtained from different methods like SCGF, AM (AM equation), L (coarse-graining method where the transition flux among all states except the merged states and the steady state probabilities are preserved), IPEP (considering parameter dependent transition rates between the observed states), HS (considering the time dependent occupation number correlation function to be preserved) with TM (true mean value of the EPR considering all the edges). The closer the mean EPR values predicted from different coarse-graining methods to the TM, the better is the prediction. In all the methods mentioned above, we consider the coarse-grained system dynamics to follow AM equations. Therefore, the error of the approximation is expected to be smaller if there is a timescale separation (except for the lumping). Except for SCGF, all other state-based coarsegraining methods (AM, L, HS) are calculated from tilted matrices based on the corresponding modified rate matrices, given by equations (21), (25), and (27), respectively. We calculate the mean EPRs from different state-based coarse-graining methods like lumping and decimation and compare them with the mean EPR obtained from partial information of only the observed link, i.e. IPEP (equation (2)). For partially observed systems, we always get lower bounds on the total EPR. Therefore, the higher the bound, the better it is, as it provides a tighter estimation of the total EPR.
We plot TM, SCGF, AM, L, and IPEP as a function of dimensionless driving forces (F = βFL) for different network topologies, as shown in figure 1. All the parameter values are given in the caption of figure 1, and the subplots of figure 2 correspond to the same network topologies as in figure 1.
In figure 2, we present TM, SCGF, AM, L, and IPEP correspond to σ TM , σ SCGF , σ AM , σ L , and σ IPEP as calculated using equations (8), (18), (22), (26), and (2), respectively. TM provides the mean EPR for fully observed systems, so the closer the EPR bounds to the TM, the better the estimators. The mean EPR obtained from the scaled cumulant generating function of the EP (σ SCGF or SCGF) by decimating one of the states (applying all the constraints mentioned before equation (13)) always estimates the total mean EPR (σ TM or TM), as expected. In other words, the decimation procedure is constructed in a manner that it reproduces the mean EPR at steady states. As the EPR's are calculated at the steady state, the EPR from the scaled cumulant generating function exactly produces the TM. In the following, we compare the AM, L, HS, and IPEP with the total mean EPR values (TM) for all the network topologies.
The EPR calculated from the AM Equation (σ AM or AM) provides a lower bound on the total mean EPR (σ TM or TM), since after decimation, the coarse-grained system is not Markovian, and therefore, the AM cannot capture the total EP. When the hidden substates are disconnected (network topology III and VII in figure 1), we do not lose any cycles carrying transition currents in the coarse-graining, so the AM provides values closer to the total mean EPR (σ TM ). We note that for topology VII (figure 1), the observed information is sufficient for precisely inferring the total EP, since the observed cycle is the only entropy-producing fundamental cycle in the network, and the AM equals the total EPR (TM). In that case, the IPEP also equals the total EPR, as expected [18].
We further compare the mean EPR from a state-based coarse-graining (L), i.e. lumping by conserving the steadystate probabilities and the transition fluxes between all states except the merged states, with the IPEP. Since it is difficult to compare based on the subplots in figure 2, we plot σ IPEP − σ L /σ IPEP in figure 3, as a function of the driving force F in semi-log plots. From figure 3(a), we find the difference between L and IPEP is larger for network topology III compared to I, near their corresponding stall forces (labeled with dash-dotted lines). For network topology V (figure 3(b)), with connected hidden states, the difference between σ L and σ IPEP is less than 0.1% for the range of F values tested around the stall force, whereas for network topology VII, with disconnected hidden states, the difference σ IPEP − σ L /σ IPEP is approximately two orders of magnitude larger compared to topology V, near the corresponding stall force. IPEP accounts for the hidden states transition information via a single rate, which is missing if the hidden states are disconnected, and therefore, the difference between the L and IPEP increases.
IPEP is calculated from the observed cycle affinity, whereas L is based on preserving the steady state properties of the system, e.g. the probabilities and the transition fluxes. So, it is difficult to compare the trade-offs between the input information provided and its output (mean EPR estimation). We find that for connected hidden substates (network topology I and V), the difference between L and IPEP is smaller compared to disconnected hidden substates (network topology III and VII) at their respective stall forces.
In figure 4, we compare the EPR obtained from the two lumping methods described in section 5, namely, L, where the steady-state probabilities and the transition flux between the observed states are preserved, and HS, where the autocorrelation function of the population for the coarse-grained states is the same as the full system. Both methods consider the coarse-grained system to follow approximated Markovian dynamics, and therefore, the error in the approximation varies depending on the timescale separation. The coarsegrained transition rates in equation (24) are based on minimizing the KLD between the trajectories, whereas the reduced matrix in equation (25) is constructed to maximize the relaxation time of the reduced matrix so that the coarse-grained system would have the same characteristic time as the full system. For both network topologies I (figure 4(a)) and V ( figure 4(b)), L dominates over HS for certain driving parameter values (F), whereas for other driving parameter values, HS dominates over L. At stalling force, HS results in larger EPR values compared to L. The non-diagonal elements of the reduced matrix of topology I (equation (27)) become negative above a certain driving force (F > −0.25) [50], and therefore we are unable to calculate HS beyond this value ( figure 4(a)). As the HS coarse-graining approximation's accuracy depends on the timescale separation between the states undergoing merging and in between the merged states, EPR prediction using this coarse-graining method also depends on the intrinsic timescale difference. However, the method predicting L can be used without a timescale separation. Therefore, the performance of the estimators for the two coarse-graining approaches depends on the system at hand. The optimal Markov matrix, R (equation (27)), was calculated based on the assumption that there are no disconnected microstates, according to HS [50]. Therefore, we only compare the results among the two lumping methods for network topology I ( figure 4(a)) and topology V ( figure 4(b)) of figure 1.

Conclusion
This paper studies different notions of partial information in driven systems and the corresponding mean EPR values associated with them. The predicted EPR values are then compared with the true mean EPR (TM) which is calculated considering all the edges of the network topology, where tighter lower bounds on the TM EPR are considered better. The SCGF, which is calculated for coarse-graining by decimation and redistribution of the steady-state probabilities and transition rates, produces the total mean EPR (TM), as expected. For the other coarse-graining approaches, systems are considered to follow approximated Markovian dynamics, and the resulting modified/coarse-grained transition rate matrices are used to calculate the mean EPRs for different methods (AM, L, HS). For certain topologies, we find that as we decrease the hidden state connectivity, the AM approaches the total mean EPR (topologies I, II, and III) since the contribution to the EPR from the loss cycles carrying transition flux decreases. By comparing the mean EPR values calculated from the statebased coarse-graining methods, namely, lumping (L, HS) and coarse-graining by decimation (SCGF, AM), where one is assumed to know the network topology, with the mean EPR inferred from partial information of only the observed states and the transitions between them (IPEP), we find that the effect of the partial information reflected on the inferred EPR for state-based coarse-graining depends on the network topology. Further, IPEP provides similar entropic information as one would obtain by L, a state-based coarse-graining (lumping with preserving the steady-state properties), where the difference between is bigger for disconnected hidden substates (topologies III and VII), compared to the networks with connected hidden substates (topologies I and III) at their corresponding stall force. Hidden state information inferred by the IPEP is lost for disconnected hidden substates, and due to this missing information, the difference between L and IPEP increases with decreasing connectivity between the hidden substates. The difference between the HS and L varies as a function of the stalling force, and the timescale difference plays a role in it. HS is sensitive towards the timescale separation in between the merged states and between two merged states, whereas L does not depend on the timescale separation. Our study can be extended to larger systems with more than two hidden microstates, or having a hidden cycle current, etc.

Data availability statement
All data that support the findings of this study are included within the article (and any supplementary files).