Catalysis in action via elementary thermal operations

We investigate catalysis in the framework of elementary thermal operations (ETOs), leveraging the distinct features of such operations to illuminate catalytic dynamics. As groundwork, we establish new technical tools that enhance the computability of state transition rules for ETOs. Specifically, we provide a complete characterisation of state transitions for a qutrit system and special classes of initial states of arbitrary dimension. By employing these tools in conjunction with numerical methods, we find that by adopting a small catalyst, including just a qubit catalyst, one can significantly enlarge the set of state transitions for a qutrit system. This advancement notably narrows the gap of reachable states between ETOs and generic thermal operations. Furthermore, we decompose catalytic transitions into time-resolved evolution, which critically enables the tracking of nonequilibrium free energy exchanges between the system and bath. Our results provide evidence for the existence of simple and practicable catalytic advantage in thermodynamics while offering insight into analysing the mechanism of catalytic processes.


Introduction
Catalysts are auxiliary states that enter a process in a way such that they are recovered at the end. They are particularly useful when one is restricted to a limited set of operations (motivated by fundamental principles or practical considerations). The general understanding is that the participation of such states helps in mitigating dynamical constraints, by expanding the working Hilbert space, without consuming resources in the catalyst. Since catalysts ideally suffer no deterioration, they can be used repeatedly to activate state transitions.
Since the discovery of quantum catalysis, first reported in entanglement theory [1][2][3][4][5][6], it has been extended to various quantum resource theories [7,8], such as coherence [9][10][11][12], thermodynamics [13][14][15][16][17][18][19], randomness [20][21][22][23][24], and quantum teleportation [25]. Previous studies on catalysis focus predominantly on the initial and final states of state transformation, either via the mathematical structure of trumping [3,5], majorization [4], or resource monotones [14,[26][27][28]. These approaches are powerful because they guarantee the existence of a catalytic operation given initial and target states. Nevertheless, most of them do not provide insight into the required catalyst state, or the process necessary for achieving the transformation. This knowledge gap poses an obstacle towards practical demonstrations of catalytic processes. A handful of recent advancements have been made to address these concerns. For example, a significant finding is that given any catalytically possible state transition, nearly any quantum state can serve as a catalyst, as long as many copies are used [29]. This insight comes from the reversible convertibility in the i.i.d. limit, combined with the fact that catalyst states are not altered by the activation process. While this result offers incredible insight towards the catalytic power of i.i.d. states, the construction in [29] always prescribes the usage of a very high-dimensional catalyst, which is excessive for processes that might require only a small catalyst and simple operations ‡. These factors reduce the implementability of catalytic processes and highlight the need for a complementary study that emphasizes the extent to which small catalysts and straightforward operations can be effective.
Another challenge towards practical demonstrations of catalytic thermal processes comes from the genericity of thermal operations (TO) [31] in the resourcetheoretic setting [14,[32][33][34][35]. TOs potentially involve intricate control over the global ‡ See [30] for an alternative direction discussing catalysts that activate the transitions that are not possible even between multiple copies of initial and target states. system and bath, in particular its joint energy eigenstates. Interestingly, crude operations [36] acting on a system plus a qubit Gibbs ancilla were proven to be sufficient to generate all TO state transitions to incoherent final states. Achieving target states with higher resourcefulness, however, can still be demanding, e.g. requiring the extreme lowering/raising of system and bath composite energy level, or arbitrary energy preserving unitaries in the system subspace. Therefore, the full employment of resource theoretic models in real world remains challenging [37].
Elementary thermal operations (ETO) [38] alleviate the above issues by considering a subset of TO that can be decomposed into series of two-level swaps. This decomposition offers a natural way to prescribe a process to the experimenter, analogous to how a complex n-qubit computation is decomposed into a small gate set §. Moreover, physical models, such as the collision model [39] or the Jaynes-Cummings (JC) model [40] can emulate the two-level swap. The intensity-dependent Jaynes-Cummings model [41] can furthermore generate all energy-preserving twolevel swaps, making the setup more realistic. Another advantage of ETO is that it opens up the opportunity to analyse intermediate states, which are found by partially applying the swap sequence -providing a time-resolved description of the system dynamics, rarely possible in resource-theoretic thermodynamics. Nonetheless, ETO has its limitations: the only existing method to decide the feasibility of state transitions is to iteratively find all extreme points of the set of reachable states. The number of iterations grows extremely quickly with the system dimension.
Our aim is to leverage the operational simplicity of ETO to find examples of easy-to-realise catalytic evolutions. This goal is nontrivial since the conditions for ETO transitions, even without catalysts, are not well-characterised. Similar to the case of the resource theory of magic [42], our result offers a straightforward recipe for implementing catalytic protocols. This strengthens the connection between catalytic studies in quantum thermodynamics and physically motivated, relatively practical thermodynamic operations.
In Sec. 2, we provide an overview of TO and ETO to set the foundation for our study. We expound on the concept of tight-majorization and neighbouring βswaps, elucidating them with Figs. 2 and 3. Additionally, we develop a fundamental technical tool, i.e. Lemma 3 which plays a pivotal role in the development of our primary findings. In Sec. 3, we present our initial examples of catalytic advantages in ETO and display the complete set of reachable states facilitated by any qubit catalyst. The catalytic regime that we study possesses several desirable properties: it can be entirely achieved through two-level swaps, both system and catalyst are small in dimension, and the catalyst is fully recovered without any errors or remnant correlations with the system. Sec. 4 extends our investigation to a higher-dimensional regime, pushing the total system dimension from six, as explored in Sec. 3, up to ninety. In general, the case of d = 9 already exceeds our computational capability. Nevertheless, by imposing restrictions on our initial states, we develop a significantly more efficient algorithm to determine the feasibility of (catalytic) ETO transitions. We focus our attention on the problem of cooling one thermal state to another under catalytic ETO and compare its performance to non-catalytic ETO and TO. Our results underscore that even relatively small catalysts suffice to overcome the TO limit.
In Sec. 5, we present the analytical results that form the basis for our numerical methods in Secs. 3 and 4. This includes, for example, a full characterisation of the set of states reachable by ETO for systems of dimension d = 3 (Thm. 5), as well as a universal upper bound, tighter than previous bounds (Thm. 9). Notably, we identify a class of initial states where characterising the set of ETO-reachable states is computationally as efficient as checking thermo-majorization relations (Thm. 8). This result essentially resolves the characterisation challenge of ETO for specific sets of initial states, opening the door for systematic investigations into operationally important tasks, such as the cooling scenario in Sec. 4.

Background
In this section we define the terminology and notation used throughout the paper, while also providing an overview of the state-of-art knowledge and techniques which we applied to develop our results. Our approach in this work is grounded in the resource-theoretic framework of quantum thermodynamics, which explores the fundamental constraints on thermodynamic state transitions concerning a set of free operations and free states. This approach has been fruitful [43][44][45][46][47][48][49][50][51][52][53][54][55] in identifying the additional restrictions that arise due to non-negligible fluctuations and finite size effects for quantum systems that interact with their respective environments.

Thermal operations
The most well-known and studied version of the thermodynamic resource theory is the paradigm of thermal operations (TO). These are quantum channels that can be written in the form The complete state transition conditions of determining whether an initial ρ S can be transformed into some ρ ′ S via E have remained a long-standing open problem, in particular when both initial and final states are energy-coherent. However, the situation simplifies in the quasi-classical case, where either ρ S or ρ ′ S is energy incoherent. Given a system S, let us describe any given initial state ρ S = ij a ij |E i ⟩⟨E j |, with respect to the acsendingly-ordered energy eigenbasis, namely E i ≤ E j for all i ≤ j. In the quasi-classical regime, the state is fully characterised by the population vector for each energy eigenbasis, which we denote by p = {a ii } i . We therefore in the rest of the manuscript refer to the population vectors as states living in some probability space V d . The set of reachable final states from p is then denoted as S TO (p). If q ∈ S TO (p), then ρ S S is achievable from ρ S via thermal operations. For quasi-classical states, the possibility of state transition is fully characterised by thermomajorisation relations [33]. It is furthermore known that the action of TO on such quasi-classical states can equivalently be described by Gibbs preserving matrices G acting on the population vectors [31,33]. That is, there exists G that preserves the Gibbs state population vector τ β = {e −βE i /Z S } i , i.e. Gτ β = τ β , and q = Gp. Given two states ρ, ρ ′ represented by their population vectors p, q, and the system Hamiltonian (which in turn determines τ β ), thermomajorisation realtion, denoted by ≻ β , is a pre-order between the states, defined according to their respective thermomajorisation curves L p , L q (Def. 2.2). We say that p ≻ β q if L p (a) ≥ L q (a) for all a ∈ [0, 1]. Definition 2.1 (β-order). Given p ∈ V d and a Gibbs state τ β ∈ V d , let us denote the element-wise ratio of the two vectors as Then the β-order π(p, τ β ) is a particular ordering of the energy eigenbasis labels (1, · · · , d), such that the ratios according to this ordering is non-increasing, i.e.
with π k = π(p) k , where we omit τ β in the argument when it is obvious from context.
1} is a piecewise-linear function that interpolates between the coordinates {(0, 0)} and elbow points {( l k=1 τ β π k , l k=1 p π k )} d l=1 . Earlier works have shown that the set S TO (p) characterised by thermomajorisation can be constructed quite efficiently. This is summarised in the theorem below.
Theorem 1 ([38], Lemma 12 and [56], Thm. 2). For any given p, the set of reachable states S TO (p) is a convex combination of d! unique extreme points that correspond to distinct β-orders and are tightly thermomajorised by p.
With Thm. 1, determining S TO (p) is computationally inexpensive. This theorem follows from the fact that if a state is tightly-majorised by p, then it thermomajorises any other state in S ETO (p) that has the same β-order π (m) . The importance of tightly-majorised states can also be seen from another perspective: Lemma 2 (Thm. 12 of [36]). If two states p and q have the same β-order π(p) = π(q) and p ≻ β q, then q can be obtained from p by a sequence of partial level thermalizations.
Partial level thermalizations [36], where populations of a subset of levels are partially mixed with corresponding thermal populations, can always be decomposed into series of two-level partial swaps and thus are (E)TO. Figure 1. Thermomajorization curves of p, r, q, and τ β , where p ≻ β r ≻ β q ≻ β τ β . In particular, r is tightly thermomajorised by p, implying that r thermomajorises any q ∈ S ETO (p) with the same order. Here we plotted one such state q = (r + τ β )/2. Temperature and energy levels are set to be E 1 = 0, βE 2 = 0.2.
Another important aspect of state transition conditions comes from functions that behave monotonically whenever thermal operations are applied. The most commonly known set of such monotones is dubbed generalised free energies [14] F α (ρ, τ β ) : When α → 1, F α reduces to the nonequilibrium free energy where ⟨·⟩ ρ denotes average with respect to the state ρ, and S(ρ) is the von Neumann entropy -a recovery of the second law of thermodynamics. Monotones only provide necessary conditions for state transitions via TO and its subsets. Yet, the set of monotones in Eq. (6) becomes sufficient in the quasiclassical regime [14], when catalysts are involved. In other words, it is straightforward to determine the existence of a catalytic thermal operation (CTO) that achieves ρ CTO − −− → ρ ′ -one simply needs to check that Eq. (6) is monotonic for all α with respect to such a state transition.

Elementary thermal operations
Special subsets of TO, which take into account more realistic limitations on feasibility, have been developed in the past few years. Two prominent examples are Markovian thermal processes (MTP) [57], which we will not explicitly describe in this paper, and elementary thermal operations (ETO) [38], which is the focus of this paper. See Table 1 for a hierarchy between these operations.
ETO is a subset of TO with an additional restriction on the unitary transformation U SB in Eq. (1). Unlike TO, where the simultaneous manipulation of all energy levels of the system is allowed, ETO concatenates a series of operations where each individual step involves only two energy levels. Formally, where j and k are the system energy levels that the unitary aims to maneuver. As in TO, the state transition from p to q via a single ETO step acting on two levels can be written as q = M and M (j,k) λ This channel is also constructed to preserve the Gibbs state τ β . Extremal cases of λ = 1 are named β-swaps (throughout the manuscript we also refer to them simply as swaps), By definition, it is clear that both concatenations and convex combinations of TOs are themselves also TO. This is obviously not true for ETO, whenever they act non-trivially over different energy levels. Therefore, most of the interesting transitions would arise only when we include all sequences of ETOs, and also + Whenever the order of E j and E k is known, we denote ETO such as M (j,k) or β (j,k) , using the convention that E j ≤ E k . arbitrary convex combinations as allowed operations. Such processes are always TO, hence, ρ but the converse is not true [38,59]. Since no efficient way to determine the possibility of p to q via ETO is known, one must construct S ETO (p) starting from the initial state p, and then verify whether q ∈ S ETO (p). However, existing theoretical constructions of S ETO (p) (by its extreme points) grow rapidly in the system dimension, and become computationally infeasible for the simplest examples of catalysis. We partially address this issue with our results in Sec. 5.

Neighbouring swaps
To find the extreme points of S ETO (p), neighbouring swaps are employed -these are β-swaps of the form β (π j ,π j+1 ) and applied to a state p with π(p) k = π k , i.e. they swap two consecutive levels in the β-order of the input state. Neighbouring swaps are important because they typically minimise dissipation/change in athermality (as captured by thermomajorisation, see Lemma 3).
Lemma 3. For any state p, the state obtained from a neighbouring β-swap q (j) = β (π j ,π j+1 ) p is tightly thermomajorised by p. Furthermore, states obtained by a single but non-neighbouring β-swap are never tightly-majorised by p unless two swapped levels have the same energy.  Fig. 2 (b), (d) and (f). From Thm. 1, it follows that the neighbouring swapped states q (j) are identified as extreme points of S TO (p), and therefore thermomajorise all states r ∈ S TO (p) such that π(q) = π(r). This leads to the following corollary: Corollary 4. If a state q ∈ S ETO (p) is tightly thermomajorised by p 0 , then it is a unique extreme point of S ETO (p) among states of the same β-order.
Therefore, all states q (j) obtained from a single neighbouring swap are also unique extreme points of S ETO (p) having order π(q (j) ). Numerical examples such as Fig. 2 (c) show that when multiple neighbouring swaps are applied, the final state is no longer necessarily extremal. Nevertheless, most of the extreme points of S ETO (p) Figure 2. Thermo-majorization curves of a state undergoing β-swaps. The initial state population is set to p = (0.75, 0.2, 0.05) and the Hamiltonian is set to βH S = (0, 0.2, 0.5). Panel (a) shows a three-dimensional initial state p. Panels (b) and (d) each shows the state after a single neighbouring swap, while (f) shows a single non-neighbouring β-swap. One observes that both (b) and (d) thermomajorise (f). Furthermore, panels (c) and (e) show p after two consecutive neighbouring β-swaps. Observe that even (c) and (e) thermomajorise (f), since the non-neighbouring swap produced a much larger deviation from the initial thermomajorisation curve.
are of the form q ( ⃗ j) * , or contain only a small number of non-neighbouring swaps in their construction. Fig. 3 exemplifies this preference towards neighbouring swaps -there, the free energy of the state is still higher after three neighbouring swaps, compared to a single non-neighbouring swap. In (a), when a neighbouring β-swap (blue arrow) is applied, the state moves to the adjacent cell of β-order, whereas the non-neighbouring swap (red arrow) transfers a state through multiple boundaries of different orders (black dashed lines), resulting in a state less resourceful than q (2,1,2) = β (1,2) β (2,3) β (1,3) p 1 , that experienced three neighbouring swaps. Fast exhaustion of the athermality also occurs in different initial states, rendering non-neighbouring swaps typically suboptimal.
Moreover, it is useful to note that Fig. 3 displays the real time evolution of the state during ETO transitions. By construction, only two levels of the state undergo change in time, i.e. all the other populations are fixed during that period, imposing the system to follow the straight lines in barycentric representations as in (a). In Triangles label the extreme points of S ETO (p 1 ), achieved by β-swaps indicated by blue arrows. The red X marker is the state β (2,3) p 1 . Different β-order cells are separated by black dashed lines connecting pure states and the thermal state. Panel (b): the free energy differences from the equilibrium state τ β , in two different paths corresponding to (a). Dashed lines show the continuous values of ∆F from points on arrow paths in (a). Straight lines connect the discrete values obtained from endpoints denoted with triangle and X symbols. x-axis is the total length of the path taken from the initial state as plotted in (a), e.g. x-coordinate of the second triangle is the summation of the first and the second blue arrow lengths starting from the plus symbol.
(b), dashed lines correspond to the free energies during the real time evolution when the intensity-dependent Jaynes-Cummings model is assumed (See Appendix A for the real time reduced state dynamics). Suppose we apply β (j,k) to a state p with g(p) j > g(p) k . According to Eqs. (A.9) and (A.10), the state evolves as q(λ(t)), where q(λ = 1) = β (j,k) p with g(q) k > g(q) j . Since the evolution is continuous, there exists a state q(λ * ), such that g(q) j = g(q) k . This point is given by λ * = 1/(1 + ∆ jk ) (λ * = 1/(1 + ∆ kj )) when k ≥ j (k ≤ j). The state q(λ * ) is the closest to being thermal, and achieves minimal generalised free energies -minima of dashed lines in (b) -among states q(λ(t)). In other words, during a single β-swap, free energy decreases until the minimum point q λ * is reached, and then increases until the end of the β-swap. On the other hand, if only the endpoints values are considered (solid lines of (b)), free energy cannot increase after each ETO step. These intermediate increases within a single swap evince the non-Markovian effect of thermal reservoirs at each step, differentiating ETO with strictly Markovian thermal processes [57,58]. See Sec. 6 for more discussion.

Qubit catalysts for qutrit systems
In this section we demonstrate the following: (i) Existence of the catalytic advantage in the ETO framework. In particular, we report the discovery of simple qubit catalysts that are sufficient to produce a non-trivial advantage for state transitions in a qutrit system.
(ii) A time-step resolved tracking of the dynamical path for system and catalyst, that reveals snapshots of the inner workings during the catalytic process.
These results could be obtained only by a more systematic understanding of the set of reachable states via ETO, which are detailed in Sec. 5, where we establish various simplifications that enhanced the computability of S ETO . Throughout the rest of the manuscript, we use the vectors p, q, r to denote system states and c is reserved for catalyst states.

Catalytic advantages
For a qutrit system and qubit catalyst, the composite state p ⊗ c lives in a sixdimensional probability space V 6 . Our goal is to construct a set obtainable by catalytic elementary thermal operations with a qubit catalyst (CETO 2 ) By definition, S ETO (p) ⊂ S (2) CETO (p), but qubit catalytic advantage exists iff. S can be evaluated analytically (Appendix F). Nonetheless, S CETO (p; c) is in general constructed by numerically evaluating the extreme points of S ETO (p⊗c) and imposing exact catalyst recovery conditions. The set S CETO (black dotted lines), for the initial states (a) p 1 in Fig. 3 and (b) p 2 = (0.7, 0.2, 0.1), such that π(p 2 ) = (1, 2, 3). The initial state, thermal state τ β , and β-order regimes are denoted similarly as to Fig. 3. Pink shaded areas mark catalytic advantage within S TO , while yellow shaded areas are states that can be reached via ETO+qubit catalyst, but not by TO without catalysis.
process for different values of c 1 . In this first work, we focus on the case where S remains uncorrelated to C to affirm that catalytic advantages exist even in the most conservative catalytic setting. We expect the catalytic power to further increase when system-catalyst correlation persists as has been shown in different resource theories [11,15,18,19,21,24,60], but leave this case for future work.
In Fig. 4, we present two S CETO sets corresponding to two different initial states that have distinct β-orders. The sets are displayed in comparison with the noncatalytic set of reachable states S ETO and S TO . In particular, notice that in Fig. 4 (a), where the initial state is set to have π = (2, 1, 3), we observe that S TO ⊊ S (2) CETO . This feature persists for a number of randomly chosen initial states with π = (2, 1, 3) and (3, 1, 2). When this happens, we can on one hand reproduce every TO transition using ETO with a single qubit catalyst; and on the other hand, combat some of the finite-size effects and enable a larger set of transitions than previously allowed by an arbitrary TO. Likewise, consider Fig. 4 (b), for an initial state π = (1, 2, 3).
Here, we observe that S (2) CETO overlaps almost entirely with S TO , but neither is fully contained by the other. This qualitative characteristic is again present for different initial states with the same β-ordering.
In Fig. 4, the set of states that go beyond S TO (yellow) highlights the additional advantage brought forward by CETO 2 . In addition, the set of states between S TO and S ETO (pink) also has operational merits -there exist states which require genuine multi-level TO to achieve, but can be obtained by an alternative pathway that involves only basic, JC-like interactions when a catalyst is present. A few natural questions emerge: for one, it would be interesting to see how the gap between CETO n and ETO changes with n being the dimension of the allowed catalyst. In Sec. 4, some hints for this question is given. A second question would be how would CETO n fare when compared to CTO, i.e. the catalytic version of thermal operations. The question of a generally fixed n is difficult to answer, however, in a follow-up work, we proved that the set of energy-incoherent state transitions reachable by CETO converges to that of CTO, for the special case when n is allowed to be arbitrarily large. In other words, when any catalyst is allowed, catalytic elementary thermal operations are as powerful as catalytic thermal operations for incoherent inital states [61].

Tracking catalytic processes
Here we explicitly construct a simple series of ETO swaps that leads to the transformation of an initial state p into an extreme point of S (2) CETO (p), and track changes in the system free energy throughout the process.
In general, a state q being an extreme point of S CETO (p) does not guarantee q⊗c to be an extreme point of the enlarged S ETO (p⊗c) in system-catalyst composite space. Thus, (non-trivially) different paths of catalytic processes exist. In Fig. 5, we choose the shortest swap series, which comprises seven ETO steps, to realise the CETO transition from an initial state p 1 to a new extreme state of S (2) CETO (p 1 ). Note that such a choice may not always be unique; see Appendix F for details. Since adopting a qubit catalyst doubles the total dimension, and recovering the catalyst also requires additional swaps, more two-level swaps are needed to achieve a similar final state (as compared to a non-catalytic ETO). For the first four steps, the nonequilibrium free energy of the system, F S decreases; see Fig. 5 (b). After two consecutive ETO steps on the joint system, the catalytic path seems identical to a non-catalytic ETO on the system. Nevertheless, on the global picture, correlations are already starting to build up with the catalyst, as shown in Fig. 6. In the third and fourth steps, F S is further reduced until it has almost a half of the original  nonequilibrium contribution. Recalling that free energies are the monotones of TO, this rapid depletion of the athermality would have been irreversible, if we have access only to the system. Yet, during catalysis, some of the free energy is transferred to either 1) the catalyst populations or 2) correlations between system and catalyst. This build-up is critical for the sixth and final steps of the process: stored free energy is what enables the system to again increase in local nonequilibrium free energy, therefore achieving a final state q outside S ETO (p 1 ) (and in this case, q is even outside S TO (p 1 )). See Appendix F for a more detailed analysis. Similar behaviours are observed for other monotones, such as generalised free energies with α ̸ = 1. Note that the most conservative form of catalysis is assumed: the catalyst is returned exactly and without correlations with the system. Hence, catalyst local free energy goes back to its original level, and mutual information also returns to zero, as shown in (b) and (d) of Fig. 6. Their role is restricted to temporary free energy storages. Furthermore, the total free energy always decreases after each swap, which is expected, since the process is an ETO on the joint system.

Higher-dimensional catalysts for transitions between thermal states
In general, identifying the whole set of reachable states S (d) CETO (p) with d-dimensional catalysts is highly challenging, even when p is three-dimensional and d = 3. However, by developing a theoretical tool, namely Thm. 8, we show that the numerical cost is dramatically reduced for initial states p whose β-orders are monotonic in energy levels. A particularly important class of states that satisfy this property is the set of thermal states with different temperatures. When β h < β, i.e. the state τ β h is hotter than the environment with temperature β −1 , the β-order π(τ β h ) = (d, d − 1, · · · , 1). Similarly, colder states with β c > β have the order π(τ βc ) = (1, 2, · · · , d). If we further employ a catalyst c from the set of states which are sufficiently thermal, the monotonicity of β-order would be preserved, i.e. π(p ⊗ c) is also monotonic in the total energy. In such cases, the analysis of higher-dimensional catalysts becomes computationally tractable. We will refer to such catalyst states as minimallydisturbing catalysts, in the sense that they do not disturb the β-ordering of the system-catalyst composite.
To demonstrate the benefit of this reduction, let us consider the cooling process Figure 7. The cooling performance of CETO from a thermal state to another thermal state, quantified by final inverse temperature β c attainable when the catalyst state is fixed. We set the initial system inverse temperature to be β h = 0.5, and the ambient temperature to be β = 1. The system of interest is threedimensional with energy levels (0, 0.4, 0.5) and catalysts of dimension dim(c) from two to nine, sixteen, and thirty are used. dim(c) = 1 case corresponds to noncatalytic ETO, and the blue dashed line above marks the inverse temperature β TO that non-catalytic TO can achieve. We searched over catalyst state distributions among minimally-disturbing catalysts. The results from the worst performing catalysts in each dimension are marked with purple circles, while blue diamonds are from the best catalysts in each dimension.
via (C)ETO starting from a high temperature thermal state τ β h with β h < β to a colder thermal state τ βc . For any temperature β −1 h , it is always possible to reach the ambient temperature β −1 by a full thermalization with the environment. However, with ETO and TO, colder temperatures β c > β can be achieved, which has been studied in [62] for the non-catalytic case of a qubit. One of our main theses is to showcase the effectiveness of small catalysts with practicable procedures. To corroborate this claim and investigate the scaling of catalytic advantage with respect to catalyst size, we apply Thm. 8 to find the limits of the cooling performance for a qutrit when using catalysts of varying dimensions, ranging from two to thirty. The catalyst Hamiltonian is trivial for simplicity. Fig. 7 shows the coldest achievable β c from minimally-disturbing catalysts, where the worst and the best cases are marked with purple circles and blue diamonds, respectively. Even with qubit catalysts, almost half of the gap between the TO limit (dashed line) and the ETO limit (dim(c) = 1) is covered. The maximal catalytic advantage (blue solid line) gradually increases with the catalyst size, and at dim(c) = 16, best catalysts among the sample surpass TO limit, whilst at dim(c) = 30, most of the samples perform better than TO. Note that we have limited the range of catalyst distribution to fix the initial composite state β-order; hence there might exist (not minimally-disturbing) catalyst states that activate a better cooling process than the ones marked in Fig. 7. Also, even the worst case catalysts provide some advantage from the same reason. Usually, if the catalyst is almost pure or pure, catalytic advantage vanishes. Nevertheless, our results give an efficiently computable lower bound to the achievable amount of cooling when any d-dimensional catalyst is allowed.
To estimate the true minimum and maximum cooling performance achievable by the given set of catalysts, we employed multiple strategies, including uniform random sampling and gradient-descent-like search. Interestingly, regardless of the catalyst dimension, the worst catalysts are given by maximally mixed states which are the Gibbs states for the trivial Hamiltonian. Using Gibbs states as a catalyst can provide no advantage in the framework of TO, since they form the free states. However, for a set of operations with innate Markovianity, such as ETO, Gibbs states can activate state transitions as a catalyst by providing additional non-Markovianity. This is reminiscent of the discussion in Sec. 3.2, where the catalyst functions as a temporary storage during the evolution. See [63] for a similar setup and [61] for the ultimate power of Gibbs state catalysts. The best catalyst distributions we obtained are non-trivial and would be of interest for future study.
Overall, the results in this section display that small catalysts do provide substantial advantage in the setting of ETO, where simple two-level swaps are sufficient to execute the procedure. Also, even when the catalysts are not fine-tuned, when they are in a certain regime, catalysis still enhances ETO considerably. In particular, Gibbs states are assumed to be easily preparable, and thus can be good candidate states for a more realistic catalytic protocol.

Characterizing S ETO
Currently, the only known way to determine whether p ETO −−→ q, is to construct the full set S ETO (p) by finding all the extreme points of this convex polytope, and check if q ∈ S ETO (p). Lostaglio et al.
[38] provided a systematic way of finding all extreme states of S ETO (p) for an arbitrary dimension d, which involves an exhaustive search among all possible β-swap sequences with a bounded length ℓ max . In other words, this procedure identifies an upper bound on the number of extreme points In [38], ℓ max ≤ d! is shown, which means that Eq. (15) grows super-exponentially with the dimension of the system. This presents a serious roadblock to both understanding and determining the possibility of state transitions via ETO, a reason why ETO, despite its strong physical motivation, has not been extensively studied. We improved this result in the following ways: (b) for a subset of initial states where the β-ordering is monotonic in energy, we derive a tight upper bound for the number of extreme points, and fully identify the corresponding β-swaps (Thm. 8).
(iii) We use computational algorithms to obtain heuristics by random sampling of initial states. We present a comparison of these results with theoretical analysis in Table 2.

An exact characterisation for d = 3
A qutrit system is the simplest setup where S ETO deviates from S TO . Furthermore, the no-go results established for qutrits, which rule out certain swap series from generating extreme points of S ETO , hold true for any three levels of higherdimensional scenarios. Here, we provide a full characterisation of S ETO (p) for any qutrit state p by defining two simple sets.
(i) The set containing the initial p after undergoing not more than 2 non-identical neighbouring swaps: (ii) The set of states that undergo three neighbouring swaps or a non-neighbouring swap: Theorem 5. For any p ∈ V 3 with π(p) k = π k , let Θ ETO and Ξ ETO be defined as in Eqs. (16) and (17), where we have dropped the explicit p for notational convenience. Then, Θ One of the important techniques used in proving Thm. 5 is identifying the β-swap series that produce states which are extreme points of S TO . As a result of Thm. 1 and Cor. 4, such states are unique extreme points for final states of their particular β-ordering. This identification utilises results established in the language of biplanar transportation matrices in [56]. We leave the full proof for Appendix C and focus on a few notable points made there. Firstly, observe that even non-neighbouring swaps can generate extreme points. This fact rules out the naive attempt of working only with states of the form q ( ⃗ j) , and forces us to consider the entire set of swap series, which is in contrast to the case of MTP, where the extreme state verifying algorithm only needs to search for neighbouring swaps [57]. However, a necessary condition for β (π 1 ,π 3 ) p to be extremal in S ETO (p) exists and alleviates the complications.
This lemma can be proven by straightforward calculations of swapped states, as described in Appendix B.4. Two simplifications emerge from this result. Firstly, in numerical computations, all non-neighbouring swaps that give orders different from what the lemma imposes can be removed. Secondly, when analytical studies are carried out without specifying an initial state, the ambiguity of β-orders after the swap vanishes when the resulting state is shown to be an extreme point of S ETO .

Improved characterisation for higher-dimensions
Finding the extreme points of S ETO (p) is generally a hard task. In particular, multiple extreme points that correspond to the same β-ordering often exist, and the total number of extreme points is only known to be upper bounded by Eq. (15), which yields an extremely loose number, scaling as d 2d!/(d−3) even with our improved result Thm. 9. Lemma 7 provides some intuition into the lower bound on the number of extreme points for generic initial states.
Lemma 7. For any state p ∈ V d , the reachable state set S ETO (p) has at least one extreme point r having π(r) = ψ for any ordering ψ.
The proof can be found in Appendix B.5. Note that Lemma 7 does not imply that all initial states have at least d! distinct extreme points. For instance, a Gibbs state τ β cannot be transformed into any other state via ETO and thus S ETO (τ β ) has a single (extreme) point τ β . In this case, g(τ β ) is fully degenerate and π(τ β ) can be any permutation of (1, · · · , d).
Similar to the τ β example, some initial states admit a simpler S ETO structure. One of our main analytical results shows that the situation simplifies drastically and achieves the lower bound of Lemma 7, when p is known to have a particular β-ordering, i.e. when it is monotonic in energy, π(p) = (1, · · · , d) or π(p) = (d, · · · , 1).
Theorem 8. If π(p) is monotonic in energy, extreme points of S ETO (p) are achieved if and only if the corresponding β-swap series that produce them are (i) always neighbouring, (ii) containing no repetition of each swap.
See Appendix D for the proof. Several important simplifications follow from the above lemma, whenever π(p) is monotonic in energy. From the no-repetition condition, ℓ max = d(d − 1)/2 is obtained. The equivalence of β-swaps outputting the same target state β-order also guarantees the uniqueness of extreme points of S ETO (p) at each order, setting the maximum number of extreme points to be d!. More importantly, given the target β-order, one can immediately identify a corresponding extreme point without the need of searching over all possible series, since we developed an explicit algorithm to evaluate this extreme point, which we call the standard formation (see Def. Appendix D.1 for details).
Notably, all thermal states τ β ′ , with temperature 1/β ′ that might be different from the ambient temperature 1/β, satisfy the monotonicity of the β-ordering. The analysis in Secton 4 deals with at most ninety dimensional system-catalyst composites. Without Thm. 8, evaluating the extreme points for such high dimensional systems is practically impossible. However, leveraging the nice ordering of the initial state, we were able to simulate over two million ninety dimensional systems in just a few hours.
For the case of generic initial states, we derive an improved upper bound on the β-swap series length that generates the ETO cone.
Theorem 9 (Improvement of Thm. 6 of [38]). S ETO (p) is the convex hull of all final states generated by for d ≥ 4 and all possible combinations of {(j n , k n )}.
The proof of Thm. 9 can be found in Appendix E. The main technique is again to identify a set of states that thermomajorise all the other states in S ETO that share the same β-order, either via applying the results of [56], or showing that they are tightly-majorised by the previous state.
However, even for generic initial states, most elements in Eq. (20) are not extremal in S ETO (p). As discussed in Sec. 2.3, non-neighbouring swaps often deplete athermality too rapidly and the resulting state ends up being close to equilibrium and far from extremality.

Gap between current theoretical characterisation and heuristical analysis
When naively constructing all β-swaps as given in Thm. 9, the number of the series that need to be checked scales as (d/ β ′ are also non-extreme. Thus by starting from length-1 series and checking extremality after each time the series is lengthened, we can greatly reduce the computational cost. There are two ways to verify the extremality of final states. For d = 3, 4 and qutrit-qubit cases, we explicitly used the fact that the resulting set is convex: (iii) Iterate steps 1 and 2 by applying single swaps to S i , save newly obtained extreme states as S i+1 , and update S j≤i+1 by eliminating non-extreme points. Update S = j S j .
The strength of this algorithm is its straightforwardness. It does not include any nonextreme point nor does it miss an extreme point of S ETO (p). However, especially in the higher-dimensional cases, constructing convex hulls and characterising extreme points are highly demanding. The second algorithm, which we used for d = 5, 6, 7, is a slight modification of the one presented in [57] for MTP and utilises Lemma 2: (i) Apply length-1 β-swaps to the set S 0 = {p} to get S 1 = {β (k,l) p} ∀k,l .
(ii) Apply length-1 β-swaps to the set S 1 to obtain S 2 and check thermomajorisation relations between states in S = j S j that share the same β-order. Eliminate the ones that are thermomajorised by other states.
(iii) Iterate step 2 for i times until S i+1 = ∅. Update S = j S j .
The difference from the algorithm for MTP is that our algorithm also considers nonneighbouring swaps. The second strategy has an important edge over the previous one: there are only m(d − 1) inequalities to check for each final state, where m is the number of already obtained extreme points having the same order. Hence for higher-dimensional systems, the second approach is preferable. Yet, non-extreme states that are achievable by convex combinations of different extreme states, but not thermomajorised by either of them, can be included in the final set. When interested only in the final set S ETO , counting some non-extreme states is still permissible.
To search for the maximum length ℓ max , randomly generated initial states and energy levels were used to construct the set S ETO . For d ≤ 6, ∼ O(10 4 ) random samples were tested for each dimension. For d = 7, only 200 cases were calculated due to its high computational cost (∼ few hours for each initial state). Table 2 presents results from the search, where ℓ max scales much slower than the theoretical prediction. Moreover, the typical number of extreme points also deviates significantly from the naive expectation, i.e. they do not scale exponentially with ℓ max .

Discussions and Conclusions
In search of more practical thermal processes, we analyse catalysis in elementary thermal operations with small catalysts. These operations offer a clearer path to implementation and less stringent experimental control requirements, compared to the more general and well-studied framework of thermal operations. However, it is known that ETO is only a strict subset of TO, and in particular, some of the states reachable via TO are no longer achievable via ETO. We try to alleviate this limitation by allowing catalysis while maintaining ease of execution by limiting the catalyst size.
Several roadblocks had to be removed in order to tackle this problem. Firstly, there is currently no efficient way of characterising the set of allowed state transitions under ETO. We partially overcome this challenge through two approaches. First, we fully solve the characterisation problem for three-dimensional systems and for a subset of initial states of generic dimension. Additionally, we improve the analytical upper bound of the computational cost needed for the most general cases. Armed with these tools, we demonstrate the existence of catalysis in ETO, where even small catalysts prove to be remarkably useful. In fact, the extreme case of employing a qubit catalyst alone nearly eliminates the gap between TO and ETO for qutrit system states, sometimes even providing additional advantages beyond TO. At the same time, our physically relevant example of a cooling protocol highlights the power of relatively small catalysts, approaching the TO limit without the need for fine-tuning the catalyst states.
Another obstacle is the initial lack of clarity on why catalysts work, despite reports of their existence (or non-existence) in various resource theories. To address this, we leverage the step-wise structure of ETO to track and analyse catalytic evolutions, by capturing snapshots of states after each ETO step. This approach opens up a new avenue for understanding the underlying origins of catalytic advantage. In our example, the catalyst's role was to receive the free energy flowing out from the system, either through reduced state population changes or correlations with the system. Without the catalyst, all changes in system free energy would dissipate into the surrounding bath, which thermalises after each swap. This interpretation could potentially be extended to catalysts in different resource theories. For instance, it would be intriguing to further investigate how residual correlations between catalyst and system (for the scenario of correlating catalysis) are exemplified in this picture.
Among our results, the theoretical upper bound presented in Thm. 9 is expected to have more room for tightening. The β-swapping operations (unless the energy levels are degenerate) are highly resource-depleting in general. Therefore, the application of these operations exponentially many times, e.g. using ℓ max number of times, is unlikely to produce a final state that is extreme in terms of athermality. Our heuristic approach strongly suggests that the maximum length scales polynomially, rather than exponentially. Yet, we currently do not have rigorous proof of this scaling, and we leave it for future studies.

Acknowledgements
We thank Matteo Lostaglio for insightful discussions. This work was supported by the start-up grant of the Nanyang Assistant Professorship of Nanyang Technological University, Singapore.
[30] Rivu Gupta, Arghya Maity, Shiladitya Mal, and Aditi Sen(De We organise the appendix as follows: Appendix A illustrates the real time dynamics of one elementary thermal operation step, assuming intensity-dependent Jaynes-Cummings interaction between two levels and a harmonic oscillator in Gibbs state. Appendix B contains various technical results we developed on elementary thermal operations (ETO), which are necessary for the analytical results in the main text. In particular, the proof of Lemmas 6 is provided in this section.
In Appendix C, we develop the full characterisation of ETO with d = 3 by proving Thm. 5. For general high-dimensional cases, we prove Thm. 9 in Appendix E.
Appendix D proves Thm. 8 claiming the drastic simplification of the S ETO extreme points construction for two special initial β-orders that are monotonic with the energy eigenvalues.
In Appendix F, we detail the methods of constructing the set of reachable states via catalytic ETO with a qubit catalyst. Here we also analyse an example transition obtained from this method.

Appendix A. Dynamics of an elementary thermal operation
In this section, we show that any two-level swap M (i,j) λ can be implemented through the intensity-dependent Jaynes-Cummings interaction between two levels to be swapped and a harmonic oscillator with the matching energy gap. Our model comprises two-level system, harmonic oscillator, and an interaction between them, which respectively read is a Pauli X-like operator in the subspace spanned by energy eigenvectors |1⟩ S |n⟩ B and |2⟩ S |n−1⟩ B possessing the same energy and g is the coupling strength parameter.
Note that the interaction term is energy-preserving, i.e. [ For simplicity, we shift to the interaction picture. We always consider the initial state to be a product of incoherent system state and a Gibbs state w.r.t. some inverse temperature β, which also commutes with H S + H B . Then, the state ϱ SB and the Hamiltonian H SB remain invariant in the interaction picture. The time-evolution operator is given by making use of the properties of X (n) , (A.8) After time t, the system reduced state becomes ρ S (t) = Tr B U (t)ϱ SB U (t) † (A.9) = 1 − e −βω sin 2 (gt) p 1 + sin 2 (gt)p 2 |1⟩⟨1| S + e −βω sin 2 (gt)p 1 + cos 2 (gt)p 2 |2⟩⟨2| S ≡ q 1 (t)|1⟩⟨1| S + q 2 (t)|2⟩⟨2| S .
Since the reduced state on S is quasi-classical at all time, we focus on the population vector q(t) = (q 1 (t), q 2 (t)), which can be written as a elementary thermal operation channel with λ(t) = sin 2 (gt). Eq. (A.10) specifies the continuous evolution during the twolevel swap process, achieving the β-swap at gt = π/2.

Appendix B. Elementary thermal operations for d-dimensional systems
Appendix B.1. Basic transformation of thermomajorisation curves after a β-swap Given a system characterised by Hamiltonian H, we denote its thermal state as τ β , and describe the initial state with respect to its energy population vector p. Under a β-swap β (k,l) as defined in Eq. (11), when E k ≤ E l . Then the element-wise ratios, g(q) as defined in Eq. (2), are transformed accordingly Furthermore, equalities for the above equations hold only under the following circumstances: Naturally, the β-swap operations also alter β-orderings of states. Let us denote the initial β-order as π(p) = (π 1 , · · · , π d ). If k = π i and l = π i±1 for some i, that is if β (k,l) is a neighbouring swap for a state p, then π(q) can be easily determined: where S i,j is a swap between i'th and j'th elements as introduced in Lemma 6.

Appendix B.2. Useful technical remarks
Here we present sundry remarks on β-swaps that are utilised in proofs of lemmas and theorems in the later part of the appendix. These results hold as equalities in the channel level and do not depend on the states these channels are acting on.
Remark 1. The β-swap series β (k,l) β (k,l) = M (k,l) λ for some λ ̸ = 0, 1, and thus always produces a non-extreme point except for in the trivial case where E k = E l . In that trivial case, two repeated swaps always result in identity, and λ = 0.
Remark 2. If E k = E l for some levels k, l, β (k,l) always connect one extreme point of S ETO to another.
Proof. If this statement does not hold, there exists a state q that is extremal for S ETO (p) but β (k,l) q = i λ i r i for some λ i ≥ 0 and r i ∈ S ETO (p). But then q = (β (k,l) ) 2 q = i λ i β (k,l) r i . Since β (k,l) r i ∈ S ETO (p), q cannot be extremal.
The equality is obtained through direct calculations, where we omit identities acting on irrelevant levels when writing ETO maps (and do so consistently in the rest of the appendix for notational brevity).

Appendix B.3. Proof of Lemma 3
Proof. For the first part of the Lemma: by Eq. (B.8), a single neighbouring swap β π j ,π j+1 gives us Each (a m , L q (a m )) corresponds to an elbow point of L q , which coincides with L p , implying that q is tightly thermomajorised by p.
For the second case when β (π i ,π i+c ) acts on non-neighbouring levels (c > 1) in p, the change in π(q) becomes a little less straightforward. That is, π(q) ̸ = S i,i+c (π(p)), (B.14) in general. First suppose E π i < E π i+c . Then g(q) π i+c = g(p) π i > g(q) π i = (1 − ∆ π i π i+c )g(p) π i + ∆ π i π i+c g(p) π i+c > g(p) π i+c , (B.15) which implies π i+c = π(q) i and π i = π(q) i+c ′ with 1 ≤ c ′ ≤ c. Here we imposed to make π i and π i+c truly non-neighbouring. From the first equality of Eq. (B.15), L q (a i ) = L p (a i ) follows. However, (i + 1)'th elbow for L q is strictly separated from L p , which is sufficient to prove the second part of the Lemma. We demonstrate this for each possible case: Case ii-a): a i+1 − a i−1 > τ β π i and c ′ ̸ = 1. Then π(q) i+1 = π i+1 . Using g(p) π i > g(p) π i+1 , The strict concavity of L p (Eq. (B.16)) imposes The argument above is easily generalizable for E π i > E π i+c , so this concludes the proof.

Appendix B.4. Proof of Lemma 6
Here, we want to prove that a β-swap involving two levels E j , E k produces an extreme point only if its sole effect on the β-ordering on the final state is a swap of j and k. This technical result is later used in establishing Lemma 10, a key tool used throughout in several subsequent proofs. We start by proving the lemma for the case of p ∈ V 3 . Denote the initial β-order as π(p) = (π 1 , π 2 , π 3 ). Consider the following three cases: (i) Suppose E π 1 < E π 3 and π(q) = (π 3 , π 1 , π 2 ), where q = β (π 1 ,π 3 ) p. The same final ordering is obtained after two neighbouring swaps, i.e.
For the general case of p ∈ V d , if π(q = β (π i ,π i+c ) p) ̸ = S i,i+c (π(p)), then the equivalent of q ′ above can be chosen as follows: (i) If E π i < E π i+c and π i ̸ = π(q) i+c , then q ′ = β (π i ,π i+c ) β (π i+c−1 ,π i+c ) p ≻ β q. Although π(q) ̸ = π(q ′ ) in general, q can always be obtained from q ′ by partial level thermalization between π i and π i+c−1 . To see this, notice that Using the same argument to Eq. (B.23), q ′ π i > q π i , and thus Combining with Eq. (B.24), q is obtained q ′ by partial thermalization between levels i and i + c − 1.
q is obtained q ′ by partial thermalization between levels i + 1 and i + c.
(iii) If E π i = E π i+c , we always get π i = π(q) i+c .
Therefore, the state q with order π(q) ̸ = S i,i+c (π(p)) is always non-extremal in S ETO (p).
Appendix B.5. Each β-order has at least one S ETO vertice: proof of Lemma 7 In this section, we show that it is impossible to have a β-order that has no vertice of S ETO . Lemma 7 hints the lower bound scaling of the number of extreme points in worst cases.
Proof. To start, we construct a series of sets S d ⊂ S d−1 ⊂ · · · ⊂ S 0 defined as follows: • S 0 = {q|q ∈ S ETO (p) and π(q) = ψ}, In other words, S 1 is the set of states in S ETO (p) with a specific β-order ψ, and with maximal ψ 1 population r ψ 1 . Likewise, S 2 is a subset of S 1 , having additionally the maximal ψ 2 population, and so on. Note that S d always has a single element for each fixed choice of ψ, which we denote as r ♯.
Suppose there is no extreme point of S ETO (p) corresponding to ψ. Then r can be written as a strict convex combination of extreme states e (i) , i.e.
Starting from j = 1, check the following: (i) For j > 1, we have e (i) ψ k = r ψ k , ∀k < j from the last iteration. For j = 1, we do not need any condition yet.
ψ k , ∀k ≤ j can be found. Let us show how to do this: • If g(e (i) ) ψ j−1 = g(r) ψ j−1 ≥ g(e (i) ) ψ j > g(r) ψ j , we can simply thermalise all the levels ψ k , ∀k > j of e (i) to have the same slope, which is smaller than g(e (i) ) ψ j . Then we obtain the desired state e ′ , since levels with degenerate slopes -all ψ k>j in this case -can be permuted within themselves in the β-order. • If g(e (i) ) ψ j > g(r) ψ j−1 > g(r) ψ j † †, we can first reduce e (i) ψ j by partially thermalizing with populations of levels ψ k>j until g(r) ψ j−1 ≥ e ′ ψ j > g(r) ψ j . Then, as in the previous case, thermalizing all the levels ψ k>j will give e ′ .
However, such e ′ satisfies e ′ ∈ S j−1 and e ′ ψ j > r ψ j , which contradicts the assumption that r ∈ S j .
(iii) If e (i) ψ j ≤ r ψ j for all i, from convexity of the combination, e (i) ψ j = r ψ j for all i. Proceed to j → j + 1.
♯ It should be noted at this point that r may not be the only extreme point that has β-ordering ψ -there could be states r ′ of the same β-order, where the first elbow is lower that of r, and the second elbow higher. Such states are not, however, contained in S d . † † Requiring g(r) ψj−1 > g(r) ψj is always possible by putting all the levels having the same slope g(r) ψj to come after ψ j in the order ψ.
If e (i) ψ j = r ψ j for all i and j, e (i) = r, which contradicts the assumption that r is not extremal.
Appendix C. Full characterisation of elementary thermal operations for d = 3 and the proof of Theorem. 5 To prove Thm. 5, we first establish three preliminary results for d = 3.
• Remark 5 identifies two-neighbouring-swap series that generate S TO extreme points, which are uniquely extremal for S ETO in their orders by Cor. 4. This remark is used repeatedly to prove the other two lemmas.
• Lemma 10 shows that non-neighbouring swaps cannot be used in series with more than one swap.
• Lemma 11 sets the maximum length of swap series to be three.
The set of candidates for extreme points is achieved after ruling out all the swaps yielding provably non-extreme states. Proof. Direct calculation gives By using the algorithm in Def. 6 of [56], one can verify that the above channels are biplanar extreme points of the set of thermal processes. According to Thm. 4 in [56], such biplanar extremal channels generate extreme points of S TO (p), when the initial state corresponds to a particular β-ordering that can be found in the process of decomposing the graph structure of the channel matrix. Performing this procedure according to [56] reveals that for β (2,3) β (1,2) and β (2,3) β (1,3) , the relevant input state β-order is given by (2, 1, 3) and (3, 1, 2); similarly for statements 2 & 3 in remark.
Proof. To prove this, we need to show that i) a neighbouring swap following a non-neighbouring swap produces non-extreme point and ii) a non-neighbouring swap following a neighbouring one also yields a non-extreme point. Since only extreme points are of our interest, using Lemma 6, we can safely assume that π(β (π i ,π j ) p) = S i,j (π) for any π = π(p) and i, j. We tackle each problem by further dividing cases.
We exhausted all possible cases and none of the swaps can create an extreme point of S ETO (p).
Lemma 11. For p ∈ V 3 having non-degenerate energy levels, l i=1 β i p can be extremal in S ETO (p) only when l ≤ 3.
Proof. We prove this lemma by showing l = 4 swaps always produce non-extreme states. From Lemma 10, non-neighbouring swaps do not need to be considered. Then the remaining two final states are q (1,2,1,2) and q (2,1,2,1) .
Case i: π(p) = (1, 2, 3) or (3, 2, 1). There are only three distinct β-swaps for three-dimensional systems. Then length-four series always include a repetition of the same swap, and Thm. 8 forbids such series to make extreme states.
When there is a degeneracy in energy levels, e.g., E k = E l , the equality β (k,l) β (k,m) = β (l,m) β (k,l) let us rearrange the swap to make β (k,l) appear only at the end of the series. From (β (k,l) ) 2 = 1, we can always make β (k,l) never appear or appear only once in ⃗ β. After this reduction, Lemma 11 also holds for degenerate energy systems.
Appendix D. Simplifications for initial orderings monotonic in energy levels: proof of Theorem 8 The technical proof of Thm. 8 can be sketched in the following steps: • Firstly, we introduce a specific series of β-swaps that transforms an initial βordering to a target ordering (Def. Appendix D.1).
• This structure, which we refer to as the standard formation, is then shown to be equivalent to any β-swap series, where i) all swaps are neighbouring to initial states of the form Eq. (19) and ii) each swap is applied at most once (Lemma 12).
• Finally, to prove Thm. 8, we show that whenever the initial β-ordering is monotonic in energy, then a transformation that is not according to a standard formation always leads to a non-extreme state. This Lemma allows us to conclude that the number of extreme points for such a S ETO (p) is at most d!, similar to that of S TO (p).
(ii) Iterate the above step for j = 2, · · · , d − 1, defining { ⃗ β (j) } d−1 j=1 . The standard formation series is then simply the concatenation By construction, there is no repetition of a swap in this series and they are all neighbouring when applied to an initial state with the order π.

An example of the standard formation
For the orderings π = 1234 and π ′ = 4231, the standard formation is The intermediate β-orderings given by this process are This formation has a nice property, namely all swaps in each block ⃗ β (j) acts on level π ′ j and the ones in ⃗ β (k) for any k > j does not act on level π ′ j . In the lemma below, we illustrate how certain classes of swap series, which turns out to be the ones producing extreme states, can always rearranged into a standard formulation.
Lemma 12. Given an initial state p with ordering π monotonic in energy, denote a β-swap series and π ′ to be the final β-ordering of the state p ′ = ⃗ βp. If ⃗ β is such that: 1) each β i is a distinct swap, and 2) when applied to p, is always a neighbouring swap, then ⃗ β can always be expressed in the form of a standard formation for (π, π ′ ).
Proof. We prove this by induction. Suppose the above lemma is true for p ∈ V d−1 .
The first goal is to prove this for initial ordering π = (1, 2, · · · , d) and ⃗ β that satisfies the conditions in the statement of the lemma. By identifying the first and the last swaps acting on the level d, one can decompose ⃗ β into Here, ⃗ β (Pre) are the swaps coming before the first swap acting on the level d, and ⃗ β (Post) are the ones after the last swap acting on the level d.
Case i: d = π ′ m , where m ̸ = 1 is the position of level d in the β-ordering of the final state p ′ . To make a rearrangement of swaps, we first remark a few points using sets A = {i|g i > g d } and B = {i|g i < g d }. We will update these sets after each swap.
At the beginning, there is no element in B and all the other levels except d are in the set A. Next, we note the following: (i) Swapping i ∈ A and j ∈ B is not allowed, since they are non-neighbouring.
(ii) Any i ∈ A can move to B only when β (i,d) is implemented.
(iii) Since initially B = ∅, any given level either stays in A at all times, or it moves to B at some point and remains so thereafter. This comes from the restriction that in order for i to move between A and B, the swap β (i,d) must be used.
The sets A and B after the whole transformation is determined by the target βordering π ′ , where we denote them as A f = {π ′ i |i < m} and B f = {π ′ i |i > m}. Given the constraints above, we know that ⃗ β describes a special process. See Fig. D1, for instance, for a visualization of this operation. A and B are separated by level d (point 1 above). Starting from B = ∅, some elements i ∈ A are transferred to B whenever β (i,d) is implemented. Once this happens, i cannot go back to A, since it would require the repetition of β (i,d) to do so. Visually, this is understood by saying that the bar representing level d in Fig. D1 is penetrable from the left only. At the end, A = A f and B = B f , where elements of A f never passed through level d (point 3 above). Lastly, if k, l ∈ A, then they have not experienced a swap with level d yet, explaining the point 4 above.
The next step we want to show is that w.l.o.g., a rearrangement, where ⃗ β (Pre) contains all β-swaps that are part of ⃗ β, acting on elements i ∈ A f , but not involving level d, is possible. To do so, we identify all swaps β (k,l) in ⃗ β (Post) ⃗ β (d−rel) such that k, l ∈ A at the time of swap. From the rightmost one, make a decomposition Notice that ⃗ β (b) only contains swaps among B ∪ {d}, which includes neither k nor l. From Remark 3, we then have that β (k,l) and ⃗ Figure D1. Illustration of changes in β-order starting from π = (1, 2, · · · , d) undergoing a swap series ⃗ β, where it is assumed that ⃗ β is always neighbouring and allows no repetition of a particular swap. Level d is depicted as a bar, dividing the levels into sets A and B. For an initial state whose β-order is monotonically increasing in energy, A = {1, · · · , d−1} while B = ∅. Three different types of swaps are possible: i) swapping between level d and its neighbouring element which is in A (1st swap above), ii) swapping among neighbouring levels in A (2nd, 3rd steps above), and iii) swaps among levels in B (4th explicit step above). When the type ii) and iii) swaps occur, A and B remain the same. On the other hand, type i) swaps move one element of A into B. The process of elements going from B to A is forbidden due to the no-repetition constraint. At the end of applying ⃗ β, A and B becomes A f and B f . The elements of A f never experience a swap with level d, while the elements of B f experienced exactly once swapping with level d.
If k ∈ A f , the existence of β (k,l) in ⃗ β indicates l ∈ A when the swap is applied. Hence, by repeating this until the end, all swaps β (k,l) with k ∈ A f are merged into ⃗ β (Pre) . Since ⃗ β (Pre) does not contain swaps acting on level d, it acts on at most d − 1 levels and can be reordered in the standard formation by the assumption that the lemma holds for states in V d−1 . Until now, there is no swap acting on level d and thus d = π( ⃗ β (Pre) p) d . Then levels in B f , which should be swapped with d in ⃗ β (d−rel) , occupy later d−m slots in the β-order: π( ⃗ β (Pre) p) m , π( ⃗ β (Pre) p) m+1 , · · · , π( ⃗ β (Pre) p) d−1 . By construction of the standard formation, ⃗ β (Pre) then can be decomposed into . Again, by assumption this swap can be rearranged as a standard formation. Concatenating standardised series ⃗ β (Post) ⃗ β (d−rel) ⃗ β (B f ) and ⃗ β (A f ) , we obtain the standard formation for the entire series.
We first prove the only if statement of the lemma, as follows: (i) We show that the repetition of any particular β-swap always leads a to nonextreme state if the series is all-neighbouring. This is done by contradiction: suppose that ⃗ β = β (k,l) ⃗ β ′ , w.l.o.g. assuming k < l. Furthermore, assume ⃗ β ′ to be a series satisfying the only if part of the statement but contains β (k,l) , causing β (k,l) to occur twice in ⃗ β. From Lemma 12, ⃗ β ′ can be written in a standard formation, which reads Notice that since ⃗ β is an all-neighbouring swap, this implies that after ⃗ β ′ , levels k and l should be neighbouring with g( ⃗ β ′ p) k ≤ g( ⃗ β ′ p) l , which then implies for some set of levels M ⊂ {m|m < k < l}, by construction of the standard formation. Here, ⃗ β (Irrel) is a series that acts on neither k nor l. Using Remarks 3 and 4, which always generates a non-extreme point from Remark 1.
(ii) Now we show that non-neighbouring swaps are also not allowed. and after all-neighbouring series ⃗ β (NS) . From the first part of the proof, ⃗ β (NS) also cannot include any repetition. Using Lemma 12, we rearrange ⃗ β (NS) into the standard formation.
1. ∃l satisfying k < l < m or k > l > m: firstly, note that the procedures from Eqs. (C.5)-(C.9) can be generalised for higher dimensions. If g(r) k > g(r) l > g(r) m and k < l < m for some r, 17) which implies that β (k,m) r can be obtained from β (l,m) β (k,m) β (k,l) r via partial thermalization of levels k and l (cf. d-dimensional case for Appendix B.4), and thus not extremal in S ETO (r). Similarly, the result also holds when k > l > m. By putting r = ⃗ β (NS) p, the state ⃗ βp is not extremal in S ETO ( ⃗ β (NS) p) and thus not extremal in S ETO (p).
(B) If l 0 and m are not neighbouring and there is no l satisfying k < l < m or k > l > m, the levels between l 0 and m are {j i } such that j i < k, m. By the construction of the standard formation and the resulting order we know that for all j i : i) β (j i ,l 0 ) and β (j i ,k) exist in ⃗ β (NS) , ii) β (j i ,k) proceeds β (j i ,l 0 ) , and iii) β (j i ,m) does not exist in ⃗ β (NS) . The last condition also implies that after β (j i ,l 0 ) , swaps acting on j i are only acting within the set {j i }, which we will denote as ⃗ β ({j i }) . Then since β (k,m) ⃗ β (Irrel) does not act on levels l 0 and all j i . Finally, l 0 is again neighbouring to m in π( ⃗ β (Irrel) β (l 0 ,m) ⃗ β (Pre) p) and we can use the argument from case ii.A to prove this state is non-extremal.
3. All l < k, m: denote the last such l swapped with k as l 0 and they are neighbouring from the structure of the standard formation. Similar to Eq. (D.19), we get β (k,m) β (k,l 0 ) part, which produces a non-extreme state by Lemma 10.
As a result, non-neighbouring swaps are completely ruled out from the candidate of extreme point producing β-swaps when starting from monotonic order states.
The sufficient condition of the lemma can be shown by recalling two properties: i) there exists at least one extreme point of S ETO for each β-order (Lemma 7) and ii) β-swap series satisfying the conditions of the lemma are all equivalent if the pair (π, π ′ ) is identical (Lemma 12), which makes them the only candidate for an extreme point with order π ′ .
Lastly, we note that the proof holds for π = (d, d − 1, · · · , 1) initial states since all we have used are Remark 4, Lemma 10, and Lemma 12, which hold even when the energy ordering is inverted.
Appendix E. Proof of Theorem 9 Before proving Thm. 9, we establish a result connecting lower dimensional results to higher dimensions. Lemma 13, combined with the fact that two disjoint neighbouring β-swaps produce S TO extreme points (using Remark 3 twice), shows that states after two different neighbouring β-swaps are always uniquely extremal in their order.
Cases 1-3 follow from a slight generalization of Remark 5. If a matrix M is a biplanar extremal thermal process (see [56] Def. 6 and Sec. IV), then M ⊕ 1 is also a biplanar extremal thermal process, since an identity connects i'th element only to i'th element and thus biplanar; and all elements of thermal processes are non-negative and upper bounded by 1, making an identity extremal. Hence, we only need to prove for cases 4 & 5.
Case 4: This can be proven by showing that all states q ′ ∈ S ETO (p) such that π(q ′ ) = π(q (j−1,j) ) are thermomajorised by q (j−1,j) . We will show this by contradiction. To do so, first note that q (j−1,j) is very close to being tightly-majorised by the initial p -in particular, all elbow points of L q (j−1,j) lie on L p except for the j'th one. Therefore, for any q ∈ S ETO (p) with the same β-ordering as q (j−1,j) , their elbows are aligned, and we know that L q (j−1,j) ≥ L q ′ at all elbows other than j'th. Now, assume that in fact q (j−1,j) ⊁ β q ′ . However, this can only happen if L q ′ > L q (j−1,j) at the j'th elbow. Since π(q (j−1,j) ) = (· · · , π j , π j+1 , π j−1 , · · · ), the above condition translates into where we have simply used the fact that L q ′ at the j-th elbow is equal to one minus the probability mass corresponding to the j + 1-th up to d-th elements in the β-order π(q ′ ). Let us denote A R = k∈R p k to be the probability mass over a set of levels R. Note that this quantity changes whenever a level i ∈ R is β-swapped with a level in i ′ ∈ R c . When level i ′ has a steeper slope than i, then A R increases after the β-swap; conversely, k∈R p k decreases. To make use of this observation for Eq. (E.5), take The only β-swaps that can reduce A R would be β (π j−1 ,π j ) and β (π j−1 ,π j+1 ) . Among different combinations of two swaps, q (j−1,j) = β (π j−1 ,π j+1 ) β (π j−1 ,π j ) p achieves the minimum probability mass over a set of levels R as can be seen from Thm. 5. Thus, Eq. (E.5) is impossible. Case 5: π j+1 < π j−1 , π j , we must try to maximise j−2 k=1 q π k + q π j+1 , which is achieved when q = q (j,j−1) as in Case 4. This concludes the proof. Now we prove Thm. 9.
i ′ ) for some j, j ′ and i ′ > i. We will show that above coincidences can happen only when i ′ for i ≥ i ′ + 2 is also not allowed from the same reason.

Appendix F. Catalytic elementary thermal operations: an example
We tackle the problem by the following procedure: given p ∈ V 3 , we choose a qubit catalyst c ∈ V 2 , and construct the reachable state set S ETO (p ⊗ c). The catalyst Hamiltonian is assumed to be w.l.o.g. degenerate, and hence a choice of c = (c 1 , 1 − c 1 ) is simply determined by a real-valued parameter c 1 . We then find the full characterisation of S ETO (p ⊗ c) by identifying its extreme points, and denote S CETO (p; c) to be the set of all states q such that q ⊗ c ∈ S ETO (p ⊗ c).
Lastly, this procedure is iterated for different choices of c, by varying the choice of c 1 in a sufficiently fine-grained manner.
Since both catalyst local free energy and mutual information increase, system local free energy should always decrease at this stage. The total state after these swaps is Catalyst local free energy decreases as a result of mixing, while correlation increases.
(iii) Last two swaps increases 3 * 1 level population to recover the original ratio between 3 * 1 and 3 * 2, while at the same time further reducing the system ground state population. For this particular choice of catalyst, we have a simplification in the sense that these swaps also balance 1 * 1 and 1 * 2, leading to q ⊗ c ≃ (0.0832, 0.1348, 0.1977, 0.3203, 0.1008, 0.1633), (F.14) q ≃ (0.2179, 0.5180, 0.2641), (F.15) with vanishing correlation and retrieval of the original catalyst. The system free energy increases here, since level |1⟩ S , which already has the lowest slope, loses more population and the new state will thermomajorises the old state. This behaviour of increasing free energy after swaps is strictly forbidden in non-catalytic setting, and allowed in this case by sacrificing correlations and catalyst free energy stored from previous operations.
Although there is a room for some change in orders between swaps that are commuting, the above is a typical strategy for constructing catalytic transformations: (i) exploits expanded dimensionality to swap a system plus catalyst level with more numbers of levels, paying i) temporary correlations and ii) local variations on catalyst as a cost; (ii) resolves correlations by mixing in the degenerate energy subspace; and (iii) recovers original catalyst distribution while increasing system local free energy.
Even for generalised free energies F α with different α values, similar behaviours are observed. In Fig. F1, α = 0.5 and 2 are presented as representative examples. In both of the cases, F α for a system reduced state (see (c) and (d) of Fig. F1) shares the decreasing/increasing trend, albeit with different slopes.