On the stability of tidal streams in action space

In the Gaia era it is increasingly apparent that traditional static, parameterized models are insufficient to describe the mass distribution of our complex, dynamically evolving Milky Way (MW). In this work, we compare different time-evolving and time-independent representations of the gravitational potentials of simulated MW-mass galaxies from the FIRE-2 suite of cosmological baryonic simulations. Using these potentials, we calculate actions for star particles in tidal streams around three galaxies with varying merger histories at each snapshot from 7 Gyr ago to the present day. We determine the action-space coherence preserved by each model using the Kullback-Leibler Divergence to gauge the degree of clustering in actions and the relative stability of the clusters over time. We find that all models produce a clustered action space for simulations with no significant mergers. However, a massive (mass ratio prior to infall more similar than 1:8) interacting galaxy not present in the model will result in mischaracterized orbits for stars most affected by the interaction. The locations of the action space clusters (i.e. the orbits of the stream stars) are only preserved by the time-evolving model, while the time-independent models can lose significant amounts of information as soon as 0.5--1 Gyr ago, even if the system does not undergo a significant merger. Our results imply that reverse-integration of stream orbits in the MW using a fixed potential is likely to give incorrect results if integrated longer than 0.5 Gyr into the past.


INTRODUCTION
In the modern era of surveys like Gaia (Collaboration et al. 2016), Sloan Digital Sky Survey (Majewski et al. 2017;Kollmeier et al. 2017), and Rubin-LSST (Ivezić et al. 2019) with high Galactic resolution, it is becoming obvious that our current static parametric models are unable to accurately describe the dynamically complex and evolving Milky Way (MW) halo. For years, researchers have used analytic density profiles, specified in terms of a handful of intuitive parameters, in order to understand the mass and shape of the MW (e. g., McMillan 2011;Bland-Hawthorn & Gerhard 2016;Sanderson et al. 2018); a crucial step in constraining the nature of dark matter (DM) (e.g., Mao et al. 2015;Robles et al. 2017; Thomas et al. 2018;Fitts et al. 2019;Sameie et al. 2021) and understanding the formation of the MW (e. g., Wechsler et al. 2002;Ludlow et al. 2013;Bonaca et al. 2020;Naidu et al. 2020;Li et al. 2022). These parametric models are efficient in approximating the present-day mass distribution of galaxies that match their symmetry and scaling assumptions, but they lack the necessary flexibility required to model galaxies with complicated ongoing mergers-and it is clear that the MW falls under this description, with a massive early merger (Gaia-Sausage-Enceladus (Helmi et al. 2018a;Belokurov et al. 2018)) followed by a quiescent period and the current merger with massive infalling satellites such as the Large Magellanic Cloud (LMC) (Besla et al. 2007;Kallivayalil et al. 2013;Peñarrubia et al. 2015;Vasiliev et al. 2021) and the progenitor of the Sagittarius stream (Sgr dSph) (Ibata et al. 1994;Newberg et al. 2002;Kirby et al. 2013) that change the potential of the MW in nonadiabatic ways (e.g., Antoja et al. 2011;Laporte et al. 2018;Erkal et al. 2019;Petersen & Peñarrubia 2020;Garavito-Camargo et al. 2021;Vasiliev et al. 2021;Lilleengen et al. 2022). The time-independent, relatively symmetric parametric models previously used to describe the MW's potential fail to capture this evolution. Re- A . cently, D'Souza & Bell (2022) used rigid spherical potentials for the MW-mass host and the most recent accreted satellite to backward integrate orbits of subhalos from their present-day positions. The found large uncertainties within the recovered orbits with respect to the true orbits. In order for a model to accurately describe the global potentials, it should be able to incorporate perturbations caused by mergers in a Galactic halo. At the same time, the amount of information available to constrain potential models is far larger, and spans a larger volume, than ever before. Together these developments argue for establishing a new framework to model the global MW potential profile that is not limited by parameterization of quantities such as radius and mass, but takes into account the time evolution of halos. One alternative and improved method is the multipole potential model based on basis function expansions (BFE), which can specify potentials of an arbitrary density distribution to high accuracy (Hernquist & Ostriker 1992;Lowing et al. 2011) and can be used to describe MW-like halos. Given the MW's dynamic nature, we expect that a time-evolving potential model constructed using BFE will perform better than traditional parametric static potential models. Much work has already been done on the feasibility of orbit reconstruction in a time-dependent potential based on BFE. Ngan et al. (2015) simulated orbits after tidal disruption of a star cluster using BFE under a smooth halo and a lumpy halo, they found that two orbits were similar in orbital positions and had similar apo-and pericentric distances. Later, Sanders et al. (2020) tested a true time-dependent BFE-based model and quantified the error in orbit reconstruction and showed high fidelity orbit reconstruction is attainable using BFE. Previous work focused on reconstructing and comparing the phase-space information of tidally disrupted star clusters of dwarf galaxies, but a lot of modern techniques use actions as proxies for preserving orbits (e.g., Sanderson et al. 2015Sanderson et al. , 2017O'Hare et al. 2020;Reino et al. 2020;Trick et al. 2021;?). We are specifically interested in how well different models preserve the actions of stellar streams and the orbits of halo objects in realistic simulations of MW-like galaxies. Stellar streams are formed when stars in a globular cluster or a dwarf galaxy are accreted into a more massive host galaxy due to tidal force from the gravitational potential of the host (Helmi & White 1999). These streams were first predicted from a galaxy merger simulation by (Toomre & Toomre 1972) followed by the first observational discovery by (Ibata et al. 1994;Johnston et al. 1995;Newberg et al. 2002) of a stream from the ongoing Sgr dSph merger, streams in globular clusters, and the Helmi streams (Grillmair et al. 1995;Helmi & White 1999). These streams are excellent probes of the shape of the MW's radial mass profile (e.g., Koposov et al. 2010;Bonaca et al. 2014;Pearson et al. 2015;Sanderson et al. 2017;Reino et al. 2020), its symmetry, (e.g., Law & Majewski 2010;Malhan & Ibata 2019) and, the orientation of its symmetry axes (Vera-Ciro & Helmi 2013;Vasiliev et al. 2021;Panithanpaisal et al. in preparation).
When a model is a good representation of the true potential it can be used to transform the stream's phase-space coordinates to the action-angle variables. These streams are highly clustered in action space (Helmi & White 1999;Sanderson et al. 2015) because they arise from the tidal disruption of globular clusters or dwarf galaxies, which have much smaller initial phase-space volumes than the MW. Mass models of the MW are also used to explore the past and future orbits of satellite galaxies, which could affect their formation histories Tomozeiu et al. 2016;Buck et al. 2019) and our conclusions about their masses pre-infall (Mayer et al. 2001;Łokas et al. 2010) and infall times (Sohn et al. 2020). The most famous case is that of the Large Magellanic Cloud (Besla et al. 2007;Kallivayalil et al. 2013), but connections between orbits and evolution have been explored for various other dwarfs as well, including Sgr (Johnston et al. 1995;Ibata et al. 1996;Johnston et al. 1999;Jiang & Binney 2000) and Antlia II (Chakrabarti et al. 2019). The development of high-precision astrometry allowed measuring proper motions of dwarfs with the Hubble Space Telescope (Sohn et al. 2020) and Gaia (Helmi et al. 2018b;Fritz et al. 2018;Pace & Li 2019) and have expanded the possibilities of analyses greatly in the past few years and promise to continue doing so in the future (Kallivayalil et al. 2015). However, reconstructed satellite orbits are necessarily predicated on some model of the MW, and the time range over which the orbit integration remains close to the true orbit is determined by the ability of the model to resemble the true potential . The positions of the centers of the clumps of stream stars in such a model, defined by the mean orbit of the group of stars forming each stream, should also be stable with time. Studying this action-space invariance can therefore be used to quantify how close a potential model is to the true potential in terms of how well it preserves two important properties of the system: the property that stars in the same stream are on neighboring orbits, quantified by the tightness of their action-space clump, and the stability of the mean orbit about which the stars scatter, quantified by the stability of the location of the clump in action space with time.
In this paper, we analyze streams in their action-space properties of streams in the MW-like cosmological simulations with different accretion histories from the Feedback In Realistic Environments (FIRE-2) suite  described in (Sec. 2.1). In Sec. 2.4 we list different potential models utilized and how well they preserve action-space information (Sec. 2.5). We specifically examine the ability of the multipole expansion (Sec. 4.1) to model galaxies with different merger histories, in order to understand to what extent these models can successfully produce a coherent, and consistent, action-space distribution for stellar streams. We use statistical measures of information detailed in Sec. 3 on our entire sample to quantify the degree of clustering and stability of action space for different potential models in Sec. 4.3 and 4.4 respectively. We present our results in Sec. 5.

ACTION SPACE OF TIDAL STREAMS IN COSMOLOGICAL SIMULATIONS
In this section, we introduce the framework used to compare potential models in terms of the degree of action space clustering and coherence for stars in stellar streams. We start with a brief description of the simulations utilized for the analysis ( §2.1-2.3) and introduce the basis expansion used for the potential models ( §2.4). Finally, we review how actions are calculated ( §2.5) and describe the statistics used to quantify the clustering and stability of the action space when assuming various potential models ( §3).

Simulations
We study cosmological zoom-in baryonic simulations of MW-mass galaxies from the Latte suite (introduced in Wetzel et al. 2016), which is part of the FIRE project. Simulations were run using the Gizmo code (Hopkins 2015), which utilizes a TREE+PM gravity solver, a Lagrangian meshless finite mass (MFM) solver for hydrodynamics that provides adaptive spatial resolution. The simulations in this paper were run with the FIRE-2 physics model , which implements star formation and stellar feedback parameters from realistic stellar evolution models (STARBURST99; Leitherer et al. 1999), consistent with the Lambda cold dark matter cosmology from the Planck Collaboration et al. (2015). Wetzel et al. (2016) and Hopkins et al. (2018) give an in-depth description of the simulations from the FIRE-2 suite. We select three galaxies from the Latte suite, referred to in this paper as m12i, m12f, and m12w, for our analysis that are similar to MW in their stellar and gas mass. m12i and m12f are similar to MW in their stellar morphology while m12w is less disky Sanderson et al. 2020). In the Latte suite, we chose halos agnostic to their formation history, so the sample is representative of varying merger histories, until the present day (redshift z = 0) (Samuel et al. 2021). These halos have a total mass of 1 − −2 × 10 12 M with average mass of baryons b = 7100 M and DM particle DM = 35, 000 M . Table 1 lists radial and tangential velocities of the most massive merging satellite at the first pericenter time peri where the satellite is peri from the center of the main halo, along with the mass ratios of the mergers observed in the simulations measured at peri , computed as, http://fire.northwestern.edu/latte • Infall mass ratio (IMR): The mass of the main halo divided by mass of the satellite just before infall.
• Total mass ratio (TMR): The mass of the main halo divided by the mass of the merging satellite.
• Pericenter mass ratio (PMR): The mass of the main halo enclosed within the peri divided by the mass brought in by the merging satellite enclosed within peri from the center of the satellite.
m12i lacks any major infalling satellites after 7 Gyr. m12i's largest merger has a TMR of 45.5 and a slow radial velocity rad , making it a rather weak interaction. This isolated halo acts as a baseline to study the adverse effects of mergers on potential modeling. m12f has two major mergers at 10.8 Gyr (m12f-1) and 7.2 Gyr (m12f-2) with TMRs 15.9 and 10.25, and PMRs 7.9 and 4.6, respectively. m12f-1 merger is comparable to the MW's interaction with the progenitor of the Sagittarius stream, Sgr dSph galaxy (Jiang & Binney 2000;Niederste-Ostholt et al. 2010;De Boer et al. 2015;Gibbons et al. 2017) in terms of merger ratios with the distance at the first pericenter 35.7 kpc. In case of m12w, the TMR is 8 and PMR is close to 2.6 similar to MW's merger with LMC which has an estimated TMR of 7.6-10 in the broadest confidence interval (Peñarrubia et al. 2015;Shipp et al. 2021;Vasiliev et al. 2021) with significantly higher radial velocity 106.8 km/s and the pericenter distance 7.2 kpc, which is not the case for the LMC (Besla et al. 2007).

Stream identification
Stellar streams in these halos are selected based on the number of star particles (between 120 and 10 5 particles), pairwise separation between them (maximum value of separation between any two star particles is greater than 120 kpc), the local velocity dispersion (Panithanpaisal et al. 2021), and bound to host between 2.7-6.5 Gyr ago. Since each star particle has a mass of ∼ 5000 , streams have a lower mass limit of 10 6 . We know precisely which stars belong to each stream [we have linked the stars to their DM halos in the merger tree (Panithanpaisal et al. 2021)]. The number of streams in each of the simulations are different, while m12i and m12f have nine and eight streams respectively, m12w only has two resolved streams. We track the formation and evolution of streams in time (Panithanpaisal et al. 2021) using the halo catalogs (Behroozi et al. 2012a) connected in time using consistent trees (Behroozi et al. 2012b).

Centering
The simulations are run in an arbitrary simulation box frame, where the total momentum of the system is nonzero. We first establish a galactocentric coordinate system (x,v) for each snapshot aligned with stars in the galaxy. We compute A . Note. peri : Time of closest approach between the main galaxy and the satellite. peri : Distance between satellite and main galaxy at closest approach. IMR: Mass Ratio at infall, main / sat just before infall, TMR: Total Mass Ratio, main / sat at peri , and PMR: Pericenter Mass Ratio, main (< peri )/ sat (< peri from sat ) evaluated at peri . rad , tan : radial and tangential velocities of the satellite with respect to the main galaxy. The non-zero rad is a limitation of finite time resolution but regardless comments on how radial the merger is.
the central position of each halo by calculating the center of mass (COM) of stars within a sphere and iteratively shrinking the sphere and recomputing the COM until a consistent central location is reached. Next, we estimate the central velocity using the mean velocities of stars within 10 kpc of the center. We pick this location as (x,v) = (0,0) and rotate all the snapshots of a simulation such that the Galactic disc is planar in xy at the present day, this is a good approximation as the angular momentum of the disk does not vary for the second half of the simulation when the star formation is steady rather than bursty (happens usually at z = 0.5) Santistevan et al. 2021;Gurvich et al. 2022). Fig. 1 plots the total COM displacement in simulation coordinates over a period of 7 Gyr. Note the accelerating COM positions in the case of m12f (red) and m12w (orange) due to ongoing mergers at respective peri (marked with a star). The infalling satellite in m12w accelerates the main halo significantly during the interaction, leading to a displacement in both positions and velocities (reflex motion). As such, the COM of the outer regions are significantly displaced from the Galactic disk due to this reflex motion (Erkal et al. 2019;Cunningham et al. 2020;Petersen & Peñarrubia 2020;Garavito-Camargo et al. 2021).
There are also high-frequency jitters in the COM position due to active star formation near the center (Orr et al. 2021), This jerky COM motion causes fluctuations in the central accelerations (green points in Fig. 2) that lead to a non-conserved reference frame. These jittery accelerations are insensitive to the centering method and the aperture used to find the central positions and velocities. We limit this nonconserved nature of the axis frame caused by jittery COM by fitting three cubic approximations (one for each axis) on the COM positions over time ensuring that the new functions are doubly differentiable. These expansions act as smooth proxies for the original center positions and velocities and minimize the effect of shaky accelerations. Since we are using cubic spline expansions, we track the same COM position in time and the maximum difference in central velocities is 17 km/s. Fig. 2 plots the total magnitude of spline acceler- Figure 1. The dynamically evolving COM in simulation coordinates for m12i (green), m12w (orange), and m12f (red). The re-centering is done with respect to the present-day positions. While m12i has smooth COM positions with time, accelerations in COM are prominent in m12f and m12w at peri (black star). This is due to the infalling satellites displacing the barycenter position and velocity (reflex motion) during the merger. This effect is stronger in m12w with a more massive satellite and lower peri . ations (black) for m12i, obtained via the second derivative of the spline functions, along with the magnitude of actual acceleration (green), found using numerical differentiation of real central positions for m12i. Note the highly erratic and larger actual accelerations in contrast with smooth and smaller spline accelerations. This methodology does not affect the potential modeling fits for the stellar streams as their orbits are far from the Galactic Center. The Cartesian accelerations along with their spline counterparts are posted in Appendix .1.
We use the spline approximations to define central positions, velocities, and fix the orientation of the disk in each simulation. Fig. 3 plots the pericenter position of the merging galaxy from the central halo for about 7 Gyr in the past. The m12f-1 merger has a prolonged infall time, while other mergers are relatively short. The shorter time scale mergers  are closer to the main halo at peri (black star) and difficult to incorporate into a time-evolving potential model.

Potential models
We use four types of potential models: (i). a timeindependent analytic potential (TIAP) model described by a combination of a Navarro-Frenk-White (NFW) profile for the DM halo (Navarro et al. 1996), a spherical power law for the central bulge (Binney & Tremaine 2011) (assuming spherical symmetry) and a Miyamoto-Nagai profile to model the stellar disk (Miyamoto & Nagai 1975) (assuming axisymmetry). These parametric potential profiles are traditionally utilized at z = 0 fitted to observational data (e.g McMillan 2011; Bovy 2014; McMillan 2016) or a simulation snapshot (Buist & Helmi 2014;Sanderson et al. 2017). (ii). a semitime independent analytic potential (STIAP) model that fixes the shape of the potential using TIAP but accounts for the changing mass with a linear mass ratio scaling at each time. (iii). a time-evolving multipole potential (TEMP) using BFEs, a low-order spherical harmonic and azimuthal harmonic basis expansion fit under axisymmetry conditions for DM and baryons, respectively, evaluated at each time snapshot independently, and (iv). a semi-time independent multipole potential (STIMP) model calculated at z = 0 using the same expansions and symmetrical structure as TEMP with a linear mass ratio scaling to account for the changing mass of the halo in time. We fit these different potential models using the galaxy dynamics package Agama (Vasiliev 2019).

Analytic models
We fit a parametric potential model for the DM halo, Galactic disk, and the central bulge of the galaxy independently at z = 0. We use the NFW profile to model the DM halo and hot gas ( gas ≥ 10 4.5 K) described by where Φ NFW ( ) is the potential at spherical radius , NFW is the mass enclosed in a radius of ∼ 5.3 and is the scale radius of the main halo. The NFW profile is parameterized using NFW and because of the sensitivity of the stellar orbits to the locally enclosed mass. We fit the total density of each DM halo using the density form of the NFW profile for DM particles within 300 kpc.
We model the Galactic Disk defined by the star particles and cold gas ( gas < 10 4.5 K) within 5 kpc ≤ cylindrical radius (R) ≤ 30 kpc and height | | ≥ 3 kpc from the COM using the Miyamoto-Nagai (MN) profile in azimuthal coordinates, given by We fit the 2D density profile of the simulation with the density functional form of MN profile using the scale mass ( MN ), scale radius ( ) and scale height ( ) for this Galactic disk.
The central bulge of stars and cold gas within ≤ 3 kpc of the COM is modeled using a spherical power-law-based A .
density profile: where is the spherical radius, 0 is the normalized density, cut is the outer cutoff radius, and , , , and are the power parameters. We again fit the density of the spherical bulge. For all our fits, we utilize a Nelder-Mead optimizer (Gao & Han 2012) to minimize differences in log-density (i.e., relative error). Table 2 lists the TIAP fits for the halos. Given the TIAP model, we from the STIAP by extrapolating the TIAP at other times with a mass ratio scaling, between the mass of the halo at each time step with respect to the total mass of the halo at the present day.

Low-order Multipole model
The multipole model uses a combination of BFEs to accurately describe the density and potential. These potential expansions are found by replacing two of the three integrals from the Green's function solution to the Poisson equation, with a sum in some BFEs (e.g Lowing et al. 2011;Garavito-Camargo et al. 2021). We use a spherical harmonic basis to describe the DM halo and hot gas ( gas ≥ 10 4.5 K) components of the potential and an azimuthal harmonic expansion for the stellar and cold gas ( gas < 10 4.5 K) components.
To model the DM and hot gas, we use a spherical-harmonic expansion: where ( , , ) are spherical coordinates in the centered frame, ℓ are the spherical harmonics for the multipole term ℓ, and −ℓ ≤ ≤ ℓ. Φ ℓ, ( ) is the amplitude: where ( ) can be expressed as two integrals of density at each radius, with normalization constants : This quantity is obtained from the particle distribution using a penalized spline estimate in logarithmically spaced radii fit to spherical harmonics coefficient calculated for each particle (Vasiliev 2018, Appendix A.2.5): where˜are normalized associated Legendre polynomials and To model the stars and cold gas, we use a BFE based on sets of Fourier harmonics in azimuthal coordinates, where potential and density are where and are given by: and Ξ is the Green's function in cylindrical coordinates. For the axisymmetric model utilized in this paper, only the = 0 harmonic exists for the azimuthal basis and only ℓ even terms are nonzero for the spherical harmonics expansion. We compute our expansion coefficients Φ , ( ) on a 1D grid of size of 25 kpc and Φ ( , ) on a 2D meridional plane of grid size 25 kpc in and (Vasiliev 2019).
These two expansions together account for the total mass distribution of matter and form the Galactic potential profile. For TEMP, we find these expansion coefficients at each time step in the specified coordinates form a time-evolving model. For STIMP, we use the expansion calculated at the present day and rescale the potential to the total halo mass (total mass in all species within vir ) at each time step similar to the STIAP. Fig. 4 plots the |Φ ℓ,0 | (amplitude) averaged over the radii, computed at even poles up to ℓ ≤ 4 of TEMP (the odd poles are zero due to axisymmetry). The increased amplitude in higher poles around the merger time (marked by black stars for m12f and m12w) implies an increase in total mass at the pericenter passage of the satellite as the merger causes the potential to deepen. The higher poles have limited ability to capture potential perturbations caused by the merging satellite even under axisymmetry assumptions.
We plot the rotation curves for the potential models along with the true rotational velocities (left) for our simulations in Fig. 5 at z = 0 for m12i (top), m12f (middle), and m12w We provide the low-order (ℓ max = 4 and max = 4) multipole potential models for the FIRE-2 suite of cosmological simulations computed at the present day under two different symmetry assumptions: axisymmetry, which can be used to compute actions in cylindrical coordinates, and with no symmetry assumptions, which is better at modeling mergers with high fidelity. These expansions and a demonstration Jupyter notebook are available at https://web.sas.upenn.edu/dynamics/data/pot_models/. Note. Note we perform least square optimization to find the best fit using the Nelder-Mead method described by (Gao & Han 2012).

Figure 4.
Amplitude in |Φ ℓ,0 | averaged over radii for even poles of the multipole model with ℓ ≤ 4 described in Sec. 2.4.2 for all the halos computed independently at each time step. Due to the axisymmetry assumption, the odd poles vanish and the amplitude is stored only in = 0 for even poles. The black stars on the plot mark the peri for m12f and m12w. The increased amplitude near the merger demonstrate the ability of the higher poles in spherical harmonics in modeling a satellite infall.
(bottom). All the potential models at the present day fit the actual rotation curve well within ±5% (percentage difference plots on the right). We fine-tune the TIAP parameters to improve the velocity fits under the values given in Table 2, but fine-tuning the rotation curves does not significantly change the parameters presented.
To summarize, we consider four potential models.
• TEMP: in which we compute the multipole expansion at each snapshot independent of the others • STIMP: where the multipole expansion is computed at present day and used for all time steps with a mass scaling hence the shape of the potential remains the same, while the mass evolves with time • TIAP: where the potentials are fit at present day and utilized at each time step • STIAP: that is TIAP used with a mass ratio scaling for at different time steps.
We list the potential models along with their shorthand and labels used throughout this paper in Table 3. Both the semitime-independent and time-independent models are henceforth labeled time-independent models (TI) due to fixed potential shape. Rotation velocity curve (left) computed using different potential models at present day along with the true rotation curve (black). Note that both multipole (blue) and analytic (green) models produce rotation curves consistent with the true curves (accuracy of ±5% at all radii (right)) of the true rotation for m12i (top), m12f (middle), and m12w (bottom). This serves as a preliminary check for our models.

Computing action space distributions
A . Semi-independent (z = 0) STIAP Analytic potential Independent (z = 0) TIAP Note. Note that Name is the abbreviation for the corresponding model used throughout the paper.The time independent potential has a fixed shape and mass at all time steps, while the semi-independent potentials only have a constraint on the shape with evolving mass scaling.
In order to analyze the stellar streams accreted in the main halo, we work in their three-dimensional action space J. Actions can be calculated for independent stars as long as they are in a bound orbit in an integrable potential. Then the actions are related by a canonical transformation from the six-dimensional phase space consisting of position (x) and momentum (p) given by : for the th component of J, where are the separable coordinates and their conjugate momenta. Analytic expressions for actions are found for some potentials by solving Hamilton-Jacobi equations with the separation of variables (Binney & Tremaine 2011). For our axisymmetric conditions, we compute actions ( , , ) in cylindrical coordinates computed using Stäckel fudge approximation (Binney 2012).
is the total angular momentum given by (x × p) , and hence, independent of the potential model utilized while the other two actions are potential dependent. These actions are adiabatically invariant in a slowly evolving potential and should therefore show high clustering and stability at all times for independent streams in simulations where the potential is adiabatic. Clumping occurs because these stars are formed in a small phase space of localized dwarf galaxies or globular clusters before tidally stripping and forming streams (Helmi & White 1999;Sanderson et al. 2015) and the action space retains phase-space information of a stream progenitor under the absence of a potential varying merger. As these actions strongly rely on the Galactic potential, action space coherence of stellar streams is an excellent probe to identify the true potential of a galaxy (Sanderson et al. 2015). In other words, the model that produces the highest clustering and stable action space throughout the history of a stellar streams is closest to the true potential of a galaxy and hence the best model. Ideally, a good model should also be able to accurately model the actions during mergers even though satellite interactions affect the DM distribution and perturb the potential (Garavito-Camargo et al. 2019;Garavito-Camargo et al. 2021). Lilleengen et al. (2019) tested whether they could constrain the gravitational potential of an Auriga halo with a similar method. They fit a potential to the simulated galaxy, varied one potential parameter, and calculated the spread in actions of an accreted group of star particles, expecting the minima of the standard deviation to be at the true parameter. This method did not reproduce the fitted potential and the actions of the star particles were not constant In this paper, We compute the actions for all star particles in each stream for a time period of 7 -13.8 Gyr (301 time snapshots) for the four models using particles from the actual N-body simulation and their true phase-space information at each time step. The alternative approach would be to take the initial conditions of particles from a stellar stream and integrate their orbit in time. Although this approach is far more complex, as reproducing trajectories in a smooth potential model ignores smaller scale perturbations on the orbits. Fig. 6 plots the stellar streams in action space (and -) for m12i (left), m12f (center), and m12w (right) along with the position space ( -) in insets (top right corner in first row) at present day for the TEMP model discussed in Sec. 2.4. Actions are highly clustered for each stream individually for both m12i and m12f. The stream colors correspond to the total stellar mass as shown in the color bar in Fig. 6. We note that more massive streams (green in Fig. 6) have a wider spread in action space. The few streams present in m12w are less massive compared to the other simulations and lack enough star particles to comment on the action space coherence.
Some of the bound star particles considered within the potential models can have unbound orbits (total energy greater than zero), which would not have defined actions. Therefore, the fraction of stars with defined actions should also be maximized for each potential model.

ACTION SPACE COHERENCE: THE KULLBACK-LEIBLER DIVERGENCE
We first analyze the action space of stellar streams for our simulations visually, to identify if the actions are clustered and stable for our models. To statistically quantify the degree of clustering and stability of the action space in a time series analysis we utilize the Kullback-Leibler divergence (KLD) (Kullback & Leibler 1951;Sanderson et al. 2015). KLD is a statistic that compares how similar two probability distributions are for continuous random variables. The KLD from a distribution ( ) to ( ) is given by where ( ) and ( ) are two independent probability distributions. Note that this statistic violates commutativity, i.e., KLD( : ) ≠ KLD( : ), which should be taken into account when comparing distributions. In our case, the distribution is the action space and since it is discrete and dependent on the number of stars, we estimate KLD via Monte Carlo integration to convert the above integral to a sum. Since we know both the membership information of stars and their actions J calculated using a potential Φ, we compute the KLDs individually for each stream for the action space J ( , , ) distribution and add them to give where is the number of stars in the th stream, is the total number of streams at that time snapshot. (J) is the probability distribution of the 3D action space of the th stream, while (J) is the background distribution used for the comparison estimated using density estimators or analytic functions. It should be noted that these quantities are dynamically evolving as the number of streams increases with time and the probability distribution changes. For two identical distributions, or two discrete samples drawn from the same distribution, the KLD is 0; the higher the value of the KLD, the greater the difference between the distributions.
In this work, we use the KLD in two ways: (i) to compare each action space distribution to a uniform distribution, which measures the degree of clustering (referred to as the uKLD), and (ii) to compare the action space distributions of neighboring snapshots in time, which quantifies the stability of the distribution over time (referred to as the nKLD. The KLD is calculated in nats, i.e., information per star particle on a logarithmic scale. Sanderson et al. (2015) provide a more detailed background on the KLD statistic and also describe the framework of our analysis.
To interpret these two statistics, we compare them with the amount of information present in each stream, calculated using a third version of the KLD called the mutual information (MI), which is the KLD of a distribution with the product of its marginal distributions (Sanderson et al. 2015) (refer to §3.3). Comparing this number to the change in the KLD with time allows us to interpret the loss of information in the distribution in terms of the orbital decoherence of stars in streams. These numbers allow us to define the best model of the potential as the one with (i) the highest degree of clustering (the highest uKLD value) and the greatest stability in action space (the A . lowest nKLD value) relative to the amount of information present in the streams (MI). We track the time evolution of KLD values with lookback time (i.e., the time before the present day defined as T LB = T snap − T z=0 ).

Clustering in time-evolving Action Space
To quantify the degree of clustering in the time-evolving action space, we pick (J) in Eq. 12 to be a uniform distribution U ( ) in each action from its maximum to its minimum value across all time steps for all the simulations (Reino et al. 2020). The distribution (J) is a function of phase space ( ) ≡ {x ( ), v) ( )} at time for each stream and potential model (Φ( )): where , , are the three actions in an axisymmetric potential. We evaluate the uKLD at each snapshot from 7 Gyr ago to the present day, for each different potential model in each simulation, using the same uniform distribution. We calculate the action space probability density (J) of each stream with a kernel density estimator (KDE) and evaluate the background uniform distribution U ( )U ( )U ( ) analytically. The potential model closest to the true potential of the simulation produces the highest clustering in action space and hence will differ most from the uniform distribution, leading to a high value of the uKLD.

Stability of Action Space with time
We describe stability of our action space by calculating the nKLD: the KLD between the action space distributions produced by our model potentials at different times in the same simulation. nKLD compares the action space distribution of simultaneous snapshots and to present-day snapshots to quantify whether they significantly differ in time. Ideally they should remain the same.
For the time-dependent model (TEMP) we compare the action space distributions of two neighboring snapshots at times −1 and , {J[ ( −1 ), Φ( −1 )]} and {J[ ( ), Φ( )]}, where the J are calculated using the phase-space positions of the stars and the model potential Φ at each snapshot. As for the uKLD, we calculate nKLD for each stream individually, starting from the snapshot where the stream is first formed (Panithanpaisal et al. 2021). Since different streams form at different times, we sum only over streams that are present in both snapshots, ( ) and calculate a weighted sum such that each stream contributes equally to the result (Reino et al. 2020): (14) For the TI models (STIMP, STIAP, and TIAP), the potential is fixed at = 0, so we instead compare the present-day action space distribution, {J[ ( 0 ), Φ( 0 )]}, to the other snapshots: (15) In both cases, we use a KDE to calculate and evaluate the action space probability densities . If Φ changes adiabatically, the action space distribution should be stationary, and the nKLD should be 0 (Sanderson et al. 2017). Hence, for the nKLD, a low nKLD corresponds to a higher stability of action space for a given potential model, indicating a consistent orbit family for the stream stars.

Information content of streams
Mutual information (MI) is the amount of information that can be obtained from one random variable by observing the other variable. Incidentally, one can think of MI as a measure of correlations between variables that is independent of the extent of distribution. Thus MI is used to establish the amount of orbital information a stream holds in it's action space. We compute the MI for each individual stream by calculating the KLD between the action space distribution of each stellar stream at present day, J[ ( 0 ), Φ( 0 )] , and the product of the marginal distributions of all streams in the simulation ( shuf J[ ( 0 ), Φ( 0 )] ), also at present day: We obtain the product of marginal distributions by shuffling the actions with respect to one another, as in Sanderson et al. (2015), and use a KDE to determine the densities of both distributions. The MI thus describes the average amount of information stored in the action space of each stream about the global action space of all streams in a simulation. We can then compare the value of MI to the nKLD to measure the amount of decoherence in terms of the amount present in each stream. A higher nKLD than MI at any lookback time (i.e the time before the present day defined as T LB = T snap − T z=0 ) indicates a loss of one stream's worth of information between snapshots.
In this section, we discuss how accurately the multipole framework can accommodate perturbations from merging galaxies, and at what mass ratio the axisymmetry assumptions begin failing to represent the potential adequately during the merger. We define failure as the inability to use the model to calculate approximate actions at all for some subset of bound stars in the simulated galaxy. This is a narrow definition of model failure, since a model can easily succeed in calculating approximate actions for stars that are not even close to adiabatically invariant, and are not closely related to the properties of their orbits. To diagnose this, we then use the KLD to measure the potential model's ability to maintain adiabatic invariance in approximate actions for stellar streams, both in terms of orbit coherence (i.e. the overall position of each stream in action space) and in terms of stream coherence (i.e. the size of the clump of stream stars in action space). Using these metrics, we compare the performance of the multipole approach, either time evolving or static, to the static, parameterized potential models currently popular for modeling the MW.

Multipole power spectrum
We start by examining the time-evolving multipole model for all the simulations. For each simulation, the expansion coefficients are calculated independently at each time snapshot at a range of galactocentric radii. Fig. 7 shows the absolute values of the expansion coefficients (color bar) Φ ℓ,0 ( ) (eq. 5) for even poles (ℓ ≤ 4) in the m12i (top), m12f (middle), and m12w (bottom) as a function of radius and time. For all the simulations, the ℓ = 0 mode represents the spherically averaged mass profile of the galaxy, which is relatively stationary in time. The amplitude for ℓ = 0 mode falls exponentially as a power law index of -0.4 with radius at the present day. On the other hand, higher poles show a wider variation between simulations, reflecting the differences in structure at the intermediate scales connected with LMC-like mergers. In m12i, which has primarily low-mass accretion at late times, the amplitude in the higher modes is evenly distributed in radius, with only some small perturbations. On the other hand, the merger information is clearly encoded in the higher poles for both the simulations with high-mass accretion (m12f and m12w ). We overplot the pericenter positions (red and purple) of the most massive merging subhalo[s] on Fig. 7. For m12f and m12w we see that the pericenter positions exactly coincide with the high-order perturbations present in the power spectrum (white regions). This behavior is seen in both m12f and m12w for ℓ ≥ 2 poles, suggesting the ability of the higher poles to accommodate the merger's effect on the overall mass distribution even though the model is constrained to be axisymmetric.
Mergers with the lowest PMR (see Table 1) have the largest relative amount of amplitude in the ℓ ≥ 2 poles. As expected, the merger with the highest PMR, in m12i, shows no significant influence on the model. Conversely, the mergers in m12f and m12w with much lower PMR, produce significant changes in the power spectrum (Fig. 7). The amplitude in the higher poles of the potential model changes quickly in the region of the merger. While the merger in m12w is most rapid and leads to a complete disruption in a short time span of 0.3 Gyr, merger 1 in m12f takes about 2 Gyr, gradually varying the potential until it completely merges at 12.3 Gyr. Hence, this merger takes time to tidally strip stars and change the global potential profile of the Galactic halo. The timescale of the merger is longer compared to characteristic dynamical times at radii of interests, and to the timescale at which snapshots are saved, and thus the multipole expansion can more easily adapt at each time step to reflect the true potential.
This behavior is similar to the Gibbs phenomenon observed in Fourier series decomposition (Gibbs 1898). The Gibbs phenomenon states that modeling a rapidly changing, discontinuous function using Fourier series (another orthogonal basis) leads to large oscillations and overshooting around the jump discontinuity that cannot be removed by adding higher poles. The merger in m12w and merger 2 in m12f show this type of behavior: the Galactic potential profile has a rapid perturbation due to the merging halo. As the timescale of the merger approaches the time between snapshots, the perturbation approaches a discontinuity from the perspective of the potential model. This provides a fundamental limit to efforts to reproduce the evolving potential from a finite series of snapshots. On the other hand, the basis expansion has relatively little trouble when the merger is more slowly evolving, as in merger 1 in m12f, since the function remains relatively smooth throughout time and space.
A useful global model for the MW potential would ideally be able to determine approximate actions for stars throughout the galaxy that reasonably describe their orbits. To quantify the first part of this requirement, in Fig. 8 we track the fraction of stars that are gravitationally bound to the main halo in each simulation (i.e. those for which actions should be defined) for which the axisymmetric BFE (specifically the TEMP model at = 0) can indeed determine defined actions. If the fraction drops below 1, it implies there are a substantial number of stars that the model potential incorrectly considers unbound, and hence cannot calculate their actions. In most cases, our BFE successfully calculates an action for every star, except in the merger in m12w, which has the lowest TMR and PMR. The fraction of stars with defined actions decreases around the merger time and falls to 0.6 at 8 Gyr, essentially implying a loss of nearly half of the the orbital information in the stars. We ascertain that this is due to the failure of the model to accommodate the merger by plotting the positions of the stars with undefined actions (magenta and yellow) in m12w along with the Galactic disk shown in Fig. 9 at 7.95 Gyr A .
(close to the time of the merger). The stars with failed actions are around the same radius as the perturbing satellite, which implies a failure to represent the potential localized near the perturber. These include both stars that are part of the merging satellite (yellow) and stars in the main halo at the same radius (magenta). This arises from the axisymmetry condition imposed on the potential model, as the perturber's mass is evenly distributed over a ring instead of being localized, which makes the kinetic energy of these particles to exceed the approximate potential energy. This effect is dominant for the massive satellite in m12w while the potential approximation with the symmetry condition works reasonably for m12i and m12f with less massive satellites. We conclude that, at least as far as producing defined actions, a low-order axisymmetric BFE can successfully accommodate mergers with TMR 10 and PMR 3. In §4.3 and §4.4 we examine whether the actions usefully describe stellar orbits. While our BFE can usually accommodate a merger except in extreme cases, this is a bigger challenge for parameterized models. In the BFE the basis is chosen to automatically satisfy the Poisson equation, but for a parameterized model it is far more complex to add a second set of components to the potential with a moving center, since in order to be self-consistent the model must solve the Poisson equation at all times. The multipole expansion and the parameterized models both perform well for simulations with no significant mergers, but when a relatively massive galaxy is near the host's center (TMR > 10 and/or PMR > 3) the multipole method is superior to parameterized models. Even the TEMP models, which are highly effective in modeling Sgr-like mergers, do not perform as well for LMC-like mergers.

Action Space coherence
The actions of stars in tidal streams should both be adiabatically invariant and remain significantly clustered for potential models that are good representations of the true potential of a halo. To test how well the BFE performs relative to a parameterized model, and whether it is important to model the time dependence in order to preserve the action space properties of streams, we use the various models we developed (Table  3) to calculate the actions of star particles in stellar streams in each simulated system at each snapshot from 7 Gyr ago to the present day. We then examine the clustering and stability of the action space distribution as a function of time.
We visually analyze the action space by making videos (Arora 2022) of the action distribution for our simulations over the 7 Gyr period, using the actions computed in each of the potential models in Table 3. These videos are posted at https://drive.google.com/drive/folders/ 1t2dLAQx4X5H7WOrUhX9nyrf2_pdgD2ga?usp=sharing and released as part of Arora (2022). Fig. 10 shows three representative time frames (columns) of two independent actions ( versus ) for m12i (1 st row) and m12f (2 nd row), computed using the TEMP model. The frames represent a time just before the PMR = 8 merger in m12f (10.52 Gyr, left column), during the merger (10.82 Gyr, middle column), and > 1 dynamical time after the merger (12.64 Gyr, right column). Note that in both simulations, a new stream forms in the time between the middle and final frames shown. The action space distributions for both m12i and m12f at the end of the simulation (13.7 Gyr), about 1 Gyr (or roughly another ∼ 1 − 2 dynamical times in the halo) after the final frame in this figure, are shown in Fig. 6.
The actions for individual streams, marked by points in a single color for each stream, start out clustered in both galaxies, then visibly spread out in the case with the merger (m12f, bottom row) to a far greater extent than in the case without a merger (m12i, top row). The gray and red streams in m12f in particular show substantial decoherence in action space. The centroids of the individual action space clusters corresponding to the streams also move rapidly between snapshots in the video of m12f during the merger. By the time the merger is finished (bottom right) the actions are permanently diffused and scrambled relative to their configuration prior to the merger. On the contrary, in m12i the actions of each stream's stars remain substantially more clustered around relatively stable centroids over the entire period of the (video).
The decoherence of action space in m12f is clearly present in all four potential models, as expected given the low TMR and PMR. On the other hand, m12i videos show action space coherence at most times for all our models. Sample action spaces under different models for m12i and m12f are posted in Appendix . .2. In Sec. 4.3 and 4.4), we quantitatively compare the degree of clustering and stability across potential models and simulations.

Action space coherence of streams
We quantify the degree of clustering of the action space for various potential models by calculating the KLD relative to a uniform-background distribution, as discussed in section 3.1, at each time step for all the potential models. A higher KLD value represents a more clustered action distribution. We refer to this quantity as the uniform-Background KLD or uKLD. Fig. 11 shows the uKLD for the four potential models, for m12i (left) and m12f (right). The orange points correspond with the STIMP (Section 2.4.2), red and green points correspond with TIAP and STIAP (Section 2.4.1) models, respectively, while the blue points correspond to the TEMP model (Section 2.4.2). The discrete jumps in the uKLD correspond to the addition of new stellar streams as they form; their tidal-disruption times as defined in Panithanpaisal et al. (2021) are marked with a dark gray vertical line. In the case of m12i, the uKLD for the TEMP model is systematically Figure 7. Absolute value of amplitude |Φ ℓ, | as defined in Eq. 5 (color bar) at all radii as a function of time for the TEMP model for all the simulations. The TMR of these halos can be found in Table 1. The pericenter positions of the merging satellites are shown in red and purple in m12f (middle) and m12w (bottom). Due to axisymmetry, only the even poles exists with = 0 components. Amplitude decrease as a function power law of radius with index ∼ −0.5 for the zeroth pole which has all the information of the total mass of the halo. Higher poles have relatively low and uniformly distributed amplitude for m12i (top). In m12f and m12w, the pericenter position of the merging satellites (red and purple) strongly coincides with the radial locations with higher amplitude in the spectrum for the duration of the mergers for ℓ = 2, 4. This strongly suggests that even with axisymmetry constraints, the TEMP model with higher poles does well in modeling a dual COM system such as merging satellites. lower by about 3-4 nats compared to the TI models (except TIAP), until all the stellar streams are added. We note that this disparity arises from the mass ratio scaling at earlier time steps. We compute the actions ( and ) of a particle of mass , and (x,v) at each time step using the model from the present day and then scaling them with the mass ratio at each time step. At earlier times, this ratio is lower than 1 which leads to an overall convergence in action values that appear as higher degree of clustering. As the halo stops growing and the mass ratio is close to 1, this disparity disappears after 11.5 Gyr and the KLDs become comparable with a difference of less than 1 nats between models.
These differences are substantially less than the information contributed by a single stream, which is between 8 and 10 nats, and can be attributed to numerical integration of the uKLD. Thus we conclude that the action space has a high degree of clustering for all the models in m12i, which has no significant interactions during the time period. For such a scenario, a model with a fixed shape and adiabatically increasing mass appears sufficient to preserve the clustering; indeed, the A .  STIMP model produces the highest clustering in the action space at all times for this simulation, although the TEMP model performs comparably to it once all of the streams are added.
The uKLD for m12f (Fig. 11, right panel) at first follows the same trend, where TEMP has a lower KLD value of about 3-4 nats until after a stream is added following the merger at 7.2 Gyr. The purple vertical lines and shaded band mark the times of mergers m12f-1 and m12f-2. Merger m12f-1 is long-lived and has a very small pericenter: the shading of the purple band corresponds to the distance of the merging galaxy from the host galaxy, with darker shades representing smaller distances. In all the models, the uKLD decreases substantially around the merger times. After the first dip in the uKLD at 7.2 Gyr (merger m12f-2), its value increases and stabilizes in all of our models. Subsequently, all the models have a relatively similar uKLD until the next merger (m12f-1), which lasts about 1.5 Gyr between 10.8 and 12.3 Gyr. A sudden dip in the uKLD is again apparent at the first pericenter of the merger, at 10.8 Gyr (darker shading). The uKLD then increases again as the merging satellite recedes from the host (lighter shading) and is highest when the merging halo is at the apocenter. Then as the satellite turns around, the uKLD decreases again more steeply than before, with a minimum at 12.2 Gyr. This second pericenter is closer than the first, and results in the complete tidal disruption of the satellite. Despite the difficulty in modeling the system when the merger is near pericenter, all models can produce a coherent action space for stellar streams with a similar degree of clustering, but time dependence and the flexibility of the potential model play a bigger role in this case than in the relatively stable m12i simulation: the STIAP (green) and TIAP (red) models lose the most information around the mergers (about 10 nats). Nonetheless, this example indicates that the BFE can indeed be used to model an SMC-mass merger, not only where the merging satellite is relatively far from the main halo, but even during and after a pericenter passage. Furthermore, although a stream is added to the action space distribution during (indeed, as a result of) the merger, the uKLD at the end of the merger has decreased to roughly its pre-merger value again, essentially representing the loss of one stream's worth of information in action space as a result of the perturbation to the potential. This holds even after the merger is complete and the axisymmetric model should in principle be a good representation of the potential again. Note that example action spaces for all models are posted in Appendix .2 and Fig. 14 and 15 for m12i and m12f respectively, at two time steps: 13.8 Gyr and 8 Gyr. uKLD changes can be seen in the clustering of streams in this space.

Orbital coherence
Even though our potential models produce action space clustering, the uKLD analysis does not capture the stability of the positions of the streams in action space, which represents the degree to which the median or parent orbit of the stream remains consistent. In a functional global model of the Figure 10. Action space ( vs. ) of stellar streams present in m12i (row 1) and m12f (row 2) found using the TEMP model. The three columns represent the time before the merger, during the merger, and after the merger in m12f at 10.52, 10.82 , and 12.64 Gyr, respectively. During the merger, the clustering of actions decreases in m12f (middle column). After the merger, the actions have been modified after the interaction (in m12f) and never retain their original values. While m12i has no such loss of clustering and instability in action space throughout different time frames. Note that a new stream forms in the third frame for both m12i and m12f. The complete time-evolving action space videos are posted at https://drive.google.com/drive/folders/1t2dLAQx4X5H7WOrUhX9nyrf2_pdgD2ga?usp=sharing for all the models and simulations.
Galactic potential, the actions should also be clustered under the assumption that the global potential profile is adiabatic, and the stream clusters should therefore have stable positions in action space over cosmic time. The stability of the action distribution over time is thus a measure of the degree to which the quantities that we are calculating are true actions, in the sense that they reflect the stationary properties of the stellar orbits, or relatively arbitrary clustered quantities without a clear connection to orbits.
To measure the stability of the action space distribution over time, we calculate the KLD between the action space distributions for pairs of neighboring snapshots in time for TEMP and KLD between each snapshot and present-day snapshot for TI models, as described in Section 3.2, again using the actions computed for star particles in streams using each of our candidate potential models. We refer to this quantity as the neighboring KLD or nKLD. The snapshots are spaced by 25 Myr or less; i.e., much less than one dynamical time for the orbits in question. Thus, a low nKLD between snapshots corresponds to a small amount of nonadiabatic variation in the parent orbits of the streams, while a large nKLD indicates that the parent orbits are changing nonadiabatically. In Fig. 12 we show the nKLD for all three potential models for m12i (left panel) and m12f (right panel). The higher nKLD values of the TI models indicate a lack of stability of its action space relative to the TEMP model (blue) in both simulations. We expect the nKLD of the TI models to increase with lookback time, as can be seen in Fig. 12, and this is true until about 2-3 Gyr in the past. Beyond this point, the nKLD oscillates around a roughly constant value, indicating complete decoherence of action space associated with a potential model fit to the present-day snapshot. In this regime, the action space of an individual snapshot calculated using a static model could still be clustered (as can be seen in Fig. 11), but the values of the actions can no longer be interpreted as information about the parent orbit of a given stream at these earlier epochs. Conversely, the TEMP model preserves a relatively constant, low nKLD even beyond a 7 Gyr lookback time. Realistically, we cannot construct such a model from only present-day information about the MW, but here we can use it to quantify the A . Figure 11. uKLD measures the action space clustering for streams in the simulations m12i (left) and m12f (right) from 6.8-13.8 Gyr. The two semi-TI models (orange and green) have systematically higher uKLD (a few nats) than the time-evolving model (blue) until all the stellar streams have formed. While the completely TI model (red) has the lowest uKLD. Discrete jumps in the uKLD are due to the formation of each new stream by tidal disruption; individual formation times are marked with gray vertical lines. While all the models produce a highly clustered action space around the present day, a drop in uKLD, indicating less action space clustering, can be clearly seen for m12f during both the mergers (purple line and purple shaded region). The prolonged merger in m12f is marked using a vertical purple band with the shading reflecting the proximity of the satellite to the Galactic Center (darker shades indicate smaller pericenter distance). The degree of clustering, and hence the accuracy of the potential model, decreases as the merging halo approaches the main halo. Fig. 3 shows the exact pericenter distances for the two mergers in this simulation.
information loss in TI models, to understand how far back in time we should expect a static model to produce actions that are connected to orbits. For our TI models, the action space is completely decoherent before 2.8 Gyr and 1.5 Gyr ago for m12i and m12f, respectively. By this point, TI models have lost almost two-thirds of the information present, as the nKLD value increases from less than 1 to almost 3 for the TI models. Also as expected, the static potential models for m12i preserve a connection between actions and orbits to earlier lookback times than for m12f. For m12f, the nKLD value peaks at around 1.5 Gyr, the time at which the merger in that galaxy is complete, for all of our potential models. Orbits prior to the merger are not recoverable in the TI models, as the nKLD values oscillate randomly between 1 and 3.
To interpret the nKLD as information loss, we compare it to the MI, defined in Section 3.3, in the action space distribution of each individual stream at the present day. This number quantifies the degree of correlation between actions that is expected for stars in a particular stream, and can be interpreted as the amount of information contained in one stream. The horizontal bands in Fig. 12 show the range containing 95% of the MI values over all the streams in m12i and m12f for the TEMP (blue) and TI (green) models, respec-tively. As one might expect, the TEMP model preserves this correlation slightly better than the TI models do, and we see that each stream contains about 1.5-2.25 nats of information per star. When the nKLD approaches this number, it is as if about one stream's worth of information has been lost in the time between two snapshots (Sanderson et al. 2017). This occurs at somewhat smaller lookback times in both models than the time when the nKLD begins to saturate at complete decoherence: at about 0.8 and 0.5 Gyr ago for m12i and m12f respectively, for both TI models. Also as expected, the nKLD values for the TEMP models never exceed the MI of a single stream, with some exceptions very far in the past where few streams were yet formed.
All our models perform relatively well around the present day. The insets in both panels of Fig. 12 show the nKLD for the last ∼ 20 snapshots, including the final 10 up to the present day, which are spaced 2 Myr apart. Here, the TI models perform comparably to the TEMP model, although at larger lookback times the nKLD for the TEMP model is bounded while the nKLD for the two TI models diverges. The divergence near the present day is steeper in m12f because of the ongoing merger in that simulation.
We conclude that although the TI models can produce action spaces in which streams are clustered at relatively large Figure 12. nKLD measures the stability of action space as a function of time for the simulations m12i (left) and m12f (right). For the TI models we compare the action space distribution at each snapshot to the present-day action space distribution, while for the TEMP model we compare the action space distributions of neighboring snapshots in time. For the TI models, the action space is completely decoherent (the nKLD is saturated) before 2.8 Gyr. For comparison, the shaded horizontal bands show the 95% range of MI computed for individual streams using the TEMP model (blue) and TI model (green), respectively. The time where our KLD value first exceeds the MI range, corresponding to an information loss equivalent to an entire stellar stream, is shown with a red dashed vertical line. lookback times, these actions cannot be connected with properties of orbits beyond a relatively short window into the past, since the clusters have shifted nonadiabatically from their present-day positions in action space. The TI models can represent the Galactic potential sufficiently well to preserve orbital properties furthest back into the past in cases with no large mergers, such as the m12i simulation. For these cases, the window in which TI modeling is sufficient appears to extend about 1.6 Gyr into the past. This window is shorter for more complicated potentials that include significant mergers: it reduces to less than 1 Gyr for m12f and fails completely for m12w.
Interestingly, given their single-origin expansion basis, the TEMP model demonstrates an exceptional ability to model potentials with a dual COM, as can be seen in Fig. 7. Problems only arise where the PMR approaches 2-indicating that at pericenter the merger is behaving like a 2:1 mergeror where the TMR is less than 10. Even in those cases, the major issue is with the axisymmetry assumptions, which fail to find actions for stars where this assumption is extremely wrong. This usually occurs only for a short time right around pericenter and for a limited number of halo stars, as shown in Fig. 8. However, the BFE methods still work well when these symmetry assumptions are relaxed, at the cost of an increase in the number of coefficients.

CONCLUSIONS
In this paper, we test different methods of galactic potential modeling by comparing how well they preserve stream orbital properties for a set of tidal streams created from disrupted satellite galaxies in zoomed cosmological-baryonic simulations. We compare an axisymmetric, time-evolving BFE potential model with more traditional static and/or parameterized models. Specifically, we use three TI potential models: an analytic potential model fit at the present day (TIAP), a TIAP model with a mass ratio scaling (STIAP) at all time steps, and a low-order ℓ = 4 multipole model fit at the present day (STIMP) assuming axisymmetry with a mass ratio scaling; and a time-evolving low-order ℓ = 4 multipole model (TEMP) where the expansion is computed at every time step in the simulation, also assuming axisymmetry. We also check the validity of our techniques for galaxies with a variety of merger histories and determine where even the best potential method breaks down.
We find that the multipole expansion can accurately model potential varying mergers with satellites such as the LMC and Sgr dSph by using the higher pole orders of the expansion, even when restricted to axisymmetry. The low-order axisym-A .
metric expansion preserves orbit structure and action-space coherence for stars in stellar streams for satellites that have the total merger ratio (the ratio between the total mass of the main halo and merging halo at the first pericenter) greater than 10, or a pericenter merger ratio (the ratio between the mass of the main halo enclosed within the pericenter position of the merging halo to the mass of the merging halo at the first pericenter) greater than 1 and have properties such as low radial velocities (order of 10 km/s) and pericenters outside the scale radius of DM halo. A good observational example of such a merger is the Sgr dSph which is thought to have a TMR greater than 10 (Niederste-Ostholt et al. 2010).
When the model does break down and lose significant orbital information, it is mainly due to a failure in the axisymmetry assumption and could be evaded by relaxing the symmetry, which however leads to added difficulties in computing actions. Interactions such as the ongoing merger of the LMC and MW, which has an estimated TMR of 7.6 (Peñarrubia et al. 2015;Vasiliev et al. 2021), are thus not expected to be well modeled by an axisymmetric low-order multipole with a single expansion center.
All types of models we tested could preserve the clustering of stream stars in action space. In cases with a large merger (by the above criteria) the degree of clustering decreases only temporarily during the merger. However, only the TEMP model demonstrated long-term orbital stability, defined as the consistency of the positions of the action-space clusters. All TI models demonstrate a total loss of information as early as 1 Gyr in the past for the simulations with no prominent mergers and about 0.7 Gyr in the past in the presence of even a Sgr-like merger, less than the dynamical times for the orbits of stream stars ∼ 1 − 3 Gyr. Orbits for objects in the MW halo projected outside this time window are likely to diverge significantly from reality.

APPENDIX
In this appendix, we present additional figures for reader's reference. Appendix .1 compares the true central accelerations with spline accelerations for the three Cartesian axes and Appendix .2 compares the action spaces for all the potential models at two different time steps: (i) (13.8 Gyr) the present day, (ii) (8 Gyr) a time far before present day for m12i and m12f.
.1. Spline fits for COM acceleration Fig. 13 plots the central accelerations (green) computed using numerical differentiation along with the spline acceleration (black) (refer to section 2.3), computed from the spline fits on the COM position along each Cartesian axes for m12i. Note that actual central accelerations are jittery, oscillatory, and large while the spline accelerations are relatively lower. Therefore, we use the spline centering technique to fit our potential models, which makes the overall system less non-conserved in comparison with the actual accelerations. These jitters in actual accelerations are caused due to active star formation near the center (Orr et al. 2021). Figure 13. The three Cartesian accelerations (green) of the COM position along with the spline accelerations (black) calculated using the smooth cubic spline fits for each coordinate axis for m12i. The actual acceleration is jittery, highly oscillatory, and large while the spline values are significantly smooth and lower in comparison. The spline fit COM positions do not adversely affect the action space with fictitious forces arising due to jittery COM and only depend on the potential model. . Fig. 14 and 15 plot the action space ( vs ) of stellar streams at 13.8 Gyr (Column 1) and 8 Gyr (Column 2 and 3) for the two multipole potential models described in Sec. 2.4.2 (row 1) and the two analytic models described in Sec. 2.4.1 (row 2) for m12i and m12f respectively. The insets on the top right plot the position-space distribution in the X-Z plane. All the models produce highly clustered actions at the present day; however, multipole models can be seen to have higher clustering at 13.8 Gyr for m12i(red and gray streams) while it is hard to visually comment on the degree of clustering incase of m12f. At 8 Gyr, semi-TI models produce more clustering, which is an artifact of scaling the actions with the mass. The mass ratio at earlier snapshots is much less than 1, and multiplying this scaling with actions increases the degree of clustering; as it scales downs the action values. Note the difference between TIAP and STIAP models at 8 Gyr as shown in Fig. 14 and 15. For e.g. the STIAP action space is computed by scaling the TIAP action space with the mass ratio which inherently increases the clustering. The relative positions of action clusters change significantly for m12f from 8 and 13.8 Gyr due to a potential varying satellite merger, while the positions remain consistent for isolated galaxies with no mergers in the case of m12i.