Dynamical model of the Milky Way using APOGEE and Gaia data

We construct a dynamical model of the Milky Way disk from a data set, which combines Gaia EDR3 and APOGEE data throughout Galactocentric radii between $5.0\leq R\leq19.5$ kpc. We make use of the spherically-aligned Jeans Anisotropic Method to model the stellar velocities and their velocity dispersions. Building upon our previous work, our model now is fitted to kinematic maps that have been extended to larger Galactocentric radii due to the expansion of our data set, probing the outer regions of the Galactic disk. Our best-fitting dynamical model suggests a logarithmic density slope of $\alpha_{\rm DM}=-1.602\pm0.079_{\rm syst}$ for the dark matter halo and a dark matter density of $\rho_{\rm DM}(R_{\odot})=(8.92\pm0.56_{\rm syst})\times 10^{-3}$ M$_{\odot}$ pc$^{-3}$ ($0.339\pm0.022_{\rm syst}$ GeV cm$^{3}$). We estimate a circular velocity at the solar radius of $v_{\rm circ}=(234.7\pm1.7_{\rm syst})$ km s$^{-1}$ with a decline towards larger radii. The total mass density is $\rho_{\rm tot}(R_{\odot})$=$(0.0672\pm0.0015_{\rm syst})$ M$_{\odot}$ pc$^{-3}$ with a slope of $\alpha_{\rm tot}$=$-2.367\pm0.047_{\rm syst}$ for $5\leq R\leq19.5$ kpc and the total surface density is $\Sigma(R{_\odot}, |z|\leq$ 1.1 kpc)=$(55.5\pm1.7_{\rm syst})$ M$_{\odot}$ pc$^{-2}$. While the statistical errors are small, the error budget of the derived quantities is dominated by the 3 to 7 times larger systematic uncertainties. These values are consistent with our previous determination, but systematic uncertainties are reduced due to the extended data set covering a larger spatial extent of the Milky Way disk. Furthermore, we test the influence of non-axisymmetric features on our resulting model and analyze how a flaring disk model would change our findings.


INTRODUCTION
Dynamical models are important for our understanding of galaxies. They describe the distribution of stars over orbits in a gravitational potential (Binney & Tremaine 1987). Hence, we can describe a galaxy as stars orbiting in a smooth gravitational potential by interpreting the combined stellar position and velocity information (e.g. Rix et al. 1997;Syer & Tremaine 1996;Binney & McMillan 2011;Cappellari 2008). This allows us to infer the gravitational potential, the circular velocity curve, mass distribution and dark matter density of a galaxy (e.g. . For most external galaxies dynamical models usually suffer from degeneracies due to limited data. Since only line-of-2000; Cappellari et al. 2006;van den Bosch et al. 2008) and made-to-measure models (e.g. Syer & Tremaine 1996;de Lorenzi et al. 2009;Dehnen 2009;Portail et al. 2017;Wegg et al. 2015). For this paper we will use the Jeans modelling approach. Jeans models are based on the Jeans equations (Jeans 1915(Jeans , 1922, which are derived from the steady-state Boltzmann equation under the assumption of axisymmetry. The steady-state collisionless Boltzmann equation needs to be satisfied by the distribution function of the system, which describes the position and velocity of the stars in equilibrium and steady state under the gravitational influence of a smooth potential (Binney & Tremaine 1987). The solutions of the Jeans equations describe the moments of the velocity distribution, and the density of a collection of stars in a gravitational potential. However, to obtain a unique solution one has to assume the shape and orientation of the velocity ellipsoid. Cappellari (2008) reviews the possible natural choices for this alignment, which are prolate spheroidal coordinates, spherical coordinates, and cylindrical ones.
Previously, we used the Gaia DR2 kinematics to construct an axisymmetric dynamical model of the Milky Way disk (Nitschai et al. 2020, hereafter Paper I). There we used the new spherically-aligned Jeans Anisotropic Modelling (JAM sph , Cappellari 2020) method, since the Gaia data (Hagen et al. 2019;Everall et al. 2019) showed that the velocity ellipsoid is closer to being spherically aligned than cylindrically (JAM cyl , Cappellari 2008). But we also compared the results against JAM cyl and found negligible differences.
The known deviations from equilibrium and axisymmetry for the Milky Way (Widrow et al. 2012;Antoja et al. 2018;Gaia Collaboration et al. 2018a) which are in conflict with our model assumptions, are not uncommon for other galaxies too, but models are still able to recover the average kinematic properties (Cappellari et al. 2013), even from less precise data. In the case of the JAM method, it has been used to model integral-field stellar kinematics of large numbers of external galaxies (Cappellari 2016). It has been tested against N-body simulations (Lablanche et al. 2012;Li et al. 2016) and in real galaxies against CO circular velocities (Leung et al. 2018), including barred and non-perfectlyaxisymmetric galaxies similar to the Milky Way. In both cases, it recovers unbiased density profiles, even more accurately than the Schwarzschild (Schwarzschild 1979) approach (Leung et al. 2018), which was also confirmed (Jin et al. 2019) by using Illustris cosmological N-body simulation (Vogelsberger et al. 2014). Hence, we expect that JAM will be able to capture the main kinematic features of the Milky Way and give accurate results for the total density and circular velocity, even more since we use the spherical alignment, JAM sph .
In this paper we build upon PAPER I, by extending the maximal Galactocentric radius of the data from ∼ 12 kpc to ∼ 20 kpc. This allows us to extend our model to the outer parts of the Galactic disk and to better constrain the model parameters. Furthermore, we improve our model by including the uncertainties of the kinematic maps at each position. The extended data set also allows us to test our model when using a flared outer disk, as suggested by e.g. Gyuk et al. (1999); Momany et al. (2006); López-Corredoira & Molgó (2014); Li et al. (2019).
The outline of this paper is as follows: we describe the data set and the derived kinematic maps with their uncertainties in Section 2. In Section 3 we describe the components of our mass model for the Galaxy and our modelling method. In the end, we show our results for this extended data set and the investigation in the effect of the non-axisymmetries in Section 4, and we also show the results for a model with a flared disk. Finally, in Section 5 we summarize our results. In the Appendix we include further tests for our model.

DATA
In this paper we combine two data sets, one are giant stars from Gaia early data release 3 (EDR3) with radial velocities (Gaia Collaboration et al. 2021) and one are the red giant branch (RGB) stars from APOGEE and Gaia . This combined data set will allow us to reach out to large Galactocentric radii.

Data set
Gaia provides precise parallax information locally but further away from the Sun the distance uncertainties dominate. In addition, it is known that there is an parallax offset in the Gaia Data Release 2 (DR2) (Lindegren et al. 2018;Zinn et al. 2019;Schönrich et al. 2019) and also in Gaia EDR3 a parallax bias has been discovered (Lindegren et al. 2021a). There has been an attempt to map the dependencies of the parallax bias, but since the dependencies are non-trivial it is not possible to have a definitive correction (Lindegren et al. 2021b).
In this work we update the data from PAPER I using Gaia EDR3 (Gaia Collaboration et al. 2021). The important updates for our study is the higher precision in the astrometry in EDR3, however the radial velocities have not been updated and have been adopted from DR2. In addition, EDR3 does not have extinction or reddening values, hence we cross matched EDR3 with DR2 using the given crossmatch catalog by Torra et al. (2021) in order to get these values. For distances we use the photogeometric distance estimates by Bailer-Jones et al. (2021) for the Gaia EDR3 in order to avoid the parallax bias and high uncertainties further away from the Sun.
The selection criteria we apply are the same as we already had in PAPER I. Since giant stars are the main contribution in Gaia at distances larger than 1 kpc from the Sun and can be measured out to large distances due to their brightness, we  select only them in this subsample. They are selected based on their absolute magnitude, M G = m G − 5 · log d + 5 − A < 3.9, given the extinction by Gaia DR2 and d being the distance from Bailer-Jones et al. (2021), their intrinsic colour (G BP − G RP ) 0 > 0.95, with the reddening taken from Gaia DR2 and parallaxes with relative uncertainty / > 5.
These data are precise only around the Sun but not far beyond.
In order to probe larger Galactocentric distances Hogg et al. (2019) (Skrutskie et al. 2006) and WISE (Wright et al. 2010). These data allow us to probe larger distances, out to Galactic radii of ∼ 20 kpc, than with Gaia data alone.
These RGB stars where selected with a cut at log g < 2.2, in order to have stars more luminous than the red clump. For more details on quality cuts for these stars see Hogg et al. (2019).
In order to extend our data set to larger Galactocentric radii, we combine the data set based on Gaia EDR3 with that from Hogg et al. (2019). For the combined data set we require that the proper motion and line-of-sight veloc-ities of all stars are finite and that they are in the area of −30 • < φ < 30 • . For sources that are in both data sets we prefer the values from Hogg et al. (2019) for our further calculations.
The remaining stars from the Hogg et al. (2019) data are 34, 180 and 2, 823, 509 stars that are only in Gaia. In total we have 2, 857, 689 stars that will be used in the further analysis.

Kinematic maps
In order to calculate the kinematic maps of the Milky Way, we transform the positions and velocities of the stars in our combined data set, into a Galactocentric reference frame.
We assume as distance to the Galactic centre R = 8.178 kpc (Gravity Collaboration et al. 2019), a vertical displacement of the Sun from the midplane of z = 0.02 kpc (Joshi 2007) and as solar velocities in cylindrical Galactic coordinates (U , V , W ) = (−11.1, 247.4, 7.2) km s −1 (from Schönrich et al. 2010;Gravity Collaboration et al. 2019;Reid et al. 2009, respectively). We require the data to have |z| ≤ 7 kpc because we have more stars towards positive z values, which are not covering the whole area and creating gaps in our data. Hence, to not have a data set that is highly asymmetric and not continues, we exclude from the beginning stars with too high z values. These stars that are excluded would anyways not be considered in our further analysis, since they are halo stars and lay outside of the disk z Figure 2. Uncertainty maps for the combined data. Similar to Figure 1, the colors represent the uncertainty value for each Voronoi bin.
In the top panels are again the velocities and in the bottom panels the velocity dispersions. From left to right are the velocities and velocity dispersions in R, φ and z. Inside the red boxes are the errors that we will use in calculating the χ 2 of our JAM model and the black lines are the stellar density (Section 3.1.1) contours.
region we want to model. We then divide our data in 200 pc × 200 pc cells and do Voronoi binning, using the VORBIN package 1 (Cappellari & Copin 2003) on this cells, requiring a minimum of 5 stars for each Voronoi bin. We use the Voronoi binning on our already divided into 200 pc × 200 pc data, to fully make use of the data, especially towards larger radii and z where the stars are not as dense and we would not get more than 5 stars in a cell. With that method we can combine the data from cells with not enough stars in them, but still keep as smallest bin the 200 pc × 200 pc cell in the areas where we have numerous stars and not have too tinny bins. For each bin we calculate the mean and standard deviation of the velocities. The final number of bin is 2, 899 and covers a volume with extreme cylindrical coordinates 0.18 kpc ≤ R ≤ 19.5 kpc and -5.02 kpc ≤ z ≤ 6.8 kpc.
The created velocity maps of the combined data set in cylindrical coordinates are shown in Figure 1. As expected the stars around the solar position extend to larger Galactocentric heights z than at larger or smaller radii. However, thanks to the Voronoi binning we get a continuous covered area up to a radius of 19.5 kpc without any gaps in the kinematics.
It is worth mentioning that the selection function of Gaia DR2  and specifically the RV sample ), since we only select stars in EDR3 that have radial velocities in DR2 and for APOGEE (Bovy et al. 2014;Mackereth & Bovy 2020) will not affect our model significantly, since we study the kinematics of the observed stars, at a given position. The stellar tracer density is assumed, not fitted and the selection function at a given position does not depend strongly on the velocities of the objects.
To find the uncertainties for our kinematic maps, we choose to use the bootstrapping method (e.g. Efron & Tibshirani 1993). We selected this method and not error propagation because it provides a more robust estimate of the uncertainties. The propagated uncertainties from radial velocities and proper motions measurements tend to underestimate the true uncertainties. In detail, we draw 100 random samples of stars with their velocities and redo the Voronoi binning, calculating again for each of these samples the mean velocity and the velocity dispersion for each bin. Afterwards we find the uncertainties of the bin by calculating the standard deviation of the 100 mean velocities and velocity dispersions.
The uncertainty maps are shown in Figure 2. Errors around the Sun and the midplane are much smaller (≤ 4 km s −1 ) than at larger heliocentric distances, because there are more stars detected and their measurements are more precise since they are closer to us.
The uncertainties are also higher towards the Galactic Center. In this area, starting at 5 kpc, the bar dominates the kinematics (Wegg et al. 2015;Bovy et al. 2019) and hence would break the axisymmetry assumptions of our model. To avoid that, we exclude data with R < 5 kpc. Additionally, we also exclude stars with |z| ≥ 2.5 kpc, which is the same range we had in PAPER I, in order to exclude halo stars. Similar to PAPER I, we ignore the halo distribution since it is not well known and in this regime we expect to mainly have disk stars. The red boxes in Figure 1 and 2 show the area we will investigate and it has a 1, 404 bins, the data outside the boxes is excluded from the further analysis.

Mass distribution of the Milky Way model
To construct a dynamical model we need an estimate of the tracer distribution and the mass density of the Galaxy. We will assume the same model for the Milky Way as in PAPER I, a detailed description and explanation can be found there and we give a brief summary in this section. In addition to this model we also investigate a model with a flaring disk in Section 3.1.2.

Distribution with an exponential disk model
Our main stellar model makes use of the Jurić et al. (2008) disk stellar distribution, normalized to the local luminosity density from Flynn et al. (2006). However, we ignore the stellar halo component because in the area we are probing with our data set, the stellar mass distribution is dominated by the disk component.
The disk is decomposed into a sum of two exponential components, the thin and the thick disk with different scale lengths (L thin , L thick ) = (2.6, 3.6) kpc and heights (H thin , H thick ) = (0.3, 0.9) kpc (Jurić et al. 2008): where f = 0.12 is the thick disk normalization relative to the thin disk (Jurić et al. 2008). In addition, we combined that with an axisymmetric bulge (McMillan 2017; Bissantz & Gerhard 2002). The luminosity density profile of this model is shown in the top panel of Figure 3.
We approximate this stellar model with a multi-Gaussian expansion (MGE, Emsellem et al. 1994;Cappellari 2002), using the MGE fitting method (Cappellari 2002) and MGEFIT software package 2 , which is necessary in order to apply the JAM model (Cappellari 2008(Cappellari , 2020. For the mass density of the Galaxy we need to include also the gas and dark matter contribution. For the gas component we add the H 2 and H II distribution from McMillan (2017). For simplicity we ignore the hole in the center of the density profile, since the hole is outside of the range we are probing. In PAPER I we have shown that this does not change the dynamical model significantly. Like for the stellar model, we use the MGEFIT package (Cappellari 2002) to fit the created image of the gas density. This MGE will be added to the potential density and be kept fixed during the model fit, to the quoted mass by (McMillan 2017).
As a dark matter halo we assume a generalized Navarro, Frenk and White profile (gNFW, Wyithe et al. 2001): where: r s the scale radius, α DM the dark matter slope and q DM the axial ratio. If α DM = -1, the profile represents the classical Navaro, Frenk, and White (NFW) dark matter profile (Navarro et al. 1996). The dark matter profile is a one-dimensional profile and we fit this with Gaussians using the MGE FIT 1D routine from the MGEFIT package. It can be made oblate or prolate, for use with the model depending on the value of q DM .
The model assumptions cause systematic uncertainties that strongly depend on the values we choose for the different parameters but also from the model itself. We have tested the effects of the disk parameters in PAPER I for the circular velocity and total density, in order to get an understanding of these uncertainties.

Distribution with a flared disk model
We know that the outer part of our Galaxy, beyond 15 kpc, is warped and flared, but the details of its shape are still uncertain Bland-Hawthorn & Gerhard (2016). These structures were first detected in the gas component of the disk (Kerr 1957;Oort et al. 1958;Grabelsky et al. 1987;May et al. 1997), but can also be detected in the three dimensional distribution of stars (e.g. Liu et al. 2017;Anders et al. 2019;Skowron et al. 2019).
A flared disk means that the vertical scale height grows with increasing radius. The stellar flare in the outer Galactic disk has been observed and the scale height modeled as a function of the radius by many studies for the outer disk (e.g. Gyuk et al. 1999;Alard 2000;López-Corredoira et al. 2002;Yusifov 2004;Momany et al. 2006;Reylé et al. 2009;López-Corredoira & Molgó 2014;Li et al. 2019).
Since, we are probing a large range of Galactic radii up to ∼ 20 kpc, our data extends to the regime where the flare is visible. Hence, we investigate, in addition to our main exponential model described in the previous subsection, how a flared disk would change our results.
Even though, there are many studies of the flare there is not one parametrization for it. We decided to base the flare we use on the work of (López-Corredoira & Molgó 2014), because they give a parametrization for the scale height for the thin and thick disk. In detail, we keep the density profile the same as for our main model and only modify the scale heights, H thin , H thick , for larger radii: Since the flare starts only at larger radii, we assume the scale height also for the thin disk to be constant up to the solar position. For R < R H thin = H thin (R ) and for R < R ft H thick = H thick (R ft ). We use as values for the scale heights at solar position, the values from Jurić et al. (2008), H thin (R ) = 300 pc and H thick (R ) = 900 pc. See Figure 3 bottom panel for the density profile with flare.

MGE with negative Gaussians but non-negative density everywhere
An MGE with only positive Gaussians cannot describe a density distribution like the one in Figure 3. A good fit can be obtained, for a restricted spatial extent, allowing for negative Gaussians (keyword negative=True in MGE FIT SECTORS of the MGEFIT package of Cappellari 2002). However, we verified that in this case the resulting MGE has strongly negative density at larger radii and produces clearly nonphysical results when used with JAM. The inability of constraining the density to be positive, and the need to test for the positivity of the total density, is a general problem when allowing for negative Gaussian in MGE fits.
To obtain an acceptable MGE fit to the flared disk, while guaranteeing a positive total density, we needed to make some modifications to the MGE FIT SECTORS procedure. For this, in addition to the usual positive Gaussians basis functions in the MGE, we introduced a set of new basis functions, . The x-axis is the radial direction, R, reaching up to ± 35 kpc away from the Galactic Center at (0, 0) kpc and the y-axis is the displacement from the midplane, z, ranging up to ± 15 kpc. The colormap is the logarithmic luminosity density. which we produced by co-adding four positive and negative Gaussians with the same absolute peak value, in such a way that their sum is everywhere positive as follows: Here, σ and q are two parameters that change for different MGE components (like the corresponding parameters for the MGE Gaussians), while f = 0.9 is a constant f 1, which is the same for all basis functions. With this definition, the new MGE basis function is G(R, z) = 0 along the R or z axes, and presents four bi-symmetric maxima, as illustrated in Figure 4.
The new basis function G(R, z) has an asymptotic behaviour at large radii G ∼ exp(−r 2 ) for r → ∞, as the standard MGE, while along the axes G ∼ R 2 for R → 0 and G ∼ z 2 for z → 0. This implies that, while it can describe decreasing profiles towards the centre, it cannot approximate drops of the central density profile steeper than ρ ∝ r 2 . For this reason, it can only approximate the flared density of Figure 3 down to a certain isophote.
Apart from the introduction of the new basis function, the fit is obtained using the same algorithm described in Cappellari (2002), while still enforcing positivity constraints for all basis functions. Once the fit converges, all the 4-Gaussian basis functions are split into their four individual Gaussians components in output, producing an MGE which is composed of both positive and negative Gaussians. In this way, no further changes are required to use the produced MGE model with other software like JAM.
The new MGE has 49 Gaussians, but these come from 28 basis-functions, some of which are constructed from groups of 4-Gaussians, which are fitted as single basis-functions. This new implementation of the MGE fit is/will be included in the new release of the MGE package (footnote 2).

Modelling
As dynamical model we use JAM sph approach 3 (Cappellari 2020), which is based on the solution of the axisymmetric Jeans equations and assumes a spherically aligned velocity ellipsoid (Bacon et al. 1983;Bacon 1985). The spherical alignment has proven to describe the Gaia data in the outer halo (Wegg et al. 2019) and in the disk region (Hagen et al. 2019;Everall et al. 2019). This was also confirmed by the results in PAPER I.
In Figure 1 we have drawn a red square around the data we will use for our model: we only fit the axisymetric inplane kinematics (v φ , σ r , σ φ , σ z ). Neither vertical motion, nor tracer distribution are fit. Due to the steady state assumption of our model, v R and v z are assumed to be zero and therefore they do not need to be fitted. The deviations from zero in the data due to non-axisymmetric signatures (Antoja et al. 2018;Gaia Collaboration et al. 2018a) are small enough for the purpose of our model.
The features which we do not include in our stellar distribution of the Galaxy are the bar in the inner region and warp in the outer disk since both signatures are non-axisymmetric and we cannot model them with JAM. In particular, the bar dominates within 5 kpc from the Galactic Centre (Wegg et al. 2015;Bovy et al. 2019), but as previously mentioned (see Section 2.2, cut at 5 kpc) we remove all the stars in the central area for our model. For the outer disk one would need to add a measurement of the warp, z w (R, φ), (e.g. Li et al. 2019) to the stellar distribution. However, since our model is axisymmetric, we cannot add this to our model. In future works one could investigate if a shift in z direction would be a possible parametrization of the warp for an axisymmetric model like ours.
For our modelling we keep the scale radius of the dark matter halo, r s , constant at 14.8 kpc ), since we know that it is not constrained well with our JAM model (PAPER I). In addition, 14.8 kpc agrees with literature values ranging from (∼ 10 to 20 kpc). Further, because our data are limited to the disk region, they are not strongly constraining the dark matter axial ratio. Hence, we keep q DM fixed to 1.3 , which they derive using globular clusters. See Appendix B for a model with free q DM .
The standard model has 7 free parameters: (i) the inner logarithmic slope of the dark matter halo (α DM ); (ii) the dark matter fraction within a sphere with radius R (f DM ); (iiivi) the velocity dispersion ratios or anisotropies (σ θ /σ r and σ φ /σ r ) for both the flattest (q MGE < 0.2) Gaussian components (subscript 1) of the MGE and the rest (subscript 2); and (vii) the mass-to-light ratio of the stellar component in the

Exponential disk model
Using the data and uncertainties described in Section 2 we perform MCMC fits, for the Milky Way distribution without flaring as described in Section 3.1.1. The best fitting model, gives a χ 2 ∼ 9.2 which is much larger than 1 that we want. This is mainly because there are systematic uncertainties due to non-axisymmetric features, spiral structure, the equilibrium assumption of JAM and the choice of the tracer distribution, which our model cannot take into account. This explains also why we get very small statistical errors for our free parameters. The systematic errors would dominate here and we can use this result to get an estimate of the systematic uncertainties, which are otherwise impossible to determine.
To get more reasonable errors, we check the precision to which our model explains the data. To do this we add quadratically a value in km s −1 to our bootstrapped errors: Where errors s.,i such that we get a χ 2 = 1 for each component i = (v φ , σ R , σ φ , σ z ). The added values for this model are (6.03, 5.01, 6.55, 3.32) km s −1 respectively. With these higher errors for the data we do a new MCMC fit and the best-fitting values are listed in the second column ('Model') of Table 1. However, the statistical uncertainties from the posterior of our the best-fitting parameters are only formal errors, which are quite small also because of the large number of stars that we use and systematic uncertainties dominate. To estimate our systematics we investigate the axisymmetry assumption of our model in Section 4.2 and we give the systematic errors as half of the difference between minimum and maximum of all our models with an exponential disk without flaring.
The posterior distribution is shown in the Appendix A Figure 9 and the JAM model is shown in the second column of Figure 5.
The circular velocity is plotted in Figure 6 and has a value of (234.7 ± 0.3 stat ± 1.7 syst ) km s −1 at the solar radius. The slope of the circular velocity curve between 6.2 kpc and 20.2 kpc, which is the region were it can be approxi- 234.7 ± 0.3stat ± 1.7syst 234.6 ± 0.4stat ± 1.7syst 234.6 ± 0.24stat acirc[ km s −1 kpc −1 ] −1.78 ± 0.05stat ± 0.34syst −1.69 ± 0.08stat ± 0.34syst −2.39 ± 0.05stat NOTE-The uncertainties given in this table as statistical errors are derived from the posterior distributions and are the formal errors. For the estimate of the systematic uncertainties we also used the values from Table 2 in Section 4.2 and we estimate them to be half of the difference between minimum and maximum of all our models with an exponential disk without flaring. mated by a straight line, is declining at (−1.78 ± 0.05 stat ± 0.34 syst ) km s −1 kpc −1 . This is consistent with the results of Eilers et al. (2019). A test to investigate the small offset between our and Eilers et al. (2019) measurements of the circular velocity can be found in the Appendix C. In Figure 6 we also show the contribution of the different components to the total circular velocity according to our best fit. However, one has to note that because the covariance of the massto-light ratio, the dark matter fraction, and the dark matter slope (see Figure 9), which is expected, it is not the only possible decomposition. One could decrease the dark matter contribution while increasing the stellar component and still get the same total result, which is what we constrain. The smaller than NFW (α DM = −1) dark matter slope also increases the dark matter contribution towards smaller radii inside R . This makes the dark matter contribution more dominant for our circular velocity at small radii in comparison to other works (e.g. Eilers et al. 2019, and what we see in Appendix C Figure 11) where usually a NFW profile is assumed. Hence, we only tightly constrain the total circular velocity and not the different components of it, which depends on the assumed stellar and dark matter density profiles. One should also note that even though we use giant stars, we do not take the asymmetric drift into account, since we use the full Jeans equations with JAM. The circular velocity is a result of the gravitational potential that we get from our best fit to the individual velocities.
Additionally, for the best fitting model the dark matter density at the solar radius is ρ DM (R ) = (0.00892 ± 0.00007 stat ± 0.00056 syst ) M pc −3 and the total density ρ tot (R ) = (0.0672 ± 0.0006 stat ± 0.0015 syst ) M pc −3 . The total density logarithmic slope in the range of 5 ≤ R ≤ 19.5 kpc is α tot = −2.367 ± 0.007 stat ± 0.047 syst . All statistical errors are the formal errors from the posterior distribution and do not include the systematic uncertainties that would increase the errors.
Comparing these results to our previous work (PAPER I), the JAM model agrees with our previous results and all parameters agree within the respective errors. Specifically, the total density, and in particular on the total slope and dark matter fraction, have smaller uncertainties and are more reliable, because we have a longer radial baseline, and more data. The discrepancy between the total logarithmic density slope of our best fitting model and that from PAPER I can be explained due to the different radial range probed (3.5 ≤ R ≤ 12.5 kpc) and the different r s (20 kpc) assumed for the dark matter halo. If we constrain our slope to a similar range as in PAPER I, our value would decrease to −2.278 ± 0.007 stat ± 0.047 syst and would get even smaller for a scale radius of 20 kpc. So in a similar range our finding for the total density slope is also consisted with that of PAPER I.
In addition, we also have calculated the total surface density as a function of radius by analytically integrating the MGE Gaussians. The integral of one Gaussian, between the cylindrical radii R 1 and R 2 and within the height range |z| < z max is given by the following expression This integral was performed for all the MGE Gaussians for the stellar, gas and dark matter components for -1.1 kpc ≤ z ≤ 1.1 kpc for direct comparison with . The resulting surface density is shown in Figure 7.
Our surface density is a little bit lower for radii smaller than R but agrees within the uncertainties at the solar radius, Σ(R , |z| ≤ 1.1 kpc) = (55.5 ± 0.5 stat ± 1.7 syst ) M pc −2 . Moreover, Piffl et al. (2014) give for the solar radius Σ(|z| ≤ 0.9 kpc) = (69 ± 15) M pc −2 and our value of Σ(R , |z| ≤ 0.9 kpc) = (51.1±0.4 stat ±1.7 syst ) M pc −2 lies within their 3σ range. Since also our total surface density agrees within the uncertainties with previous findings, it gives us another confirmation that our model constrains well the total density, circular velocity, and the potential, and only the exact decomposition into the different components (dark matter, stellar and gas), which is not the scope of this work, is not tightly constrained. Further, we also performed a fit allowing the 18 Gaussians in the MGE model of the stellar component, to have a different anisotropy value, (σ θ /σ r ) and (σ φ /σ r ). This model has a total of 39 free parameters and the ones that are not the anisotropies are listed in column 'Free anisotropies' in Table 1. This model gives a better fit to the data, since it has more parameters but all 'interesting' parameters agree within the 3σ error range with our main model.

Assessing systematic errors due to non-axisymmetries
Our Galaxy contains non-axisymmetric features due to the Galactic bar (e.g. Monari et al. 2016;Khoperskov et al. 2019), spiral arms (e.g. Reid et al. 2019;Eilers et al. 2020), the warp of the disk (e.g. Vázquez et al. 2008;Li et al. 2020), or interactions and mergers with satellite galaxies (e.g. Helmi et al. 2018;Koppelman et al. 2019). We now investigate how the model assumption of axisymmetry affects our results by dividing the data set into four sectors, depending on positive or negative z and φ: If the Milky Way was perfectly axisymmetric, our axisymmetric model should give the same results within the errors for each of the four sectors. Hence, we now fit for each sector separately and compare the different results, in order to estimate how much the non-axisymmetric features of our Galaxy influence our model and estimate the systematic uncertainties of the free parameters.
For these fits we use the same errors as for the fit to all data points. The resulting circular velocities are plotted in Figure 6 as fainter and thinner blue lines and all the fit parameters are listed in Table 2. The results show us that there are some differences depending on which part of the Galaxy we model, however most of the free parameters are within the 3σ range of each other and with the total model. Higher discrepancies exist between α DM , which are outside the 3σ range of the statistical errors.
We can use the discrepancies between the parameters for the different sectors to get an estimate for the systematic uncertainty of our model because of non-axisymmetric features in the data. The mean values in Table 2 is the mean of the individual sectors and the errors are half of the differences between the maximum and the minimum of the individual sectors. The last column of the table is the total systematic errors = (max-min)/2, including the best fitting values from Table 1 without flaring and the model with free anisotropies. These systematic errors from the sectors and from all our models without flaring might be the same if the minimum and maximum value are from the four sectors.
Additionally, for the four sectors the mean dark matter density at the solar radius is , ρ DM (R ) = (0.00935 ± 0.00056 syst ) M pc −3 and the mean total density, ρ tot (R ) = (0.0654 ± 0.0015 syst ) M pc −3 . The mean total density logarithmic slope, α tot = −2.322 ± 0.047 syst for 5 ≤ R ≤ 19.5 kpc and the mean local surface density Σ(R , |z| ≤ 1.1 kpc) = (54.6 ± 1.7 syst ) M pc −2 . All the errors given here are the same systematic errors as in Section 4.1 using the four sectors, the best-fitting model, and the model with free anisotropies.

Flared Disk Model
In Section 3.1.2 we also explained how we can change our stellar distribution to account for the flaring of the disk. Using this distribution we test how that affects our model.
We use the same errors as before and the fit results are listed in Table 1 with their formal errors. The model result is shown in the third column of Figure 5 and the posterior distribution is shown in the Appendix A Figure 10. From the posterior distribution and the table one can notice that the statistical uncertainty of (σ θ /σ r ) 1 is an order of magnitude smaller than for the rest of the velocity dispersion ratios, even though the fit has converged properly. This confirms that the formal errors are not reliable since they are too small and the systematic uncertainties of the model would dominate.
From the χ 2 value one can see that the fit with flaring is not as good as without it, which could be due to our choice of flare parameters. Also, if we compare the residuals of the two models, one can notice that the residuals get higher away from the midplane for the model with flaring. On the other hand, around the midplane it seems to be similarly good or only slightly higher than the model without the flaring. The differences between the models with and without flaring can also be seen in Figure 8.
We the data as well, so we cannot assume that they have the same systematics and further study would be needed. As mentioned above, the flared model might give slightly worse fit to our data only because of the flare parameters we selected, since we just added a parametrization of the scale height, h(z), to our normal model without flaring. Additionally, to fully test if the flared model provides a better fit to the data, one would need to do a detailed study of the flare and test which parametrization for the scale heights is the best for the Milky Way disk. This is beyond the scope of this paper and we just want to get a first idea of how our model changes if we include a flare.

CONCLUSION
In this paper we construct a dynamical Jeans model of the Milky Way disk using the JAM sph code (Cappellari 2020).
The data we use are based on Gaia EDR3, combined with the Hogg et al. (2019) data set, which are combined APOGEE, Gaia DR2, 2MASS and WISE data with precise spectrophotometric distance estimates. The combined data set enlarges the range of Galactocentric distances that we cover. We probe an area in range of 5.0 ≤ R ≤ 19.5 kpc and −2.5 ≤ z ≤ 2.5 kpc.
Our main results are consistent with PAPER I within the uncertainties, but more reliable because of the more extended radial range of our data.
The best fit value for the dark matter slope is α DM = −1.602 ± 0.015 stat ± 0.079 syst which is smaller than −1 (NFW) and the dark matter density at solar radius is (0.00892 ± 0.00007 stat ± 0.00056 syst ) M pc −3 . These values agree with our previous work (PAPER I) and are consistent with other works for the dark matter slope which indicate that a steeper slope than NFW is needed for the disk 2) and the model with free anisotropie. The radial range (5 kpc to 19.1 kpc) of the surface density in this plot, is between the minimum and maximum radius of the data which we used for our best-fitting model (see Figure 5).   region (Portail et al. 2017;Cole & Binney 2017). The dark matter density at solar position is slightly smaller than previous works (McKee et al. 2015;McMillan 2017, PAPER I), which might be because we constrain the total density and the decomposition is more uncertain, similar for what we explained for the circular velocity. However, the values still agree in the 3σ range of the uncertainties. The circular velocity at the solar position is (234.7 ± 0.3 stat ± 1.7 syst ) km s −1 with a mild decline towards larger radii of a circ = (−1.78 ± 0.05 stat ± 0.34 syst ) km s −1 kpc −1 , which is consistent with the results from Eilers et al. (2019). The total density at the solar radius is ρ tot (R ) = (0.0672 ± 0.0006 stat ± 0.0015 syst ) M pc −3 and total density logarithmic slope is α tot = −2.367 ± 0.007 stat ± 0.047 syst for 5 ≤ R ≤ 19.5 kpc and −2.278 ± 0.007 stat ± 0.047 syst for 3.5 ≤ R ≤ 12.5 kpc (as in PAPER I). All these values are consistent our previous work (PAPER I) within the errors. The total density slope is also consistent with the slope inferred from early-type disk galaxies (Cappellari et al. 2015) and the total solar density is again smaller compared to other works but agrees within the 3σ uncertainties of the work by McKee et al. (2015). Further, the local surface density is Σ(R , |z| ≤ 1.1 kpc) = (55.5 ± 0.5 stat ± 1.7 syst ) M pc −2 which is slightly lower than other findings (e.g. Piffl et al. 2014) but still within the uncertainties. Additionally, we also test how non-axisymmetries of the gravitational potential change our result.
We further investigate how a flared disk would change our results. It provides a similarly good fit but in the future one could perform a more thorough analysis of the parameter space to better constrain the flaring of the disk given its dynamical properties.
The regions which are not included in our model are the bar in the inner part and the warp in the outer disk. This features are non-axisymmetric and we cannot reproduce them with our model. We avoid the bar influence by excluding the region inside 5 kpc, but the warp is still effecting our model. Hence, one should note that the regions of our data strongly influenced by this feature (the outer disk) are not well reproduced with our model due to the axisymmetry assumption.
The kinematics used in this paper and the MGE components can be found as Supplementary data in the online version. The kinematic data can be used to recreated Figures 1, 2 and the best-fitting results of Table 1 while the Kinematics of the individual sectors were used to get the best-fitting parameters of Table 2. The MGE datasets are useful for Figures 1, 2, 5 and are described in Section 3. The full details are available in the package.

APPENDIX A. POSTERIOR DISTRIBUTION
The posterior distribution of the the best-fitting model without flaring (see Section 4.1) is shown in Figure 9. Additionally, Figure 10 shows the posterior distribution for the best-fitting model with a flared disk (see Section 4.3). We have also tested how the result changes with JAM cyl (Cappellari 2008) for the flared model and we see that the results are almost the same. The only difference is that the small formal errors of (σ θ /σ r ) 1 are getting of the same order as the other ratios and that χ 2 DOF is slightly smaller (χ 2 DOF = 1.07), which can be a coincidence since the flared model we use might be wrong, and the cylindrical alignment may happen to compensate for that.  Table 3.
What we see also from the posterior distribution, is that q DM tends to get high values and has a correlation with the mass-to-light ratio [(M * /L) V ]. Since, a too high q DM value is nonphysical we keep it fixed for our models in the main text.  (2019) and ours. We had already seen an offset in the previous calculation with the best-fitting JAM model in PAPER I.
To do that, we assume the same Milky Way model like in Eilers et al. (2019). We adopt a spherical Navarro-Frenk-White profile (Navarro et al. 1996), for the thin and thick disk we assume Miyamoto-Nagai profiles (Miyamoto & Nagai 1975), and for the bulge we assume a spherical Plummer potential (Plummer 1911), while adapting the parameter values of Pouliasis et al. (2017, model I). Furthermore, we use only the Hogg et al. (2019) data that satisfy |z| < 0.5 kpc or tan(z/R) < 6 • and and have more than 3 stars in each 200 pc × 200 pc bin. Finally, we assume R = 8.122 kpc (Gravity Collaboration et al. 2018), z = 0.025 kpc (Jurić et al. 2008) and as solar velocities in cylindrical Galactic coordinates (U , V , W ) = (−11.1, 245.8, 7.8) km s −1 (Reid & Brunthaler 2004).
Using the JAM cyl (Cappellari 2008) and the above assumptions we can reproduce almost perfectly the velocity curve by Eilers et al. (2019), see Figure 11 which is plotted as fig. 3  . Posterior distribution for the best-fitting JAM sph with an exponential disk model. This is the corner plot for the fit without a flared disk (see Section 4.1). The panels show posterior probability distributions marginalized over two dimensions (contours) and one dimension (histograms). The thick contours represent the 1σ, 2σ and 3σ confidence levels for one degree of freedom. The numbers with errors on top of each plot are the median and 16th and 84th percentiles of the posterior for each parameter (black dashed lines).
This, proves that the offset we have in our main model, is not caused by the modelling method but by the data set and the Galaxy model we assume. Small discrepancies are towards the Center where we do not have data and towards higher radii. The last one confirms our suspicion that the Pouliasis et al. (2017) model overestimates the mass in the inner parts and therefore underestimates the stellar mass towards larger radii to compensate it, which causes a high dark matter mass at larger radii.