The local dark matter density

J I Read

doi:10.1088/0954-3899/41/6/063101

1. Introduction

The local dark matter density (ρ_dm) is an average over a small volume, typically a few hundred parsecs¹, around the Sun. It is of great interest for two main reasons. Firstly, it encodes valuable information about the local shape of the Milky Way's dark matter halo² near the disc plane. This provides interesting constraints on galaxy formation models and cosmology (e.g. Dubinski 1994, Ibata et al 2001, Kazantzidis et al 2004, Macciò et al 2007, Debattista et al 2007, Lux et al 2012); on the merger history of our Galaxy (e.g. Lake 1989, Read et al 2008, 2009); and on alternative gravity (AG) theories (e.g. Milgrom 2001, Knebe and Gibson 2004, Read and Moore 2005, Nipoti et al 2007). Secondly, ρ_dm is important for direct detection experiments that hope to find evidence for a dark matter particle in the laboratory. The expected recoil rate (per unit mass, nuclear recoil energy E, and time) in such experiments is given by (e.g. Lewin and Smith 1996):

$\begin{equation} \frac{{\rm d}R}{{\rm d}E} = \frac{\tilde{\rho }_\mathrm{dm}\sigma _W |F(E)|^2}{2 m_W \mu ^2} \int _{v>\sqrt{m_N E/2\mu ^2}}^{v_\mathrm{max}} \frac{f({\bf v},t)}{v}{\rm d}^3 {\bf v} \end{equation} \tag{ 1 }$

where σ_W and m_W are the interaction cross section and mass of the dark matter particle (that we would like to measure); |F(E)| is a nuclear form factor that depends on the choice of detector material; m_N is the mass of the target nucleus; μ is the reduced mass of the dark matter–nucleus system; v = |v| is the speed of the dark matter particles; f(v, t) is the velocity distribution function; $v_\mathrm{max} = 533^{+54}_{-41}$ km s⁻¹ (at 90% confidence) is the Galactic escape speed (Piffl et al 2014); and $\tilde{\rho }_\mathrm{dm}$ is the dark matter density within the detector.

It is clear from equation (1) that the ratio σ_W/m_W trivially degenerates with $\tilde{\rho }_\mathrm{dm}$ . Thus, to measure the nature of dark matter from such experiments (in the event of a signal), we must have an independent measure of $\tilde{\rho }_\mathrm{dm}$ . This can be obtained by extrapolating from ρ_dm to the lab, accounting for potential fine-grained structure (Kamionkowski and Koushiappas 2008, Vogelsberger et al 2008, Zemp et al 2009, Peter 2009, Fantin et al 2011); I discuss this in section 2. We also need to know the velocity distribution function of dark matter particles passing through the detector: f(v, t). In the limit of small numbers of detected dark matter particles, this must be estimated from numerical simulations (section 2). However, for several thousand detections across a wide range of recoil energy, it can be measured directly (Peter 2011).

There are two main approaches to measuring ρ_dm. Local measures use the vertical kinematics of stars near the Sun—called 'tracers' (e.g. Kapteyn 1922, Oort 1932, Hill 1960, Oort 1960, Bahcall 1984b, 1984a, Bienayme et al 1987, Kuijken and Gilmore 1989c, 1989b, 1989a, 1991, Bahcall et al 1992, Creze et al 1998, Holmberg and Flynn 2000a, Siebert et al 2003, Holmberg and Flynn 2004, Bienaymé et al 2006, Garbari et al 2012, Smith et al 2012, Moni Bidin et al 2012, Bovy and Tremaine 2012, Zhang et al 2013). Global measures extrapolate ρ_dm from the rotation curve³ (e.g. Dehnen and Binney 1998a, Fich et al 1989, Merrifield 1992, Sofue et al 2009, Weber and de Boer 2010, Catena and Ullio 2010, McMillan 2011). More recently, there have been attempts to bridge these two scales by modelling the phase space distribution of stars over larger volumes around the Solar neighbourhood (Bovy and Rix 2013). The global measures often result in very small error bars (Catena and Ullio 2010; though see Salucci et al 2010 and Iocco et al 2011). However, these small errors hinge on strong assumptions about the Galactic halo shape—particularly near the disc plane (Weber and de Boer 2010). By contrast, local measures rely on fewer assumptions, but have correspondingly larger errors (e.g. Garbari et al 2012, Smith et al 2012, Zhang et al 2013). To avoid confusion, I will refer to results from global estimates that assume a spherically symmetric dark matter halo as an 'extrapolated' dark matter density, denoted ρ_{dm, ext}, while I will refer to local measures as ρ_dm. Combining measures of ρ_dm and ρ_{dm, ext}, we can probe the local shape of the Milky Way halo. If ρ_dm < ρ_{dm, ext}, then the dark matter halo at the Solar position R₀ ∼ 8 kpc is likely prolate (stretched) along a direction perpendicular to the disc plane. If ρ_dm > ρ_{dm, ext}, this could imply an oblate (squashed) halo, or a local dark matter disc (see figure 1). I discuss the theoretical implications of these different scenarios in section 2.

**Figure 1.** A schematic representation of local versus global measures of the dark matter density. The Milky Way disc is marked in grey; the dark matter halo in blue. Local measures—ρ_dm—are an average over a small volume, typically a few hundred parsecs around the Sun. Global measures—ρ_{dm, ext}—are extrapolated from larger scales and rely on assumptions about the shape of the Milky Way dark matter halo. (Here I define ρ_{dm, ext} such that the halo is assumed to be spherical.) Such probes are complementary. If ρ_dm < ρ_{dm, ext}, this implies a stretched or prolate dark matter halo (situation a, left). Conversely, if ρ_dm > ρ_{dm, ext}, this implies a squashed halo, or the presence of additional dark matter near the Milky Way disc (situation b, right). This latter is expected if our Galaxy has a 'dark disc' (see section 2).
Download figure:
Standard image High-resolution image

Measurements of ρ_dm have a long history dating back to Kapteyn (1922) who was one of the first to coin the term 'dark matter'. Using the measured vertical velocity of stars near the Sun, he compared the sum of their masses to the vertical gravitational force required to keep them in equilibrium, finding that:

'As matters stand it appears at once that this [dark matter] mass cannot be excessive.'

However, this early pioneering work treated the stars as a collisional gas, whereas stars are really a collisionless fluid that obeys similar but different equations of motion. This was corrected the same year by Jeans (1922), who laid down the basic theory for mass modelling of stellar systems that I outline in section 3. The technique was later refined and applied to improved data by Oort (1932), Hill (1960), Oort (1960), and Bahcall (1984a), 1984b). However, there were several problems with these early works: (i) their measurements relied on poorly calibrated 'photometric' estimates of the distances (section 3.6); (ii) stars were chosen that were sometimes too young to be dynamically well mixed in the disc (see section 3); (iii) populations were often assumed to be 'isothermal' with the vertical velocity dispersion constant with height (typically a poor approximation: Kuijken and Gilmore 1989a, Garbari et al 2011); and (iv) it was often unclear if the stars for which photometric density distributions could be estimated were the same stars for which the velocity distribution was measured (Kuijken and Gilmore 1989a, and see section 3). A key series of papers by Kuijken and Gilmore (1989a, 1989b, 1989c, 1991), improved on this by collecting an unprecedented amount of data, and compiling a volume complete sample of K-dwarf stars (that are particularly good for measuring photometric distances; section 3.6) towards the South Galactic Pole. A quarter of these had radial velocity measurements.

A further key improvement came with the Hipparcos satellite that launched in August 1989, providing positions and proper motions for ∼100 000 stars within ∼100 pc of the Sun (van Leeuwen 2007). It was a boon for the field, since prior to this only radial Doppler velocities and photometric distances were available. Several new measurements of ρ_dm using these new data followed (Creze et al 1998, Holmberg and Flynn 2000a, Siebert et al 2003, Holmberg and Flynn 2004, Bienaymé et al 2006).

Most recently, there have been a series of new measurements coming from new Galactic surveys—the Sloan Digital Sky Survey (SDSS; Smith et al 2012, Zhang et al 2013), and the RAdial Velocity Experiment (RAVE; Siebert et al 2008). These same surveys have recently found evidence for vertical density waves in the Milky Way disc (Widrow et al 2012, Williams et al 2013, Yanny and Gardner 2013), perhaps caused by the recent Sagittarius dwarf merger (Gómez et al 2013). This is something that may prove both a blessing and a curse for attempts to measure ρ_dm; I discuss this further in section 5.8.

All of the above measurements use stellar kinematics to probe the total Galactic potential near the Sun. To extract the local dark matter density from this, we must assume some weak field theory of gravity (to link the potential to the matter density; see section 2.1 and section 3), and we must subtract off the contribution from visible matter (i.e. stars, gas, stellar remnants etc). I call this from here on the baryonic matter density ρ_b. Estimates of this have also evolved with time, from an early estimate of ρ_b ∼ 0.038 M_⊙pc⁻³ ⁴ (Oort 1932) to the more modern value ρ_b = 0.0914 ± 0.009 M_⊙pc⁻³ (Flynn et al 2006). I discuss the latest constraints on ρ_b in section 3.5.

In addition to the above improvements in data, there has been a concerted push to better understand the model systematics that go into the measurement of ρ_dm. Early work by Statler (1989) and Kuijken and Gilmore (1989c) explored the effects of un-modelled coupling between radial and vertical star motions (see section 3), while tests on simple mock data drawn from an analytic Galactic model have been useful in determining the effect of errors due to measurement uncertainties and poor sampling (Kuijken and Gilmore 1991; Inoue and Gouda 2013; and see section 4). But a full test of methods on dynamically realistic N-body mock data has only come recently with Garbari et al (2011). This has exposed some rather surprising model biases that I discuss further in section 3 and section 4. Finally, new methods to combat such systematics are being developed (e.g. Garbari et al 2011, McMillan and Binney 2013) resulting in further new measurements of ρ_dm (Garbari et al 2012). I discuss these techniques in section 3 and compare and contrast the latest measurements in section 5.

A summary of measurements of ρ_dm from Kapteyn through to the present day is given in figure 2, where I mark also the latest limits on ρ_{dm, ext} from the rotation curve assuming a spherically symmetric dark matter halo (grey band⁵); all data and references are given in table 4. I discuss this figure in detail along with the latest constraints on ρ_dm in section 5.

With the successful launch of the Gaia satellite, measurements of ρ_dm are set to enter a golden age (e.g. Perryman et al 2001, Wilkinson et al 2005). There are significant challenges to be overcome (Rix and Bovy 2013, Binney 2013), but as has happened post-Hipparcos, Gaia will likely drive another step-wise improvement in the error bars on ρ_dm. I discuss this in section 5.9.

This article is organized as follows. In section 2, I discuss theoretical expectations for ρ_dm and its laboratory extrapolation $\tilde{\rho }_\mathrm{dm}$ in our current cosmology. In section 3, I present the key theory behind both local and global measures of the local dark matter density, with a particular focus on moment methods. In section 4, I present tests of different methods on simple 1D mock data, determining what quality and type of data best constrain ρ_dm. In section 5, I discuss historical measures of ρ_dm and summarise the latest measurements from different groups. I compare and contrast the advantages and disadvantages of different methods and data, and I assess where the key uncertainties remain. In section 5.9, I discuss how the Gaia satellite will transform our measurements of ρ_dm. Finally, in section 6, I present my conclusions.

2. Theoretical expectations for ρ_dm and its laboratory extrapolation $\tilde{\rho }_\mathrm{dm}$

Before discussing mass modelling theory and the latest results, it is worth a short digression to describe our theoretical expectations for ρ_dm (averaged over a few hundred parsecs), and its extrapolation to the dark matter density in the laboratory $\tilde{\rho }_\mathrm{dm}$ .

2.1. The cosmological model

Throughout this review, I will assume the 'standard' Λ Cold Dark Matter model, or ΛCDM, where the Λ refers to 'dark energy'—an apparent acceleration of the Universe at the present time. This is supported by a wealth of observational data. The cosmic microwave background radiation (Wright et al 1992, Planck Collaboration et al 2013); galaxy clustering (Croft et al 2002); baryon acoustic oscillations (Slosar et al 2013); and Type Ia SNe standard candles (Riess et al 1998, Perlmutter et al 1999) all point towards a cosmological model where the energy density of the Universe comprises just 5% in baryons (Ω_b); 27% in dark matter (Ω_dm); and 68% in dark energy ( $\Omega _\Lambda$ ). This is further supported by evidence for dark matter within galaxies and clusters from stellar/galaxy kinematics (e.g. Zwicky 1937, van der Kruit and Freeman 1984, Kleyna et al 2001, Adams et al 2012); stellar/gaseous rotation curves (e.g. Volders 1959, Freeman 1970, Bosma et al 1977, Bosma and van der Kruit 1979, Rubin et al 1980, van Albada et al 1985); and gravitational lensing (e.g. Walsh et al 1979, Clowe et al 2006).

The ΛCDM model has two unknown elements: dark energy and dark matter. The former appears to be consistent with a 'cosmological constant' that could result from vacuum energy, though this is far from established (e.g. Planck Collaboration et al 2013). The latter, we are better able to pin down. While it remains unclear exactly what dark matter is, it does appear to move as a collisionless non-relativistic fluid at least at the present time (Clowe et al 2006). AG theories like MOND (Milgrom 1983) and its relativistic extension TeVeS (Bekenstein 2004) face a host of observational challenges⁶ (e.g. Clowe et al 2006, Natarajan and Zhao 2008, Ibata et al 2011, Dodelson 2011). By contrast, dark matter as a collisionless fluid appears to give an excellent match to the growth of large scale structure in the Universe (e.g. Viel et al 2008). On smaller scales, there have been many claims of problems with ΛCDM, most notably the missing satellites and cusp-core problems. The former is a large discrepancy between the predicted and observed number of satellite galaxies around the Milky Way and M31 (Klypin et al 1999, Moore et al 1999); the latter is a discrepancy between predicted 'cuspy' dark matter density at the centres of dwarf galaxies (ρ = ρ₀[r/r₀]⁻¹) and observed constant density cores (ρ = ρ₀; Flores and Primack 1994, Moore 1994). Both problems have stood the test of time, with dark matter cores now being reported even within tiny dwarf spheroidals orbiting the Milky Way (e.g. Goerdt et al 2006, Walker and Peñarrubia 2011, Cole et al 2012). These small scale problems may be telling us something exciting about the nature of dark matter (e.g. Bode et al 2001, Rocha et al 2013) or inflation physics (e.g. Zentner and Bullock 2002). However, on scales below ∼1 Mpc 'baryon physics' (radiative cooling, star formation and feedback from stellar winds, supernovae and active galactic nuclei) become important. These difficult-to-model processes could physically reshape the dark matter at the centres of galaxies, solving the cusp-core problem without the need to resort to exotic cosmology (Read and Gilmore 2005, Mashchenko et al 2006, Pontzen and Governato 2012, Teyssier et al 2013). Such cored dwarfs are then much more easily tidally disrupted by the Milky Way (MW), plausibly solving the missing satellites problem too (Read et al 2006b, Zolotov et al 2012). I discuss this in more detail in section 2.3.

While dark matter is most likely some sort of collisionless fluid, it is not clear what it is made up of. Microlensing constraints from the Milky Way bulge and the nearby Large and Small Magellanic Clouds put an upper bound of the mass of 'compact object' dark matter of M_dm < 10⁻⁷ M_⊙ (e.g. Tisserand et al 2007). While no smoking gun, this and the other results above point towards dark matter being comprised of some new yet-to-be discovered weakly interacting particle that lies beyond the standard model of particle physics (e.g. Jungman et al 1996, Boyarsky et al 2009). The precise nature of this particle, however, remains elusive. It could be quite massive (∼ 10–1000 GeV), as predicted by some supersymmetric extensions to the standard model (e.g. Jungman et al 1996). This would make it non-relativistic at all times, so-called 'Cold Dark Matter' (CDM). However, other popular models like axions or sterile neutrinos predict a lighter particle (∼1–50 keV) that would be relativistic for a time in the early Universe (e.g. Boyarsky et al 2009), so-called 'Warm Dark Matter' (WDM). I focus on CDM in this review as it remains better-studied than WDM (see also the discussion in section 2.2), but note that WDM remains an exciting proposition that deserves to be more fully explored.

2.2. Dark matter only (DMO) simulations

In ΛCDM, structure grows via the hierarchical accretion of smaller sub-structures (White and Rees 1978). The process is highly nonlinear, requiring numerical N-body simulations to integrate the equations of motion (Dubinski and Carlberg 1991, Navarro et al 1996b, Stadel et al 2009, Springel et al 2008, Dehnen and Read 2011). Such simulations solve Newtonian gravity between N 'super-particles' on the background of an expanding Freedmann–Lemaître–Robertson–Walker metric. The super-particles have mass typically ≳ 10³ M_⊙ each and represent large smoothed patches of the collisionless dark matter fluid; they should not be confused with dark matter particles that are likely >10⁶⁰ of magnitude smaller in mass. This 'Newtonian approximation' is extremely good (Adamek et al 2013), and certainly more than adequate for calculating the phase space distribution function of dark matter in the Galaxy.

A detailed discussion of cosmological N-body simulations is beyond the scope of this present work (see e.g. Dehnen and Read 2011 and Kuhlen et al 2012b for reviews). Here, I simply note that the results from these simulations—at least for non-relativistic CDM—numerically converge on a well-defined asymptotic solution as the number of super-particles is increased (Heitmann et al 2007, Kim et al 2013, and see figure 3, panel (b)). In this sense, the results from these 'dark matter only', or DMO simulations as I will call them from now on, are robust. That said, problems still remain for simulations where there is a strong suppression in the small scale power spectrum, as in WDM simulations (Bode et al 2001, Avila-Reese et al 2001, Wang and White 2007, Hahn et al 2013). There, discreteness noise due to anisotropic force errors leads to the growth of spurious numerical substructures. A full solution to the problem remains elusive, though recent work shows promise. Hahn et al (2013) suggest a radical break from the standard N-body method by numerically modelling the folding of the dark matter phase sheet in phase space. Unlike standard N-body methods, they explicitly calculate the phase space distribution of the sheet by interpolating between particles. They then integrate over velocity to obtain the dark matter density field. The method shows great promise but becomes computationally expensive in regions where the sheet becomes highly foliated—i.e. at the centres of forming halos. Lovell et al (2014) propose a much less computationally expensive post-processing algorithm to prune spurious structures from standard N-body simulations. However, this can only remove surviving spurious structure, leading to the worry that already merged spurious halos may remain problematic.

**Figure 3.** Key predictions from dark matter only (DMO) cosmological simulations. (a) Projected density contours of the *Aquarius* Aq-A-1 DMO cosmological simulation of a halo of Milky Way mass (M₂₀₀ ∼ 10¹² M_⊙), run with 4.2 billion dark matter super-particles (Springel *et al* 2008). The size of the Galactic disc out to the Sun position R₀ = 8 kpc (not modelled in this simulation) is marked by the red horizontal line. (b) The spherically averaged dark matter density profile from the *GHALO* suite of Milky Way mass halo simulations (Stadel *et al* 2009). Four different resolutions (super-particle numbers) are marked, showing excellent numerical convergence. (c) The dark matter density Probability Distribution Function (PDF) in the *Aquarius* suite, calculated using a kernel average (64 smoothing neighbours) at each super-particle, normalized to a power law model fit over a thick ellipsoidal shell at 6–12 kpc from the halo centre (Vogelsberger *et al* 2009a). Simulations Aq-A-1 through Aq-A-5 (of decreasing numerical resolution, as marked) are over-plotted; only Aq-A-1 and Aq-A-2 resolve the high density tail due to subhalos. The black dashed line shows the intrinsic scatter due to Poisson noise in the density estimator. (d) The dark matter velocity PDF averaged over 2 kpc boxes at 7–9 kpc from the halo centre of Aq-A-1.
Download figure:
Standard image High-resolution image

2.2.1. Key predictions from DMO simulations

In this section, I summarize the key predictions, relevant for this review, from ΛCDM DMO simulations of Milky Way-mass Galactic halos. These are collated in figure 3.

The spherically averaged radial density profile

A first key prediction from DMO simulations is the spherically averaged radial density profile of dark matter halos. This is well-fit (at the ∼10% level; Merritt et al 2006, Stadel et al 2009) by a split power law that goes as roughly r⁻¹ in the centre and r⁻³ at the edge (Dubinski and Carlberg 1991, Navarro et al 1996b), the 'NFW' profile (see figure 3(b)):

$\begin{equation} \rho = \rho _0 \left(\frac{r}{r_s}\right)^{-1}\left(1+\frac{r}{r_s}\right)^{-2} \end{equation} \tag{ 2 }$

where r_s is a radial scale length; and ρ₀ is a density normalization. These are usually defined in terms of a 'concentration parameter' c = r₂₀₀/r_s; a 'virial radius' r₂₀₀; and a 'virial mass' M₂₀₀:

$\begin{equation} \rho _0 = \frac{200}{3} \frac{c^3}{\ln (1+c)-\frac{c}{1+c}} \rho _{\rm crit}; \end{equation} \tag{ 3 }$

where ρ_crit = 128.2 M_⊙ kpc⁻³ is the critical density of the Universe at redshift z = 0;

$\begin{equation} r_{200} = \left(\frac{3M_{200}}{4 \pi 200 \rho _{\rm crit}}\right)^{1/3} \end{equation} \tag{ 4 }$

is the 'virial radius' at which the mean enclosed density is 200 times ρ_crit; and M₂₀₀ is the 'virial mass'—the mass enclosed within r₂₀₀.

The NFW profile appears to be 'universal' in the sense that it gives a good fit to the full range of halo masses probed to date, from dwarf galaxy subhalos to giant galaxy cluster halos (Navarro et al 1996b, Springel et al 2008, Stadel et al 2009), though the physical reason for this universality remains to be fully understood (e.g. MacMillan et al 2006, Pontzen and Governato 2013).

Although there is significant scatter in r_s at a given M₂₀₀, there is a correlation between the two (Navarro et al 1996b, Bullock et al 2001, Macciò et al 2007). At redshift z = 0, Macciò et al (2007) find:

$\begin{equation} \log c = 1.02 [\pm 0.015]-0.109 [\pm 0.005] \left(\log \left[\frac{M_{200}}{1.47\,{\rm M}_\odot}\right] -12\right ) \end{equation} \tag{ 5 }$

with a scatter about this mean relation of σ_log c = 0.14 ± 0.013. Thus, for Milky Way mass halos (M₂₀₀ ∼ 10¹² M_⊙; Wilkinson and Evans 1999, Klypin et al 2002, McMillan 2011, Piffl et al 2013), we have r₂₀₀ = 210 kpc and $r_s = 19^{+7.5}_{-5.4}$ kpc at 68% confidence.

The shape of dark matter halos

DMO simulations also make predictions for the shape of dark matter halos, which are found to be triaxial (Dubinski and Carlberg 1991, Warren et al 1992, Navarro et al 1996b, Jing and Suto 2002, and see figure 3(a)). Consistent with earlier work, Macciò et al (2007) find a mean shape parameter 〈q〉 = (b + c)/2a ∼ 0.8 when averaged over the whole halo, where a > b > c are the long, intermediate and short axes of the figure. This corresponds to a typically prolate (egg-shaped) halo. Like the halo concentration parameter, 〈q〉 shows significant scatter at a given halo mass, slightly decreasing with halo mass (Macciò et al 2007). When not averaged over the whole halo, the shape parameter q is also a function of ellipsoidal radius (Jing and Suto 2002). An understanding of the expected distribution of halo shapes is important for ρ_dm when we try to extrapolate its value from larger scales (see figure 1), and when studying the expected scatter in ρ_dm at a given Galactocentric radius. I discuss this latter, next.

The local dark matter density

Defining the 'Solar neighbourhood' as a small volume around the Sun, we can use the above simulations to theoretically estimate ρ_dm for halos of Milky Way mass. The first and simplest analysis is to average ρ_dm in a spherical shell at the 'Solar circle', R₀ ∼ 8 kpc. Zemp et al (2009) perform this exercise for the high resolution 'VL-II' DMO simulation of a Milky Way mass halo, finding 〈ρ_dm〉_s = 0.01056 M_⊙ pc⁻³, which is remarkably close to that measured for the real Milky Way (see figure 2).

We can go further, however, and use the DMO simulations to estimate the expected scatter in ρ_dm. This is encoded in the dark matter density probability density function (PDF). Vogelsberger et al (2009a) calculate this at ∼8 kpc from the halo centre for the Aquarius suite of high resolution DMO simulations (figure 3(c)). They use a 'smoothed particle' kernel weighted density estimate calculated at the position of each super-particle in a thick ellipsoidal shell at 6–12 kpc from the halo centre. This is then normalized to a power law model fit to this same ellipsoidal shell. With this analysis, the resultant scatter in ρ_dm is remarkably small—consistent with the Poisson noise in the density estimator (black dashed line; figure 3(c)). (In other words, the scatter in ρ_dm is so small that they are unable to measure it above the intrinsic super-particle noise in the simulation.) However, this small scatter relies on the analysis being performed over ellipsoidal shells. Zemp et al (2009) perform a similar exercise for the VL-II simulation (Diemand et al 2007), but averaging ρ_dm over spherical volumes of radius 500 pc and normalizing to 〈ρ_dm〉_s. With this 'spherical' analysis, they find a scatter in ρ_dm of up to a factor of 2–3 within the 68% confidence interval of their density PDF. When averaging instead along just one axis of the triaxial halo figure, they find a small scatter similar to that reported in Vogelsberger et al (2009a). Thus, the scatter in ρ_dm reported by Zemp et al (2009) is entirely due to systematic differences in ρ_dm along the long, intermediate and short axis of the triaxial halo. If the Milky Way halo is triaxial and we allow the disc to be aligned along any of the principle axes, then such scatter should be considered as part of our theoretical uncertainty on ρ_dm. In practice, however, we cannot align discs arbitrarily within triaxial halos. Discs are unstable if aligned perpendicular to the intermediate axis of the figure (Heiligman and Schwarzschild 1979, Binney 1981, Debattista et al 2013). More importantly, baryons—stars and gas—that are not included in the DMO models, likely alter the expected halo shape, making halos much rounder and reducing the expected scatter in ρ_dm. I discuss this in section 2.3.

The two highest resolution Aquarius simulations—Aq-A-1 and Aq-A-2—are able to resolve the high density tail in the PDF due to subhalos at 8 kpc (figure 3(c), blue and red lines). While subhalos can significantly boost ρ_dm, the likelihood of this happening is very small (see section 2.2.2).

Finally, it is straightforward to show from these DMO simulations that, even up to ∼1 kpc above the disc of the Milky Way, we expect ρ_dm to be roughly constant when averaged over small 'Solar neighbourhood' volumes (Garbari et al 2011). This will provide a valuable simplification when trying to derive ρ_dm from real data, as we shall see in section 3.

The local velocity distribution function of dark matter

We can also use DMO simulations to predict the local velocity distribution function of dark matter in the Milky Way. This is important for direct detection experiments as I already discussed in section 1. The latest simulations are consistent with being close to Maxwellian, but not quite (Zemp et al 2009, Vogelsberger et al 2009a, and see figure 3(d)). The 'not-quite' is important, particularly at the high velocity tail end of the distribution. This is boosted in the simulations with respect to a pure Maxwellian profile, where the highest velocity particles come from recently accreted structure that is not fully phase-mixed (so-called 'debris flows' Kuhlen et al 2012a, Lisanti and Spergel 2012). These structures are a super-position of many tidal streams that intersect the Solar neighbourhood volume; they are particularly important for direct detection experiments that are sensitive to light or inelastic dark matter, or those with directional sensitivity (Kuhlen et al 2012a). Even more pronounced effects occur if an undisrupted but significant stream penetrates the Sun position (Stiff et al 2001). This is statistically unlikely, but—at least for the more massive streams—can be observationally tested by hunting for the visible stream-stars that would accompany such a 'dark stream' (Freese et al 2005). Lower mass satellite streams are potentially more problematic. These could also alter the local velocity PDF while being completely devoid of stars and essentially undetectable. I discuss these in section 2.2.2, below.

An example velocity PDF averaged over 2 kpc boxes at 7–9 kpc from the halo centre of the Aq-A-1 Aquarius simulation is shown in figure 3(d). Notice that, while the distribution is reasonably Maxwellian, there are prominent bumps and wiggles of larger magnitude than the box-to-box scatter. These depend on the particular formation history of a given dark matter halo (see figure 4 from Vogelsberger et al 2009a). As pointed out by those authors, if we enter an era where dark matter particles are routinely detected, then we could actually measure such bumps and wiggles for our own Galaxy. Since these encode information about our Galactic accretion history, we could conceive of unravelling our past via detailed modelling of the dark matter velocity PDF. I caution, however, that such bumps and wiggles may be at least partially erased by baryonic processes during Galaxy formation (section 2.3); this remains to be explored.

**Figure 4.** Including baryons in the cosmological simulations alters the predictions for ρ_dm. (a) Adding dissipative baryonic matter causes the dark matter halo become oblate and aligned with the disc (red horizontal line; Read *et al* 2009). (b) The presence of a massive disc at high redshift biases the accretion of satellites causing their tidal debris—both stars and dark matter—to settle into a rotating disc. This plot shows the distribution function of rotational velocity in the disc plane v_ϕ for the LPM simulation (Table 1; Read *et al* 2009). Without baryons (DMO; dotted), the dark matter distribution is well-fit by a single Gaussian. Including baryons (DM; black), it is skewed towards the rotating stellar disc (red); it is well-fit by a double Gaussian. This is a particularly extreme example since the LPM simulation had a massive near-planar merger at redshift z ∼ 1.
Download figure:
Standard image High-resolution image

2.2.2. Extrapolating from ρ_dm to ˜ρ_dm

Even with over a billion super-particles, the spatial resolution of the Aquarius Aq-A-1 DMO simulation is ∼20 pc (Springel et al 2008). While this is sufficient to model ρ_dm on the scales that we can hope to measure it in our Galaxy, it is many orders of magnitude away from $\tilde{\rho }_\mathrm{dm}$ . Thus, we must extrapolate from ρ_dm to obtain $\tilde{\rho }_\mathrm{dm}$ . The key concerns here are:

1.
unresolved substructrues, tidal debris/streams, and caustics (Stiff et al 2001, Freese et al 2005, Kamionkowski and Koushiappas 2008, Vogelsberger et al 2009b, Vogelsberger and White 2011, Fantin et al 2011); and
2.
the effect of the Solar system on the dark matter phase space distribution function (e.g. Peter 2009).

Streams and Caustics.

Vogelsberger and White (2011) use a novel 'sub-grid' stream model applied to the Aquarius simulation suite to show that unresolved streams are unlikely to significantly affect the smoothed results found in high resolution cosmological simulations (see also Fantin et al 2011). This is because of the sheer number of criss-crossing streams (∼10¹⁴) that co-add to make the distribution very smooth. The result is rather fortunate. Massive streams that could affect the velocity PDF are rare and in any case detectable because of their accompanying stars; lower mass streams that may be undetectable due to a lack of accompanying stars are common and, as a result, co-add to make the velocity PDF smooth. Caustics (regions of extremely high density caused by foliations of the dark matter phase sheet) appear to be similarly unimportant (Vogelsberger et al 2009b).

Unresolved substructure.

Kamionkowski and Koushiappas (2008) discuss the possibility that we lie within a small dark matter subhalo, significantly increasing $\tilde{\rho }_\mathrm{dm}$ with respect to ρ_dm. While this can occur, the probability that we lie on top of such a subhalo is quite small. Kamionkowski and Koushiappas (2008) derive a density PDF for $\tilde{\rho }_\mathrm{dm}$ that has a peak at $\tilde{\rho }_\mathrm{dm}< \rho _\mathrm{dm}$ , with a power law tail to high density caused by subhalos. The peak is lower than ρ_dm because of mass conservation. If we move more dark matter into substructures, then the tail to high density is boosted because there are more dense substructures, but the peak of the density PDF is shifted to lower density because there is less mass in the remaining smooth component. Since we are most likely to lie at or near the peak of the distribution, substructure halos have the effect, statistically, of reducing $\tilde{\rho }_\mathrm{dm}$ with respect to ρ_dm. Kamionkowski and Koushiappas (2008) extrapolate mass functions from N-body simulations down to the free streaming scale. Assuming a total mass fraction in substructure of 10%, the peak of the density PDF for $\tilde{\rho }_\mathrm{dm}$ is only very slightly shifted to ∼0.9ρ_dm, while the probability that $\tilde{\rho }_\mathrm{dm}$ is larger than ρ_dm is very small.

Solar system capture and scattering.

Finally, the effect of scattering within the Solar system is also likely to be small (Peter 2009), once both the capture and ejection of dark matter particles is taken into account (Edsjo and Peter 2010).

In conclusion, current state-of-the-art DMO simulations that achieve a spatial resolution of ∼20 pc appear to be adequate for making predictions for both ρ_dm and $\tilde{\rho }_\mathrm{dm}$ , under the assumption that baryons do not significantly alter the dark matter distribution. However, this assumption is most likely a poor one, as I discuss next.

2.3. The effect of baryons

While the DMO simulations are well understood, when including 'baryonic' matter (stars and gas) the simulations become significantly more complex (e.g. Mayer et al 2008). At present, the state-of-the art still leaves important physics below the resolution limit—so-called 'sub-grid' physics—leading to large discrepancies between groups (Mayer et al 2008, Scannapieco et al 2012). However, this situation is set to improve rapidly as both software algorithms and hardware improve (e.g. Dehnen and Read 2011). Recent simulations have now passed a critical resolution threshold of ∼100 pc that allows the most massive star forming regions to be resolved (Guedes et al 2011, Agertz et al 2011, Hopkins et al 2013), as well as beginning to resolve the scale height of the Milky Way thin disc (∼200 pc) for the first time. The most massive star forming regions are where the majority of massive stars explode as supernovae, returning heat and metals to the inter-stellar medium (ISM). This stellar feedback appears to be critical in forming galaxies that match the observed properties of real galaxies in the Universe (e.g. Mayer et al 2008, Guedes et al 2011, Agertz et al 2011, Hopkins et al 2013), though at present rather strong feedback—where a significant fraction of the available SNe energy couples very efficiently to the surrounding gas—appears to be required (e.g. Mashchenko et al 2008, Governato et al 2010, Guedes et al 2011, Teyssier et al 2013). Such feedback is not yet problematic given our uncertainties in how feedback operates (e.g. Agertz et al 2013), but more work needs to be done on modelling the small scale physics and its coupling to larger scales to determine whether or not feedback can really regulate the growth of galaxies, or whether we are missing some important ingredient in our cosmological model.

2.3.1. Qualitative predictions

While we are currently unable to make strong predictions when including baryonic processes, we can still study the expected changes to the DMO predictions in a more qualitative manner using the latest simulations. I discuss the key results from these here.

Most of the local mass near the Sun is in baryons, not dark matter.

The first important point to realize is that although we expect (and indeed observe) a significant amount of dark matter in our galaxy, the amount of dark matter expected in the vicinity of the Sun is actually rather small. This is because gas is a dissipative fluid. Unlike dark matter, gas can condense to form a rotationally supported disc that dominates the local gravitational potential. We can estimate the approximate about of dark matter expected in the vicinity of the Sun from the rotation curve assuming spherical symmetry. The enclosed mass at the Solar position R₀ ∼ 8 kpc is given by:

$\begin{equation} M_{\rm dm}(R_0) \sim \frac{v_c^2 R_0}{G} - M_d \end{equation} \tag{ 6 }$

where G is Newton's gravitational constant; v_c ∼ 220 km s⁻¹ is the local circular speed (Bovy et al 2012a, Schönrich 2012, Golubov et al 2013); and M_d ∼ 6 × 10¹⁰ M_⊙ is the mass of the Milky Way stellar disc (e.g. Binney and Tremaine 2008). This gives M_dm(R₀) ∼ 3 × 10¹⁰ M_⊙. Thus, about half of the mass of the Milky Way interior to R₀ is actually in baryons rather than dark matter (see e.g. Klypin et al 2002 for a more detailed analysis that arrives at the same conclusion). As we approach the disc plane, this becomes even more extreme. The scale height of the Milky Way thin disc is z₀ ∼ 200 pc, with most of the disc mass lying within ∼500 pc (e.g. Binney and Tremaine 2008). Thus, assuming a halo like that simulated in Springel et al (2008) normalized to the Milky Way rotation curve, dark matter comprises just ∼ one tenth of the mass in the Solar neighbourhood volume (8 < R₀ < 9 kpc; |z| < 500 pc).

The above makes hunting for the gravitational effect of dark matter near the Sun rather like looking for the proverbial needle in the haystack. This is one motivation for using extrapolations from larger scales where the dark matter dominates the potential. We are left in the end with a trade-off. We can average over large volumes over which we will see significant dark matter, but be necessarily less 'local', or we can average over a very small volume near the Sun, but be significantly more sensitive to our assumed baryonic mass model. I discuss this further in section 3.

Cusp-core transformations and halo shape change.

As gas collects and dominates the central potential of galaxies, it can cause a physical rearrangement of the dark matter distribution (simply through the gravitational interaction). Dark matter can contract in response to gas condensation (Young 1980, Blumenthal et al 1986), or even expand if energetic supernovae, or active galactic nuclei eject a significant amount of mass (Navarro et al 1996a). This latter process needs to repeat multiple times for the effects to be significant (Read and Gilmore 2005, Mashchenko et al 2006, Pontzen and Governato 2012, Teyssier et al 2013, Pontzen and Governato 2014). But if it does act, it will gradually transform dark matter cusps, predicted by DMO simulations (section 2.2), into constant density dark matter cores. Such cores have been observed in dwarf galaxies for over two decades now (e.g. Moore 1994, Flores and Primack 1994), lending support to such an idea. Further observational evidence has come more recently. If such cusp-core transformations occur, then the star formation history of dwarf galaxies should be bursty with a duty cycle of ∼250 Myrs, while their stars should be similarly heated leading to—at least in the older stellar populations—a significant vertical dispersion. Both of these predictions are consistent, and perhaps even favoured, by the latest data (Teyssier et al 2013). Such processes may even be important for galaxies as massive as the Milky Way (Dutton et al 2010, Macciò et al 2012).

Gas condensation also alters the shape of dark matter halos making them oblate and aligned with the disc, at least within ∼10 disc scale lengths (Katz and Gunn 1991, Dubinski 1994, Debattista et al 2007, Read et al 2009, and see figure 4(a)). This has three important effects on ρ_dm. Firstly, it makes assumptions of spherical symmetry for our Galaxy not unreasonable, despite the expectation in a DMO Universe that halos are triaxial (Dubinski and Carlberg 1991, Warren et al 1992, Navarro et al 1996b, Jing and Suto 2002, and see section 2.2.1). This means that spherical extrapolations from the rotation curve ρ_{dm, ext} could give a reasonable estimate of ρ_dm (see figure 1). Secondly, a more spherical halo significantly reduces the expected scatter in ρ_dm at the Solar neighbourhood (Pato et al 2010, and see discussion in section 2.2.1). Thirdly, oblate halos enhance ρ_dm. We can think of this enhancement as coming from a contraction of the dark matter halo due to the addition of a massive stellar disc. Bovy and Tremaine (2012) use a back-of-the-envelope calculation to argue that for the Milky Way, this enhancement should be about ∼30%. This matches recently published numerical results remarkably well (Pato et al 2010, Kuhlen et al 2013).

The formation of a 'dark disc'.

Finally, if a star/gas disc is already in place at high redshift then it will bias the further accretion of subhalos towards the disc plane. This is a result of momentum exchange due to gravitational scattering between the satellite and the disc stars: 'dynamical friction' (e.g. Binney and Tremaine 2008). The frictional force goes as:

$\begin{equation} M \dot{\bf v} = C \frac{\rho M^2}{v^3}{\bf v} \end{equation} \tag{ 7 }$

where M is the mass of the satellite; $\dot{\bf v}$ is the deceleration due to dynamical friction; ρ is the background density (i.e. stars, gas, dark matter etc); C is some constant of proportionality; and v = |v| is the velocity of the satellite relative to the background⁷.

Assuming a disc density (see section 3):

$\begin{equation} \rho = \rho _0 \exp (-|z|/z_0) \end{equation} \tag{ 8 }$

and assuming that the satellite travels on a straight line at velocity v through the disc, then its change in velocity over a single passage is given by:

$\begin{equation} \Delta v = |\dot{\bf v}| \Delta t = |\dot{\bf v}| \frac{2z_0}{v} \simeq \frac{2 C \rho _0 z_0 M}{v^3}. \end{equation} \tag{ 9 }$

This frictional force acts to drag the most massive satellites down towards the disc plane, leading to an accreted disc that contains both stars and dark matter (Lake 1989, Read et al 2008, 2009, Purcell et al 2009, Ling et al 2010, Kuhlen et al 2013, and see figure 4(b)).

There are three important points to note from equation (9). Firstly, the force depends on the satellite mass M and so will only be important for the most massive mergers (Read et al 2008). Secondly, the force is approximately proportional to the product of the disc scale height and central density: 2ρ₀z₀, which is nothing more than the disc surface density:

$\begin{equation} \Sigma = 2 \int _0^\infty \rho (z)\,{\rm d}z = 2\rho _0 z_0. \end{equation} \tag{ 10 }$

Thus, even if simulations do not properly resolve z₀ (most cosmological simulations do not), they can still largely capture the disc-plane dragging process correctly so long as they correctly capture Σ (Read et al 2009).

Finally, notice that the friction force goes as 1/v², where v = |v| ≃ |v_sat − v_disc| is the difference in velocity between the satellite and the background. Thus, the friction is significantly enhanced for satellites that co-rotate with the disc. For this reason, we expect the accreted disc stars and dark matter to largely co-rotate (Read et al 2008, 2009). Retrograde accreted material must also be present, but it is most likely to be sub-dominant to the prograde material, particularly as we approach the Solar neighbourhood.

Read et al (2008), 2009) estimate that the dark disc should contribute ∼ 0.25–1.5 times ρ_dm from the non-rotating smooth halo in our current cosmology, depending on the (rather uncertain) merger history and mass of our Galaxy. The dark disc also changes the velocity PDF of dark matter particles, producing a distribution that is better-fit by a double rather than single Gaussian, with interesting implications for both direct and indirect dark matter particle searches (Bruch et al 2009a, 2009b). Table 1 summarises the range of dark disc properties found by Read et al (2009) for three Milky Way mass galaxies with rather different merger histories, as marked. The most quiescent galaxy Q has a rather puny dark disc that contributes just ∼20% to ρ_dm, while the large planar merger (LPM) simulation has a massive ∼1:1 near-planar merger that produces a dark disc that dominates ρ_dm. This latter simulation introduces an alternate dark disc formation mechanism: if the mass ratio of the merger is small enough, then a gas rich merger can define the resultant disc plane, leading to a very significant dark disc (Read et al 2009). Such a scenario is not immediately implausible for the Milky Way. The LPM merger occurred at redshift z ∼ 1 which corresponds to ∼8 Gyr ago in our current cosmology. This is about the age separation of the Milky Way thin and thick discs (if there are indeed such distinct entities Bovy et al 2012b). Thus, any stellar heating induced by the merger could be hidden entirely in the thick disc stars—perhaps even explaining the origin of the thick disc.

Table 1. Dark disc properties for three numerical simulations of Milky Way mass galaxies taken from Read et al (2009). The original simulation labels are given in brackets; I use the more descriptive labels Q, LM and LPM in this review. Each galaxy had a rather different merger history: Q was rather Quiescent with no massive mergers since redshift z = 2; LM had two Large Mergers at z < 0.5; and LPM had a Large near-Planar Merger at z ∼ 1 (∼8 Gyr ago). The columns show: a description of the simulation; the dark disc to smooth halo density ratio averaged over |z| < 2.1 kpc; 7 < R < 8 kpc; the vertical velocity dispersion of the dark disc; the rotational velocity of the dark disc; and the ratio of the local to extrapolated dark matter density evaluated at 7 < R < 8 kpc (see text for details).

Description		ρ_dd/ρ_dm	σ_dd(km/s)	v_{rot, dd}(km/s)	(ρ_dm − ρ_{dm, ext})/ρ_{dm, ext}
Q	Quiescent (MW1)	0.23	50	54	0.175
LM	Late mergers (h204)	1.1	76	144	0.35
LPM	Large (∼1:1) planar	1.65	88	140	0.47
	merger (h258)

While the ratio ρ_dd/ρ_dm is of great interest for direct dark matter detection experiments (section 1), it is difficult to measure directly. Much more accessible is a comparison of the local to extrapolated dark matter density (cf figure 1):

$\begin{equation} \zeta = (\rho _\mathrm{dm}- \rho _\mathrm{dm,ext})/\rho _\mathrm{dm,ext}. \end{equation} \tag{ 11 }$

For ζ > 0, we have a flattened halo near the disc plane and/or a dark disc, while ζ < 0 implies a prolate halo. To calculate ζ from the simulation data, I average ρ_dm over |z| < 0.5 kpc and 7 < R < 8 kpc; and calculate ρ_{dm, ext} from the cumulative enclosed dark matter mass assuming spherical symmetry:

$\begin{equation} \rho _\mathrm{dm,ext}= \frac{M_{\rm dm}(R_2) - M_{\rm dm}(R_1)}{4\pi \overline{R}^2 \Delta R} \end{equation} \tag{ 12 }$

where R₂ = 8 kpc; R₁ = 7 kpc; $\overline{R} = 7.5$ kpc; and ΔR = R₂ − R₁. I compare the values of ζ for the Q, LM and LPM simulations to real data for the Milky Way in section 5.4.

The above findings for dark discs have largely been confirmed by more recent works (Purcell et al 2009, Ling et al 2010, Kuhlen et al 2013); however, there is some significant debate about how quiescent the merger history of our Galaxy was. The Eris simulation explored by Kuhlen et al (2013), for example, has a particularly quiescent merger history as compared to typical dark matter halos of similar mass. Purcell et al (2009) argue that this must be so, as otherwise mergers would dynamically over-heat the Milky Way thick stellar disc. However, such heating is reduced if mergers are of lower inclination and orbital eccentricity (exactly the same mergers that give rise to significant dark discs; Read et al 2008); or if gas—not present in the Purcell et al (2009) models—is included (Moster et al 2010).

Turning the above around, however, if it can be demonstrated that the Milky Way has a rather puny dark disc, then the implication is that its merger history must have indeed been rather quiescent. I discuss the possibility of empirically constraining the dark disc—and therefore the merger history of our Galaxy—next.

2.3.2. Hunting for the Milky Way's dark disc

One approach to constrain the Milky Way's dark disc is to hunt for the stars that must have been accreted along with it. These should show distinct chemistry and kinematics from the in-situ Milky Way population leading to the hope that they can be detected (Read et al 2008). A second possibility for detecting the 'dark disc' is via its dynamical influence—i.e. its contribution to ρ_dm (Read et al 2008, Garbari et al 2012). The expected scale height of the dark disc is large (2–3 kpc) and so, unless measurements probe high up above the Galactic disc, the approximation that ρ_dm is constant over the Solar neighbourhood remains reasonable even when considering the dark disc (Read et al 2008). This means, however, that the 'dark disc' will likely degenerate with the flattened oblate halo that is expected due to adiabatic contraction of the dark matter halo (see section 2.3.1). By combining measures of ρ_dm and ρ_{dm, ext} extrapolated from the rotation curve (figure 1) with chemo-dynamic Galactic 'archaeology' in the Milky Way, we can hope to break this degeneracy. There are several interesting scenarios:

1.
ρ_dm < ρ_{dm, ext}. In this case, there is no dark disc and the dark matter halo is likely prolate. This would have interesting implications for ΛCDM cosmology and/or galaxy formation theories as such a scenario is not expected. It would also essentially rule out weak field AG theories that require the gravitational potential to share symmetry properties with the disc (Helmi 2004, Read and Moore 2005).
2.
ρ_dm ≃ ρ_{dm, ext}. In this case the dark matter halo is near-spherical and there is no significant 'dark disc'. This implies a rather quiescent merger history for the Milky Way (Read et al 2008, Purcell et al 2009, Kuhlen et al 2013).
3.
ρ_dm > ρ_{dm, ext}. This implies either an oblate/squashed dark matter halo and/or a 'dark disc'. The degeneracy between these two scenarios can also be broken with improved data:

(a)
An oblate halo will additionally show flattening far from the disc plane. There may already be hints of such a flattening in the tidal debris of satellites orbiting around the Milky Way (e.g. Lux et al 2012) and in the kinematics of distant Milky Way 'halo stars' (e.g. Loebman et al 2012). Neither probe is conclusive at present. However, relatively small improvements in data promise significantly improved constraints (Lux et al 2013).
(b)
A 'dark disc' can be found via the stars that are accreted with it. These should show distinct chemistry and kinematics from the underlying in-situ disc population.

We explore which of the above scenarios, given current data, is most likely for the Milky Way in section 5.

2.3.3. Towards ab-initio simulations including baryonic physics

Ideally, we would like to be able to make robust quantitative predictions from numerical simulations that model both the dark matter and baryonic fluids. While this remains a significant challenge, resolving the correct spatial locations of the most massive star forming regions within galaxies (∼100 pc) is a key milestone that we have recently passed (Guedes et al 2011, Agertz et al 2011). For this reason, we can expect that the next generation of galaxy formation simulations will be significantly more predictive (e.g. Kim et al 2013).

3. Mass modelling theory

In this section, I briefly review the theory behind calculating the gravitational potential from an equilibrium distribution of 'tracer' stars moving in that potential. I focus mainly on stellar tracers in this review, discussing gas briefly in section 3.8.

A population of tracer stars obeys the collisionless Boltzmann equation:

$\begin{equation} \frac{{\rm d}f}{{\rm d}t} = \frac{\partial f}{\partial t} + \nabla _x f\cdot {\bf v} - \nabla _v f\cdot \nabla _x \Phi = 0 \end{equation} \tag{ 13 }$

where f(x, v) is the stellar distribution function; x and v are the positions and velocities, respectively; and Φ is the gravitational potential.

Assuming Newtonian weak field gravity, the force ∇_xΦ is related to the total mass density ρ (stars, gas, dark matter etc) through Poisson's equation:

$\begin{equation} \nabla _x \cdot \nabla _x \Phi = \nabla _x^2 \Phi = 4\pi G \rho . \end{equation} \tag{ 14 }$

If the system is in dynamic equilibrium (steady state), then we may neglect the partial time derivative of f in equation (13). This may not be a good approximation for the Milky Way if it has been recently bombarded by a satellite, or if the chosen tracers are not dynamically 'well mixed' in the disc. I discuss the choice of tracer stars in section 3.6; and recent evidences for disequilibria in the Milky Way disc in section 5.8.

Assuming equilibrium tracers for now, we drop the ∂f/∂t term. With this assumption—and armed with a measurement of the phase space distribution function f of our tracers—in principle, we can directly measure the gravitational force ∇_xΦ by solving equation (13). In practice, however, this is hard because f is six-dimensional (even a million stars gives only ten sample points per dimension) and we need to estimate the (noisy) partial derivatives of f. There are several solutions to this problem, each with advantages and disadvantages. I detail these, next.

3.1. Distribution function modelling

In distribution function modelling, we write down some parameterized (but possibly rather general) functional form for f(x, v). With a particular form in mind, the derivatives may be calculated either analytically or numerically without noise being an issue. Furthermore, since f—appropriately normalized—is really just a probability density distribution, we can directly calculate the likelihood of the data given the model:

$\begin{equation} \mathcal {L} = \prod _i \frac{f({\bf x}_i, {\bf v}_i)}{\int {\rm d}^3 {\bf x} {\rm d}^3 {\bf v} f({\bf x}, {\bf v})} \end{equation} \tag{ 15 }$

where the product is over all stars i with phase space position [x_i, v_i], while the integral is over the full distribution function. A useful trick is to take the logarithm of equation (15) that transforms the product into a more computationally manageable sum.

The advantages of such an approach are: i) we can directly model discrete data; and ii) we maximise the information content in the data by using the full shape information in the distribution function. The key disadvantage is that we must assume some form for f up-front. If our choice(s) for f do not include the correct solution, then we will obtain biased results no matter what quality or abundance of data are available (I give an example of this in section 4). Furthermore, it can often be very difficult to work out when this is happening.

One way to combat the above is to make f as general as possible. There are several approaches that may be considered as variants of one-another. I briefly describe these next, before shifting to moment methods (section 3.2) that are the main focus of this review.

3.1.1. 'Schwarzschild' or orbit modelling

In Schwarzschild modelling, we model the distribution function as a linear combination of many stellar orbits (Schwarzschild 1979). Starting with some assumed gravitational potential Φ, we build an orbit library: a large collection of representative orbits within this potential. This is usually comprised of regular orbits, though chaotic orbits can also be modelled as a constant additive phase space contribution (e.g. Binney and Tremaine 2008, Zhao 1996). The observed distribution of stars is then fit using a weighted sum over these orbits (for recent examples, see: van de Ven et al 2008, van den Bosch et al 2008, Vasiliev 2013).

Schwarzschild modelling has the advantage that the distribution function is directly constrained by the data in an essentially parameter free way (once the potential is prescribed). The disadvantages are mostly due to the computational cost of exploring a wide range of models. For discrete data, we require a large number of orbits to properly span the phase space (error-free data formally require infinitely many orbits; Magorrian 2013); while for each trial potential, we must begin over building the orbit-library from scratch. McMillan and Binney (2013) have argued recently that the intrinsic noise in the method owing to the finite number of orbits within the library could be a major barrier for exquisite data, unless the data are binned (for a discussion of the perils and pitfalls of binning data, see section 3.2). Furthermore, moving to libraries with an enormous number of orbits can lead to the danger of over-fitting noise in the data.

3.1.2. Made to Measure (M2M)

The made to measure (M2M) method was first proposed by Syer and Tremaine (1996). At heart, it is really an N-body method. However, it is different from typical N-body techniques in that each star has a constantly evolving orbit weight that pushes the simulated N-body system towards the real data. The idea is to maximise a merit function (Dehnen 2009):

$\begin{equation} Q = \mu S - \case{1}{2} C \end{equation} \tag{ 16 }$

where C is some constraint function that measures the goodness of fit; μ is a Lagrange multiplier; and S is some penalty function that forces us towards a single optimal solution; more on this shortly. The functions C and S are a matter of choice, but typically C is a χ²-like measure:

$\begin{equation} C = \sum _j^n \left(\frac{Y_j - y_j}{\sigma _j}\right)^2 \end{equation} \tag{ 17 }$

where Y_j are the data values with uncertainties σ_j; and y_j = ∑_iw_iK_j(x_i, v_i) are moments of the model weighted by a smoothing kernel K_j and some individual weights w_i (typically, a time averaged weight is used to avoid oscillating solutions; Dehnen 2009); and S is a pseudo-entropy:

$\begin{equation} S = -\sum _i \hat{w}_i \log \left(\frac{\hat{w}_i}{p_i}\right) \end{equation} \tag{ 18 }$

where $\hat{w}_i = w_i / \sum _j w_j$ are normalized weights, and p_i are priors on these weights.

The basic idea is then to solve the motion of the particles as a usual N-body problem:

$\begin{equation} \ddot{\bf x}_i = \nabla _x \Phi \end{equation} \tag{ 19 }$

where the potential Φ and accelerations ∇_xΦ are calculated using standard numerical techniques (e.g. Dehnen and Read 2011), while evolving the weights w_i with time to maximise Q:

$\begin{equation} \dot{w}_i = \epsilon w_i \frac{\partial Q}{\partial w_i} \end{equation} \tag{ 20 }$

where epsilon is a normalization parameter.

Modern implementations of the M2M method include: Bissantz et al (2004), de Lorenzi et al (2007), Rodionov et al (2009), Dehnen (2009), Long and Mao (2010) and Hunt and Kawata (2013). Each of these authors have extended and adapted the above classic methodology mainly to cope with the problem of orbit weight convergence.

The key advantage of M2M is that it naturally avoids assumptions about the form or shape of the gravitational potential, or the distribution function. Unlike the Schwarzschild method, the potential is fit simultaneously along with the orbit weights. However, it shares many of the same issues as Schwarzschild modelling. Searching through many models can be slow since M2M converges only on one 'best' solution; there may be others that are equally good (Dehnen 2009). There is a danger that solutions will not converge (Dehnen 2009) and, as with Schwarzschild, there is a danger of over-fitting noise in the data (de Lorenzi et al 2007). However, most of these issues will continue to improve with time as software and hardware algorithms improve (e.g. Dehnen and Read 2011). Indeed, this is what has driven a sudden interest in the method—largely untouched since Syer and Tremaine (1996)—over the past few years.

3.1.3. Action modelling

The Jeans theorem states that for regular orbits—and assuming a steady state galaxy—the distribution function may be written in terms of isolating integrals (e.g. Binney and Tremaine 2008). A particularly useful choice of canonical coordinates for the isolating integrals are the Action-Angle variables (e.g. Binney and Tremaine 2008, Binney 2013). These have the useful property that the actions J are conserved along each orbit, while the angles ${\boldsymbol \theta }$ increase linearly with time. From Hamilton's equations, we have:

$\begin{equation} \dot{\bf J} = \frac{\partial H}{\partial {\boldsymbol \theta }} = 0; \,\,\,\, \dot{\boldsymbol \theta } = \frac{\partial H}{\partial {\bf J}} = {\bf \Omega }({\bf J}) = {\rm const.} \end{equation} \tag{ 21 }$

⇒

$\begin{equation} {\bf J} = {\rm const.}; \,\,\,\, \theta _i = \theta _{0,i} + \Omega _i t \end{equation} \tag{ 22 }$

where H is the Hamiltonian.

In one dimension, the constant action and linearly increasing angle maps out a circle in phase space. In two dimensions, this becomes a torus; while in three dimensions, it is a 3-torus (recall that a circle is a 1-torus).

By the Jeans theorem, we can write the distribution function solely in terms of these actions: f ≡ f(J). Thus, once the orbital actions for a set of stars are known, the full distribution function is immediately known. This is a key strength of action modelling⁸. Like other methods, however, it also has some disadvantages. Firstly, the map from the observables [x, v] to the Actions $[{\bf J},{\boldsymbol \theta }]$ and visa-versa is non-trivial. Simple solutions are known for separable Stäckel potentials (Stäckel 1883, de Zeeuw 1985), but more general potentials require a numerical solution. One potential approach is torus modelling, where orbital tori in a general Galactic potential are fit by warping known tori from a simple toy potential (Kaasalainen and Binney 1994, Sanders 2012a, Binney 2013). A full solution for general potentials has not yet been presented, but may be achievable as an extension of existing techniques (Binney 2013). Secondly, only regular orbits can be modelled in this way. Binney (2013) cast this as an advantage in that it allows us to study the departure from regularity in a controlled manner. Perturbation theory about the best-fitting regular model, for example, has already proven to be able to recover the behaviour of irregular orbits in the case of a planar logarithmic potential (Kaasalainen 1994).

Binney (2012a) have recently introduced a useful approximation for calculating actions in potentials that are close to Stäckel form. This was applied to fit a simple parameterized distribution function to Solar neighbourhood data in Binney (2012b), illustrating the power of such an approach. The axisymmetric distribution function is assumed to take a 'quasi-isothermal' form:

$\begin{equation} f(J_r,J_z,L_z) = \frac{\Omega \Sigma \epsilon }{2 \pi ^2 \sigma _r^2 \sigma _z^2 \kappa }[1 + \tanh (L_z/L_0)]\,{\rm e}^{-\kappa J_z / \sigma _r^2} {\rm e}^{-\epsilon J_z / \sigma _z^2} \end{equation} \tag{ 23 }$

where J_r, J_z are the radial and vertical actions, respectively; L_z is the specific angular momentum of orbits within the disc plane; and Ω(L_z), κ(L_z) and epsilon (L_z) are the circular, radial and vertical epicyclic frequencies set by the gravitational potential. Under the epicycle approximation of near-circular orbits, these are given by (e.g. Binney and Tremaine 2008):

$\begin{equation} \Omega ^2 = \frac{L_z^2}{R^4} \,\,;\,\, \kappa ^2 = \left(R\frac{{\rm d}\Omega ^2}{{\rm d}R} + 4\Omega ^2\right)_{R_c,0} \,\,;\,\, \epsilon ^2 = \left(\frac{\partial ^2\Phi }{\partial z^2}\right)_{R_c,0}. \end{equation} \tag{ 24 }$

We must then further specify a form for the disc surface density Σ, the functions σ_r(L_z) and σ_z(L_z), and the gravitational potential Φ. Some simple choices for these (exponentials for Σ, σ_r and σ_z; and a Dehnen and Binney (1998b) model for the potential) are adopted in Binney (2012b). The 'Stäckel action' approximation is then required in order to map the observables [x, v] onto the actions J that appear in equation (23) for a given potential Φ(R, z) (Binney 2012a).

This same model has also been used recently by Bovy and Rix (2013) to measure the surface density of the Milky Way disc over a range of radii (4.5 < R < 9 kpc), for the first time. I discuss these measurements in section 5.

3.2. Moment methods: the Jeans equations

A completely different approach to distribution function modelling is to take instead moments of equation (13). Casting the steady state collisionless Boltzmann equation (equation (13) without the ∂f/∂t term) in cylindrical polar coordinates [R, ϕ, z], we have (e.g. Binney and Tremaine 2008):

$\begin{equation} v_R \frac{\partial f}{\partial R} + \frac{v_\phi }{R}\frac{\partial f}{\partial \phi } + v_z \frac{\partial f}{\partial z} - \left(\frac{\partial \Phi }{\partial R} - \frac{v_\phi ^2}{R}\right)\frac{\partial f}{\partial v_R} - \frac{1}{R}\left(v_Rv_\phi + \frac{\partial \Phi }{\partial \phi }\right)\frac{\partial f}{\partial v_\phi } - \frac{\partial \Phi }{\partial z}\frac{\partial f}{\partial v_z} = 0. \end{equation} \tag{ 25 }$

Multiplying through v_R, v_ϕ or v_z and integrating over all velocities derives the three Jeans equations (Jeans 1922, Binney and Tremaine 2008):

$\begin{equation} \frac{\partial (\nu \sigma _R^2)}{\partial R} + \frac{\partial (\nu \sigma _{Rz})}{\partial z} + \nu \left(\frac{\sigma _R^2 - \sigma _\theta ^2}{R} + \frac{\partial \Phi }{\partial R}\right) = 0 \quad R-\mathrm{Jeans} \end{equation} \tag{ 26 }$

$\begin{equation} \frac{1}{R^2}\frac{\partial (R^2\nu \sigma _{R\phi })}{\partial R} + \frac{\partial (\nu \sigma _{\phi z})}{\partial z} = 0 \quad \phi -\mathrm{Jeans} \end{equation} \tag{ 27 }$

$\begin{equation} \frac{1}{R}\frac{\partial (R\nu \sigma _{Rz})}{\partial R} + \frac{\partial }{\partial z}\big(\nu \sigma _z^2\big) + \nu \frac{\partial \Phi }{\partial z} = 0 \quad z-\mathrm{Jeans} \end{equation} \tag{ 28 }$

where:

$\begin{equation} \nu = \int {\rm d}^3 {\bf v} f({\bf x},{\bf v}) \end{equation} \tag{ 29 }$

is the density of the tracer stars, which is the zeroth moment of the distribution function;

$\begin{equation} \langle v\rangle _i = \frac{1}{\nu }\int {\rm d}^3 {\bf v} v_i f({\bf x},{\bf v}) \end{equation} \tag{ 30 }$

is the mean velocity (with i = R, ϕ, z), which is the first moment of the distribution function; and

$\begin{equation} \sigma _{ij} = \frac{1}{\nu }\int {\rm d}^3 {\bf v} (v_i - \langle v\rangle _i) (v_j - \langle v\rangle _j) f({\bf x},{\bf v}) \end{equation} \tag{ 31 }$

is the velocity dispersion tensor, which is a second velocity moment of the distribution function. (Note that ν should not be confused with the total matter density ρ that appears in the Poisson equation (equation (14)). The equality ν = ρ is only valid if the tracer stars comprise all of the gravitating mass.)

In principle, we may continue in the same vein adding ever higher order moment equations (for example, multiplying through by $v_R^2$ and integrating). This begins to constrain the shape of f at each point through its moments. (A Gaussian is fully defined by its first and second moments and thus the above equations are sufficient. However, more complex distributions will have non-trivial third, fourth and higher moments.) This is potentially valuable but highlights a key problem: such a set of moment equations has no closure relation (e.g. Binney and Tremaine 2008). Some distribution functions can be pathological, requiring a infinite set of moment equations⁹. Even then, such a set of moments may not correspond to a unique distribution function (the log-normal distribution is a simple example; e.g. Carron 2012).

The key advantages of Jeans methods are: i) they are extremely fast as compared to other methods, allowing large parameter spaces to be explored; and ii) no assumption about the form of f is required since we just constrain its moments. The key disadvantages are that we must bin the data in order to calculate the moments; the shape of the distribution function is not used; the set of moment equations is not closed (see above); and it is possible in some cases that a solution is found for which no actual distribution function exists (An and Evans 2006, Binney and Tremaine 2008). Data binning is a particular problem since it averages information away, while it must be performed in 'model' rather than 'data' space which can make it tricky to properly account for observational uncertainties. I discuss this further in section 3.7.

3.3. The 1D approximation

Given current data, solving all three Jeans equations (26)–(28) is neither practical nor possible (though this is beginning to change; see section 5). For this reason, simplifying assumptions are a necessity. Fortunately, for measurements close the Solar neighbourhood, we can approximately reduce the dimensionality of the problem to just motion in the z-direction.

Consider the Jeans equation perpendicular to the disc:

$\begin{equation} \underbrace{\frac{1}{R}\frac{\partial (R\nu \sigma _{Rz})}{\partial R}}_{{\rm tilt\,\,term\,\,} \mathcal {T}} + \frac{\partial }{\partial z}\big(\nu \sigma _z^2\big) + \nu \frac{\partial \Phi }{\partial z} = 0 \quad z-\mathrm{Jeans}. \end{equation} \tag{ 32 }$

In this equation, the radial and vertical motions couple only through the 'tilt' term $\mathcal {T}$ , marked above. Close to the disc plane, we may expand the gravitational potential in a Taylor series about [R₀, 0]:

$\begin{equation} \Phi (R_0+\Delta R,\Delta z) \simeq \Phi (R_0,0) + \Delta z \left. \frac{\partial \Phi }{\partial z}\right|_{R_0,0}+ \Delta R \left. \frac{\partial \Phi }{\partial R}\right|_{R_0,0} + O(\Delta ^2) \end{equation} \tag{ 33 }$

that to leading order is separable in ΔR and Δz. Therefore, close to the disc plane, the cross term in the velocity ellipsoid must vanish: σ_Rz = 0, and the term $\mathcal {T}$ should be small as compared to the other terms in equation (32). The question remains, however, how close is 'close'? This can be estimated by assuming some simple but well-motivated model for the Milky Way disc:

$\begin{equation} \nu \simeq \nu _0 \exp (-R/R_0)\exp (-z/z_0) \end{equation} \tag{ 34 }$

$\begin{equation} \sigma _z^2 \simeq \sigma _{z,0}^2 \exp (-R/R_1) \end{equation} \tag{ 35 }$

$\begin{equation} \sigma _{Rz} \simeq \sigma _{Rz,0} \exp (-R/R_2) \left(\frac{z}{z_0}\right)^n . \end{equation} \tag{ 36 }$

The vertical and radial exponential dependencies are reasonable given our current knowledge of the Milky Way (e.g. Binney and Tremaine 2008, Siebert et al 2008, Rix and Bovy 2013). The vertical polynomial term for σ_Rz∝zⁿ ensures that σ_Rz(R, 0) = 0, while allowing it to rise arbitrarily steeply otherwise.

Putting equations (34)–(36) into equation (32) gives:

$\begin{equation} \sigma _{Rz}\left[\frac{1}{R} - \frac{1}{R_0} - \frac{1}{R_2}\right] - \sigma _z^2\frac{1}{z_0} + \frac{\partial \Phi }{\partial z} = 0. \end{equation} \tag{ 37 }$

Using R = R₀ ∼ 8 kpc; R₀ ∼ R₂ ∼ 2 kpc; and z₀ ∼ 0.2 kpc, we can take the ratio of the first two terms to assess the relative importance of the tilt $\mathcal {T}$ for the Milky Way at the Solar neighbourhood:

$\begin{equation} f_\mathcal {T} \sim \frac{7}{40} \frac{\sigma _{Rz}(R_0,z)}{\sigma _z^2(R_0,z)} . \end{equation} \tag{ 38 }$

Equation (38) can be thought of as a percentage error introduced by neglecting $\mathcal {T}$ . Current constraints for the Milky Way (Siebert et al 2008) suggest that the tilt angle of the velocity ellipsoid at ∼1 kpc is:

$\begin{equation} \tan (2\delta ) = \frac{2\sigma _{Rz}^2}{\sigma _z^2 \sigma _R^2} \simeq 2\delta=14.6\pm 3.6^\circ . \end{equation} \tag{ 39 }$

Thus, at |z| ∼ 1 kpc and using σ_z ∼ 20 km s⁻¹; σ_R ∼ 40 km s⁻¹ (Soubiran et al 2003), we have $f_\mathcal {T}(1\,{\rm kpc}) \sim 0.12$ ; it will be smaller than this at lower heights. Thus, for |z| ≲ 1 kpc we can reasonably ignore $\mathcal {T}$ at the 10% level. For larger heights, we will need to measure $\mathcal {T}$ and include it in the analysis.

From here on, we drop the tilt term $\mathcal {T}$ . This gives us a one dimensional equation in z:

$\begin{equation} \frac{\partial }{\partial z}\big(\nu \sigma _z^2\big) + \nu \frac{\partial \Phi }{\partial z} = 0 \end{equation} \tag{ 40 }$

which has a formal analytic solution:

$\begin{equation} \frac{\nu }{\nu (0)} = \frac{\sigma _z^2(0)}{\sigma _z^2}\exp \left(-\int _0^z \frac{1}{\sigma _z^2(z^{\prime })}\frac{\partial \Phi (z^{\prime })}{\partial z^{\prime }}{\rm d}z^{\prime } \right). \end{equation} \tag{ 41 }$

Finally, we can relate the potential Φ to the total matter density via Poisson's equation. In cylindrical coordinates (and assuming azimuthal symmetry), this is:

$\begin{eqnarray} 4\pi G \rho & = & \frac{\partial ^2 \Phi }{\partial z^2} + \frac{1}{R}\frac{\partial }{\partial R} \left(R \frac{\partial \Phi }{\partial R}\right) \nonumber \\ & = & \frac{\partial ^2 \Phi }{\partial z^2} + \underbrace{\frac{1}{R}\frac{\partial v_c^2(R,z)}{\partial R}}_{\mathrm{rotation\,\,curve\,\,term}\,\,\mathcal {R}}. \end{eqnarray} \tag{ 42 }$

If the rotation curve term $\mathcal {R}$ is also small, then equation (42) becomes an equation also only in z and our system of equations (equations (40) and (42)) reduces to 1D motion perpendicular to the disc. We might expect $\mathcal {R}$ to be small given the flatness of the Milky Way rotation curve (v_c ∼ const gives $\mathcal {R}(z=0) \sim 0$ ). At heights |z| ≲ 1.5 kpc, Kuijken and Gilmore (1989c) show, for a range of plausible Milky Way potential models, that $\mathcal {R}(z)$ is also small, amounting to a correction of order a few per cent. Bovy and Tremaine (2012) show that this rises to ∼10% at |z| ∼ 4 kpc, while the error always leads to an underestimate of ρ_dm. Thus, for |z| ≲ 1 kpc, we may also safely drop the $\mathcal {R}$ term, leading to a 1D system of equations: the 1D approximation.

Armed with our 1D system of equations, we are left with a number of choices in how to solve them. Firstly, we can either simultaneously solve the Jeans and Poisson equations (equations (41) and (42)), or we can first solve equation (41) for the vertical force:

$\begin{equation} K_z = -\frac{\partial \Phi }{\partial z} \end{equation} \tag{ 43 }$

and then consider what this means for the mass distribution in the disc (Hill 1960). This latter has the advantage that we need not specify a gravitational model until the last possible moment (e.g. Nipoti et al 2007).

Another choice enters in that we can solve equation (41) for ν(z) given some measured or fitted σ_z(z), or we can do this the other way round:

$\begin{equation} \sigma _z(z)^2 = \frac{1}{\nu (z)}\int _0^z \nu (z^{\prime }) K_z(z^{\prime }) \,{\rm d}z^{\prime } + \frac{\sigma _z(0)^2 \nu (0)}{\nu (z)} \end{equation} \tag{ 44 }$

which can be advantageous since ν(z) is often better constrained than σ_z (e.g. Kuijken and Gilmore 1989c).

Finally, we can choose to constrain either the volume density ρ or the surface mass density Σ. Neglecting the rotation curve term $\mathcal {R}$ , this is given by:

$\begin{equation} \Sigma (z) = \int _{-z}^z \rho (z^{\prime })\,{\rm d}z^{\prime } = 2 \int _0^z \frac{1}{4\pi G} \frac{\partial ^2 \Phi }{\partial z^2} \,{\rm d}z^{\prime } = \frac{|K_z|}{2\pi G}. \end{equation} \tag{ 45 }$

This has the advantage that is it directly related to the vertical force K_z, whereas ρ requires another derivative of the potential. The mean enclosed dark matter density can be calculated from Σ as:

$\begin{equation} \langle \rho \rangle _\mathrm{dm}(z_\mathrm{max}) = \frac{\Sigma _z(z_\mathrm{max}) - \Sigma _b(z_\mathrm{max})}{2z_\mathrm{max}} \end{equation} \tag{ 46 }$

where Σ_b(z_max) is the baryonic contribution.

3.4. A 1D distribution function method

If the tilt term is zero rather than just small ( $\mathcal {T} = 0$ ), then we can make a further approximation that the distribution function is fully separable up to z ∼ 1 kpc:

$\begin{equation} f = f_{R,\phi }(R,v_R,v_\phi ) \times f_z(z, v_z). \end{equation} \tag{ 47 }$

This is a stronger assumption than we have assumed so far as I will discuss in section 4, but it is powerful. Now we can write the vertical density fall-off as an integral over a one-dimensional distribution function in the vertical energy $E_z = \frac{1}{2}v_z^2 + \Phi$ (Kuijken and Gilmore 1989c):

$\begin{equation} \nu (z) = \int _{-\infty }^{\infty } \,{\rm d}v_z f(z,v_z) = 2\int _\Phi ^\infty \frac{f(E_z)}{\sqrt{2(E_z - \Phi )}}\,{\rm d}E_z. \end{equation} \tag{ 48 }$

Applying an Abel transformation, we obtain (Kuijken and Gilmore 1989c, Binney and Tremaine 2008):

$\begin{equation} f(E_z) = -\frac{1}{\pi } \int _{E_z}^\infty \frac{\partial \nu }{\partial \Phi }\frac{1}{\sqrt{2(\Phi - E_z)}}\,{\rm d}\Phi \end{equation} \tag{ 49 }$

which may be directly compared with discrete data [z, v_z] to obtain a likelihood function:

$\begin{equation} \mathcal {L} = \prod _i^N \frac{f(E_{z,i})}{\int _0^\infty f(E_z) \,{\rm d}E_z}. \end{equation} \tag{ 50 }$

This is the method derived and used by Kuijken and Gilmore (1989b). We call this the 'KG' method from here on.

Flynn and Fuchs (1994), Holmberg and Flynn (2000b) and Holmberg and Flynn (2004) employ a very similar method, but rather than calculating a likelihood from f(E_z), they use the 1D distribution function to calculate the density fall-off of a tracer population moving in a potential Φ(z). Starting from equation (48), we define a z = 0 vertical velocity without loss of generality:

$\begin{equation} w = \sqrt{2[E_z - \Phi (0)]} = \sqrt{2E_z} ; \,\,\,\, \Phi (0) \equiv 0. \end{equation} \tag{ 51 }$

Substituting this into equation (48), we obtain:

$\begin{equation} \nu (z) = 2 \int _{\sqrt{2\Phi }}^\infty \frac{f(w)w \,{\rm d}w}{\sqrt{w^2-2\Phi }} \end{equation} \tag{ 52 }$

which has the advantage that ν(z) may be calculated using only the vertical velocity distribution function of nearby stars in the plane, f(w). Comparing this with the observed distribution ν_obs(z), we can hone in on the best-fitting Φ(z).

Since both the KG and HF methods assume a separable distribution function (equation (47)), we focus on the HF method as a proxy for both when confronting 1D methods with mock data in section 4.

3.5. The mass model

The total matter density ρ is a sum over all baryonic components (stars, gas, stellar remnants etc) and dark matter. The dark component is likely constant, at least up to z ∼ 1 kpc for which the 1D approximation is valid (section 2). (Recall that for z < 1 kpc this is true even if there is a 'dark disc', since this is expected to have a scale height of ∼2–3 kpc (section 2.3.2). Data probing to z ≳ 2 kpc would be potentially sensitive to the density fall-off of such a dark disc, making it interesting to relax the ρ_dm ∼ const. assumption. For this review, however, where most of the data are for z ≲ 2 kpc and we work typically under the assumption that the tilt is small, I assume ρ_dm = const.)

The baryonic components can be treated as a sum over many isothermals with constant σ_z (Flynn et al 2006). Isothermals are a convenient decomposition for the disc, since the solution to equation (40) is then analytic (Bahcall 1984b):

$\begin{equation} \nu _i = \nu _{0,i} \exp \left(-\frac{\Phi (z)}{\sigma _{z,i}^2}\right). \end{equation} \tag{ 53 }$

(Note that such a decomposition need not refer to physically distinct tracers, though it does in the Flynn et al (2006) model. A particular stellar type could be described, for example, by a linear sum over several isothermal components.)

This gives a total mass model:

$\begin{eqnarray} \rho & = & \rho _\mathrm{disc} + \rho _\mathrm{dm}\nonumber \\ & = & \sum _i \nu _{0,i} \exp \left(-\frac{\Phi (z)}{\sigma _{z,i}^2}\right) + \rho _\mathrm{dm}. \end{eqnarray} \tag{ 54 }$

The Flynn et al (2006) mass model is described in table 2. Integrating the total surface density, we obtain Σ_b = Σ_g + Σ_* + Σ_• = 49.3 ± 7.5 M_⊙pc⁻², where the gas contribution is Σ_g = 13.2 ± 6.6 M_⊙pc⁻²; the stellar contribution is Σ_* = 28.9 ± 2.9 M_⊙pc⁻²; and stellar remnants/brown dwarfs contribute Σ_• = 7.2 ± 0.7. This can be compared with a recent determination of Σ_* = 30 ± 1 M_⊙pc⁻² from SDSS¹⁰ (Bovy et al 2012b).

Table 2. (a) The disc mass model from Flynn et al (2006). The columns show: the mass component (stars/gas/stellar remnant); the mass density in the midplane ρ(0); the total column density Σ; and the vertical velocity dispersion σ_z. Uncertainties on the densities are of order ∼50% for all the gas components (indicated with *) and ∼10% for all the stellar components. (b) A new compilation of the integrated baryonic surface density Σ_b in gas $\Sigma _g = \Sigma _{\rm {H\, i}} + \Sigma _{{\rm H}_2} + \Sigma _{\rm Warm\,\,gas}$ ; stars Σ_*; and stellar remnants/brown dwarfs Σ_•.

$\epsfbox{images/jpg493802un01.eps}$

It is clear that the total error budget is dominated by the gas. Assuming a constant ρ_dm up to ∼1 kpc, the expected dark matter contribution is Σ_dm ∼ 16 M_⊙pc⁻² which is only ∼2 times the error on Σ_b. Thus, the only reason we can hope to measure ρ_dm at all is because we expect Σ_b and Σ_dm to have very different vertical dependences, with Σ_b largely reaching its asymptote by z ∼ 0.5 kpc, and Σ_dm continuing to grow up to 1 kpc and beyond (section 2).

Given the importance of the baryonic mass model, it is worth a moment to understand the origin of the above uncertainties and how we might do better. With the advent of SDSS, the uncertainty in the local stellar surface mass density Σ_* is now very small. Combining the Flynn et al (2006) constraints for Σ_• with the Bovy et al (2012b) value for Σ_*, we obtain a very accurate Σ_* + Σ_• = 37.2 ± 1.2 M_⊙pc⁻². The major source of error, however, is in the gas surface density Σ_g, which is primarily H i gas (see table 2). The large error on the H i contribution arises because of the difficultly of measuring distance for gas (see section 3.8). To convert the observations of temperature and velocity as a function of Galactic coordinates on the sky: T_gas(l, b, v) to a surface density Σ_g, we must assume some underlying mass model for the Galaxy (for a review see Kalberla and Kerp 2009). Using the results from such an analysis independently of measurements of ρ_dm immediately creates some inconsistency since the best fit mass model used to derive Σ_g may be rather different from the best fit that arises from the measurement of ρ_dm. I discuss this problem further in section 3.8. For now, I will side-step this thorny issue and simply discuss the measurements of Σ_g available in the literature to date. Holmberg and Flynn (2000a) split the H i into hot and cold components that each contribute $\Sigma _{\rm {H\, i}} \sim 4 \;\mathrm{M} _\odot \mathrm{pc}^{-2}$ (see table 2), whereas Wolfire et al (2003) favour $\Sigma _{\rm {H\, i}} \sim 5\; \mathrm{M} _\odot \mathrm{pc}^{-2}$ , and Kalberla and Dedes (2008) $\Sigma _{\rm {H\, i}} \sim 12\; \mathrm{M} _\odot \mathrm{pc}^{-2}$ . If we take the very latest value to be correct (not necessarily a safe thing to do) and assign an error based on the radial fluctuations in H i reported by Kalberla and Dedes (2008), then we obtain $\Sigma _{\rm {H\, i}} = 12 \pm 4\, \mathrm{M} _\odot \mathrm{pc}^{-2}$ . Including the contribution from warm gas and H₂ reported in table 2, I derive Σ_b = 54.2 ± 4.9 M_⊙pc⁻², where I have assumed a 50% error on the H₂ and warm gas contribution as previously. This formally more accurate Σ_g is reported also in table 2. I stress, however, that in future we ought to simultaneously fit for Σ_g alongside our fit for ρ_dm.

3.5.1. The rotation curve prior

It is desirable to model the local dark matter density ρ_dm independently of the rotation curve if possible for the reasons outlined in section 1. However, we can still use the rotation curve to put sensible bounds on ρ_dm. Some authors like Kuijken and Gilmore (1989c) have applied such priors, while others like Bahcall (1984b) have not (see figure 2). The precise form of any such prior depends on the choice of mass model. Kuijken and Gilmore (1989a, 1989b, 1989c, 1991), for example, use a series of spherical-halo Galactic mass models that are consistent with the known rotation curve to inform their prior. I describe this prior in more detail and explore its effect in section 4.

3.6. The choice of tracer

So far, we have assumed the existence of some equilibrium tracer stars with known position and velocity. In the 1D approximation, this means having perfect knowledge of the height and vertical velocity of each star: [z, v_z]. If using moment methods, we may then extract from this a density ν ≡ ν(z), and a vertical velocity dispersion σ_z ≡ σ_z(z). However, there are several practical problems that arise when attempting to measure [z, v_z] for real stars in the Milky Way. I briefly discuss these, next.

Selecting stars in 'equilibrium'.

Firstly, we require that the tracers are in dynamical equilibrium (steady state) such that we can neglect the partial time derivative of the distribution function (see section 3). For this reason, authors usually avoid young stars since these may not have had time to dynamically mix through the disc (e.g. Bahcall et al 1992). However, there is no guarantee that the disc has not been recently disturbed such that even old stars are currently out of equilibrium; I discuss recent evidence for such disequilibria in the Milky Way in section 5.

Selecting stars that reach to high ${\boldsymbol z}$ .

Secondly, we require stars that orbit relatively high up above the disc plane (z > 0.75 kpc) in order to break a degeneracy between the dark and stellar mass in the disc (Garbari et al 2011). I discuss this degeneracy further in section 4.

Obtaining a good measure of distance.

Thirdly, it is difficult to measure the distance z of a star accurately. In an ideal world, we would use the parallax distance method, since this is the most accurate available (e.g. Binney and Tremaine 2008). However, using the Hipparcos satellite, this is currently only possible for bright stars within ∼100 pc of the Sun (van Leeuwen 2007). This will change soon with the advent of Gaia (see section 5.9 and figure 11). In the meantime, we must make do with a photometric distance estimate. This relies on finding stars of a known luminosity¹¹ L—so-called 'standard candles'. The distance then follows from a flux measurement:

$\begin{equation} d_p^2 = \left[\frac{L_\lambda }{4\pi f_\lambda }\right] \end{equation} \tag{ 57 }$

where d_p is the photometric distance to the star; and L_λ and f_λ are the luminosity and flux at a given wavelength λ.

To use equation (57) to obtain a photometric distance, we must have some independent measure of L_λ for a given stellar type. We can obtain this by calibrating the relationship between L_λ, colour B − V, and metallicity [Fe/H] using nearby stars that have Hipparcos distances (see figure 5 ¹²). In doing this, however, there is a danger of mis-classifying the stellar type in the first place. Notice from figure 5(a) that K-giant stars (stars that have evolved off the main sequence; Phillips 1999) can masquerade as main sequence K-dwarf stars since they share similar colours. In practice, this is not a major problem because K-giants are so much brighter that K-dwarfs. Beyond about ∼200 pc, it becomes implausible that a distant K-giant could be mistaken for a nearby faint K-dwarf (Kuijken and Gilmore 1989c). Thus, a simple distance cut on z < 200 pc is sufficient to weed out K-giant contamination (Garbari et al 2012).

**Figure 5.** (a) A synthetic colour-magnitude diagram (CMD) generated using the *IAC-star* code (Aparicio and Gallart 2004); the stellar *spectral type* (B0V, A0V, ...etc) as a function of colour B − V, is marked (see footnote 11 for a definition of colour, magnitude, and spectral type). (b) A real CMD for 431 K-dwarf (K0V) stars selected from Kotoneva *et al* (2002) with Hipparcos distances (z < 100 pc; Garbari *et al* 2012). These can be used to calibrate a photometric distance at heights z > 100 pc for which Hipparcos distances are not accurate. Notice that the scatter in the relationship between *M_V* and B − V is largely due to metallicity [Fe/H] (colour contours); it can be significantly reduced if [Fe/H] is known.
Download figure:
Standard image High-resolution image

Obtaining good velocities.

We also require good velocities v_z for the stars. Radial velocities (along the line of sight) are most accurate since these derive from Doppler shifts; however, transverse velocities (so-called proper-motions) can also be obtained by waiting long enough that the stars move across the sky with respect to a fixed background (e.g. Wilkinson et al 2005). Here, Gaia will also be transformative, obtaining accurate proper motions out ∼1 kpc even for faint K dwarf stars (see section 5.9). Since in the 1D approximation, we require only v_z, one useful trick is to look in a direction where v_z can be measured using only Doppler shifts (i.e. where the line of sight points perpendicular to the disc plane); this trick was used by Kuijken and Gilmore (1989c) to obtain their K-dwarf sample.

The advantage of a 'volume complete' sample.

If we know that we have observed every single star of a given type up to some height z_c, then that sample is said to be volume complete up to z_c. The advantage of using such volume complete samples is that the density ν(z) simply follows from counting statistics. If, however, we are missing some stars because they become too faint to be reliably detected, or because they are obscured by dust, then we must correct for such incompleteness. Provided we know both the luminosity function of our stars (that can be a function of height), and our selection function, then there is no problem. But this is an area where systematic errors can creep in.

Ensuring consistency.

Ideally, we should use the same tracers for σ_z(z) that we use for ν(z). However, in practice, this is often not done as it is much easier to obtain data for ν(z) (that requires only imaging), than for σ_z(z) (that requires spectra and/or proper motions). This leads to an additional source of systematic error (Kuijken and Gilmore 1989a). The only way to truly avoid this problem is to use a consistent set of tracer stars.

Modern survey data.

Modern surveys RAVE and SDSS have collected velocities each for of order ∼10 000 stars within ∼2 kpc of the disc plane (Siebert et al 2008, Smith et al 2012, Zhang et al 2013). Such velocity data are exquisite and have been driving significant improvements in measurements of ρ_dm (see figure 2 and section 5). However, the challenge with these data is in understanding the survey selection function well enough that ν for a given tracer may be reliably determined (see e.g. discussion in Smith et al 2012).

Given the above list of complications, it is prudent to measure ρ_dm both with a very clean sample of stellar tracers, and using the latest survey data that have much improved statistics but for which it is significantly harder to estimate the systematic errors. I take this approach in section 5. For the 'clean' stellar sample, I present results from a recent re-analysis of the K-dwarf data from Kuijken and Gilmore (1989c). This uses a new distance calibration that takes advantage of the more modern Hipparcos data (Garbari et al 2012, and see figure 5). These data amount to some ∼2000 K-dwarf stars, a quarter of which have measured v_z; they are volume complete up to 1.1 kpc above the disc plane. The 'less clean' stellar sample comes from SDSS survey data. There, some ∼10 000 stars are available with measured [z, v_z] up to ∼2 kpc above the disc plane. However, the selection function for these stars is significantly more complex (Smith et al 2012, Zhang et al 2013). Finally, I review results for a recent study that also uses the SDSS data, but slices the stars into narrow 'mono-abundance populations' (MAPs) (Bovy and Rix 2013). These MAPs, appear to be well-fit by very simple quasi-isothermal distribution functions (see section 3.1.3), allowing for greatly simplified models to be applied to the data. If such assumptions are correct, then even tighter constraints on ρ_dm follow.

3.7. Errors and model degeneracies

Degeneracies.

If using the mass model described in 3.5, then we have over 30 parameters that may degenerate with one another. To cope with this, Garbari et al (2011) use a Markov Chain Monte Carlo (MCMC) method to efficiently explore this parameter space (see also Zhang et al 2013). As we move beyond 1D models, such methods for efficient parameter exploration will become increasingly important; I discuss this briefly in section 5. The MCMC method is also very useful when folding in observational uncertainties. I discuss this, next.

Observational Errors.

So far, we have assumed perfect data with no observational errors. In general, we will have non-Gaussian probability distribution functions that describe the likelihood of a position, velocity and tracer membership (star type, metallicity etc) of a given tracer star. If using a distribution function approach, including these errors is a straightforward (though perhaps computationally expensive) convolution¹³:

$\begin{equation} \mathcal {L}({\bf a}| {\bf m}) = \int _{\bf a_0} d {\bf a}_0 g({\bf a} - {\bf a}_0) \mathcal {L}_0({\bf a}_0|{\bf m}) \end{equation} \tag{ 58 }$

where $\mathcal {L}({\bf a}|{\bf m})$ is the probability of obtaining imperfect data a given some model parameters m; $\mathcal {L}_0({\bf a}_0\mid {\bf m})$ is the probability of obtaining perfect data a₀ given m; and g(a|a₀) is the probability of obtaining a given a₀ (i.e. the error probability distribution function).

If not using a distribution function method, the errors can be included in one of two ways. We may include the observational errors, along with the Poisson noise uncertainties, in the calculation of the binned ν(z) and σ_z(z). However, the resultant error PDFs are unlikely to be Gaussian and we should not use the usual χ² statistic when comparing these binned data with a given model. Alternatively, we can sample the error PDF to generate many different data sets that are each compared with a given model (this amounts to a Monte Carlo sampling of the convolution integral in equation (58)). If using an MCMC, this can then be easily included as a 'Monte Carlo within a Monte Carlo' (Garbari et al 2011). The downside to this approach is that we must generate many more models in our MCMC chain to ensure that both the model parameters and the data uncertainties are properly sampled.

3.8. Gas as a tracer of the potential

In addition to using stars, we may also use gas as a tracer of the Milky Way potential. For gas, the equations are slightly different since gas is a collisional rather than collisionless fluid. Like the stars, gas will obey the Poisson equation, but the collisionless Boltzmann equation (13) is replaced by the equation of hydrostatic equilibrium that balances pressure forces and gravity:

$\begin{equation} \nabla P_{\rm gas} = -\rho _{\rm gas} \nabla \Phi \end{equation} \tag{ 59 }$

where P_gas and ρ_gas are the gas pressure and density, respectively. Equation (59) amounts to an assumption of equilibrium for the gas that is potentially much more precarious than the similar assumption of steady state for the stars. This is because typically un-modelled physical process in the ISM, like supernovae, cosmic ray radiation, gas turbulence and magnetic fields, contribute an effective P_{gas, eff} that is not included in equation (59) (e.g. Elmegreen and Scalo 2004). This could lead to potentially large systematic errors on Φ. Levine et al (2008) recently found, for example, that their derived vertical derivative of the H i rotation curve in the Milky Way is too large to be explained by gravity alone.

To solve equation (59), we must also specify an equation of state of the gas that relates pressure to temperature. Usually, a polytrope is assumed: $P_{\rm gas} = A \rho _{\rm gas}^\gamma$ , where A is a constant. For an ideal gas, the gas temperature then follows from P_gas = ρ_gask_BT_gas/(μm_H), where k_B = 1.38 × 10⁻²³ m² kg s⁻² K⁻¹ is the Boltzmann constant; μ is the mean molecular weight; and m_H is the mass of a proton.

Aside from disequilibria and un-modelled physics, a further key complication when using gas is determining the distance. In practice, for the Milky Way we can only measure the temperature as a function of angle on the sky, usually expressed in Galactic coordinates l, b; and the line of sight velocity v that follows from the Doppler shift of the H i 21 cm line (e.g. Binney and Merrifield 1998). To obtain a distance from this, we must model the gas assuming both hydrostatic equilibrium and some background potential for the Milky Way (Kalberla 2003, Kalberla et al 2007). I discuss the results of such fits to the new Leiden–Argetina–Bonn (LAB) survey data in section 5.

4. Tests using mock data

Given the wide array of different methods outlined in section 3, it is helpful to compare and contrast these by applying them to mock data. This allows us to assess systematic errors that occur when model assumptions are violated, and to assess what type and quality of data are most important to improve estimates of ρ_dm.

Statler (1989) was one of the first to worry about systematic errors in measuring ρ_dm, focussing on the typically un-modelled tilt-term $\mathcal {T}$ ; Kuijken and Gilmore (1989c) estimated the order of magnitude effect of neglecting $\mathcal {T}$ and the rotation curve term $\mathcal {R}$ (section 3.3), and the effect of measurement errors; Kuijken and Gilmore (1989a) discussed the problems that can arise if tracers are inconsistent (see section 3.6); and Kuijken and Gilmore (1991) and Inoue and Gouda (2013) performed Monte-Carlo simulations of their full analysis pipeline, similar to those that I will perform section 4.1. However, the first detailed investigation using dynamically realistic mocks generated from evolved N-body simulations was performed by Garbari et al (2011). I discuss this work in section 4.2.

4.1. Simple 1D mock data

It is beyond the scope of this short review to compare and contrast all of the methods outlined in section 3. Instead, in this section I focus on very simple tests of the 1D Jeans method described in section 3.3. I will discuss distribution function methods in section 4.2. As we will see, this is already instructive. All of the mock data tests presented in this paper are available for download from the Gaia Challenge wiki site¹⁴, where tests of ever increasing sophistication are on-going.

To set up some simple mock data that are dynamically self-consistent, I use the 1D distribution function approximation outlined in section 3.4. This assumes that $\mathcal {T} = 0$ at all heights above the disc plane. I will also assume no observational uncertainties. This is essentially 'as good as it gets' and so such tests should allow us to estimate the absolute minimum uncertainty expected from data sets of a given size.

To make life even easier, I will assume a very simple parameterized form for the tracer density and gravitational potential as in Kuijken and Gilmore (1989c)¹⁵:

$\begin{equation} \nu (z) = \nu _0 \exp (-z/z_0) \end{equation} \tag{ 60 }$

and:

$\begin{equation} \Phi (z) = K(\sqrt{z^2 + D^2} - D) + Fz^2 \end{equation} \tag{ 61 }$

which gives:

$\begin{equation} K_z = -\left[\frac{Kz}{\sqrt{z^2 + D^2}} + 2 F z\right] \end{equation} \tag{ 62 }$

where z₀ is the tracer scale height; D is the disc scale height; and K and F set the vertical force contribution from the disc and dark halo, respectively. I adopt a system of units: kpc, M_⊙, km/s⁻¹.

Assuming Newtonian gravity and that the rotation curve term $\mathcal {R}$ (section 3.3) is small, we can relate the vertical force to a surface density (M_⊙ pc⁻²) via the Poisson equation:

$\begin{equation} \Sigma _z(z) \simeq \frac{|K_z|}{2\pi G} \end{equation} \tag{ 63 }$

where in the above system of units, G = 4.299.

Using these simple analytic forms, we can calculate the distribution function as a function of vertical energy:

$\begin{equation} f(E_z) = -\frac{1}{\pi }\int _{E_z}^{\infty } \frac{\mathcal {G}(\Phi ) \,{\rm d}\Phi }{\sqrt{2(\Phi - E_z)}}; \,\,\,\, \mathcal {G} \equiv \frac{1}{z_0}\frac{\nu }{K_z}. \end{equation} \tag{ 64 }$

This needs to be solved numerically which requires us to transform away the infinity in the upper integral limit and the root in the denominator. Using the substitution $\Phi = E_z \sec ^2\theta$ gives:

$\begin{equation} f(E_z) = \frac{-\sqrt{2 E_z}}{\pi } \int _{0}^{\pi /2} \sec ^2\theta \mathcal {G}(\theta ,E_z) \,{\rm d}\theta . \end{equation} \tag{ 65 }$

To draw a population of i stars, I first draw the positions z_i from equation (60). Then for each star, assuming values for [z₀, D, F, K], I calculate f(E_z) by numerically integrating equation (65). The vertical velocities are then drawn from f(E_z) using an accept/reject method, remembering to normalize f(E_z)/max [f(E_z)] at each star position z_i.

I set up three mock data sets as described in table 3, chosen to be a reasonable match to the Milky Way. The different mocks are designed to explore the effect of sampling and priors (Simple); modelling multiple populations with different scale height simultaneously (Simple2); and having stellar tracers high up above the disc plane (High).

Table 3. Mock data parameters. The columns show: mock description; tracer scale height z₀; disc and dark matter vertical force parameters K, F; disc scale height D; and a plot label that indicates which panels in figure 6 use each given mock. (Note that panel f) explores simultaneously fitting two tracers: Simple and Simple2 with different scale heights.) I adopt a system of units: kpc, M_⊙, km s⁻¹. The disc surface mass density follows from equation (63): Σ_b = K/(2πG) = 55.53 M_⊙pc⁻². The dark matter density follows similarly: Σ_dm = Fz/(2πG)⇒ρ_dm = F/(2πG1000) = 0.01 M_⊙pc⁻³.

Model	z₀	K	F	D	Plot
Simple	0.4	1500	267.65	0.18	(a)–(f)
Simple2	0.9	1500	267.65	0.18	(f)
High	0.65	1500	267.65	0.18	g), (h)

I then attempt to recover the surface mass density Σ_z(z) from these mock data. In the spirit of 'as good as it gets', I fit exactly the same input mass model to the data (equation (62)). I use the 1D Jeans approximation for this (section 3.3), and an MCMC to explore parameter degeneracies (section 3.7). I run 500 000 models for each MCMC chain and conservatively discard the first half to avoid bias induced by the initial chain parameters.

4.1.1. The effect of sampling error and priors

First, I consider how well we do using just 1000 stars but applying different levels of prior information. The results are shown in figure 6(a)–(d). I consider two different priors:

1.
Rot. The KG rotation curve prior (Kuijken and Gilmore 1989b, 1989c):
$\begin{equation} F = (0.041 - 0.0094 K \pm 0.008) \frac{c}{a} \end{equation} \tag{ 66 }$
where c/a describes the halo flattening perpendicular to the disc. I assume the default choice used by KG c/a = 1 which is valid for a spherical dark matter halo.
2.
Scale. Here, I assume significant knowledge about the baryonic mass distribution such that I can place strong priors on 0.1 < D < 0.25 kpc and 50 < Σ_b < 60 M_⊙pc⁻².

Without any priors (figure 6(a)), the recovery of Σ_z(z) is poor. The input model (blue) is recovered within the 95% confidence interval, but there are significant model degeneracies. Panels (b)–(d) explore the effect of increasing the prior constraints. In panel (b), I turn on the 'Rot' prior that wraps in information from the Milky Way rotation curve assuming spherical symmetry. This immediately tightens the error envelope but does not drastically reduce the uncertainties at z ∼ 1 kpc. In panel (c), I turn on the 'Scale' prior that puts constraints on the mass and scale length of the visible disc. This prior is quite reasonable in that such information are available for the real Milky Way (section 3.5). The errors are now significantly reduced at z ≲ 0.5, but the errors grow significantly at z ∼ 1 kpc—the region where we become sensitive to ρ_dm. Finally in (d), I add both the Rot prior that constraints ρ_dm and the Scale prior that constrains the visible disc. Now I obtain rather tight constraints that are closer to previously reported errors in the literature (e.g. Kuijken and Gilmore 1991, Holmberg and Flynn 2004).

It is clear from figure 6(a)–(d) that with tracer numbers of n_* ∼ 1000, we are rather sensitive to priors on the mass model. Once the prior from the rotation curve is taken away, the resultant errors on ρ_dm are large (Bahcall et al 1992, Garbari et al 2012).

Figure 6(e) shows what happens as we raise the sampling to 10,000 stars—about the number currently available from the SDSS survey data (Zhang et al 2013). Now, even without any prior constraints, the error envelope is rather tight—similar to that quoted recently by Zhang et al (2013).

4.1.2. Data high above the disc plane

Figure 6(g) and (h) explore the effect of using tracers high above the disc plane. Moni Bidin et al (2012) recently used a sample of ∼500 stars over heights ∼2–4 kpc to claim very tight constraints on ρ_dm, finding—at odds with previous studies and the Galactic rotation curve—a dearth of dark matter near the Sun (ρ_dm = 0 ± 0.001 M_⊙pc⁻²; see the point marked 'MB12' in figure 2, and table 4). Their formal uncertainties were also surprisingly small—much smaller than those in figure 6(g) and (h). There are likely several reasons for this. Firstly, Bovy and Tremaine (2012) showed that the Moni Bidin et al (2012) measurement hinged on an erroneous assumption that the mean azimuthal velocity of stellar tracers v_ϕ(R, z) is constant. Assuming instead that the Milky Way rotation curve is constant in the plane:

$\begin{equation} v_c^2(R,0) = \left. R\frac{\partial \Phi }{\partial R}\right|_{z=0} = {\rm const.} \end{equation} \tag{ 67 }$

which is a statement about the gravitational potential in the plane Φ(R, 0) rather than the stellar kinematics, they derive a value consistent with other measures in the literature (see the point marked 'BT12' on figure 2). Secondly, it is likely that the observational uncertainties in the MB12 data are underestimated (Sanders 2012b). Finally, with just 412 stars, they rely on knowing very well from which photometric sample these stars are drawn (in order to determine the density fall-off with height). Systematic errors could easily creep in here. If the data become inconsistent such that the density fall-off ν(z) is no longer consistent with σ_z(z), then attempts to fit models to these data could push model parameters into corners of parameter space, leading to erroneously small errors.

Table 4. Measurements of ρ_dm (top) and ρ_{dm, ext} (bottom). The columns show: label; reference; description of the study (for latest measurements only); order of magnitude tracer sample size (for latest measurements of ρ_dm only, calculated up to ∼1–2 kpc); and ρ_dm or ρ_{dm, ext} in M_⊙pc⁻³ and GeV cm⁻³. Notes: (a)ρ_dm: All values have been calculated from the quoted total density (black) or surface density (blue) in each study, assuming a baryonic contribution ρ_b = 0.0914M_⊙pc⁻³ or Σ_b = 55M_⊙pc⁻², respectively (section 3.5). All error bars represent either 1σ uncertainties or 68% confidence intervals. For the latest measurements, if the studies' determination of ρ_dm differs from that quoted here, the original determination (using the studies' favoured baryonic contribution) is also marked in square brackets; see text for further details. Studies that use a 'rotation curve' prior (see section 3.5.1), are marked with a dagger †. G12 and G12* use re-calibrated volume complete (VC) data from Kuijken and Gilmore (1989c); G12* invokes a stronger baryon prior: Σ_b = 55 ± 1 M_⊙ pc⁻² (see text for details). S12, Z13 and BR13 all use SDSS survey data that have a Complex Selection Function (CSF). S12 choose not to quote uncertainties owing to the difficulty of estimating systematic errors. BR13 slice the data into Mono Abundance Populations (MAPs), assuming that each of these can be fit by a quasi-isothermal distribution function (see section 3.1.3). All data points are plotted graphically in figure 2. (b) ρ_{dm, ext}: S10 use a non-parametric (NP) method with some of the weakest priors of all of the methods. CU10 use the most restrictive priors, assuming an NFW profile. WB10 explore different halo models (NFW and ISOthermal, amongst others) with weaker priors (WP). I11 wrap in microlensing (ML) data assuming weak priors and a gNFW profile (equation (2) with the power law indices allowed to vary).

Label	Reference	Description	Sampling	ρ_dm [M_⊙ pc⁻³]	ρ_dm [GeV cm⁻³]
(a) Local measures (ρ_dm)
Kapteyn	Kapteyn (1922)	–	–	0.0076	0.285
Jeans	Jeans (1922)	–	–	0.051	1.935
Oort	Oort (1932)	–	–	0.0006 ± 0.0184	0.0225 ± 0.69
Hill	Hill (1960)	–	–	−0.0054	−0.202
Oort	Oort (1960)	–	–	0.0586 ± 0.015	2.2 ± 0.56
Bahcall	Bahcall (1984a)	–	–	0.033 ± 0.025	1.24 ± 0.94
Bienayme†	Bienayme et al (1987)	–	–	0.006 ± 0.005	0.22 ± 0.187
KG†	Kuijken and Gilmore (1991)	–	–	0.0072 ± 0.0027	0.27 ± 0.102
Bahcall	Bahcall et al (1992)	–	–	0.033 ± 0.025	1.24 ± 0.94
Creze	Creze et al (1998)	–	–	−0.015 ± 0.015	−0.58 ± 0.56
HF†	Holmberg and Flynn (2000b)	–	–	0.011 ± 0.01	0.4 ± 0.375
HF†	Holmberg and Flynn (2004)	–	–	0.0086 ± 0.0027	0.324 ± 0.1
Bienayme	Bienaymé et al (2006)	–	–	0.0059 ± 0.005	0.51 ± 0.56
Latest measurements
MB12	Moni Bidin et al (2012)	CSF	412	0.00062 ± 0.001	0.023 ± 0.042
				[0 ± 0.001]	[0 ± 0.042]
BT12	Bovy and Tremaine (2012)	CSF	412	0.008 ± 0.003	0.3 ± 0.11
G12	Garbari et al (2012)	VC	2 × 10³	0.022 $^{+0.015}_{-0.013}$	0.85 $^{+0.57}_{-0.5}$
G12*	Garbari et al (2012)	VC + Σ_b	2 × 10³	0.0087 $^{+0.007}_{-0.002}$	0.33 $^{+0.26}_{-0.075}$
S12	Smith et al (2012)	CSF	10⁴	0.005 [no error]	0.19
				[0.015]	[0.57]
Z13	Zhang et al (2013)	CSF	10⁴	0.0065 ± 0.0023	0.25 ± 0.09
BR13	Bovy and Rix (2013)	CSF + MAP	10⁴	0.006 ± 0.0018	0.22 ± 0.07
				[0.008 ± 0.0025]	[0.3 ± 0.094]
(b) Global measures assuming spherical symmetry (ρ_{dm, ext})
S10	Salucci et al (2010)	NP	–	0.011 ± 0.005	0.43 ± 0.2
CU10	Catena and Ullio (2010)	NFW; SP	–	0.0103 ± 0.00072	0.385 ± 0.027
WB10	Weber and de Boer (2010)	NFW/ISO; WP	–	0.005 - 0.01	0.2 - 0.4
I11	Iocco et al (2011)	gNFW; WP; ML	–	0.005 - 0.015	0.2 - 0.56
M11	McMillan (2011)	NFW; SP	–	0.011 ± 0.0011	0.4 ± 0.04

4.1.3. Multiple tracer populations

Figure 6(f) considers modelling different stellar tracers simultaneously in the same potential. I consider two sample populations—Simple and Simple2 (table 3) with 5000 stars in each. We can think of these as being stars that are split, for example, by metallicity or abundance or stellar type. Since each population has a different scale length—z₀ = 0.4 kpc (Simple); and z₀ = 0.9 kpc (Simple2)—but they both live in the same potential, we should obtain tighter constraints on Σ_z(z) than we would do if modelling the same number of stars with a single population (this trick was recently employed in the context of measuring ρ_dm by Zhang et al 2013, for the first time). As can be seen, splitting the populations in this way does not yield significantly improved constraints. As compared to the Simple population with 10⁴ stars, the errors are somewhat larger at high z and smaller at low z. This is perhaps surprising given claims from the spherical Jeans modelling community of the power of population splitting (e.g. Battaglia et al 2008, Walker and Peñarrubia 2011). However, the reason for this is that in the spherical Jeans equations there is an unknown cross term that must be marginalized out: the velocity anisotropy β(r). In general, the poorly measured β(r) can take any value in the range −∞ < β < 1. By contrast, in the 1D approximation that we employ here, the equivalent cross term is the tilt $\mathcal {T}$ that we can safely assume is small. Thus, population splitting when modelling, for example, dwarf spheroidal galaxies of the Milky Way is invaluable in helping to break a degeneracy between the enclosed mass M(r) and β(r). In our 1D disc modelling, here, no such degeneracy exists and population splitting is correspondingly less powerful.

There are, however, two good reasons to still consider population splitting despite the above. Firstly, once we move high up above the disc plane, $\mathcal {T}$ is no longer small and population splitting will likely prove to be a powerful additional constraint. Secondly, population splitting in stellar abundance or age appears to produce stellar tracers that have a remarkably simple distribution function. More on this in section 5.7.

4.2. N-body mocks

The above simple 1D models are instructive in that they already give us a feel for the expected error given perfect data. However, the real Milky Way is dynamically more complex that our simple mock. Apart from observational uncertainties, we have uncertain tracer membership (section 3.6), disequilibria (section 5.8), and potentially significant contributions from the tilt $\mathcal {T}$ and the rotation curve $\mathcal {R}$ terms (section 3).

One way to test the above is to apply mass modelling methods to dynamically realistic N-body mock data. The first to attempt this was Garbari et al (2011); I briefly review the results of that work in this subsection. Garbari et al (2011) set up a mock Milky Way using a Widrow and Dubinski (2005) model, with 30 × 10⁶ star 'super-particles' (see section 2 for a definition of 'super-particles'), and 15 × 10⁶ and 0.5 × 10⁶ super-particles for the dark matter halo and stellar bulge, respectively. A contour plot of the stellar distribution viewed from above is shown in figure 7 for an 'unevolved' disc (a) that was run for t ∼ 50 Myrs to ensure equilibrium had been reached; and an 'evolved' disc (b) that was run for t ∼ 4 Gyr such that a bar and spiral arms similar to those seen in the real Milky Way formed. The unevolved disc satisfies by construction all of the assumptions in the 1D distribution function method outlined in section 3.4: $\mathcal {T} = 0$ exactly, and $\mathcal {R} \sim 0$ . By contrast the evolved disc does not, showing asymmetric variations as a function of angle around the disc.

Garbari et al (2011) test two different mass modelling methods: a generalized 1D moment method (section 3.3) that they call the 'Minimal Assumption' or (MA) method; and a 1D distribution function method—the HF method (section 3.4). The key difference between these two is that the MA method assumes only that $\mathcal {T}$ is small, while the HF method (like the KG method in section 3.4) assumes that the distribution function is exactly separable—i.e. that σ_Rz = 0 and therefore $\mathcal {T} = 0$ exactly. The key results are shown in figure 7. Firstly, Garbari et al (2011) apply the MA method to the unevolved disc (figure 7(c)). Notice that there is a strong degeneracy between ρ_dm and the visible in-plane matter density ρ_b if the tracers do not sample high up above the disc plane (compare the left and right panels). This was stressed also by Bahcall (1984b). We must sample several disc scale heights above the disc (≳ 750 pc) to be able to 'see' the dark matter—at least given current errors on the visible mass density (section 3.5). Secondly, consider the recovery of the evolved disc. Now the disc is axisymmetric and we must consider different 'wedges' as a function of angle around the disc, as marked in red on figure 7(b). Figure 7(d) shows the recovery of ρ_dm (blue contours) and ρ_b (red contours) as a function of wedge angle; the true answers for each wedge are marked by the solid circles and dashed lines. The top plot shows the recovery for the HF method; the bottom for the MA method. Notice that the HF method is biased in several wedges, giving a systematically wrong recovery of ρ_dm within the quoted uncertainties (the 90% confidence intervals are marked by horizontal lines). The reason for this is that the distribution function in most wedges is not a function of vertical energy, whereas the HF method assumes exactly this: f ≡ f(E_z). This is shown in the bottom two panels of figure 7(d). Notice that for the wedge at θ = 45°, f(E_z) measured at z = 0 pc (black) is in excellent agreement with the same measured at z = 500 pc (red). By contrast, for the wedge at θ = 180°, the distribution function clearly changes as we move from z = 0 pc to z = 500 pc. This is why the HF method recovers a systematically biased ρ_dm and ρ_b for this wedge.

**Figure 6.** 1D Mock data tests of the recovery of the disc surface mass density Σ_z(z). The solid, dotted and dashed lines show the median, 68% and 95% confidence intervals for 250 000 models sampled with an MCMC. The blue line shows the input model. The mock data are described in table 3 and are 'as good as it gets' in that I assume no observational errors; zero tilt and rotation curve terms $\mathcal {T}=0$ , $\mathcal {R}=0$ ; and perfect self-consistent tracer stars. Panels (a)–(d) explore the effect of increasingly strong model priors (as marked) for n_* = 10³ tracers from the Simple model (see text for details). Panel (e) shows results for 10⁴ tracers; and (f) the same split into two populations (Simple and Simple2; see table 3) with different scale heights. Finally, panels (g) and (h) explore results for just 500 tracers high above the disc plane (the 'High' mock data set).
Download figure:
Standard image High-resolution image

**Figure 6.** 1D Mock data tests of the recovery of the disc surface mass density Σ_z(z). The solid, dotted and dashed lines show the median, 68% and 95% confidence intervals for 250 000 models sampled with an MCMC. The blue line shows the input model. The mock data are described in table 3 and are 'as good as it gets' in that I assume no observational errors; zero tilt and rotation curve terms $\mathcal {T}=0$ , $\mathcal {R}=0$ ; and perfect self-consistent tracer stars. Panels (a)–(d) explore the effect of increasingly strong model priors (as marked) for n_* = 10³ tracers from the Simple model (see text for details). Panel (e) shows results for 10⁴ tracers; and (f) the same split into two populations (Simple and Simple2; see table 3) with different scale heights. Finally, panels (g) and (h) explore results for just 500 tracers high above the disc plane (the 'High' mock data set).
Download figure:
Standard image High-resolution image

**Figure 7.** Testing mass modelling methods using dynamically realistic N-body mock data; figures reproduced from Garbari *et al* (2011). (a) The N-body mock Milky Way disc viewed in stellar density contours from above. The 'cylinders' and 'wedges' used to represent Solar neighbourhood volumes are marked in red. (b) The same simulation evolved for ∼4 Gyr to form a bar and spiral arms. Notice that the disc is no longer axisymmetric. (c) Recovery of ρ_dm and the in-plane visible mass density ρ_b using the 'MA' method on the unevolved disc (see text for details), and considering tracers up to |z| < 250 pc (left) and |z| < 750 pc (right). (d) Recovery of ρ_dm (blue contours) and ρ_b (red contours) for the evolved (non-axisymmetric) disc, using the 'HF' method (top) and the 'MA' method (bottom). The true values are marked by the dashed lines and solid circles; the horizontal lines on each contour bar mark the 90% confidence intervals. Notice that the HF method performs well for the wedge at θ = 45°, but poorly at θ = 180°. The bottom two panels plot the distribution function as a function of vertical energy f(*E_z*) in the plane (black) and at z = 500 pc (red). Notice that these agree for θ = 45°, but depart strongly at θ = 180°. The θ = 180° wedge does not satisfy the assumption f ≡ f(*E_z*) and so the HF method produces a biased result for ρ_dm.
Download figure:
Standard image High-resolution image

The above demonstrates the importance of testing our methodology on dynamically realistic mock data. At first sight, the HF and MA methods make seemingly identical assumptions. But the subtle difference that the HF method assumes an exactly separable distribution function, while the MA method assumes only approximate separability (via an assumed small tilt term) leads to a potentially strong bias on ρ_dm for the HF method. Modern analyses use more sophisticated distribution functions (e.g. Binney 2012b, Bovy and Rix 2013, and see section 3.1.3). However, we must continue to test and hone such parameterized distribution function forms on simulations of ever increasing realism. This is a key goal of the Gaia Challenge project¹⁶.

5. Measurements of ρ_dm and ρ_{dm, ext}

In this section, I summarize historic and recent measurements of ρ_dm derived from local tracers in the disc, and ρ_{dm, ext} derived from the rotation curve.

5.1. Nearly a century of measurements of ρ_dm

A summary of measurements of ρ_dm from Kapteyn through to the present day is given in figure 2, where I mark also the latest limits on ρ_{dm, ext} from the rotation curve assuming a spherically symmetric dark matter halo (grey band). In compiling this list, I used the density (black) or surface density (blue) in each study, assuming a baryonic contribution ρ_b = 0.0914 M_⊙pc⁻³ or Σ_b = 55 M_⊙pc⁻², respectively (section 3.5). Note that this can, however, produce a different answer as compared to fitting the raw data using these values as a prior on ρ_b and/or Σ_b. I illustrate this point using the Garbari et al (2012) study in section 5.2.

There are a number of fascinating results that crop up from examining figure 2. Firstly, notice that right up to the mid and late 1980s, there was an enormous scatter in the results between different groups. Oort (1960), Bahcall (1984a) and Bahcall et al (1992) claimed evidence for significant dark matter in the disc, while Bienayme et al (1987) and Kuijken and Gilmore (1989a) found none. Post Hipparcos, there has been a dramatic convergence between groups towards values consistent with spherical extrapolations from the rotation curve (grey band). However, the observant reader will notice that there is quite some difference between the quoted errors. Part of this is explained by volume density (black/red) versus surface density (blue) estimates. The latter average over a region higher up above the disc plane that breaks degeneracies between the visible and dark matter mass (Bahcall 1984b, Garbari et al 2011), leading to greater accuracy. But even accounting for this, the error bars appear to remain static with time or grow despite the arrival of data from SDSS. This is because modern analyses now make many fewer assumptions than previously (Garbari et al 2012, Zhang et al 2013, and see section 3). As these data have improved, we have begun to address more refined questions about the dynamical state of our Galaxy. Secondly, notice that there are three post-Hipparcos measurements that appear discrepant at greater than 1σ. Creze et al (1998) are on the low-side. This is likely because they average over the smallest height of any of the studies: z_max = 125 pc, which is less than the scale height of the Milky Way thin disc (e.g. Binney and Tremaine 2008). At this height, we become very sensitive to errors in ρ_b. By contrast, Garbari et al (2012, G12) is on the high-side, though it does agree with the other measures within 2σ. I discuss this further in section 5.2. Finally, there is a third discrepant point. Moni Bidin et al (2012, MB12) have recently claimed to find no dark matter near the Sun at very high confidence. As discussed in section 4, this discrepancy results, at least in part, from a poor modelling approximation. Bovy and Tremaine (2012, BT12) re-analyse their data using more realistic model assumptions, finding a value consistent with the other measures (the BT12 data point is also marked on figure 2).

5.2. The latest local measures

Several groups have recently revisited local measurements of ρ_dm, as summarized in figure 2 and table 4 a. All of these measures of ρ_dm are complementary in the sense that: i) they rely on different prior assumptions, some stronger than others; and ii) while S12, Z13 and BR13 have ∼10,000 stars within ∼2 kpc, the ∼2000 K-dwarf stars in the G12 sample (re-calibrated from Kuijken and Gilmore 1989c) have a much simpler, volume complete, selection function.

The first thing to note is that all of the above studies agree within 2σ, while only G12 is discordant at 1σ. This is already remarkable given the different data sets and methodologies employed. However, it is interesting to understand why G12 is different. The reason can be seen in figure 8 that plots the derived ρ_dm from G12 against Σ_b (that is simultaneously fit for in their analysis). The 90% confidence intervals are marked both with (red) and without (black) correcting for the non-flatness of the local rotation curve. Notice that there is a degeneracy between Σ_b and ρ_dm that is weakly broken (the contours close), favouring $\Sigma _b = 45.5^{+5.6}_{-5.9} \;\mathrm{M} _\odot \mathrm{pc}^{-2}$ . This is systematically lower than Z13 who favour Σ_b = 55 ± 5 M_⊙pc⁻². If we invoke a stronger prior on the G12 analysis of Σ_b = 55 ± 1 M_⊙ pc⁻², in-line with Z13 and with the updated baryonic mass model compiled here (section 3.5), we obtain the green point labelled G12*. This is in much better accord with the other recent measures (see also figure 2 and table 4). (Note that using Σ_b = 55 M_⊙pc⁻² as a prior on the G12 analysis produces a very different result to simply subtracting Σ_b = 55 M_⊙pc⁻² from the G12 total surface mass density. The former uses knowledge of the baryonic mass distribution in the model fitting, the latter does not.)

**Figure 8.** The weakly-broken degeneracy between Σ_b and ρ_dm in the G12 analysis.
Download figure:
Standard image High-resolution image

The studies S12, Z13 and BR13 all use SDSS survey data that have a complex selection function (CSF). For this reason, S12 choose not to quote uncertainties owing to the difficulty of estimating systematic errors. By contrast, Z13 build on earlier work from Bovy et al (2012b) to characterize the survey systematics, computing a final error on their derived ρ_dm. Ideally, to explore potential systematics in such an analysis we should build sophisticated mock data that are both chemically and dynamically realistic. This is a key goal of the Gaia Challenge initiative¹⁷.

5.3. The latest global measures

In addition to recent work on measurements of ρ_dm, there have been several new measurements of ρ_{dm, ext}. These combine data from a wide range of tracers—stars and gas—in the Milky Way, fitting a global model for the Galaxy. Typically, it is assumed that the dark halo is spherical and in the data compilation in table 4b, I include values only obtained under this assumption. (Note that CU10, WB10 and M11 additionally use the local surface density of matter as a constraint on their models, taking the value from Kuijken and Gilmore (1991). However, since Kuijken and Gilmore (1991) use a prior from the rotation curve that assumes a spherical halo (see sections 3–5), I still consider these to be global models that constrain ρ_{dm, ext} rather than ρ_dm.)

From table 4, it is clear that the different studies agree within their quoted uncertainties, but also that a few studies have significantly smaller uncertainties than the others. The reason for this simply comes down to the strength of the assumed priors in each case. S10 and I11 use the weakest priors of all of these studies, the former employing a non-parametric method; the latter using a parametric method but with quite some freedom in the dark matter density distribution (they also include microlensing data constraints). Neither of these studies uses any prior on ρ_dm from local measures. This makes their measurements truly independent of local measures of ρ_dm. Since their results are similar, I use I11 to over-plot results for ρ_{dm, ext} on figure 2.

5.4. Constraints on the local Milky Way halo shape and an accreted dark disc

Comparing the global (ρ_{dm, ext}) and local (ρ_dm) measures, we can look for evidence for a flattened or prolate dark halo for our Galaxy at R₀, or the presence of a dark disc (see figure 1, and section 2.3). I quantify this in figure 9, where I plot ζ = (ρ_dm − ρ_{dm, ext})/ρ_{dm, ext} for four recent measurements of ρ_dm from table 4: G12, G12*, Z13 and BR13. I assume ρ_{dm, ext} = 0.38 ± 0.18 GeV cm⁻³ taken from I11 (see table 4). Over-plotted are the three ζ values reported in table 1 (dotted lines) for the case of a Quiescent Milky Way (Q), a Milky Way with significant Late Mergers (LM), and a Milky Way with a massive LPM, as marked. I also over-plot the effect of global halo flattening (red dashed lines). To derive these, I assume a Logarithmic halo model, for which the density at the Solar position [R₀, 0] is given by (e.g. Binney and Tremaine 2008):

$\begin{equation} \rho _L = \left(\frac{v_0^2}{4\pi G q^2}\right) \frac{(2q^2 + 1) R_c^2 + R_0^2}{(R_c^2 + R_0^2)^2} \end{equation} \tag{ 68 }$

where q is the potential flattening in the z direction; R_c = 15 kpc is a halo scale length, and v₀ = 220 km s⁻¹ sets the halo mass.

**Figure 9.** Constraints on the Milky Way halo shape at R₀ and/or an accreted dark disc from four recent measurements of ζ = (ρ_dm − ρ_{dm, ext})/ρ_{dm, ext} (data taken from table 4). The black dotted lines mark ζ values taken from three cosmological simulations of Milky Way mass halos (table 1); the red dashed lines show ζ for a simple flattened Logarithmic halo model (equation (68)).
Download figure:
Standard image High-resolution image

Using the above form for the dark matter halo, we can calculate the increase/decrease in the local dark matter density with respect to the spherical case for different values of q:

$\begin{equation} \zeta = (\rho _L - \rho _L(q = 1)) / \rho _L(q = 1) \end{equation} \tag{ 69 }$

This is overplotted on figure 9 for q = 0.7, 1 and 1.8, as marked (red dashed lines). The small q values correspond to an oblate (flattened) halo; the large q values to a prolate (stretched) halo.

As already reported in G12, notice that G12 favour significant flattening in the plane, suggesting an oblate halo and/or a significant dark disc. By contrast, G12*—that uses a stronger baryonic surface density prior of Σ_b = 55 ± 1 M_⊙pc⁻²—is perfectly consistent with a spherical halo at R₀. The errors are large, however, permitting both prolate and oblate halos within 1σ, consistent with all three 'dark disc' simulations: Q, LM and LPM. Only the latest SDSS constraints appear constraining at 1σ (Z13 and BR13). These favour prolate halos at R₀ with negative ζ. If correct, such a local prolate halo would be theoretically rather surprising (see section 2.3), and certainly bad news for 'AG' explanations of dark matter (e.g. Read and Moore 2005). However, the statistical significance for this is low. More interesting is the upper bound on these data points. Notice that they are only marginally consistent (at 1σ) with a Quiescent Milky Way with a spherical halo at R₀. If the latest measurements from SDSS are correct, the implication is that the Milky Way has a near-spherical or even prolate dark matter halo at R₀, no significant dark disc and—correspondingly—a rather quiescent merger history. I caution, however, that this result hinges on there being no large systematics that remain to be uncovered in the SDSS data, and on the local baryonic surface density being Σ_b ∼ 55 M_⊙pc⁻².

5.5. Independent measures of the Milky Way halo shape

Apart from the ρ_dm/ρ_{dm, ext} comparison, the strongest constraints on the Milky Way halo shape at the moment come from tidal streams. The archetype is the Sagittarius stream, an enormous structure that stretches across the Northern and Southern hemispheres, giving constraints on the halo shape at ∼15–50 kpc from the Galactic centre (Ibata et al 2001). Early work on the stream suggested a near-spherical halo for the Milky Way (Ibata et al 2001), but more recent data combined with more sophisticated modelling appear to favour a triaxial halo (Law and Majewski 2010). The trouble with the latter is that the best fitting model is unstable (Heiligman and Schwarzschild 1979, Binney 1981, Debattista et al 2013). Vera-Ciro and Helmi (2013) have suggested that this problem could be solved if the triaxiality is allowed to vary with ellipsoidal radius, similarly to what is expected from cosmological simulations (section 2.3). However, halos that have an axisymmetric inner region that aligns with an outer intermediate axis may also be unstable, once a fully self-consistent 'live' halo is taken into account (Debattista et al 2013).

Even if the stability issues for the triaxial solution can be resolved, a key problem remains. None of the current models fit the very latest data that favour a trailing arm with much larger apocentre than the leading arm (Belokurov et al 2013). This puzzling result could imply that the orbit of Sagittarius has evolved (Zhao 2004, Read et al 2008, Lux et al 2013), or that the Sagittarius progenitor had a more complex internal structure than is typically assumed (Peñarrubia et al 2010).

The complexity of the Sagittarius stream has driven an increased focus on thinner colder streams that are much simpler to model (though stream-orbit offsets must still be accounted for: Varghese et al 2011, Eyre and Binney 2011, Sanders and Binney 2013, Lux et al 2013). Lux et al (2012) have recently argued that a tentative turnaround in the NGC 5466 globular cluster stream at its western edge implies an oblate or triaxial Galactic halo, while Lux et al (2013) have shown that with just radial velocity data, the Pal 5 globular cluster stream will determine the flattening of the Milky Way halo—something that is within reach of current instrumentation. With full proper motion and distance data along these two streams, a triaxial halo could be confirmed or ruled out.

For the time-being, there is sufficient room for uncertainty in all of the above data that a spherical Milky Way halo remains a plausible fit. This could change rapidly, however, as models of already existent data for the Sagittarius stream improve, and as new data for the Pal 5 and NGC 5466 streams become available.

Finally, I stress that all of these stream data give shape constraints at radii significantly larger than that discussed in section 5.4. Combing local constraints at R₀ with stream data over a wide range of Galactocentric radii R holds the exciting promise of constraining radial variations in the shape of our dark matter halo, for the first time.

5.6. Constraints from H i gas

As discussed in section 3.8, the Milky Way H i gas disc also provides information about the Galactic gravitational potential both in and out of the disc plane. Kalberla et al (2007) have recently fit the Jeans–Poisson equations to new H i data from the LAB H i survey, finding some rather surprising results. Their favoured mass model has a significant dark disc and a dark 'ring' (see also similar results from de Boer and Weber 2011) that appears to align with the known Monoceros stellar over-density (Ibata et al 2003, Conn et al 2007, 2012). However, as discussed in section 3.8, using gas as a tracer presents a range of new complications. Magnetic fields, turbulence driven by gravity or supernovae, radiation pressure, and/or dis-equilibria can all play a role in determining the final distribution of H i gas in the Milky Way—particularly perpendicular to the Galactic plane. It is not clear whether these are a significant source of uncertainty in deriving the Galactic potential from the H i field, but Levine et al (2008) have recently pointed out that the vertical gradient in the H i rotation curve does seem to be too large to be explained by gravity alone.

5.7. Beyond the 1D approximation

The most recent measurement of ρ_dm from BR13 moves beyond the 1D assumption for the first time to constrain the disc surface density over a range of radii 4 kpc ≲ R ≲ 9 kpc. Similarly to an earlier study of Geneva Copenhagen Survey and RAVE data by (Binney 2012b), they use the quasi-isothermal distribution function and torus machinery described in section 3.1.3. An interesting difference is that Binney (2012b) slices the data into different stellar populations of a given age, each of which is assumed to be a quasi-isothermal, whereas BR13 slice on chemical abundance (MAPs). To the extent that abundance is an indicator of age, these two choices are rather similar (e.g. Haywood et al 2013).

Both Binney (2012b) and BR13 find that the Milky Way disc is close to maximal (where its contribution to the rotation curve is as large as could be allowed by the rotation curve data). Given that these two studies use rather different data sets, this does lend support to their model fits. If the Milky Way disc can really be sliced into quasi-isothermal MAPs as advocated by BR13, or narrow quasi-isothermal age intervals as advocated by Binney (2012b), then this is a truly remarkable result. How can our Galactic disc conspire after a cosmic time of violent mergers and gas accretion to have such a simple dynamical structure? This is particularly puzzling given that there is a known and relatively recent merger in the form of the Sagittarius dwarf (Ibata et al 1994). More on this, next.

5.8. Disequilibria

I have assumed so far throughout this review that the Milky Way disc is in dynamic equilibrium such that the partial time derivative of the distribution function can be neglected. The very presence of a bar and spiral arms in the Milky Way, along with evidence for 'moving groups' in the Solar neighbourhood all point towards disequilibria (e.g. Blitz and Spergel 1991, Dehnen 1998, Bissantz and Gerhard 2002, Antoja et al 2011). As we have seen in section 4.2, however, this sort of disequilibria is not a major source of systematic error, at least for current data quality. Interestingly, however, Widrow et al (2012) have recently reported evidence for a different type of disequilibria that might be more problematic. Using SDSS star counts and kinematics, they find evidence for vertical waves in the disc at R₀. This has been largely confirmed—at least in the stellar kinematics—by the RAVE survey (Williams et al 2013). I show the key plots that define the asymmetry in figure 10. Such density waves could be illusory, resulting from complex survey selection functions—after all, the agreement between SDSS and RAVE is only qualitative rather than quantitative (Williams et al 2013). However, if such waves are real, it raises some interesting questions. I discuss these, next.

**Figure 10.** Evidence for disequilibria in the Milky Way, and a possible culprit (figure reproduced from Gómez *et al* 2013). The top panels show residual vertical star counts: Δ = (n(z) − 〈n〉(z))/〈n〉(z) for SDSS data (grey dots taken from Widrow *et al* 2012); and for a simulated Milky Way disc, recently bombarded by the Sagittarius dwarf (blue/red data points with Poisson errors marked). The bottom plots show the similar results for the vertical velocity averaged in small bins in z. The left plots show the raw simulation data; the right, the same phase shifted to better match the observations. Notice that the simulations and observations show qualitatively similar wave-like structures in both density and velocity.
Download figure:
Standard image High-resolution image

What could have caused the perturbation?

Sánchez-Salcedo et al (2011) recently considered the longevity of perturbations to a simple 1D model of the Milky Way disc (see also a similar analysis in Widrow et al 2012). They showed that any excited modes decay rapidly on the order of ∼10 vertical crossing times for the disc. For the Milky Way thin disc this is ∼20h/σ_z ∼ 200 Myrs which is extremely short in astronomical terms. However, Purcell et al (2011) present a numerical model of the Sagittarius merger where it has undergone three close pericentric passages, the latest being close to the present time. Such continued and current interaction with the disc could excite modes that are still present today. Indeed, Gómez et al (2013) show that this same simulation leads to vertical modes in the disc that are similar to those found by Widrow et al (2012) and Williams et al (2013) (see figure 10). This is compelling but not necessarily conclusive. There are a number of unknowns in the Sagittarius modelling, not least the mass and properties of the progenitor system (see section 5.5). These exquisite stream data do, however, hold out some hope that we can quantitatively predict the effects of such a merger on the Milky Way disc in the not too distant future.

Alternatively, we need not necessarily appeal to Sagittarius to perform the perturbations. Our current ΛCDM cosmological model (section 2) has long predicted the presence of thousands of massive satellites orbiting the Milky Way that are not observed as visible galaxies (Moore et al 1999, Klypin et al 1999). A fascinating possibility is that such satellites really are there, orbiting as ghostly dark halos that constantly perturb the Milky Way disc.

Finally, it is possible that the perturbations are induced—at least in part—by the Milky Way spiral arms. Faure et al (2014) have recently used 3D test particle simulations to show that these can also induce vertical modes in the disc similar to those reported by the SDSS and RAVE surveys, though it is unclear if such a mechanism can explain the asymmetry in stellar density at large heights above the disc plane reported by Widrow et al (2012).

How do disequilibria bias ρ_dm measurements?

Ideally, we should address this question by performing the sort of mock data tests outlined in 4.2 but applied to discs that have been recently perturbed. This exercise remains to be performed and is certainly beyond the scope of this present review. However, Widrow et al (2012) do perform some simple 1D experiments of a perturbed disc. Like Sánchez-Salcedo et al 2011, they find that oscillations damp after ∼200 Myrs, but they also consider the associated oscillations excited in the vertical force K_z. For oscillations that match the amplitude of perturbations in the SDSS data (grey dots, figure 10), Widrow et al (2012) find a vertical force oscillation of δ|K_z|/2πG ∼ 2 M_⊙pc⁻² at z ∼ 1 kpc. Since this is about 10% of the expected dark matter contribution at this height, such disequilibria are unlikely to have a major impact on measures of ρ_dm.

Note that such a small effect should not be surprising. Consider some wave-like perturbation to the disc density ρ → ρ(1 + δ), with:

$\begin{equation} \delta \sim \delta _0 \sin (2\pi z/\lambda ). \end{equation} \tag{ 70 }$

Assuming an exponential disc ρ = ρ₀exp ( − |z|/z₀), this gives a perturbation to the vertical force at z ≫ z₀:

$\begin{equation} \Delta _K = \frac{\delta |K_z|}{|K_z|} = \frac{\delta \Sigma _z}{\Sigma _b} = \delta _0 \int _{0}^{\infty } {\rm e}^{-x} \sin \left(2\pi \frac{z_0}{\lambda } x\right)\,{\rm d}x. \end{equation} \tag{ 71 }$

Assuming δ₀ ∼ 0.1, z₀ ∼ 0.3 kpc and λ ∼ 1 kpc (Widrow et al 2012, and see figure 10), this gives Δ_K ∼ 0.04. For a disc surface density of Σ_b = 55 M_⊙pc⁻², this gives δ|K_z|/2πG = δΣ_z ∼ 2.2 M_⊙pc⁻², which is in excellent agreement with the number reported in Widrow et al (2012).

5.9. Gaia: precision measurements of ρ_dm

Over a mission lifetime of ∼9 years, the Gaia satellite will catalogue the positions and velocities of ∼ a billion stars in our Galaxy (Perryman et al 2001). Such a dataset will be transformative for measures of ρ_dm. Figure 11 shows the expected distance error for stars of different spectral type as a function of distance d and apparent magnitude m_V (equation (56)) at the end of the Gaia mission. I have plotted the spectral types: B0V, F0V, G0V, K0V and M0V, as in figure 5 (for a definition of spectral type and apparent magnitude see footnote 11). For K-dwarf stars, Gaia will measure distance to better than 10% accuracy out to ∼1 kpc even for the very faintest stars. Just considering K stars over 5.5 < M_V < 7.5, this amounts to some ∼18 × 10⁶ stars¹⁸. The position and proper motions for these stars will be similarly accurate out to this distance (e.g. Brown 2013).

**Figure 11.** The distance accuracy of the Gaia mission at completion. The coloured lines show the apparent magnitude *m_V* of stars of different spectral type (see figure 5 for a definition of spectral type) as a function of distance d. Distance accuracy horizons of 0.1, 1 and 10% are marked by the dashed lines. This figure was produced using https://pypi.python.org/pypi/PyGaia/ and a template from Anthony Brown.
Download figure:
Standard image High-resolution image

With such data—available for bright stars over a large volume around the Sun—we will be pushed beyond the simple 1D models typically employed to date. Including all of these stars in the analysis and splitting by spectral type and/or abundance (e.g. Bovy and Rix 2013), we will have an enormously valuable dataset for measuring ρ_dm, largely free from complex survey selection functions. Some problems will remain, however. An accurate 3D dust model for the Galaxy will be necessary to ensure that stars are not mis-classified (e.g. Marshall et al 2010). Ideally, we should fit such a dust model simultaneously alongside the dynamical model fit. Furthermore, it may be preferable to fit a full chemo-dynamic model to the Gaia data, rather than splitting on spectral type or abundance. This has several advantages: (i) all data may be used simultaneously; (ii) prior information about the inter-relationship between different stellar types could help to break model degeneracies (e.g. Binney 2013); and (iii) the result of such a fit would give us much more information than just the Galactic potential or ρ_dm—it would simultaneously constrain the formation history of these stars within our Galaxy. In the end, however, all of these different approaches are likely to be complementary. For the question of interest here—the measurement of ρ_dm—the cleanest approach of fitting volume complete stellar tracers seems like a good place to start.

6. Conclusions

I have presented a review of nearly a century of measurements of the mean density of dark matter near the Sun: ρ_dm. We are about to enter a golden age where such measurements become truly precise. Such accurate measures encode valuable dynamical information about our Galaxy, and are also of great importance for 'direct detection' dark matter experiments. I have reviewed theoretical expectations for ρ_dm, its laboratory extrapolation $\tilde{\rho }_\mathrm{dm}$ and the local velocity distribution function of dark matter f(v) (that is important for direct detection experiments). I presented the key theory behind measurements of ρ_dm in the Milky Way, and I collated both historical and modern measures. Finally, I looked ahead to what will soon be possible with the Gaia satellite. My key conclusions are as follows:

Numerical simulations of ρ_dm

State of the art dark matter only cosmological simulations make accurate predictions for the local phase space distribution function of dark matter (its mean density ρ_dm and velocity distribution function f(v)) on scales of ≳ 20 pc. Unresolved structure on smaller scales is unlikely to affect the conclusions of these simulations.
Although unresolved structures in the simulations are not likely important, baryonic processes are. Gas cooling, star formation and stellar feedback during galaxy formation likely rearrange the dark matter distribution in galaxies, even if the dark matter and baryons interact only via gravity. Baryons act to make halos oblate and aligned with the central disc, at least out to ∼10 disc scale lengths; to transform central dense dark matter cusps into cores (if stellar/black hole feedback is strong enough); and—through biased accretion—to lead to the formation of an accreted dark matter disc. Each of these processes affects the expectation values of ρ_dm and f(v) near the Sun.

Measurements of ρ_dm

A key source of uncertainty on ρ_dm is the baryonic contribution to the local dynamical mass: Σ_b. I have compiled a new measurement from literature data: Σ_b = 54.2 ± 4.9 M_⊙pc⁻², where the dominant source of uncertainty is in the H i gas contribution. Improving our determination of Σ_b warrants renewed attention.
Homogenizing Σ_b across different studies (using the above value), I find excellent agreement between different groups. In table 4, I have compiled a list of recent measures of both ρ_dm (calculated locally) and ρ_{dm, ext} (extrapolated from the rotation curve assuming spherical symmetry). Each of these studies is complementary. One—G12—uses a very clean dataset with a simple selection function, but poorer sampling (∼2000 stars). Three—S12, Z13, and BR13—use SDSS data with significantly improved sampling (∼10 000 stars), but with a significantly more complex data selection function. The latter studies have smaller formal errors, but present a greater challenge when estimating systematic errors.
Comparing the above measures of ρ_dm with spherical extrapolations from the Milky Way's rotation curve (ρ_{dm, ext} = 0.005 − 0.015 M_⊙pc⁻²; 0.2 − 0.56 GeV cm⁻³), the Milky Way is consistent with having a spherical dark matter halo at R₀ ∼ 8 kpc. The latest measurements of ρ_dm from SDSS appear to favour little halo flattening in the disc plane, suggesting that the Galaxy has little or no accreted dark matter disc and a correspondingly quiescent merger history (see figure 9). I caution, however, that this result hinges on there being no large systematics that remain to be uncovered in the SDSS data, and on the local baryonic surface density being Σ_b ∼ 55 M_⊙pc⁻².
There is a continuing need for detailed tests of our methodologies on dynamically realistic mock data. I illustrated this using both simple 1D tests and full 6D mock data based on an N-body simulation of the Milky Way. This latter reveals the surprising result that seemingly sensible assumptions about the distribution function of tracer stars in the disc can lead to significant systematic biases on ρ_dm. Such model systematics will likely become a dominant source of uncertainty on ρ_dm in the Gaia era.
Two groups have recently found evidence for disequilibria in the Milky Way in the form of vertical density/velocity waves in the Milky Way disc stars. I showed that, at the currently quoted wave amplitudes, these contribute a systematic error on ρ_dm of order ∼10%. This is not likely to be the dominant source of uncertainty on ρ_dm even with Gaia quality data. However, if such oscillatory modes persist as the data continue to improve, they will provide us with a brand new probe of Galactic structure.

Acknowledgments

This work has made use of the IAC-STAR synthetic CMD computation code. IAC-STAR is supported and maintained by the computer division of the Instituto de Astrofísica de Canarias. I would like to thank the Gaia Project Scientist Support Team and the Gaia Data Processing and Analysis Consortium (DPAC) for providing the PyGaia package that was used to make figure 11. I would like to acknowledge support from SNF grant PP00P2_128540/1 and ESF funding for the Gaia Challenge conference where much of the work in section 4.1 was conceived and undertaken. I would like to thank the Oxford University Press publication Monthly Notices of the Royal Astronomical Society and each of the individual authors concerned for the permission to reproduce material that contributed to figures 3, 4, 5, 7, 8 and 10. Finally, I would like to thank Chris Flynn, Silvia Garbari, Paul McMillan, Andrew Pontzen, Jo Bovy, George Lake and the anonymous referees for reading through early drafts of this review and for very helpful feedback.

The local dark matter density

Article metrics

Submit

Permissions

Share this article

Author e-mails

Author affiliations

Dates

Peer review information

Abstract

1. Introduction

2. Theoretical expectations for ρdm and its laboratory extrapolation \tilde{\rho }_\mathrm{dm}

2.1. The cosmological model

2.2. Dark matter only (DMO) simulations

2.2.1. Key predictions from DMO simulations

The spherically averaged radial density profile

The shape of dark matter halos

The local dark matter density

The local velocity distribution function of dark matter

2.2.2. Extrapolating from ρdm to ˜ρdm

Streams and Caustics.

Unresolved substructure.

Solar system capture and scattering.

2.3. The effect of baryons

2.3.1. Qualitative predictions

Most of the local mass near the Sun is in baryons, not dark matter.

Cusp-core transformations and halo shape change.

The formation of a 'dark disc'.

2.3.2. Hunting for the Milky Way's dark disc

2.3.3. Towards ab-initio simulations including baryonic physics

3. Mass modelling theory

3.1. Distribution function modelling

3.1.1. 'Schwarzschild' or orbit modelling

3.1.2. Made to Measure (M2M)

3.1.3. Action modelling

3.2. Moment methods: the Jeans equations

3.3. The 1D approximation

3.4. A 1D distribution function method

3.5. The mass model

3.5.1. The rotation curve prior

3.6. The choice of tracer

Selecting stars in 'equilibrium'.

Selecting stars that reach to high {\boldsymbol z}.

Obtaining a good measure of distance.

Obtaining good velocities.

The advantage of a 'volume complete' sample.

Ensuring consistency.

Modern survey data.

3.7. Errors and model degeneracies

Degeneracies.

Observational Errors.

3.8. Gas as a tracer of the potential

4. Tests using mock data

4.1. Simple 1D mock data

4.1.1. The effect of sampling error and priors

4.1.2. Data high above the disc plane

4.1.3. Multiple tracer populations

4.2. N-body mocks

5. Measurements of ρdm and ρdm, ext

5.1. Nearly a century of measurements of ρdm

5.2. The latest local measures

5.3. The latest global measures

5.4. Constraints on the local Milky Way halo shape and an accreted dark disc

5.5. Independent measures of the Milky Way halo shape

5.6. Constraints from H i gas

5.7. Beyond the 1D approximation

5.8. Disequilibria

What could have caused the perturbation?

How do disequilibria bias ρdm measurements?

5.9. Gaia: precision measurements of ρdm

6. Conclusions

Numerical simulations of ρdm

Measurements of ρdm

Acknowledgments

Footnotes

2. Theoretical expectations for ρ_dm and its laboratory extrapolation $\tilde{\rho }_\mathrm{dm}$

2.2.2. Extrapolating from ρ_dm to ˜ρ_dm

Selecting stars that reach to high ${\boldsymbol z}$ .

5. Measurements of ρ_dm and ρ_{dm, ext}

5.1. Nearly a century of measurements of ρ_dm

How do disequilibria bias ρ_dm measurements?

5.9. Gaia: precision measurements of ρ_dm

Numerical simulations of ρ_dm

Measurements of ρ_dm