Simulating Wind-Blown Nebulae from Single and Binary Massive Stars

Winds from massive stars expand supersonically into their surroundings, creating dynamic and fascinating nebulae that can give us insight into physical processes in interstellar plasma, and into the evolutionary history of the stars. Around single stars, parsec-scale bubbles such as bow shocks and ring nebulae are formed, whereas in colliding-wind binary (CWB) systems the high wind density produces intense time- and space-dependent emission across the electromagnetic spectrum from radio to gamma-rays. This contribution summarizes some recent results from 3D MHD modelling of bow shocks around runaway stars such as ζ Oph, and of the wind-collision zone of the CWB systems WR140 and WR21a. A resolution study of 3D simulations of bow shocks shows that X-ray emission from the shocked wind is time-variable and that converged results can be obtained once the Kelvin-Helmholtz instability at the contact discontinuity is resolved. Simulations of the CWB system WR140 show that inverse-Compton cooling of the shocked plasma can trigger runaway cooling when the orbit is near periastron, producing strong compression and dynamical instabilities. This sharply reduces the hard-X-ray emission around periastron, in agreement with observations. Scaling tests of the simulation software pion are also presented for a model of the CWB system WR21a run on up to 8192 cores using the HPC system Karolina.


Introduction
Massive stars drive strong winds into the interstellar medium (ISM), generating shockwaves and often producing bright nebulae including both thermal and non-thermal emission.The nebulae can be used to probe shock physics as well as to constrain mass-loss rates of stars [1], the evolutionary history of post-main-sequence stars [2,3], and the effects of radiative heating on winds [4].While most massive stars form in binary or triple systems within star clusters, some of these are ejected and disperse through the Galaxy [5].
Bow shocks form around massive stars when the relative motion between star and interstellar medium (ISM) exceeds the ISM sound speed, because of the hydrodynamic interaction between stellar wind and ISM.A two-shock structure forms, divided by a contact discontinuity and swept into a bow shape by the dynamical pressure of the relative motion [6].For hot stars the (inner) reverse shock is usually adiabatic whereas the (outer) forward shock is often radiative [7,8].Magnetic fields can impact the structure and stability of bow shocks by limiting the compressibility of the gas and by inhibiting instability growth at the contact discontinuity [9][10][11].Direct detection of the shocked wind through thermal X-ray emission has long been predicted [12,13] but, despite many searches [14,15] [11,16].This detection is important for measuring the energy content of wind bubbles, and to constrain effectiveness of physical processes such as thermal conduction and instability-driven turbulent mixing.These are the primary energy loss mechanisms for the hot phase of the ISM [17][18][19].
Colliding-wind binaries (CWBs) consist of two massive stars, each of which drives a strong wind, and the wind-wind collision between the two stars emits radiation at a level detectable above the stellar emission (usually at radio, optical, X-ray and sometimes γ-ray energies) [e.g.20,21].With typical orbital periods of days to a few years, they provide ideal laboratories to study shocks with predictable and time-dependent properties [22], particle acceleration [23] and dust formation [24].
The work presented here uses the radiation-magnetohydrodynamics (MHD) software pion [25], with some upgrades described in [26].pion solves the Euler or ideal-MHD equations on a uniform mesh using the finite-volume method with an integration scheme accurate to 2nd order in time and space.Static mesh-refinement (nested grids) is used to focus resolution and computing resources on regions of interest; this method is quite efficient when such regions can be predicted ahead of time [27].Parallelisation is via a hybrid MPI+OpenMP scheme using MPI for domain decomposition and OpenMP threads operating on 1D columns of cells in each sub-domain.The stellar wind boundary condition consists of a sphere of radius at least 7 cells on the coarsest level containing the boundary.Mass, momentum and energy flux into the domain from this sphere are calculated based on the adopted mass-loss rate and stellar-wind velocity of the star, with an option for wind acceleration according to a commonly used β-law prescription [26].For binary systems a simple N-body integrator calculates the orbital motion of the stars and the wind boundary is updated whenever the star moves by 0.1 cells.

Bow shocks around massive stars
The direct detection of the shocked stellar wind in bow shocks around hot stars has proven difficult because of the low density of the emitting gas.So far the only detection is of the hot plasma around the closest O star to Earth, ζ Oph, using Chandra observations [11,16].The observed soft, thermal X-ray emission has a temperature (T ∼ 0.2 keV) well below that of the shocked wind, and is thought to arise from gas near the contact discontinuity (CD) with temperature intermediate between the shocked wind and ISM.This layer may be produced by thermal conduction [18] and/or turbulent mixing [7,17] at the wind-ISM interface.Radiative cooling in this X-ray emitting layer is thought to be the primary energy-loss mechanism for the hot phase of the ISM [e.g.19], and so characterising its properties are important for our understanding of the multi-phase ISM.
Predictions for X-ray emission in the Bubble Nebula (NGC 7635) using 2D hydrodynamical simulations of this bow shock [13] proved to be overoptimistic (no emission was detected by [15]) for reasons that are still not clear.Conversely, [11] made 3D MHD simulations of the bow shock of ζ Oph, obtaining synthetic emission maps that show weaker thermal X-ray emission than is actually observed.The 2D hydrodynamic simulations [13] showed vigorous mixing of wind and ISM gas driven by the Kelvin-Helmholtz (KH) instability at the CD, whereas the 3D MHD simulations showed a smooth CD. [11] speculated that the combination of MHD inhibition of the KH instability [28], together with the diffusive HLL solver used for MHD, resulted in much weaker mixing and therefore fainter X-ray emission.One should distinguish between the former (physical) mechanism and the latter (purely numerical) effect in order to understand the conflicting results from modelling NGC 7635 and ζ Oph.
To this end, we performed a numerical study of a bow shock around a runaway massive star using different sets of equations (Euler, ideal MHD), solvers (HLL, HLLD), dimensionality (2D, 3D) and spatial resolution (spanning a factor of 8 in cell diameter).The full results will be presented in a paper currently in preparation; here we show some initial results of the study.) through the planes y = 0 and z = 0 for low-resolution simulation 1 (left two panels) and high-resolution simulation 3 (right two panels) at the times indicated.The star is at the origin, and the reference frame is such that the star is stationary on the grid, with ISM flowing from right to left at speed v .The stellar wind is mostly low density (blue) whereas ISM is high density (green/yellow), with a sharp interface at the CD.
A simulation is set up of a massive star moving at v = 30 km s −1 in the x direction through a uniform ISM with density ρ = 10 −23 g cm −3 .The mass-loss rate of the star is Ṁ = 10 −7 M yr −1 and wind terminal velocity v ∞ = 2000 km s −1 ; the star has surface rotation velocity v rot = 100 km s −1 , a radius of 10 R and a surface split-monopole magnetic field strength of 1 G.We assume the star emits enough ionizing photons to keep the domain photoionized, leading to a minimum equilibrium temperature around 8000 K in the ISM.For the MHD simulations the ISM magnetic field is set to a uniform value of B 0 = (1, 4, 0) µG, oriented almost perpendicular to v .A hybrid HLLD/HLL Riemann solver is used, such that HLL is used at strong shocks and when the density ratio between left and right states is more than 5 [26].A nested grid is set up with 3 levels of refinement focused on the bow shock, such that the finest level contains both the star and the apex of the bow shock.Simulation 1 has resolution 128 3 grid cells per level, 2 has 256 3 and 3 has 384 3 .A highresolution 2D simulation is overplotted (cyan dotted line) for comparison (with B 0 v ).
The data at t < 0.12 Myr show effects of the initial conditions.Simulations 1 and 2 settled down to a relatively stable state with a smooth CD, similar to the results of [11] (simulation 1 is shown in the left 2 panels of Fig. 1).In contrast, the (highestresolution) simulation 3 shows the development of KH instability at the CD, with growing amplitude waves seeded near the apex of the bow shock and moving downstream.This entrains significant quantities of ISM gas into the hot shocked wind, reducing its temperature, increasing its density, and hence increasing the X-ray emission.The symmetry axis of the termination shock is noticeably tilted with respect to v in the x-z plane; this is a temporary feature driven by the KH vortices and at other times it is tilted in the opposite sense.As well as mixing wind and ISM gas, the turbulent flow also inhibits flow of shocked wind through the −x boundary, increasing the pressure (and density) of the shocked-wind region in the downstream.This forces the wind termination shock closer to the star in the downstream direction than in simulation 1. Fig. 2 shows the time evolution of the X-ray luminosity from the full simulation domain for simulations 1-3.Simulation 1 has soft-X-ray luminosity of L X ∼ 3 × 10 29 erg s −1 with weak fluctuations on ∼ 0.1 Myr timescales.Simulation 2 has somewhat stronger fluctuations, up to a factor of 2 in L X , centred around the same value as simulation 1. Simulation 3 shows even larger fluctuations, up to a factor of 5 from maximum to minimum, driven by the periodic growth of KH vortices that induce strong mixing in the turbulent wake downstream from the star.Compared with a higher-resolution 2D simulation (cyan dotted line) the amplitude and timescale of the X-ray variability is quite consistent, albeit the simulations are not identical because in 2D the ISM magnetic field is initially parallel to v .The contrast between maximum and minimum emission is shown in Fig. 3, where we plot the X-ray surface brightness, projected along the ẑ direction, at two different times.In the right panel the regions of bright emission correspond to vortices that have entrained ISM gas into the wind bubble, producing soft-X-ray emitting plasma.One may expect that adding time-dependence to the ISM properties (e.g., turbulence and overdense clouds) will also induce further variability in observable properties [e.g.29].
We can draw two main conclusions from these simulations: (i) The simulations of [11] had insufficient spatial resolution to capture the turbulent mixing induced by KH instability in the wake behind the bow shock of ζ Oph, and this is most likely the reason why they predict weaker X-ray emission than is observed.(ii) Even in the absence of ISM inhomogeneity and turbulent flows, the X-ray emission of the shocked wind is time-variable, with fluctuations of up to a factor of 5-10 on a timescale of ∼ 0.05 Myr.Comparison of simulations with observations therefore requires a statistical sample of bow shocks for conclusions to be drawn about the effectiveness of this turbulent mixing process in nature as compared with simulations.

Colliding-wind binaries
In contrast to the large but relatively faint bow shocks around runaway massive stars, CWBs are typically point sources (except for interferometric observations) with intense multiwavelength emission, both thermal and non-thermal.The wind-wind interaction shares many characteristics with bow shocks, including the bow shape that arises when the winds have unequal momentum density.The gas density in the shocked plasma is orders of magnitude larger, and characteristic size orders of magnitude smaller, than bow shocks, and the shocked region is swept into a spiral shape by orbital motion.Radiative processes are very similar between the two classes of object, but in CWBs the radiation and matter densities are usually large enough that local absorption plays an important role in the observed emission.
In [26] we investigated the archetypal CWB WR 140, consisting of a carbon-type Wolf-Rayet star (WC) and a main-sequence O star in an eccentric 7.9 year orbit that has separation ranging from 1.4 au at periastron to 10 times that at apastron.The wind-wind interaction was simulated for about 300 days centred on periastron, to model the strong dip observed in the X-ray lightcurve around periastron [22].Using 3D MHD simulations with 7 levels of static mesh-refinement, we showed that such a dip is not produced by a standard calculation because the shocked wind of both stars remains largely adiabatic.The shocked layers remain dynamically stable and the X-ray luminosity through periastron follows the expectations for an adiabatic wind-collision zone, modified slightly to take account of wind acceleration.Some time ago it was proposed that inverse-Compton cooling of the thermal plasma could be important in CWBs [30], but this cooling process has been largely ignored in modelling efforts for the past 30 years.We calculated that this should be important in WR 140 around periastron, using analytic arguments and 2D simulations of wind-wind collisions.This extra radiative cooling, that is effective when the shocks are within a few stellar radii of a luminous star, is enough in the WR 140 system to trigger runaway radiative cooling in the shocked layer, and the associated dynamical instability seen in radiative shocks.
Fig. 4 shows two snapshots of a 3D MHD simulation of the WR 140 system about 17 days before periastron (left panel) and at periastron (right panel).Gas density is plotted in the upper panels, and 2-10 keV X-ray luminosity in the lower panels.In the left-hand panel the separation of the two stars is such that neither IC nor collisional cooling processes are effective, and so the bow shock appears smooth and dynamically stable.Once the separation of the stars crosses a critical threshold, where the cooling timescale becomes shorter than the advection timescale (the time for shocked gas to leave the wind-collision zone), then the shocked layer undergoes runaway cooling, losing pressure support and collapsing to a thin sheet that is dynamically unstable to the thin shell instability [31].While we do not spatially resolve the physical sheet thickness, the vigorous instability nevertheless induces growing oscillations that deform the shock fronts.This means that much of the shock experiences a normal flow velocity reduced by the cosine of the deformation angle, resulting in a lower post-shock temperature and weaker hard-X-ray emission.
This dip is one of the most striking features of the observed hard-X-ray lightcurve seen over at least 3 orbits [22].Our results can qualitatively match the data in terms of the sharpness and asymmetry of the dip, but compared with observations the onset of the increase is too late in the simulations, and the decrease in emission is not deep enough.Further work is required to investigate these discrepencies, although the qualitative agreement is very encouraging.In particular, radiative transfer calculations to take account of the variable local absorption in the WR wind would be very valuable, because the correction of the observed X-ray flux for absorption only considers a single value of the column density (the source is unresolved).

Parallel scaling: WR 21a as a case study
The binary system WR 21a is one of the most massive in the Galaxy, with component masses of 93 M and 53 M for the WR and O star, respectively [32].It is also one of the most X-ray luminous [33], a close binary with orbital period only 31.7 days and significant eccentricity, e = 0.695.We used this binary system as as the problem setup for testing the scaling of pion on the IT4I HPC system Karolina through a EuroHPC benchmarking project 1 .
A 3D hydrodynamics simulation was set up with 8 refinement levels, each containing 320 2 × 160 grid cells, with a domain size [8.192, 8.192, 4.096] × 10 14 cm on the coarsest level, centred on the origin.The system centre-of-mass is at the origin, the orbital plane is z = 0, and the stellar separation at periastron is 4.7 × 10 12 cm.The simulation was integrated from periastron to close to apastron and a snapshot was saved.Starting from this snapshot, the simulation was then restarted using 1, 2, 4, 8, 16, 32, and 64 nodes, each of which was run with 1, 2, 4, and 8 OpenMP threads per MPI process, for total core counts of 128 -8192 cores (Karolina has 128 cores per node).The simulation was run in each case for 199 coarse-level timesteps (25 472 on the finest level) to test the strong scaling.
Furthermore, lower-resolution simulations with the same setup were run to the same point in time, using 128 2 × 64, 160 2 × 80, 192 2 × 96 and 256 2 × 128 grid cells per level.These were run   Speedup, is shown relative to a simulation using 1 node with 128 MPI processes and 1 OpenMP thread per process.Curves are shown for N th = 1, 2, 4, and 8 threads per MPI process.For N th = 4 the number of ghost cells exceeds the number of grid cells for N > 2048 cores.with 1, 2, 4, and 8 nodes, and compared with the highest-resolution simulation run on 16 nodes, in all cases using 8 OpenMP threads per MPI process, i.e., 16 MPI processes per node.These simulations were used to test the weak scaling of pion.It is not a perfect weak-scaling test because the amount of work per core/node was not constant (generally the cell counts in pion should be divisible by 32 in each dimension), but it gives a good baseline measurement with which we can predict the performance at even higher resolution on a similar problem, which is the main aim of the calculation.
The starting snapshot for the strong-scaling test, and the weak-scaling test on 16 nodes, is plotted in Fig. 5, where gas density is shown in the orbital plane.The larger and more massive WR star is towards the top, and the O star with the weaker wind is below.The wind-collision region is close to the stability limit, with weak KH instability along the contact discontinuity providing the seed for overdensities that locally undergo runaway cooling, although the bulk of the shocked plasma behaves adiabatically.The lower-resolution simulations look similar but with less instability as resolution degrades.
Results for weak scaling are plotted in Fig. 6, where the upper panel shows the time required per cell-update per core, as well as the time spent in communication and in computation.The lower panel shows the relative amount of work per node/core for the different simulations, which is approximately constant.The upper panel shows that performance is consistent from 1 node (128 cores) up to 16 nodes (2048 cores), with wall time and compute time basically constant, and a weak trend of increasing time on communication with increasing node count, from 10% of the total runtime on 1 node to 20% on 16 nodes.An extrapolation of this performance would imply that higher resolution simulations could run on up to 8192 cores with < 30% of runtime on communications.
Strong scaling results are plotted in Fig. 7, where we show the speedup obtained relative to a simulation with 1 node (128 cores), 128 MPI processes and 1 OpenMP thread per process.The efficiency remains above 50% from 1 to 16 nodes (2048 cores), which is the point at which the number of ghost cells equals the number of grid cells for each MPI process (when run with 4 OpenMP threads per process).For 32 and 64 nodes there is more ghost-cell calculation than grid-cell calculation, and so we don't expect high efficiency.
The strong-scaling plot shows that running pion is most efficient with 4 OpenMP threads per MPI process: for lower thread counts the scaling drops off faster whereas for larger thread counts the overall speed is lower.After some investigation we discovered that the communication routines themselves are the bottleneck for large OpenMP thread counts, because the boundary communication is not yet multithreaded.The next step is to implement this multithreading so that pion can run efficiently on 8-20 threads per MPI process.

Summary
Computational research on wind-driven nebulae around single and binary massive stars is advancing rapidly, enabled by mature and robust simulation codes and increasing computational resources.This contribution summarizes published and ongoing work from my research group on bow shocks around massive stars, wind-wind collisions in binary systems, and code development and testing.A resolution study of 3D simulations of bow shocks shows that X-ray emission from the shocked wind is strongly time-variable and that results consistent with higher-resolution 2D simulations can be obtained once the KH instability at the contact discontinuity is resolved.This may contribute to resolving discrepencies between previous simulations and X-ray observations: the overprediction of X-rays from simulations of the Bubble Nebula [13,15] and underprediction from simulations of the bow shock of ζ Oph [11,16].With considerably increased computational cost, similar simulations could be run including diffusive processes such as thermal conduction [34] to attempt to resolve the physical diffusion length-scale and achieve numerically converged results.
Simulations of the CWB system WR 140 show that inverse-Compton cooling of the shocked plasma can trigger runaway cooling when the orbit is near periastron, producing strong compression and dynamical instabilities.This sharply reduces the hard-X-ray emission around periastron, in agreement with observations.
Scaling tests of pion were also presented for a model of the CWB system WR 21a run on up to 8192 cores, showing good strong scaling up to the point where the number of ghost cells equals the number of grid cells.Weak scaling was tested from 128 to 2048 cores and performance is consistent across this range.These scaling results show that pion can run efficiently for simulations of CWB systems on Tier-0 supercomputing systems.

Figure 1 .
Figure1.Slices of log 10 ρ/(g cm −3 ) through the planes y = 0 and z = 0 for low-resolution simulation 1 (left two panels) and high-resolution simulation 3 (right two panels) at the times indicated.The star is at the origin, and the reference frame is such that the star is stationary on the grid, with ISM flowing from right to left at speed v .The stellar wind is mostly low density (blue) whereas ISM is high density (green/yellow), with a sharp interface at the CD.

Figure 2 .
Figure 2. Soft X-ray (0.3-2 keV) luminosity of simulations 1 (blue solid line), 2 (red dashed line) and 3 (black dotdashed line) as a function of simulation time.A highresolution 2D simulation is overplotted (cyan dotted line) for comparison (with B 0 v ).The data at t < 0.12 Myr show effects of the initial conditions.

Figure 3 .
Figure 3. Projected soft-X-ray emission from simulation 3 at simulation times corresponding to weak (left) and strong (right) X-ray emission, separated by 37 000 years.The projection is along the ŷ direction, and the surface brightness is plotted in the units indicated in the plot title.Contours of radio emission measure (integral of electron density squared along the line of sight) are overplotted on a linear scale from the peak value to 10% of the peak, tracing the shocked ISM.

Figure 4 .
Figure 4. Snapshots of gas density shortly before (left) and at (right) periastron for the simulation of the WR 140 system.Gas density in the orbital plane z = 0 is shown on a logarithmic scale with the WR star being bright yellow and the O star a black circle.The hard-X-ray lightcurve is also plotted in the lower panels, and the vertical line shows the time of the snapshot.

Figure 5 .
Figure 5. Snapshot showing a slice of 3D simulation of the CWB system WR 21a close to apastron, in the orbital plane z = 0. Log of gas density is plotted on a logarithmic scale.The lower panel shows the X-ray luminosity of the wind-collision zone as a function of orbital phase, and the vertical line shows the current phase.The system centre of mass is at the origin, the stars are orbiting counter-clockwise, and not all of the simulation domain is shown.

Figure 6 .
Figure 6.Weak scaling of pion for simulations of the WR 21a system.The upper panel shows the time required per cell-update per core as a function of the number of nodes.The time required for computation and communication is also plotted.The lower panel shows the amount of work per core for the simulations (it was not possible to make this exactly equal in all cases because the number of cells in each dimension should be divisible by 32).

Figure 7 .
Figure 7. Strong scaling of pion for simulating the CWB system WR 21a over 199 coarse-level timesteps with 320 2 × 160 grid cells per level and 8 refinement levels.Speedup, is shown relative to a simulation using 1 node with 128 MPI processes and 1 OpenMP thread per process.Curves are shown for N th = 1, 2, 4, and 8 threads per MPI process.For N th = 4 the number of ghost cells exceeds the number of grid cells for N > 2048 cores.
, has only been detected in the bow shock of the closest 15th International Conference on Numerical Modeling of Space Plasma Flows Journal of Physics: Conference Series 2O star to Earth, ζ Oph