Brought to you by:

The following article is Open access

Blacklight: A General-relativistic Ray-tracing and Analysis Tool

Published 2022 September 6 © 2022. The Author(s). Published by the American Astronomical Society.
, , Citation Christopher J. White 2022 ApJS 262 28 DOI 10.3847/1538-4365/ac77ef

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

0067-0049/262/1/28

Abstract

We describe the Blacklight code, intended for postprocessing general-relativistic magnetohydrodynamic simulation data. Beyond polarized ray tracing of synchrotron radiation, it can produce a number of outputs that aid in analyzing data sets, such as maps of auxiliary quantities and false-color renderings. Additional features include support for adaptive mesh refinement input, slow-light calculations, and adaptive ray tracing. The code is written with ease of use, readability, and transparency as primary objectives, while it still achieves high performance. Blacklight is publicly available and released into the public domain.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Direct simulation of accreting black holes, employing general-relativistic magnetohydrodynamic (GRMHD) codes, is an indispensable method for understanding these physical systems. However, there is a disconnect separating the directly modeled quantities describing the plasma from the directly observed light at various frequencies. It is fortunate, then, that for supermassive black holes accreting well below the Eddington rate, the observed light in the millimeter-to-infrared range is largely due to optically thin synchrotron radiation emitted by the electrons. Such radiation, energetically unimportant for and neglected by the GRMHD evolution, can be recovered by postprocessing simulation data with ray-tracing codes, producing predictions for light curves, spectra, and even resolved images based on such simulations.

Ray tracing in this manner is critical for interpreting Event Horizon Telescope (EHT) observations of the M87 black hole (The EHT Collaboration 2019e, 2019f, 2021) and GRAVITY and EHT observations of Sagittarius A* (Sgr A*) (GRAVITY Collaboration 2018; The EHT Collaboration 2022e), for example, and can be a key ingredient in data-analysis pipelines (Wong et al. 2022). Given that there are a number of GRMHD codes in active use, and that different analyses will make distinct demands upon ray tracing, there are a number of existing ray-tracing codes. These include BHOSS (Younsi et al. 2020), GRay (Chan et al. 2013) and GRay2 (Chan et al. 2018), grtrans (Dexter & Agol 2009; Dexter 2016), ibothros (Noble et al. 2007) and ipole (Mościbrodzka & Gammie2018), Odyssey (Pu et al. 2016; Pu & Broderick 2018), RAIKOU (Kawashima et al. 2021), RAPTOR (Bronzwaer et al. 2018, 2020), and VRT2 (see Broderick & Blandford2003, 2004), as summarized and compared in Gold et al. (2020). 1

A number of these codes are publicly available, though each is adapted to particular GRMHD codes, and none currently natively support Athena++ (White et al. 2016; Stone et al. 2020) simulation data. Furthermore, the exclusive focus of most codes has been, understandably, producing simple synchrotron images from simulation data, 2 while the growing diversity and complexity of the research performed by this community suggest the time has come to consider more general tool sets for analyzing simulations.

Here we describe Blacklight, 3 a publicly available 4 code for state-of-the-art ray-tracing techniques, as well as additional analysis tools related to calculating geodesics in curved spacetimes. Blacklight is designed with the goal of enabling scientific research; beyond accuracy and performance, primary objectives influencing how the code is written include ease of use (no dependencies on external libraries), readability (clean, consistent coding, using encapsulation when and only when appropriate given the physics and equations being considered), and transparency (thorough documentation both in the source code and in the related wiki 5 ). It enables, for the first time, direct usage of Athena++ outputs, including data with static and adaptive mesh refinement. It also supports files generated by the iharm3D (Prather et al. 2021) version of HARM (Gammie et al. 2003), as well as a legacy HARM format.

Section 2 provides the details for the numerical choices made by Blacklight, including the integration of geodesics (Section 2.2), unpolarized radiation (Section 2.3), and polarized radiation (Section 2.4). We illustrate the correctness of the code with test problems from the literature in Section 3. Particularly noteworthy capabilities are discussed in Section 4, including slow light (Section 4.3), adaptive ray tracing (Section 4.4), integration of quantities other than the intensity of light (Section 4.6), false-color renderings (Section 4.7), and the production of true-color images (Section 4.8). Performance data can be found in Section 5. We describe the coding considerations and philosophy that characterize Blacklight in Section 6.

We use the (−, +, +, +) metric signature. In purely geometrical contexts (i.e., the metric and null geodesics), we omit factors of c and G, though we keep the black hole mass M as a convenient dimensional-consistency check, so that mass, length, time, and spin all have the same, nontrivial unit. When discussing plasma and radiation, c and G are restored, and equations hold in cgs units.

2. Integration Algorithms

Blacklight, like all such ray tracers, proceeds by defining a camera (essentially an array of pixels), calculating a ray for each pixel in the camera, and calculating the relevant properties of light along each ray. Each pixel defines the initial conditions (spacetime position and momentum) for its associated ray (null geodesic). Rays are calculated by integrating the geodesic equation on a stationary spacetime, starting at the camera and going backward in time to the source. The appropriate radiative transfer equation (for unpolarized light, polarized light, or some auxiliary quantity) is then integrated along each ray from the source to the camera.

The calculations are performed in Cartesian Kerr–Schild coordinates 6 xα = (t, x, y, z) appropriate for a Kerr black hole with mass M and dimensionless spin a/M, where the metric components are

Equation (1a)

Equation (1b)

Here η is the Minkowski metric, and we define

Equation (2a)

Equation (2b)

Equation (2c)

Equation (2d)

These coordinates are horizon penetrating, though null geodesics traced backward can only asymptotically approach the horizon with infinite affine parameter. More importantly, they have no polar axis singularity, which might otherwise make accurate integration difficult in its vicinity. We will index these coordinates with α, β, γ, and δ (spacetime) or a and b (space).

At times we will make reference to spherical Kerr–Schild coordinates xμ = (t, r, θ, ϕ), indexed with μ and ν (spacetime) or i and j (space). These are related to their Cartesian counterparts via Equations (2c) and (2d), together with

Equation (3a)

Equation (3b)

The inverse transformation is

Equation (4a)

Equation (4b)

Equation (4c)

For completeness, we note the metric components are

Equation (5)

where $s=\sin \theta $.

We will also make reference to the normal frame related to spherical Kerr–Schild coordinates, indexing this with n for time and p and q for space. Vectors transform between the two according to

Equation (6a)

Equation (6b)

with $\alpha ={(-{g}^{{tt}})}^{-1/2}$ the lapse and βi = − gti /gtt the shift. This frame, familiar to those working with 3 + 1 decompositions, has metric components gnn = − 1, gnp = 0, and ${g}_{{pq}}={\delta }_{p}^{i}{\delta }_{q}^{j}{g}_{{ij}}$.

Blacklight can use simulation data in either Cartesian or spherical Kerr–Schild coordinates, transforming values in the latter case into a singularity-free system for processing.

2.1. Camera Definition

There are two common types of camera used in general-relativistic ray tracing, which we will term "plane-parallel" and "pinhole." The former takes pixels to be located at distinct points in a plane (suitably defined), each seeing light with the same parallel (again, suitably defined) momentum. The latter takes all pixels to be collocated in space but sensitive to different momentum directions. Plane-parallel cameras are appropriate for making images as seen at infinity without having to place the camera at large distances from the source, while pinhole cameras model what a small detector would see at its location. Codes in the literature use one or the other, and Blacklight supports both.

In either case, Blacklight requires the user to specify the position xi of the camera center, the normal-frame velocity up of the camera center, and the unnormalized momentum ${k}_{i}={\delta }_{i}^{p}{k}_{p}$ of the light received at the camera center. Velocities are specified as up rather than ui , as the latter fails to uniquely determine a future-directed 4-velocity within the ergosphere.

With the components of x, u, and k known in any coordinate system, we can define unit vectors for the line-of-sight, vertical, and horizontal directions. To do this, we work in the camera frame based on Cartesian Kerr–Schild coordinates, where the metric has components

Equation (7a)

Equation (7b)

Equation (7c)

Equation (7d)

The line-of-sight vector K is a unit vector parallel to k. Explicitly, we have

Equation (8a)

Equation (8b)

Equation (8c)

where the proportionality constant is set by requiring ${K}_{a^{\prime} }{K}^{a^{\prime} }=1$. The transformation rule

Equation (9a)

Equation (9b)

returns these components to the coordinate frame.

A vertical direction v begins with a fiducial "up" vector ${U}^{a^{\prime} }={\delta }_{z}^{a}$. In the special case that the camera is centered on the polar (z) axis, we instead take ${U}^{a^{\prime} }={\delta }_{y}^{a}$. Then v is defined from U orthogonal to K via Gram–Schmidt,

Equation (10a)

Equation (10b)

Equation (10c)

where the normalization is determined by ${v}_{a^{\prime} }{v}^{a^{\prime} }=1$. The orthogonal horizontal direction is defined as

Equation (11a)

Equation (11b)

where D is the determinant of ${g}_{a^{\prime} b^{\prime} }$ and [a b c] is the Levi–Civita symbol. Blacklight allows users to rotate the camera about its axis by an angle ψ, so that the horizontal and vertical directions become

Equation (12a)

Equation (12b)

Users specify a camera width w and linear resolution N. Consider a pixel with horizontal and vertical indices i, j ∈{0,...,N − 1}. Define the coefficients

Equation (13)

For plane-parallel cameras, the pixel position and momentum are defined to be

Equation (14a)

Equation (14b)

where the displacement in the camera frame is

Equation (15)

For pinhole cameras, they are

Equation (16a)

Equation (16b)

where we have

Equation (17a)

Equation (17b)

For both cameras, the remaining component ${k}_{\mathrm{pix}}^{t}$ is found such that ${g}_{\alpha \beta }{k}_{\mathrm{pix}}^{\alpha }{k}_{\mathrm{pix}}^{\beta }=0$. There is a unique positive root on and outside the ergosphere. Inside the ergosphere, the lesser of the two positive roots is chosen. Note the normalization of kpix is done in an arbitrary but well-defined way here without reference to the frequency of light being observed, fixing the scale of the affine parameter used in the geodesic equation.

Eventually, the physical frequency ν (e.g., 230 GHz) must come into play. At the camera, the ray's physical momentum will be nkpix, where the normalization n is set by one of two user options. If the camera is to record frequency ν at its location (most appropriate for pinhole cameras), then

Equation (18)

Alternatively, if the camera is to record light that would be observed at frequency ν by a static observer at infinity (most appropriate for plane-parallel cameras), then

Equation (19)

2.2. Geodesics

Given initial contravariant position components xα and covariant null momentum components kα , Blacklight integrates the geodesic equation in the Hamiltonian form

Equation (20a)

Equation (20b)

Equation (20c)

(see James et al. 2015; Pu et al. 2016; Kawashima et al. 2021; Velásquez-Cadavid et al. 2022). It also sometimes tracks proper distance along the ray via

Equation (21)

Integration proceeds in the direction of negative λ (backward in time) from the camera. It terminates when the radial coordinate r crosses the threshold from less than to greater than that of the camera center. Even in horizon-penetrating coordinates, backward-propagating rays cannot reach the horizon in finite affine parameter. Geodesics are therefore also terminated if they cross inside an inner r threshold set by the user.

Blacklight offers three numerical schemes for integrating Equation (20a). One is the standard second-order Runge–Kutta (RK2) Heun method, which has the Butcher tableau given in Table 1, and another is the standard fourth-order Runge–Kutta (RK4) method with the Butcher tableau of Table 2. In both cases, the step size in affine parameter is taken to be

Equation (22)

where fstep is a user-specified constant and rhor is the outer horizon radius.

Table 1. Butcher Tableau for the RK2 Geodesic Integrator

Download table as:  ASCIITypeset image

Table 2. Butcher Tableau for the RK4 Geodesic Integrator

Download table as:  ASCIITypeset image

In terms of robustness and ease of use, however, the recommended geodesic integrator is the "RK5(4)7M" method of Dormand & Prince (1980; DP), which is an adaptive fifth-order Runge–Kutta method that calculates RK4 values from the same set of function evaluations in order to estimate errors for each step. In comparing common integrators in the context of the computation of null geodesics around black holes, Velásquez-Cadavid et al. (2022) find DP to be far more accurate. The step size is adjusted dynamically based on this error estimate, in order to take steps as large as possible given a user-prescribed tolerance (see Press et al. 1997). Here the error is defined in terms of the initial, fourth-order, and fifth-order values of the eight dependent variables {ym } = {xα } ∪ {kα } in (20a) via

Equation (23)

where fabs and frel are provided by the user. Values of epsilon > 1 indicate a step must be reattempted with a smaller Δλ.

As described in Shampine (1986), the function evaluations can be arranged in a different way to provide a fourth-order accurate estimate of the midpoint of a step. Blacklight takes this midpoint to be the position and momentum of the step for the purposes of radiative transfer integration. In cases where the geodesic steps cover more proper distance than a user-specified limit (such that the radiative transfer equation would be poorly sampled even where the geodesic is calculated accurately), Blacklight subdivides the step. The aforementioned midpoint, together with function values and derivatives at both endpoints of the step, can be used to uniquely define a quartic expression, providing a fourth-order accurate interpolation anywhere within the step (Shampine 1986) and enabling this subdivision. The exact choice of coefficients used in DP implementations varies; the ones used here are given in Table 3.

Table 3. Butcher Tableau for the DP Geodesic Integrator

Download table as:  ASCIITypeset image

In all cases, integration is performed in Cartesian Kerr–Schild coordinates. As these coordinates are stationary, use of Equation (20a) ensures the covariant energy is conserved along rays exactly in the code, as it is physically. Because Blacklight does not employ spherical coordinates here, exact conservation of ϕ covariant momentum (z angular momentum) is not guaranteed. However, this choice avoids numerical issues arising from geodesics approaching coordinate poles.

2.3. Unpolarized Radiation

Blacklight can create unpolarized images, ignoring polarization effects along each ray. In terms of the Lorentz invariants

Equation (24a)

Equation (24b)

Equation (24c)

the general-relativistic equation for unpolarized radiative transfer is

Equation (25)

where the affine parameter λ must be normalized such that Equation (20a) holds. We take jI and αI to be constant over each ray segment (typically no larger than nearby cells in the underlying simulation). The evolution of I from the beginning of a segment (where we denote it I) to the end (where we denote it I+), across an optical depth Δτ = αI Δλ, employs the well-known exact solution

Equation (26)

where we use the standard function $\mathrm{expm}1(\cdot )=\exp (\cdot )-1$ as written and fix a finite ${\rm{\Delta }}{\tau }_{\max }=100$ to avoid floating-point inaccuracies.

2.4. Polarized Radiation

Users can choose to integrate the equations of polarized radiative transfer, evolving four degrees of freedom and producing images in Stokes parameters Iν , Qν , Uν , and Vν at the camera. In addition to jI and αI , this requires knowing the Lorentz-invariant linearly polarized emissivities jQ and jU , circularly polarized emissivity jV , linearly polarized absorptivities αQ and αU , circularly polarized absorptivity αV , Faraday conversion rotativities ρQ and ρU , and Faraday rotation rotativity ρV .

There are two primary approaches used by polarized general-relativistic ray-tracing codes. Some (e.g., grtrans) make use of the convenient form of the Walker–Penrose constant (Walker & Penrose 1970) along null geodesics in Boyer–Lindquist coordinates to parallel-transport fiducial polarization directions backward from the camera, thus allowing each segment's local definition of the Stokes parameters to match those at the camera. Here we adopt the alternative approach of evolving the polarized state of the radiation forward in a way commensurate with both plasma and coordinate effects, as done in, e.g., ipole. This allows our use of Cartesian (or even more general) coordinates.

The method we use is well described by Mościbrodzka & Gammie (2018); we only briefly summarize it here. The polarized transfer equation can be written as both

Equation (27)

and

Equation (28)

Here A and B run over the Stokes indices (with repetition implying summation), S is the vector of Stokes parameters, J is the vector of emissivities, M is the matrix of absorptivities and rotativities, N is the complex Hermitian coherency tensor, and Γ is the collection of connection coefficients. The dependent variables are encoded in both S and N, either of which can be used to define the other in a given frame.

The general procedure is to use Strang splitting for each ray segment of affine parameter length Δλ, evolving Equation (28) without the unwieldy plasma terms for Δλ/2, then evolving Equation (27) without the equally unwieldy coordinate terms for Δλ, and finally evolving Equation (28) without plasma coupling for another Δλ/2. All coefficients are held constant over each substep. As the last substep in one segment and the first substep in the next segment evolve the same equation, we merge these two via a second-order predictor-corrector method with the Butcher tableau given in Table 4.

Table 4. Butcher Tableau for the Predictor-corrector Evolution of Equation (28)

Download table as:  ASCIITypeset image

Just as Equation (25) requires special care and exact solutions to avoid numerically unstable integration, so too does Equation (27). The full solution is described in a number of sources, such as Jones & O'Dell (1977), Landi Degl'Innocenti & Landi Degl'Innocenti (1985), and Mościbrodzka & Gammie (2018). 7 The latter provides useful simplifying cases for when various coefficients vanish, which Blacklight employs when applicable.

2.5. Sampling

Given a simulation grid, possibly including mesh refinement, and a set of rays with sample points, Blacklight interpolates values from the former onto the latter. This interpolation is done in the primitive variables employed by Athena++: total rest-mass density ρ; total pressure p or electron entropy κe, depending on electron model (see Section 2.6); fluid velocity up , or the Cartesian Kerr–Schild normal-frame values if appropriate; and magnetic field Bi or Ba , as appropriate.

Interpolation can be done in nearest-neighbor fashion, where the values are copied from whatever unique cell bounds the sample point. Alternatively, Blacklight can perform trilinear interpolation in simulation coordinates. Athena++ domain-decomposes the simulation into logically Cartesian blocks of cells, and interpolation within a block costs negligibly more than nearest-neighbor lookup. Accurate interpolation across block boundaries, however, can dominate the cost of producing an image, given that adjacent blocks may be at different refinement levels and ghost zones are generally not included in data dumps. Interblock interpolation rarely makes a noticeable difference in the results, compared to intrablock interpolation; users can choose which method to employ.

The plasma state at each sample point is fully defined after interpolation. Useful quantities that can be derived from the interpolated ones, possibly with user-defined assumptions about the plasma, include electron number density ne; electron dimensionless temperature Θe = kB Te/me c2, with Te the electron temperature; fluid-frame magnetic field magnitude $B={({b}_{\alpha }{b}^{\alpha })}^{1/2}$, with bα = uβ (*F)β α the components of the magnetic 4-vector and F the electromagnetic field tensor; plasma σ = B2/ρ; and plasma β−1 = B2/2p.

2.6. Coefficients

It is standard practice to rotate one's coordinate system at each point along a ray in order to make jU , αU , and ρU vanish. For the remaining eight coefficients, Blacklight uses the fitting formulas provided by Marszewski et al. (2021), which conveniently unifies and corrects the synchrotron literature. These formulas cover the cases of thermal, power-law, and kappa distributions for the electrons, always assumed to be isotropic. Blacklight can calculate the coefficients resulting from any admixture of these three populations. Thermal electrons are distributed in Lorentz factor γ according to the Maxwell–Jüttner distribution based on Θe,

Equation (29)

where K2 is the cylindrical Bessel function. Power-law electrons are defined by an index p and Lorentz-factor cutoffs ${\gamma }_{\min }$ and ${\gamma }_{\max }$,

Equation (30)

The kappa distribution is defined by index κ and width w,

Equation (31)

In all cases, the electron number density is given by

Equation (32)

where the user specifies the molecular weight μ and electron-to-ion number-density ratio ne/ni.

Unlike the power-law and kappa-distribution parameters, Te varies across the simulation domain. As the simulations for which Blacklight is intended often treat the plasma as a single fluid with a single temperature T, we must invoke a prescription for determining Te as a function of the local plasma state available in postprocessing. The code supports the common choice of declaring a plasma-beta-dependent ratio between the ion temperature and electron temperature,

Equation (33)

with global constants Rhigh and Rlow, as introduced by Mościbrodzka et al. (2016). In this case, the electron temperature is

Equation (34)

Alternatively, Blacklight can use the electron entropy available in two-temperature simulations as prescribed by Ressler et al. (2015). In this case, the code must provide the values of an entropy-like variable ${\kappa }_{{\rm{e}}}\propto \exp ({s}_{{\rm{e}}}/{k}_{{\rm{B}}})$, where se is the electron entropy per electron. Then the electron temperature is given by

Equation (35)

(see Sądowski et al. 2017).

3. Test Problems

The most comprehensive comparison of ray-tracing codes to date is given by Gold et al. (2020). Here we apply to Blacklight the two quantitative tests from that work used to compare all codes, illustrating the correctness of geodesics and unpolarized images produced from simple plasma models.

3.1. Null Geodesic Deflection

For null geodesics confined to the midplane in Kerr spacetime and outside the photon ring(s), solutions to the geodesic equation involve relatively tractable integrals. In Gold et al. (2020), the codes' values of the azimuthal deflection angles Δϕ of such geodesics are compared to the values given by formulas in Iyer & Hansen (2009) for the asymptotic deflection from past to future infinity. Here we repeat this test with the following modification. Given that the geodesic origin (source) and termination point (camera) are at finite distance (r = 1000 GM/c2 here), we compare the result of ray tracing to the finite-distance formulas from Iyer & Hansen,

Equation (36a)

Equation (36b)

Equation (36c)

where u = 1/r, r0 is the radial coordinate of closest approach, and b = − kϕ /kt is the impact parameter.

We place a camera centered at r = 1000 GM/c2, θ = π/2, and ϕ = 0 around a black hole with a/M = 0.9. There are 51 pixels across the middle of the image, spanning a width w = 36 GM/c2. Of these pixels, 14 are ignored in this analysis; they lie inside the photon ring, given they have impact parameters satisfying bbb+ for the critical values

Equation (37)

Figure 1 shows the deflection angles calculated by Blacklight compared to Equation (36). The left and right sides of the figure correspond to the left and right sides of the image. For plotting purposes, we subtract the asymptotic, flat-spacetime "deflection" of ${\rm{\Delta }}{\phi }_{\mathrm{flat}}=\mathrm{sgn}({\rm{b}})\pi $. Here we employ the DP integrator with tolerances fabs, frel = 10−8, and the agreement is very good, with absolute errors in the range 10−6–10−5. A similar test in Gold et al. (2020, see their Figure 1) shows most codes' errors being 10−5 or greater, though in that case, the finite-distance code values are compared to the asymptotic formula.

Figure 1.

Figure 1. Deflection angles obtained by Blacklight when integrating the geodesic equation for rays in the midplane around a spinning black hole. The line is calculated from the analytic formula, with the differences shown in the lower panel. This test uses the DP integrator with fabs, frel = 10−8.

Standard image High-resolution image

We can compare the errors obtained with other numerical tolerances and integration schemes. The top panel of Figure 2 shows the errors for the DP integrator using tolerances ranging over 10−12–10−5, plotted against the number of steps taken to trace each ray. The adaptive nature of this integrator means stricter tolerances can be satisfied by only using slight more points near closest approach. For this reason, we recommend DP over the RK integrators for general-purpose ray tracing, where reasonable tolerances set by the user almost always yield reasonably accurate results without the danger of requiring intractably many steps.

Figure 2.

Figure 2. Errors in midplane Kerr deflection angles using the DP, RK4, and RK2 schemes. The error for each ray is plotted against the total number of steps used to integrate that ray. Colors correspond to the user-specified numerical control parameters. The DP integrator is a robust choice, with the RK methods providing versatility for users needing more control over the computation–accuracy trade-off.

Standard image High-resolution image

When testing the RK4 and RK2 integrators, we vary fstep over 0.002–0.1. The solution improves with decreasing step size, as expected, sometimes even outperforming typical DP results. However, this accuracy can come at a large cost in terms of the number of steps taken. Conversely, the RK integrators allow very quick and somewhat inaccurate integrations using relatively few steps. Relative to DP, they permit users to make larger trade-offs between accuracy and performance. It is possible that these non-adaptive variable-step-size methods could be improved for this type of problem with a better heuristic than Equation (22).

3.2. Model Images

Gold et al. (2020) compare the images produced by unpolarized transfer using five sets of parameterized formulas for velocity, emissivity, and absorptivity, shown in their Figure 2. We repeat these tests with the same parameters and camera settings as in that work, choosing the DP integrator, a plane-parallel camera, and frequencies of 230 GHz at the location of the camera. The resulting images, shown in Figure 3, agree with those obtained by other codes.

Figure 3.

Figure 3. Images produced by Blacklight for the five parameterized models in Gold et al. (2020), showing the expected features and brightnesses. In order to match the color scales in that work, Iν in erg s−1 cm−2 sr−1 Hz−1 is multiplied by 105(75/128)2 f−1, where f is 1.6465, 1.436, 0.4418, 0.2710, and 0.0255 for the five tests, respectively.

Standard image High-resolution image

We report the total fluxes for these images in Table 2. Comparing to Table 2 of Gold et al. (2020), Blacklight lies comfortably in the middles of the ranges spanned by the seven codes in that work. Our table shows how the minimum and maximum values reported in that comparison deviate from the values obtained here, with the typical minima 1% lower than the Blacklight value and the typical maxima 1% higher.

4. Code Features

Blacklight has a number of features beyond simply integrating the monochromatic equation of radiative transfer, all of which are intended to aid in the analysis or presentation of simulation data. Some features change how these data are sampled (data cuts, flat spacetime, slow light, and adaptive ray tracing), while others change what quantities are projected onto the image (nonthermal electrons, auxiliary images, false-color renderings, and true-color images). Each of these features can be toggled independently, working with any others.

In illustrating these features, we employ a fiducial data set and camera parameters. The physical system is that of the thick, aligned disk from White et al. (2020) around a black hole with dimensionless spin 0.9, with the free scales arbitrarily tuned to match values appropriate for Sgr A*: a black hole mass of 4.152 × 106 M (The GRAVITY Collaboration 2019) and an average flux of 2.4 Jy (Doeleman et al. 2008). Regions with plasma σ exceeding unity are treated as vacuum.

The fiducial camera is plane-parallel, with full width 24 GM/c2. Its center is held static at Kerr–Schild r = 100 GM/c2 and θ = 45°, receiving light with ki ∝ (1, 0, 0). Unless otherwise stated, it collects light that reaches 230 GHz at infinity, using 2562 pixels. Figure 4 shows the four Stokes parameters for the fiducial image.

Figure 4.

Figure 4. Stokes parameters for the fiducial image in units of erg s−1 cm−2 sr−1 Hz−1.

Standard image High-resolution image

4.1. Data Cuts

It is already common practice to treat any part of a simulation with σ ≳ 1 as vacuum, given that GRMHD jets are often numerically mass loaded and would be unrealistically bright if simulated values there were taken at face value. In fact, this cut is employed in all simulation images shown in this work. Given the importance of probing the origin of image features, Blacklight offers a simple user interface for making similar cuts on a variety of local conditions. Upper and lower cutoffs can be specified for ρ, ne, p, Θe, B, σ, and/or plasma β−1.

As an illustration, the top panels of Figure 5 show the fiducial image and the same image made by considering β−1 < 1/2 to be vacuum, highlighting only the coronal regions of the simulation. Some parts of the image become brighter with this cut, indicating the full data set results in some foreground absorption by less magnetized plasma.

Figure 5.

Figure 5. Comparison of the fiducial image (upper left), made with only the standard σ ≤ 1 cut, with the result of various other cuts on the data. Only sufficiently magnetized (β−1 ≥ 2) plasma contributes to the upper right image. The lower panels exclude plasma further than (left) or nearer than (right) the plane orthogonal to the line of sight passing through the origin.

Standard image High-resolution image

Blacklight also provides users with the ability to cut the data based on position. Users are provided with upper and lower cutoffs on radial coordinate r. They can also cut data on one side or another of an arbitrary plane. The plane is defined to be all points x such that $({x}^{a}-{x}_{0}^{a}){n}_{a}=0$ for a user-provided origin x0 and normal n. The lower panels of Figure 5 show what happens when this plane is chosen to bisect the domain into a foreground and a background relative to the camera.

4.2. Flat Spacetime

While having gravity bend geodesics with respect to even a Cartesian coordinate system is critical for obtaining accurate images of light emitted by the modeled system, the resulting mapping from the 3D space of a simulation to a 2D image is very non-intuitive. As an analysis tool intended to be more general than a mere imager, Blacklight allows the user to toggle the effect of gravity. The resulting images more closely correspond to the projections of data one might obtain with standard 3D visualization software, showing features "where they are" rather than where their light reaches the camera.

There is ambiguity in how to treat spacetime as flat in this way. Blacklight uses the following procedure. Simulation data are transformed into Cartesian Kerr–Schild coordinates, both in terms of fluid elements' positions and in terms of the vector components. That is, we take the simulation to provide, perhaps in a roundabout way, ρ, p or κe, uα , and bα , all as functions of xα . These values are then reinterpreted to be functions of Cartesian Minkowski coordinates, with vector components interpreted in that coordinate system. Geodesics are calculated in flat spacetime, and the radiative transfer equation integrates the reinterpreted quantities in flat spacetime as well. Qualitatively similar results could be obtained by, e.g., transforming the simulation to spherical Kerr–Schild and then reinterpreting values as being in spherical Minkowski coordinates.

The effect on images of artificially assuming a flat spacetime is illustrated in Figure 6. Though this technique does not produce images as would be observed in nature, it can be useful to identify which image features arise as a result of lensing. In the example shown, one can clearly see that the black hole shadow is magnified by gravity. This feature is particularly useful when used in conjunction with false-color renderings (Section 4.7 below).

Figure 6.

Figure 6. Comparison of the fiducial image (left) with the result of treating spacetime as flat (right). The effect of gravity here is to magnify the image relative to flat-spacetime expectations, as well as to produce a photon ring.

Standard image High-resolution image

4.3. Slow Light

Most ray tracing is calculated with the fast-light approximation, whereby each time slice from a simulation is taken, in turn, to represent the system in a time-invariant state while light propagates from the plasma to the camera. This is sufficient for many purposes, but the emitting region can be tens or hundreds of gravitational times across, while the dynamical times can be less than 10 gravitational times, with turbulence and caustic-crossing timescales even shorter. Thus certain applications, especially those concerned with high-frequency variability, require slow-light ray-tracing algorithms that account for the time evolution of the system.

Blacklight can read a sequence of simulation outputs, sampling the entire spacetime volume needed to make an image. Whenever a ray needs simulation values from a given spacetime point, those values are interpolated in time just as in space. Users have the option to use either linear interpolation between the two adjacent time-slices, or else simple nearest-neighbor sampling. The latter can be useful for simulation dumps that are not too finely sampled (Δt ≳ 1 GM/c3), as linear interpolation on timescales longer than the magnetic field coherence time tends to systematically underestimate the magnitude of the field (Dexter et al. 2010).

Figure 7 shows the intensity from a fast-light calculation, together with a comparable slow-light image. The fast-light case is the fiducial simulation at time 10,500 GM/c3, though the camera center (and thus outer cutoff for data being used) is moved in to r = 25 GM/c2. Even with such a small spatial extent, the longest-duration geodesics need 159 GM/c3 to propagate from their termination points to the camera. The slow-light calculation uses 160 data dumps, equally spaced from 10,371 to 10,530 GM/c3. The offset in this latter time (the time at the camera) from the fast-light time roughly accounts for the light-travel time for the bulk of the emission to the camera. The two images are quite similar, though the approaching plasma on the left side of the image is able to emit light along a number of rays at different times as it moves relativistically.

Figure 7.

Figure 7. Results of a fast-light calculation performed with a snapshot from the fiducial simulation (left), together with a comparable slow-light calculation using 160 snapshots spaced by Δt = 1 GM/c3. Though the images largely agree, the time-variability properties of the light curves generated from sequences of such images could well be different.

Standard image High-resolution image

4.4. Adaptive Ray Tracing

It is sometimes desirable to use a large number of rays (i.e., pixels) in one region of an image, resolving small-scale features, while not wasting resources in other parts of the image. More codes are adopting adaptive ray-tracing techniques in order to facilitate more efficient computation; Wong (2021) refines regions with particularly long geodesic path lengths, Gelles et al. (2021) use an estimate of the error derived from the intensities of nearby pixels, and Davelaar & Haiman (2022) use a scheme very similar to what we describe here in order to trace through a binary black hole system.

The adaptivity in Blacklight is reminiscent of mesh refinement in Athena++. Rather than considering a pixel to be a point, the code considers a pixel to be a square region of the image plane, sampled by a ray placed at its center. This is illustrated in Figure 8. It is a simple matter to recursively subdivide pixels into four, with each iteration augmenting a given ray with four new rays. The root grid, which we will number adaptive level 0, is domain-decomposed into an integer number of square blocks of pixels, whose size is specified by the user. Each block is analyzed as a whole, though with no information from other blocks, as the code runs. If a refinement condition is triggered, every pixel in the block is divided into four, and the four new blocks covering the region, considered to be at the next adaptive level, are ray-traced and evaluated for further refinement. This process continues up to a user-specified maximum refinement level, and there are no limits to how rapidly refinement can change across the image plane. The code supports single-pixel blocks, allowing fine-grained control over which regions are refined.

Figure 8.

Figure 8. Schematic illustrating the arrangement of pixels into blocks suitable for refinement. The root grid samples 82 rays (black points) in as many pixels (gray squares), which are arranged into four blocks of 42 pixels (black squares). The upper right block is refined into four new blocks (solid blue squares) of 42 pixels each (pale blue squares), sampled at their respective centers (blue points).

Standard image High-resolution image

Blacklight supports two types of refinement criteria. One, similar to static mesh refinement in Athena++ and other hydrodynamical codes, specifies regions in the image plane and corresponding minimum refinement levels. Any block whose center lies within a region and whose level is less than the given minimum will be refined. The other class of criteria is based on evaluating the set of intensity values ${I}_{\nu }^{i,j}$ (i and j indexing pixel column and row) at a chosen frequency across all pixels in the block. Users can specify thresholds in intensity ${I}_{\nu }^{i,j}$ itself; absolute pixel-to-pixel intensity gradient

Equation (38a)

Equation (38b)

Equation (38c)

relative pixel-to-pixel intensity gradient

Equation (39a)

Equation (39b)

Equation (39c)

absolute pixel-to-pixel Laplacian

Equation (40a)

Equation (40b)

Equation (40c)

and/or relative pixel-to-pixel Laplacian

Equation (41a)

Equation (41b)

Equation (41c)

One-sided differences are substituted as appropriate near the edges of blocks. If a user-specified critical fraction of pixels in a block surpasses a given threshold, the block is refined.

As an illustration, we consider the fiducial simulation with a root resolution of 322 pixels, broken into blocks of 82 pixels. The top panels of Figure 9 show the intensity, optical depth, and circular polarization at root level. Though rapidly computed, these images fail to show any features beyond the presence of a crescent with a spur. We then add up to three levels of refinement, refining any blocks for which $\left|{{\rm{\nabla }}}_{\mathrm{abs}}{I}_{\nu }\right|\gt 5\times {10}^{-5}\ \mathrm{erg}\,{{\rm{s}}}^{-1}\ {\mathrm{cm}}^{-2}\ {\mathrm{sr}}^{-1}\ {\mathrm{Hz}}^{-1}$ for five or more pixels. As shown in the lower panels, this criterion focuses pixels in the bright parts of the image with plentiful features, keeping resolution low in the extremities of the image and in the darker parts of the crescent.

Figure 9.

Figure 9. Unrefined (top) and adaptively refined (bottom) images of the fiducial simulation. The root level consists of 322 pixels in 82 blocks, and refinement is controlled via the absolute pixel-to-pixel intensity gradient. Refinement successfully captures interesting structure in intensity, optical depth, and circular polarization, without increasing resolution in the outer parts of the image.

Standard image High-resolution image

4.5. Nonthermal Electrons

As discussed in Section 2.6, Blacklight supports thermal, power-law, and kappa distributions for the electrons. A majority of literature results to date assume purely thermal electrons, though this is largely a de facto choice that should become less common as small-scale plasma research continues to inform the GRMHD community about the properties we expect the electrons to have. As more observations are conducted, especially outside the millimeter, alternative distributions may become more necessary to match what is seen in nature.

Figure 10 shows the fiducial image, made with a thermal distribution whose temperature varies across the simulation, together with nonthermal images. The power law is taken to have p = 3, ${\gamma }_{\min }=4$, and ${\gamma }_{\max }=1000;$ we parameterize the kappa distribution with κ = 4 and w = 1/2. These parameters result in total fluxes comparable to the thermal case, though the images are more extended and diffuse. The lower-right panel shows the results from declaring that the electrons are equally partitioned among the three cases, achieved by simply setting the relative fractions in an input file. While not physically motivated, this demonstrates how the code can be used to quickly investigate the effects of more complicated scenarios, such as adding some power-law electrons to a mostly thermal population.

Figure 10.

Figure 10. The fiducial image resulting from thermal electrons, and three variations made with alternate electron distributions. The power-law and kappa-distribution parameters are chosen to match the total flux of the fiducial image. The "mixed" case assumes one-third of the electrons are in each of the other cases.

Standard image High-resolution image

4.6. Auxiliary Images

One of the primary goals of Blacklight is facilitating science by connecting the physical processes in a simulation with the resulting observables. One way this is accomplished is by providing a number of auxiliary "images" of quantities other than Stokes parameters. This allows for quick access to properties of geodesics and representative simulation values along those geodesics as functions of image-plane location.

One pair of simple parameters is the time and length of each geodesic. The time here is the difference in Kerr–Schild t between the origin of the geodesic (its termination point when tracing backward) and the camera. For length, Blacklight reports the Kerr–Schild proper length s obtained by integrating Equation (21) along the geodesic. Figure 11 shows these two quantities for the fiducial simulation. The time is particularly useful for gauging how many snapshots are required to perform a slow-light calculation with a given camera.

Figure 11.

Figure 11. Elapsed time and proper length of geodesics in the fiducial simulation. The color scales are the same in geometric units, meaning the slight color differences reflect the fact that ds/dt is not necessarily unity (the speed of light) for null geodesics in a curved spacetime.

Standard image High-resolution image

We consider three additional quantities indicative of some total property of each geodesic. The code can provide images of the elapsed affine parameter λ, scaled arbitrarily as defined in Section 2.1; it can report the integrated unpolarized emissivity

Equation (42)

and it can calculate the total unpolarized absorption optical depth,

Equation (43)

These three quantities are displayed in the top panels of Figure 12.

Figure 12.

Figure 12. Top: auxiliary images of affine parameter, unpolarized emission, and unpolarized optical depth in the fiducial simulation. Bottom: averages of electron temperature weighted by the scheme corresponding to the above quantity, as described in the text. Each weighting emphasizes different parts of the simulation.

Standard image High-resolution image

More useful are averages of simulation quantities motivated by these three variables. As discussed in reference to data cuts (Section 4.1), Blacklight tracks the values of ρ, ne, p, Θe, B, σ, and β−1. For each such quantity q, it can report an affine-parameter-averaged value

Equation (44)

which weights the quantity relatively uniformly over the geodesic. Alternatively, the emission-weighted average

Equation (45)

emphasizes those parts of the ray in which the most light is being emitted. Quantities similar to this have already been put to use in the literature. However, emission weighting can be deceptive when a bright emitting region is masked by a cooler, optically thick part of the simulation, in which case the average is not representative of the conditions where the observed photons are emitted. For this reason, we introduce an optical-depth-integrated value 〈qτ , calculated as the solution to

Equation (46)

in the same way that I is the solution to the unpolarized radiative transfer Equation (25). As the ray is integrated toward the camera, 〈qτ is pulled toward the local value of q at a rate proportional to the absorptivity αI . The values of 〈Θeλ , $\langle {{\rm{\Theta }}}_{{\rm{e}}}{\rangle }_{{I}_{j}}$, and 〈Θeτ are shown in the lower panels of Figure 12.

4.7. False-color Renderings

The auxiliary images of Section 4.6 all correspond to a single scalar value generated for each pixel in the image, not unlike true intensity images. Blacklight supports another type of image, which we call a false-color rendering, in which each ray has a human-perceivable color that is modified as it propagates through the simulation.

Consider a quantity q1, drawn from the same set available to data cuts (Section 4.1) and auxiliary images (Section 4.6): {ρ, ne, p, Θe, B, σ, β−1}. Associate to this quantity a color c1; an optical depth per unit length τs,1; and threshold values q1,− and q1,+, either of which may be infinite. A ray, initialized to black, can have its color c be modified in much the same way as the optical-depth integration of Equation (46),

Equation (47)

where here the appropriate optical depth is related to proper distance s via

Equation (48)

In this way, the ray's color approaches c1 whenever it passes through regions of spacetime where q is between its cutoff values.

Alternatively, let quantity q2 have color c2, opacity α2, and threshold q2,0. Then whenever the ray crosses a surface q2 = q2,0 (possibly only in a user-specified direction), we can blend its color with c2 according to

Equation (49)

Renderings defined in this way can reveal three-dimensional structure in simulation data. Note that these are more akin to volume renderings in scientific visualization software than to externally illuminated isosurface renderings.

We pause to note that it is all too common to find color blending schemes that work in RGB color space. However, the mixing of colored light being emulated by the above equations will fail to match human color perception if done componentwise in this space, obfuscating the scientific accuracy of the final image. Internally, Blacklight uses the correct CIE XYZ color space designed for this purpose (Smith & Guild 1932), which has the added benefit of having a gamut that includes all colors visible to the human eye. Only the final combined image, generated from an arbitrary number of quantities qi and their associated parameters, need be projected into RGB space. 8

Figure 13 shows an example of a three-component rendering from the fiducial simulation. The camera is placed close to the midplane (θ = 70°), and the field of view is enlarged for the first set of images. Blue traces regions with ρ > 5 × 10−18 g cm−3, with ${\tau }_{s}={(30\ {GM}/{c}^{2})}^{-1};$ red traces regions with σ > 1 (the standard σ cut has been removed), with ${\tau }_{s}\,={(25\ {GM}/{c}^{2})}^{-1};$ and green tracks β−1 = 1 surfaces, with α = 0.25. Material outside r = 45 GM/c2 is ignored here.

Figure 13.

Figure 13. False-color renderings of the fiducial simulation, highlighting high densities ρ (blue), β−1 = 1 surfaces (green), and large magnetizations σ (red). For ρ and σ, the color bars indicate the proper distance needed for a ray to achieve a given color; the color bar for β counts how many surfaces the ray crosses. The zoomed-out images use flat spacetime (first panel) and the appropriate Kerr spacetime (second panel) for ray tracing. For the zoomed-in images (third panel, flat; fourth panel, curved), only matter with z < −rhor is considered.

Standard image High-resolution image

The first panel, made assuming flat spacetime, is akin to what general-purpose scientific visualization software would produce. This shows the spatial structure of the simulation data in an intuitive way. The second panel more directly connects the simulation variables to lensed images, as it is made with rays that follow the actual null geodesics of the spacetime. The last two panels illustrate the same dichotomy, but zoomed in to the black hole, with material outside r = 20 GM/c2 excluded. Additionally, for these latter renderings, we exclude all material with coordinate z > −rhor. With this cut, we can clearly tell that much of the highly magnetized plasma seen above the black hole actually resides below it.

Figure 14 provides another illustration of what can be learned from false-color renderings. The central panel shows the true 230 GHz image of the fiducial simulation as seen from the pole (θ = 0), with all material outside r = 10 GM/c2 omitted. On the left, we illuminate dense (ρ > 2 ×10−17 g cm−3) regions, with ${\tau }_{s}={(5\ {GM}/{c}^{2})}^{-1}$. On the right, we illuminate regions with relativistically hot electrons (Θe > 20), with ${\tau }_{s}={(1\ {GM}/{c}^{2})}^{-1}$. Comparisons of this sort can show at a glance what simulation variables are most correlated with image structures. Here, the spiral-armed nature of the image outside the photon ring is matched in density but not electron temperature.

Figure 14.

Figure 14. False-color, face-on images generated from the innermost regions of the fiducial simulation. Intensity (middle) can be compared with high densities (left) or high electron temperatures (right), with the former better matching the spiral structure easily seen by eye. For ρ and Θe, the color bars indicate the proper distance needed for a ray to achieve a given color.

Standard image High-resolution image

4.8. True-color Images

The vast majority of ray-traced black hole accretion images in the literature are monochromatic, showing Iν at a single frequency. It is, however, a simple matter to simultaneously calculate images for a number of different frequencies given a fixed camera and fixed simulation data, and there is information contained in the relationships between images at different frequencies.

Here we explore a novel means of presenting such multifrequency data via true-color images, as seen in an appropriately boosted reference frame. The resulting images are particularly useful for outreach and public engagement, where it is often not emphasized that the ubiquitous black–reddish-orange–yellowish-white color maps used in science are merely imparting eye-pleasing colors on what is essentially gray-scale information.

Given values Iν,i at multiple frequencies νi , we can self-consistently Lorentz-boost the intensity into the human visible range. Suppose we wish frequency ν0 to boost to wavelength ${\lambda }_{0}^{\prime} $ in a new frame. Then the intensities transform as

Equation (50)

Given Iλ at sample wavelengths, it is straightforward to integrate against tabulated matching functions (Stockman et al. 1999; Sharpe et al. 2005, 2011) based on modern measured cone response functions (Stockman & Sharpe 2000), yielding the XYZ color coordinates of the spectrum. A standard transformation can project the resulting pixel color into RGB values. The only remaining free parameter is the overall normalization of the XYZ values, which changes the intensity but not the hue of the color. It is essentially (the reciprocal of) the camera's exposure.

We apply this procedure to the fiducial simulation, viewed face-on (θ = 0), at an angle (θ = 45°), and edge-on (θ = 90°). In each case, we use 12 frequencies, evenly spaced in wavelength from 152 to 324 GHz. Choosing ν0 = 230 GHz and ${\lambda }_{0}^{\prime} =550\ \mathrm{nm}$ (near the center of human perception), the boosted wavelengths $\lambda ^{\prime} $ cover the visual range of 390–830 nm. This corresponds to traveling toward the source at a Lorentz factor of approximately 1200. The resulting images are shown in the upper panels of Figure 15, with the lower panels showing the standard 230 GHz monochromatic brightness maps from the same points of view.

Figure 15.

Figure 15. Top: true-color images of the fiducial simulation at three different viewing angles, obtained by blueshifting multifrequency data with a Lorentz factor of approximately 1200. Light from the left sides of the middle and right panels is emitted by plasma traveling toward the camera, and the resulting Doppler boost can be seen in the bluer color of the light. Bottom: the corresponding 230 GHz monochromatic images.

Standard image High-resolution image

Such boosted true-color images of synchrotron emission tend to be largely white in color. This is to be expected, as the spectrum is broad compared to the factor of ∼2 range in sensitivity of the human eye, and it tends to be monomodal. Shifting the peak frequencies to wavelengths longer or shorter than ∼550 nm can make it redder or bluer, but it would be surprising to find more exotic colors as a result of this technique. Still, there are subtle color effects in Figure 15, including the fact that the approaching side of the disk is literally blueshifted in hue.

5. Performance

Though creating a single ray-traced image is much less expensive than running a GRMHD simulation, or even advancing such a simulation from one snapshot to the next, a thorough analysis based on imaging can come at considerable cost given the number of parameters such as camera position, frequency, and electron model that can be varied. We therefore provide performance numbers for Blacklight, 9 in order to estimate how costly any particular analysis might be.

Here we consider the polarized ray tracing of a single GRMHD snapshot, with no slow light, adaptivity, auxiliary images, or false-color renderings. The snapshot consists of a 77 × 64 × 128 spherical grid. Tests are run on four different architectures:

  • 1.  
    Stellar is a campus cluster at Princeton University, consisting of 2.9 GHz Cascade Lake nodes (Intel Xeon Platinum 8268) with 96 physical cores per node.
  • 2.  
    Tiger is another campus cluster at Princeton University, consisting of 2.4 GHz Skylake nodes (Intel Xeon Gold 6148) with 40 physical cores per node.
  • 3.  
    The Stampede2 cluster at the Texas Advanced Computing Center (TACC), available through the Extreme Science and Engineering Discovery Environment (XSEDE) program, has 2.1 GHz Skylake nodes (Intel Xeon Platinum 8160) with 48 physical cores per node.
  • 4.  
    Stampede2 also has 1.4 GHz Knights Landing nodes (Intel Xeon Phi 7250) with 68 physical cores per node.

First, we demonstrate the "strong scaling" of the code, fixing the image resolution to be 2562 pixels and measuring performance while varying the number of threads used by the code, from a single thread to one thread per physical core. The results, shown in Figure 16, show typical values of 100–1000 rays computed per second per thread. This includes the integration of both the geodesic equation and the radiative transfer equation, as well as the (small) amortized costs associated with bookkeeping and reading and sampling the simulation data.

Figure 16.

Figure 16. Parallel scaling of Blacklight with number of threads on four different machines, measured by averaging performance across multiple runs to generate each data point. The drop at high thread counts suggests that batch processing of many images should be done with multiple independent processes on a node, each with a small number of cores.

Standard image High-resolution image

There is a considerable decline in performance with increasing thread count. This is largely unrelated to the hardware itself having less capability than the number of cores might suggest, as happens when processors throttle highly parallel computations for thermal reasons or when the memory bus throughput becomes a limiting factor; the performance numbers vary little between the cases of keeping the rest of the node idle and keeping all cores busy with comparable work. The loss in efficiency is also not the result of large sections of serial code; most of the computation in Blacklight is distributed across threads via OpenMP parallel directives, where each thread works on a contiguous block of pixels. However, these parallel loops suffer performance losses with large numbers of threads, likely due to cache coherency issues. Different threads simultaneously try to read from and write to nearby (but not identical) locations in memory as they work on the parts of various arrays associated with their block of pixels.

This scaling, however, essentially only affects users wishing to use an entire node to rapidly compute a single image at very high resolution. The more common scientific use case is that of computing many images at more modest resolution (e.g., 1282 or 2562), where multiple instances of the code can fit within the memory available on a single node. In this case, about four cores can be assigned to each of about 10–30 processes, with each process launching as many threads as cores. The result will be the performance seen toward the left side of Figure 16, even with the entire node dedicated to ray tracing.

Figure 17 shows how image resolution is not a factor in performance as measured in cost per pixel. Here we fix all runs to use four threads, varying the resolution of the output image. Image sizes are limited by the memory available to each node. In one case, Blacklight is used to trace 10 images at once, rather than just one, using identical camera settings and input files with identical layouts. There are some benefits to efficiency by amortizing over many images the one-time cost of integrating the geodesic equation.

Figure 17.

Figure 17.  Blacklight performance as a function of output image resolution on four different machines, measured by averaging the results of multiple runs to generate each data point. Four threads are used in all cases. Solid lines indicate a single image is calculated with each call of the code; the dashed line corresponds to having Blacklight trace 10 images simultaneously.

Standard image High-resolution image

Taking 2562 pixels using four threads on Stellar as a representative case, each image takes 30 s to run. Of this time, 75% is spent integrating the polarized radiative transfer equation, including calculating synchrotron coefficients; 13% is spent resampling values from the simulation grid onto geodesics; 12% is spent integrating the geodesic equation; and less than 1% is spent on all other tasks. Relative to the same 30 s total, integrating the unpolarized transfer equation takes 15% of the time, making the total unpolarized runtime 12 s, 39% that of the polarized case. If we instead produce only the 26 auxiliary images described in Section 4.6, integrating the image takes 29% of 30 s, resulting in a runtime of 16 s, 54% that of the polarized case. Finally, if we only produce a single false-color rendering with three components, as in Figure 13, the integration takes 14% of 30 s, leading to a total runtime of 11 s, 38% that of the polarized case.

Further optimization of Blacklight is underway, with the goal of achieving near-perfect scaling, as should be possible given the embarrassingly parallel nature of ray tracing many pixels. Additionally, profiling tools indicate parts of the code are not yet fully vectorized (compiled to single instruction, multiple data (SIMD) instructions). We note that the absolute performance is still competitive with other codes. Using a 288 × 128 × 128 iharm3D simulation data dump, we trace a 2562 polarized image with both Blacklight and ipole on Stellar, using icpc 2021.1.2 to compile the former and gcc 8.5.0 for the latter. 10 Though ipole has better scaling with number of threads, Blacklight calculates and processes sample points along rays faster when using fewer than 48 threads. With a single thread, Blacklight processes 3.4 × 105 samples per second, compared to 1.0 × 105 for ipole.

6. Code Philosophy

The aiding of scientific investigations is the primary motivation for Blacklight, and this reflected in several aspects of the code.

Given that most academic researchers have access to disparate hardware and software environments with little direct technical support, it is important that codes for researchers work with minimal manual configuration. Blacklight has no dependencies on external libraries, placing the burden on the developer rather than the user to ensure it can interface with HDF5 and NumPy files, for example. It requires no third-party configuration tools. The aim is to have code that compiles on most Unix-like machines with no effort. The code is written in standard C++17, which is supported by all major compiler vendors.

Users of scientific codes should be able to discern the exact algorithms being employed in the source code, perhaps modifying the methods to suit their needs. Thorough documentation certainly helps, and the wiki for Blacklight is approximately one-third the size of the source code proper. Ideally, users should be able to inspect the source code and quickly find and understand pertinent sections. To this end, a moderate amount of encapsulation is employed (the code has just five main classes, for reading an input file, reading simulation data, integrating geodesics, integrating radiation, and outputting results), enough to allow users to understand parts of the code without understanding the entire code base, but not so much as would result in chasing the logic of a small task through many files. Blacklight uses only select features of C++ beyond what is contained in C; it is largely imperative in style, as is appropriate for implementing straightforward algorithms that employ long sequences of mathematical equations.

Finally, Blacklight is publicly available. In fact, it is placed into the public domain. There is only the one version of the code as described here; users need not worry that an official, private, better version exists. The main repository is meant to be selective in what it incorporates, so that users can have confidence that updating to the latest version will not reduce functionality or accuracy in their own workflow.

It is hoped that these precepts will lead to a code that avoids the pitfalls common to academic tools for scientific computation. Blacklight can then assist and enable black hole accretion research, especially in the arena of understanding physical processes by connecting astrophysical simulations to astronomical observations.

We are indebted to those who led the way in providing the GRMHD community with ray-tracing codes, especially C. Gammie, J. Dexter, M. Mościbrodzka, and B. Prather, who answered many of this author's questions about such codes; to O. Blaes for extensive discussion regarding the nuances of synchrotron radiation; and to B. Oyang, for providing the initial motivation to write this software.

This work used the Princeton Research Computing clusters Stellar and Tiger managed and supported by the Princeton Institute for Computational Science and Engineering (PICSciE) and the Office of Information Technology's High Performance Computing Center and Visualization Laboratory at Princeton University, as well as the Extreme Science and Engineering Discovery Environment (XSEDE) cluster Stampede2 at the Texas Advanced Computing Center (TACC) through allocation AST200005.

Footnotes

  • 1  

    An even earlier application of ray tracing to GRMHD simulations can be found in Schnittman et al. (2006), though with an emphasis on X-ray emission and stellar-mass black holes.

  • 2  

    Notable exceptions include the interactive modes of GRay, GRay2, and Odyssey, and the virtual-reality data products from RAPTOR (Davelaar et al. 2018). Interestingly, of the existing codes mentioned, these are precisely the four that work with GPUs, leveraging the ability to rapidly produce a single image to enable these techniques.

  • 3  

    Version 1.0 of Blacklight is released concurrently with this article; this version has been deposited to Zenodo (White 2022).

  • 4  

    The code is placed into the public domain under the Unlicense and is currently hosted on GitHub: https://github.com/c-white/blacklight.

  • 5  
  • 6  

    These coordinates are used by GRay2, though most existing ray tracers employing a fixed metric choose spherical Kerr–Schild or Boyer–Lindquist coordinates.

  • 7  

    We note there is a sign error in the third of five terms in Equation (24) in Mościbrodzka & Gammie (2018); the error is purely typographical and does not appear in the source code for ipole.

  • 8  

    Even this last projection is of course only necessary for producing standard-format digital images. Colors used for physical printing or further manipulation should use the XYZ data directly.

  • 9  

    All measurements use version 1.0 of the code (commit 1ab5bf2e on 2022 May 22).

  • 10  

    ipole additionally calculates an unpolarized image, though this should be only a small fraction of the total cost.

Please wait… references are loading.
10.3847/1538-4365/ac77ef