Blacklight: A General-Relativistic Ray-Tracing and Analysis Tool

We describe the Blacklight code, intended for post-processing general-relativistic magnetohydrodynamic simulation data. Beyond polarized ray tracing of synchrotron radiation, it can produce a number of outputs that aid in analyzing data sets, such as maps of auxiliary quantities and false-color renderings. Additional features include support for adaptive mesh refinement input, slow-light calculations, and adaptive ray tracing. The code is written with ease of use, readability, and transparency as primary objectives, while it still achieves high performance. Blacklight is publicly available and released into the public domain.


INTRODUCTION
Direct simulation of accreting black holes, employing general-relativistic magnetohydrodynamic (GRMHD) codes, is an indispensable method for understanding these physical systems. However, there is a disconnect separating the directly modeled quantities describing the plasma from the directly observed light at various frequencies. It is fortunate, then, that for supermassive black holes accreting well below the Eddington rate, the observed light in the millimeter-to-infrared range is largely due to optically thin synchrotron radiation emitted by the electrons. Such radiation, energetically unimportant for and neglected by the GRMHD evolution, can be recovered by post-processing simulation data with ray-tracing codes, producing predictions for light curves, spectra, and even resolved images based on such simulations.
Ray tracing in this manner is critical for interpreting Event Horizon Telescope (EHT) observations of the M87 black hole (The EHT Collaboration 2019e,f, 2021 and GRAVITY and EHT observations of Sagittarius A* (Sgr A*) (GRAVITY Collaboration 2018; The EHT Collaboration 2022e), for example, and can be a key ingredient in data-analysis pipelines (Wong et al. 2022). Given that there are a number of GRMHD codes in active use, and that different analyses will make distinct demands upon ray tracing, there are a number of existing raytracing codes. These include BHOSS , GRay (Chan et al. 2013) and GRay2 (Chan et al. 2018), grtrans (Dexter & Agol 2009;Dexter 2016), ibothros (Noble et al. 2007) and ipole (Mościbrodzka & Gammie 2018), Odyssey (Pu et al. 2016;Pu & Broderick 2018), RAIKOU (Kawashima et al. 2021), RAPTOR (Bronzwaer et al. , 2020, and VRT2 (cf. Broderick & Blandford 2003, 2004, as summarized and compared in Gold et al. (2020). 1 A number of these codes are publicly available, though each is adapted to particular GRMHD codes, and none currently natively support Athena++ (Stone et al. 2020;White et al. 2016) simulation data. Furthermore, the exclusive focus of most codes has been, understandably, producing simple synchrotron images from simulation data, 2 while the growing diversity and complexity of the research performed by this community suggest the time has come to consider more general tool sets for analyzing simulations.
Here we describe Blacklight, 3 a publicly available 4 code for state-of-the-art ray-tracing techniques, as well as additional analysis tools related to calculating geodesics in curved spacetimes. Blacklight is designed with the goal of enabling scientific research; beyond accuracy and performance, primary objectives influencing how the code is written include ease of use (no dependencies on external libraries), readability (clean, consistent coding, using encapsulation when and only when ap-1 An even earlier application of ray tracing to GRMHD simulations can be found in Schnittman et al. (2006), though with an emphasis on X-ray emission and stellar-mass black holes. 2 Notable exceptions include the interactive modes of GRay, GRay2, and Odyssey, and the virtual-reality data products from RAPTOR . Interestingly, of the existing codes mentioned, these are precisely the four that work with GPUs, leveraging the ability to rapidly produce a single image to enable these techniques. 3 See https://doi.org/10.5281/zenodo.6591886 for all versions of the code and https://doi.org/10.5281/zenodo.6591887 for version 1.0 released concurrently with this work. 4 The code is placed into the public domain. It is currently hosted at https://github.com/c-white/blacklight.
propriate given the physics and equations being considered), and transparency (thorough documentation both in the source code and in the related wiki 5 ). It enables, for the first time, direct usage of Athena++ outputs, including data with static and adaptive mesh refinement. It also supports files generated by the iharm3D ) version of HARM (Gammie et al. 2003), as well as a legacy HARM format. Section 2 provides the details for the numerical choices made by Blacklight, including the integration of geodesics ( §2.2), unpolarized radiation ( §2.3), and polarized radiation ( §2.4). We illustrate the correctness of the code with test problems from the literature in Section 3. Particularly noteworthy capabilities are discussed in Section 4, including slow light ( §4.3), adaptive ray tracing ( §4.4), integration of quantities other than the intensity of light ( §4.6), false-color renderings ( §4.7), and the production of "true-color" images ( §4.8). Performance data can be found in Section 5. We describe the coding considerations and philosophy that characterize Blacklight in Section 6.
We use the (−, +, +, +) metric signature. In purely geometrical contexts (i.e., the metric and null geodesics), we omit factors of c and G, though we keep the black hole mass M as a convenient dimensional-consistency check, so that mass, length, time, and spin all have the same, nontrivial unit. When discussing plasma and radiation, c and G are restored, and equations hold in CGS units.

INTEGRATION ALGORITHMS
Blacklight, like all such ray tracers, proceeds by defining a camera (essentially an array of pixels), calculating a ray for each pixel in the camera, and calculating the relevant properties of light along each ray. Each pixel defines the initial conditions (spacetime position and momentum) for its associated ray (null geodesic). Rays are calculated by integrating the geodesic equation on a stationary spacetime, starting at the camera and going backward in time to the source. The appropriate radiative transfer equation (for unpolarized light, polar-ized light, or some auxiliary quantity) is then integrated along each ray from the source to the camera.
The calculations are performed in Cartesian Kerr-Schild coordinates 6 x α = (t, x, y, z) appropriate for a Kerr black hole with mass M and dimensionless spin a/M , where the metric components are Here η is the Minkowski metric, and we define These coordinates are horizon penetrating, though null geodesics traced backward can only asymptotically approach the horizon with infinite affine parameter. More importantly, they have no polar axis singularity, which might otherwise make accurate integration difficult in its vicinity. We will index these coordinates with α, β, γ, and δ (spacetime) or a and b (space). At times we will make reference to spherical Kerr-Schild coordinates x µ = (t, r, θ, φ), indexed with µ and ν (spacetime) or i and j (space). These are related to their Cartesian counterparts via (2c) and (2d), together with The inverse transformation is For completeness, we note the metric components are where s = sin θ. We will also make reference to the normal frame related to spherical Kerr-Schild coordinates, indexing this with n for time and p and q for space. Vectors transform between the two according to with α = (−g tt ) −1/2 the lapse and β i = −g ti /g tt the shift. This frame, familiar to those working with 3+1 decompositions, has metric components g nn = −1, g np = 0, and g pq = δ i p δ j q g ij . Blacklight can use simulation data in either Cartesian or spherical Kerr-Schild coordinates, transforming values in the latter case into a singularity-free system for processing.

Camera Definition
There are two common types of camera used in general-relativistic ray tracing, which we will term "plane-parallel" and "pinhole." The former takes pixels to be located at distinct points in a plane (suitably defined), each seeing light with the same parallel (again, suitably defined) momentum. The latter takes all pixels to be collocated in space but sensitive to different momentum directions. Plane-parallel cameras are appropriate for making images as seen at infinity without having to place the camera at large distances from the source, while pinhole cameras model what a small detector would see at its location. Codes in the literature use one or the other, and Blacklight supports both.
In either case, Blacklight requires the user to specify the position x i of the camera center, the normal-frame velocity u p of the camera center, and the unnormalized momentum k i = δ p i k p of the light received at the camera center. Velocities are specified as u p rather than u i , as the latter fails to uniquely determine a future-directed 4-velocity within the ergosphere.
With the components of x, u, and k known in any coordinate system, we can define unit vectors for the line-of-sight, vertical, and horizontal directions. To do this we work in the camera frame based on Cartesian Kerr-Schild coordinates, where the metric has compo-nents g t a = −δ ta , (7a) The line-of-sight vector K is a unit vector parallel to k. Explicitly, we have where the proportionality constant is set by requiring K a K a = 1. The transformation rule returns these components to the coordinate frame. A vertical direction v begins with a fiducial "up" vector U a = δ a z . In the special case that the camera is centered on the polar (z) axis, we instead take U a = δ a y . Then v is defined from U orthogonal to K via Gram-Schmidt: where the normalization is determined by v a v a = 1. The orthogonal horizontal direction is defined as where D is the determinant of g a b and [a b c] is the Levi-Civita symbol. Blacklight allows users to rotate the camera about its axis by an angle ψ, so that the horizontal and vertical directions become Users specify a camera width w and linear resolution N . Consider a pixel with horizontal and vertical indices i, j ∈ {0, . . . , N − 1}. Define the coefficients For plane-parallel cameras, the pixel's position and momentum are defined to be where the displacement in the camera frame is For pinhole cameras, they are where we have For both cameras, the remaining component k t pix is found such that g αβ k α pix k β pix = 0. There is a unique positive root on and outside the ergosphere. Inside the ergosphere, the lesser of the two positive roots is chosen. Note the normalization of k pix is done in an arbitrary but well-defined way here without reference to the frequency of light being observed, fixing the scale of the affine parameter used in the geodesic equation.
Eventually, the physical frequency ν (e.g., 230 GHz) must come into play. At the camera, the ray's physical momentum will be nk pix , where the normalization n is set by one of two user options. If the camera is to record frequency ν at its location (most appropriate for pinhole cameras), then −nk pix α u α = ν.
Alternatively, if the camera is to record light that would be observed at frequency ν by a static observer at infinity (most appropriate for plane-parallel cameras), then

Geodesics
Given initial contravariant position components x α and covariant null momentum components k α , Blacklight integrates the geodesic equation in the Hamiltonian form (cf. James et al. 2015;Pu et al. 2016;Kawashima et al. 2021;Velásquez-Cadavid et al. 2022). It also sometimes tracks proper distance along the ray via ds dλ = g pq k p k q 1/2 .
Integration proceeds in the direction of negative λ (backward in time) from the camera. It terminates when the radial coordinate r crosses the threshold from less than to greater than that of the camera center. Even in horizon-penetrating coordinates, backward-propagating rays cannot reach the horizon in finite affine parameter. Geodesics are therefore also terminated if they cross inside an inner r threshold set by the user.
Blacklight offers three numerical schemes for integrating (20). One is the standard second-order Runge-Kutta (RK2) Heun's method, which has the Butcher tableau given in Table 1, and another is the standard fourth-order Runge-Kutta (RK4) method with the Butcher tableau of Table 2. In both cases, step size in affine parameter is taken to be where f step is a user-specified constant and r hor is the outer horizon radius. In terms of robustness and ease of use, however, the recommended geodesic integrator is the "RK5(4)7M" method of Dormand & Prince (1980, DP), which is an adaptive fifth-order Runge-Kutta method that calculates fourth-order Runge-Kutta values from the same set of function evaluations in order to estimate errors for each step. In comparing common integrators in the context of the computation of null geodesics around black holes, Velásquez-Cadavid et al. (2022) find DP to be far more accurate. The step size is adjusted dynamically based on this error estimate, in order to take steps as large as possible given a user-prescribed tolerance (cf. Press et al. 1997). Here the error is defined in terms of the initial, fourth-order, and fifth-order values of the eight dependent variables where f abs and f rel are provided by the user. Values of > 1 indicate a step must be reattempted with a smaller ∆λ.
As described in Shampine (1986), the function evaluations can be arranged in a different way to provide a fourth-order accurate estimate of the midpoint of a step. Blacklight takes this midpoint to be the position and momentum of the step for the purposes of radiative transfer integration. In cases where the geodesic steps cover more proper distance than a user-specified limit (such that the radiative transfer equation would be poorly sampled even where the geodesic is calculated accurately), Blacklight subdivides the step. The aforementioned midpoint, together with function values and derivatives at both endpoints of the step, can be used to uniquely define a quartic expression, providing a fourthorder accurate interpolation anywhere within the step (Shampine 1986) and enabling this subdivision. The exact choice of coefficients used in DP implementations varies; the ones used here are given in Table 3.
In all cases, integration is performed in Cartesian Kerr-Schild coordinates. As these coordinates are stationary, use of (20) ensures the covariant energy is conserved along rays exactly in the code, as it is physically. Because Blacklight does not employ spherical coordinates here, exact conservation of φ covariant momentum (z angular momentum) is not guaranteed. However, this choice avoids numerical issues arising from geodesics approaching coordinate poles.

Unpolarized Radiation
Blacklight can create unpolarized images, ignoring polarization effects along each ray. In terms of the Lorentz invariants the general-relativistic equation for unpolarized radiative transfer is where the affine parameter λ must be normalized such that (20) holds. We take j I and α I to be constant over each ray segment (typically no larger than nearby cells in the underlying simulation). The evolution of I from the beginning of a segment (where we denote it I − ) to the end (where we denote it I + ), across an optical depth ∆τ = α I ∆λ, employs the well-known exact solution , α I > 0 and ∆τ < ∆τ max ; j I α I , α I > 0 and ∆τ ≥ ∆τ max ; where we use the standard function expm1(·) = exp(·)− 1 as written and fix a finite ∆τ max = 100 to avoid floating-point inaccuracies.

Polarized Radiation
Users can choose to integrate the equations of polarized radiative transfer, evolving four degrees of freedom and producing images in Stokes parameters I ν , Q ν , U ν , and V ν at the camera. In addition to j I and α I , this re-quires knowing the Lorentz-invariant linearly polarized emissivities j Q and j U , circularly polarized emissivity j V , linearly polarized absorptivities α Q and α U , circularly polarized absorptivity α V , Faraday conversion rotativities ρ Q and ρ U , and Faraday rotation rotativity ρ V .
There are two primary approaches used by polarized general-relativistic ray tracing codes. Some (e.g., Table 3. Butcher tableau for the DP geodesic integrator. grtrans) make use of the convenient form of the Walker-Penrose constant (Walker & Penrose 1970) along null geodesics in Boyer-Lindquist coordinates to parallel-transport fiducial polarization directions backward from the camera, thus allowing each segment's local definition of the Stokes parameters to match those at the camera. Here we adopt the alternative approach of evolving the polarized state of the radiation forward in a way commensurate with both plasma and coordinate effects, as done in, e.g., ipole. This allows our use of Cartesian (or even more general) coordinates.
The method we use is well described by Mościbrodzka & Gammie (2018); we only briefly summarize it here. The polarized transfer equation can be written as both and coordinate terms + (plasma terms).
(28) Here A and B run over the Stokes indices (with repetition implying summation), S is the vector of Stokes parameters, J is the vector of emissivities, M is the matrix of absorptivities and rotativities, N is the complex Hermitian coherency tensor, and Γ is the collection of connection coefficients. The dependent variables are encoded in both S and N , either of which can be used to define the other in a given frame.
The general procedure is to use Strang splitting for each ray segment of affine parameter length ∆λ, evolving (28) without the unwieldy plasma terms for ∆λ/2, then evolving (27) without the equally unwieldy coordinate terms for ∆λ, and finally evolving (28) without plasma coupling for another ∆λ/2. All coefficients are Table 4. Butcher tableau for the predictor-corrector evolution of (28).
held constant over each substep. As the last substep in one segment and the first substep in the next segment evolve the same equation, we merge these two via a second-order predictor-corrector method with the Butcher tableau given in Table 4. Just as (25) requires special care and exact solutions to avoid numerically unstable integration, so too does (27). The full solution is described in a number of sources, such as Jones & O'Dell (1977), Landi Degl' Innocenti & Landi Degl'Innocenti (1985), and Mościbrodzka & Gammie (2018). 7 The latter provides useful simplifying cases for when various coefficients vanish, which Blacklight employs when applicable.

Sampling
Given a simulation grid, possibly including mesh refinement, and a set of rays with sample points, Blacklight interpolates values from the former onto the latter. This interpolation is done in the primitive variables employed by Athena++: total rest-mass density ρ; total pressure p or electron entropy κ e , depending on electron model (see §2.6); fluid velocity u p , or the Cartesian Kerr-Schild normal-frame values if appropriate; and magnetic field B i or B a , as appropriate.
Interpolation can be done in nearest-neighbor fashion, where the values are copied from whatever unique cell bounds the sample point. Alternatively, Blacklight can perform trilinear interpolation in simulation coordinates. Athena++ domain-decomposes the simulation into logically Cartesian blocks of cells, and interpolation within a block costs negligibly more than nearestneighbor lookup. Accurate interpolation across block boundaries, however, can dominate the cost of producing an image, given that adjacent blocks may be at different refinement levels and ghost zones are generally not included in data dumps. Interblock interpolation rarely makes a noticeable difference in the results, compared to intrablock interpolation; users can choose which method to employ.
The plasma state at each sample point is fully defined after interpolation. Useful quantities that can be derived from the interpolated ones, possibly with userdefined assumptions about the plasma, include electron number density n e ; electron dimensionless temperature Θ e = k B T e /m e c 2 , with T e the electron temperature; fluid-frame magnetic field magnitude B = (b α b α ) 1/2 , with b α = u β ( * F ) βα the components of the magnetic 4-vector and F the electromagnetic field tensor; plasma σ = B 2 /ρ; and plasma β −1 = B 2 /2p.

Coefficients
It is standard practice to rotate one's coordinate system at each point along a ray in order to make j U , α U , and ρ U vanish. For the remaining eight coefficients, Blacklight uses the fitting formulas provided by Marszewski et al. (2021), which conveniently unifies and corrects the synchrotron literature. These formulas cover the cases of thermal, power-law, and kappa distributions for the electrons, always assumed to be isotropic. Blacklight can calculate the coefficients resulting from any admixture of these three populations. Thermal electrons are distributed in Lorentz factor γ according to the Maxwell-Jüttner distribution based on Θ e : where K 2 is the cylindrical Bessel function. Power-law electrons are defined by an index p and Lorentz-factor cutoffs γ min and γ max : The kappa distribution is defined by index κ and width w: In all cases, the electron number density is given by where the user specifies the molecular weight µ and electron-to-ion number-density ratio n e /n i . Unlike the power-law and kappa-distribution parameters, T e varies across the simulation domain. As the simulations for which Blacklight is intended often treat the plasma as a single fluid with a single temperature T , we must invoke a prescription for determining T e as a function of the local plasma state available in postprocessing. The code supports the common choice of declaring a plasma-beta-dependent ratio between the ion temperature and electron temperature, with global constants R high and R low , as introduced by Mościbrodzka et al. (2016). In this case the electron temperature is Alternatively, Blacklight can utilize the electron entropy available in two-temperature simulations as prescribed by Ressler et al. (2015). In this case, the code must provide the values of an entropy-like variable κ e ∝ exp(s e /k B ), where s e is the electron entropy per electron. Then the electron temperature is given by (cf. Sądowski et al. 2017).

TEST PROBLEMS
The most comprehensive comparison of ray tracing codes to date is given by Gold et al. (2020). Here we apply to Blacklight the two quantitative tests from that work used to compare all codes, illustrating the correctness of geodesics and unpolarized images produced from simple plasma models.

Null Geodesic Deflection
For null geodesics confined to the midplane in Kerr spacetime and outside the photon ring(s), solutions to the geodesic equation involve relatively tractable integrals. In Gold et al. (2020), the codes' values of the azimuthal deflection angles ∆φ of such geodesics are compared to the values given by formulas in Iyer & Hansen (2009) for the asymptotic deflection from past to future infinity. Here we repeat this test with the following modification. Given that the geodesic origin (source) and termination point (camera) are at finite distance (r = 1000 GM/c 2 here), we compare the result of ray tracing to the finite-distance formulas from Iyer & Hansen, where u = 1/r, r 0 is the radial coordinate of closest approach, and b = −k φ /k t is the impact parameter.
We place a camera centered at r = 1000 GM/c 2 , θ = π/2, and φ = 0 around a black hole with a/M = 0.9. There are 51 pixels across the middle of the image, spanning a width w = 36 GM/c 2 . Of these pixels, 14 are ignored in this analysis; they lie inside the photon ring, given they have impact parameters satisfying (37) Figure 1 shows the deflection angles calculated by Blacklight compared to (36). The left and right sides of the figure correspond to the left and right sides of the image. For plotting purposes, we subtract the asymptotic, flat-spacetime "deflection" of ∆φ flat = sgn(b)π.
Here we employ the DP integrator with tolerances f abs , f rel = 10 −8 , and the agreement is very good, with absolute errors in the range 10 −6 -10 −5 . A similar test in Gold et al. (2020, cf. their Figure 1) shows most codes' errors being 10 −5 or greater, though in that case the finite-distance code values are compared to the asymptotic formula. We can compare the errors obtained with other numerical tolerances and integration schemes. The top panel of Figure 2 shows the errors for the DP integrator using tolerances ranging over 10 −12 -10 −5 , plotted against the number of steps taken to trace each ray. The adaptive nature of this integrator means stricter tolerances can be satisfied by only using slight more points near closest approach. For this reason, we recommend DP over the RK integrators for general-purpose ray tracing, where reasonable tolerances set by the user almost al- ways yield reasonably accurate results without the danger of requiring intractably many steps.
When testing the RK4 and RK2 integrators, we vary f step over 0.002-0.1. The solution improves with decreasing step size, as expected, sometimes even outperforming typical DP results. However, this accuracy can come at a large cost in terms of number of steps taken. Conversely, the RK integrators allow very quick and somewhat inaccurate integrations using relatively few  Figure 2. Errors in midplane Kerr deflection angles using the DP, RK4, and RK2 schemes. The error for each ray is plotted against the total number of steps used to integrate that ray. Colors correspond to the user-specified numerical control parameters. The DP integrator is a robust choice, with the RK methods providing versatility for users needing more control over the computation-accuracy trade-off.
steps. Relative to DP, they permit users to make larger trade-offs between accuracy and performance. It is possible that these non-adaptive variable-step-size methods could be improved for this type of problem with a better heuristic than (22).

Model Images
Gold et al. (2020) compare the images produced by unpolarized transfer using five sets of parameterized formulas for velocity, emissivity, and absorptivity, shown in their Figure 2. We repeat these tests with the same parameters and camera settings as in that work, choosing the DP integrator, a plane-parallel camera, and frequencies of 230 GHz at the location of the camera. The resulting images, shown in Figure 3, agree with those obtained by other codes.
We report the total fluxes for these images in Table 5. Comparing to Table 2 of Gold et al. (2020), Blacklight lies comfortably in the middles of the ranges spanned by the seven codes in that work. Our table shows how the minimum and maximum values reported in that comparison deviate from the values obtained here, with the typical minima 1% lower than the Blacklight value and the typical maxima 1% higher.

CODE FEATURES
Blacklight has a number of features beyond simply integrating the monochromatic equation of radiative transfer, all of which are intended to aid in the analysis or presentation of simulation data. Some features change how that data is sampled (data cuts, flat spacetime, slow light, and adaptive ray tracing), while others change what quantities are projected onto the image (nonthermal electrons, auxiliary images, false-color renderings, and "true-color" images). Each of these features can be toggled independently, working with any others.
In illustrating these features, we employ a fiducial data set and camera parameters. The physical system is that of the thick, aligned disk from White et al. (2020) around a black hole with dimensionless spin 0.9, with the free scales arbitrarily tuned to match values appropriate for Sgr A*: a black hole mass of 4.152×10 6 M (The GRAV-ITY Collaboration 2019) and an average flux of 2.4 Jy (Doeleman et al. 2008). Regions with plasma σ exceeding unity are treated as vacuum.
The fiducial camera is plane-parallel, with full width 24 GM/c 2 . Its center is held static at Kerr-Schild r = 100 GM/c 2 and θ = 45 • , receiving light with k i ∝ (1, 0, 0). Unless otherwise stated, it collects light that reaches 230 GHz at infinity, using 256 2 pixels. Figure 4 shows the four Stokes parameters for the fiducial image.

Data Cuts
It is already common practice to treat any part of a simulation with σ 1 as vacuum, given that GRMHD jets are often numerically mass loaded and would be unrealistically bright if simulated values there were taken at face value. In fact, this cut is employed in all simulation images shown in this work. Given the importance of probing the origin of image features, Blacklight offers a simple user interface for making similar cuts on a variety of local conditions. Upper and lower cutoffs can be specified for ρ, n e , p, Θ e , B, σ, and/or plasma β −1 .
As an illustration, the top panels of Figure 5 show the fiducial image and the same image made by considering β −1 < 1/2 to be vacuum, highlighting only the coronal regions of the simulation. Some parts of the image become brighter with this cut, indicating the full data set results in some foreground absorption by less magnetized plasma.
Blacklight also provides users with the ability to cut the data based on position. Users are provided with upper and lower cutoffs on radial coordinate r. They can also cut data on one side or another of an arbitrary plane. The plane is defined to be all points x such that (x a −x a 0 )n a = 0 for a user-provided origin x 0 and normal n. The lower panels of Figure 5 show what happens when this plane is chosen to bisect the domain into a foreground and a background relative to the camera.

Flat Spacetime
While having gravity bend geodesics with respect to even a Cartesian coordinate system is critical for obtaining accurate images of light emitted by the modeled system, the resulting mapping from the 3D space of a simulation to a 2D image is very non-intuitive. As an analysis tool intended to be more general than a mere imager, Blacklight allows the user to toggle the effect of gravity. The resulting images more closely correspond to the projections of data one might obtain with standard 3D visualization software, showing features "where they are" rather than where their light reaches the camera.
There is ambiguity in how to treat spacetime as flat in this way. Blacklight uses the following procedure. Simulation data is transformed to Cartesian Kerr-Schild coordinates, both in terms of fluid elements' positions and in terms of the vector components. That is, we take the simulation to provide, perhaps in a roundabout way, ρ, p or κ e , u α , and b α , all as functions of x α . These values are then reinterpreted to be functions of Cartesian Minkowski coordinates, with vector components interpreted in that coordinate system. Geodesics are calculated in flat spacetime, and the radiative transfer equation integrates the reinterpreted quantities in flat spacetime as well. Qualitatively similar results could be obtained by, e.g., transforming the simulation to spherical Kerr-Schild and then reinterpreting values as being in spherical Minkowski coordinates.
The effect on images of artificially assuming a flat spacetime is illustrated in Figure 6. Though this tech- nique does not produce images as would be observed in nature, it can be useful to identify which image features arise as a result of lensing. In the example shown, one can clearly see that the black hole shadow is magnified by gravity. This feature is particularly useful when used in conjunction with false-color renderings ( §4.7 below).

Slow Light
Most ray tracing is calculated with the fast-light approximation, whereby each time slice from a simulation is taken, in turn, to represent the system in a timeinvariant state while light propagates from the plasma to the camera. This is sufficient for many purposes, but the emitting region can be tens or hundreds of gravitational times across, while the dynamical times can be less than ten gravitational times, with turbulence and caustic-crossing timescales even shorter. Thus certain applications, especially those concerned with highfrequency variability, require slow-light ray tracing algorithms that account for the time-evolution of the system.
Blacklight can read a sequence of simulation outputs, sampling the entire spacetime volume needed to make an image. Whenever a ray needs simulation values from a given spacetime point, those values are interpolated in time just as in space. Users have the option to use either linear interpolation between the two adjacent time-slices, or else simple nearest-neighbor sampling. The latter can be useful for simulation dumps that are not too finely sampled (∆t 1 GM/c 3 ), as linear interpolation on timescales longer than the magnetic field coherence time tends to systematically underestimate the magnitude of the field (Dexter et al. 2010). Figure 7 shows the intensity from a fast-light calculation, together with a comparable slow-light im- Figure 7. Results of a fast-light calculation performed with a snapshot from the fiducial simulation (left), together with a comparable slow-light calculation using 160 snapshots spaced by ∆t = 1 GM/c 3 . Though the images largely agree, the time-variability properties of the light curves generated from sequences of such images could well be different.
age. The fast-light case is the fiducial simulation at time 10,500 GM/c 3 , though the camera center (and thus outer cutoff for data being used) is moved in to r = 25 GM/c 2 . Even with such a small spatial extent, the longest-duration geodesics need 159 GM/c 3 to propagate from their termination points to the camera. The slow-light calculation uses 160 data dumps, equally spaced from 10,371 to 10,530 GM/c 3 . The offset in this latter time (the time at the camera) from the fast-light time roughly accounts for the light-travel time for the bulk of the emission to the camera. The two images are quite similar, though the approaching plasma on the left side of the image is able to emit light along a number of rays at different times as it moves relativistically.

Adaptive Ray Tracing
It is sometimes desirable to use a large number of rays (i.e., pixels) in one region of an image, resolving smallscale features, while not wasting resources in other parts of the image. More codes are adopting adaptive raytracing techniques in order to facilitate more efficient computation; Wong (2021) refines regions with particularly long geodesic path lengths, Gelles et al. (2021) use an estimate of the error derived from nearby pixels' intensities, and Davelaar & Haiman (2021) use a scheme very similar to what we describe here in order to trace through a binary black hole system.
The adaptivity in Blacklight is reminiscent of mesh refinement in Athena++. Rather than considering a pixel to be a point, the code considers a pixel to be a square region of the image plane, sampled by a ray placed at its center. This is illustrated in Figure 8. It is a simple w w Figure 8. Schematic illustrating the arrangement of pixels into blocks suitable for refinement. The root grid samples 8 2 rays (black points) in as many pixels (gray squares), which are arranged into four blocks of 4 2 pixels (black squares). The upper-right block is refined into four new blocks (solid blue squares) of 4 2 pixels each (pale blue squares), sampled at their respective centers (blue points). matter to recursively subdivide pixels into four, with each iteration augmenting a given ray with four new rays. The root grid, which we will number adaptive level 0, is domain-decomposed into an integer number of square blocks of pixels, whose size is specified by the user. Each block is analyzed as a whole, though with no information from other blocks, as the code runs. If a refinement condition is triggered, every pixel in the block is divided into four, and the four new blocks covering the region, considered to be at the next adaptive level, are ray-traced and evaluated for further refinement. This process continues up to a user-specified maximum refinement level, and there are no limits to how rapidly refinement can change across the image plane. The code supports single-pixel blocks, allowing fine-grained control over which regions are refined.
Blacklight supports two types of refinement criteria. One, similar to static mesh refinement in Athena++ and other hydrodynamical codes, specifies regions in the image plane and corresponding minimum refinement levels. Any block whose center lies within a region and whose level is less than the given minimum will be refined. The other class of criteria is based on evaluating the set of intensity values I i,j ν (i and j indexing pixel column and row) at a chosen frequency across all pixels in the block.
Users can specify thresholds in intensity I i,j ν itself; absolute pixel-to-pixel intensity gradient relative pixel-to-pixel intensity gradient absolute pixel-to-pixel Laplacian and/or relative pixel-to-pixel Laplacian One-sided differences are substituted as appropriate near the edges of blocks. If a user-specified critical fraction of pixels in a block surpasses a given threshold, the block is refined.
As an illustration, we consider the fiducial simulation with a root resolution of 32 2 pixels, broken into blocks of 8 2 pixels. The top panels of Figure 9 show the intensity, optical depth, and circular polarization at root level. Though rapidly computed, these images fail to show any features beyond the presence of a crescent with a spur. We then add up to three levels of refinement, refining any blocks for which |∇ abs I ν | > 5 × 10 −5 erg s −1 cm −2 sr −1 Hz −1 for five or more pixels. As shown in the lower panels, this criterion focuses pixels in the bright parts of the image with plentiful features, keeping resolution low in the extremities of the image and in the darker parts of the crescent.

Nonthermal Electrons
As discussed in §2.6, Blacklight supports thermal, power-law, and kappa distributions for the electrons. A majority of literature results to date assume purely thermal electrons, though this is largely a de facto choice that should become less common as small-scale plasma research continues to inform the GRMHD community about the properties we expect the electrons to have. As more observations are conducted, especially outside the millimeter, alternative distributions may become more necessary to match what is seen in nature. Figure 10 shows the fiducial image, made with a thermal distribution whose temperature varies across the simulation, together with nonthermal images. The power law is taken to have p = 3, γ min = 4, and γ max = 1000; we parameterize the kappa distribution with κ = 4 and w = 1/2. These parameters result in total fluxes comparable to the thermal case, though the images are more extended and diffuse. The lower-right panel shows the results from declaring that the electrons are equally partitioned among the three cases, achieved by simply setting the relative fractions in an input file. While not physically motivated, this demonstrates how the code can be used to quickly investigate the effects of more complicated scenarios, such as adding some powerlaw electrons to a mostly thermal population.

Auxiliary Images
One of the primary goals of Blacklight is facilitating science by connecting the physical processes in a simulation with the resulting observables. One way this is accomplished is by providing a number of auxiliary "images" of quantities other than Stokes parameters. This allows for quick access to properties of geodesics and representative simulation values along those geodesics as functions of image-plane location.
One pair of simple parameters is the time and length of each geodesic. The time here is the difference in Kerr-Schild t between the origin of the geodesic (its termination point when tracing backward) and the camera. For length, Blacklight reports the Kerr-Schild proper length s obtained by integrating (21) along the geodesic. Figure 11 shows these two quantities for the fiducial simulation. The time is particularly useful for gauging how many snapshots are required to perform a slow-light calculation with a given camera.
We consider three additional quantities indicative of some total property of each geodesic. The code can provide images of the elapsed affine parameter λ, scaled arbitrarily as defined in §2.1; it can report the integrated unpolarized emissivity I j = j I dλ; (42) Figure 9. Unrefined (top) and adaptively refined (bottom) images of the fiducial simulation. The root level consists of 32 2 pixels in 8 2 blocks, and refinement is controlled via the absolute pixel-to-pixel intensity gradient. Refinement successfully captures interesting structure in intensity, optical depth, and circular polarization, without increasing resolution in the outer parts of the image.
and it can calculate the total unpolarized absorption optical depth, These three quantities are displayed in the top panels of Figure 12. More useful are averages of simulation quantities motivated by these three variables. As discussed in reference to data cuts ( §4.1), Blacklight tracks the values of ρ, n e , p, Θ e , B, σ, and β −1 . For each such quantity q, it can report an affine-parameter-averaged value which weights the quantity relatively uniformly over the geodesic. Alternatively, the emission-weighted average emphasizes those parts of the ray in which the most light is being emitted. Quantities similar to this have already been put to use in the literature. However, emission weighting can be deceptive when a bright emitting region is masked by a cooler, optically thick part of the simulation, in which case the average is not representative of the conditions where the observed photons are emitted. For this reason, we introduce an optical-depthintegrated value q τ , calculated as the solution to in the same way that I is the solution to the unpolarized radiative transfer equation (25). As the ray is integrated toward the camera, q τ is pulled toward the local value of q at a rate proportional to the absorptivity α I . The values of Θ e λ , Θ e Ij , and Θ e τ are shown in the lower panels of Figure 12.

False-Color Renderings
The auxiliary images of §4.6 all correspond to a single scalar value generated for each pixel in the image, not unlike true intensity images. Blacklight supports another type of image, which we call a false-color rendering, in which each ray has a human-perceivable color that is modified as it propagates through the simulation.
Consider a quantity q 1 , drawn from the same set available to data cuts ( §4.1) and auxiliary images ( §4.6): {ρ, n e , p, Θ e , B, σ, β −1 }. Associate to this quantity Figure 10. The fiducial image resulting from thermal electrons, and three variations made with alternate electron distributions. The power-law and kappa-distribution parameters are chosen to match the total flux of the fiducial image. The "mixed" case assumes one-third of the electrons are in each of the other cases. Figure 11. Elapsed time and proper length of geodesics in the fiducial simulation. The color scales are the same in geometric units, meaning the slight color differences reflect the fact that ds/dt is not necessarily unity (the speed of light) for null geodesics in a curved spacetime. a color c 1 ; an optical depth per unit length τ s,1 ; and threshold values q 1,− and q 1,+ , either of which may be infinite. A ray, initialized to black, can have its color c be modified in much the same way as the optical-depth integration (46), where here the appropriate optical depth is related to proper distance s via In this way, the ray's color approaches c 1 whenever it passes through regions of spacetime where q is between its cutoff values. Alternatively, let quantity q 2 have color c 2 , opacity α 2 , and threshold q 2,0 . Then whenever the ray crosses a surface q 2 = q 2,0 (possibly only in a user-specified direction), we can blend its color with c 2 according to Renderings defined in this way can reveal threedimensional structure in simulation data. Note that these are more akin to volume renderings in scientific visualization software, rather than externally illuminated isosurface renderings.
We pause to note that it is all too common to find color blending schemes that work in RGB color space. However, the mixing of colored light being emulated by the above equations will fail to match human color perception if done componentwise in this space, obfuscating the scientific accuracy of the final image. Internally, Blacklight uses the correct CIE XYZ color space designed for this purpose (Smith & Guild 1932), which has the added benefit of having a gamut that includes all colors visible to the human eye. Only the final combined image, generated from an arbitrary number of quantities q i and their associated parameters, need be projected into RGB space. 8 Figure 13 shows an example of a three-component rendering from the fiducial simulation. The camera is placed close to the midplane (θ = 70 • ), and the field of view is enlarged for the first set of images. Blue traces regions with ρ > 5 × 10 −18 g cm −3 , with τ s = (30 GM/c 2 ) −1 ; red traces regions with σ > 1 (the standard σ cut has been removed), with τ s = (25 GM/c 2 ) −1 ; and green tracks β −1 = 1 surfaces, with α = 0.25. Material outside r = 45 GM/c 2 is ignored here.
The first panel, made assuming flat spacetime, is akin to what general-purpose scientific visualization software Figure 12. Top: auxiliary images of affine parameter, unpolarized emission, and unpolarized optical depth in the fiducial simulation. Bottom: averages of electron temperature weighted by the scheme corresponding to the above quantity, as described in the text. Each weighting emphasizes different parts of the simulation. Figure 13. False-color renderings of the fiducial simulation, highlighting high densities ρ (blue), β −1 = 1 surfaces (green), and large magnetizations σ (red). For ρ and σ, the color bars indicate the proper distance needed for a ray to achieve a given color; the color bar for β counts how many surfaces the ray crosses. The zoomed-out images use flat spacetime (first panel) and the appropriate Kerr spacetime (second panel) for ray tracing. For the zoomed-in images (third panel, flat; fourth panel, curved), only matter with z < −r hor is considered. would produce. This shows the spatial structure of the simulation data in an intuitive way. The second panel more directly connects the simulation variables to lensed images, as it is made with rays that follow the actual null geodesics of the spacetime. The last two panels illustrate the same dichotomy, but zoomed in to the black hole, with material outside r = 20 GM/c 2 excluded. Additionally, for these latter renderings, we exclude all material with coordinate z > −r hor . With this cut, we can clearly tell that much of the highly magnetized plasma seen above the black hole actually resides below it. Here, the spiral-armed nature of the image outside the photon ring is matched in density but not electron temperature.

True-Color Images
The vast majority of ray-traced black hole accretion images in the literature are monochromatic, showing I ν at a single frequency. It is, however, a simple matter to simultaneously calculate images for a number of different frequencies given a fixed camera and fixed simulation data, and there is information contained in the relationships between images at different frequencies.
Here we explore a novel means of presenting such multi-frequency data via "true-color" images, as seen in an appropriately boosted reference frame. The resulting images are particularly useful for outreach and public engagement, where it is often not emphasized that the ubiquitous black-reddish-orange-yellowish-white color maps used in science are merely imparting eye-pleasing colors on what is essentially grayscale information.
Given values I ν,i at multiple frequencies ν i , we can self-consistently Lorentz-boost the intensity into the human visible range. Suppose we want frequency ν 0 to boost to wavelength λ 0 in a new frame. Then the intensities transform as Given I λ at sample wavelengths, it is straightforward to integrate against tabulated matching functions (Stockman et al. 1999;Sharpe et al. 2005Sharpe et al. , 2011 based on modern measured cone response functions (Stockman & Sharpe 2000), yielding the XYZ color coordinates of the spectrum. A standard transformation can project the resulting pixel color into RGB values. The only remaining free parameter is the overall normalization of the XYZ values, which changes the intensity but not the hue of the color. It is essentially (the reciprocal of) the camera's exposure. We apply this procedure to the fiducial simulation, viewed face-on (θ = 0), at an angle (θ = 45 • ), and edgeon (θ = 90 • ). In each case, we use 12 frequencies, evenly spaced in wavelength from 152 GHz to 324 GHz. Choosing ν 0 = 230 GHz and λ 0 = 550 nm (near the center of human perception), the boosted wavelengths λ cover the visual range of 390-830 nm. This corresponds to traveling toward the source at a Lorentz factor of approximately 1200. The resulting images are shown in the upper panels of Figure 15, with the lower panels showing the standard 230 GHz monochromatic brightness maps from the same points of view.
Such boosted true-color images of synchrotron emission tend to be largely white in color. This is to be expected, as the spectrum is broad compared to the factor of ∼2 range in sensitivity of the human eye, and it tends to be monomodal. Shifting the peak frequencies to wavelengths longer or shorter than ∼550 nm can make it redder or bluer, but it would be surprising to find more exotic colors as a result of this technique. Still, there are subtle color effects in Figure 15, including the fact that the approaching side of the disk is literally blueshifted in hue.

PERFORMANCE
Though creating a single ray-traced image is much less expensive than running a GRMHD simulation, or even advancing such a simulation from one snapshot to the next, a thorough analysis based on imaging can come at considerable cost given the number of parameters such as camera position, frequency, and electron model that can be varied. We therefore provide performance numbers for Blacklight, 9 in order to estimate how costly any particular analysis might be.
Here we consider the polarized ray tracing of a single GRMHD snapshot, with no slow light, adaptivity, auxiliary images, or false-color renderings. The snapshot consists of a 77 × 64 × 128 spherical grid. Tests are run on four different architectures: • Stellar is a campus cluster at Princeton University, consisting of 2.9 GHz Cascade Lake nodes Figure 14. False-color, face-on images generated from the innermost regions of the fiducial simulation. Intensity (middle) can be compared with high densities (left) or high electron temperatures (right), with the former better matching the spiral structure easily seen by eye. For ρ and Θe, the color bars indicate the proper distance needed for a ray to achieve a given color. Figure 15. Top: true-color images of the fiducial simulation at three different viewing angles, obtained by blueshifting multifrequency data with a Lorentz factor of approximately 1200. Light from the left sides of the middle and right panels is emitted by plasma traveling toward the camera, and the resulting Doppler boost can be seen in the bluer color of the light. Bottom: the corresponding 230 GHz monochromatic images. First, we demonstrate the "strong scaling" of the code, fixing the image resolution to be 256 2 pixels and measuring performance while varying the number of threads used by the code, from a single thread to one thread per physical core. The results, shown in Figure 16, show typical values of 100-1000 rays computed per second per thread. This includes the integration of both the geodesic equation and the radiative transfer equation, as well as the (small) amortized costs associated with bookkeeping and reading and sampling the simulation data.
There is a considerable decline in performance with increasing thread count. This is largely unrelated to the hardware itself having less capability than the number of cores might suggest, as happens when processors throttle highly parallel computations for thermal reasons or when the memory bus throughput becomes a limiting factor; the performance numbers vary little between the cases of keeping the rest of the node idle and keeping all cores busy with comparable work. The loss in efficiency is also not the result of large sections of serial code; most of the computation in Blacklight is distributed across threads via OpenMP parallel directives, where each thread works on a contiguous block of pixels. However, these parallel loops suffer performance losses with large numbers of threads, likely due to cache coherency issues. Different threads simultaneously try to read from and write to nearby (but not identical) locations in memory as they work on the parts of various arrays associated with their block of pixels.
This scaling, however, essentially only affects users wanting to use an entire node to rapidly compute a single image at very high resolution. The more common scientific use case is that of computing many images at more modest resolution (e.g., 128 2 or 256 2 ), where multiple instances of the code can fit within the memory available on a single node. In this case ∼4 cores can be assigned to each of ∼10-30 processes, with each process launching as many threads as cores. The result will be the performance seen toward the left side of Figure 16, even with the entire node dedicated to ray tracing. Figure 17 shows how image resolution is not a factor in performance as measured in cost per pixel. Here we fix all runs to use four threads, varying the resolution of the output image. Image sizes are limited by the memory available to each node. In one case, Blacklight is used to trace 10 images at once, rather than just one, using identical camera settings and input files with identical layouts. There are some benefits to efficiency by amortizing over many images the one-time cost of integrating the geodesic equation.
Taking 256 2 pixels using four threads on Stellar as a representative case, each image takes 30 s to run. Of this time, 75% is spent integrating the polarized radiative transfer equation, including calculating synchrotron coefficients; 13% is spent resampling values from the simulation grid onto geodesics; 12% is spent integrating the geodesic equation; and less than 1% is spent on all other tasks. Relative to the same 30 s total, integrating the unpolarized transfer equation takes 15% of the time, making the total unpolarized runtime 12 s, 39% that of the polarized case. If we instead produce only the 26 auxiliary images described in §4.6, integrating the image takes 29% of 30 s, resulting in a runtime of 16 s, 54% that of the polarized case. Finally, if we only produce a single false-color rendering with three components, as in  Figure 13, the integration takes 14% of 30 s, leading to a total runtime of 11 s, 38% that of the polarized case. Further optimization of Blacklight is underway, with the goal of achieving near-perfect scaling, as should be possible given the embarrassingly parallel nature of ray tracing many pixels. Additionally, profiling tools indicate parts of the code are not yet fully vectorized (compiled to single instruction, multiple data (SIMD) instructions). We note that the absolute performance is still competitive with other codes. Using a 288 × 128 × 128 iharm3D simulation data dump, we trace a 256 2 polarized image with both Blacklight and ipole on Stellar, using icpc 2021.1.2 to compile the former and gcc 8.5.0 for the latter. 10 Though ipole has better scaling with number of threads, Blacklight calculates and processes sample points along rays faster when using fewer than 48 threads. With a single thread, Blacklight processes 3.4×10 5 samples per second, compared to 1.0 × 10 5 for ipole.

CODE PHILOSOPHY
The aiding of scientific investigations is the primary motivation for Blacklight, and this reflected in several aspects of the code.
Given that most academic researchers have access to disparate hardware and software environments with little direct technical support, it is important that codes for researchers work with minimal manual configuration. Blacklight has no dependencies on external libraries, placing the burden on the developer rather than the user to ensure it can interface with HDF5 and NumPy files, for example. It requires no third-party configuration tools. The aim is to have code that compiles on most Unix-like machines with no effort. The code is written in standard C++17, which is supported by all major compiler vendors.
Users of scientific codes should be able to discern the exact algorithms being employed in the source code, perhaps modifying the methods to suit their needs. Thorough documentation certainly helps, and the wiki for Blacklight is approximately one-third the size of the source code proper. Ideally, users should be able to inspect the source code and quickly find and understand pertinent sections. To this end, a moderate amount of encapsulation is employed (the code has just five main classes, for reading an input file, reading simulation data, integrating geodesics, integrating radiation, and outputting results), enough to allow users to understand parts of the code without understanding the entire code base, but not so much as would result in chasing the logic of a small task through many files. Blacklight uses only select features of C++ beyond what is contained in C; it is largely imperative in style, as is appropriate for implementing straightforward algorithms that employ long sequences of mathematical equations.
Finally, Blacklight is publicly available. In fact, it is placed into the public domain. There is only the one version of the code as described here; users need not worry that an official, private, better version exists. The main repository is meant to be selective in what it incorporates, so that users can have confidence that updating to the latest version will not reduce functionality or accuracy in their own workflow.
It is hoped that these precepts will lead to a code that avoids the pitfalls common to academic tools for scientific computation. Blacklight can then assist and enable black hole accretion research, especially in the arena of understanding physical processes by connecting astrophysical simulations to astronomical observations.

ACKNOWLEDGMENTS
We are indebted to those who led the way in providing the GRMHD community with ray-tracing codes, especially C. Gammie, J. Dexter, M. Mościbrodzka, and B. Prather, who answered many of this author's questions about such codes; to O. Blaes for extensive discussion regarding the nuances of synchrotron radiation; and to B. Oyang, for providing the initial motivation to write this software.