GRB 221009A: Discovery of an Exceptionally Rare Nearby and Energetic Gamma-Ray Burst

We report the discovery of the unusually bright long-duration gamma-ray burst (GRB), GRB 221009A, as observed by the Neil Gehrels Swift Observatory (Swift), Monitor of All-sky X-ray Image, and Neutron Star Interior Composition Explorer Mission. This energetic GRB was located relatively nearby (z = 0.151), allowing for sustained observations of the afterglow. The large X-ray luminosity and low Galactic latitude (b = 4.°3) make GRB 221009A a powerful probe of dust in the Milky Way. Using echo tomography, we map the line-of-sight dust distribution and find evidence for significant column densities at large distances (≳10 kpc). We present analysis of the light curves and spectra at X-ray and UV–optical wavelengths, and find that the X-ray afterglow of GRB 221009A is more than an order of magnitude brighter at T 0 + 4.5 ks than that from any previous GRB observed by Swift. In its rest frame, GRB 221009A is at the high end of the afterglow luminosity distribution, but not uniquely so. In a simulation of randomly generated bursts, only 1 in 104 long GRBs were as energetic as GRB 221009A; such a large E γ,iso implies a narrow jet structure, but the afterglow light curve is inconsistent with simple top-hat jet models. Using the sample of Swift GRBs with redshifts, we estimate that GRBs as energetic and nearby as GRB 221009A occur at a rate of ≲1 per 1000 yr—making this a truly remarkable opportunity unlikely to be repeated in our lifetime.


INTRODUCTION
Massive stars exhibit a broad continuum of properties in their terminal explosions.At one end are the cosmological long-duration gamma-ray bursts (GRBs), capable of coupling tremendous energies ( 10 51 erg) to highly collimated ejecta with a bulk Lorentz factor Γ 0 100 (Piran 2004).On the other end, the energy budget of most stripped-envelope core-collapse supernovae is dominated by the (quasi)-isotropic supernova emission, with photospheric velocities of tens of thousands of km s −1 (e.g., Liu et al. 2016).Intermediate between these two extremes lies the growing class of low-luminosity GRBs and relativistic supernovae (e.g., Margutti et al. 2014).Typified by the prototypical low-luminosity GRB 980425 associated with SN 1998bw (Galama et al. 1998;Kulkarni et al. 1998), these sources couple several orders of magnitude less energy to their moderately relativistic ejecta (∼ 10 48 erg), and lack the high degree of collimation of cosmological GRBs.
In the nearby universe (z 0.3), where high-energy facilities are sensitive to low-luminosity GRBs, highluminosity GRBs are exceedingly rare due to their much lower volumetric rate.Yet an energetic GRB observed within this volume could produce unprecedented brightness.Here we report the discovery of GRB 221009A, an extremely luminous GRB (Kann & Agui Fernandez 2022) in our cosmic backyard.
On 2022 October 9 at 14:10:17 UT, the Burst Alert Telescope (BAT; Barthelmy et al. 2005), onboard the Neil Gehrels Swift Observatory (Swift; Gehrels et al. 2004), triggered twice in rapid succession on a new cosmic source in the constellation Sagitta.Following its automated burst response, Swift promptly slewed to the location of the first trigger, detecting a bright transient seen with both the Swift X-Ray Telescope (XRT; Burrows et al. 2005) and the Ultraviolet/Optical Telescope (UVOT; Roming et al. 2005).Due to the rarity of such a repeated BAT image trigger and the proximity of the source to the Galactic plane (b = 4.3 • ), it was initially classified as a new Galactic X-ray and optical transient, and therefore was designated Swift J1931.1+1946(Dichiara et al. 2022).Monitor of All-sky X-ray Image (MAXI; Matsuoka et al. 2009) reported the detection of bright X-ray emission from this location shortly thereafter (Negoro et al. 2022).
Subsequently the Gamma-ray Burst Monitor (GBM; Meegan et al. 2009) onboard Fermi reported the detection of an exceptionally bright long-duration GRB ≈55 min prior to the initial BAT trigger, with a consistent localization (Veres et al. 2022).Due to issues in receiving data, the automated classification and localization notices associated with this onboard GBM trigger were not distributed to the world.However, the Fermi team rapidly communicated the existence of the GBM trigger, and its spatial coincidence with the double BAT trigger, to the Swift team.This spatial association, together with analysis of prompt XRT data that showed a smooth GRB-like power-law decline, led to the conclusion that Swift J1931.1+1946 was in fact a GRB, GRB 221009A (Kennea et al. 2022).For the first time in the ∼18 years since launch, BAT triggered not on the GRB prompt emission, but instead on the bright highenergy afterglow when GRB 221009A entered the field of view.
The unusual brightness of GRB 221009A prompted widespread follow-up at multiple wavelengths.Addi-tional X-ray detections were reported by the Neutron Star Interior Composition Explorer (NICER; Iwakiri et al. 2022) and the Nuclear Spectroscopic Telescope Array (NuSTAR; Brethauer et al. 2022).The Large High Altitude Air Shower Observatory reported detecting photons up to 18 TeV (Huang et al. 2022).Changes in the strength of signals propagated by radio transmitters were recorded at the time of the GBM trigger as the photons from the GRB ionized Earth's atmosphere (Schnoor et al. 2022;Guha & Nicholson 2022).Spectroscopic observations of the afterglow and the host galaxy provided a redshift of z = 0.151 (de Ugarte Postigo et al. 2022;Castro-Tirado et al. 2022;Izzo et al. 2022), corresponding to a distance of 749.3 Mpc.
This paper is organized as follows: §2 contains analysis of the observations taken by Swift (BAT, XRT, and UVOT), MAXI, and NICER; in §3, we present analysis of the dust scattering echo, broadband spectrum, and light curve of the burst afterglow; we discuss how GRB 221009A compares to other GRBs in §4, investigate the astrophysical rate of similar events, and the nature of energetic GRBs; in §5, we present our conclusions.We show that due to the combination of proximity and large (but not unprecedented) intrinsic luminosity GRB 221009A has a much brighter X-ray afterglow than previously observed Swift GRBs, and such luminous nearby events are extremely rare occurrences.
For this paper, we assume a cosmology with H 0 = 67.36,Ω m = 0.3153, Ω v = 0.6847 (Planck Collaboration et al. 2020), and z = 0.151 unless otherwise stated.We adopt the GBM trigger time as the burst onset, i.e., T 0 = 13:16:59.99UTC on 2022 October 9. Magnitudes are reported on the Vega system, and uncertainties are given at a 90% confidence interval (unless otherwise noted).

Swift Burst Alert Telescope
Figure 1 shows the BAT raw light curves (i.e., not background-subtracted), summed over all detectors, from T 0 − 500 s to T 0 + 5000 s.The location of GRB 221009A was occulted by the Earth until T 0 + 1870 s.At ∼T 0 + 1100 s, the overall count rate began to rise due to increased particle background as Swift approached the South Atlantic Anomaly (SAA).From T 0 +1317-2183 s, BAT data collection was disabled as Swift transited the SAA.The count rate remained elevated while exiting the SAA until the spacecraft slew beginning at T 0 + 2496 s, when we attribute the linearly decaying enhanced count rate to emission from GRB 221009A.However, the source location was not in the coded portion of the BAT field of view until a slew beginning at T 0 + 3095 s.
Finally after this slew completed, BAT triggered on GRB 221009A, at 14:10:18 (T 0 + 3199 s; trigger ID 1126853) and 14:17:06 (T 0 + 3607 s; trigger ID 1126854) UTC.The event data from these two triggers cover a time range from T 0 + 2960 s to T 0 + 4570 s.The maskweighted light curve shows steadily declining emission present when the burst location came into the BAT field of view at T 0 + 3173 s, and extending beyond the available event data range.The time-averaged spectrum from T 0 + 3302 s to T 0 + 4538 s is best fit by a simple power-law model, with Γ = 2.08 ± 0.03.The fluence in the 15-150 keV band is (7.4 ± 0.1) × 10 −5 erg cm −2 .
The smooth temporal evolution observed by BAT, together with the large temporal offset between the BAT and GBM triggers, indicate that BAT triggered on the afterglow of GRB 221009A.This marks the first such occurrence of a BAT afterglow trigger in the 18 years of Swift operations.
Given the exceptionally bright afterglow, we searched for even later emission in the BAT survey mode data using the BatAnalysis python package (Parsotan et al., in prep.).In individual pointings of survey mode data the afterglow was detected until 2022 October 9 21:55:38 UT (T 0 + 31 ks).We attempted to fit the spectra of each survey dataset with a power-law (cflux*po) model in XSPEC (Arnaud 1996) to obtain fluxes and photon indices.Non-detections were then analyzed to obtain 5σ upper limits following the procedure outlined in Laha et al. (2022).
At later times, the survey data were binned daily and then mosaiced together.The spectrum from each mosaiced image was fitted, and the flux and photon indices derived similar to the procedure above.The results of this analysis are plotted in the top panel of Figure 2, and listed in Appendix A.

Swift X-Ray Telescope
The XRT began observing GRB 221009A at 14:13:09 UT (T 0 + 3370 s, 170 s after the first BAT trigger) and located a bright afterglow in the initial 0.1-s image-mode exposure, after which observations began in Windowed Timing (WT) mode.The initial WT count rate was 910 ± 40 ct s −1 (all XRT count rates are corrected for the effects of pile-up and hot columns, see e.g., Evans et al. 2007); making it more than an order of magnitude brighter at this time -in observed flux -than any other GRB observed by XRT.Due to the high count rate, the XRT remained initially in WT mode, in which only one-dimensional spatial information is collected.Significant structures are present in the 1-D spatial profiles, with a clear excess compared to the expected point spread function (PSF), which were evolving with time (see Appendix B).This resembles the behavior expected when dust clouds in our Galaxy scatter X-rays from the GRB prompt emission, that were not initially traveling towards Earth, back into our line of sight.At T 0 + 89 ks, the GRB had faded sufficiently that the XRT automatically switched to Photon Counting (PC) mode.The 2-D image from this observation, shown in Figure 3, confirmed the presence of a complex series of expanding, bright rings associated with a dustscattering echo (see also Tiengo et al. 2022), the properties of which are discussed in § 3.1.
The presence of scattered emission complicates the data analysis, as it can contribute events to the regions over which source and background counts are accumulated.Both the intensity and spectrum of the rings are spatially variable, thus the selected 'background' region may in fact not be representative of the background within the source region.As a result, the automated XRT analysis (Evans et al. 2007(Evans et al. , 2009) ) needs some modification.For PC mode this is relatively simple, as we have full 2-D imaging and the innermost rings were reasonably well separated from the GRB itself.We restricted the source region to a radius of 20 pixels (47 ), and manually defined a background region as free from dust contamination as possible (identified from a stacked image of all PC mode data).
For WT mode this is more problematic since the CCD is read out in columns, rather than pixels.After investigating the variation in the echo contribution to WT-mode data, we modified the default WT extraction regions to minimize its impact (see Appendix B).We found that the WT flux is accurate to ∼6%.We further verified this by extracting a light curve using only data above 4 keV, where dust scattering becomes less efficient, given the roughly ν −2 dependence of the scattering cross section; only minimal changes in the light curve shape are observed.On the other hand, the dust has more significant impact on the WT spectra (Appendix B), resulting in an increase in the best-fit photon index up to ∼7% and a significant increase to the best-fit absorption column (up to ∼27%).
We modified the settings for the automated light curve 1 to limit the source extraction region to 20 pixels radius, and to use the PC-mode background region (described in the Appendix B); due to the unusual brightness of the source we also increased the number . Combined X-ray and UV/optical light curve of GRB 221009A.The upper panel shows observed X-ray flux from MAXI, NICER, XRT (all 0.3-10.0keV) and .BAT data are taken from detections of the afterglow in Survey data; note that the final BAT data point combines all observations integrated over 1 day.The dotted line shows a broken power-law fit to the MAXI, NICER, and XRT data.The inset figure shows the first MAXI, BAT and XRT data, with the black dashed line indicating a fit to the XRT data alone, and red dashed line a fit between the first MAXI point and the first XRT detection.The lower panel shows 7 filter optical and UV data from UVOT as obtained after the subtraction of late-time template images.The dashed line shows a power-law fit to the white band data.
of counts per bin by a factor of ten compared to the default (see Evans et al. 2007 for details).We fit the time-averaged PC mode spectrum using XSPEC with a power-law model and two absorption components.The first of these was a tbabs fixed to the Galactic value of 5.38×10 21 cm −2 (Willingale et al. 2013); the second was a ztbabs with N H free and redshift fixed at 0.151.The fit yields N H = (1.4 ± 0.4) × 10 22 cm −2 , with a photon index Γ = 1.8±0.2.We used these results to convert the light curve into observed 0.3-10 keV flux.The resulting flux measurements are shown in Figure 2.
To investigate possible spectral evolution, we extracted a series of time-resolved spectra using the 'Add time-sliced spectrum' option on the UKSSDC website 2 (Evans et al. 2009), which ensures that the modifications to the default processing parameters, described above, are employed.From the WT-mode data we created one spectrum from the first spacecraft orbit (T 0 +3.4-4.5 ks), and then spectra in 10 ks chunks from T 0 +10-50 ks.For PC mode, we extracted spectra over the intervals T 0 +67-200 ks, 200-400 ks, 400-1000 ks and > 1000 ks.For WT mode we used only grade 0 events, as is recommended for very absorbed objects 3 .We fitted these spectra independently with the model defined above; the results are shown in Table 1.The photon index ranges between ∼ 1.6 and ∼ 1.9 with the harder profiles observed during the first block of observations (from 3.4 3 https://www.swift.ac.uk/analysis/xrt/digest cal.php#abs to 4.5 ks) and around 1-2 days (from 68 to 175 ks) after T 0 .

Swift Ultraviolet/Optical Telescope
The UVOT began settled observations of the field of GRB 221009A at 2022 October 9 14:13:17 UT, T 0 +3.4 ks (179 s after the first BAT trigger; Dichiara et al. 2022;Kuin et al. 2022).A nearby star, 5 away from the GRB, contaminates the photometry when using the standard circular aperture of 5 radius.In order to minimize the star's contribution to the measurement of the afterglow, we use a 2.5 aperture to extract source counts.To be consistent with the UVOT calibration, these count rates were then corrected to 5 using the curve of growth contained in the calibration files.Background counts were extracted using two circular regions of radius 15 located in source-free regions.The count rates were obtained from the image lists using the Swift tool uvotsource.
As GRB 221009A fades, even using a small aperture, the contribution from the nearby star increases in dominance.In order to estimate the level of contamination, for each filter we combined the late time exposures between T 0 + (3.4 × 10 6 ) s and T 0 + (4.4 × 10 6 ) s.We extracted the count rate in the late combined exposures using the same 2.5 aperture and applied an aperture correction to 5 .These were subtracted from the source count rates to obtain the afterglow count rates.The afterglow count rates were converted to magnitudes using the UVOT photometric zero points (Poole et al. 2008;Breeveld et al. 2011), as well as to fluxes.To improve the signal-to-noise ratio, the count rates in each filter were binned using ∆t/t =0.2.The final photometry is plotted in the bottom panel of Figure 2 and listed in Appendix C.

MAXI
MAXI, which is onboard the International Space Station (ISS), performs scanning observations of about 80% of the sky every 92 min.Prior to the GBM detection of the prompt emission from GRB 221009A, the two Gas Slit Camera (GSC; Mihara et al. 2011) units, GSC 4 and GSC 5, scanned the source region at 12:25 UT on 2022 October 9 (T 0 −3.1 ks), and no enhancement was seen at that time.The 2-10 keV 3σ upper limit is 0.074 ct cm −2 s −1 , corresponding to a flux of approximately 7.9×10 −10 erg cm −2 s −1 .At 13:58 UT (T 0 +2.5 ks) the GSCs detected a very bright transient with a 2-10 keV flux of 5.97 ± 0.19 ct cm −2 s −1 (∼ 6.5×10 −8 erg cm −2 s −1 ; Negoro et al. 2022).Unfortunately, at the time the data were collected, there was a loss of the signal to the ground, and the data were stored in the High Rate Communication Outage Recorder in the ISS, leading to a delay in reporting the transient.Before downloading these stored data at around 15:55 UT, the MAXI/GSC Nova-Alert System (Negoro et al. 2016) triggered on the source at 15:31:03 UT (T 0 +8.0 ks), about 81 min after the BAT trigger.In this second observation by MAXI, the 2-10 keV flux had dropped to 1.00±0.08ct cm −2 s −1 (about 1×10 −8 erg cm −2 s −1 ).
Thanks to the exceptional brightness, the GSCs significantly detected the afterglow during four scans that followed the second detection, to 21:41 UT, six scans in total (e.g., Kobayashi et al. 2022).Typically, GRBs are not detected beyond the first one or two scans, taken at ∼ 92 min intervals .After the six detections, the sky region containing the source was not observable by MAXI until 15:31 UT on 2022 October 11.
We performed a spectral analysis using data obtained from GSC 4 and GSC 5 (for details see Appendix D).We first attempted a standard analysis using the same model applied to the XRT spectra, i.e., a power-law with two absorption components, with a fixed Galactic absorption column density of N H = 5.38 × 10 21 cm −2 .However, we obtained steeper photon indices and/or lower intrinsic absorption column densities than those of the XRT (see Table 6).If we fixed N H,zabs at the weighted mean value obtained with the XRT at T 0 +(3.4 − 50) ks, 1.29×10 22 cm −2 , even steeper photon indices were obtained.
These discrepancies can be explained by the GSC spectra containing dust scattered soft X-rays, which can not be spatially resolved by the GSC.Mitigation of the dust scattering process is discussed in Appendix D; to summarize, an additional spectral component to account for the scattered emission was included with a power-law index derived from XRT PC-mode data and the associated flux allowed to vary as a free parameter.The afterglow photon index was fit as a free parameter in the data from the first two scans, but fixed at the weighted mean XRT value at T 0 +(10 − 50) ks due to poor statistics.The resulting spectral parameters are displayed in Table 1, and flux measurements (extrapolated to the 0.3-10.0keV band) are plotted in Figure 2.

NICER
NICER received notification of this GRB from MAXI on the ISS via OHMAN (On-orbit Hookup of MAXI and NICER) at 14:10:57 UT, but could not begin regular observations due to poor visibility until T 0 + 52.870 ks.Only two observations were possible during this early phase.The first and second observations were at T 0 + 14.003 ks and T 0 + 19.560 ks with an exposure time of 29 s and 62 s, respectively.After T 0 + 52.870 ks, NICER intermittently made observations over a period of about 42 days.
NICER's X-ray Timing Instrument (XTI) consists of 56 modules, each comprising an X-ray concentrator optic and silcon drift detector (Gendreau et al. 2012).During these observations 52 XTI modules were operational.We reduced the data with the nicerl2 command and generated level 2 cleaned events.To avoid potential contamination of instrumental noise, we further excluded XTI module numbers 14 and 34.We defined a time interval in which an exposure time of more than 200 s can be continuously secured as a one-time interval and created an energy spectrum at each time interval (except for the first two observations) with good statistics.
To select a time period with less background influence due to charged particles, we also checked the distribution of the rate of 12-15 keV photons, where the optics have less effective area.Most data are distributed around 0.048 ct s −1 and can be represented by a normal distribution with a standard deviation of 0.018 ct s −1 .Therefore, we removed data with a rate greater than 2σ from the mean in the 12-15 keV range after T 0 +100 ks seconds where the background contribution becomes large (> 0.084 ct s −1 ).We also did not use data after T 0 +1000 ks because the background rate increased due to a geomagnetic storm.The background spectral model is estimated by the 3C50 background model (Remillard et al. 2022) using the nibackgen3c50 command.
To explore the spectral evolution as in §2.2, we divided the NICER data into six time intervals and fitted the time-resolved spectra with a power-law model and the same two absorption components.Since NICER is a non-imaging detector, it cannot remove photons originating from the dust echo within its 5 diameter fieldof-view.To evaluate the impact of the dust echo, we fitted the data in two energy ranges, 1-8 keV and 4-8 keV bands.Since the data fitting in the 4-8 keV range does not allow us to constrain the intrinsic N H value, we fixed it to 1.29×10 22 cm −2 which is the weighted mean value obtained by XRT at T 0 +(10.0-961.6)ks (Table 1).Comparing the fitting results in the two energy bands, the results for the 4-8 keV band show that the photon indices are systematically harder than those for the 1-8 keV band results and consistent with the XRT results.Therefore, we conclude that 4-8 keV light curve is much less affected by the dust echo, and report these results in Table 1.

Dust Scattering Echo
While complicating the point source analysis of the afterglow emission, the dust scattering echo contains a rich amount of information about the intervening Galactic dust along the line of sight to GRB 221009A.Dust echoes from GRBs offer the unique opportunity to measure distances to dust clouds geometrically, with a precision limited only by the angular resolution of the telescope.This is because the distance D GRB to the X-ray source is effectively infinite, relative to the distance to the dust.This allows us to eliminate the source distance from the scattering angle equation (e.g., Heinz et al. 2015) and solve it directly for the distance D dust to the dust (using the small angle approximation): where ∆t is the time delay between the GRB and the exposure and θ is the off-axis angle of the ring relative to the position of the GRB.
Because the vast majority of the fluence of the GRB was concentrated in the prompt GRB emission, to lowest order, we can approximate the input light curve that caused the echo as a delta function in time.This implies that there is a single, unique ∆t for each photon detected by the XRT, and thus, each photon can be referenced to a specific dust distance in eq. 1.This makes the analysis of this echo substantially simpler than in comparable echoes from Galactic X-ray transients.
Details of our analysis of the dust echoes from GRB 221009A are provided in Appendix E. Figure 4 shows the best fit column density histogram for a standard Mathis, Rump, & Nordsieck (MRN) model (Mathis et al. 1977), assuming a soft X-ray fluence of F 0.8−5 keV = 2.1×10 −3 erg cm −2 (Lesage et al. in prep.).The figure shows clear evidence for several dust components on the near side of the Galaxy, with the largest column densities found within distances of about 1 kpc, as expected given the Galactic latitude of GRB 221009A.
However, there is also clear evidence for dust at larger distances, located well above the Galactic disk (see also Negro et al. 2023).While the column densities of these dust concentrations are small, the large cross section at the correspondingly small scattering angles of this more distant dust makes the scattering from these clouds detectable in the radial intensity profiles, as can be seen in Figure 13, for example.As such, the dust echo from GRB 221009A proves that echo tomography is exquisitely sensitive to small dust concentrations at large distances that are not typically measurable with other techniques.
The largest column density cloud is located approximately between 400 and 600 pc distance, with a very nearby dust cloud at a distance of about 200 pc and a third cloud at distance approximately 700 pc.Naturally, these structure are very local compared to the large scale Galactic structure.The very rapid detection of the echo by the XRT allows us to map very nearby dust typically hard to study using dust echoes.We find a total Galactic column density of N H ∼ 5.9× 10 21 cm −2 , broadly consistent with the Galactic column density of 5.38 × 10 21 cm −2 found by Willingale et al. (2013).Given that the largest uncertainty in this value derives from the poorly constrained estimate of the soft X-ray fluence of GRB 221009A, we can infer that our assumed fluence of F 0.8−5 keV = 2.1 × 10 −3 ergs cm −2 is likely correct within roughly 10%.
We find that a simple MRN distribution provides an excellent fit to the temporal evolution of the radial intensity profile, without the need for more complex dust chemistry (such as ice mantles).In this sense, the dust towards GRB 221009A appears to be typical for interstellar dust seen along sightlines to other recent X-ray light echoes, such as the echo from V404 Cyg (Beardmore et al. 2016;Heinz et al. 2016;Vasilopoulos & Petropoulou 2016).

Broadband Spectral Energy Distribution
The combined XRT+MAXI+NICER dataset allows us to explore the soft X-ray afterglow evolution in great detail.In the earliest soft X-ray spectra (T 0 +2.5-10 ks), the afterglow exhibits strong spectral evolution, with the XRT WT-mode power-law photon index increasing (i.e., getting softer) by ∼ 17% (Table 1).This change is much larger than the inferred WT-mode systematic uncertainty (Appendix B), and is corroborated by the 2-20 keV MAXI spectra derived around this time (Appendix D).
From T 0 + 10-50 ks, the spectra show no definitive signs of evolution; there are hints at a reduction in the host galaxy absorption throughout the XRT WT-mode data, although this trend is reversed when XRT switched to PC mode.Given the differences between the WT-and PC-mode results are much less than the dust-induced systematics on the WT results (Appendix B), and no similar spectral evolution is observed in the NICER spectra, we do not believe there is supportable evidence for spectral evolution at these times.
On the longer timescales probed by the XRT PC-mode data, clear spectral evolution is seen and takes the form of an increasing (i.e., softer) photon index, which is confirmed by the 4-8 keV NICER spectra (Table 1).
To constrain the broadband spectral behavior, we built and fit two spectral energy distributions (SEDs) at T 0 + 4.2 ks and T 0 + 43 ks, following the procedure outlined in Schady et al. (2010Schady et al. ( , 2007)), using data from the BAT survey mode, XRT and UVOT.For the 4.2 ks SED we use data in the range 2.5-4.7 ks.For the 43 ks SED we used data in the range 23-103 ks.Details of the SED construction are provided in Appendix F.
The SEDs were fit using XSPEC.We tested two different models for the continuum: a single power-law, and a broken power-law with the change in spectral slope fixed to be ∆β = 0.5 (corresponding to the expected change in spectral slope caused by the synchrotron cooling frequency; Sari et al. 1998).In each of these models, we also included two dust and gas multiplicative components to account for the Milky Way and host galaxy dust extinction and photoelectric absorption (tbabs, ztbabs, zdust)4 .We also include a component for attenuation by the intergalactic medium (zigm).The Galactic components were frozen to the reddening and column density values from Schlegel et al. (1998) and Willingale et al. (2013), respectively.In the analysis, we found consistent results between Milky Way, Small Magellanic Cloud (SMC) and Large Magellanic Cloud dust extinction laws, therefore we only report the fits using the SMC dust extinction law.
Due to the uncertain contribution to the XRT WT spectra by the dust rings, we extracted WT spectra using the 5-20 pixel annular extraction region identified in Appendix B for the spectrum for the early SED, and a 10 pixel radius circular extraction region for the later time SED, chosen to minimize the fraction of dust scattered emission in the extraction region, with a background spectrum taken from a 75-125 pixel annular region. 5ince the intrinsic N H in the WT fits is sensitive to the dust contamination (see Appendix B and Table 5), we also provide fits using BAT and UVOT alone.The results of the analysis are given in Table 2 and plotted in Figure 5.
Examining the SED at T 0 +4.2 ks including BAT, XRT and UVOT data, we find that a broken power-law is preferred, with the F -test finding a statistical improvement of 5σ.The break energy required is at ∼ 7 keV.At the same epoch, excluding the XRT data we find again a broken power-law model is preferred, with the F -test suggesting the improvement is > 3σ.The photon index for the broken power-law model is consistent with the fits including the XRT spectral files.However, the intrinsic dust extinction is lower: 0.23 ± 0.05 mag compared to 0.51 ± 0.03 mag including all data.Not surprisingly, excluding the XRT data greatly increases the uncertainty in the break frequency.
For the SED at T 0 +43 ks, a broken power-law is again preferred, with the F -test suggesting a break is statistically required at > 3σ.The break frequency remains in the X-ray bandpass, with evidence for a decline in energy over this period.The fits still prefer a significant host galaxy reddening, but the exact value of E(B − V ) host again varies depending on the inclusion or exclusion of the XRT data.
To summarize, both at T 0 + 4.2 ks and T 0 + 43 ks, we find strong evidence favoring a broken power-law model, with the lower energy photon index Γ ≈ 1.7.Both epochs favor a break energy in the X-ray bandpass, and a modest amount of intrinsic absorption/reddening in the host galaxy: E(B − V ) host ≈ 0.3-0.5 mag; N H ≈ (1.1-1.4)×10 22cm −2 .More precise constraints, however, are precluded due to the complications resulting from the dust scattering echoes.

Light Curve
The combined X-ray and UV/optical light curves of GRB 221009A are shown in Figure 2 (top and bottom panels, respectively).The XRT, MAXI, and NICER measurements are placed in a common bandpass from 0.3-10.0keV.The BAT survey mode data, however, are not extrapolated to this band and instead plotted from 14-195 keV -due to the inference of a spectral break Table 2. Fits to the two spectral energy distributions, at T0 + 4.2 ks and T0 + 43 ks, built using UV/optical (U), X-ray (X) and gamma-ray (B) data.Both epochs were fit with a power-law (POW) and a broken power-law (BKPOW) continuum, accounting for Galactic and host galaxy gas and dust.The columns are time of the SED; model; host galaxy equivalent column density N H,X,host ; host galaxy reddening, E(B-V) host ; power-law photon index or the first photon index of the broken power-law model Γ, (Γ2 is fixed to be Γ + 0.5); break energy for the broken power-law spectral models E bk , χ 2 & degrees of freedom (d.o.f); and the null hypothesis probability.below this band ( §3.2), these points are instead meant to be illustrative.While soft X-ray observations of the afterglow did not begin until T 0 +2.5 ks, the light curve clearly lacks signatures of the canonical X-ray afterglow behavior at early times, including the prompt decay phase, the plateau phase, and any prominent flaring (Nousek et al. 2006).This relatively simple monotonic decay has been noted previously in other GRBs detected at high (≥GeV) energies (Yamazaki et al. 2020).

SED
We fit the joint 0.3-10.0keV (XRT+MAXI+NICER) X-ray light curve using the Markov Chain Monte Carlo software package emcee (Foreman- Mackey et al. 2019).An additional fixed fractional error was included as a nuisance parameter to account for cross-calibration uncertainty.We find a broken power-law model is strongly preferred over a single power-law, although still formally a more complex behavior is indicated.The best-fit parameters for the broken power-law model are α 1,x = −1.498± 0.004, α 2,X = −1.672± 0.008, and t break,X = (7.9+1.1 −1.0 ) × 10 4 s (68% confidence intervals).The initial MAXI detection at T 0 + 2.5 ks clearly indicates a shallower decay at early times (T 0 + 3.3 ks), while a late time excess (T 0 + 2 Ms) may indicate some energy injection (e.g., Zhang et al. 2006).
For the UV/optical data, we constructed a combined single-band light curve following the procedure outlined in Schady et al. (2010).We then fit this light curve using the identical procedure described above for the X-ray data (including the cross-calibration nuisance parameter).We find that a single power-law and broken powerlaw model provide comparable quality fits to the datathe statistically preferred model depends sensitively on how late-time (> 10 5 ks) measurements are included in the fit.The best fit index derived for the single powerlaw model is α O = −1.13± 0.01, while for the broken power-law model the parameters are α O,1 = −0.98 +0.11  −0.05 , α O,2 = −1.31+0.05 −0.07 , and t break,O = (2.2 +1.7 −1.1 ) × 10 4 s.Regardless of the model selected, the UV/optical light curve clearly declines more slowly than the X-ray emission.Given the large foreground extinction, these wavelengths are unaffected by contamination from an emerging supernova or an underlying host galaxy, and thus this accurately reflects the afterglow behavior.

Comparison with previous GRBs
The top panel of Figure 6 plots the observed 0.3-10.0keV flux light curve of GRB 221009A, overplotted on the entire sample of XRT light curves since Swift launch (all uncorrected for absorption).At the time of the first XRT observations, GRB 221009A is approximately an order of magnitude brighter than any previous X-ray afterglow.Put differently, forming the distribution of X-ray afterglow flux at T 0 + 4.5 ks (a time selected to minimize the number of events requiring interpolation due to orbital gaps in light curve coverage), GRB 221009A falls 5.5σ above the mean.
However, GRB 221009A is also one of the nearest GRBs detected since the launch of Swift -12 events (out of ≈400) have lower redshifts reported in the online Swift GRB Table 6 .To compare to the intrinsic properties of the X-ray afterglow sample, we converted the observed count-rate light curve into intrinsic luminosity in the 0.3-10 keV energy band in the GRB comoving frame.The luminosity in a given bin is given by: where D L is the luminosity distance (749.3Mpc), R is the measured 0.3-10 keV count rate (in the observer frame), C u is the conversion from count-rate to unabsorbed 0.3-10 keV flux (also in the observer frame) obtained from the spectral fits above ( §2.2), and k is the k-correction of Bloom et al. (2001), which corrects from observed 0.3-10 keV flux to the 0.3-10 keV flux in the GRB comoving frame.To calculate k, we assume the spectrum is an unbroken power-law with the photon index Γ = 1.8.The time axis must also be corrected to the observer frame, by dividing the time by 1 + z = 1.151.
The resultant light curve is shown in the bottom panel of Figure 6.While the lack of early time observations somewhat complicates the comparison, it is clear that GRB 221009A is at the high end of the X-ray afterglow luminosity distribution (1.8σ above the mean).But unlike when considering this comparison in flux space, GRB 221009A is by no means an outlier.Rather we conclude that the exceptional observed brightness of the X-ray afterglow of GRB 221009A results from the combination of being very nearby and very intrinsically luminous: while neither the distance nor the luminosity are unprecedented, together they make GRB 221009A unique amongst the Swift afterglow sample.
Comparing GRB 221009A to the broader population of UV/optical afterglows is further complicated by the large (and uncertain) line-of-sight extinction, both foreground and in the host galaxy.With m U = 17.59 +0.13  −0.12 mag at T 0 + 3574 s, GRB 221009A is significantly fainter than the brightest UVOT-detected afterglows, such as GRB 080319B (m U ≈ 15.4 mag; Racusin et al. 2009;Bloom et al. 2009) and GRB 130427A (m U ≈ 13.9 mag; Maselli et al. 2014) at comparable observerframe times post-burst.But even applying only a correction for foreground (i.e., Milky Way) extinction of A U ≈ 6.8 mag (Schlafly & Finkbeiner 2011), the UV/optical emission from GRB 221009A would likely have been the brightest observed to date had it occurred at high Galactic latitude.
If we adopt E(B − V ) host = 0.4 mag and an SMClike extinction law ( §3.2), we infer a U -band host extinction of 1.8 mag.Neglecting k-corrections (which are likely to be much smaller than the uncertainty in the extinction correction), we find an absolute magnitude of M U ≈ −29.3 mag (AB).Comparing to Figure 7 in Perley et al. (2014), we find that the UV/optical light curve of GRB 221009A occupies a similar location in luminosity space as the X-ray afterglow: towards the most luminous end, but entirely consistent with the existing distribution.

Astrophysical Rate
Although Swift was behind the Earth at T 0 , we use the GBM light curve and spectrum (Lesage et al. in prep.) to estimate what GRB 221009A would have looked like to the BAT, as well as to determine the occurrence rate of such events.Adopting the time-averaged spectrum of the main pulse from T 0 + 218.5-277.9s (10-1000 keV fluence of 8.293 × 10 −2 erg cm −2 ), we derive a BAT (15-150 keV band) fluence of 1.86 × 10 −2 erg cm −2 s −1 and an average flux of 3.13 × 10 −4 erg cm −2 s −1 .This fluence from the main pulse alone is 50× higher than the largest fluence detected to date by BAT (GRB 130427A; Maselli et al. 2014).
At z = 0.151, the 15-150 keV (rest-frame) isotropic prompt energy release for the main pulse is E γ,iso = 9 × 10 53 erg.We employ the 15-150 keV energy range to facilitate a direct comparison with the entire population of BAT GRBs with measured redshifts.
To estimate the relative occurrence rate of such an energetic event, we compare GRB 221009A with the intrinsic long GRB luminosity function derived in Lien et al. (2014).We consider only the main pulse, as this dominates the burst energetics, and it is difficult to ascertain when the prompt emission ends, given the afterglow was bright enough to trigger the BAT at T 0 + 3.4 ks.
We randomly generated 10 4 GRBs using the intrinsic GRB rate and luminosity distribution in Lien et al. (2014), and each simulated burst was randomly assigned a pulse structure drawn from the real BAT GRB sample.We calculated the E γ,iso of these simulated bursts, and found that only one burst had an energy release (slightly) higher than the main pulse of GRB 221009A.Therefore, we conclude only ∼1/10 4 long GRBs are as energetic as GRB 221009A.
Using this value, we crudely estimate the rate at which such energetic GRBs are detectable by BAT: where R GRB is the intrinsic all-sky long-GRB rate from Lien et al. (2014), f Eiso = 1.0 × 10 −4 is the fraction of GRBs in this intrinsic sample that have E γ,iso larger than GRB 221009A, f FOV = 1/6 is the fraction of sky covered by the BAT field of view, and f survey = 0.8 is the fraction of time BAT is able to trigger on GRBs (neglecting spacecraft slews, SAA passage, etc.).In other words, we need to wait ∼ 1/0.06 = 17 yr for an event like the main pulse of GRB 221009A to occur in the BAT field of view.
We emphasize, however, that the above estimate simply indicates that BAT should have detected ∼1 GRB more energetic than GRB 221009A over its lifetime, independent of distance.But in addition to being highly energetic, GRB 221009A is also one of the most nearby GRBs detected by Swift.To estimate the intrinsic (volumetric) rate of comparable events, we utilize the BAT trigger simulator (Lien et al. 2014) to calculate the detectability of the prompt emission at different redshifts under different representative geometries.The full details of the simulation are provided in Appendix G.
Using this framework, we derive an upper limit on the local volumetric rate of GRB 221009A-like events of R GRB,comov (z = 0) ≤ 6.1 × 10 −4 Gpc −3 yr −1 .Integrating this flat comoving rate from z = 0 to z = 10, we obtain an upper-limit on the all-sky rate of such events of ≤ 0.5 yr −1 .Comparing to the all-sky intrinsic long GRB rate of ∼ 4571 yr −1 in Lien et al. (2014), the fraction of GRB 221009A-like events is roughly 0.5/4571 ≤ 1.0 × 10 −4 , which is similar to the relative rate derived above.
Even more remarkable, we can use these results to derive the rate of GRB 221009A-like GRBs within the volume out to z = 0.151 (1.1 Gpc −3 ).This implies we would need to wait over ≈10 3 years to detect another GRB 221009A-like event within this volume.The combination of the large energy release and the small distance make GRB 221009A truly a once in a lifetime phenomenon.

The Nature of Energetic GRBs
Interpreting GRB 221009A in the context of the standard afterglow synchrotron model (e.g., Sari et al. 1998) presents a number of challenges, in particular if the X-ray and optical emission is assumed to arise from a common origin (c.f., Ghisellini et al. 2007;de Pasquale et al. 2009).To begin with, the broadband spectral fitting performed in §3.2 strongly favors the presence of a spectral break around the soft X-ray band, with the change in spectral slope being consistent with the cooling frequency ν c .However, if we adopt relatively standard afterglow parameters ( B = 0.01, p = 2.5, η γ = 0.15), then we find that a cooling frequency ≈ 5 keV (10 18 Hz) at ≈0.1 d post-trigger implies a jet expanding into an extremely low density circumburst environment: n 0 ≈ 10 −3 cm −3 for a constant-density (ISMlike) environment.Such low densities have been inferred previously for highly energetic events (e.g., Cenko et al. 2011), but this remains difficult to reconcile with the massive star progenitors of long GRBs.
Furthermore, standard afterglow closure relations (e.g., Racusin et al. 2009) are unable to reproduce the observed spectral and temporal power-law indices.For example, the broadband SED fits presented in §3.2 find an optical spectral index β O ≈ 0.7.This is consistent with the temporal decay observed for a jet expansion into a constant-density medium (α 0 ≈ 1.1; p ≈ 2.4).However, this would predict a significantly shallower temporal decay in the X-rays (α X ≈ 1.3) than what is observed.While expansion into a wind-like medium would predict a steeper X-ray temporal decay, the optical emission is inconsistent with this picture.
Given the exceptionally large E γ,iso , GRB 221009A requires a narrow opening angle (and hence an early jet break time) in order to be compatible with the geometry-corrected energy release inferred from the pre-Swift sample (e.g, Frail et al. 2001).Assuming an efficiency of ≈15% in converting bulk kinetic energy into prompt γ-ray emission, a beaming-corrected energy re-lease of ≤ 10 52 erg requires an opening angle of θ j 4 • .For a top-hat jet expanding into a constant-density medium, this implies a jet break time of: where n 0 is the circumburst density in units of 0.1 cm −3 (Sari et al. 1999).Note that due to the -1/3 exponent, this result is relatively insensitive to the exact value of the circumburst density.With X-ray coverage extending out to T 0 + 70 d, the Swift+MAXI+NICER X-ray light curve is more than adequate to search for such a geometric signature.But out to late times, the temporal decay does not steepen beyond α X = 1.7, significantly shallower than the α > 2 behavior anticipated for a simple on-axis top-hat jet.
The X-ray afterglow steepening observed at T 0 + 8 × 10 4 s could conceivably be attributed to a jet break (see also D'Avanzo et al. 2022).In this case, the corresponding jet would indeed be extremely narrow, with an implied opening angle of ≈ 2 • .However, as described above, the post-break decay index is significantly shallower than expected for an on-axis top-hat jet.Interestingly, similar behavior was observed in GRB 130427A and attributed to time-variable microphysical parameters (Maselli et al. 2014).One alternative explanation is a complex angular structure for the jet (i.e., compared to the simple top-hat model assumed above, with a constant energy as a function of angle).The presence of energetic material outside the jet "edges" would naturally account for the shallower decline initially (e.g., O'Connor et al., in prep.).However, eventually when viewed sufficiently far off-axis the decay should steepen, something not observed in the X-ray light curve out to T 0 + 70 d.A detailed exploration of the jet structure would require additional multi-wavelength observations, in particular radio coverage, and is thus beyond the scope of this work.

CONCLUSIONS
Here we present the combined Swift, MAXI, and NICER observations of GRB 221009A, spanning from T 0 + 2.5 ks to T 0 + 73 days and from UV to hard X-ray energies, providing the opportunity to study the prolonged afterglow in exquisite detail.The primary conclusions are as follows: • Through dust echo tomography, we identify a series of dust clouds along the line-of-sight in our galaxy at distances ranging from 200 pc to beyond 10 kpc (i.e., well above the Galactic disk).As such, the dust echo from GRB 221009A proves that echo tomography is acutely sensitive to small dust concentrations at large distances that are not easily measurable with other techniques.
• The X-ray afterglow of GRB 221009A is more than an order of magnitude brighter at T 0 + 4.5 ks (in terms of measured flux) than any previous GRB observed by Swift.However, the intrinsic X-ray afterglow luminosity is at the high end, but consistent with, the distribution of Swift GRBs with redshifts.Thus, GRB 221009A is unique both because it is luminous and very nearby (by GRB standards).
• We calculate the rate of GRB 221009A-like events (i.e., GRBs with prompt emission as energetic/luminous as that inferred by the GBM).We find that only ∼1 in 10 4 GRBs has an E γ,iso comparable to GRB 221009A.When factoring in the distance, we find that the rate of GRBs as energetic and nearby as GRB 221009A is 1 per 1000 years.
• From the extensive multi-wavelength data sets, we find that the afterglow emission from GRB 221009A is not well described by standard synchrotron afterglow theory.In particular, the break observed in the X-ray light curve at T 0 + 79 ks is inconsistent with a jet break from an onaxis top-hat jet.Either the jet structure is more complex, with significant power outside the jet core, or GRB 221009A is not narrowly collimated.The latter scenario would have profound implications for the energy budget of the event.
While GRB 221009A disappeared behind the Sun for most observatories at ≈70 d post-burst, the X-ray and radio afterglow are still anticipated to be easily detectable after the field becomes visible again in ≈February 2023.Coupled with the exquisite multiwavelength data collected from radio to VHE gammarays, the broadband story of this exceptional event will continue to unfold over the coming months, years, and (perhaps) decades.
We acknowledge the use of public data from the Swift data archive.3, and the results of the daily mosaicing are presented in Table 4.

B. Swift XRT DUST AND BACKGROUND SUBTRACTION
In WT mode the Swift XRT CCD is read out in columns and the spatial information collapsed to one single dimension.As a result, the 'source' region will contain all dust photons 'above' and 'below' the GRB in terms of CCD position, while the background regions will sample a different vertical slice through the dust rings, with potentially different spectral properties from those in the source region.Furthermore, the columns are summed over the full 600-pixel  height of the XRT CCD, but only the central 200 columns are read out, which can make a suitable background region difficult to obtain.
Example time-sliced 1-D profiles from the WT-mode data are shown in Figure 7.These suggest that within a 20-pixel source accumulation region, the data are dominated by the source.The profile from the first snapshot of WT data is shown in more detail in Figure 8, along with a fit of the profile expected from a point source; the lower panel shows the difference between the data and model.As the GRB was so bright in this snapshot, the central core profile is suppressed due to pile-up (resulting in a negative residual, which is not shown for clarity); the central 10 columns were subsequently excluded in the spectral extraction to remove the piled-up core.The remaining residuals suggest the scattering echoes contribute ∼ 10 − 15 % of the observed count rate in a 5 − 20 pixel radius extraction region centred on the source; thus the GRB spectral normalization will be overestimated by a similar amount due to the echo contamination in this snapshot.
To investigate the effect of the echo on the subsequent WT-mode flux measurements, we took the PC mode image from T 0 + 1.036 − 1.111 day and summed it in one dimension so as to mimic the WT mode profiles.We did this twice; first using all data, then a second time after removing a model of the central source PSF, which was fit to the inner-most 20-pixel radius data.The 1-D intensity profiles thus obtained show the source-less data are approximately flat, with a variation of < 25 % between the source outer extraction radius and the wings out to 6 arcminutes.Within the source accumulation region, ∼ 75 % of the events come from the source.Thus, the source light curve is not significantly altered by the variation in the estimated background level obtained from the wings of the 1-D profile where the dust dominates.NH (10 22 cm −2 ) 1.1 ± 0.2 1.4 ± 0.2 27% Photon index 1.47 ± 0.08 1.57 ± 0.09 7% Flux (0.3-10 keV, 10 −10 erg cm −2 s −1 ) 1.64 ± 0.06 1.74 ± 0.06 6% To determine the impact of dust on the WT-mode spectroscopic data, we used the PC mode data in ObsID 01126853004, comprising 3.1 ks of data collected between 89 and 101 ks after the Fermi trigger.7 First we extracted data from a vertical region corresponding to a typical WT-mode background region, and also from a 20-column radius region around the source, excluding the source itself.The spectrum of the dust in the source columns is visibly slightly harder than that in the background region; fitting an absorbed power-law in XSPEC showed the difference to be minimal, however, with the best-fitting parameters varying by less than 6% between the two spectra and comfortably agreeing within their 90%-confidence errors.To better quantify the effect of this possible change in dust spectrum between source and background regions, we extracted two spectra from the PC mode data.The first was taken from a 20-pixel radius circle centred on the source (i.e. with no dust present) with a background spectrum taken from the dust-free region identified above.In the second case we mimicked WT data, extracting data from the full vertical height of the CCD for regions corresponding to those used in the WT-mode analysis (Figure 9).We fit these two spectra independently in XSPEC using the absorbed power-law model.The results are shown in Table 5 and in Figure 10.While all parameters agree to within their errors, the best-fit photon index is 7% higher (i.e., softer) and the absorption column 27% higher in the WT-style data.Since these data were gathered when the source was fainter than in the real WT data (i.e. the impact of dust contamination is at its worst in this experiment), we adopt these percentages as the maximum inaccuracy expected from our WT-mode spectral fits.C. UVOT AFTERGLOW PHOTOMETRY The final photometry measured for GRB 221009A is displayed in Table 8.

D. MAXI SPECTRAL ANALYSIS AND DUST SCATTERING MITIGATION
Source spectra at the first (T 0 +2.5 ks) and second (T 0 +8.0 ks) scans were extracted from a 3.0 • × 4.0 • rectangular region centered on the source (Figure 11, left), corresponding to FwhiteM of 1.5 deg and 1.5-2 deg of the PSF for the scan and its perpendicular (anode) directions, respectively (Sugizaki et al. 2011).Since the source was detected near the center of the GSC 4 detector and at the edge of the GSC 5 detector (β = 2-3 deg, see Figure 2 in Mihara et al. 2011), we extracted the background spectra from two 2.4 • × 4.0 • rectangular regions before and after scanning the source region, avoiding a shadow region near the center frame of the GSC 4 camera body (at β 0) and a high background region in the GSC 5. GSC 4 spectra at or after the scan at 17:04 (T 0 +13.6 ks) were obtained from a circular region within a radius of 1.5 arc-deg, and background ones were from an annulus region with the inner and outer radii of 2.0 and 4.0 arc-deg overlapped with a 8.0 • × 3.4 • rectangular region (Figure 11, right).GSC 5 spectra at those scans were not used because of low signal-to-noise data.
We evaluated fluxes of the direct afterglow component in the following two ways: First, we fitted the spectra in the 4-20 keV band where the contribution from the dust scattering component is small (middle rows in Table 6).This led to harder fitted photon indices in the first and second scan spectra, Γ = 1.75 ± 0.09 and Γ = 2.07 +0.28  −0.26 , than those in the simple 2-20 keV fits, the former being closer to that of the first XRT observation, 1.61 ± 0.02.
Next we directly investigated energy spectra of the dust scattering component using the XRT data.To emulate the dust scattering spectrum, we fit the XRT spectra of two bright outer dust-echo rings at T 0 +95.4 ks and T 0 +146.7 ks Figure 12.GSC spectra at T0+2.5 ks (left panel) and at T0+8.0 ks (right).The blue solid lines show the direct power-law component with Γ free for the spectrum at T0+2.5 ks and fixed at 1.85 at T0+8.0 ks.The light blue dashed lines are the powerlaw model with Γ = 3.94.The red points and lines are background data and models, respectively.The background model is the sum of two power-laws.The best-fit data to the model ratio is also shown in each panel.
with the above power-law model with the fixed column densities.The resultant weighted mean power-law spectral index is Γ dust = 3.94 ± 0.04.Assuming this value does not change with time, we attempted to fit the GSC spectra at 2-20 keV with a model with two power-laws, allowing the normalization of both components and the non-dust photon index for the first and second scan spectra to be free (Figure 12).We summarize the fitting results in the lower rows in Table 6.Interestingly, the obtained parameters for the first and second scan spectra are almost perfectly consistent with those in the single power-law fits at 4-20 keV even if the photon index is fixed or not, though the uncertainty of the dust flux (the normalization of the power-law with Γ = 3.94) is large.We also note that the first scan spectrum tends to be harder than the second scan one in all the cases.
The differential scattering cross-section has energy E and scattering angle θ s dependence of approximately exp(−αE 2 θ 2 s ) for θ s 1 where α is a constant of proportional to the square of the size of grain (e.g., Mauche & Gorenstein 1986).Thus, an actual dust scattered component is considered to have a harder spectrum than assumed here especially in the first and second scan observations.If the time and spectral evolution of the dust echo ring is understood, the fitting parameters can be more constrained.This is, however, beyond the scope of this paper.Finally, we note that if we assume Γ = 3.5 for the scattered power-law component we got a steeper photon index (Γ = 1.75 ± 0.14) and a lower 0.3-10 keV absorbed flux of 5.76 +0.82  −0.70 × 10 −8 erg cm −2 s −1 for the direct power-law component in the first scan spectrum.

E. DUST ECHO MODELING
In order to examine the properties of the dust scattered echo, images were created from the XRT PC mode event data over the 0.8 − 5.0 keV band, where the dust reprocessing cross-section peaks.Given the scattered emission evolves radially with time, the images were initially extracted over per snapshot intervals (with typical exposures ranging from 0.3 to 1.6 ks), until day 22 post trigger when per ObsID images were created (with exposures from 3.3 to 5.2 ks).Vignetting-corrected exposure maps were also generated, spanning the observing times for each image.
Radial profiles were then generated as follows.The GRB position was obtained on a per image basis by fitting the expected XRT PSF model (including a piled-up profile modification when required, as described in Evans et al. 2020) to the imaging data out to 47.1 where the central source dominates.This position was then used as the location about which radial profiles were generated, initially in 2.357 linear bins from 0.79 to 13.75 radius, following the removal of 12 faint point sources (with count rates less than 3.5 × 10 −3 count s −1 , determined from 67 ks of late time data ranging from T 0 + 43 to 64 days after the trigger).The central source profile was retained and combined with a nominal background level determined from late time data to provide the non-halo background estimate.Example radial profiles are shown in Figure 13 which shows the echo emission expanding and decreasing in intensity with time.Later profiles were stacked to increase signal to noise in the plot, with radial bins scaled so that radial bins correspond

]
Figure 13.XRT PC mode radial intensity profiles in the 0.8-5.0keV band (blue solid lines); later profiles show stacked observations in indicated time windows, with radii scaled to the beginning of the observing window using eq.( 1) so that radial intensity features at the same distance appear at the same radius.Overplotted lines show the PSF and background model (green dashed) and the dust model (orange dashed) and the best fit model (solid red).
to the first profile in the set of stacked observations according to eq. ( 1) in order to align radial intensity features of different observations.The plots show clear excess over the PSF from the afterglow emission by the point source, with each peak (i.e., each ring in the image) corresponding to a separate dust cloud along the line of sight.
Approximating the light curve of the prompt emission as a delta function, the radial intensity of the echo at time t = t GRB + ∆t plotted in Figure 13 can be written as where n(D) is the dust volume density as a function of distance D(θ, ∆t) along the line of sight for a given scattering angle θ and time since the burst ∆t and τ phot is the optical depth to photo-electric absorption along the line of sight.The observed radial intensity distribution is then given by the convolution of eq.(E1) with the telescope PSF.
The flux of a given cloud of column density N at distance D (used to perform the dist model fits) is given by (Heinz et al. 2016) Because the time delay from the prompt emission and the radius θ of each ring are known, eq. ( E1) can be used to measure the product of burst fluence, density, and scattering cross section, and, with knowledge of the fluence and a model for the scattering cross section, we can, in principle, solve for the dust density distribution n dust as a function of line-of-sight (LOS) distance and the azimuthal angle along the ring for each observation.
In practice, estimating the fluence at soft X-ray energies (where dust scattering is most efficient and where the echo is observed) requires extrapolation from the sensitivity band of the instruments that detected the prompt emission (namely, GBM; Lesage et al. in prep.), corrected for intrinsic absorption.This fluence value, derived using GBM data (after pile-up correction procedures), has an uncertainty of ∼ 50% that propagates to our results.
In addition, models for the dust scattering cross section vary from cloud to cloud and are themselves not well constrained.Thus, a conservative approach will treat both dσ/dΩ and n Dust as functions to be fitted simultaneously, with the understanding that constraints derived for each will carry some degeneracy.
That said, the relative column density distribution is well constrained by this process, provided that dust chemistry and grain size distribution do not vary drastically from cloud to cloud.
At typical soft X-ray energies around 2keV, the scattering cross section is roughly constant at angles smaller than about 100 while falling off rapidly at larger scattering angles (roughly as dσ/dΩ ∝ θ 3.5 ).This results in a rapid decline in the intensity and flux of the echo as a function of time as the ring size expands, as can be seen in Figure 13.It further implies that, beyond a ring radius of about 100 , smaller rings, which are farther away, will have a larger intensity per column density compared to larger rings, which are closer to the observer.
On the other hand, the Galactic latitude of GRB 221009A of b = 4.3 • suggests that the Galactic column density distribution should be dominated by nearby dust within the plane, within a few kpc.For example, dust at the intersection of the LOS with the Solar circle on the far side of the Galaxy would lie 780pc (almost 8 dust scale heights, e.g., Li et al. 2018) above the Galactic plane, where we would not expect to see substantial amounts of dust (however, any dust present at those distances will benefit from the intensity enhancement discussed in the previous paragraph).
To constrain the dust column density and cross section models, we decomposed the dust distribution along the line of sight into 100 logarithmically spaced bins and determined the least squares fit for the column density for each bin.To this end, we constructed 1-D XRT point spread functions to approximate the radial profile for each dust ring.We then defined the fitting function as given by eq.(E1) for each column density component.We added the point source PSFs for each profile to the fitting function.We then fitted all 100 components simultaneously to all the background-subtracted and radial intensity profiles between day 1 and 42 post-burst.The resulting inferred density distribution is plotted in Figure 4.

F. BROADBAND SED CONSTRUCTION
The BAT survey spectra are produced using the BatAnalysis python package (Parsotan et al., in prep.).Using the HEASoft batsurvey script, the BatAnalysis package calculates the count rate of the GRB in each of the 8 energy bins used in the BAT survey (14-20, 20-24, 24-35, 35-50, 50-75, 75-100, 100-150, and 150-195 keV) and the errors associated with each energy bin.The package also generates the detector response matrix using the HEASoft batdrmgen script for the BAT survey observation of interest.
For the time intervals of the two SEDs, we created time-sliced X-ray spectra being careful to limit dust echo contamination (see §3.2).The source spectral files were grouped to ≥ 20 counts per energy bin.The spectral files were normalized to correspond to the 0.3-10 keV flux of the afterglow at T 0 + 4.2 ks or T 0 + 43 ks.The flux used to normalize a given spectrum is determined by fitting a power-law to the data within the SED time range and the best-fit decay index is used to compute the flux at the mid-point of the SED, in the same way as was done for the UVOT data.
To build the optical spectral files we follow the methodology provided in Schady et al. (2010).This essentially sets the count rates in the spectral files to that determined at an instantaneous epoch by extrapolating the UVOT light curves.In order to obtain the count rates at the instantaneous epoch, we first combined the data from the different filters into a single-flow light curve filter.The light curves of the different filters were normalized to that in the v filter.The normalization was determined by fitting a power-law to each of the light curves in a given time range simultaneously.The power-law indices were constrained to be the same for all the filters and the normalizations were allowed to vary between the filters.The ratios of the power-law normalizations between each filter and the v filter were then used to shift the individual light curves to the same normalization as the v filter, thus resulting in a single filter light curve.To construct the two SEDs at T 0 + 4.2 ks and T 0 + 43 ks, we first determine the temporal slope from the single filter light curve within the corresponding time interval.By fixing the power-law index at this value, we then fit a power-law to the individual filter light curves.We use the derived normalizations to compute the count rate and count rate error at the required time, which was then applied to the relevant spectral file.

G. BAT TRIGGER SIMULATION
To estimate the intrinsic rate of such events, we utilize the BAT trigger simulator (Lien et al. 2014) to calculate the detectability of the prompt emission at different redshifts under different representative instrumental setups.Specifically, our setup uses (1) one standard average background level, (2) four different locations on the BAT image plane (i.e., Grid ID of [14,15,16,17] or equivalent boresight angle of [56,45,27,0] deg), which represent different detector sensitivity and cover different locations within the BAT field of view, and (3) four different numbers of enabled detectors (29000, 22000, 18000, and 15000; these numbers represent the change of average number of enabled detectors from 2005 to 2022).We ran simulations with a combination of each of these setups with a sample of redshifts from z = 0.1 to z = 12 with increments of ∆z = 1.0 (i.e., z = 0.1, 1.0, 2.0, ..., 12.0).The simulation results are summarized in Table 7.
For each setup, we estimate the expected number of detections based on the highest detectable redshift z lim and an assumed intrinsic rate with the following equation: R GRB (z < z lim ) is the all-sky intrinsic rate up to the redshift limit for this burst to be detected by BAT with a specific setup of Grid ID and number of enabled detectors, and is calculated by integrating the comoving rate by taking into account of the volume of the universe through the following equation (see, e.g., Lien et al. 2011, for details of the derivation): R GRB (z < z lim ) = 4π where the comoving distance r comov (z) = ( C H0 ) z 0 1/H(z) dz.f survey is the fraction of time that BAT is capable of triggering, and we adopt f survey = 0.8 based on the study in Lien et al. (2016).f fov = (2.1 sr)/(4π sr) is the fraction of sky covered by the entire BAT field of view down to a partial coding fraction of ∼ 0.1, and f grid = [0.460,0.346, 0.109, 0.085] is the fraction of BAT field of view for Grid ID 14, 15, 16, and 17 8 .In other words, f fov × f grid is the fraction of bursts in the entire sky that would have the partial coding fraction of the specific Grid ID. f ndet is the fraction of GRBs in the BAT field of view that would occur with this number of enabled detectors.For simplicity, we assume an equal number of GRBs occur with these four numbers of enabled detectors.That is, f ndet = 0.25.
The total number of detections can then be calculated by adding up N det,isetup for all 16 combinations of instrumental setups listed in Table 7.That is, We assume a flat intrinsic comoving rate of R GRB,comov (z = 0) and adjust the value until the detection rate matches with that of a GRB 221009A-like event.We set the upper limit of the BAT detection rate of such events to be 1 per 18 years of the Swift mission lifetime, because the prompt emission of the burst actually occurred outside of the BAT field of view.This gives us a corresponding upper limit of R GRB,comov (z = 0) ≤ 6.1 × 10 −4 Gpc −3 yr −1 .Integrating this flat comoving rate from z = 0 to z = 12, we obtain an upper-limit on the all-sky rate of GRB 221009A-like events to be 0.5 yr −1 .Comparing to the all-sky intrinsic long-GRB rate of ∼ 4571 yr −1 in Lien et al. (2014), the fraction of GRB 221009A-like events is roughly 0.5/4571 ≤ 1.0 × 10 −4 .

Figure 1 .
Figure1.BAT raw light curve from the rate data.Panel (a) shows the raw light curve in the 15-350 keV band with a time binning of 1.6 s.This light curve was made from BAT quad-rate data, which records continuous count rates from all active detectors in four different energy ranges(15-25, 25-50, 50-100, and 100-350 keV).The time period when the GRB was occulted by the Earth (within 69 • of the Earth center) is marked in green.Spacecraft slew times are marked in gray.The two BAT trigger times are marked by red lines.Panel (b) shows the partial coding fraction as a function of time.A partial coding fraction of zero indicates that the GRB was outside of the BAT coded field of view, and a value of one indicates that the source is in the highest sensitivity region of the BAT coded field of view.

Figure 3 .
Figure 3. XRT PC-mode 0.8 − 5.0 keV images from ObsIDs 01126853004 (top left), 01126853006 (top right), 01126853008/09 (bottom left) and 01126853010/11 (bottom right), illustrating the echo expansion with time.The echo event radial positions were scaled by t −0.5 to each observation mid-time (given in the upper left corner of each image in days since GBM trigger) to counter the halo expansion within each observation.

Figure 4 .
Figure 4. Dust column density (expressed as equivalent hydrogen column density, assuming solar abundances) determined from fitting dust scattering models to the radial intensity profiles as a function of distance, for a standard MRN model.The dependence on the uncertain soft X-ray fluence of GRB 221009A is shown in the y-axis label.Vertical gray lines indicate 1σ error bars.

Figure 6 .
Figure 6.A comparison of GRB 221009A with all of those observed by XRT.The greyscales indicate the number of GRBs in each (time, brightness) bin, the blue light curve is GRB 221009A, with two notable bright GRBs, GRB 130427A (purple) and GRB 080319B (orange) shown for comparison.Top: Observer frame, 0.3-10 keV observed flux light curves.Bottom: Comoving-frame, 0.3-10 keV intrinsic (unabsorbed) luminosity light curves; the greyscale sample data includes only those GRBs with published redshifts.

Figure 7 .
Figure 7. XRT WT-mode 1-D PSF profiles as a function of time in the 0.8-5.0keV band, illustrating the evolution of the echo in the early WT data.

Figure 8 .
Figure 8. Upper panel: XRT WT mode 1-D PSF profile from the first snapshot of data at T0 + 3.4 ks (solid line) and the modelled point source profile (dashed line).Lower panel: the data minus model residual.

Figure 9 .
Figure 9.The extracted PC-mode data designed to mimic the WT-mode extraction to investigate the impact of dust.Left: the source region, right: the background region.The regions are tilted at the roll angle of the spacecraft.

Figure 10 .
Figure10.A comparison of a PC-mode spectrum (black), in which dust has been correctly accounted for, and a mimicked WT-mode spectrum (red) of the same data, enabling us to quantify the impact of the dust on the WT mode data.

Figure 11 .
Figure 11.GSC 2-20 keV images obtained with GSC 5 at T0+2.5 ks (left) and GSC 4 at T0+13.6 ks (right).Source and background regions are shown by the solid lines and the dashed lines, respectively.

Table 1 .
Spectral parameters determined from fitting a power-law model to the timeresolved X-ray data from Swift, MAXI and NICER.
Spectral energy distributions at T0 + 4.2 ks and T0 + 43 ks, containing BAT, XRT and UVOT data.The top panel displays the best fit unabsorbed broken power-law model (red) together with the data (black) and the data folded with the model (orange).The bottom panel shows the ratio between the data and the model folded through the instrument response.
This work made use of data supplied by the UK Swift Science Data Centre at the University of Leicester.This research has made use of MAXI data provided by RIKEN, JAXA and the MAXI team.PAE, APB, KLP, NPMK acknowledge UKSA funding.The material is based upon work supported by NASA under award number 80GSFC21M0002.Swift at Penn State is supported by NASA contract NAS5-00136.EA, MGB, AD, PDA, AM, MP, BS, SC and GT acknowledge funding from the Italian Space Agency, contract ASI/INAF n.I/004/11/4.PDA acknowledges support from PRIN-MIUR 2017 (grant 20179ZF5KS).GY acknowledges the support of NASA Postdoctoral Program at the Goddard Space Flight Center, administered by Oak Ridge Associated Universities under contract with NASA.SH acknowledges support from NSF grant AST2205917.We acknowledge the hard work of the Swift and NICER operations teams in performing these observations.

Table 3 .
BAT Survey Observations of GRB 221009A.Upper limits are calculated with an assumed photon index of 1.

Table 4 .
Swift BAT Daily and Total Mosaics of GRB 221009A.Upper limits are calculated with an assumed photon index of 1.

Table 5 .
An estimate of the impact of dust on the WT-mode spectral fit results; see text for details. 1 (PC-WT)/PC Parameter PC result Pseudo-WT result Percentage Difference 1

Table 6 .
MAXI observation logs and spectral fit results.All the exposure times are 47 s except for 48 s at 21:42:30.All the fluxes are in units of 10 −8 erg cm −2 s −1 .The photon index of 1.85 with no error is a fixed value.Note all errors given are 1σ.

Table 7 .
Detectable redshift limit for each instrumental setup."Grid ID" indicates the location at the BAT image plane and corresponds to different partial coding fractions and thus different detector sensitivities."ndet" refers to the number of enabled detectors.

Table 8
(continued)Note-Columns 1 and 2 give the mid-time of the exposure in kiloseconds since the GBM trigger and the length of the exposure divided by 2. Magnitudes are given in the Vega system.Uncertainties are given at the 1σ level.Observations with S/N greater than or equal to 2 were considered detections.For non-detections with S/N less than 2, 3σ upper limits are given.The values in this table have not been corrected for Galactic extinction.