Photon Energy-Resolved Velocity Map Imaging from Spectral Domain Ghost Imaging

We present an approach that combines photon spectrum correlation analysis with the reconstruction of three-dimensional momentum distribution from velocity map images in an efficient, single-step procedure. We demonstrate its efficacy with the results from the photoionization of the $2p$-shell of argon using the FLASH free-electron laser~(FEL). Distinct spectral features due to the spin-orbit splitting of Ar$^+(2p^{-1})$ are resolved, despite the large average bandwidth of the ionizing pulses from the FEL. This demonstrates a clear advantage over the conventional analysis method, and it will be broadly beneficial for velocity map imaging experiments with FEL sources. The retrieved linewidth of the binding energy spectrum approaches the resolution limitation prescribed by the spectrometers used to collect the data. Our approach presents a path to extend spectral-domain ghost imaging to the case where the photoproduct observable is high-dimensional.


I. INTRODUCTION
Modern x-ray spectroscopy provides a sensitive probe of local electronic density in molecular systems with atomic-site specificity [1][2][3]. As a result, x-ray free-electron lasers (XFELs) with continuous wavelength tunability throughout the soft x-ray region, unparalleled peak brightness, and short temporal duration, have enabled significant advances in time-resolved measurements of molecular dynamics [4][5][6][7]. Oftentimes the intrinsic photon energy jitter of a self-amplified spontaneous emission (SASE) XFEL, along with the strongly fluctuating sub-structure, is thought to be the limiting factor for achievable energy resolution in timeresolved spectroscopy [6][7][8]. However, the spectral fluctuations inherent to SASE operation can, in fact, be exploited as a notable advantage by correlating x-ray observables with properties of the incident pulse on a shot-to-shot basis. This is a powerful approach to improving the resolution of experiments at XFEL facilities, beyond the bandwidth limit. In particular, the application of the spectral-domain ghost imaging (SDGI) technique has demonstrated sub-bandwidth resolution for spectroscopies employing XFELs [9][10][11][12].
Thanks to their high throughput, 4π collection solid-angle, and ability to provide angleresolved photoproduct yields, velocity map imaging (VMI) spectrometers [13] have emerged as a popular instrument for time-resolved XFEL studies of dynamics in gas phase systems [3,6,7,[13][14][15][16][17]. A VMI spectrometer typically measures the two-dimensional (2D) projection of the three-dimensional (3D) momentum distribution of charged particles. In most cases, however, the quantity of interest is the underlying 3D distribution or another related quantity such as the kinetic energy (KE) spectrum, access to which requires postprocessing of the 2D projection that inverts the Abel transform [18][19][20][21][22]. In trying to apply SDGI to retrieve the photon energy-resolved KE spectrum of charged particles from a VMI data set, the momentum projection, inherent to the VMI operation principal, could complicate the procedure. A feature with a specific KE is mapped to a broad range of pixels on the VMI detector, and thus features with distinct KEs will produce heavily overlapping features in the projected distribution This is in contrast with previous applications of SDGI, where KE is more directly related to the experimental measurement [10]. In this work, we present a single-step regression approach which can simultaneously reconstruct the spectral response and the 3D momentum distribution by exploiting correlation between the single-shot VMI image and the corresponding photon spectrum demonstrating that the projection inherent in the VMI concept is not detrimental to the SDGI procedure.
FIG. 1: Schematic of a typical experimental layout that correlation analysis approaches can be applied to. XFEL pulses are incident on gaseous sample. The vital observables for our approach are the charged particles' VMI images and the x-ray spectra simultaneously measured with a high-resolution photon spectrometer on a shot-to-shot basis. Throughout this work, we refer the VMI axis as x direction, and in this figure the x-ray is polarized along z direction and propagates along y.
The primary components of a typical experiment where correlation analysis can be applied is illustrated in Fig. 1. Briefly, a stochastic light source (here the SASE XFEL [23]) produces pulses that intercept a grating. The reflected beam (zeroth order) is directed toward the interaction point of a VMI spectrometer, which records the single-shot photoelectron momentum distribution. The incident spectrum is measured from the first order diffraction of the grating [24]. In Sec. II, we describe our single-step regression approach. In Sec. III, we demonstrate it with an experiment at the Free-electron LASer in Hamburg (FLASH), where spectral resolution below the average bandwidth of the incident x-ray pulses has been achieved. We point out that sub-bandwidth resolution is a common advantage of photon spectrum correlation analysis approaches. In Sec. IV, we describe the advantage we observe in regularizing with a single-step reconstruction compared to other possible implementations. Our approach promises to enhance the energy resolution of XFEL experiments using VMI, opening up opportunities to study time-dependent phenomena through the variation in finely resolved energy structures [7,25].

A. Model
The primary quantity of interest in VMI measurements is the 3D momentum distribution of charged particles, f (p), which depends on the spectral profile of the incoming pulse, according to where a(ω) is the frequency spectrum of an incident pulse, and χ(p, ω) is the (linear) response to the spectral intensity at frequency ω. This model applies to light-matter interactions in the linear regime, such as single-photon ionization. The image obtained in a VMI spectrometer is given by the Abel transform of f with a mapping from momentum to position where x is the direction of projection, α ∝ mt −2 ToF is the magnification factor depending on the time-of-flight t ToF and particle mass m, andχ(r, ω) = (2mα) 3/2 χ(p, ω) is the ω-dependent momentum distribution.
Our analysis approach is summarized graphically in Fig. 2. The spectral response of the sample, χ(p, ω), is extracted by fitting a model to a combination of single-shot VMI images and photon spectra. The most likelyχ is the function that minimizes the difference between the measured and predicted electron momentum distributions, subject to regularization terms that favor sparsity and smoothness in the ω-dependent distributionχ. We restrict our considerations to cases with cylindrical symmetry, where the axis of symmetry (z), is oriented perpendicular to the direction of projection (x), such as the case shown in Fig. 1. Other configurations satisfying the same symmetry requirements are discussed in Sec. IV D . Such a model is sufficient to describe experiments using either linearly-or FIG. 2: Schematic of the overall analysis procedure. Given a single-shot spectral profile, our model composes a 3D momentum distribution based on parameters C and conducts the Abel transform, obtaining an expected 2D distribution to be compared with the corresponding VMI image. Tuning the model parameters arrives at an optimal point of the objective function that consists the total discrepancy and the regularization terms. The optimal set of parameters is eventually converted to the photon-frequency dependent momentum distribution.
circularly-polarized laser pulses acting on isotropic samples. The Abel transform is uniquely invertible under this symmetry condition, and the quality of the reconstruction will depend on the number of FEL shots used.

B. Implementation
In general, the inverse Abel transform is ill-conditioned, being susceptible to experimental noise [26]. Several inversion algorithms [18][19][20][21][22] have been developed to mitigate this issue and to robustly reconstruct the underlying 3D momentum distribution. In our method we adopt the basis functions employed in the well-known pBasex [18] algorithm to represent χ(r, ω) and combine the inversion procedure with spectral domain ghost imaging [9,10]. pBasex expands the three-dimensional momentum distribution with a basis set that is the product of radial basis functions and Legendre polynomials: where P l (z/r) is the l-th order Legendre polynomial, and the sum over l can be truncated at twice the highest order of light-matter interaction N L (i.e. twice the number of photons involved in the ionization process). {f k (r)} N K k=1 is a set of radial basis functions, and c lk (ω) are their ω-dependent coefficients.
Combining Eqn. (3) with Eqn. (2), the projected 2D distribution can be written as: where G lk (y, z) = f k (r)P l (z/r)dx/(2π), is the projection of each basis function perpendicular to the axis of cylindrical symmetry. It is worth noting, that after discretization on a Cartesian grid R q = (y q , z q ), the projections G lk,q = G lk (y q , z q ) are the same integrals encountered in the standard pBasex inversion method (Eqn. 4 of Ref. [18]). Being further uniformly discretized in photon energy ω w , Eqn. (4) iŝ where for each single-shot, i,B i,q is the expected intensity at pixel q given the spectral profile A i,w = A i (ω w ) and the spectral response of the sample encoded in the coefficients C lk,w = C lk (ω w ). Solving for the coefficients C from the expected imageB is straightforward, as the tensor H ≡ G ⊗ A can be reshaped into a matrix H iq,lkw = G lk,q A i,w , whose pseudoinverse mapsB to C. However, solving C from the measured images B by application of the pseudo-inverse, is highly sensitive to experimental noise in both B and photon spectra A. A more robust approach is to minimize an objective function, h consisting of h 0 , which quantifies the model-measurement discrepancy, and regularization terms favoring expected qualities of C lk,w , such as sparsity and smoothness [10], which is similar to the procedure discussed in Ref. [27]. In our case, h(C) is where is the Gaussian log-likelihood up to a global factor, with weights W q over each pixel q in the VMI image. Weights W q can be chosen to enhance the sensitivity of the reconstruction to certain regions of the VMI image, and henceforth we denote W to be the diagonal matrix constructed by them. h sp (C) and h sm,· (C) are the sparsity and smoothness regularization terms, respectively. We separate smoothness into two terms h sm,ω (C) and h sm,r (C), to differentiate the smoothness along the frequency (ω), and radial (r) directions. The corresponding hyperparameters, λ sp , λ sm,ω , λ sm,r control the degree to which sparsity and smoothness are enforced in the retrievedχ. For each direction in d = ω, r, the smoothness term quantifies the roughness with the second order difference, i.e. h sm,d (C) = L (d) C 2 2 , d = ω, r, with L (d) representing the finite-difference Laplacian operator along direction d. Common choices for the form of h sp include the L 1 -norm [28] and L 2 -norm squared [29]. We choose the latter for the demonstration in Sec. III. We discuss strategies for choosing the proper values for the hyperparameters in the supplementary material.
From the point of view of implementation, the discrepancy term, h 0 (C; A, G, B), is quadratic in the C coefficients, conducting the summation over shot i to formulate h 0 into a quadratic form prior to the optimization significantly improves the efficiency, although at the cost of caching a large matrix A T A ⊗ GW G T in memory. We demonstrate this reconstruction method for a dataset collected from soft x-ray ionization of argon above the 2p ionization threshold [30][31][32]. The experiment is conducted using the CAMP instrument [33] at beamline BL1 of the FLASH [23]. Argon gas is introduced via supersonic expansion to produce a continuous molecular beam. Following collimation through two skimmers and an aperture, the beam is intercepted by focused x-ray pulses produced by the FEL in the interaction region of the spectrometer. The velocities of the photoelectrons are mapped to a position-sensitive microchannel plate/phosphor screen detector and recorded with a CMOS camera at 10 Hz.
The pulses have an average bandwidth of 5 eV full-width at half maximum (FHWM) at 264 eV central photon energy. The estimated pulse duration is 100 − 150 fs (FWHM) with a mean pulse energy of 6 µJ/pulse at the sample. The incident photon spectrum is recorded on a shot-to-shot basis using an upstream variable line spacing (VLS) spectrometer [24]. The zeroth order beam from the grating is delivered to the interaction point, while the first order is collected by a detector to image the spectrum. This measurement is performed with linearly polarized light from the FEL, which implies that the resulting electron momentum distribution has reflection symmetry about the xy plane. In this case, we can limit the sum in Eqn. (3) to only even-order Legendre polynomials [34]. The dataset consists of 2.8 × 10 4 FEL shots, and the representative single-shot raw VMI images and photon spectra are shown in Fig. 3, where the average VMI image and photon spectrum are also shown. The single-shot VMI images are quite sparse, containing 40 electrons in each image on average.
From the minimization of Eqn. (6), we obtain the most probable parameters C lk,w given the dataset, which represents the fitted χ(p, ω). Using Eqn. (5), the average image is reconstructed by averaging the expected intensityB i,q across all FEL shots. As shown in Fig. 3 (f), the reconstructed average image agrees well with the average of raw images except being less noisy. The reconstructed average image, nevertheless, is the projected momentum distribution of the mean photon spectrum and can hardly show the underlying fine structure. With the same set of most-probable parameters C lk,w , a better way to reveal the fine structure is to reconstruct the KE spectrum for each l-th Legendre polynomial component and each photon energy ω, which is discretized as with E r = αr 2 denotes the KE at radial grid point r. The ionization from argon 2p-shell is a single-photon process, so we visualize I 0 (E, ω) and I 2 (E, ω), in Fig. 4 (b)(c) respectively. We observe two dispersive features that correspond to the spin-orbit split cationic states of argon, 2 P 1/2 and 2 P 3/2 [32]. For each photon energy in Fig. 4 (b), we shift the reconstructed KE spectrum to binding energy (BE) according to BE = ω − KE, which is shown in Fig. 4 (e). The spin-orbit splitting is well resolved with our approach. We use a previously reported measurement [32] of this splitting, 2.12 eV, to calibrate the VMI and VLS energy axes. With this calibration, we extract a line-width of 1.1 eV FWHM from Fig. 4 (e), which is resolved beyond the width of the averaged FEL pulse spectrum, 5 eV FWHM. Furthermore, the extracted line-width is below the average bandwidth of a single pulse 3.5 eV FWHM. Thus, with the application of spectral domain ghost imaging in this XFEL experiment, we are able to achieve sub-bandwidth resolution.
In Fig. 4 (c) and (f) we plot the projection of the 3D momentum distribution onto the second-order Legendre polynomial I 2 (E, ω), which is related to the photoemission anisotropy parameter, β 2 [35]. We extract a value of 0.46 ± 0.05 and 0.39 ± 0.05 for the anisotropy parameter for the 2 P 1/2 and 2 P 3/2 ionic states, respectively. These values are averaged over a range of photon energies from 261 to 265 eV, which is in good agreement with previous measurements in Ref. [36] and calculations in Ref. [37]. The results in Fig. 4 (b)(c)(e)(f) are obtained with a set of N K = 125 radial basis functions and uniform pixel weight W .
In contrast, directly applying pBasex on the cumulative image results in a single spectral feature with a width of 5.4 eV FWHM. Moreover, grouping the FEL shots by the centre-of-mass of photon spectrum does not resolve the two spin-orbit features, and the corresponding average binding energy spectrum has a width of 5.1 eV (FWHM), as shown in Fig. 4 (a) and (d). In order to obtain the resolution observed in Fig. 4 (b) and (c), we need to explicitly consider the correlations between the single-shot spectra and the VMI images.

A. Resolution achieved in our demonstration experiment
To characterize the resolution of this technique, it is useful to define the correlation length of the spectral measurement. The correlation length is a measure of how strongly coupled the intensity fluctuations are between neighboring pixels in the photon spectrum. Here, we define this metric δ A to be the average distance between two frequencies where the Pearson's correlation coefficient drops to 1/e [38]. The correlation between spectrometer pixels arises from two sources: the resolution of the photon spectrometer, and the intrinsic variation due to spectral fluctuations of the source. In our experiment, the correlation length is measured to be δ A = 22 VLS pixels, corresponding to 0.99 eV, as shown in Fig. S1(a) of the supplementary material. This is much larger than the instrumental resolution of the photon spectrometer [24], and thus, we conclude that the dominant contribution to the correlation length is the intrinsic correlation of the FEL source.
The average width of the photoemission features in the BE spectrum in Fig. 4 (e) is determined to be σ BE = 0.48 eV in standard deviation. This width is the result of the finite VMI energy resolution and the correlation in the spectral measurement, which approximately sum in quadrature σ 2 BE ≈ σ 2 VMI +σ 2 φCorr . The contribution from spectral correlation, σ φCorr , is not necessarily the correlation length δ A , but the two quantities are related by a constant of proportionality. As described in the supplemental material, and shown in Fig. S1(b), σ 2 φCorr is estimated to be 0.11 eV 2 based on the measured δ A = 22 px. Having accounted for σ 2 φCorr , we find the remaining contribution to σ 2 BE to be σ 2 VMI = 0.10 eV 2 . This corresponds to a VMI energy resolution of 2.3% at KE = 15 eV, which is consistent with the reported value of 2% for this instrument [33]. We note that the average bandwidth of incident x-ray pulses is not a limiting factor for the resolution σ BE of our technique. Instead, it is determined by the kinetic energy resolution of the VMI spectrometer and the correlation length of the photon spectrum measurement.

B. Extension of SDGI
Spectral domain ghost imaging, in general, exploits spectral fluctuations of the incident source to capture the sample response. The key development in the present work is incorporating the native projection of VMI into the ghost imaging model. This method unravels the projection of different Newton spheres, while simultaneously correlating these unravelled shells with the incident photon spectra. More specifically, our approach models the photoproduct momentum space with a reduced dimensional representation (similar to the pBasex approach) and inverts the tensor product of spectral integration and Abel transform in a single step. Such an approach is a unique way to extend the scope of SDGI. Although under ideal circumstances it is equivalent to conducting the two regression steps sequentially, the single-step approach has several advantages over either sequential approach.
In comparison to the sequential application of SDGI followed by Abel inversion, the singlestep approach represents the momentum distribution with a polar basis set, reducing the dimensionality of the problem and coordinating the comprehensive behavior of the momentum distribution over multiple pixels. When applying SDGI directly to the 2D VMI data, incorporating the smoothness term that penalizes high spatial frequencies in momentum space is inefficient and in most cases intractable. This is because the dimension of the image space is too large. Conceding this smoothness over momentum space, we can implement a sequential approach by applying SDGI to each VMI pixel independently and subsequently applying pBasex to the result. This approach remains efficient, but it is plagued by compromised performance in noise handling, as shown in the supplementary material Fig. S2. The key difference between this sequential approach and the single-step is that the regression is performed independently for each pixel in the first of the two sequential steps.
Another feasible sequential approach is to apply single-shot Abel inversion followed by SDGI. Although the regularization terms can remain in the same form as in the single-step approach, in this approach the Gaussian log-likelihood in the objective function (Eqn. 6) is replaced by, where G + = (GW G T ) −1 GW is the pseudo-inverse of the Abel transform, which is conducted separately for each photon energy bin. In our demonstration experiment, both sequential approaches show inferior performance compared to the single-step approach, which is illustrated in the comparison in the supplementary material. Under ideal circumstances, when regularization becomes unnecessary to suppress experimental noise, the optimal point of the original objective function in Eqn. (6) is where A + = (A T A) −1 A T , and in going from Eqn. (10a) to (10b) we have changed the order of operations between the pseudo-inverse and tensor-product of A and G. Equation (10a) describes the single-step approach and Eqn. (10b) denotes both the sequential approaches described above. Thus it can be seen that in the absence of regularization, both sequential approaches are equivalent to the single-step method. The regularization in Eqn. (6) necessitates the computationally heavy step of inverting an N ω N L N K -dimensional matrix, which is not true for the sequential approaches. However, as described above, the single-step outperforms the sequential approaches by virtue of being more robust to noise. Similar results between the single-step and sequential approaches may be achieved in other measurements, and the precise choice of method depends critically on the properties of the measurement; e.g. signal-to-noise ratio, size of the dataset, and number of parameters to fit.

C. Connection to Covariance Approach
We note that there are a number of mathematical techniques to extract correlations in large datasets. While the method presented in this work relies on linear regression, covariance analysis is another common method for extracting correlations [7,[39][40][41][42]. The photoproduct yield can be correlated with other intrinsic or extrinsic measurements, to extract signal from noisy data. Both regression and covariance have been employed in ghost imaging experiments in the spectral domain, and these two techniques are closely related.
The connection is shown by regressing the mean-subtracted VMI images ∆B i ≡ B i − B on the mean-subtracted spectra ∆A i ≡ A i − A , with · denoting the average across all shots. Given the standard definition of sample covariance Cov[X, Y ] = ∆X T ∆Y /(N shot − 1), the unregularized spectral regression of ∆B on ∆A gives where the right hand equality demonstrates this is identical to applying the inverse of the photon autocovariance matrix to the photon-electron covariance matrix. The transform from the projected momentum distribution to the coefficients C is a linear operation in the electron momentum space, which is in tensor product with the operations in the photon energy space. Therefore the aforementioned single-step and sequential regression approaches are all applicable to the mean-subtracted data. Comparing to the original approach elaborated in Sec. II, regressing the mean-subtracted data is more robust to static background in both the VMI images and photon spectra, but it may suffer from the loss of information due to subtraction of the mean, especially when the spectral fluctuation is limited. Regardless whether the mean is subtracted or not, one can further constrain the parameters with additional prior knowledge about the sample, to solve the most probable parameters in a space with dimension lower than N ω N L N K . Although related, these further restricted approaches require more assumptions and are less generalisable than the main approach we present in this work. We provide an example in the supplementary material.

D. Applicable Apparatus Configurations
As for standard inverse Abel transform procedures, a prerequisite of our method is the cylindrical symmetry about any axis z that is perpendicular to the projection axis of the VMI x, for which the coordinate system shown in Fig. 1 is not the only configuration. Here we describe two other configurations without exhausting all possible cases. For circularly polarized x-rays, such as in [43], the layout in Fig. 1 is still applicable except that the symmetry axis z is along the axis of beam propagation. For a co-axial VMI [44] with linearly polarized x-rays propagating along the VMI axis x, z falls along the x-ray polarization axis.

V. CONCLUSION AND OUTLOOK
We present a regression approach which can achieve VMI measurements at resolution better than the inherent bandwidth of a noisy photon source, by simultaneously reconstructing the spectral response and performing the inverse Abel transform. Our approach demonstrates a clear advantage over the conventional binning-and-averaging method. In our experimental demonstration on the photoionization of argon, the retrieved linewidth is dominated by the resolution limit prescribed by the VMI resolution and the measured spectral correlation. We anticipate that the outlined approach will be of great use in the emerging field of time-resolved inner-shell photoelectron spectroscopy at FELs, which promises to interrogate photoinduced nuclear and electronic dynamics in a site-selective manner.
We demonstrate the connection between covariance and regression analysis and we relate and compare our single-step approach to other sequential approaches. Our method extends the scope of SDGI by adopting a reduced dimensional representation for the properties of the photoproducts, and it can be directly applied to any measurement where VMI images are recorded in coincidence with the incident photon spectrum. This makes the technique broadly applicable to many different light sources.

VI. DATA AVAILABILITY
The data that support the findings of this study are available from the authors upon reasonable request. An implementation of the proposed approach can be found in our custom package developed for general SDGI [45]. The fluctuation of the spectral intensity of XFEL pulses has an intrinsic correlation between neighboring frequencies. The correlation between intensity at two frequencies can be quantified with Pearson's correlation coefficient where ∆A(ω) ≡ A(ω) − A(ω) is the fluctuation of the spectral intensity about the mean value. To extract a single correlation length, we average over a range of frequencies, ω L < ω < ω U , where the mean photon spectrum is above 1/e 2 of the maximum to obtain the average autocorrelation function and the correlation length is defined to be the frequency Ω c that the function falls to F (Ω c ) = 1/e. The autocorrelation function of the 2.8 × 10 4 XFEL pulses used in our demonstration experiment is shown in Fig. S1(a), and the correlation length is determined to be δ A = 22 VLS pixels, while the instrumental resolution is ∼ 1 px [1]. The correlation length δ A is not exactly the same as σ 2 φCorr for at least two reasons. First, the autocorrelation function is not exactly the photon spectrum, but the normalized autocovariance of the fluctuating spectrum. Therefore, if the point spread in the photon spectrum is modelled with a gaussian of width σ a0 , the autocorrelation function is also expected to be a gaussian with standard deviation √ 2σ a0 , and so at the 1/e of the maximum, δ A = 2σ a0 , thus we estimate σ a0 = 11px . Secondly, averaging over ω in obtaining the binding energy spectrum may have alleviated the blurring due to the finite δ A . Both effects maintain the linear dependence of σ 2 φCorr on δ A , but the equality may not be exact. Returning to the discussion in Section IV.A, recall that the finite resolution of the electron binding energy spectrum is given by, σ 2 BE ≈ σ 2 VMI + σ 2 φCorr . Rather than estimating σ φCorr directly from a measurement, it is easier to increase σ 2 φCorr by blurring the photon spectra, and extrapolate the behavior of this curve. To estimate σ 2 φCorr in our experiment, we blur the raw VLS spectra with a gaussian kernel of varying width σ g . At each σ g we repeat the single-step regression to obtain the binding energy spectrum, and we fit the two features to a double-gaussian model, obtaining an average linewidth σ BE of the two spin-orbit states as a function of σ 2 g . The double-gaussian model is constrained to have the same standard deviation for both peaks, which is reasonable in this case because the intrinsic linewidths of the two states only differ by 1meV [2]. Convolved with the gaussian kernel, the gaussian point spread model of the photon spectrum remains gaussian, but broadened to σ a = σ 2 a0 + σ 2 g . Given the slope shown in Fig. S1(b) , we can extrapolate to σ 2 a = σ 2 a0 + σ 2 g = 0, and we estimate that σ 2 BE would decrease by 0.11 eV 2 if δ A were 0, and thus we take σ 2 φCorr = 0.11 eV 2 as our estimate, as stated in the main text.

II. COMPARISON TO SEQUENTIAL APPROACHES
Besides the single-step approach presented in the main text, there are two sequential approaches to apply spectral-domain ghost imaging (SDGI) and Abel inversion with pBasex to the data. One method involves independently conducting SDGI on each pixel of the VMI detector followed by applying pBasex to the SDGI reconstruction. The other method applies pBasex to single-shot VMI images followed by SDGI on the single-shot coefficients. In the first sequential method, the smoothness regularization over the image space is inefficient, or even intractable to implement in some cases, because a VMI image typically has (4 × 10 2 ) 2 ≈ 2 × 10 5 pixels in a quadrant, which amounts to a forbidding dimension of (2 × 10 5 ) 2 /2 = 2 × 10 10 for the smoothness operator.
Both sequential approaches should perform well given sufficient number of hits per image, sufficient number of shots and sufficient computational power. Given the same realistic dataset, however, they are less robust to noise than the single-step approach we describe in the main text. As shown in Fig. S2 (a-d), both sequential approaches are able to resolve the two spin-orbit states, but the signal-to-noise ratio is lower than the results obtained with the single-step approach in Fig. S2 (e-f). The is particularly true for the l = 2 Legendre polynomial components. We quantify this result using the peak-signal-to-noise ratio (PSNR), referencing to the fitted result with a double-gaussian model: PSNR (I(KE, ω), I ref (KE, ω)) = 20 log 10 max KE,ω I ref where I(KE, ω) denotes a spectrum of interest, I ref (KE, ω) denotes the reference spectrum, and · denotes the average over (KE, ω) space. In our case, the reference I ref is obtained by fitting I(KE, ω) to a model containing two photoelectron features with Gaussian lineshape: It is interesting to note that, in the pBasex+SDGI sequential approach, applying the Gaussian-likelihood SDGI to single-shot pBasex results is robust to reconstruction artifacts caused by the sparsity of hits in single-shot images. The coefficients resulting from single-shot Abel inversion G + B only come into the SDGI objective function through the discrepancy h 0,p+S defined in the main text Eqn. (9), and that is again quadratic in C, so G + B only involves in the optimization through the linear term in C, whose coefficient is As the shot index i is summed over, we see that the Abel inversion is equivalently applied to the average of images weighted by spectral intensity. In other words, what appeared to be a single-shot Abel inversion is the Abel inversion on the averaged image weighted by A w , conducted for each photon energy bin ω w separately. In short, due to the linearity of Abel inversion and the fact that Gaussian-likelihood effectively contracts shot index, single-shot Abel inversion is not detrimental. This explains why in Fig. S2 , the pBasex+SDGI sequential approach outperforms the SDGI+pBasex approach.

III. HYPERPARAMETER TUNING
The result of regularized regression depends on the hyperparameters that control the regularization terms. In the previous works [3][4][5], where regularized regression spectraldomain ghost imaging was applied, the hyperparameters are either tuned in simulation or set empirically. In this work, we use k-fold cross validation [6] to evaluate the performance of models with different hyperparameters. The metric employed to score the models is the mean L 2 residue averaged over the validation sets, as explained below. We inspect the resulting ω-dependent energy spectrum from the hyperparameters that well-perform in terms of the metric. The hyperparameter is chosen based on the joint consideration of the residue's dependence on hyperparameters and the qualitative behavior of the resulting spectrum.
The k-fold cross validation procedure is conducted as follows. We randomly shuffle and split the whole dataset into k sets. Holding out one as the validation set, we solve the optimization problem with the rest of the dataset (i.e. the training set), the objective function being specified by Eqn. (6) in the main text. At each hyperparameter value (λ sp , λ sm,ω , λ sm,r ), we permute through all k possible validation sets and obtain k fitted models, each of which is evaluated on their corresponding validation set, and the average residue is the performance at this hyperparameter value. Specifically, the mean L 2 residue of a model represented by coefficients C on a validation set (A (v) , B (v) ) is where h 0 is squared-error function defined in Eqn. (5) in the main text, N (v) is the number of shots in the validation set. Note that the training set is normalized prior to optimization, which helps reduce the effective magnitude of hyperparameters to around unity, otherwise the magnitude of hyperparameters that have effects on the outcome may be hard to search. Specifically, we normalize the photon spectrum A and the projections of basis G such that i,w A i,w = N ω , kl,q G 2 kl,q = N q . The outcome optimal C has been scaled back to the original scale before scoring.
We cross validate with a k = 10 split. The hyperparameters are log-uniformly sampled in several patches of region. The same residue evaluated on the training sets is expected to by and large decrease with decreasing hyperparameters. Thus it can serve as a sanity check and provide a reference for the level of fluctuation in the hyperparameter tuning. We visualize the metric on both validation and training sets in Fig. S3 . The five best points in validation metric are marked by the crosses in Fig. S3 (a) (c), and after inspecting the resulting spectra, we choose (λ sp , λ sm,ω , λ sm,r ) = (0.5, 300, 45) as marked by the star.

IV. MODELING WITH BINDING ENERGY SPECTRUM
In this approach, we make use of the prior knowledge that, in the range of photon energy involved in this argon experiment, 1. only single-photon dispersive features are present, 2. their cross-section is, to a good approximation, independent from photon energy.
With these additional assumptions, we can further reduce the model parameters to the ionization intensity spectrum in binding energy s(BE). In this way, the photoelectron spectrum in KE is expected to be a convolution between the spectrum s(BE) and photon spectrum a(ω)Î (E) = dωs(ω − E)a(ω) , and photon-electron covariance in energy-energy space is expected to be We invert photon autocovariance out of photon-electron covariance by minimizing the objective function in Eqn.
FIG. S3: Hyperparameter tuning with 10-fold cross validation. The dependence of mean L 2 residue on hyperparameters (λ sp , λ sm,ω , λ sm,r ), evaluated on (a,c) validation sets and (b,d) training sets. A global baseline of 5.5 has been subtracted prior to visualization. The three dimensional space (λ sp , λ sm,ω , λ sm,r ) is projected into the (λ sp , λ sm,ω ) and (λ sp , λ sm,r ) spaces. The best five hyperparameters in validation metric are marked by orange crosses in (a,c), with increasing marker size indicating higher ranking. The magenta star marks the hyperparameter we selected.
where s b is the numerical second-order derivative at point b, and λ 1 and λ 2 are regularization hyperparameters.
We apply the linear transforms in electron momentum space to the photon-electron-image covariance Cov[A, B], obtaining the photon-electron covariance in energy-energy representation Cov[A, I]. The covariance matrices Cov[A, I] and Cov[A, A] are visualized in Fig. S4  (a)(b). The two spin-orbit features are visible in Cov[A, I], and by solving the constrained optimization problem, we obtain the most probable s(BE) with two fully separated features, as shown in Fig. S4 . This convolution model completely relies on the accurate calibration of the kinetic energy axis and photon energy axis, which has been done with our main approach with the slope of the photoelectron feature in the resulting I 0 (E, ω). As discussed in the main text, the comparative advantages of these related approaches may be case dependent, and the precise choice of method depends critically on the properties and purpose of the  measurements.