Direction-dependent Corrections in Polarimetric Radio Imaging. II. A-solver Methodology: A Low-order Solver for the A-term of the A-projection Algorithm

P. Jagannathan; S. Bhatnagar; W. Brisken; A. R. Taylor

doi:10.3847/1538-3881/aa989f

1. Introduction

The aperture illumination pattern (AIP) of an antenna determines the directional gain and sensitivity of the antenna to the sky brightness distribution. For an interferometric baseline consisting of a pair of antennas, the outer convolution of the two AIPs determines the Mueller matrix. The Mueller matrix encodes the mixing of the input polarization signals, including the effects of the off-axis leakage of one polarization product into another. It also largely determines the imaging performance. Accurate knowledge of the antenna AIP is essential for high-fidelity imaging performance of a radio interferometric array.

The current and next generation of interferometric arrays are outfitted with dual-polarization, wide bandwidth receivers having high fractional bandwidths (total bandwidth/center frequency), e.g., in the case of the Very Large Array (VLA), by as much as 66%–75% (L, S, and C bands). The directional properties of the AIP in each polarization will change significantly across the band. In addition to smoothly varying geometric frequency scaling of the AIP, effects can arise due to, imperfect optical alignments or standing waves between optical elements of the antenna (e.g., Popping & Braun 2008). Standard calibration and imaging algorithms that do not account for the directional and frequency dependence of the antenna AIP lead to errors whose magnitude increases with distance from the antenna pointing direction and are particularly significant for polarization imaging (Jagannathan et al. 2017). With a known AIP, the direction-dependent errors can be corrected over the field of view using the A-Projection algorithm (Bhatnagar et al. 2008, 2013).

The AIP can be modeled to the first order by a simple ray-tracing geometric model. Geometric models of aperture illumination deliver sufficient accuracy in the regime where the incident wavelength of electromagnetic waves is much smaller than the blocking antenna structures in the optical path. At its highest operation frequencies of 10 s of GHz, the VLA falls in this geometric regime. However, at gigahertz frequencies, lower purely geometric approaches are insufficient and higher order effects from diffraction and scattering significantly affect and alter the AIP. These effects introduce higher order frequency-dependent terms. Full electromagnetic (EM) simulations of the antenna (Young et al. 2013) in principle allow for the accurate modeling of the AIP, including higher order effects. However, such approaches are computationally expensive (even more for high spectral resolution simulation across wide bandwidths) and are limited by the accuracy of antenna models and illumination patterns, which are given as an initial input. Results from such EM simulations often do not accurately reflect the real AIPs and are difficult (and expensive) to perturb to fit the measured AIPs.

In the forthcoming sections of this paper, we describe a new hybrid method, called the A-solver, that uses holographic measurements in combination with low-order parametric modeling of the antennas to efficiently create a high spectral resolution model of the full-polarization AIP over the very wide-bandwidth of the VLA. We utilize the parameterized beam and using simulations of point sources across the field quantify the effect of the parameterized AIP on imaging. The detailed working of the full Mueller A-Projection algorithm and the use of frequency-dependent parameters on imaging of real VLA data will be the focus of a forthcoming third paper.

1.1. Primary Beam Correction and Imaging

Corrections for the AIP can be carried out during imaging in the aperture plane (A-Projection algorithm) or post deconvolution in the image plane using the Fourier transform of the AIP, the antenna primary beam (PB). The PB of radio antennas varies with direction and frequency. For altitude-azimuth mounted antennas the sky brightness distribution rotates with respect to the antenna primary beam as a function of the antenna parallactic angle. Consequently, for long integration observations during which the parallactic angle changes, the response of the array to a radio source includes an instrumental component that varies with time, frequency, and polarization. In Paper I (Jagannathan et al. 2017), we showed the errors introduced in polarimetric imaging when the time dependence of the antenna PB is unaccounted for, in particular, the case of altitude-azimuth (Alt-Az) mounted telescope arrays. Observations with equitorial mounted antennas or Alt-Az antennas with a third axis of motion to maintain a fixed parallactic angle (McConnell et al. 2016) allow for a simple correction in the form of a direction-dependent flux subtraction post imaging. This technique was used to good effect for the Canadian Galactic Plane Survey (Taylor et al. 2003) using the equatorial mount antennas of the Dominion Radio Astronomy Observatory synthesis radio telescope. Alternatively, for "snapshot" observations across narrow-bandwidths, image plane PB corrections for all polarizations are highly effective as demonstrated by the NVSS (Condon et al. 1998).

At low frequencies where the ionosphere plays a limiting role in full PB direction-dependent imaging, peeling based methods (Cotton 2008 and Intema et al. 2009) are effective in producing high quality PB-corrected images in Stokes I for narrowband surveys like TGSS (Intema et al. 2017). For wide-field dipole arrays, such as the MWA, implementations such as the Real-Time System (RTS, Mitchell et al. 2008 and Ord et al. 2010), and WSCLEAN (Offringa et al. 2014) allow for image plane PB corrections. Measurements of the polarized MWA PB (Sutinjo et al. 2015) provide image plane PB models, which are modeled in terms of spherical harmonic functions on the sky (Wayth et al. 2016). However, small discrepancies in the PB model across wide frequency bands of modern telescopes manifests as a scaling error as a function of declination and frequency, also as noted in the GLEAM survey (Hurley-Walker et al. 2014 and Hurley-Walker et al. 2017). Polarization observations at low frequencies with the MWA utilize the induced rotation measure by the ionosphere over multiple epochs as a tool in identifying the shifted rotation measure away from RM = 0 (Lenc et al. 2016, 2017). All of these approaches require imaging and deconvolution using fractions of data partitioned along all or some of the axis (time, frequency, polarization, and baseline).

Aperture plane corrections using the wide-band A-Projection algorithm (Bhatnagar et al. 2008 and Bhatnagar et al. 2013) works on unpartitioned data, thus benefiting from the full sensitivity of modern wide-band telescopes during the nonlinear operation of image modeling (a.k.a., the "deconvolution" step). Rau et al. (2016) demonstrate that multi-scale multi-frequency synthesis (minor cycle) in conjunction with AW-Projection (major cycle) performs significantly better than image plane corrections in joint deconvolution of multipointing radio deep fields. In this approach, the modeling can be shown to take advantage of the full available continuum sensitivity of the instrument. With algorithms that require partitioning, the data along time or frequency or both, the available SNR for modeling processes is significantly lower. This work therefore, focuses primarily on the full-pol. modeling of the PB for projection algorithms to enable wide-band full-pol imaging including joint-mosaic imaging of complex fields involving a large number of overlapping pointings.

2. Theory

The measurement equation for a single interferometer baseline, calibrated for direction-independent (DI) terms,⁵ is given by

$\begin{eqnarray}&&{{\boldsymbol{V}}}_{{ij}}^{\mathrm{Obs}}(\nu ,t)={W}_{{ij}}(\nu ,t)\int {{\mathsf{M}}}_{{ij}}({\boldsymbol{s}},\nu ,t){\boldsymbol{I}}({\boldsymbol{s}},\nu ){e}^{\iota {{\boldsymbol{b}}}_{{ij}}\cdot {\boldsymbol{s}}}d{\boldsymbol{s}},\end{eqnarray} \tag{ 1 }$

where ${{\boldsymbol{V}}}_{{ij}}^{\mathrm{Obs}}$ is the visibility measured by a pair of antennas i and j, with a projected separation of ${{\boldsymbol{b}}}_{{ij}}$ . W_ij are the effective weights, and ${\boldsymbol{I}}({\boldsymbol{s}},\nu )$ is the full-polarization vector of the sky brightness distribution as a function of direction, ${\boldsymbol{s}}$ , and ν is the observing frequency. M_ij is the Mueller matrix, which encodes the effects of antenna directional gain and polarization leakage on the measured visibilities. ${{\mathsf{M}}}_{{ij}}$ can be written in terms of the antenna voltage pattern (VP), ${{\mathsf{E}}}_{i}$ (following, Hamaker et al. 1996), as

$\begin{eqnarray}&&{M}_{{ij}}({\boldsymbol{s}},\nu ,t)={{\mathsf{E}}}_{i}({\boldsymbol{s}},\nu ,t)\otimes {{\mathsf{E}}}_{j}^{* }({\boldsymbol{s}},\nu ,t).\end{eqnarray} \tag{ 2 }$

${{\mathsf{M}}}_{{ij}}$ appears in Equation (1) inside the integral. Its effects therefore cannot be calibrated independently of the imaging process to reconstruct the sky brightness distribution ( ${\boldsymbol{I}}$ ). They need to be corrected for as part of the imaging process using projection algorithms like A-Projection. Projection algorithms are a class of radio interferometric imaging algorithms that correct for the terms inside the integral in Equation (1) by applying the inverse of the terms during convolutional gridding as part of the imaging process (transforming visibility data to the image domain). In an iterative ${\chi }^{2}$ -minimization scheme (e.g., Cornwell 1995 and Rau & Cornwell 2011), the update direction is computed after projecting-out the direction-dependent (DD) effects, at full accuracy in the prediction stage. These algorithms, however, require a model for ${{\mathsf{M}}}_{{ij}}$ as an input, including all the dominant effects that need to be calibrated. Equation (1) can be recast, in terms of the AIP, which is a Fourier transform of ${{\mathsf{E}}}_{i}$ as

$\begin{eqnarray}&&{{\boldsymbol{V}}}_{{ij}}^{\mathrm{Obs}}(\nu ,t)={W}_{{ij}}(\nu ,t){ \mathcal F }\,[({{\mathsf{E}}}_{i}({\boldsymbol{s}},\nu ,t)\otimes {{\mathsf{E}}}_{j}^{* }({\boldsymbol{s}},\nu ,t))\cdot {\boldsymbol{I}}({\boldsymbol{s}},\nu )]\end{eqnarray} \tag{ 3 }$

$\begin{eqnarray}&&={W}_{{ij}}(\nu ,t)[{{\mathsf{A}}}_{{ij}}\,\star \,{{\boldsymbol{V}}}_{{ij}}],\end{eqnarray} \tag{ 4 }$

where ${ \mathcal F }$ is the Fourier transform operator and ${{\boldsymbol{V}}}_{{ij}}={ \mathcal F }I$ is the true visibility full-polarization vector of the sky brightness distribution. ${{\mathsf{A}}}_{{ij}}$ is the Fourier transform of ${{\mathsf{M}}}_{{ij}}$ and can be decomposed into antenna-based quantities as

$\begin{eqnarray}&&{{\mathsf{A}}}_{{ij}}={{\mathsf{A}}}_{i} \circledast {{\mathsf{A}}}_{j}^{* }.\end{eqnarray} \tag{ 5 }$

Here ${{\mathsf{A}}}_{i}$ and ${{\mathsf{A}}}_{j}$ are the AIPs for the two antennas. Given a model for the AIPs, ${{\mathsf{A}}}_{j}^{M}$ , the A-Projection algorithm computes the image as ${ \mathcal F }\,[{{\mathsf{A}}}_{{ij}}^{{M}^{\dagger }}\,\star \,{{\boldsymbol{V}}}_{{ij}}^{\mathrm{Obs}}]$ and the resulting images are normalized by an appropriate function of ${ \mathcal F }\,[{W}_{{ij}}({{\mathsf{A}}}_{{ij}}^{{M}^{\dagger }}\,\star \,{{\mathsf{A}}}_{{ij}})]$ (see Bhatnagar et al. 2013 for details). ${{\mathsf{A}}}_{{ij}}^{M}$ is constructed from the models for the AIP of the individual antennas, ${{\mathsf{A}}}_{i}^{M}$ and ${{\mathsf{A}}}_{j}^{M}$ according to Equation (5). The ability to compute these models accurately and efficiently is therefore crucial for correcting the effects of ${{\mathsf{M}}}_{{ij}}$ .

2.1. Aperture Illumination Pattern from Holography

Holography directly measures the antenna VP, E_i, using the signals from a strong unpolarized, compact calibrator radio source at a grid of positions (l, m) over the AIP. This replaces $I({\boldsymbol{s}},\nu )$ in Equation (1) with an approximation of a Kronecker delta function in ${\boldsymbol{s}}$ at each (l, m). Typically, a subset of the array antennas, whose VP is measured, scans the source, while the rest of the antennas are used as reference antennas and are pointed toward the source (i.e., the source is placed at l = m = 0). The reference antennas provide the reference signal, with respect to which the signal from the scanning antennas is measured, and when projected in the antenna Az–El plane, gives a sampled map of the complex VP, ${E}_{i}(l,m)$ , for all of the scanning antennas.

An important limitation of this method is the low signal in areas of the AIP with low directional gain. An accurate aperture model would require the holography measurement to sample beyond the first side-lobes in a dense grid with a high signal-to-noise ratio. Since we are interested in the Fourier transform of the antenna, truncation of the measured VP after the first side lobe gives rise to errors (due to aliasing) in ${{\mathsf{A}}}_{i}={ \mathcal F }[{E}_{i}]$ .

${{\mathsf{A}}}_{i}^{{M}^{\dagger }}$ is applied as a convolutional correction while gridding the observed visibility data onto a regular grid as described in Equation (3). There are two kinds of oversampling required to represent the convolution function (CF) in its appropriate digital form for gridding. For computational efficiency reasons, the CF used for gridding in general (not just for projection algorithms) is a look-up table. To minimize quantization errors, the CF look-up table needs to be oversampled by a factor represented by the symbol O_ap (typically ${O}_{{ap}}\geqslant 20$ ). Holographic measurements are the measurements of the antenna VP (E_i) itself. To minimize aliasing as well as to measure the various features of the antenna VP accurately, the holographic measurements are also oversampled. However, due to practical limitations, a much smaller oversampling factor ( $1.5\times$ ) was used and found to be sufficient (see Section 3.2) for these antenna VP measurements. Since the holographic oversampling factor in the antenna VP measurement is much less than the oversampling factor needed for gridding (O_ap), a parameterized model of the antenna AIP is required.

3. A-Solver: Ray-tracing as a Parameterized Predictor of the Antenna AIP

3.1. Physical Modeling of the AIP

Approaches to computer models of the AIP or VP can be broadly classified as Physical Modeling or Phenomenological Modeling. While a detailed discussion of these styles of modeling is beyond the scope of this paper, we mention here that Physical Modeling (e.g., simulators using Physical Optics (PO) or full-EM simulators) minimizes the required degrees of freedom in the model and follows the physics of the problem, both of which have significant numerical and computational advantages. Physical Modeling also leads to a fundamental understanding of the instrument. Phenomenological modeling on the other hand⁶ ignores physics and models individual effects as free parameters.

A simple model of the far-field radiation pattern can be computed using geometric optics (GO). This works well for smooth surfaces away from edges and structures that are much smaller than the wavelength of radiation, where diffractive effects become important. For observations above several gigahertz with the VLA, diffractive effects are expected to be small (e.g., see Bhatnagar et al. 2008 where instrumental Stokes-V is modeled using a GO simulator for the VP). An adaptation of a GO simulator (Brisken 2003) exists in CASA, and is referred to as the "CASSBEAM" simulator. The code, though VLA centric, is general and can be used to model Cassegrain antennas in general. The simulator takes as input a parametric description of the structure of the antenna. The VLA antennas are a shaped Cassegrain system, with a nearly parabolic primary and a hyperbolic secondary designed to attain a more uniform illumination of the antenna aperture. The shaped aperture alters the side-lobe levels of the antenna far-field, and the side-lobe azimuthal symmetry is altered by the presence of the quadrupod legs holding up the secondary. The general shape of the main lobe and the side lobes is also altered by the central blockage due to the sub-reflector.

The set of parameters used in our work to describe the structure and optics of a VLA antenna are shown in Figure 1 and listed in Table 1. The shape of the secondary reflector is not part of the model and is computed on-the-fly during ray tracing by enforcing the optical path length of the rays to be a constant, from the time of the first incidence. The algorithm computes the changes in the electric field for the different reflections, following the rays to the feed where the electric fields in a natural linear polarization basis are transformed into a circular basis having been multiplied by the feed illumination function and the feed illumination taper function. CASSBEAM computes all the elements of the direction-dependent antenna Jones matrix—the VP for the two orthogonal polarizations along the diagonal and the leakage patterns on the anti-diagonal, including the effects of the off-axis location of the feeds (see Figure 1 of Jagannathan et al. 2017 and Equation (3) of Bhatnagar et al. 2008).

Table 1. Antenna Parameters in Ray Tracing

Description	L Band Values
Antenna name	VLA
Sub-reflector height	8.47852
Position of feed in x	−0.10026
Position of feed in y	0.97019
Position of feed in z	1.67640
Sub-reflector Angle	9.26
Width of strut legs	0.27
Strut legs distance from vertex	7.55
Height of strut legs above vertex	10.93876
Radius of central hole	2.0
Radius of the antenna	12.5
Band reference frequency	1.5
Feed taper polynomial	10.0, 2.0
Order of feed taper polynomial	2

Note. All measurements of length are in meters. All angle measures have units of degrees. All frequencies are in gigahertz. Polynomial coefficients are unitless quantities. All dimensions provided here are from Napier (1996).

Download table as: ASCII Typeset image

This simple geometric model of the antenna aperture illumination is insufficient for A-Projection. At L, S, and C Bands (1–2, 2–4, 4–8 GHz) of the VLA, secondary reflections, diffraction, and scattering play a major role while standing waves in the optics play a significant role in all the bands. Secondary scatterings involving the feed, sub-reflector, support beams and struts—all structures of the order of the wavelength of incident electromagnetic waves—alters the PB. The effects on the antenna PB are two-fold—the amount of flux gets redistributed from the main lobe to the side lobes, and the introduction of higher order frequency-dependent effects in the antenna PB, that alters the effective off-axis leakage, and the angle of polarization squint.

To refine the AIP model, we developed the A-solver approach where we perturbed the model parameters such that the predicted AIP fits the holographic measurement of the AIP. The latter is a measure of the real AIP, which usually is significantly different from idealized AIP.

3.2. Holography

For the holography data used here, an unpolarized source, 3C147 (<0.04% polarized), was scanned in a 35 × 35 grid in the antenna reference frame with a step size of Δl, Δm = 2 farcm 5057, out to the second null (at 1 GHz). In addition, a polarized calibrator 3C286 was observed to provide polarization angle calibration. Half the array was utilized as reference antennas, while the other target antennas scanned the array. For more details on the holographic measurement, see Perley (2016).

The visibility data were imported into AIPS to obtain the antenna grid coordinates (l, m) for each holography scan. The uncalibrated data were exported as a UVFITS file and imported into CASA as a Measurement Set. The calibrator fluxes were set using Perley & Butler (2013a) and Perley & Butler (2013b) for both 3C147 and 3C286 in all polarizations across the full bandwidth. On-axis gain, bandpass, frequency-dependent polarization leakage, and polarization position angle calibration were carried out and the calibration solutions were applied to the data using the APPLYCAL task in CASA. Subsequently, utilizing the CASA toolkit, data from baselines between the target antennas and each of the reference antennas were averaged to improve the signal-to-noise ratio of the measured VPs. Furthermore, the data recorded on a grid point for 10 s was averaged. This gave the final set of antenna VP data per channel per holography grid. The VP data were then interpolated onto a 128 × 128 grid on a per channel basis to create a 1024-channel image cube for each of the target antennas, in polarizations R and L (the diagonal elements of the DD antenna Jones matrix) and leakage patterns ${\mathtt{R}}\leftarrow {\mathtt{L}},{\mathtt{L}}\leftarrow {\mathtt{R}}$ (the anti-diagonal elements of DD antenna Jones matrix).

3.3. A-solver Optimization Procedure

The ray-tracing AIP simulator code within the CASA imaging R&D code base was modified to accept input parameters from a python wrapper code. This was wrapped as a parameterized function in Python and utilized as the unknown function to be determined by the Nelder–Mead simplex algorithm (Nelder & Mead 1965), minimizing for the function parameters against each of the individual channel images of the target antenna VP cube produced from the holography data (see Section 3.2). The optimization parameters chosen were the apparent blockage (Rhole), the feed illumination taper function (ftaper as a fourth-order polynomial), and the antenna pointing offset in R.A. and decl. (xoffset, yoffset).⁷ The apparent blockage parameter along with the feed illumination taper function altered the antenna AIP and consequently the antenna PB, without altering the optical path of the incident radiation. While these parameters appear independent, they are not orthogonal and produce the best antenna AIP for a given frequency together. An initial run, including only the apparent blockage and feed illumination taper gave higher systematic gradients. Such gradients signify physical antenna pointing errors, which were independently parameterized and included as part of the optimization procedure. The simplex algorithm traversed each parameter space independent of the others varying them until a joint minima is found. The residuals before and after the joint minimization are shown in the upper and lower panels of Figure 2. The choice of the simplex algorithm over more computationally optimal algorithms arises from the lack of a priori knowledge of the gradients of the various minimization parameters, in the seven-dimensional parameter space.

**Figure 2.** Upper panel images show the residuals (normalized with respect to peak intensity at the beam center) of $| {E}_{i}^{\mathrm{Holo}}-{{ \mathcal F }}^{-1}{A}_{i}^{\mathrm{ideal}}|$ measured at 1.353 GHz. The upper left panel shows the residual for the left-circular (`L`) polarization and the upper right panel for the right-circular (`R`). Similarly, the lower panels show $| {E}_{i}^{\mathrm{Holo}}-{{ \mathcal F }}^{-1}{A}_{i}^{M}|$ .
Download figure:
Standard image High-resolution image

$| {E}_{i}^{\mathrm{Holo}}-{{ \mathcal F }}^{-1}{A}_{i}^{\mathrm{ideal}}| $ — **Figure 2.** Upper panel images show the residuals (normalized with respect to peak intensity at the beam center) of $| {E}_{i}^{\mathrm{Holo}}-{{ \mathcal F }}^{-1}{A}_{i}^{\mathrm{ideal}}|$ measured at 1.353 GHz. The upper left panel shows the residual for the left-circular (`L`) polarization and the upper right panel for the right-circular (`R`). Similarly, the lower panels show $| {E}_{i}^{\mathrm{Holo}}-{{ \mathcal F }}^{-1}{A}_{i}^{M}|$ .
Download figure:
Standard image High-resolution image

The CASSBEAM code uses OpenMP thread parallelization and was set to launch four threads per process call. This parallelization allowed for the fast production of a new beam model for every convergence iteration. Despite this parallelization, the minimization takes 4 hr per channel per polarization to converge to a solution. So a serial minimization would take 2 × 1024 × 4 hr to derive a channelized solution for an antenna. Since each channel minimization based on our parameterization is independent of the next channel, a simple frequency-based parallelization was used to trigger a parallel minimization run of 1024 channels and two polarizations on the Amazon Web Services (AWS) compute cluster. This reduced the compute time down to 6 hr per antenna. Our results for three antennas are discussed below. We should note here that the run time would be unreasonably long if a full Physical Optics simulator like GRASP⁸ was used instead.

4. Results

In order to highlight the efficacy of the parameterized model of A_i^M as against the ideal model of the AIP's Figure 2 shows a comparison between $| {E}_{i}^{\mathrm{Holo}}-{{ \mathcal F }}^{-1}{A}_{i}^{\mathrm{ideal}}|$ (top row) and the $| {E}_{i}^{\mathrm{Holo}}-{{ \mathcal F }}^{-1}{A}_{i}^{M}|$ (bottom row). ${A}_{i}^{\mathrm{ideal}}$ refers to the default aperture illumination produced by the CASSBEAM code, and ${A}_{i}^{M}$ is derived from the parameter values obtained by the optimization procedure discussed in Section 3.3. The first side-lobe is underestimated in the upper panels of Figure 2 by 50% in both polarizations (L is upper left and R is upper right). Within the main lobe of the VP, the residuals in the upper panel show a systematic offset in power within the main lobe in both polarizations. The offset within the main-lobes that affects both polarizations equally is a sign of mechanical antenna pointing error. In contrast, the lower panel images (lower left for L lower left and lower right for R lower right) of the parameterized model residuals shows no sign of side-lobe power discrepancy or residual pointing error. These are residuals for one frequency channel. The optimized residuals show similar improvement across the entire bandwidth at the VLA L-band for all the optimized antennas.

The optimization procedure solved for the pointing offset of the antenna and then fitted the data for Rhole—the apparent blockage. Heiles et al. (2001) demonstrated that a blocked aperture leads to increased power in the side lobes. They also show that the size and extent of the VP side lobes can be effectively shaped by tapering the illumination of the feeds. In line with their finding, we were able to effectively model the first side-lobe power altering the apparent blockage in ray-tracing in conjunction with the ftaper, feed illumination taper polynomial function utilized in our code. We find that an increased apparent blockage and a sharper tapering function for feed illumination, determined per channel across the entire band allows for the capture of all significant changes in the antenna VP out to the first side lobe. We also note that the trend captured in the optimized parameters correlates with the measured wide-band sensitivity of the VLA L-band (Momjian et al. 2014), which suggests that our optimized models correctly estimate the departures from idealized antenna.

4.1. Apparent Central Blockage

The central blockage in an antenna reduces the aperture efficiency and increases the side-lobe levels—an aspect that is alleviated by shaped surface design and off-axis feed geometry of the VLA to improve uniform aperture illumination and increased aperture efficiency (chapter 3, Taylor et al. 1999). The frequency dependence of the VP across the bandwidth, in particular, the presence of a standing wave, altering the first side-lobe and the shape of the polarization properties of the VP for the JVLA antenna across L band. Solving for a frequency-dependent apparent blockage allowed us to capture the frequency-dependent variation in the per-channel solutions of the Rhole parameter. Plotted in Figure 3 in red, green, and blue is the apparent blockage parameter for three different antennas derived from the optimization spanning seven spectral windows. The effect of the standing wave is captured in the variation of the Rhole parameter with frequency. This frequency-dependent variation of the Rhole parameter, in turn, can be fit using a combination of a straight line in frequency per spectral window, and a sinusoidal function. The data are fit per spectral window, each containing 64 MHz of data utilizing the Astropy models package (Astropy Collaboration et al. 2013). The fit reveals that the oscillations in frequency have a period of $\sim 17\,\mathrm{MHz}$ . The period of this oscillation corresponds to twice the light travel time from the feed to the secondary, consistent with the presence of a standing wave between the antenna secondary and the feed. With this fit—of a line and a sine function—the number of parameters that determine the frequency-dependent behavior of the antenna AIP to five numbers per spectral window. (The data and the fits to the data are available upon request). The standing waves in the apparent blockage is a static effect that arises from a second reflection between the feed and the antenna secondary. While these static effects are common to all the antennas analyzed, there were differences in the average trend per frequency from one antenna to another. These antenna-to-antenna variations can be naturally accounted for in the general A-Projection framework. The variations in the parameter could be from differences in the optics from antenna to antenna where small differences lead to measurable differences in the antenna PB.

**Figure 3.** Fit to the recovered *apparent* blockage parameter for antennas 6, 10, and 12, in red, green, and blue respectively, with the lines representing the fit and the points representing the derived apparent blockage data across 448 MHz, of data.
Download figure:
Standard image High-resolution image

4.2. Feed Illumination Taper

The VLA receiver feeds lie on a circle around the optical axis. The feeds are illuminated by the sub-reflector and the angular span of the illumination can be altered by tapering the feed illumination pattern. The tapered illumination pattern reduces the amount of radiation received from the edges of the dish, which, while it marginally reduces aperture efficiency, effectively stops the feed receiving spillover radiation. In addition to reducing the spillover, it also alters the shape and the gain of the PB side-lobe. The parameterized AIP model allowed for the taper function (a fourth-order polynomial) to vary along with the central blockage to optimally match the shape and structure of the antenna VP out to the first lobe. In Figure 4, the normalized amplitude of the feed taper function is plotted against the angular distance from the feed axis for antenna 12, at 1.0, 1.5, and 2.0 GHz in red, green, and blue dashed lines respectively. The feed taper function obtained from the optimization is plotted in light red, green, and blue solid lines respectively. The taper functions determined from parameter optimization have stronger tapering and a sharper fall-off resulting in lesser feed illumination overall to match the VP's determined through holography.

**Figure 4.** Dashed red, green, and blue lines show the feed taper function at 1, 1.5, and 2.0 GHz respectively, used to derive A_i^ideal. The solid, red, green, and blue (overwritten by the dashed red line) lines show the feed taper function at 1, 1.5, and 2.0 GHz, respectively, used to derive A_i^M for antenna 12.
Download figure:
Standard image High-resolution image

4.3. Pointing Offset

Figure 5 plots the per-channel solutions for the pointing offsets for antennas 6, 10, and 12 in blue, green, and red, respectively, in units of the half power beam width (HPBW). Any linear scaling with frequency is therefore removed and all optical effects that scale linearly with frequency should appear as flat curves in this plot. On the other hand, effects like the mechanical pointing offsets, which are not optical effects, should appear in this plot with linear slope as a function of frequency. The mean separation between the R and L beams is $\sim 5.7 \%$ corresponding to the known polarization squint due to the off-axis optics of the JVLA antenna. The solid lines and the fainter points plotted above the curves showing the R- and L-beam offsets are the mechanical pointing offsets for the three antennas. Antenna 6 shows the largest pointing offset indicated by the the line fit with a slope of ∼2 farcm 4. All antennas show mild variations in the pointing offsets with frequency. Frequency-dependent pointing error over and above the squint can be caused by an uncorrected second-order term in phase across the antenna. The higher order phase terms also affect the ${E}^{R\leftarrow L}$ and ${E}^{L\leftarrow R}$ adversely, introducing squash and other higher order distortions. Modeling these higher order phase errors in the off-diagonal Jones matrix is covered in Section 4.6. Once the pointing offset has been solved for per channel, solutions for the apparent blockage (Rhole) and the feed illumination taper polynomial (ftaper) were derived.

4.4. Antenna AIP and Imaging

A sub-optimal AIP model, ${{\mathsf{A}}}_{{ij}}^{M}$ , will create errors in the image that can be characterized in the residual image. The residual error contribution in a snapshot for a single baseline i − j can be written as

$\begin{eqnarray}&&{I}^{\mathrm{res}}={I}^{\mathrm{psf}}\,\star \,[{\rm{\Delta }}{M}_{{ij}}\cdot I^\circ ],\end{eqnarray} \tag{ 6 }$

where I^res is the residual image, I^psf is the telescope point-spread function to be deconvolved, $I^\circ$ is the true sky distribution, ${\rm{\Delta }}{M}_{{ij}}={ \mathcal F }[{A}_{{ij}}^{\mathrm{True}}]-{ \mathcal F }[{A}_{M}]$ is the difference between the true antenna AIP and the model AIP.

Let us consider PB^True(or equivalently ${ \mathcal F }[{A}_{{ij}}^{\mathrm{True}}]$ ) to denote the PB of the antenna AIP with optimized Rhole and ftaper parameters, and PB^def to denote the PB of the antenna AIP with frequency-independent Rhole and ftaper parameters. The left panel of Figure 6 then shows the fractional error $({\mathrm{PB}}^{\mathrm{True}}-{\mathrm{PB}}^{\mathrm{def}})/{\mathrm{PB}}^{\mathrm{True}}$ when using the standard sub-optimal AIP, as against the optimized AIP, for stokes I at 1.448 GHz of antenna 12. The optimized beam is overlaid as contours in pink. The error within the main lobe of the PB is at the level of several percent, a significant change for high-fidelity imaging noise limited wide-field imaging that typically requires dynamic ranges in excess of 10,000:1. The left panel of Figure 6 also demonstrates that error in flux reconstruction $\gt 5 \%$ starts beyond the 0.05 gain position of the PB main lobe and continues to increase to nearly 40%–60% change across the first side lobe. On the right in Figure 6 is the fractional error in polarized intensity $({\mathrm{PB}}^{\mathrm{True}}-{\mathrm{PB}}^{\mathrm{def}})/{\mathrm{PB}}^{\mathrm{True}}$ . The error in the polarized intensity varies between 10% and 20% across the PB out to the first side lobe.

**Figure 6.** Plotted is the fractional change in the antenna PB, $({\mathrm{PB}}^{\mathrm{def}}-{\mathrm{PB}}^{\mathrm{True}})/{\mathrm{PB}}^{\mathrm{True}}$ , with magenta contours overlaid of PB^True at 80%, 50%, 10%, 5%, and 1% power at 1.448 GHz of antenna 12. The left panel is the fractional change in total intensity, while the panel on the right is the fractional change in linear polarized intensity.
Download figure:
Standard image High-resolution image

**Figure 6.** Plotted is the fractional change in the antenna PB, $({\mathrm{PB}}^{\mathrm{def}}-{\mathrm{PB}}^{\mathrm{True}})/{\mathrm{PB}}^{\mathrm{True}}$ , with magenta contours overlaid of PB^True at 80%, 50%, 10%, 5%, and 1% power at 1.448 GHz of antenna 12. The left panel is the fractional change in total intensity, while the panel on the right is the fractional change in linear polarized intensity.
Download figure:
Standard image High-resolution image

While the fractional error in the PB gives us the instantaneous error in the residual image for a particular frequency, the effect on the total continuum sensitivity offered by the wide bandwidths is obtained by examining the fractional error in the wide-band PB. The instantaneous wide-band PB is defined as ${\sum }_{{\nu }_{0}}^{\nu 1}{PB}(\nu )$ spanning the range of frequencies from ${\nu }_{0}$ to ${\nu }_{1}$ . The wide-band PB represents the effective forward gain of broadband continuum imaging. The effective wide-band sensitivity extends far beyond the null of the narrowband PB (Bhatnagar et al. 2011). WB A-Projection uses the wide-band PB to normalize the image in the final imaging step of the flat-noise implementation of the algorithm (see Bhatnagar et al. 2013 for more details). Shown in Figure 7 is the fractional error in the wide-band PB at the reference frequency of 1.5 GHz. Overlaid in pink are the contours of instantaneous wide-band $\sum {\mathrm{PB}}^{\mathrm{True}}$ . The fractional error in the PB means that the error in gain of the PB-corrected image is $\sim 5 \%$ at the 0.1 gain of the PB and increases to $\sim 20 \%$ at 0.01 PB gain (this includes the first side lobe). Since every pixel in the wide-band PB image is the sum of the pixel values at all the frequencies, the fractional error beyond 0.1 PB gain is dominated by the lower frequencies (larger beam size) while the error within the 0.1 PB gain being dominated by the higher frequencies.

**Figure 7.** Plotted is the fractional change in the antenna wide-band PB, $({\mathrm{PB}}^{\mathrm{True}}-{\mathrm{PB}}^{\mathrm{def}})/{\mathrm{PB}}^{\mathrm{True}}$ across 1 GHz of bandwidth, with magenta contours overlaid of PB^True at 80%, 50%, 10%, and 1% power at the reference frequency, 1.5 GHz of antenna 12.
Download figure:
Standard image High-resolution image

**Figure 7.** Plotted is the fractional change in the antenna wide-band PB, $({\mathrm{PB}}^{\mathrm{True}}-{\mathrm{PB}}^{\mathrm{def}})/{\mathrm{PB}}^{\mathrm{True}}$ across 1 GHz of bandwidth, with magenta contours overlaid of PB^True at 80%, 50%, 10%, and 1% power at the reference frequency, 1.5 GHz of antenna 12.
Download figure:
Standard image High-resolution image

4.5. Imaging Simulations

We used point-source simulations to contrast the difference between parameterized AIP and frequency-independent models for full Mueller imaging. Eight unpolarized point sources (I = 1 Jy, Q, U, V = 0), were placed across the main lobe and first side lobe of the antenna PB. The data were simulated for a total integration time of 15 minutes, with a bandwidth of 64 MHz centered at 1.4 GHz to produce a full Mueller predicted measurement set (MS; refer Figure 4, Jagannathan et al. 2017 for schematic) for the VLA in C-configuration. The median value of the apparent central blockage (refer to Section 4.1) and feed illumination taper (refer to Section 4.2) of antennas 6, 10, and 12 were used as inputs to CASSBEAM. The resulting Jones matrix was used as an input in our simulations.

The MS was then imaged with full Mueller A-Projection with the CFs produced with (a) frequency-independent (default) parameters for the feed illumination taper and central blockage, and (b) with the updated (frequency-dependent) parameters. We refer to the PB derived from default parameters as PB^True. The reconstructed fluxes as a function of the PB gain is shown in Figure 8. The blue curve (using the optimized parameters, PB^True) shows that we are able to reconstruct the flux in total intensity accurately when utilizing an accurate frequency-dependent AIP in the full Mueller A-Projection algorithm. The green curve (standard parameters, PB^def) shows that when using a frequency-independent AIP pattern we begin to incur errors that increase from ∼2% at the 0.3 gain in the PB to ∼9% at the 0.01 gain within the main lobe. In addition to the six sources in the main lobe, two more sources were places in the side lobes, where the standard parameters overestimate flux by ∼25%, as we divide by PB^True, which underestimates the power in the side lobes.

**Figure 8.** Plotted in the figure is the PB-corrected point-source flux. Plotted in blue are the full Mueller-imaged and PB^True-corrected point-source fluxes for the parameterized frequency-dependent model. Plotted in green is the full Mueller-imaged and PB^def-corrected point-source fluxes for the frequency-independent model.
Download figure:
Standard image High-resolution image

Figure 9 shows the difference image, ${I}^{\mathrm{True}}-{I}^{\mathrm{def}}$ , with the point-source locations indicated with white circles. The color scale is chosen to highlight the deconvolution errors introduced. These errors are more prominent beyond the 5% PB gain mark within the main lobe. The deconvolution errors denote the loss in fidelity of imaging and represent degradation in imaging fidelity, even though the effects are markedly visible only when the imaging dynamic range is in excess of 10000:1.

Note that the A-Projection framework used for imaging naturally includes antenna to antenna variations, in particular, to account for the AIP of heterogenous arrays, such as ALMA. In this paper, we have modeled the dominant static term of the antenna AIP in terms of the feed illumination taper and the apparent central blockage parameters. While the pointing offset we solve for is used to derive a better fit to the antenna AIP, we are aware that it is a time varying quantity that, as a part of the A-Solver approach, cannot be described in this paper. Time dependent pointing effects, however, can be solved-for by means of the Pointing Selfcal approach (Bhatnagar & Cornwell 2004; Bhatnagar & Cornwell 2017). Time-dependent shape changes that affect the antenna AIP derived from the A-Solver methodology would only affect the highest dynamic range imaging studies ( $\geqslant {10}^{6}$ ) for homogenous arrays. In such a case, a coupled shape and pointing self-calibration approach would be required.

A few computational points of note with respect to the full Mueller A-Projection (FM-AW)P framework are worth mentioning at present. A more detailed presentation of the algorithm and its performance on observations will be presented in a forthcoming paper. The CF production in FM-AWP as implemented in CASA, is on a per-spectral window basis (typically 16–32 spectral windows across a VLA observing band). The CFs are produced once at the start of the imaging and cached. In a typical A-Projection imaging cycle, 80% of the time is spent in the gridding of the data. The CF production even is a significantly lesser fraction, typically 10% of the total imaging time and is a one-time cost as the CFs are cached. Within the new imager framework CF production and gridding are parallelized⁹ by means of MPI. This parallel framework has made the FM-AWP algorithm computationally feasible.

4.6. Off-diagonal Antenna Jones

In Paper I (Jagannathan et al. 2017), we demonstrated the effects of beam squash on polarimetric imaging. Squash is caused by a second-order phase term (Heiles et al. 2001) and in conjunction with other second-order phase terms like defocus and coma, affects polarimetric imaging adversely. Reconstructing the polarized emission from the sky requires the use of all the terms of the antenna Jones matrix. In the prior sections, we have dealt with the frequency dependence of the antenna AIP primarily in the context of the diagonal elements of the antenna Jones matrix. To model the off-diagonal jones elements requires the inclusion of higher order distortions, which were done by including a general second-order polynomial in phase in addition to the Rhole and ftaper parameters.

In Figure 10, the panels on the left represent the real (upper left) and imaginary (lower left) parts of the off-diagonal antenna Jones matrix element ${E}^{R\leftarrow L}$ of the model. The inclusion of a second-order phase term in the antenna alters the side-lobe flux but does not alter the general morphology of the clover-leaf pattern as is the case in the panels on the right in Figure 10. The real (upper right) and imaginary (lower right) parts of the off-diagonal antenna Jones matrix element ${E}^{R\leftarrow L}$ of the measured holographic map. The altered morphology of the lobes mimics a rotation of the VP. We therefore introduced rotation of the antenna VP as an additional free parameter in the minimization, which leads to a more realistic model VP shown in the center panels. A rotation of $\approx 18^\circ$ gave the least residuals with respect to the holographic data. A similar rotation is quite clearly seen in the polarization squint vector as well for all antennas and at bands in the holographic measurements. The physical origin of this rotation is not yet understood.

**Figure 10.** Panels of the figure show the off-diagonal antenna Jones matrix element $R\leftarrow L$ of ${E}_{i}^{M}={{ \mathcal F }}^{-1}[{{\mathsf{A}}}_{i}^{M}]$ , with the upper panels showing the real portion of ${E}_{i}^{M}$ and the lower panels showing the imaginary part of ${E}_{i}^{M}$ at 1.013 GHz of antenna 27. The left-most panels (upper and lower) ${E}_{i}^{M}$ include an optimized second-order polynomial in phase. The figures in the center of the complex ${E}_{i}^{M}$ include an $\approx 18^\circ$ rotation in addition to the second-order polynomial in phase. In the two panels on the right, the real and imaginary parts of the measured holographic ${E}_{i}^{M}$ are shown.
Download figure:
Standard image High-resolution image

5. Conclusions

The imaging performance of the A-Projection is determined by our knowledge of the AIP. High dynamic range and high-fidelity polarimetric imaging across wide fields requires an extremely accurate understanding of the antenna AIP across the full bandwidth. The A-Solver approach of solving for the frequency-dependent AIP of antennas based on a parametrized model whose values are determined by comparison to holographic data is a viable approach to obtaining an accurate VP as demonstrated in this paper. The parameterized model captures the rapid frequency dependence of the AIP, including the effects of standing waves. Modeling the central blockage as an apparent blockage in the model allowed for the accurate reconstruction of the amplitude of the VP side lobe as a function of frequency. The parameterized model of the AIP is a naturally compact representation requiring fewer parameters to capture higher order frequency-dependent effects, than frequency-dependent modeling of antenna VP.

An important point to note is that the product of the two AIPs making the PB for each baseline is, in general, a complex valued function, and not a purely real function as is assumed when imaging without using the A-Projection algorithm (the effective PB with A-Projection is $\sqrt{{\mathrm{PB}}^{M}\cdot {\mathrm{PB}}^{^\circ \dagger }}$ , which is real-valued at the level the model PB^M accurately models the real ${PB}^\circ$ ). This could be due to differences between the two AIPs involved and/or non-Hermitian structure of the AIPs due to various EM or antenna structural effects. The PB pattern is already quite complex, and as discussed in Section 3.1, directly modeling even the real-valued PB is difficult, approximate, and needs many more free parameters. In addition to this, modeling of the complex valued PB also has all the additional numerical complications involved in directly fitting to any complex valued data. In contrast, the physical modeling approach described in this paper models the PB in the aperture plane. This not only requires a significantly smaller number of parameters, the parameters themselves are real-valued, describing the physics of the optics (here, via the antenna structural parameters). The fitting procedure, therefore, deals with real-valued parameters. This has significant numerical advantages, and computational advantages in the production of parameterized CFs for a given frequency.

This work was done using the R&D branch of the CASA code base. We thank R. Perley for carrying out the holography and O. Smirnov for carrying out various illuminating numerical experiments with the data. We thank James Robnett and Erik Bryer for their extensive assistance in deployment of the minimization runs on AWS. Support for this work was provided by the NSF through the Grote Reber Fellowship Program administered by Associated Universities, Inc./National Radio Astronomy Observatory. The National Radio Astronomy Observatory is a facility of the National Science Foundation operated under cooperative agreement by Associated Universities, Inc.

Software: CASA (McMullin et al. 2007).

Direction-dependent Corrections in Polarimetric Radio Imaging. II. A-solver Methodology: A Low-order Solver for the A-term of the A-projection Algorithm

Article metrics

Permissions

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract