VieRDS: A Software to Simulate Raw Telescope Data for very Long Baseline Interferometry

J Gruber; A Nothnagel; J Böhm

doi:10.1088/1538-3873/abeca4

1. Introduction

The technique known as Very Long Baseline Interferometry (VLBI) utilizes several independent and usually spatially far distant radio telescopes to record and accurately time-tag the filtered, down-converted, sampled and quantized electric field strength measurements of one common compact extra-galactic radio source. The digitzed electric field strength measurements of each telescope can then be compared with the other telescopes to determine the difference in signal arrival time for each baseline. The precise measurement of the difference in signal arrival time represents the fundamental observable for VLBI and is used for the estimation of accurate radio telescope coordinates on Earth and extra-galactic radio source positions (Sovers et al. 1998 and references therein). These coordinates can then be used to produce terrestrial and celestial reference frames, for example, the International Terrestrial Reference Frame (ITRF, Altamimi et al. 2016) and the International Celestial Reference Frame (ICRF, Charlot et al. 2020). Furthermore, the variable Earth rotation and orientation is monitored with highest precision, and many other parameters of the Earth system can be derived (Schuh & Behrend 2012). The current aim of VLBI as specified in the concept design of the VLBI Global Observing System (VGOS, formerly known as VLBI2010) is to provide 1 mm accuracy in station position from 24 hr observations and 0.1 mm yr⁻¹ in station velocity on global scales (Petrachenko et al. 2009).

As the first step of the whole geodetic VLBI process, each telescope independently tracks a common source for a predefined integration time of a few to several tens of seconds. The main receiver components are used for amplifying, down-conversion, quantization, and sampling (time tagging) of the incoming electromagnetic radiation. The reception, digitization, and time tagging within a single radio telescope can be referred to as the observation process. The data is then recorded on some media in a suitable digital format such as the VLBI Data Interchange Format (VDIF, Whitney et al. 2010).

Since the incoming radiation is downconverted from the radio frequency band (also referred to as sky frequency) to the base of the frequency spectrum, the term baseband data is also used to describe the raw telescope data. For technical and other VLBI-related reasons, the received wide band spectrum is currently processed in a series of electronically formed channels of up to 32 MHz width. They are distributed in bands with large spanned bandwidth (up to 1 GHz) for applying the bandwidth synthesis technique (Rogers 1970).

To allow for calibration of the individual data paths with their dispersive properties at a later processing step, an artificial signal in the form of tones of 1, 5, or 10 MHz repetition rate is injected into the receiving chain. The high temporal stability of this signal is essential to achieve the accuracy goals of any VLBI system. The final output of each telescope can be considered to be the digital representation of the incoming radiation of a radio source with the presence of overall system noise and the phase calibration signal. In our terminology, we call it raw telescope data, or shortly refer to it as "raw data."

The baseline-source-geometry at the time of the observation causes a time delay of the signal arriving at the two telescopes forming the baseline. Inverting the problem, a set of different delay observations are used to deduce the geometry. To find the delays, a correlation process needs to be employed which assumes an auto-correlation of the same signal received from the radio source observed simultaneously. In this process, the recorded electromagnetic field strength of each telescope for matching observations is compared to each other by using efficient correlation algorithms (Nothnagel 2019). This is mostly implemented on High-Performance Clusters (HPC), such as the Distributed FX (DiFX) style correlator (Deller et al. 2011), the K5 correlator at Kashima Space Research Center (Kondo et al. 2004), or the SFXC software correlator (Keimpema et al. 2015) developed at the Joint Institute for VLBI in Europe (JIVE).

Due to the transition from the legacy S/X system which is observing at two frequency bands, to a broadband VGOS system observing at four frequency bands, new raw data processing algorithms are implemented to adjust to the new system (Niell et al. 2018). Most notably the correlation and fringe-fitting analysis stages are affected by this transition due to dual-linearly polarized instead of circularly polarized feeds, and broadband observations, for example. Both analysis stages have in common that they deal with the phases and magnitudes of the raw telescope data. For correlation, the phases and magnitudes of the raw data are processed, whereas within fringe-fitting the phases and magnitudes provided by the interferometer are processed. In contrast, geodetic parameter estimation uses the group delay obtained in fringe-fitting by quasi applying linear regression to the interferometric phases and magnitudes in a complex sense. Hence, any nonlinear effects of the interferometric phases and magnitudes that are not modeled accurately within correlation and fringe-fitting, degrade the accuracy of geodetic parameter estimation.

The VGOS system is characterized by an increased effective bandwidth for each observation (Petrachenko et al. 2009). On the one hand, increased effective bandwidth increases the precision of the observation (Rogers 1970). On the other hand, new systematic influences within the interferometric phase over the broadband spectrum must be taken into account, such as the effect of source structure (Anderson & Xu 2018) or the nonlinear group delay contribution due to the ionosphere (Cappallo et al. 2014). Besides the new broadband VGOS observing system, also VLBI observations to artificial satellites which emit a deterministic signal instead of a noise-like signal have been observed and analyzed successfully (e.g., Hellerschmied et al. 2018). Such developments expand the scientific sphere of activity for VLBI further, but also require much more attention in the correlation and fringe-fitting analysis stages.

Considering the ongoing developments within the VLBI technique with an increased demand to improve correlation and fringe-fitting algorithms we think that the simulation of raw data can be a convenient tool to support these tasks. The steady improvement in the performance of modern CPU and increasing storage capabilities of computer memory motivate the development of a software simulation tool furthermore. This is because raw VLBI data is characterized by very high sampling rates and large data volumes. Also, VLBI observations are considered to be very expensive in terms of hardware and manpower due to the employment of radio telescopes with their need for active operations. A tool to simulate the VLBI observation process and then perform correlation and fringe-fitting of these data can be used to study the relation of adverse effects introduced in a controlled environment and the resulting geodetic observables. This also allows for backward engineering where the origin of systematics detected in the output can be investigated by modifying the input raw data on purpose.

Raw data simulation software has already been successfully used for various tasks and has been proven to be very practical. For example, the program called "enoise" (Nishioka 2015) developed at the VLBI group of Academia Sinica Institute of Astronomy and Astrophysics (ASIAA) was used to test the so-called zoom bands for the DiFX correlator. The simulator "Datasim" (Meyer-Zhao 2018) was used to test higher sampling rates for the Atacama Large millimeter/submillimeter Array (ALMA). Both software packages are enhanced versions of a program called "anoise" developed by Geoff Crew of MIT Haystack Observatory.

In this article, we present a new software realization implemented in MATLAB^© ¹ that is capable of simulating the observation process and providing simulated raw telescope data for VLBI in a data format readable by a correlator. The main focus of this article is to present digital signal processing algorithms to enable simulations of systematic effects in phase and magnitude of the frequency spectrum of the raw data signal. For the first time, the simulation of a dispersive group delay for a VGOS broadband frequency setup, and the simulation of the characteristic station frequency response of two VGOS telescopes are shown demonstrating the capabilities of the presented software. Furthermore, the non-zero baseline simulation capabilities are demonstrated for a very long baseline near the equator. The new software is called VieRDS and is distributed within the umbrella of the Vienna VLBI and Satellite Software (VieVS, Böhm et al. 2018).

In the following section, we describe the key characteristics of our simulation concept. In Section 3, we present the signal model with the amplitude scaling and the signal processing algorithms. The software architecture using MATLAB^© is shown in Section 4. In Section 5, our processing chain for real and simulated raw data is presented and three examples are shown for demonstration purposes. The article closes with conclusions and an outlook.

2. VieRDS Simulation Concept

Due to the nature of simulations, the simulated raw data provided by VieRDS can only be an approximation, although a very close one, of the real observation process. Before we enter the digital signal creation description, we describe the key characteristics of our simulation concept by focusing on its approximations, assumptions, and limitations first. These aspects will be needed in subsequent chapters.

2.1. Three-component Signal Model

The first consideration targets the composition of the signal of the filtered, down-converted, sampled and quantized electric field strength measurements as it is recorded by the VLBI radio telescope. In VieRDS we model the recorded signal as a linear superposition of a source signal, a system signal, and a phase calibration signal. The simulation concept allows for separate treatment of each of these signal components. This means, that the signal processing algorithms presented in Section 3 such as the application of a delay, delay rate, and filter operations can be applied to the source, the system, and the phase calibration signal, separately.

Other signal components such as disturbing signals due to radio-frequency interference (RFI) or further radio sources are not included in the signal model of the current version of VieRDS and will be addressed at subsequent development stages.

2.2. Simulation Per Channel

An important part of VieRDS is related to the simulation of individual channels for bandwidth synthesis. Real radio sources emit electromagnetic radiation continuously over a broad frequency spectrum. The receivers of radio telescopes are designed to accept a limited part of that, either GHz-wide parts of the spectrum as in legacy S/X band systems or as wideband radiation between 2 and 14 GHz in the new VGOS systems. After passing amplifiers, mixers for heterodyning to lower intermediate frequencies, and filters, the spectrum is divided in several channels of reduced bandwidth. Within those channels, the radio telescopes virtually intersect with the radio spectrum of the radio source and provide a sampling of the continuously emitted radio spectrum. Since each channel corresponds to the observation at different radio frequencies, the signal of each channel carries unique information that is independent of the other channels. This circumstance is applied in the simulation concept of VieRDS and each channel is processed independently. When two or more telescopes observe at the same frequency channel in the simulation scenario, the source signal is not independent and correlates between the telescopes for the common channel. The signals of different frequency channels are uncorrelated.

2.3. Digital Signal Creation in the Time-domain at Baseband

Besides the concepts of the superposition of the signal components and their separate treatment, as well as the individual simulation approach per channel, VieRDS creates the signals in the time-domain initially at baseband. The simulation of the down-conversion from the sky frequency to baseband is not simulated because it is considered to be a linear shift of the channel frequencies from sky to baseband frequencies. Due to the assumption of linearity, it is possible to conveniently simulate effects at baseband which are effectively occurring at sky frequencies. For example, the derivation of the group delay from a linear slope of phase ϕ over frequency ν

$\begin{eqnarray}&&\tau =\displaystyle \frac{d\phi }{d\nu }\end{eqnarray} \tag{ 1 }$

is considered to be the same for the sky and baseband frequencies. To account for the remaining phase offset due to the frequency conversion, VieRDS features a tool to shift the phases of each channel by a constant value. The relationship of (1) is the same for any sky to baseband frequency conversion, e.g., for legacy S/X systems as well as for VGOS systems, and for any bandwidth considered.

The concept of the phase shift operation implemented in VieRDS is illustrated in Figure 1 with an example of a VGOS wideband system. In this example, the phases are approximated for a group delay of 300 ps leading to the respective slope of phase over frequency, as expressed mathematically in (1). In the broad spectrum, the phase theoretically cycles repeatedly. The bottom part then depicts four bands of 500 MHz each which are filtered and heteorodyned from the sky frequency band. The orange-colored phase behavior of the baseband at the bottom then just represents the filtered parts of the sky spectrum which are also shown in orange at the top part. Here, the phase offsets are still the same in the sky and basebands. Since VieRDS conceptually generates the signals at baseband, the phase offset of the corresponding channel sky frequency must be taken into account. Within VieRDS the phase value at the lower edge of the channel sky frequencies is calculated and handed over to a Digital Signal Processing (DSP) operation (described in detail in Section 3.5), which performs a phase shift of the corresponding baseband signal.

In a real VLBI radio telescope, the signal arrives in its analog form before the actual sampling and digitization are carried out. This operation requires a local oscillator whose frequency is referenced to an atomic clock providing high-precision frequency standards (Thompson et al. 2017). VieRDS creates the signals in the time-domain at baseband and initially in a digital representation without any usage of a hardware device that generates an analog signal. The stochastic and systematic effects introduced by the uncertainties in the atomic clock's spectral purity and long-term stability are not yet simulated in the current version of VieRDS.

As described in Section 2.2, the receiving capabilities of a VLBI radio telescope virtually intersect with the broad frequency spectrum of the continuously emitted electromagnetic radiation only at band-limited channels. The bandwidth of the channels can be controlled by the sampler's sampling frequency. By applying the Nyquist–Shannon sampling theorem (Shannon 1949), which permits a discrete sequence of samples to capture all the information from a band-limited continuous-time signal, it is possible to create the signal for each channel received by a VLBI radio telescope initially in a digital representation in the form of discrete samples. Since all the analog signal information can be captured by the discrete signal, as long as the Nyquist–Shannon sampling rate criterion is satisfied, it is possible to conveniently simulate effects in the digital representation effectively occurring in the analog signal representation. The Nyquist–Shannon theorem serves as a fundamental bridge between analog-time and discrete-time signals in general, and is heavily used within the simulation concept of VieRDS. The digital signal creation by VieRDS is described in more detail in Section 3.1.

In a VLBI experiment, the number of bits used for signal quantization is defined before the actual VLBI observation occurs. Usually, one or two-bit quantization of the electric field strength measurements is set in the so-called scheduling process. The VLBI samplers use this information and quantize the voltage amplitudes according to the VLBI schedule. The quantization process can then be interpreted as a mapping from a continuous voltage amplitude to a discrete voltage amplitude value. With such an analog to digital conversion of the amplitudes in addition to the sampling of the time-axis as described before, the analog to digital conversion can be described completely. The time-axis sampling process is not simulated because the Nyquist–Shannon concept is used as described before. In contrast, VieRDS simulates the quantization by first generating the amplitude values with 64 bit precision and then mapping the 64 bit representation to the number of bits defined in the schedule, e.g., one or two-bit. In general, this approach serves as a method for simulation of any amplitude quantization and is mathematically described in Section 3.7. It should be emphasized that with this functionality, VieRDS is already prepared for any other level of quantization, e.g., for spacecraft tracking signals.

2.4. Frequency Response and Equivalent Parameter Combination

VieRDS is capable of applying an arbitrary magnitude filter to the simulated signal. This filter can be used to approximate an ideal bandpass filter and allows to model the frequency response of the telescope and its electronics. VieRDS currently supports the application of one single magnitude filter which can be applied to any signal component in any channel.

A radio telescope consists of several interconnected electronic devices. Each device applies a certain filter on the received electromagnetic radiation. Within VieRDS we follow the approach that all filters that occur in the receiver chain can be approximated and modeled with one single filter.

The approach of parameter combination for the frequency response is also used for the amplitude scale of the system noise. In the concept of VieRDS, the system noise is not only related to the signal contributed by the receiver itself. The system signal can be used to model the receiver noise but can also be utilized for other noise contributions that occur within the field of view of the telescope. The additional noise contribution is modeled by the amplitude scaling factor of the system noise signal. For example, the noise contribution due to the cosmic microwave background radiation and/or the temperature-dependent noise contribution of the troposphere can be combined into one single amplitude scale factor of the system noise. Again, each channel can be modeled differently which enables the simulation of frequency-dependent noise contributions. This changes the signal-to-noise ratio (S/N) per channel and, e.g., allows simulation studies to quantify the effect of amplitude calibration in the fringe-fitting process.

2.5. Circular Polarization and Upper Sideband Response

Another key characteristic of the VLBI observation process concerns the polarization of the feed horns. The current version of VieRDS is based on the simulation of circularly polarized receivers. The simulation of linearly polarized feed horns has not yet been implemented. This fact is considered in the calculation of the amplitude of the received source signal, and is described in more detail in Section 3.2.

Real radio telescopes are capable of providing the upper and lower sideband of any channel at sky frequency. Currently, only the upper sideband is simulated in VieRDS. Modeling of the lower sideband, which has a reversed frequency spectrum, is not available yet and, thus, the double sideband response of the receiving system is not featured yet either.

2.6. Delay Reference Location

Since VieRDS deals with the simulation of single-telescope data and does not operate on products of baseline data, it is convenient to select a common reference point for all telescopes to calculate the delay. For this purpose, the geocenter is selected. Hence, the delay parameters for each telescope are calculated with respect to the geocenter.

2.7. Single Scan Simulation with a Focus on Correlation and Fringe-fitting

The simulation of VLBI observables for geodetic parameter estimation has been used several times in the past, e.g., to investigate the influence of the tropospheric turbulence by using of Monte Carlo simulations for the VGOS era (Pany et al. 2010). Furthermore, simulations of VLBI observables have been proven to be very useful to evaluate the performance of VLBI schedules before the actual observation. Thus, Monte Carlo simulations are also part of existing state-of-the-art VLBI scheduling packages (e.g., Schartner et al. 2019). The main characteristic of those Monte Carlo simulations is the vast number of observations which are simulated. In contrast, the simulation of raw data as presented in this article is designed to simulate single scans with the main focus on systematic effects that are visible in the interferometric phases and magnitudes used in the correlation and fringe-fitting. However, due to the simulation of overall system noise, stochastic error sources that affect the delay measurement precision can be simulated in VieRDS as well. Hence, VieRDS is designed to simulate stochastic as well as systematic influences to investigate the precision and accuracy of single scan VLBI observations.

3. Digital Signal Creation and Processing Algorithms of VieRDS

In this section, the digital signal generation of each signal component is described. The signals in a digital format are directly generated at baseband in the time-domain following the Nyquist–Shannon theorem (Shannon 1949) to reconstruct a virtual analog signal that serves as input for the VLBI samplers.

Furthermore, the digital signal processing algorithms to apply a delay, delay rate, constant phase offset, arbitrary magnitude filter, and quantization, are presented. These can be applied to each signal component independently if needed. In the first step, each algorithm is explained as a stand-alone mathematical operation for each of the three individual signal components. The section closes with the integration of all individual algorithms into one signal model referred to as the filtered three-component signal model.

3.1. Signal Components and Digital Representation

The VLBI raw data signal, which is recorded by each station, is considered to be a linear superposition of three individual signals which in turn are implemented in VieRDS. The main component, of course, represents the radiation of the radio source, referred to as the source signal (x_src). This component is common for all telescopes that are part of the simulation. In contrast, the second and third components are telescope-specific signals and are unique for each telescope. One of the telescope-specific signals reflects the independent radiation within the field of view of the telescope (e.g., the atmospheric noise contribution) and additionally, the thermal radiation emitted by the receiver itself. This component is called the system signal (x_sys). Since the sum of Gaussian noise components is again Gaussian noise with a variance which is the sum of the individual components' variances, x_sys can be used to simulate an arbitrary number of noise components by simply adjusting the variance of one single noise component x_sys.

The third signal component is dedicated to the instrument's phase calibration signal that is usually part of each VLBI telescope (x_pcal). The linear superposition of all three signal components is implemented as a vector addition operation:

$\begin{eqnarray}&&x[n]={x}_{\mathrm{src}}[n]+{x}_{\mathrm{sys}}[n]+{x}_{\mathrm{pcal}}[n],\end{eqnarray} \tag{ 2 }$

where x[n] represents the signal stage before quantization and formatting. The time index n is defined through the sampling interval T_x of the corresponding fictitious analog signal x(t) by x[n] = x(nT_x). It should be noted here, that the choice of the time resolution of all signal components leads to an implicit digitization which is equivalent to the system's sampling frequency f_s (16, 32, or 64 MHz are common values for VLBI samplers). Thus, the analog to digital conversion with respect to the time axis, the actual sampling, is not part of the raw data simulation, and the analog representation x(t) is never touched in the simulation. Since f_s, which is used for the digital signal creation of each channel, matches the values of real values of VLBI samplers, the Nyquist–Shannon sampling rate criterion (Shannon 1949) is satisfied for all individual channels, and the digital signal representation captures all the information of the analog signal representation for each channel.

In contrast, the amplitude values of the signal components are initially represented by floating-point numbers and stored with a precision of 64 bit. MATLAB^© constructs the double data type according to IEEE® Standard 754 for double precision.² Usually, the sampling systems of the VLBI telescopes operate with a depth of one or two bit. An amplitude quantization within VieRDS is implemented to convert the 64 bit representations to the desired bit depth of the VLBI receiving system (see Section 3.7 for more details on the quantization algorithm within VieRDS).

The length of the signals can be adjusted by setting the observation duration T_obs, yielding the same number of samples N for each signal component by N = T_obs · f_s. Thus, the time indices n range from values 0 to N − 1 for each component. The number of samples, which is the product of observation duration and sampling frequency, is proportional to the requirements of the computer memory and should not exceed certain limits. For example, to achieve a slightly higher sampling frequency as it is the case for VGOS, the observation time can be decreased accordingly. However, the parameterization is not limited by the software itself and memory-intensive simulation tasks can be outsourced to high-performance clusters.

3.2. Signal Classes and Amplitude Scale

Two signal classes are currently supported and can be assigned to the three signal components described in the previous Section 3.1. The non-deterministic source and system signal components are represented by normally distributed random numbers with a mean value of zero, where the variance ${\sigma }_{j}^{2}$ of the normal distribution can be expressed by the power P_j of a finite discrete signal x[n], as

$\begin{eqnarray}\begin{array}{rcl}{P}_{j} & = & \displaystyle \frac{1}{N}\displaystyle \sum _{n=0}^{N-1}| {x}_{j}[n]{| }^{2}={\sigma }_{j}^{2},\\ j & = & \mathrm{src},\mathrm{sys}.\end{array}\end{eqnarray} \tag{ 3 }$

More details on the calculation of the signal power can be found, e.g., in Winser & Cranos (2017) for example. The "power-variance" relation presented in (3) is used to scale the amplitudes of the series of random numbers by the square root of the power in Watts. Furthermore, the conversion from the effective temperature T_j to power is implemented as

$\begin{eqnarray}&&{P}_{j}=2k\,{T}_{j}\,{f}_{\mathrm{bw}}\end{eqnarray} \tag{ 4 }$

where k is Boltzmann's constant and f_bw the bandwidth of the signal (Thompson et al. 2017). Thus, it is possible to determine the variance of the random numbers based on the effective temperature. With (4) the power calculation of circularly polarized feeds is presented. The power calculation of linear polarized feeds must be treated differently and is not yet implemented in the current version of VieRDS.

The effective temperature for the system signal is referred to as system temperature which can be set to an appropriate value by the user. Also, the conversion from equivalent flux density to power is implemented as well to enable the determination of the variance of the source and system signal based on the source flux and System Equivalent Flux Density (SEFD) in Jansky. The corresponding conversion formula is implemented as,

$\begin{eqnarray}&&{P}_{j}={S}_{j}\,{A}_{\mathrm{eff}}\,{f}_{\mathrm{bw}},\end{eqnarray} \tag{ 5 }$

where S_j is the flux density in Jansky, and A_eff the effective area of the antenna in m². This function calculates the power which is collected by an antenna with an effective area under a certain bandwidth load from a source with a certain strength (Thompson et al. 2017).

A second signal class is introduced to simulate the phase calibration signal of the radio telescope and is implemented with the approach of Nosov (2019) by

$\begin{eqnarray}\begin{array}{rcl}{x}_{\mathrm{pcal}}[n] & = & \displaystyle \sum _{k=0}^{{N}_{\mathrm{pcal}}-1}{A}_{\mathrm{pcal}}\,\cos (2\pi \,{{kf}}_{\mathrm{pcal}}\,t\\ & & +{\phi }_{\mathrm{pcal},{\rm{k}}}),\,t=n\,{T}_{x}.\end{array}\end{eqnarray} \tag{ 6 }$

Implemented as such, the phase calibration signal represents a sum of individual tones k with amplitude A_pcal, frequency spacing f_pcal, and phase offset ϕ_pcal,k. In total, there are N_pcal tones per channel bandwidth available. The values for all those parameters can be directly set by the user, except the tone amplitude is calculated via a relative power p_cal,rel defined by the user with respect to the sum of the source and system power by

$\begin{eqnarray}\begin{array}{rcl}{A}_{\mathrm{pcal}} & = & \sqrt{\displaystyle \frac{2{P}_{\mathrm{pcal}}}{{N}_{\mathrm{pcal}}}},\\ {P}_{\mathrm{pcal}} & = & {p}_{\mathrm{cal},\mathrm{rel}}({P}_{\mathrm{src}}+{P}_{\mathrm{sys}}).\end{array}\end{eqnarray} \tag{ 7 }$

In legacy S/X systems, the specifications call for approximately 1 % of relative power (p_cal,rel) (See, for example, Nosov 2019 for the "Quasar" VLBI network) with an equivalent requirement for VGOS (G. Rajagopalan 2021, personal communication).

The superposition of the signal components with different signal classes yields a signal that consists of a deterministic part and a non-deterministic part. The level of the contribution of the deterministic part originating from the phasecal signal to the noise-like source and system signal is critical for the correlation sensitivity and plays an important role in the design of phase calibration systems of VLBI radio telescopes. VieRDS enables the simulation of such a calibration signal together with the noise-like source and system signals.

3.3. Delay Application

The application of a signal delay is of particular interest to the simulation concept for allowing a later end-to-end validation and for experimental setups with subsequent correlation and fringe fitting. For this reason, a signal delay algorithm is implemented with a two-step approach to delay the discrete signal along its time axis.

3.3.1. Integer-sample Delay

Within the first step, the signal is delayed by an integer number of samples. This delay is called the integer-sample delay and is realized by zero-padding the signal at the start in the case of a negative delay with respect to the geocenter. The generated sample overhang at the end of the signal is truncated to preserve the original signal length. In the case of a positive delay, the signal is truncated after the integer-delay number of samples and zero-padded at the end.

Due to zero-padding and truncating either at the start or at the end of the signal, samples are lost that carry information from the source signal. For the largest possible delay of 0.021 s between two telescopes at the Earth, 2.1% of samples are lost for a 1 s scan and will not contribute to the correlation. However, by adjusting the source flux properly the S/N loss due to zero-padding and trimming can be compensated easily.

3.3.2. Fractional-sample Delay

Typically, the samples that are recorded by radio telescopes are separated in time by several tens or hundreds of nanoseconds (for example 15.625 ns for a rather high sampling frequency of 64 MHz). The delay contributions of various geodetic and astrometric parameters can be much smaller and be at a level of a few picoseconds. To enable the simulation of such effects, a precise algorithm is required, which is capable of shifting the discrete signal along the time axis by a fraction of the actual sampling interval. This delay is called fractional-sample delay.

Considering the situation from the other end, i.e., from the correlation process within the DiFX correlator, the fractional-sample delay correction is implemented as a phase slope correction across the observed bandwidth (Deller et al. 2007). Applied on the two-sided frequency response of the simulated real-valued discrete signal, this would yield a non-symmetrical frequency spectrum which has a complex-valued time-domain representation. Although it is possible to store complex signals within VDIF and use them in the DiFX correlator, it is not feasible for the simulation of real-valued raw telescope data. Hence, a fractional-sample delay algorithm based on a time-domain convolution with a windowed sinc function is implemented in VieRDS which provides a real-valued signal.

The sinc function h that is used for convolution to delay the signal by a fractional-sample is implemented as

$\begin{eqnarray}\begin{array}{rcl}h[n] & = & \displaystyle \frac{\sin (\pi (n-u))}{\pi (n-u)},\\ n & = & -{N}_{h}/2,\ldots ,0,1,2,\ldots ,{N}_{h}/2,\end{array}\end{eqnarray} \tag{ 8 }$

wheres N_h is the length of the convolution function. The fractional-sample delay u can take any value between − N_h/2 and + N_h/2. To avoid ringing artifacts at the passband edges (mathematically called Gibbs phenomenon), a Chebyshev window function provided by the MATLAB^© Digital Signal Processing Toolbox³ is used and multiplied with the sinc function h. The convolution of this function with the input signal yields the fractional-sample delayed input signal and represents a filter operation. The filter's accuracy can be tuned by setting the filter length (N) and the sidelobe attenuation (r). By default, N is set to 101 samples, and r is set to 130 dB, which yields a maximum delay error of 0.01 ps for typical VLBI sampling frequencies. However, the filter length and the sidelobe attenuation can be set by the user to increase the accuracy of the fractional-sample delay application, but with the disadvantage of increased processing time.

To evaluate the delay error's impact, five different (N, r) configurations are investigated. The configurations are (101, 70), (101, 130), (101, 210), (51, 130), and (201, 130). Changes in the filter length N impact the computation complexity of the fractional delay algorithm (8) significantly because, for each convolution step, the length of the convolution sum increases with increased filter length. The advantage of increased N is that the delay error decreases. Changes in the sidelobe attenuation (r) do not significantly affect the processing time but impact the lowpass filter bandwidth. Furthermore, increasing r decreases the delay error but to the disadvantage of decreased lowpass filter bandwidth.

The five configurations are depicted in Figure 2 to show the impact on the delay error and the change of the lowpass filter bandwidth. For each configuration, a delay error of 0.1 ps is plotted too, which serves as a reference value for the filter design. Within the development of VieRDS the aim is to provide a fractional-sample delay algorithm that can delay discrete signals with a maximum error of 0.1 ps. This error limit can be accomplished by certain (N, r) configurations such as (51, 130), (101, 210), and (101, 130), whereas the latter represents the default configuration within VieRDS because it is a convenient compromise of delay precision and filter bandwidth.

The integer-sample delay introduced by the filter operation can be calculated by (N_h − 1)/2 when using an odd number for the filter length. After convolution, the signal is corrected by the filter delay by using the integer-sample delay application as described in Section 3.3.1.

3.4. Delay Rate Application

Due to effects such as the motion of the telescope with respect to the incoming radiation or a drift of the hydrogen maser reference frequency, the filtered and down-converted electric field strength measurements are sampled at different points on the time axis for each radio telescope. This manifests itself in a changing delay between radio telescopes for each sample. Within this text, this change is called the delay rate, which encompasses nonlinear delay changes over time as well.

An algorithm is implemented in VieRDS which shifts the phase of each sample according to the delay rate value of each sample. This can be realized with a complex mixing operation as it is described by Deller et al. (2007). Again, this yields a complex-valued discrete signal which is not feasible for the simulation of raw radio telescope data as described in the section about the fractional delay application in Section 3.3.2. Hence, a single-sideband amplitude modulation which results in a real-valued signal is implemented in VieRDS to apply a changing delay over time. This effectively removes the upper sideband and makes it possible to realize a real-valued delay rate application. The upper sideband generation is inherent to a real mixing operation and must be removed to provide a phase shift of each sample that is valid for the entire bandwidth.

In the following, the delay rate application due to the radial velocity of the radio telescope is described. The apriori parameterization of other effects that change the sample position on the time axis has not yet been implemented within VieRDS. However, the presented delay rate algorithm can be used for the application of other effects as well in future software versions.

The simulated input signal x is multiplied by a sinusoid of frequency f₀ and the result is added to the Hilbert transform H of x multiplied by a phase-shifted sinusoid of frequency f₀ as,

$\begin{eqnarray}&&\begin{array}{l}y[n]=x[n]{\rm{\cos }}(2\pi {f}_{0}{\rm{\Delta }}\tau [n])\\ \quad +\,\mathrm{Im}(H(x[n])){\rm{\sin }}(2\pi {f}_{0}{\rm{\Delta }}\tau [n]).\end{array}\end{eqnarray} \tag{ 9 }$

The output signal y[n] represents the result of the single-sideband modulation of the input signal x[n], where Δτ[n] is the delta delay (see below). The sky frequency f₀ is used to simulate the delay rate at the corresponding sky frequency of the received electromagnetic radiation. For a common sky frequency of 8 GHz for VLBI, the phase rotation is periodic with 125 ps. Delay rates for telescopes near the equator observing a source with a low decl. can yield values of up to several hundreds of nanoseconds per second. Such a high delay rate cannot be applied with the algorithm described by (9) because the phase rotation of full circles causes delay offsets when the full circles are reached. For this reason, a second step after the single sideband amplitude modulation is introduced where a so-called block-wise integer and fractional-delay application for these offsets is carried out. The block length indicates a time interval. The station-based geocenter delay per sample τ[n] is split up into the sum of the integer-sample delay τ_int,l and fractional-sample delay τ_frac,l per block l, and a delta delay Δτ[n], as,

$\begin{eqnarray}&&\tau [n]={\tau }_{\mathrm{int},l}+{\tau }_{\mathrm{frac},l}+{\rm{\Delta }}\tau [{n}_{i}].\end{eqnarray} \tag{ 10 }$

The integer and fractional-sample delay stay constant for the entire block such that Δτ[0] = 0 is fulfilled. The changes of τ[n] over n are reflected by the delta delay Δτ[n]. It compensates for the limitation due to the periodicity of the delay rate application over the time-span of the modulation defined by the reciprocal of the sky frequency. The length of the block is calculated for each telescope separately regarding the extent of the delay rate over time per telescope. Within each block the following condition must be fulfilled,

$\begin{eqnarray}&&{f}_{0}{\rm{\Delta }}\tau [{n}_{i}]\lt 1\end{eqnarray} \tag{ 11 }$

where n_i = 0,..,N_block − 1 is the time-index of the signal block, with N_block samples per block. Thus, the higher the delay rate the shorter the block interval is chosen to stay within the unambiguous period of the phase rotation function (sine and cosine functions in Equation (9)).

The delay rate algorithm described in this text applies a changing delay over time on the signal. The correlation software has to compensate for the changing delay which is referred to as fringe rotation within many previous works about the theory of radio interferometry (Thompson et al. 2017). For extremely low delay rates the changing delay compensation can even be applied to the correlation product (Deller et al. 2007). Since VieRDS operates only with single telescope data without the usage of the correlation product, the term fringe-rotation is avoided throughout this text because it is conceptually related to the correlation step.

3.5. Constant Phase Shift

In the frequency domain, the delay of a signal can be viewed as a phase shift Δϕ(f), which increases or decreases linearly with frequency, without the need for an offset at the base of the spectrum (Δϕ(0) = 0). This can be realized with the algorithms described in Section 3.3. The concept of VieRDS is that the signal for each channel with a different frequency at the sky is initially created at baseband. To simulate a delay that is linear across several channels with different sky frequencies, the signal of each channel must be shifted by a constant phase value. This is realized by replacing the argument of the sinusoid in (9) by a constant phase value Δϕ_a in radians, as

$\begin{eqnarray}&&\begin{array}{l}y[n]=x[n]\cos ({\rm{\Delta }}{\phi }_{a})\\ \quad +\,\mathrm{Im}(H(x[n]))\sin ({\rm{\Delta }}{\phi }_{a}).\end{array}\end{eqnarray} \tag{ 12 }$

All samples of the input signal x[n] are shifted by the same phase value ϕ and yield the real-valued output signal y[n]. Within VieRDS the lower sky frequency limit f_a of the channel frequency band is used to calculate the phase offset for the phases of the channel sky frequencies by

$\begin{eqnarray}&&{\rm{\Delta }}{\phi }_{a}=2\pi \,\tau \,{f}_{a}.\end{eqnarray} \tag{ 13 }$

The shift by the constant phase value for the signals of the individual channels at baseband takes the true phase values at sky frequency into account. The slope of the phases is realized with the delay application described in Section 3.3. Thus, the phase values of the signal, which is initially created at baseband, correspond to the phase values of the respective sky frequencies. The application of a constant phase shift makes it also possible to conveniently simulate other systematic phase changes. The concept of the phase shift application at baseband is illustrated in Figure 1.

3.6. Arbitrary Magnitude Filter

For modeling various realistic system responses, VieRDS features a tool to change the magnitudes of the frequency response for all simulated signal classes described in Section 3.2. Arbitrary magnitude values per frequency across the channel bandwidth can be handed over to the software, and a corresponding magnitude filter is applied. A shape-preserving piece-wise cubic interpolation for the input magnitude values is carried out at the query points. The query point resolution on the frequency axis can be defined by the user. The arbitrary magnitude filter design object provided by the MATLAB^© Digital Signal Processing Toolbox is used to create a filter function based on the interpolated magnitude values. The filter is applied with a convolution on the simulated input signal. The filter delay is corrected by the use of the integer-sample delay correction described in Section 3.3.1. An example of such an application of the implemented arbitrary magnitude filter of the real station frequency response of two VGOS telescopes is shown in Section 5.2 to test this algorithm.

3.7. Quantization

Quantization of the filtered and down-converted field strength measurements is an essential element in the backend of each radio telescope. The amplitude values of the simulated raw data are represented by floating-point numbers and stored with a precision of 64 bit prior to quantization. Within VieRDS, the simulation of the quantization is realized by mapping the 64 bit representation to a lower bit depth. Currently, the mapping to a one and a two bit representation is implemented.

For one bit quantization, the floating-point signal x[n] is loaded and quantized according to Equation (14), yielding a signal x_1bit that can take either two values for each sample: −0.5 and +0.5 (based on the VLBI Data Interface Specification (VDIF)),

$\begin{eqnarray}{x}_{1\mathrm{bit}}[n]=\left\{\begin{array}{ll}-0.5, & x[n]\lt 0\\ +0.5, & x[n]\geqslant 0\end{array}.\right.\end{eqnarray} \tag{ 14 }$

For two bit quantization, three thresholds are used to map the 64 bit to the two bit representation. There is one threshold located at zero, as it is the case for the one bit quantization. Additionally, two thresholds are symmetrically located around zero. If the 64 bit amplitude value of the simulated signal falls inside the symmetrical thresholds, the value is mapped to ±0.5 considering the sign of the value. If the 64 bit value falls outside the thresholds, the value is mapped to ±1.5 considering the sign of the value. The size of the threshold can be set by the user by a factor that is proportional to the standard deviation of the input signal. This factor is called the quantization factor q. This approach uses the standard deviation because the input signal is considered to be a highly non-deterministic signal with a normal distribution. If the quantization factor is set to one, the threshold value is equal to the standard deviation σ of the input signal. The quantization from 64 bit to two bit is realized by

$\begin{eqnarray}{x}_{2\mathrm{bit}}[n]=\left\{\begin{array}{ll}-1.5, & x[n]\lt -q\sigma \\ -0.5, & -q\sigma \leqslant x[n]\lt 0\\ +0.5, & 0\leqslant x[n]\lt q\sigma \\ +1.5, & q\sigma \leqslant x[n]\end{array}.\right.\end{eqnarray} \tag{ 15 }$

The quantized signal x_2bit is represented by four possible amplitude values: −1.5, −0.5, +0.5 and +1.5.

The current backend systems such as RDBE (Niell et al. 2010) and DBBC (Tuccari et al. 2019) set their thresholds/quantization factors by testing the previous 5 or 1 s, respectively, for making 32% and 18% of the next samples lie below and above the limit (G. Tuccari 2021, personal communication). The bit distribution of 32% and 18% corresponds to a quantization factor of 0.9. The optimal quantization factor based on a statistical analysis is 1.08 according to Thompson et al. (2017). The default quantization factor value of VieRDS is set to 1.08, but can be adjusted to any value by the user.

3.8. Filtered three-component Signal Model

The final output signal provided by VieRDS is a function of the individual signal processing algorithms described in Sections 3.3–3.7. This comprises the application of a delay, delay rate, phase offset, arbitrary magnitude filter, and quantization algorithm. The application of the integer and the fractional delay can be assigned to a corresponding delay application function called H_τ. Likewise, the delay rate application, constant phase shift, arbitrary magnitude filtering, and quantization can be assigned to functions ${H}_{\dot{\tau }}$ , H_ϕ, H_M, and H_Θ, respectively.

Each signal component x_i[n] with i = src, sys, pcal, can be processed by a different parameterization of the corresponding processing algorithms H_τ,i, ${H}_{\dot{\tau },i}$ , H_ϕ,i and H_M,i and amplitude scaled by σ_i. Quantization, however, can only be carried out for the superimposed signal by applying the quantization function H_Θ,b, where b represents the number of bits used for quantization. In addition to the component-wise processing, the superimposed signal can also be processed by a certain parameterization (index = 0) assigned to the functions H_τ,0, ${H}_{\dot{\tau },0}$ , H_ϕ,0 and H_M,0.

Rewriting the superposition formula (2) and applying the component-wise signal processing algorithms yields

$\begin{eqnarray}&&\begin{array}{l}{x}_{H}[n]=\displaystyle \sum _{i=1}^{3}{H}_{M,i}\{{H}_{\phi ,i}\{{H}_{\tau ,i}\\ \quad \{{H}_{\dot{\tau },0}\{{\sigma }_{i}{x}_{i}[n]\}\}\}\}.\end{array}\end{eqnarray} \tag{ 16 }$

Applying the signal processing algorithms including the quantization algorithm on the superimposed and filtered signal x_H yields

$\begin{eqnarray}&&\begin{array}{l}q[n]={H}_{{\rm{\Theta }},b}\{{H}_{M,0}\{{H}_{\phi ,0}\{{H}_{\tau ,0}\\ \quad \{{H}_{\dot{\tau },0}\{{x}_{H}[n]\}\}\}\}\}.\end{array}\end{eqnarray} \tag{ 17 }$

The quantized signal q[n] represents the final output of a single telescope simulation provided by VieRDS. The sequence of applications is delay rate, delay, phase offset, and arbitrary magnitude filtering. The main feature of the simulation concept of VieRDS is clearly visible and based on the fact that each signal component can either be parameterized independently or as superimposed signal. This flexibility enables various different setups for simulation studies.

4. VieRDS Software Architecture and Databases

In the previous section, the theory of the digital signal creation and the processing algorithms are described. In this section, the integration of those algorithms into the software architecture is shown and the modular software structure is explained. VieRDS consists of several modules that are run sequentially (Figure 3). The first module loads the input data, the second one calculates the model parameters, and within the third module, the signals are created and signal processing is carried out. In the last module, the storing of the simulated data in a convenient format is carried out. The flowchart of VieRDS consisting of the high-level module architecture, the individual signal creation/processing steps per channel, the input and output databases, and the channel-wise parallel computing capabilities is shown in Figure 3.

**Figure 3.** Flowchart of VieRDS. The input databases and the output databases are depicted in blue and red color, respectively. The black box is the high-level presentation of VieRDS with the modules for reading the input data (Input), calculating the model parameters (Model), creating and applying the signal processing algorithms (Signal), and formatting the raw data in the VDIF format (Formatter). The orange box presents the signal module's design with the simulation per station (St) for each channel (Ch) separately. Processes chained from top to bottom are run in sequence, whereas processes at the same level horizontally can be run in parallel. Hence, each channel can be processed in parallel. The FTC{} algorithm represents the filtered three-component signal model applied per station and channel, yielding the quantized signal q[n] described in Section 3.8.
Download figure:
Standard image High-resolution image

VieRDS is implemented with MATLAB^© and makes heavy use of the Digital Signal Processing Toolbox (see footnote 3). The parallel computing toolbox⁴ is used within modules to enable parallel computations on multicore computers, GPUs, and computer clusters.

The use of VieRDS is not limited to high performance clusters. The bottleneck for simulations with VieRDS is computer memory, since the signals are stored with their entire length in memory before they are written to VDIF. If the signal length, which depends on the sampling frequency and the scan length, is kept small (several seconds for sampling frequencies up to, e.g., 64 MHz), VieRDS can be run on modern personal computers. VieRDS was tested on a four-processor Intel system with 16 GBytes of total memory, and the runtime of a legacy 8 channel X-band simulation with 16 MHz sampling frequency, one second scan length, and 1 bit quantization was 68 s. Doubling the scan length to two seconds resulted in a runtime of 134 s. A single VGOS channel with 64 MHz sampling frequency, 2 bit quantization and one second scan length took 57 s. All run times included the process of writing the respective VDIF files.

4.1. Input Module

VieRDS takes a simple text file in YAML⁵ ^, ⁶ format as input. The input parameters are defined per station comprising the following parameters:

1.
Station name according to Terrestrial Reference Frame (TRF) database.
2.
Source name according to Celestial Reference Frame (CRF) database.
3.
Sampling frequency (MHz).
4.
Observation duration (s).
5.
Date of observation, defined as signal arrival time at geocenter.
6.
Number of bits.
7.
VDIF and VEX⁷ file specifications.

Additionally, a class of station parameters exists that can be defined per station and per frequency channel as well:

1.
Sky frequency (MHz), defined at the channel center.
2.
Quantization factor.
3.
Delay (s) and phase offset (rad) for each signal component: source signal, system signal, phase calibration signal, and superimposed signal.
4.
Filter length and sidelobe attenuation of the fractional-sample delay application.
5.
Source Flux (Jy), which by definition is a station-independent parameter; as it is available per station, it can be used to simulate, for example, station-dependent aperture efficiency parameters that result in reduced or increased recorded signal power per station.
6.
Station SEFD (Jy).
7.
Effective telescope area (m²).
8.
Arbitrary magnitude filter for each signal component: Name of text file that stores magnitude values per frequency (GHz), interpolation resolution (Hz), filter order, name of MATLAB^© filter design object.
9.
Phase calibration signal parameters: relative power (%), frequency spacing (MHz), phase offset per tone (rad).

Those parameters enable the simulation of frequency-dependent parameters for each station separately. For example, it is possible to simulate frequency-dependent signal delays, frequency-dependent source fluxes, or/and frequency-dependent station sensitivities. Furthermore, delay and phase offset parameters can be defined for each signal component individually, and the superimposed signal as well.

VieRDS distinguishes between two different simulation approaches driven by the location of the radio telescopes. For the first case, the radio telescopes are placed at the same location. The digitized signals at both telescopes are affected by the same delay and delay rate due to equal location with respect to the incoming radiation. This results in a delay and delay rate difference of zero. For this reason, the delay and the delay rate is not simulated. However, delay and phase offsets can be simulated also by setting those values specified in the YAML input file. These values need not be related to the location. They can take any value to simulate other (frequency-dependent) effects. This approach is referred to as zero baseline simulation and a TRF and a CRF database are not required.

For the second case, the geocenter delay and delay rate parameters are calculated based on the location specified in the TRF and the CRF database. The parameter values are calculated within VieRDS and need not be set specifically. However, additive delay parameters can be set as it is the case for the zero baseline simulation in the YAML input file. These values are added to the delay parameters calculated based on the TRF and CRF database. This approach is referred to as non-zero baseline simulation.

4.2. Model Module

Once the data is read in, converted to SI units, and stored into channel-based MATLAB^© internal data structures, the model parameters are calculated derived from the input parameters.

In the case of the non-zero baseline option, the geocenter delay for each telescope is calculated at equi-distant points along the time axis of the observation period. Per default, a point in time is defined every 100 ms. Since a geocenter delay is required for each sample, a piece-wise cubic interpolation between the apriori value is carried out to obtain a delay value for each sample.

The geocenter delay calculation within VieRDS is considered to be very coarse because it only depends on the apriori data of TRF station and CRF source coordinates, and Earth Orientation Parameters (EOP). The transformation from the TRF to the CRF is carried out by the use of the IAU2000 finals⁸ EOP data including a Lagrange interpolation for finer resolution in time. The Vienna solutions obtained by VieVS (Böhm et al. 2018) for the TRF and CRF databases are used within VieRDS. For the transformation from the TRF to the CRF the MATLAB^© functions of VieVS are used as well, but they are implemented independently into VieRDS. Hence, VieRDS can be used as a standalone software without usage of the VLBI analysis software VieVS.

After the calculation of the non-zero baseline parameters, further individual model parameters for the specific telescope and channel are calculated that are independent of the zero or non-zero baseline simulation method. These calculations include the determination of the sampling interval by T_x = 1/f_s which is required for the signal creation, and the determination of the lower and upper frequency limit f_a,b using the sky frequency defined at channel center f₀, and the channel bandwidth defined as f_bw = f_s/2,

$\begin{eqnarray}&&{f}_{a,b}={f}_{0}\pm {f}_{\mathrm{bw}}/2.\end{eqnarray} \tag{ 18 }$

Additionally, the amplitude values of the signal classes as described in Section 3.2 are calculated. Furthermore, the phase offsets and delay parameters as described in Section 3.3.2 are calculated and are stored for later use in the signal generation and signal processing module.

Once the model parameter calculation is complete, the input and model parameters are written in a corresponding VLBI Experiment file (VEX), which is a framework for defining the setup details and schedule for a VLBI experiment.⁹ This VEX file includes all file specific blocks that are required for DiFX correlation and can be specified in the input YAML file (Section 4.1). Thus, compatibility with the DiFX software correlator is ensured to realize the correlation of the simulated raw telescope data provided by VieRDS.

4.3. Signal Module

After the calculation of the model parameters, the source, the system, and the phase calibration signals are generated according to the descriptions in Section 3.1. Due to the memory-intensive characteristic of the signal generation and the signal processing afterward, the module for parameter calculation and signal creation are clearly separated to allow for parallelization over all frequency channels.

In the first step, the source signal with unscaled amplitude values is created. The generation of white noise is realized with the MATLAB^© function randn, which returns a vector of random scalars drawn from the standard normal distribution.¹⁰ The length of the vector corresponds to the length of the discrete signal which in turn is equal to the number of samples (e.g., 32 × 10⁶ samples per second for a channel with 16 MHz bandwidth). The source signal will be kept until the very end of the simulation run of each telescope because it is the common signal for all participating telescopes. Please note again that the noise level of the media traversed by the source signal with respect to each telescope is incorporated in the system signal. The system and the phase calibration signal are created independently for each telescope and deleted once the simulation run of each telescope is finished. The amplitude values of the system and phase calibration are scaled as described in Section 3.2.

In the next step, the delay rate and delay algorithms are applied sequentially on the signal components. VieRDS offers the possibility to delay each component individually and/or the superimposed signal as described in Section 3.8 about the filtered three-component signal model. Additionally, VieRDS features the application of an arbitrary magnitude filter per signal component as well.

Once the delay rate, the delay, and the magnitude filter are applied, the signals are added by a vector addition operation. This yields the superimposed signal with 64 bit amplitude resolution. The system signal and the phase calibration signal are deleted, only the raw source signal with unscaled amplitude is kept for further simulation. The quantization of the superimposed signal represents the last step in the signal creation and processing module.

4.4. Formatter Module

The data storing of the simulated quantized signal for each radio telescope in a proper data format is the final task within the VieRDS software architecture. The VLBI Data Interface Specification (VDIF) format is a natural choice for this task. It is a standardized transport-independent VLBI data-interchange format that is suitable for all types of VLBI data transfer and disk-file storage (Whitney et al. 2010). A module is implemented in VieRDS to convert the internally stored signal data of MATLAB^© into the VDIF format. This tool features the storage of multi-channel data and the storage of several bits per sample. Data in this format can be read by commonly used correlators such as the Distributed FX correlator (DiFX). Hence, it is possible to create simulated raw data which can be further correlated in the same processing chain as real observed data. The comparison of real and simulated data is therefore easily possible.

5. Proof of Concept

The VieRDS algorithms that are previously described in Section 3 and which are embedded in the software architecture as described in Section 4 are now tested. Three proof of concept (POC) experiments are carried out. For all experiments, the same processing chain is used which is described in detail in the following Section 5.1. The first experiment deals with the simulation of the station characteristic frequency response of two VGOS stations. The goal of this experiment is to reflect the real station characteristic frequency response including the simulation of the phase calibration signal. The second experiment tests the broadband delay capabilities of VieRDS. Here, we simulate a group delay from a linear and a nonlinear phase behavior for a typical VGOS broadband frequency setup. The third experiment challenges the non-zero baseline simulation capabilites of VieRDS by generating simulated raw telescope data for a very long baseline near the equator with high geocenter delay rate values at both telescopes.

The results of all three experiments can be collected to evaluate an important VieRDS feature: the user-controlled manipulation of the magnitude and phase of the simulated raw data signal.

5.1. Processing Chain for Real and Simulated Raw Data

To analyze the simulated raw VLBI data obtained by VieRDS, a processing chain that includes a software correlator is set up. For this purpose, we installed DiFX at the Vienna Scientific Cluster (VSC-4). The simulated raw data can then be correlated by DiFX to form the final correlation product, the so-called visibilities. Additionally, real data is processed by the same processing chain as well to allow for profound comparisons between real and simulated data. Of course, simulated data can be processed without the usage of real data in the processing chain as well.

For all three simulation experiments, the input data for the correlator is stored in VDIF format. The correlation results that are provided by DiFX are initially stored in the native Swinburne format and are then converted to the FITS¹¹ format. The analysis of the real and simulated data for the following demonstrations are carried out by inspection of the FITS file data. Except for the third simulation experiment, the non-zero baseline simulation, the correlation output of DiFX is further processed by the Haystack Observatory Postprocessing System (HOPS), developed at MIT Haystack Observatory (Whitney et al. 2004). The usage of simulated raw data in the real correlation processing chain is depicted in Figure 4.

**Figure 4.** Flowchart of the processing chain. The steps for real data as implemented in our processing infrastructure are depicted in black color. The radio telescopes digitize the electromagnetic field strength measurements and store it in VDIF format. The data is then handed over to the DiFX software correlator. The real observation can be substituted by the simulation of the observing process and the simulated data can also be stored in the VDIF format (orange color). It is possible to feed the DiFX software correlator with simulated data.
Download figure:
Standard image High-resolution image

5.2. Simulation of a Characteristic Frequency Response at a Station

The receiving system of a radio telescope is represented by a sophisticated interconnection of electronic devices and can be characterized by the magnitudes of the frequency response at the station. The realization of the receiving system can be somewhat different between radio telescopes and a station specific frequency response can be found. This circumstance is used to test the arbitrary magnitude filter algorithm within VieRDS as described in Section 3.6. In the following, the simulation of a station specific frequency response is shown based on real visibility data of auto-correlation data compared against the frequency response of the real observed counterpart.

The real data is obtained from the VGOS intensive session VI9290 using the channel observed from 3000.4 to 3032.4 MHz. A real scan with the baseline KOKEE12M and WETTZ13S is correlated using the processing chain depicted in Figure 4. The visibilities which are stored in the Swinburne format are converted to the FITS format and only the auto-correlation visibilities are extracted. Due to the correlation configuration, the spectral resolution of this data is 250 kHz, yielding 128 samples for the bandwidth spectrum. The square root of the absolute visibility values are calculated and referred to as the magnitudes of the frequency response at the station. The magnitude spectrum is used as input for the arbitrary magnitude filter applied to the simulated raw data signal.

Since VieRDS features the simulation of the phase calibration signal as an independent method by the superposition of the phase calibration signal as an individual signal onto the source and system signal, the phase calibration tones are suppressed here. The suppression of the phase calibration tones is required because the sampling in the frequency domain of the real signal is of too low precision, resulting in smeared peaks. The arbitrary magnitude filter accounts for the magnitude shape without the phase calibration tones.

To remove the phase calibration tones from the real magnitudes a moving median algorithm with a window length of 5 samples is applied. The resulting signal serves as the input for the arbitrary magnitude filter parameterization. A filter order of 300 is chosen and the real normalized magnitude values are interpolated with a spectral resolution of 1 kHz. The filter is applied to the simulated raw data. The simulated raw data consists of a source, a system signal, and a phase calibration signal with 5 MHz frequency spacing. The relative power of the phase calibration signal is set to 0.9% for KOKEE12M and 4.0% for WETTZ13S. The numbers where deduced through matching the real station frequency response with the simulated relative power of the phase calibration signal. This is significantly stronger than the required relative power suggested by G. Rajagopalan for VGOS (2021, personal communication) but reflects the real situation. The application of this simulation scenario is an excellent example of the reverse-engineering capabilities of VieRDS.

The algorithm implemented in VieRDS to create a phase calibration signal is described in Section 3.2. The absolute scale of the magnitudes is not of interest, for this purpose the data is normalized before comparison. The resulting simulated magnitude values are displayed in orange color, whereas the real observation data is depicted in black color, for KOKEE12M and WETTZ13S in Figure 5.

**Figure 5.** VieRDS is designed to simulate the characteristic frequency response of VLBI radio telescopes. The magnitudes of the frequency response of a 32 MHz channel for KOKEE12M (left) and WETTZ13S (right) is shown. The real observed frequency response is depicted in black color, whereas the simulated frequency response is depicted in orange color. The characteristic bandpass frequency response is clearly visible for both, real and simulated data, and reflects the characteristic bandpass response per station. The difference between real and simulated data is comparably small. Furthermore, the phase calibration signal with tones every 5 MHz are clearly visible for real and simulated data.
Download figure:
Standard image High-resolution image

We see that the real and simulated data is in good agreement. The simulated phase calibration tones are present and can be detected by the DiFX algorithm for phase calibration signal extraction. The frequency spacing of the simulated tones matches the real tones and the tone amplitudes are in good agreement. The cumulative difference between real and simulated data is for both stations below 1.3%.

To reproduce the simulation scenario the YAML input data can be downloaded from https://www.vlbi.at/data/correlation/public/raw_data_simulation/characteristic_station_frequency_response.yaml.

It should be noted here that the station frequency response can also be calculated by direct Fourier transformation of the raw VDIF samples. However, since from a correlator's point of view the station frequency response is often analyzed by examining the auto-correlation curve, the method described above was chosen.

5.3. Simulation of a Dispersive Group Delay

A dispersive group delay shows a nonlinear slope for the phase over frequency. In this example, we show the simulation of such a delay due to the frequency-dependent ionospheric delay contribution.

To test the dispersive group delay simulation, a VGOS frequency setup is used because the dispersive contribution is known to be significantly visible in the broadband frequency spectrum. The simulation setup consists of the simulation of 32 channels within the frequency range from 3.00 to 10.68 GHz. This setup is consistent with the frequency setup used for the real observed VI9290 VGOS session. The channel bandwidth is 32 MHz.

A slant total electron content (sTEC) of 8 TEC units is used to calculate the ionospheric delay τ_ion by

$\begin{eqnarray}&&{\tau }_{\mathrm{ion}}=\displaystyle \frac{40.31\,\mathrm{sTEC}}{{{cf}}^{2}},\end{eqnarray} \tag{ 19 }$

in addition to a constant group delay of 600 ps. Based on this frequency-dependent delay the phase offset and the group delay is calculated for each simulation channel. The delay and the phase offset algorithm as described in Sections 3.3 and 3.5 are then applied using these values.

A scan of two stations with the VGOS frequency setup is simulated. Two experiments are carried out. First, the signal of the first telescope is delayed by the constant 600 ps delay. Second, the signal of the first telescope is delayed by the sum of the dispersive and constant delays. No delay is applied to the signal of the second telescope for both experiments. Thus, a constant group delay is simulated for the first experiment, and a dispersive group delay is simulated for the second experiment and correlated with the undelayed signal of the second telescope. The processing chain described in Section 5.1 is used to correlate the simulated raw data of the two telescopes. The phase values are extracted from the resulting FITS file and plotted in Figure 6.

**Figure 6.** The interferometric phases obtained by DiFX correlation for the simulated raw data of a linear group delay (top left) and a nonlinear group delay due to the ionosphere (bottom left) are depicted, with an enlarged display of the corresponding lower frequency band (top/bottom right). The dispersive group delay of the ionosphere shows a characteristic nonlinear slope for the phase over frequency. Within VieRDS this behavior is modeled by applying a phase offset and linear group delay per channel. The simulation is parameterized by using a linear group delay of 600 ps and an sTEC value of 8 TECu. The interferometric phase values provided by DiFX are within 0 and 2π (black color). For a clear depiction, these values are several times unwrapped, yielding multiple versions of the original values separated by 2π (gray color). The phase values are highlighted with the orange color where they match the apriori data (dashed line in black color). The frequency range comprises a full VGOS broadband setup consisting of four individual bands with 8 channels per band yielding 32 channels that were simulated in total.
Download figure:
Standard image High-resolution image

In the figure, the frequency dependence of the phase values is clearly visible and the phase values align very well with the calculation of the apriori ionospheric delay. The characteristic ionospheric delay with negative group delays for low radio frequencies and positive group delays for larger radio frequencies is a noticeable feature. To reproduce the simulation scenario, the YAML input data can be downloaded from https://www.vlbi.at/data/correlation/public/raw_data_simulation/dispersive_group_delay.yaml.

5.4. Non-zero Baseline Simulation

VieRDS offers the possibility to run the VLBI raw data simulation in two different modes. One mode is the zero baseline simulation, and the other mode is called the non-zero baseline simulation. In the zero baseline simulation, the participating telescopes are placed at the same location, which results in delay and delay rate values of zero size between the telescopes. This fact is utilized for zero baseline simulations by not simulating/applying any apriori delay and delay rate correction, which decreases the computational complexity significantly.

In contrast, for non-zero baseline simulations, the telescopes are located according to their apriori coordinates, forming non-zero baseline lengths. The geocenter delay for VLBI telescopes can yield delay values of up to 0.021 s (travel time of radio waves for a distance that corresponds to the Earth radius) and delay rates of up to several hundreds of nanoseconds per second. These apriori delays and delay rates are considered for non-zero baseline simulations.

The POC examples demonstrated in the previous sections are zero baseline simulations, which do not take into account any apriori delay and delay rates. In this section, a non-zero baseline simulation of two telescopes observing with legacy X-band (8 channels) is demonstrated. The observing mode of the simulation is parameterized as follows:

1.
X-band center frequencies (MHz): 8212.99, 8252.99, 8352.99, 8512.99, 8732.99, 8852.99, 8892.99, 8932.99,
2.
Channel bandwidth: 8 MHz,
3.
Number of bits used for quantization: one.

A coarse apriori delay and delay rate calculation is carried out based on the EOP, TRF, and CRF data as described in Section 4.2. Consequently, significant residuals will be obtained when correlating with the DiFX correlator, which computes the aprioris with a more sophisticated delay and delay rate model. For example and most importantly, a sophisticated relativistic model including yearly and daily aberration effects (Petit & Luzum 2010) is not yet implemented in VieRDS. However, the residuals obtained by DiFX correlation of VieRDS data are usually within the search range of the single-band delay resolution function (several μs) and can be easily compensated for by making small adjustments in the correlator clock model.

To challenge the non-zero baseline simulation capabilities of VieRDS, a very long baseline parallel to the equator (HARTRAO—WARK12M) with a low decl. source (2358+189) is selected for raw data simulation. This geometry yields a rather high absolute delay and delay rate (Table 1).

Table 1. Geocenter Delay and Delay Rate of Station HARTRAO and WARK12M when Observing the Source 2358 + 189 at 2020 January 28, 17:30:00

	Geocenter Delay μs	Geocenter Delay Rate μs s⁻¹
HARTRAO	6758.473225	−1.106016
WARK12M	−18926.863755	0.480697

Download table as: ASCII Typeset image

The simulated non-zero baseline raw data is correlated with the DiFX correlator described in Section 5.1. In contrast to the POC of Section 5.2 and 5.3, DiFX treats the simulated raw data as if it was observed at the station corresponding to the precise TRF coordinates specified in the VEX file. In analogy to the delay and delay rate application by VieRDS, the DiFX correlator corrects the delay and delay rate effects with its independent algorithms described in Deller et al. 2007, which are currently more accurate than those implemented in VieRDS.

The Haystack Observatory Postprocessing System (HOPS), developed at MIT Haystack Observatory (Whitney et al. 2004) further processes the DiFX correlator output to obtain the characteristic single-band delay, multi-band delay, and delay rate resolution function. These functions can be used to inspect the correlation results. Furthermore, the phase residuals are depicted to prove the phase stability over time.

The graphs of the single-band delay, multi-band delay, and delay rate resolution function obtained by HOPS are depicted in Figure 7 and show symmetrical behavior with a distinct peak. This indicates a successful channel-wise cross-correlation and successful bandwidth synthesis. The offsets from zero delay (for this simulation experiment 1.3 μs for the single-band delay), are caused by the differences in the apriori delay calculation of DiFX and VieRDS, most notably by the missing aberration and relativistic corrections in VieRDS. However, the single-band delay resolution function's peak lies in the valid search range of the correlation configuration of DiFX and fourfit.¹²

Despite the apriori delay offset, the VLBI fringe-detection is successful. The linear behavior with a small random noise component of the phase residuals over time per channel indicates no unwanted phase corruption during the VieRDS delay rate application (Figure 7). The delay rate residual value is 92.6 ps s⁻¹, which indicates a small offset for the apriori delay rate calculation between DiFX and VieRDS.

6. Summary and Conclusions

In this paper, we described the digital signal processing algorithms that are required to simulate raw telescope data for VLBI. We outlined the key characteristics of a novel simulation concept and presented the implementation with MATLAB^©. The key characteristics are as follows:

1.
The received signal by the radio telescope is simulated as the superposition of a source, a system, and a phase calibration signal. The system signal includes the noise contribution of the media traversed by the radiation.
2.
The source signal is simulated for the frequencies of each channel separately and is applied identically to all bit streams belonging to a simultaneous scan of a radio source.
3.
The system and phase calibration signal are simulated for each telescope and each frequency channel, separately.
4.
Amplitude scale, delay, delay change over time, phase offset, and arbitrary magnitude filter can be simulated for each signal component and frequency channel separately.
5.
Amplitude quantization is possible with 1 and 2 bit.
6.
The simulated raw data signal is stored in VDIF format and is thus compatible with the DiFX software correlator.

To validate the software, we performed a proof of concept (POC) demonstration. It contained a raw data simulation followed by correlation and fringe-fitting with DiFX and the fourfit program, respectively. The POC provided the following insights:

1.
The characteristic frequency response at the station was correctly simulated for each channel reflecting the real frequency response.
2.
The frequency spacing of the tones and the power of the phase calibration signal was simulated as specified.
3.
For a VGOS broadband frequency setup, VieRDS simulated correctly the dispersive group delay caused by the ionosphere.
4.
The non-zero baseline simulation of a legacy X-band observation with a very long baseline near the equator showed a successful fringe-detection with fourfit.

From the results obtained by the POC demonstration, we can derive one important characteristic of VieRDS: the user-controlled manipulation of the magnitude and phase of the simulated raw data signal enables a variety of applications for the simulation of the VLBI observation process. In the first place, we can simulate standard and optimal configurations and environments as we see them in the majority of VLBI experiments. However, of more interest are those system responses, which are rather pathological, and where we can test hyphotheses of what may have caused specific abnormal correlation and fringe fitting results. In this respect, VieRDS is a very useful tool for figuring out problems virtually in a reverse-engineering mode. Conceivable tests may be of how and to what extent corrupted phase-cal tones may affect the delay observable or assessing the impact of amplitude calibration, just to mention two of the many issues which can be addressed.

It should be mentioned here that the simulation experiments of the POC section are a demonstration with the aim to verify the practical potential of VieRDS. Detailed statistical comparisons of measured observations and their simulated equivalent, e.g., fringe amplitude or phase variation over time, will be performed and discussed in subsequent publications. This article focuses on describing the main features of the simulation concept, digital signal generation, digital signal processing algorithms, and software architecture, while the POC section can be considered as a feasibility study to assess the practicality of VieRDS.

In the near future we plan to apply the current capabilities of VieRDS for simulation studies to reveal a few of the systematic effects in the VLBI observation process. The VGOS setup and its peculiarities are certainly a wide field for testing and improving the VLBI observables. The modular and flexible software structure implemented in MATLAB^© makes it possible to easily extend the current capabilities, e.g., with the simulation of source structure effects which are critical for VGOS. We also aim to include the simulation of dual-linear polarized receivers. The successful simulation of the phase calibration signal also shows the potential of the simulation of other non-stochastic signal classes, such as artificial satellite signals or disturbing RFI signals.

The authors are grateful to the Austrian Science Fund (FWF) for supporting this work with the project P31625 (VGOS Squared). The computational results presented have been achieved in part using the Vienna Scientific Cluster (VSC).

VieRDS: A Software to Simulate Raw Telescope Data for very Long Baseline Interferometry

Article metrics

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

2. VieRDS Simulation Concept

2.1. Three-component Signal Model

2.2. Simulation Per Channel

2.3. Digital Signal Creation in the Time-domain at Baseband

2.4. Frequency Response and Equivalent Parameter Combination

2.5. Circular Polarization and Upper Sideband Response

2.6. Delay Reference Location

2.7. Single Scan Simulation with a Focus on Correlation and Fringe-fitting

3. Digital Signal Creation and Processing Algorithms of VieRDS

3.1. Signal Components and Digital Representation

3.2. Signal Classes and Amplitude Scale

3.3. Delay Application

3.3.1. Integer-sample Delay

3.3.2. Fractional-sample Delay

3.4. Delay Rate Application

3.5. Constant Phase Shift

3.6. Arbitrary Magnitude Filter

3.7. Quantization

3.8. Filtered three-component Signal Model

4. VieRDS Software Architecture and Databases

4.1. Input Module

4.2. Model Module

4.3. Signal Module

4.4. Formatter Module

5. Proof of Concept

5.1. Processing Chain for Real and Simulated Raw Data

5.2. Simulation of a Characteristic Frequency Response at a Station

5.3. Simulation of a Dispersive Group Delay

5.4. Non-zero Baseline Simulation

6. Summary and Conclusions

Footnotes

VieRDS: A Software to Simulate Raw Telescope Data for very Long Baseline Interferometry

Article metrics

Share this article

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

2. VieRDS Simulation Concept

2.1. Three-component Signal Model

2.2. Simulation Per Channel

2.3. Digital Signal Creation in the Time-domain at Baseband

2.4. Frequency Response and Equivalent Parameter Combination

2.5. Circular Polarization and Upper Sideband Response

2.6. Delay Reference Location

2.7. Single Scan Simulation with a Focus on Correlation and Fringe-fitting

3. Digital Signal Creation and Processing Algorithms of VieRDS

3.1. Signal Components and Digital Representation

3.2. Signal Classes and Amplitude Scale

3.3. Delay Application

3.3.1. Integer-sample Delay

3.3.2. Fractional-sample Delay

3.4. Delay Rate Application

3.5. Constant Phase Shift

3.6. Arbitrary Magnitude Filter

3.7. Quantization

3.8. Filtered three-component Signal Model

4. VieRDS Software Architecture and Databases

4.1. Input Module

4.2. Model Module

4.3. Signal Module

4.4. Formatter Module

5. Proof of Concept

5.1. Processing Chain for Real and Simulated Raw Data

5.2. Simulation of a Characteristic Frequency Response at a Station

5.3. Simulation of a Dispersive Group Delay

5.4. Non-zero Baseline Simulation

6. Summary and Conclusions

Footnotes