Optimized spectral analysis in magnetic resonance spectroscopy for early tumor diagnostics

Molecular imaging through magnetic resonance spectroscopy (MRS) can provide information about key metabolites. Conventional applications of MRS are hampered by data analysis via the fast Fourier transform (FFT). Most MRS studies for cancer detection have relied upon estimations of a mere handful or even a single composite metabolite, e.g. total choline. These have yielded incremental improvements in diagnostic accuracy. In vitro studies reveal richer metabolic information for identifying cancer, particularly in closely-overlapping resonances. Among these are phosphocholine, a marker of malignant transformation. The FFT cannot assess these congested spectral components. This can be done by the fast Pade transform (FPT), an advanced, high-resolution, quantification-equipped method, applied to MRS time signals as encoded from patients with breast cancers and other cancers, with benign pathology and with normal tissue, as illustrated herein for the latter. With realistic noise levels, the FPT accurately computes the metabolite concentrations, including phosphocholine, which completely underlies phosphoethanolamine. In sharp contrast, the FFT produces a rough envelope spectrum with only a few shortened, broadened peaks, and key metabolites altogether absent. The FPT clearly separates true metabolites from spurious resonances. The efficiency and high resolution of the FPT translates into shortened examination time of the patient. These capabilities strongly suggest that by applying the FPT to time signals encoded in vivo from breast cancer and other malignancies, MRS will fulfill its potential to become a clinically- reliable, cost-effective method for early cancer detection.


Introduction
Magnetic resonance-based modalities are promising for early cancer detection, particularly since they do not entail exposure to ionizing radiation. Magnetic resonance imaging (MRI) is usually very sensitive, but is costly and not sufficiently specific. Molecular imaging through magnetic resonance spectroscopy (MRS) can potentially improve the specificity of MRI, by providing information about key metabolites. In this paper, we will focus upon MRS-based diagnostics and how these can be improved through optimization of signal processing methods. The actual problem area addressed will be related to screening and early detection of breast cancer. This problem is chosen in light of the fact that worldwide, breast cancer is the most commonly diagnosed malignancy among women and a major cause of death [1]. Survival is profoundly impacted by the stage at diagnosis [2]. Screening significantly improves survival, since with early detection, adequate and timely care can be provided [3]. However, consensus has not been reached as to the best strategies for early breast cancer detection [4]. There is widespread recognition that screening programs need to be improved and made more cost effective [1]. We examine these topics from the standpoint of mathematical optimization via advanced signal processing, as this aspect of the problem pertains to MRS for breast cancer diagnostics.

Screening for breast cancer: The current status of diagnostic imaging
Currently, anatomical imaging through mammography together with ultrasound and magnetic resonance imaging (MRI) is the mainstay for breast cancer screening, with MRI generally showing the best sensitivity [5]. However, MRI has poor specificity and, as such, is associated with high biopsy rates, very often for benign lesions [3]. Expense is another drawback of MRI, such that it is currently considered cost-effective only for women at very high breast cancer risk [6].
Since these anatomic imaging modalities are insufficiently specific, functional and molecular imaging through magnetic resonance (MR) are garnering attention vis-à-vis breast cancer diagnostics [7]. Diffusion-weighted MRI (DWI) helps distinguish benign lesions and normal breast from cancer. Since malignancy is often associated with increased cellularity or intracellular density, the random motion of free water may be restricted, as reflected in a decreased apparent diffusion coefficient (ADC). However, a low ADC can occur with fibrotic tissue or benign proliferative changes, while necrotic tissue in a cancer is associated with increased ADC. Problems remain in selecting optimal cutpoints and recording parameters for DWI. Thus, breast cancer is not detected via DWI with full certainty [8].
Improved identification of breast cancer can potentially be achieved via molecular imaging through MRS or MR spectroscopic imaging (MRSI), whereby the metabolic characteristics of the examined tissue are assessed. When a single MRS voxel is inadequately representative, MRSI provides volumetric coverage. A recent meta-analysis of in vivo MRS and MRSI studies using 1.5T or 3T scanners reveals a pooled sensitivity of 73% and specificity of 88% for detecting over 700 breast cancers versus over 400 scans of non-cancerous breast [9]. These analyses were based on the total choline (tCho) peak, whose resonant frequency is 3.2 parts per million (ppm). As an indicator of cell membrane turnover, tCho was used as a sign of cancer. Since signal-to-noise ratio (SNR) increases linearly with magnetic field strength, higher field scanners would be expected to improve SNR, to enhance MRS-based breast diagnostics. However, the diagnostic accuracy of MRS was not better with 3T compared to 1.5T scanners [9]. Using higher field (4T or 7T) MR scanners [10][11][12], tCho was detected in normal breast and benign lesions and was undetectable in several cancers. Overlap was seen among the ranges of tCho for these three categories of breast tissue. The high costs would also preclude high-field scanners from most breast cancer screening protocols. All told, tCho as a single composite spectral entity has not yielded adequate diagnostic accuracy for breast cancer screening. Further difficulties arise due to the various cutpoints of tCho used to define breast cancer.

Inadequacies due to reliance upon conventional data processing in MRS and MRSI
Whereas isolated resonances are better resolved with higher field scanners, widths are broadened when MRS time signals are encoded with stronger magnets. This adversely impacts on overlapping resonances that are abundant in MR spectra of the breast. Choline detection can be hampered with higher field magnets due to lipid-induced sidebands [13]. These lipids can be suppressed by prolonging the echo time (TE). However, potentially informative metabolites with short T 2 relaxation time will also have decayed with long TE. Furthermore, the lipid itself may be part of the actual disease process, such that diagnostic information could be lost with this encoding strategy [14].
Nuclear magnetic resonance performed in vitro on extracted breast specimens reveals more abundant metabolic information [15]. Components of tCho: phosphocholine (PC) resonating at 3.22 ppm, glycerophosphocholine (GPC) at 3.23 ppm and free choline (Cho) at 3.21 ppm can better characterize breast tissue [16]. With malignant transformation in the breast, PC is increased in association with a "glycerophosphocholine to phosphocholine switch" due to the over-expressed enzyme choline kinase responsible for PC synthesis [16][17][18]. Besides the resonant frequencies of PC, GPC and free choline being extremely close to one another, the far more abundant phosphoethanolamine (PE) also resonating at 3.22 ppm, and which is not a cancer biomarker, obscures the PC peak on total shape spectra.
Encoded MRS time signals require mathematical transformation into their spectral representation in the frequency domain. Currently, this is done by the fast Fourier transform (FFT), built into all clinical MR scanners. The FFT is a low-resolution, linear, non-parametric processor. As a single polynomial, the FFT passes all the noise from the time domain into the frequency domain, with no way to suppress this noise corruption. The FFT can only produce a total shape spectrum; the underlying components cannot be generated via Fourier analysis. Thus, the overlapping resonances contained within the tCho peak cannot be disentangled by the FFT. No trustworthy information can be gleaned regarding the abundance of phosphocholine, PC, nor about the concentration of PE which, as mentioned, is completely superimposed on PC. Via the FFT, not even the number of overlapping resonances contained within a given peak can be determined. Post-processing via fitting is employed to surmise how many components underlie a given peak. Consequently, it is likely that false information will be generated by overfitting (overmodelling) and/or that physical resonances are missed because of underfitting (undermodelling). Fitting procedures cannot generate reliable quantitative information even about predetermined resonances, due to the likelihood of bias. Only by serendipity will a correct guess be made [19,20]. It has therefore been concluded that the inability to autonomously reconstruct the spectral parameters, i.e. the fundamental complex frequencies and amplitudes, completely eliminates the FFT as a signal processor for solving the quantification problem, the "raison d'être" of MRS [19,20].

The fast Padé transform: An advanced signal processor well-suited to MRS and MRSI
The fast Padé transform (FPT) is a high-resolution, non-linear signal processor with parametric and non-parametric forms. It is well-suited to MRS and MRSI [19][20][21][22], and can solve the above-described difficulties in breast cancer diagnostics. The number of genuine resonances contained in an encoded MRS time signal is handled as a parameter by the FPT, and is ascertained exactly. The complexvalued fundamental frequencies and amplitudes are precisely reconstructed for each resonance. With these parametric data, the key clinical information, the metabolite concentrations are accurately computed, even when resonances closely overlap [19,[21][22]. The spectrum generated by the FPT is the quotient of two polynomials (P/Q). There are two equivalent variants: and inside and outside the unit circle for the causal and anti-causal representation, respectively [19][20][21][22]. These two variants provide a fully self-contained checking procedure. The FPT algorithmically separates the genuine and spurious resonances by identifying pole-zero confluences. This is vital when there are many spurious resonances as in most MRS quantification problems, since over-determined systems of linear equations are encountered. The FPT-generated spectrum as the quotient of two polynomials (P/Q) produces a unique set of genuine spectral poles and zeros. The denominator polynomial Q is associated with the system poles and represents the positions (chemical shifts) and widths of peaks in a spectrum. The numerator polynomial P is associated with the valleys between adjacent peaks, and the system zeros are described thereby. Coincident poles and zeros, termed Froissart doublets [23], are unstable, roaming stochastically in the complex frequency plane with the slightest perturbation, and they never converge. Consequently, these spurious resonances are akin to noise which must be identified and discarded from the final results of the analysis. The two polynomials P and Q provide two degrees of freedom which facilitate noise suppression, as follows. In the Padé-reconstructed spectrum , the total number of resonances K´ is larger than its genuine counterpart K in the unknown ratio P K / Q K . The overestimation K = K´ K is equi-partitioned between and , such that and will have an equal number of the same spurious resonances in . This excess, spurious content cancels out in . After this cancellation, only the true information P K / Q K remains in the reconstructed data. Further details are given Refs. [19,[23][24][25][26][27][28].

Applications of the FPT to MRS time signals from breast and rationale for the present study
We have applied the FPT to noise-free MRS time signals as encoded from extracted breast specimens from normal breast, fibroadenoma and breast cancer [15]. In all three cases, the FPT resolved and precisely quantified the nine resonances, including PC and PE that were almost completely overlapping [4,19,26]. Many spurious resonances were also produced; at convergence 750 resonances were retrieved, i.e. 741 spurious plus the 9 physical resonances. Although fewer than 2% of the resonances were genuine, the FPT delineated these two categories with certainty in all cases. The FPT was then applied to MRS time signals associated with breast cancer and benign breast pathology (fibroadenoma) with added noise (σ = 0.0289 RMS, where RMS is the root-mean-square of the noise-free time signal, RMS = 1/2 and σ is the standard deviation). Convergence was achieved in both cases at = 1700, with accurate reconstruction of the spectral parameters for all 9 genuine metabolites. Also, 841 spurious resonances were generated and were recognized as Froissart doublets with pole-zero confluences and zero-valued amplitudes, together with marked instability in the face of changes in noise level or signal length [27,28]. By this three-fold "signature" the nonphysical content is distinguished from physical resonances [7,23,27,28].
Based upon these results, we herein apply the FPT to noise-corrupted time signals from breast without pathology to provide normative data. Of particular interest is the capability of the FPT to resolve the resonances between 3.21 ppm and 3.23 ppm, where, as noted, elevated phosphocholine is a marker of malignant transformation [16,18]. We assess the performance of the FPT in handling noise-corrupted MRS time signals in this controlled setting, as a key step towards clinical application of Padé-optimized MRS for breast cancer diagnostics.

The input data: an MRS time signal as encoded from normal, non-infiltrated breast
The time signal of the typical quantum-mechanical form was generated according to: where N is the total signal length. These input data are based upon encoded time signals from tissue samples of normal, non-infiltrated breast from twelve patients, as per Ref. [15]. Each c n is a sum of K = 9 damped complex exponentials Herein, k and d k are the fundamental angular frequencies and amplitudes; k = 2 k , where k is the linear frequency. The time signals of the form (1) were quantified via the FPT, as per Refs. [19][20][21][22].
Since k is complex-valued, such that for Im( k ) > 0, as in (1), quantity c n decreases exponentially over time n (n = 0, 1, 2,…, N 1). Thus, exp (in k ) = exp (-n k + i n k ), where and . The rhs of Eq. (1) is then a linear combination of K exponentially attenuated frequency-dependent cosinusoids and sinusoids each of which is premultiplied by the constant, real intensity , (2) where is the phase of the complex amplitude . Herein, for conciseness, we present explicitly only the reconstructions of the . The converged findings are identical to those obtained from the [19][20][21][22]. The diagonal (M = K ) and non-diagonal (M K ) complex-valued spectra in the are: Here, the symbol signifies the error which is a series in powers . Thus, the approximation in the general includes exactly signal points from the input data = with . The MRS time signals from Ref. [15] of length N = 65536 were recorded at a Larmor frequency ( L ) of 600 MHz and with a static magnetic field strength . A bandwidth (BW) of 6 MHz was used, where the inverse of this BW is the sampling time . The nine resonances identified in Ref. [15] were grouped into two frequency bands. The first was from 1.3 ppm to 1.5 ppm and the second from 3.2 ppm to 3.3 ppm. Within the second band were seven resonances, including two which very closely overlapped at 3.22 ppm: phosphocholine, PC resonance #4 and phosphoethanolamine, PE resonance #5 separated by 0.001 ppm. In the present study, we used a 32-fold shorter time signal of total length N = 2048. The amplitudes d k were computed based upon the reported concentrations of metabolites C met for normal non-infiltrated breast tissue from twelve patients [15] (ww) of the tissue. In Ref. [15], the relaxation times were not reported. We set the line widths (full-widths at half-maximum, FWHMax) to be from 0.0008 to 0.0009 ppm. The peaks are represented by Lorentzians, consistent with the time signal from Eq. (1). In the absorption mode, we have used mainly the diagonal, and paradiagonal, forms, since these two variants for l = 0 and l = 1 are empirically observed to be more stable compared to those using other values of l (l = 2, 3, …) in . The line widths are proportional to Im ( k ). The smallest chemical shift difference of 0.001 ppm is only 1.11 to 1.25 times larger than the line widths of 8 x 10 -4 to 9 x 10 -4 ppm.

Addition of noise corruption to the input data
We performed the reconstructions using the input data corrupted by noise. The latter were generated by adding complex-valued random zero-mean Gauss-distributed white noise of preassigned levels to the noiseless MRS time signal. The noise levels were σ = 0.00289 RMS and σ = 0.0289 RMS with, as noted, RMS being the root-mean-square of the noise-free time signal.

Comparisons of the FPT and the FFT in non-parametric signal processing
It is only for total shape spectra that the Padé and Fourier spectra can directly be compared, because as a non-parametric estimator, that is all the FFT can compute. The Fourier spectrum is given by: where 2 k/T are fixed Fourier grid frequencies that are unrelated to k from c n . Here, T is the total duration or acquisition time of the signal, T = N . The variable (6) is an undamped harmonic. The non-parametric data are provided by the FPT when the Padé polynomials and are extracted from the time signal {c n }. The ratio is the complex-valued total shape spectrum whose real part is the absorption spectrum. The FPT generates the total shape spectrum at any given set of sweep frequencies that need not correspond to the preassigned Fourier grid [19,20]. Thus, the FPT has interpolation capabilities relating directly to the features of the analyzed time signal.

Quantification of MRS time signals from normal, non-infiltrated breast by the parametric FPT
The MRS data are quantified by the FPT via polynomial rooting, whereby the roots of the characteristic equations of the numerator (P K ) and the denominator (Q K ) polynomials generate the zeros and poles of the Padé spectrum , respectively. Since the rational function is meromorphic, the zeros of are the poles of . Meromorphic functions have poles as the only singularities. Roots of equation reconstruct the fundamental or eigen-frequencies . The amplitude is given via the analytical expression for the Cauchy residue of taken at the k th pole . Although the quantification problem entails finding the eigen-set , it is not the concluding part of spectral analysis, since this does not generate the eigenset with K being the true number of resonances. Rather, because an over-determined system of linear equations is set up, a larger collection is produced where is the sum of K genuine and spurious resonances . Thus, we obtain instead of . To omit the false resonances that are not the constituents of the K input eigen-frequencies and the K associated eigenamplitudes, we root the numerator polynomial via to obtain the zeros of the spectrum . A subsequent binning is carried out according to stable versus unstable resonances as a function of the varying degree of polynomials and . Finally, the exact number K of genuine resonances is reconstructed from after discarding the unstable resonances as spurious.

Results
We systematically augmented the signal length for the same bandwidth (i.e. 3 acquisition times) to check the constancy of the spectral parameters. The reconstructed spectral parameters and metabolite concentrations are shown in panels (2) to (4) of table 1 at total orders K = 350, 850 and 1020. The partial signal length N P is given by N P = 2K for total signal length N = 2048. In panel (2) of table 1, at N P = 700, only 8 of the 9 genuine resonances were resolved. Between 3.220 ppm and 3.221 ppm only 1 resonance was generated at 3.221 ppm, whose computed concentration is the sum of PC + PE ("4 + 5"). At N P =1700 convergence was attained for all the resonances, such that both PC and PE are fully resolved with their correct amplitudes and computed concentrations (panel (3)). At all higher N P and at full signal length N convergence remained stable, as illustrated in the bottom panel (4) of table 1 at N P = 2040. Figure 1 summarizes key clinical MRS data for normal, non-infiltrated breast tissue. In panel (1) the input data, the real part (Re) of the complex time signal is shown, as per the encoded data from normal non-infiltrated breast reported in Ref. [15]. To the noiseless time signal, we added Gaussiandistributed noise with σ = 0.0289 RMS. The imaginary part (Im) of the time signal is similar (omitted). The output data are shown either for the interval between 3.2 ppm and 3.3 ppm or [3.16, 3.34] ppm. These are presented as spectra in panels (2), (3), (4) and as metabolite concentrations in panel (5). The absorption total shape spectrum reconstructed via the FFT in panel (2) at full signal length (N = 2048) is totally inaccurate and non-convergent, with only a few rough peaks, while the FPT has converged at N P = 1700 on panel (3) (total shape spectrum) and panel (4) (component shape spectra) where PC is seen to completely underlie PE. The converged Padé-generated concentration map at N P = 1700 is presented on panel (5), wherein the concentrations of all the metabolites, including PC are accurately computed. Signal-noise separation is presented in panels (6) and (7) where FWHMax (full width at half maximum: spectral poles) equals FWHMin (full width at half minimum: spectral zeroes) for non-physical data symbolized as ๏ (Froissart doublets). These show marked instability at the two noise levels: panel (6) ( = 0.00289 RMS) and panel (7) ( = 0.0289 RMS). The zero peak heights in (8) (the abscissa crosses the centers of the empty circles) are the 3rd "signature" of spurious resonances. In contradistinction, for physical resonances FWHMax FWHMin. For such results, the reconstructed and input poles (O and ) coincide, as indicated by . These are stable against the two noise levels differing by a factor of 10 in panels (6) and (7). Also, the related peak heights are non-zero (panel (8)). Although such true structures may show pole-zero near-coincidences and some of the peak heights (GPC, PC and CHO) are close to zero, they show no change for different noise levels, as seen in the dash-lined box in panels (6) and (7).

Discussion and Conclusions
The algorithmic success of the FPT is a direct result of its quantum mechanical underpinnings, whereby the frequency spectrum is represented as the unique ratio of two polynomials as in the finiterank Green's function [19,20]. Its superior resolution compared to the FFT demonstrated herein and elsewhere [19,20,22,27,28] has several reasons. One is the interpolation capability of the FPT. This occurs because a spectrum in the , given by a polynomial quotient where can be taken at any sweep frequency ω. Consequently, frequency ω can be selected independently of the employed number of signal points c n . Thus, the FPT achieves interpolation in spectra, based on the actual analyzed time signal, whereas the minimal separation ω min in an FFT spectrum is determined by the total acquisition time T. The FFT spectrum is defined only on the Fourier grid points (k = 0,1,2,3,…, N 1), where N is the total signal length. Thus, the FFT cannot interpolate. The superfluous data (zeros) added in the encoding domain do not improve resolution in the Fourier spectrum. Further, the FPT can extrapolate beyond the N input signal points because the implicitly present inverse of is an infinite sum (Maclaurin series). The FPT has an extra degree of freedom to cancel out noise from the numerator and the denominator , whereas       breast [15]. Input data: time signal in panel (1). Output data: absorption spectra in panels (2), (3) and (4); concentrations of metablites in panel (5); signal-noise separation via Froissart doublets in panels (6), (7) and (8). Symbols: input data (+) and output data (O , •). To avoid clutter in panels (6) and (7), the full width at the half maximum, FWHM, is doubled at frequency 3.220 ppm (as PC * ). For more details, see the main text. in the FFT, as a single polynomial and being linear, noise is imported intact from the measured time domain to the computed frequency spectrum [19,20]. Due to the superior resolution performance of the FPT, fewer and shorter MRS time signals could be encoded, thereby diminishing examination time. This translates into greater comfort for the patient and lower cost. For normal, non-infiltrated breast, the concentration of PC was 15% that of the overlying PE, whereas for the other two cases (fibroadenoma and breast cancer), the relative concentrations were 30% [15]. Moreover, the PC concentration was less than 20% of that for fibroadenoma [19,26]. Nevertheless, the PC concentration, albeit very low, was computed precisely by the FPT in the presence of moderate levels of noise. This genuine metabolite was clearly distinguished from the numerous spurious resonances by its stability in the face of different noise levels, as well as at different partial signal lengths after convergence. All the other genuine metabolites were also confidently identified, with their concentrations precisely computed by the FPT, even though several (notably the other components of total choline) were likewise only sparsely present. Precisely the capabilities of the FPT demonstrated here help pave the way for developing urgently-needed normative data bases, to which comparisons could be confidently made with various benign breast pathologies, as well as with breast cancer of various types.
The results presented herein provide further evidence that mathematical optimization through the FPT will be of vital importance for realizing the potential of magnetic resonance spectroscopy, MRS, and spectroscopic imaging, MRSI, for early cancer diagnostics, in particular for breast cancer. As long as MRS and MRSI rely upon the FFT, plus post-processing via fitting, breast cancer cannot be distinguished from normal, non-infiltrated breast nor from benign breast lesions with sufficient confidence. Moreover, the low resolution of the FFT renders MRS and MRSI of the breast extremely time-consuming and therefore uncomfortable for the patient and costly. Encoding for MRS entails poor SNR as a major problem. Exclusive reliance upon the FFT requires encoding many transients ( 200) to be averaged for improving SNR. This keeps the patient longer in the scanner, resulting in increased inhomogeneity of the static magnetic field. During this extended encoding time, the patient moves, so that more and more transients are likely not to be associated with the same intended voxel. This leads to incoherent averaging, when adding together these different time signals. Hence, this illustrates the need to reformulate the entire concept of encoding in MRS by relying upon another theory which requires fewer encoded transients ( 20) and yet improves SNR and resolution. The FPT is optimally poised to serve this purpose critical to the use of MRS in daily clinical practice.
The closely-overlapping resonances, notably the components of total choline, within the MR spectra of the breast appear to be of crucial diagnostic importance for early cancer diagnostics. The FFT plus non-unique fitting procedures cannot provide any certainty about these overlapping resonances. Thus, even at high magnetic field strengths, proton MRS has thus far only generated estimates of total choline. With these, only incremental improvement are seen compared to anatomical imaging with MRI, in the accuracy with which breast cancer is distinguished from non-malignant breast tissue.
The high-resolution, quantification-equipped FPT represents a paradigm shift for signal processing. This has been confirmed in interdisciplinary research fields and applications, where generic time signals comprised of complex damped harmonics are routinely encountered. These capacities of the FPT are also shown herein. Namely, in the presence of realistic noise levels, the FPT clearly distinguishes spurious resonances from genuine metabolites and exactly computes the concentrations of the latter, including phosphocholine, PC, albeit only minimally present for normal, non-infiltrated breast. The high resolution of the FPT together with its enhancement of SNR yields shorter examination time, thus increasing patient comfort plus lowering costs. Particularly with closelyoverlapping resonances in the MR spectra, convergence is attained by a greatly over-determined system of linear equations. The signal-noise separation procedure uniquely linked to the FPT is the means by which genuine i.e. physical resonances are identified and the metabolite concentrations precisely computed. Simultaneously, the same procedure identifies the much more numerous spurious resonances via coincident poles and zeros (Froissart doublets), accompanied by zero or near zero amplitudes and marked instability against various levels of truncation and/or noise. The results presented in this paper in which normative data for the breast are generated by the FPT in the presence of noise, bring us one step closer to applying the fast Padé transform to time signals encoded in vivo from normal, benign and cancerous breast in order for MRS and MRSI to realize their potential to, at last, become a reliable, cost-effective method for breast cancer diagnostics.