An algorithm for constrained one-step inversion of spectral CT data

We develop a primal-dual algorithm that allows for one-step inversion of spectral CT transmission photon counts data to a basis map decomposition. The algorithm allows for image constraints to be enforced on the basis maps during the inversion. The derivation of the algorithm makes use of a local upper bounding quadratic approximation to generate descent steps for non-convex spectral CT data discrepancy terms, combined with a new convex-concave optimization algorithm. Convergence of the algorithm is demonstrated on simulated spectral CT data. Simulations with noise and anthropomorphic phantoms show examples of how to employ the constrained one-step algorithm for spectral CT data.

Use of energy information in X-ray CT has been proposed almost since the conception of CT itself [12].Dual-energy CT acquires transmission intensity at either two energy windows or for two different X-ray source spectra.Despite the extremely coarse energy-resolution, the technique is effective because for many materials only two physical processes, photo-electric effect and Compton scattering, dominate X-ray attenuation in the diagnostic energy range [2].Within the context of dual-energy, the processing methods of energy-windowed intensity data have been classified in two broad categories: pre-reconstruction and post-reconstruction [13].The majority of processing methods for multi-window data also fall into these categories.
In pre-reconstruction processing of the multi-energy data, the X-ray attenuation map is expressed as a sum of terms based on physical process or basis materials [2].The multi-energy data are converted to sinograms of the basis maps, then any image reconstruction technique can be employed to invert these sinograms.The basis maps can subsequently be combined to obtain images of other desired quantities: estimated X-ray attenuation maps at a single energy, material maps, atomic number, or electron density maps.The main advantage of prereconstruction processing is that beam-hardening artifacts can be avoided, because consistent sinograms of the basis maps are estimated prior to image reconstruction.Two major challenges for pre-reconstruction methods are the need to calibrate the spectral transmission model and to acquire registered projections.Photon-counting detectors ease the implementation of projection registration, because multiple energy-thresholding circuits can operate on the same detection element signal.Accounting for detection physics and spectral calibration by data pre-processing or incorporation directly in the image reconstruction algorithm remains a challenge for photoncounting detectors [1].
For post-reconstruction processing, the energy-windowed transmission data are processed by the standard negative logarithm to obtain approximate sinograms of a weighted energyaveraged attenuation map followed by standard image reconstruction.The resulting images can be combined to obtain approximate estimates of images of the same physical quantities as mentioned November 12, 2015 DRAFT linear spectral CT data model; and Sec.IV demonstrates the proposed algorithm with simulated spectral CT transmission data.

II. ONE-STEP IMAGE RECONSTRUCTION FOR SPECTRAL CT
A. Spectral CT data model For the present work, we employ a basic spectral model for the energy-windowed transmitted X-ray intensity along a ray , where the transmitted X-ray intensity in the energy window w for ray is given by Here t∈ denotes that we are integrating along the ray while E integrates over the range of energy; S w (E) is the product of the X-ray beam spectrum intensity and detector sensitivity for the energy window w and transmission ray at energy E; and µ(E, r) is the linear X-ray attenuation coefficient for energy E at the spatial location r.Let I (0) w be the transmitted intensity in the setting where no object is present between the X-ray beam and the detector (i.e.attenuation is set to zero), given by where s w (E) represents the normalized energy distribution of X-ray intensity and detector sensitivity.Image reconstruction for spectral CT aims to recovery the a complete energy-dependent linear attenuation map µ(E, r) from intensity measurements I w in all windows w and rays comprising the X-ray projection data set.
Throughout the article we use the convention that N x is the dimension of the discrete index x.For example, the spectral CT data set consists of N w energy windows and N transmission rays.
This inverse problem is simplified by exploiting the fact that the energy-dependence of the X-ray attenuation coefficient can be represented efficiently by a low-dimensional expansion.For the present work, we employ the basis material expansion where ρ m is the density of material m; the X-ray mass attenuation coefficient µ m (E)/ρ m are available from the national institute of standards and technology (NIST) report by Hubbell and Seltzer [23]; and f m ( r) is the fractional density map of material m at location r.For the present spectral CT image reconstruction problem, we aim to recover f m ( r), which we refer to as the material maps.
Proceeding with the spectral CT model, we discretize the material maps f m ( r) by use of an expansion set where φ (map) k ( r) are the representation functions for the material maps, respectively.For the 2D/3D image representation standard pixels/voxels are employed, that is, k indexes the pixels/voxels.
With the spatial expansion set, the line integration over the material maps is represented by a matrix X with entry X k measuring the length of the intersection between ray and pixel k: where formally we can calculate This integration results in the standard line-intersection method for the pixel/voxel basis.
The discretization of the integration over energy E in Eq. ( 1) is perform by use of a Riemann sum approximation.
where i indexes the discretized energy E and With the Riemann sum approximation we normalize the discrete window spectra, Modeling photon-counting detection, we express X-ray incident and transmitted spectral fluence in terms of numbers of photons per ray (as before, the ray identifies the source detector-bin combinations) and energy window w: where N w is the incident spectral fluence and ĉw is interpreted as a mean transmitted fluence.
Note that in general the right hand side of Eq. ( 3) evaluates to a non-integer value and as a result the left hand side variable cannot be assigned to an integer as would be implied by reporting transmitted fluence in terms of numbers of photons.This inconsistency is rectified by interpreting the left hand side variable, ĉw , as an expected value.

B. Constrained optimization for one-step basis decomposition
For the purpose of developing one-step image reconstruction of the basis material maps from transmission counts data, we formulate a constrained optimization involving minimization of a non-convex data-discrepancy objective function subject to convex constraints.The optimization problem of interest takes the following form where c w are the measured counts in energy window w and ray ; D(•, •) is a generic data discrepancy objective function; and the indicator functions δ(P i ) enforce the convex constraints f ∈ P i , the P i are convex sets corresponding to the desired constraints (for instance, nonnegativity of the material maps).The indicator function is defined Use of constrained optimization with TV constraints is demonstrated in Sec.IV.

Data discrepancy functions:
For the present work, we consider two data discrepancy functions: transmission Poisson likelihood (TPL) and least-squares (LSQ) The TPL data discrepancy function is derived from the negative log likelihood of a stochastic model of the counts data that is, minimizing D TPL is equivalent to maximizing the likelihood.Note that in defining D TPL we have subtracted a term independent of f from the negative log likelihood so that D TPL is zero when c = ĉ(f ), and positive otherwise.From a physics perspective, the important difference between these two data discrepancy functions is how they each weight the individual measurements; the LSQ function treats all measurements equally while the TPL function gives greater weight to higher count measurements.We point out this property to emphasize that the TPL data discrepancy can be useful even when there are data inconsistencies due to other physical factors besides the stochastic nature of the counts measurement.This alternate weighting is also achieved without introducing additional parameters as would be the case for a weighted quadratic data discrepancy.From a mathematics perspective, both data functions are convex functions of ĉw (f ), but they are non-convex functions of f .It is the non-convexity with respect to f that drives the main theoretical and algorithmic development of this work.Although we consider only these two data fidelities, the same methods can be applied to other functions.

Convex constraints:
The present algorithm framework allows for convex constraints that may improve reconstruction of the basis material maps.In Eq. ( 4) the constraints are coded with indicator functions, but here we express the constraints by the inequalities that define the convex set to which the material maps are restricted.When the basis materials are identical to the materials actually present in the subject, the basis maps can be highly constrained.Physically, the fractional densities represented by each material map must take on a value between zero and one, and the corresponding constraint is Similarly, the sum of the fractional densities cannot be greater than one, leading to a constraint on the sum of material maps Care must be taken, however, in using these bound and sum constraints when the basis materials used for computation are not the same as the materials actually present in the scanned object.
The bounds on the material maps and their sum must likely be loosened, and therefore they may not be as effective.
In medical imaging, where multiple soft tissues comprise the subject, it is standard to employ a spectral CT materials basis which does not include many of the tissue/density combinations present.The reason for this is that soft tissues such as muscle, fat, brain, blood, etc., all have attenuation curves similar to water, and recovering each of these soft tissues individually becomes an extremely ill-posed inverse problem.For spectral CT, it is common to employ a two-material expansion set, such as bone and water, and possibly a third material for representing contrast agent that has a K-edge in the diagnostic X-ray energy range.The displayed image can then be the basis material maps or the estimated X-ray attenuation map for a single energy E, also known as a monochromatic image A non-negativity constraint can be applied to the monochromatic image at one or more energies.This constraint makes physical sense even when the basis materials are not the same as the materials in the subject.
Finally, we formulate 1 -norm constraints on the gradient magnitude images, also known as the total variation, in order to encourage gradient magnitude sparsity in either the basis material maps or the monochromatic image.In applying TV constraints to the basis material maps, we allow for different constraint values γ m for each material where ∇ represents the finite-differencing approximation to the gradient, and we use | • | to represent a spatial magnitude operator so that |∇f m | is the gradient magnitude image (GMI) of material map m.Similarly, a TV constraint can be formulated so that it applies to the monochromatic image at energy E where the constraint can be applied at one or more values of E.
November 12, 2015 DRAFT The constraints involving TV of the material maps and the monochromatic image are specifically in Sec.IV.Many other convex constraints can be incorporated into the presented framework such as constraints on a generalized TV computed from multiple monochromatic images [24].

III. A FIRST-ORDER ALGORITHM FOR SPECTRAL CT CONSTRAINED OPTIMIZATION
The proposed algorithm derives from the primal-dual algorithm of Chambolle and Pock (CP) [25], [26], [27].Considering the general constrained optimization form in Eq. ( 4), the second term coding the convex constraints can be treated in the same way as shown in Refs.[28], [29].
The main algorithmic development, presented here, is the generalization and adaptation of CP's primal-dual algorithm to the minimization of the data discrepancy term, the first term of Eq. ( 4).
We derive the data fidelity steps specifically focusing on the deriving steps for D TPL and D LSQ .
Optimizing the spectral CT data fidelity: We first sketch the main developments of the algorithm for minimizing the non-convex data discrepancy terms, and then explain each step in detail.The overall design of the algorithm is comprised of two nested iteration loops.The outer iteration loop involves derivation of a convex quadratic upper bound to the local quadratic Taylor expansion about the current estimate for the material maps.The inner iteration loop takes descent steps for the current quadratic upper bound.Although the algorithm construction formally involves two nested iteration loops, in practice the number of inner loop iterations is set to one.Thus, effectively the algorithm consists only of a single iteration loop where a re-expansion of the data discrepancy term is performed at every iteration.
The local convex quadratic upper bound, used to generate descent steps for the non-convex data discrepancy terms, does not fit directly with the generic primal-dual optimization form used by CP.A convex-concave generalization to the CP primal-dual algorithm is needed.The resulting algorithm called mirrored convex-concave (MOCCA) algorithm is presented in detail in Ref. where To obtain the desired expansions, we need expressions for the gradient and Hessian of each data discrepancy.The gradient of L TPL (f ) is derived explicitly in Appendix A; we do not show the details for the other derivations.The data discrepancy gradients are: where r and r (log) denote the residuals in terms of counts or log counts: Z represents the combined linear transform that accepts material maps, performs projection, and then combines the resulting sinograms to form monochromatic sinograms at energy E i : and A(f ) is a term that results from the gradient of the logarithm of the estimated counts log ĉ(f ): .
Using the same variable and linear transform definitions, the expressions for the two Hessians are Substituting either Eq. ( 18) or (19) for the Hessian and either Eq. ( 12) or ( 13) for the gradient into the Taylor expansion in Eq. ( 11), yields the quadratic approximation to the data discrepancy terms of interest.This quadratic is in general non-convex because both Hessian expressions can have negative values.
where ∇ 2 + L(f ) and ∇ 2 − L(f ) are both positive semidefinite (see Appendix B for more details).The resulting split expressions are: and where and similarly To summarize, the expression for the convex local upper bound to the quadratic approximation in Eq. ( 11) is where ∇ 2 + is used instead of ∇ 2 f in the quadratic term.Here Q depends parametrically on the counts data c, through the function L (see Eq. ( 11)), and the expansion center f 0 .The gradients of L at f 0 are obtained from Eqs. ( 12) and ( 13), and the Hessian upper bounds are available from Eqs. ( 20) and (22).Note that the quadratic expression in Eq. ( 24) is not necessarily an upper bound of the data discrepancy functions, even locally, because we bound only the quadratic expansion.We employ the convex function Q (c, f 0 ; f ) combined with convex constraints to generate descent steps for the generic non-convex optimization problem specified in Eq. ( 4).
B. The motivation and application of MOCCA 1) Summary of the Chambolle-Pock (CP) primal-dual framework: The generic convex optimization addressed in Ref. [25] is where F and G are convex, possibly non-smooth, functions and K is a matrix multiplying the vector x.The ability to handle non-smooth convex functions is key for addressing the convex constraints of Eq. ( 4).In the primal-dual picture this minimization is embedded in a larger saddle point problem using the Legendre transform or convex conjugation and the fact that if F is a convex function.The CP primal-dual algorithm of interest solves Eq. ( 26) by iterating on the following steps where n is the iteration index; σ > 0 and τ > 0 are the primal and dual step sizes, respectively, and these step sizes must satisfy the inequality where K 2 is the largest singular value of K.Because this algorithm solves the saddle point problem, Eq. ( 26), one obtains the solution to the primal problem, Eq. ( 25) along with its Fenchel The fact that both Eqs. ( 25) and (32) are solved simultaneously provides a convergence check: the primal-dual gap, the difference between the objective functions of Eqs. ( 25) and ( 32), tends to zero as the iteration number increases.
In some settings, the requirement στ < 1 may be impractical or too conservative, and the CP algorithm can instead be implemented with diagonal matrices Σ and T in place of σ and τ [26], with the condition Σ 1/2 KT 1/2 < 1 and the revised steps x (n+1) = arg min where for a positive semidefinite matrix A the norm z A is defined as 2) The need to generalize the CP primal-dual framework: To apply the CP primal-dual algorithm to Q for fixed f 0 , we need to write Eq. ( 24) in the form of the objective function in Eq. (25).Manipulating the expression for Q and dropping all terms that are constant with respect to f , we obtain The matrices D and E are nonnegative and depend on c and f 0 ; b is a vector which also depends on c and f 0 ; and K is a matrix that depends only on f 0 .Both terms of Q are functions of Kf and accordingly Q is identified with the function F in the objective function of Eq. ( 25) where F Q+ (z) and F Q− (z) are convex functions of z.That F Q (z) is not convex implies that F Q cannot be written as the convex conjugate of F * Q (y), and performing the maximization over y in Eq. ( 26) no longer yields Eq. (25).
3) Heuristic derivation of MOCCA: To generalize the CP algorithm to allow the case of interest, we consider the function F to be a convex-concave where F + and F − are both convex.The heuristic strategy for MOCCA is to employ a convex (again we drop terms that are constant with respect to z).We then execute an iteration of the CP algorithm on the convex function F convex (z 0 ; z); and then modify the point of convex approximation z 0 and repeat the iteration.The question then is how to choose z 0 , the center for the convex approximation, in light of the fact that the optimization of F in the CP algorithm happens in the dual space with F * , see Eq. (29).
A corresponding primal point to a point in the dual space can be determined by selecting the maximizer of the objective function in the definition of the Legendre transform.Taking the gradient of the objective function in Eq. ( 28) and setting it to zero, yields We use this relation to find the expansion point for the primal objective function that mirrors the current value of the dual variables.
Incorporating the convex approximation F convex (z 0 ; z) about the mirrored expansion point z 0 into the CP algorithm, yields the iteration steps for MOCCA where F * convex (z 0 ; y) is convex conjugate to F convex (z 0 ; z) with respect to the second argument; the first line obtains the mirror expansion point using Eq. ( 39) and the right hand side expression is found by setting to zero the gradient of the objective function in Eq. (41); the second line makes use of convex approximation F convex in the form of its convex conjugate; and the remaining two lines are the same as the those of the CP algorithm.For the simulations in this article, all variables are initialized to zero.Convergence of MOCCA, the algorithm specified by Eqs. ( 40) -( 43), is investigated in an accompanying paper [30], which also develops the algorithm for a more general setting.36) to which we apply MOCCA, re-expand the spectral CT data discrepancy at the current estimate of the material maps, and iterate this procedure until convergence.We refer to iterations of the core MOCCA algorithm as "inner" iterations, and the process of iteratively re-expanding the data discrepancy and applying MOCCA are the "outer" iterations.Because MOCCA allows for non-smooth terms, the convex constraints described in Sec.II-B can be incorporated and the inner iterations aim at solving the intermediate problem For the remainder of this section, for brevity, we drop the constraints and write the update steps taking only for the spectral CT data fidelity.The full algorithm with the convex constraints discussed in Sec.II-B can be derived using the methods described in [27] and an algorithm instance with TV constraints on the material maps is covered in Appendix C.
In applying MOCCA to Q (c w , f 0 ; f ), we use the convex and concave components from F Q in Eq. (37) to form the local convex quadratic expansion needed in MOCCA, see Eq. (38), The corresponding dual function is needed to derive the MOCCA dual update step at Eq. ( 41).We note that because the material maps f enter Q (c w , f 0 ; f ) only after linear transformation, Kf , and comparing with the generic optimization problem in Eq. ( 25), we have G(f ) = 0 for the present case where we only consider minimization of the data discrepancy.
In using an inner/outer iteration, a basic question is how accurately does the inner problem need to be solved.It turns out that it is sufficient to employ a single inner iteration, so that effectively the proposed algorithm no longer consists of nested iteration loops.Instead, the proposed algorithm performs re-expansion at every iteration: where (3) Re-expanding at every step is not guaranteed to converge, and an algorithm control parameter λ is introduced that balances algorithm convergence rate against possible unstable iteration, see Sec.IV for a demonstration on how λ impacts convergence.
A similar strategy was used together with the CP algorithm in the use of non-convex image regularity norms, see [28].
The first line of the algorithm, Eq. ( 47), explicitly assigns the current material maps estimate to the new expansion point.In this way it is clear in the following steps whether f (n) enters the equations through the re-expansion center or through the steps of MOCCA.For the spectral CT algorithm it is convenient to use the vector step-sizes Σ (n) and T (n) , defined in Eq. ( 48), from the pre-conditioned form of the CP algorithm [26], because the linear transform K 1 (f 0 ) is changing at each iteration as the expansion center changes.Computation of the vector step-sizes only involves single matrix-vector products of |K 1 (f 0 )| and |K 1 (f 0 )| with a vector of ones, 1, as opposed to performing the power method on K 1 (f 0 ) to find the scalar step-sizes σ and τ , which would render re-expansion at every iteration impractical.In Eq. ( 48), the parameter λ enters in such a way that the product Σ 1/2 KT 1/2 remains constant.For the preconditioned CP algorithm, λ defined in this way will not violate the step-size condition.The dual and primal steps in Eqs. ( 50) and (51), respectively, are obtained by analytic computation of the minimizations in Eqs. ( 41) and (42) using Eq.(45) and G(f ) = 0, respectively.The primal step at Eq.( 51) and the primal variable prediction step at Eq. ( 52) are identical to the corresponding CP algorithm steps at Eqs. ( 30) and (31), respectively.The presented algorithm accounts only for the spectral CT data fidelity optimization.For the full algorithm incorporating TV constraints used in the results section, see the pseudocode in Appendix C.

C. One-step algorithm µ-preconditioning
One of the main challenges of spectral CT image reconstruction is the similar dependence of the linear X-ray attenuation curves on energy for different tissues/materials.This causes rows of the attenuation matrix µ mi to be nearly linearly dependent, or equivalently its condition number is large.There are two effects of the poor conditioning of µ mi : (1) the ability to separate the material maps is highly sensitive to inconsistency in the spectral CT transmission data, and (2) the poor conditioning of µ mi contributes to the overall poor conditioning of spectral CT image reconstruction negatively impacting algorithm efficiency.To address the latter issue, we introduce a simple preconditioning step that orthogonalizes the attenuation curves.We call this step "µ"-preconditioning to differentiate it from the preconditioning of the CP algorithm.
To perform µ-preconditioning, we form the matrix and perform the eigenvalue decomposition where the eigenvalues are ordered The singular values of µ are given by the √ s i 's and its condition number is s 1 /s Nm .The preconditioning matrix for µ is given by Implementation of µ-preconditioning consists of the following steps: • Transformation of material maps and attenuation matrix -The appropriate transformation is arrived at through inserting the identity matrix in the form of P −1 P into the exponent of the intensity counts data model in Eq. ( 3): where • Substitution into the one-step algorithm -Substitution of the transformed material maps and attenuation matrix into the one-step algorithm given by Eqs.(47)-( 52) is fairly straightforward.All occurrences of f are replaced by f , and the linear transform K 1 is replaced by where, using Eqs.( 16) and (17), , and Using µ-preconditioning, care must be taken in computing the vector stepsizes Σ and T in Eq. ( 48).Without µ-preconditioning, the absolute value symbols are superfluous, because K 1 has non-negative matrix elements.With µ-preconditioning, the absolute value operation is necessary, because K 1 may have negative entries through its dependence on Z and in turn µ .
• Formulation of constraints -the previously discussed constraints are functions of the untransformed material maps.As a result, in using µ-preconditioning where we solve for the transformed material maps, the constraints should be formulated in terms of The explicit pseudocode for constrained data-discrepancy minimization using µ-preconditioning is given in Appendix C.
After applying the µ-preconditioned one-step algorithm the final material maps are arrived at through Eq. (58).

D. Convergence checks
Within the present primal-dual framework we employ the primal-dual gap for checking convergence.The primal-dual gap that we seek is the difference between the convex quadratic approximation using the first matrix block in Eq. ( 45), which is the objective function in the primal minimization and the objective function in the Fenchel dual maximization problem These problems are derived from the general forms in Eqs. ( 25) and (32), and the constraint in the dual maximization comes from the fact that G(f ) = 0 in the primal problem, see Sec.
3.1 in [27].For a convergence check we inspect the difference between these two objective functions.Note that the constant term z 0 E 1 z 0 cancels in this subtraction and plays no role in the optimization algorithm, and could thus be left out.If the material maps f (n) attain a stable value, the constraint K 1 y = 0 is necessarily satisfied from inspection of Eq. ( 51).When other constraints are included the estimates of the material maps should be checked against these constraints and the primal-dual gap is modified.We demonstrate use of the one-step spectral CT algorithm on simulated transmission data modeling an ideal photon-counting detector.The X-ray spectrum, shown in Fig. 1, is assumed known.In modeling the ideal detector, the spectral response of an energy-windowed photon count measurement is taken to be the same as that of Fig. 1 between the bounding threshold energies of the window and zero outside.We conduct two studies.The first is focused on demonstrating convergence and application of the one-step algorithm with recovery of material maps for a two-material head phantom using the following minimization problems TPL-TV: arg min Note that for monoenergetic image reconstruction, the TV constraint is placed on the monoenergetic image, but the optimization is performed over the individual material maps f m and the monoenergetic image is formed after the optimization using (10).

IV. RESULTS
Aside from the system specification parameters, such as number of views, detector bins, and image dimensions, the algorithm parameters are the TV constraints γ m for TPL-TV and LSQ-TV or γ mono for TPL-monoTV and the primal-dual step size ratio λ.The TV constraints γ m or γ mono affect the image regularization, but λ is a tuning parameter which does not alter the solution of TPL-TV, LSQ-TV, or TPL-monoTV.It is used to optimize convergence speed of the one-step image reconstruction algorithm.
A. Head phantom studies with material map TV-constraints For the present studies, we employ a two-material phantom derived from the FORBILD head phantom shown in Fig. 2. The spectral CT transmission counts are computed by use of the discrete-to-discrete model in Eq. ( 3).The true material maps f k,bone and f k,brain are the 256×256 pixel arrays shown in Fig. 2 and the corresponding linear X-ray coefficients µ bone,i and µ brain,i are

Bone map Brain map
TPL-TV LSQ-TV The mean of the transmission measurements is arrived at by assuming 4 × 10 6 total photons are incident at each detector pixel over the complete X-ray spectrum.As the simulated scan acquires only 128 views, the total X-ray exposure is equivalent to acquiring 512 views at 1 × 10 6 photons per detector pixel.
We obtain multiple material map reconstructions varying the TV constraints among values greater than or equal to the actual values of the known bone and brain maps.The results for TV-TPL are shown in Figs. 6 and 7, and those for TV-LSQ are shown in Figs. 8 and 9.In all images the bone and brain maps recover the main features of the true phantom maps, and the main difference in the images is the structure of the noise.The noise texture of the recovered brain maps appears to be patchy for lower TV and grainy for larger TV constraints.Also, in comparing the brain maps for TPL-TV in Fig. 7 and LSQ-TV in Fig. 9, streak artifacts are more apparent in the latter particularly for the larger TV constraint values.
It is instructive to examine the convergence metrics in Fig. 10 and image convergence in Fig. 11 for this noisy simulation.The presentation parallels the noiseless results in Figs. 3 and 4, respectively.The differences are that results are shown for a single λ and the image RMSE is shown for two different TV-constraint settings in the present noisy simulations.The cPD and TV plots all indicate convergence to the solution for TPL-TV and LSQ-TV.Note, however, the value of the data discrepancy objective function settles on a positive value as expected for inconsistent data.The data discrepancy, however, does not provide a check on convergence.It is true that if the data discrepancy changes with iteration we do not have convergence, but the inverse is not necessarily true.It is also reassuring to observe that the convergence rates for the set values of λ are similar between the noiseless and noisy results.This similarity is also not affected by the fact that the TV constraints are set to different values in each of these simulations.
The RMSE comparison of the recovered material maps with the true phantom maps shown in Fig. 11 indicate an average error less than 1% for the bone map and just under 2% for the brain map (100 × the RMSE values can be interpreted as a percent error because the material maps have a value of 0 or 1).The main purpose of showing these plots is to see quantitatively the difference between the TPL and LSQ data discrepancy terms.We would expect to see lower values of image RMSE for TPL-TV, because the simulated noise is generated by a transmission Poisson model.Indeed the image RMSE is lower for TPL-TV and the gap between TPL-TV and LSQ-TV is larger for looser TV constraints.We do point out that image RMSE may not translate into better image quality, because image quality depends on the imaging task for which the images are used.Task-based assessment would take into account features of the observed signal, noise texture, and possibly background texture and observer perception [34].
One of the benefits of using the TV constraints instead of TV penalties is that the material maps reconstructed using the TPL and LSQ data discrepancy terms can be compared meaningfully.
The TV constraint parameters will result in material maps with exactly the chosen TVs, while to achieve the same with the penalization approach the penalty parameters must be searched to achieve equivalent TVs.Also generating simulation results becomes more efficient, because we  algorithm and accordingly we select the simpler optimization problem.
For the chest phantom simulations, the scanning configuration is again 2D fan-beam CT with a source to iso-center distance of 80 cm and source to detector distance of 160 cm.The physical size of the phantom pixel array is 29 × 29 cm 2 .The number of projection views is 128 over a 2π scan.Five X-ray energy windows are simulated in the energy ranges [20 , 50]  show the high-contrast structures with few artifacts.We point out that the one-step algorithm yields three basis material maps (not shown) and the mono-energetic images are formed by use of Eq. (10).We derive the gradient in Eq. ( 12), motivating the definition of the linear transform A. Recall Eqs. ( 3) and ( 16): The gradient of L TPL is .
Continuing the algebraic manipulation we insert I : The other necessary gradient and Hessian computations follow from similar manipulations.

APPENDIX B POSITIVE
Using Eq. (62), we prove the inequality in Eq. ( 61) where the inequality is shown by noting that b s ≥ 0, by assumption, and the sum is thus a linear combination of positive definite matrices with non-negative coefficients.
That only the first block of the full quadratic expression is explained in Sec.III-B4 and the form of D 1 , E 1 and b 1 given in Sec.III-B2 determines whether we are addressing TPL-TV or LSQ-TV .The data discrepancy term of this optimization problem is the same as the objective function of Eq. ( 45), but it differs from Eq. ( 45) in that we have added the convex constraints on the material map TV values.We write Eq. (64) using indicator functions (see Eq. ( 5)) to code the TV constraints and we introduce the µ-preconditioning transformation described in Sec.

III-
where f = P f are the transformed (µ-preconditioned) material maps from Sec. III-C.Note that the TV constraints apply to the untransformed material maps f = P −1 f .
Writing constrained TV optimization in the general form F (Kx) + G(x): To derive the CP primal-dual algorithm, we write Eq. (65) in the form of Eq. ( 25).We note that all the terms involve a linear transform of f , and accordingly we make the following assignments Note that we use the short-hand that the gradient operator, ∇, applies to each of the material maps in the composite material map vector, P −1 f .The Legendre transform in Eq. ( 27) provides the necessary dual functions F * 1 , F * 2 , and G * .By direct computation From Sec. 3.1 of Ref. [27] G * (f ) = δ(f = 0).
Convex conjugate of F 2 : We sketch the derivation of F * 2 (y grad ), and for this derivation we drop the "grad" subscript.The objective functions of the primal and dual problems, in Eqs.(64) and (69) respectively, are needed to generate the conditional primal-dual gap plots in Fig. 3.
The material map TV proximity step: In order to derive the TPL-TV and LSQ-TV one step algorithms, we need to derive the proximity minimization in Eq. ( 41) .
The proximity problem splits into "sino" and "grad" sub-problems and the "sino" sub-problem results in Eq. (50).We solve here the "grad" proximity optimization to obtain the pseudo-code for TPL-TV and LSQ-TV .
The proximity minimization is a projection of g + m onto a weighted 1 -ball.= max (g − α 0 w, 0) .

IS
w (E) dE; s w (E) = S w (E)/I r(t)) dt dE,

[ 30 ]
. For the one-step spectral CT image reconstruction algorithm we present: the local convex quadratic upper bound, a short description of MOCCA and its application in the present context, preconditioning, and convergence checks for the spectral CT image reconstruction algorithm.A. A local convex quadratic upper bound to the spectral CT data discrepancy terms 1) Quadratic expansion: We carry out the deriviations on D LSQ and D TPL in parallel.The local quadratic expansion for each of these data discrepancy terms about the material maps November 12, 2015 DRAFT

2 )
A local convex upper bound to L(f ): The key to deriving a local convex upper bound to the quadratic expansion of L(f ) is to split the Hessian expressions into positive and negative components.Setting the negative components to zero and substituting this thresholded Hessian into the Taylor's expansion, yields a quadratic term with non-negative curvature.(As an aside, a tighter convex local quadratic upper bound would be attained by diagonalizing the Hessian and forming a positive semi-definite Hessian by keeping eigenvectors corresponding to only non-negative eigenvalues in the eigenvalue decomposition, but for realistic sized tomography configurations such an eigenvalue decomposition is impractical.)The algebraic steps for splitting the Hessian into positive and negative components in the form

4 )
Application of MOCCA to optimization of the spectral CT data fidelity: The MOCCA algorithm handles a fixed convex-concave function F , convex function G, and linear transform K.In order to apply it to the spectral CT data fidelity, we propose: employing the local quadratic expansion in Eq. ( 0) , and y(−1) are initialized to zero vectors.Before explaining each line of the one-step spectral CT algorithm specified by Eqs.(47)-(52), we point out important features of the use of re-expansion at every iteration: (1) There are no nested loops.(2) The size of the system of equations is significantly reduced; note that only the first matrix block of K, D, E, and b (see Eq. (36) for their definition) appears in the steps of the algorithm.By re-expanding at every iteration the set of update steps for the second matrix block becomes trivial.

Fig. 1 .
Fig. 1.Normalized spectrum of a typical X-ray source for CT operating at a potential of 120kV.

D
LSQ (c, ĉ(f )) such that f m TV ≤ γ m ∀ m.The pseudo-code for TPL-TV and LSQ-TV is given explicitly in Appendix C. The second study simulates a more realistic study demonstrating application on an anthropomorphic chest phantom simulating multiple tissues/materials at multiple densities.For this study we demonstrate one-step image reconstruction of a mono-energetic image at energy E using TPL-monoTV: arg min f D TPL (c, ĉ(f )) such that f (mono) (E) TV ≤ γ mono .

Fig. 2 .
Fig. 2. Bone and brain maps derived from the FORBILD head phantom.Both images are shown in the gray scale window [0.9, 1.1].

Fig. 5 .
Fig. 5. Difference between estimated brain and bone maps after 5,000 iterations and the corresponding phantom map shown in a 1% gray scale window [-0.01, 0.01] for TPL-TV and a 0.1% window [-0.001, 0.001] for LSQ-TV and different values of λ with ideal, noiseless data.The difference images are displayed in a region of interest around the sinus bones.

Fig. 6 .
Fig. 6.Reconstructed bone map by use of TPL-TV from simulated noisy projection spectral CT transmission data.The material map TV constraints are varied according to fractions of the corresponding phantom material map TV.

Fig. 7 .
Fig. 7. Reconstructed brain map by use of TPL-TV from simulated noisy projection spectral CT transmission data.The material map TV constraints are varied according to fractions of the corresponding phantom material map TV.

Fig. 8 .
Fig. 8. Reconstructed bone map by use of LSQ-TV from simulated noisy projection spectral CT transmission data.The material map TV constraints are varied according to fractions of the corresponding phantom material map TV.

Fig. 9 .
Fig. 9. Reconstructed brain map by use of LSQ-TV from simulated noisy projection spectral CT transmission data.The material map TV constraints are varied according to fractions of the corresponding phantom material map TV.

Fig. 12 .
Fig. 12. (Left) Chest phantom displayed at 70 KeV in a gray scale window of [0, 0.5] cm −1 .(Right) Reconstruction by use of unregularized TPL.The estimated material maps are combined to form the shown monochromatic image estimate at 70 KeV (gray scale is also [0, 1.0] cm −1 ).For reference the TV values of the phantom and unconstrained reconstructed image are 2,587 and 7,686, respectively.
, [50, 60], [60,80], [80,100], and [100, 120] keV.The lowest energy window is selected wider than the other four to avoid photon starvation.Noise is added in the same way as the previous simulation.The transmitted counts data follow a Poisson model with a total of 4×10 6 photons per detector pixel.The monoenergetic image at 70 keV along with unregularized image reconstruction by TPL are shown in Fig. 12.The TPL mono-energetic image reconstruction demonstrates the impact of the simulated noise on the reconstructed image.In Fig. 13 we show the resulting monoenergetic images from TPL-monoTV at three values of the TV constraint.The reconstructed images are shown globally in a wide gray scale and in an ROI focused on the right lung in a narrow gray scale window.The values of the TV constraint are selected based on visualization of the fine structures in the lung.For viewing these features, relatively low values of TV are selected.We note that in the global images the same TV values

Fig. 13 .
Fig. 13.Estimated monochromatic images by use of TPL-monoTV.The left column shows the complete image in a gray scale window of [0, 0.5] cm −1 .The right column magnifies a region of interest (ROI) in the right lung, and the gray scale is narrowed to [0, 0.1] cm −1 in order to see the soft tissue detail.The top set of images correspond to the phantom.The location of the ROI is indicated in the left phantom image inset by use of the narrow [0, 0.1] cm −1 gray scale.The second, third, and fourth rows correspond to images obtained by different TV constraints of the monoenergetic image at 70 KeV.
The selected optimization problems and simulation parameters are chosen to demonstrate possible applications of the one-step algorithm for spectral CT.Comparison of the TPL and LSQ data discrepancy in Figs.7 and 9does show fewer artifacts for TPL-TV, where the simulated noise model matches the TPL likelihood.In practice, we may not see the same relative performance on real data -the simulations ignore some important physical factors of spectral CT, and image quality evaluation depends on the task for which the images are used.V. CONCLUSIONWe have developed a constrained one-step algorithm for inverting spectral CT transmission data directly to basis material maps.The algorithm addresses the associated non-convex data discrepancy terms by employing a local convex quadratic upper bound to derive the descent step.While we have derived the algorithm for TPL and LSQ data discrepancy terms, the same strategy can be applied to derive the one-step algorithm for other data fidelities.The one-step algorithm derives from the convex-concave optimization algorithm, MOCCA, which we have developed for addressing an intermediate problem arising from use of the local convex quadratic approximation.The simulations demonstrate the one-step algorithm for TV-constrained data discrepancy minimization, where the TV constraints can be applied to the individual basis maps or to an estimated monochromatic X-ray attenuation map.Future work will investigate robustness of the one-step algorithm to data inconsistency due to spectral miscalibration error, X-ray scatter, and various physical processes involved in photoncounting detection.The one-step algorithm's ability to incorporate basis map constraints in the inversion process should provide a means to control artifacts due to such inconsistencies.We are also pursuing a generalization to the present algorithm to allow for auto-calibration of the spectral response of the CT system.APPENDIX A GRADIENT OF L TPL -STEP ALGORITHM FOR TPL-TV AND LSQ-TV WITH µ-PRECONDITIONING TV-constrained optimization: To derive the one-step algorithm used in the Results section, we write down the intermediate convex optimization problem that involves the first block of the local quadratic upper bound to D TPL or D LSQ

F * 2 g= γ m k = 1 Fig. 14 . 1 y sino + b 1 + E 1 z 0 2 2
Fig. 14.Schematic illustrating the solution of max g m g m g m − δ g m 1 ≤ γm .The input vector gm and the maximizing vector g m are indicated on a 2D schematic, but the argument applies for the full N k -D space of gm.Because g m is a vector of magnitudes, each component is non-negative g m,k ≥ 0. The indicator function confines g m below the line (hyper-plane),k g m,k = γm.The combination of these constraints confines g m to the schematic, shaded triangle.The maximizer g m is the vector that maximizes the dot product, g m g m (or equivalently the projection of g m onto gm as indicated by the dashed line from the head of g m to the arrow indicating gm).Maximization of this dot product is achieved by choosing g m = γmê k−max such that it is aligned along the unit vector corresponding to the largest component of gm.The largest component of gm is also known as the "infinity-norm", gm ∞.Thus we have (γmê k−max ) gm = γm gm ∞.