A survey of computational frameworks for solving the acoustic inverse problem in three-dimensional photoacoustic computed tomography

Photoacoustic computed tomography (PACT), also known as optoacoustic tomography, is an emerging imaging technique that holds great promise for biomedical imaging. PACT is a hybrid imaging method that can exploit the strong endogenous contrast of optical methods along with the high spatial resolution of ultrasound methods. In its canonical form that is addressed in this article, PACT seeks to estimate the photoacoustically-induced initial pressure distribution within the object. Image reconstruction methods are employed to solve the acoustic inverse problem associated with the image formation process. When an idealized imaging scenario is considered, analytic solutions to the PACT inverse problem are available; however, in practice, numerous challenges exist that are more readily addressed within an optimization-based, or iterative, image reconstruction framework. In this article, the PACT image reconstruction problem is reviewed within the context of modern optimization-based image reconstruction methodologies. Imaging models that relate the measured photoacoustic wavefields to the sought-after object function are described in their continuous and discrete forms. The basic principles of optimization-based image reconstruction from discrete PACT measurement data are presented, which includes a review of methods for modeling the PACT measurement system response and other important physical factors. Non-conventional formulations of the PACT image reconstruction problem, in which acoustic parameters of the medium are concurrently estimated along with the PACT image, are also introduced and reviewed.


Introduction to PACT
Photoacoustic computed tomography (PACT), also known as optoacoustic tomography, is a rapidly emerging imaging technique that holds great promise for biomedical imaging (Kruger et al., 1995;Kruger et al., 1999;Xu and Wang, 2002;Xu and Wang, 2003b;Oraevsky and Karabutov, 2003). PACT is a hybrid technique that exploits the photoacoustic effect for signal generation. When a sufficiently short optical pulse is employed to irradiate an object such as biological tissue, the photoacoustic effect results in the generation of acoustic signals within the object. After propagating out of the object, these signals can be measured by use of wide-band ultrasonic transducers. The goal of PACT in its canonical formulation is to reconstruct an image that represents a map of the initial pressure distribution within the object from knowledge of the measured photoacoustically-induced acoustic signals. The initial pressure distribution is proportional to the absorbed optical energy distribution within the object, which can reveal diagnostically useful information based on endogenous hemoglobin contrast or exogenous contrast if molecular probes are utilized. As such, PACT can be viewed either as an ultrasound mediated optical modality or an ultrasound modality that exploits optical-enhanced image contrast (Xu and Wang, 2006b).
Over the past few decades, there have been numerous fundamental studies of photoacoustic imaging of biological tissue (Oraevsky et al., 1997;Wang et al., 1999;Kruger et al., 1999;Paltauf et al., 1996;Oraevsky and Karabutov, 2000;Maslov and Wang, 2008;Esenaliev et al., 1999), and the development of PACT continues to progress at a tremendous rate . Biomedical applications of PACT include small animal imaging Ma et al., 2009;Brecht et al., 2009) and human breast imaging (Andreev et al., 2000;Manohar et al., 2005;Becker et al., 2018;Lou, 2017), to name only a few. For additional information regarding applications of PACT, the reader is referred to the many review articles that have been published on this topic (Li and Wang, 2009;Beard, 2011;Wang, 2003Wang, -2004Wang and Hu, 2012;Ntziachristos and Razansky, 2010).
PACT is a computed imaging modality that utilizes an image reconstruction algorithm for image formation. From a physical perspective, the image reconstruction problem in PAT corresponds to an acoustic inverse source problem (Anastasio et al., 2007). When an idealized imaging scenario is considered, a variety of analytic solutions to the PACT inverse problem are available (Li and Wang, 2009;Rosenthal et al., 2013;Kuchment and Kunyansky, 2011;; however, in practice, numerous challenges exist that can limit their applicability. Alternatively, optimization-based, or iterative, image reconstruction methods for PACT provide the opportunity to enhance image quality by compensating for physical factors, noise, and data-incompleteness. While such approaches are routinely employed in the broader image reconstruction community, relatively few research groups have explored such modern reconstruction methods for PACT. Although they can be computationally demanding, the advent and use of modern parallel computing technologies  have rendered these reconstruction algorithms feasible for many PACT applications. In this article, the PACT image reconstruction problem is reviewed within the context of modern optimization-based image reconstruction methodologies. This review is restricted to the problem of estimating the initial pressure distribution and does not address the more complicated problem of recovering the optical properties of an object (Saratoon et al., 2013;Yuan and Jiang, 2006). Imaging models that relate the measured photoacoustic wavefields to the sought-after object function are described in their continuous and discrete forms. These models will describe physical nonidealities in the data such as those introduced by acoustic inhomogeneity, attenuation, and the response of the imaging system. The basic principles of optimization-based PACT image reconstruction from discrete measurement data are presented, which includes descriptions of forward operators that accurately describe the physics of image acquisition. Furthermore, the derivation of the adjoints of the corresponding forward operators are also reviewed. The adjoint operators facilitate the application of gradientbased approaches in solving the optimization-based PACT image reconstruction problem. Non-conventional formulations of the PACT image reconstruction problem, in which acoustic parameters of the medium are concurrently estimated along with the PACT image, are also introduced and reviewed.
The article is organized as follows. In Section 2, two canonical forward models employed for PACT are reviewed that are based on the acoustic wave equation and integral geometry formulations. The explicit formulation of discrete imaging models are described in Section 3. Optimization-based image reconstruction methods that are based on the discrete imaging models are presented in Section 4. Furthermore, joint reconstruction approaches to PACT image reconstruction whereby acoustic parameters of the medium are concurrently estimated along with the initial pressure distribution are discussed in Section 6. Section 7 concludes the article by providing a brief overview of the challenges and opportunities for research related to image reconstruction for practical applications of PACT.

Canonical forward models in their continuous forms
A schematic of a general PACT imaging experiment is shown in Fig. 1. A sufficiently short (Diebold, 2009b) laser pulse is employed to irradiate an object at time t = 0 and the photoacoustic effect results in the generation of internal pressure distribution inside of the object, which is denoted as p 0 (r), r ∈ R 3 . This pressure distribution subsequently propagates outwards in the form of acoustic waves and is detected by the wide-band point-like ultrasonic transducers located on a measurement aperture Ω 0 ⊂ R 3 . The use of alternative measurement technologies, such as integrating line or area detectors (Burgholzer et al., 2005;Paltauf et al., 2007), have also been widely explored but will not be reviewed here. The measurement aperture Ω 0 is a two-dimensional (2D) surface that partially or completely surrounds the object. The propagated pressure wavefield at time t > 0 will be denoted as p(r, t).
Below, two types of 3D canonical forward models that relate the measured propagated pressure wavefield to the sought after object function p 0 (r) are described in their continuous forms. These models, which form the basis for mathematical studies of the inverse problem in PACT, do not model transducer characteristics. The effects of finite sampling and other physical factors associated with transducer characteristics are addressed in the discrete versions of these models presented in Section 3.

Wave-equation based PACT forward model in its continuous form
Consider an idealized PACT experiment in which an object with known heterogeneous acoustic properties is irradiated with laser pulse to photoacoustically induce an initial pressure distribution p 0 (r). The initial pressure distribution subsequently generates outwardly propagating broadband acoustic waves in the surrounding medium. The acoustic properties of the surrounding medium modulate the behavior of the propagating acoustic waves based on the acoustic wave equation. Let c 0 (r) denote the medium's ambient speed of sound (SOS) distribution, ρ(r, t) and ρ 0 (r) denote the distributions of the medium's acoustic and ambient densities, respectively. Letṡ(r, t) denote the particle velocity in the medium. The initial pressure distribution p 0 (r) and all quantities that describe the properties of the medium are assumed to be represented by bounded functions that have compact supports.
For a wide range of PACT applications, acoustic absorption is not negligible (Treeby et al., 2010;La Rivière et al., 2006;Burgholzer et al., 2007;Modgil et al., 2012;Deán-Ben et al., 2011). Hence, the strong dependence of the amplitude, spectrum and shape of the propagating broadband acoustic pressure signals on the absorption characteristics of the medium needs to be modeled. It is well known that over diagnostic ultrasound frequency ranges, the acoustic absorption in biological tissue can be described by a frequency power law of the form (Szabo, 1994;Szabo, 2004) where f is the temporal frequency in MHz, α 0 (r) is the spatially varying frequencyindependent absorption coefficient in dB MHz −y cm −1 , and y is the power law exponent that is typically in the range of 0.9-2.0 (Szabo, 2004). Note that classical lossy wave equations predict an absorption that is either frequency independent or proportional to frequency squared (Markham et al., 1951).
To describe the effects of power law absorption and dispersion, formulations of the wave equation that model time-domain fractional derivative operators as convolutions have been proposed (Szabo, 1994;Szabo, 2004;Podlubny, 1998;Moshrefi-Torbati and Hammond, 1998). Numerical implementation of time-domain fractional derivative operators for accurately modeling 3D acoustic wave propagation poses a significant memory burden (Yuan et al., 1999). To overcome this memory burden, some work has focused on devising a recursive algorithm to compute the time-domain fractional derivative (Liebler et al., 2004). However, this approach is heuristic and requires a priori optimization for each value of y. To circumvent this, a lossy wave equation modeling power law absorption using fractional Laplacian operators has been proposed (Chen and Holm, 2004). The fractional Laplacian operators can be easily implemented numerically to effectively model the power law absorption in 3D lossy media. Although, the proposed fractional Laplacian derivative operator-based lossy wave equation can accurately model power law absorption, it does not exhibit the correct dispersive sound speed relation (Sushilov and Cobbold, 2004). In order to account for the dispersion inconsistencies, a lossy wave equation that utilizes two fractional Laplacian derivative operators was proposed (Treeby and Cox, 2010a;Treeby et al., 2010). These two fractional Laplacian derivative operators account for the required power law absorption and dispersion terms, separately.
Given the frequency-dependent power law absorption model, one can formulate the PACT forward model in a lossy, acoustically heterogeneous fluid media as (Treeby and Cox, 2010a;Morse and Ingard, 1968;Morse and Ingard, 1961): subject to the initial conditions Here, the spatially varying quantities µ(r) and η(r) describe the acoustic absorption and dispersion proportionality coefficients that are defined as The second and third terms on the right hand side of Eqn. (2c) account for the required power law absorption and dispersion terms separately through two fractional Laplacian derivative operators. The initial value problem defined in Eqn.
(2) describes the propagation of photoacoustically generated pressure data p(r, t) given the spatially varying sound speed c 0 (r), ambient density ρ 0 (r), and the acoustic absorption coefficient α 0 (r). Consider that the measured pressure data p(r, t)| r=r are recorded outside the support of the object for r ∈ Ω 0 and t ∈ [0, T ]. The continuous PACT forward model consists of the composition of the partial differential equation (PDE) described in Eqn.
(2) and an observation operator that restricts the pressure recorded to the measurement surface Ω 0 . Hence, the mapping from the initial pressure distribution p 0 (r) to the pressure recorded on the continuous measurement aperture Ω 0 in a acoustically heterogeneous fluid media can be expressed as: where the wave equation-based forward operator H wave : describes a continuous-to-continuous (C-C) mapping between two function spaces. The reason for the use of such terminology is that the wave equation-based forward operator maps a function of continuous variable (as opposed to a discrete variable) to another function of a continuous variable. There is no implication that either the function itself is continous or even that the mapping is continuous.

Integral geometry-based forward model
In the special case of a lossless and acoustically homogeneous infinite medium, the solution to the wave equation in Eqn.
(2) can be expressed as (Xu and Wang, 2006a) p(r , t) = 1 4πc 2 where c 0 denotes the constant sound speed value, δ(t) denotes the one-dimensional (1D) Dirac delta function, and the integral geometry-based C-C forward operator is denoted as Equation (5) can be conveniently expressed in terms of the spherical Radon transform (SRT) (Kuchment and Kunyansky, 2011) as where the data function g(r , t) is related to the measured pressure data p(r , t) by In the special case of a 2D problem, the spherical Radon transform model reduces to a circular Radon transform Wang et al., 2008).
For the case of a known weakly heterogeneous acoustic media, Eqn. (6) can be replaced by a generalized Radon transform model that is derived from the wave equation by use of a geometrical acoustics approximation. In that case, the measured pressure signals are related to p 0 (r) through integration over nonspherical isochronous surfaces (Miller et al., 1987;Modgil et al., 2010;Jose et al., 2012).
Heuristic strategies have also been proposed to mitigate image artifacts that result from employing the idealized forward model in Eqn. (6) in the presence of unknown acoustic heterogeneity. For example, half-time and partial-time image reconstruction methods have been proposed for PACT image reconstruction from temporally truncated measurements that exclude components of the measured data that are strongly aberrated (Poudel, Matthews, Li, Anastasio and Wang, 2017;Anastasio, Zhang, Sidky, Zou, Xia and Pan, 2005;Anastasio, Zhang, Pan, Zou, Ku and Wang, 2005).

Review of semidiscrete and discrete forward modeling
As with any digital imaging system, the data acquired in a PACT experiment represent a finite collection of numbers that form a vector. As such, the PACT forward operator is fundamentally a continuous-to-discrete (C-D) mapping (Barrett and Myers, 2013) that relates p 0 (r) to the measurement vector. The C-D operator for PACT can be expressed as where D is a discretization operator that spatially and temporally samples the pressure wavefield p(r , t) and H CC represents a C-C PACT forward operator such as H int or H wave . Let u ∈ R QK×1 denote a lexicographically ordered representation of the sampled pressure data and let [u] m denote its m-th component. When ideal point-like ultrasound transducers are assumed, the measured samples of the pressure wavefield are given by for k = 0, 1, ..., K − 1. Here, k and K denote the temporal sample index and total number of temporal samples, respectively, and the vectors r q ∈ R 3 , q = 0, 1, 2, ..., Q − 1, describe the locations of the Q transducers on the aperture Ω 0 . More generally, as discussed in Section 3.4, the measured pressure data can be described as (Wang and Anastasio, 2015) [ where σ q (r 0 ) and τ k (t) are functions that describe the spatial and temporal sampling apertures of the transducers, respectively.
In order to employ the algebraic or optimization-based image reconstruction methods described later, the C-D forward operator must be approximated by a discreteto-discrete (D-D) one. To accomplish this, a finite dimensional representation of p 0 (r) can be employed. An N -dimensional representation of p 0 (r) can be described as where N > 0 and the superscript a indicates that p a 0 (r) is an approximation of p 0 (r). The functions φ n (r) are called expansion functions and the expansion coefficients [θ] n are elements of the N -dimensional vector θ. The goal of image reconstruction is to estimate θ for a fixed choice of the expansion functions φ n (r). As described below, different choices for the φ n (r) and rules for determining θ will result in different D-D forward models. The specific choice of expansion functions may be motivated by various theoretical and practical reasons including a desire to minimize representation error, incorporation of a priori information regarding the object, and efficient computation. Popular choices of expansion functions in PACT include cubic or radially symmetric expansion functions known as Kaiser-Bessel (KB) window functions (Ephrat et al., 2008;Paltauf et al., 2002;Wang et al., 2013;Wang, Schoonover, Su, Oraevsky and Anastasio, 2014), and linear interpolation functions (Zhang et al., 2009;Wang et al., 2013;Dean-Ben et al., 2012;Ding et al., 2017). In general, these D-D imaging models will have distinct numerical properties that will affect the performance of iterative reconstruction algorithms . D-D forward models can be established by substitution of a finite-dimensional object representation into the C-D imaging model: where the D-D operator H is commonly referred to as the system matrix. An element in the n-th row and m-th column of H will be denoted by [H] n,m .
Next, examples of D-D PACT imaging models for use with homogeneous and heterogeneous acoustic media are reviewed.

D-D forward models for use with homogeneous media
In this subsection, two popular D-D PACT imaging models for homogeneous acoustic media are reviewed that employ different choices for the expansion functions.

Interpolation-based D-D PACT model
In the interpolation-based D-D PACT imaging model, the the associated expansion functions can be expressed as (Kak and Slaney, 2001) where ∆ s is the distance between neighboring points on an uniform and isotropic Cartesian grid. The coefficient vector θ can be defined as : where r n = (x n , y n , z n ) T denotes the location of the n-th Cartesian grid node. The corresponding D-D PACT imaging model based on the interpolation expansion functions can be expressed as where Here, G and D are discrete approximations of the SRT operator (Eqn. (6)) and the differential operator (Eqn. (7)), respectively. The discrete SRT operator G can be implemented in a "temporal-sample-driven" manner; namely, the pressure data are computed by accumulating the contributions from voxels on a discretized spherical shell surface defined by the current data sample : where N i , N j denotes the numbers of angular divisions over the polar and azimuth directions within the local spherical coordinate system centered at the q-th transducer r q with a radius of k∆ t c 0 , in which the center of i-th polar division and j-th azimuth division is denoted by r k,i,j . The differential operator D can be implemented as Due to their large sizes, the explicit storage of system matrices is typically infeasible with current computer technologies except for small scale problems. By use of Eqns.
(17) and (18) the action of H int can be computed without having to explicitly store its elements.

3.2.2.
Kaiser-Bessel function-based D-D PACT model The Kaiser-Bessel (KB) function-based D-D PACT model employs the KB functions of order m as the expansion functions. These KB functions are defined as (Lewitt, 1990;Schweiger and Arridge, 2017;Wang, Schoonover, Su, Oraevsky and Anastasio, 2014) b where x ∈ R + , a ∈ R + denotes the support radius of the KB function, γ ∈ R + describes the smoothness of the KB function. The term I m (x) denotes the modified Bessel function of the first kind of order m. The associated expansion function can be expressed as which is simply the KB function centered at location r n . When employing the KB expansion functions, it is convenient to formulate the corresponding D-D PACT imaging model in the temporal frequency domain . Letũ denotes the discrete Fourier transform (DFT) of the measurement data u. The KB function-based D-D PACT imaging model can be expressed as (Wang, Schoonover, Su, Oraevsky and Anastasio, 2014) where the elements of the system matrixH KB can be written as the multiplication of two terms: wherep Here, L denotes the total number of frequency components and ∆ f is the frequency sampling interval. The first quantityp KB (f ) in Eqn. (22) is the temporal Fourier transform of the acoustic pressure generated by a KB function located at origin, wherê j m (x) is the m-th order spherical Bessel function of the first kind (Diebold, 2009a). The second quantityh s,point q (r n , f ) is the spatial impulse response (SIR) in the temporal frequency domain. Under the assumption of an idealized point-transducer model, this quantity reduces to the Green function given in Eqn. (23), where r s q denotes the location of the q-th transducer and r n denotes the location of the center of the n-th KB expansion function (Wang, Schoonover, Su, Oraevsky and Anastasio, 2014).
3.3. D-D forward models for use with heterogeneous acoustic media 3.3.1. Overview of approaches The establishment of D-D imaging models for lossy, heterogeneous fluid media requires the introduction of finite-dimensional approximations of the object function p 0 (r) as well as the known medium acoustic parameters such as c(r), ρ 0 (r) and α(r). In principle, these medium acoustic parameters can be estimated from adjunct ultrasound tomography data (Manohar et al., 2007;Jin and Wang, 2006). When such adjunct image data are not available, one can attempt to estimate the acoustic parameters concurrently with p 0 (r) as described in Section 6. For PACT in heterogeneous media, the choice of finite-dimensional representation of the object and medium parameters is dictated by the numerical method employed to solve the coupled first order differential equations described in Eqn. (2). Most methods for computing a numerical solution of the acoustic wave equation in lossy heterogeneous media fall into one of three major categories: finite difference (FD) methods, finite element (FE) methods, and spectral methods.
The FD and FE methods are also known as local methods because the wave propagation equations are solved at each point based only on conditions at neighboring points. In contrast, spectral methods are global, as information from the entire wavefield is employed to numerically propagate the wavefield. As the spectral methods leverage global information, they allow computation to be performed on coarser grid while maintaining high accuracy (Gottlieb and Orszag, 1977). As opposed to FD or FE methods that require grid spacing on the order 10 points per minimum wavelength, spectral methods, in theory, only require 2 points per minimum wavelength (Cox, Kara, Arridge and Beard, 2007). Due to such relaxed spatial sampling constraints, spectral methods are well suited for large-scale, high frequency PA wave solvers. One of the most popular spectral methods used for solving the acoustic wave equation is the kspace pseudospectral method (Mast et al., 2001;Tabei et al., 2002;Cox, Kara, Arridge and Beard, 2007). Because of its popularity in the PACT community, the D-D imaging model presented below will be based upon the k-space pseudospectral time-domain (PSTD) method (Cox, Kara, Arridge and Beard, 2007).

D-D forward model based on the k-space PSTD method
Below, a general formulation of a D-D forward model based on the k-space PSTD method is briefly outlined. Because of the highly technical nature of the formulation, the reader is referred to the literature for specific details .
The finite-dimensional approximations of the object function p 0 (r) as well as the medium parameters c(r), ρ 0 (r) and α(r) are constructed by sampling these quantities on a Cartesian grid. If {φ int n (r)} N −1 n=0 denotes the set of 3D Cartesian expansion functions defined in Eqn. (13), the initial pressure distribution p 0 (r) can be approximated as a finite dimensional vector θ ∈ R N ×1 . To ensure the stability of the k-space PSTD method, the material properties such as the speed of sound c(r), the ambient density ρ 0 (r), the absorption coefficient α(r), and the wavefield parameters such as pressure p(r, t), particle velocityṡ(r, t), and acoustic density ρ(r, t), are sampled at different points of staggered Cartesian grids (Treeby and Cox, 2010b). Furthermore, the wavefield parameters such as the pressure, particle velocity and acoustic density are staggered temporally. Thus, the finite-dimensional representations of the medium parameters and the wavefield parameters for use in a D-D imaging model will generally employ different expansion functions. Letṡ represent the discrete approximations of the vectorvalued particle velocity and acoustic density over the whole 3D Cartesian grid at the k th time step along the i th direction. Let p k ∈ R N ×1 represent the acoustic pressure sampled at all spatial grid points at the k th time step. Let × 1 vector containing all the wavefield variables at the time step k∆t. The image acquisition process in PACT can be mathematically modeled as the propagation of the wavefield parameters forward in For specific details regarding the definition of the temporal matrix T k (k = 1, · · · , K − 1) ∈ R 7N K×7N K , the reader is referred to the literature . From the initial conditions defined in Eqn.
(2), the vector where T 0 ∈ R 7N ×N maps the initial pressure distribution to the initial value of the wavefield variables at time t = 0 . In general, the transducer locations r q at which the pressure are recorded will not coincide with the vertices of the 3D Cartesian grid at which the propagated wavefields are computed. The pressure at the transducer locations can be related to the computed field quantities via an interpolation operation defined as where M ∈ R QK×7N M . The choice of the interpolation operation is a parameter that will guide the construction of the matrix M. Some of the most commonly employed interpolation strategies include trilinear interpolation or Delaunay triangulation-based interpolation (Lee and Schachter, 1980). By use of Eqns. (25), (24) and (26), one obtains The explicit form of the system matrix H P S ∈ R QK×N , that describes the propagation of PA waves based on Eqn.
(2) can be therefore expressed as Here, the subscript P S stands for the fact that the D-D imaging model is computed using the k-space PSTD method.

Incorporation of transducer responses in D-D forward models
In practice, the ultrasound transducers employed in a PACT imager are imperfect and result in measurements of the PA wavefield that are averaged over finite temporal and spatial apertures as described in Eqn. (10). In the ultrasound imaging community, the effects of these sampling apertures are characterized by the transducer's electrical impulse response (EIR) and spatial impulse response (SIR). Failure to account for these effects can result in reconstructed estimates of p 0 (r) that contain distortions and/or degraded spatial resolution (Xu and Wang, 2003a). When iterative image reconstruction methods are to-be-employed, a natural way to compensate for the EIR and SIR is to include them in the constructed D-D forward model Rosenthal et al., 2011;Ding et al., 2017). Below, the D-D forward models for use with homogeneous acoustic media introduced in Section 3.2 are extended to incorporate the transducer responses. The extension of the the D-D forward model for use with heterogeneous acoustic media introduced in Section 3.3 to include transducer responses is relatively straightforward and is described in the literature .
3.4.1. Incorporating the EIR in D-D forward models for use with homogeneous acoustic media Because degradation of the measured data by the EIR is described by a linear time-invariant model, it can be readily incorporated into the system matrices. For example, the interpolation-based system matrix with EIR can be re-defined as  H where D, G are the discrete approximation so the differential operator and the SRT operator and Here, [h e ] k = ∆ t h e (t)| t=k∆t is the EIR signal sampled in the temporal domain, F and F −1 represent the discrete Fourier transform and inverse discrete Fourier transform, respectively, and p ideal denotes the temporally sampled ideal pressure signal vector that would be recorded by an idealized point transducer.
A similar approach can be adopted to incorporate EIR in the KB function-based system matrix. In this case, the convolution operation can be implemented as an element-wise multiplication in the temporal frequency domain Wang et al., 2013) and the elements of the system matrix are re-defined as where samples ofh e (f ) are computed as the discrete Fourier transform of h e .

Incorporating the SIR in D-D forward models for use with homogeneous acoustic media
The spatial impulse response (SIR) describes the spatial averaging of an acoustic signal that occurs when the signal is measured by use of a transducer possessing a non-zero active area. Specifically, consider a point acoustic source located at position r n whose temporal response is described by δ(t). The SIR h s q (r n , t) represents the measurement of the radiated wavefield by a transducer indexed by q that has an idealized EIR.
Various SIR models have been proposed (Harris, 1981;Lockwood and Willette, 1973;Stepanishen, 1971) and employed in PACT Mitsuhashi et al., 2014;Rosenthal et al., 2011;Queirós et al., 2013;Ding et al., 2017;Wiskin et al., 2012;Sandhu et al., 2015). Because the degradation of the measured signal by the SIR is not generally described by a linear time-invariant model, it is not as straightforward to compensate for as is the EIR.
For the interpolation-based D-D imaging model in Section 3.2, Ding et al. proposed to approximately incorporate the SIR by summing the pressure signal over a collection of points on the transducer surface. However, in order to accurately model the effects of the SIR, it may be necessary to sample a large number of points on the transducer surface, which can result in a large computational burden. It remains generally difficult to accurately incorporate the SIR in an interpolation-based PACT D-D forward model and therefore certain implementations of this model have ignored the SIR .
In the KB function-based system matrix, the SIR can be readily incorporated as an additional element-wise multiplication step in the temporal frequency domain: Here,h s q (r n , f ) denotes the temporal Fourier transform of h s q (r n , t) that can be computed by integrating the Green function in Eqn. (23) over the q-th transducer surface S q : Note that the integral in Eqn. (33) resembles the Rayleigh integral (Kirkup, 1994). As a specific example, consider a transducer element that possesses a rectangular detecting surface of area a × b. Under the validity of the far field assumption (Mitsuhashi et al., 2014;Born and Wolf, 2013): Eqn. (33) can be evaluated to determineh s q (r n , f ) as (Stepanishen, 1971) where x tr nq , y tr nq are the transverse components of r n in a local coordinate system centered at r s q . Given r n = (x n , y n , z n ) and the transducer location r s q specified in spherical polar coordinates r s q = (r q , θ q , φ q ), the values of the transverse coordinates can be computed as : 3.4.3. Patch-based estimation of the SIR When the far field condition in Eqn. (34) is violated, the SIR expression in Eqn. (35) can be inaccurate and its use may lead to artifacts in the reconstructed images (Mitsuhashi et al., 2014). Moreover, when the transducer surface is not flat, it may be difficult to determine an analytic expression for the SIR. These issues can be mitigated by use of "divide-and-integrate" approaches, including line-detector-based method (Rosenthal et al., 2011) and patch-based method (Mitsuhashi et al., 2014). In such approaches, the surface of the transducer is computationally decomposed into smaller elements whose SIRs can be more readily determined. In the line-detector based model, each transducer element surface is decomposed into a number of parallel straight lines, whose SIRs can be analytically expressed. In the patch-based method, each transducer element surface is divided into smaller planar patches that each satisfy the far field approximation (Mitsuhashi et al., 2014); subsequently, in either case, the SIR of the transducer can be approximated by computing the average of the SIRs of the sub-elements (lines or patches): whereh s q,i (r n , f ) denotes the SIR corresponding to i-th sub-element. In the patch-based approach,h s q,i (r n , f ) can be computed with the aid of Eqn. (35). It should be noted that, in the line-detector model, the approximation only accounts for the SIR effect in the direction that is parallel to the straight lines, it still assumes zero thickness (pointlike) transducer in the perpendicular direction. The patch-based model employs the far-field approximation for both directions in the transducer plane, therefore providing compensation in both directions. In addition, the patch-based model can be extended beyond planar transducers by decomposing an arbitrary transducer surface into smaller patches to estimate its SIR.
A computer simulation study examining the effects of accurately modeling the SIR for flat rectangular transducers in the context of PACT image reconstruction is discussed below (Mitsuhashi et al., 2014). The numerical phantom used for the study consisted of spherical objects placed within a PACT system as shown in Fig. 2(a). The simulated pressure data for each sphere were generated by numerically convolving a closed form expression of waves generated by a solid sphere (Diebold, 2009a), with a semi-analytical SIR specifically for spherical waves (Jensen, 1999). During reconstruction, the Kaiser-Bessel function-based D-D PACT imaging model introduced in Section 3.2.2 was employed with different SIR models: Fig. 2 37). The reconstructed images where obtained by solving a penalized-least-squares optimization problem with quadratic smoothness penalty using the conjugate gradient method. These results show that ignoring transducer SIR in the PACT imaging model leads to significant distortion in reconstructed images. Incorporating an accurate transducer SIR model can greatly reduce such artifacts. For objects far away from the transducers, the far-field-based SIR model may suffice. However, for closer objects, patch-based or other "divide-andintegrate" SIR models can further improve the reconstruction accuracy.

Image reconstruction approaches
In this section, a brief survey of conventional image reconstruction methods for PACT is provided. Subsequently, optimization-based image reconstruction methods that are based upon D-D forward models are presented in Section 5.

Brief overview of analytic reconstruction methods
A large number of analytic, or non-iterative, tomographic reconstruction methods for PACT are currently available. Several review articles provide detailed descriptions of these (Kuchment and Kunyansky, 2011;Li and Wang, 2009), so only a quick overview is provided here. Analytic reconstruction methods are commonly based upon a C-C PACT forward model that assumes a homogeneous acoustic media, such as those described in Section 2.2. Analytic reconstruction formulae of filtered backprojection (FBP) form are available for special detector geometries (Finch and Patch, 2004;Xu and Wang, 2005;Kunyansky, 2007a;Finch et al., 2007;Haltmeier, 2014). Such reconstruction methods typically assume an acoustically homogeneous and lossless medium and full-view acoustic detection in which the pressure measurements are densely sampled on a closed surface that encloses the object. Additionally, the available analytic methods are unable to compensate for the ultrasound transducer responses . Other ad hoc reconstruction techniques such as inversion of the Radon transform (Kruger et al., 1995), as well as delay-and-sum techniques (Hoelen and de Mul, 2000) have been employed for PACT image reconstruction. PACT reconstruction methods that are based on harmonic decomposition have also been proposed (Norton and Linzer, 1981). For special geometries such as planar detector geometries, such reconstruction methods can be implemented efficiently by use of the fast Fourier transform (FFT) (Köstli et al., 2001;Fawcett, 1985). This approach has been extended for use with novel measurement geometries that utilize reverberant cavities , and for use with closed measurement surfaces for which the eigenfunction of the Dirichlet Laplacian are explicitly known (Kunyansky, 2007b). These approaches possess computationally efficient implementations and, in certain cases, can reconstruct 3D images thousands of times faster than backprojection-type methods (Kunyansky, 2007b). Similar to the FBPtype algorithms, the harmonic decomposition-based reconstruction methods typically assume an acoustically homogeneous media and do not compensate for the transducer responses.

Time-reversal reconstruction methods
As an alternative to the analytic reconstruction approaches discussed above, a variety of time-reversal (TR) based reconstruction methods have been proposed (Xu and Wang, 2004;Stefanov and Uhlmann, 2009;Hristova et al., 2008;Treeby et al., 2010;Palacios, 2016). Image reconstruction via TR is achieved by running a numerical model of the wave equation-based forward problem backwards in time. Namely, the measured acoustic pressure signals are re-transmitted into the domain in a time-reversed fashion. As such, they can readily account for wave propagation in heterogeneous media and can accommodate arbitrary detection geometries.
4.2.1. Formulation of TR image reconstruction for PACT Consider a compactly supported initial pressure distribution p 0 (r) in an infinite 3D homogeneous lossless acoustic medium, and let the domain Ω correspond to the interior of a closed measurement surface Ω 0 that encloses the to-be-imaged object. According to Huygen's principle, there exists a time T > 0 for which the radiated wavefield inside Ω vanishes ∀t ≥ T . Because Ω contains no energy ∀t ≥ T , one can solve the wave equation backwards in time inside Ω with zero initial conditions and the boundary condition given by the measured data on the surface Ω 0 . As described mathematically below, this process of rewinding the waves backward in time to reconstruct the p 0 (r) is defined as TR. When finite sampling effects are ignored, it has been demonstrated that time reversal yields a theoretically exact reconstruction of p 0 (r) for the case of a 3D acoustically homogeneous medium if T is large enough to allow the energy to escape the domain Ω . In even dimensions or when the medium is acoustically heterogeneous, this result does not hold true. Nevertheless, the TR method can still be employed to obtain an estimate of p 0 (r). When the speed of sound distribution is non-trapping (i.e. all of the acoustic energy escapes the domain Ω), the errors in the estimates can be bounded as a function of the acquisition time T (Hristova, 2009).
In addition, the same principle of replaying the wavefield back in time can also be applied in lossy fluid media. Intuitively, to compensate for the amplitude loss in the wavefields, the wavefields should be corrected by gain factors. To account for acoustic absorption, the absorption term in Eqn. (2c) must be reversed in sign when computing the time-reversed wavefields. In contrast, to account for the dispersion, the sign of the dispersive speed is left unchanged (Treeby et al., 2010). Thus, the TR reconstruction method for use with lossy heterogeneous media solves the following system of equations for the time-reversed pressure wavefield p(r, t) (Treeby et al., 2010): subject to the conditions: Here, u(r, t) denotes the measured pressure data for the case where finite sampling effects are neglected and idealized point-like ultrasound transducer are employed. The sought after estimate of p 0 (r) is specified by the TR method aŝ In operator form, Eqns. (38) and (39) can be expressed aŝ where H T R : L 2 (Ω 0 × [0, T ]) → L 2 (Ω) denotes the time reversal operator described in Eqn. (38).

4.2.2.
Modified TR image reconstruction based on a Neumann series method As described above, the canonical TR method is mathematically exact when the radiated wavefield p(r, t) decays to zero inside the domain Ω for some finite time T . However, in even dimensions and/or when the medium is acoustically heterogeneous and nontrapping, this condition may not be satisfied (Hristova et al., 2008;Hristova, 2009). A reconstruction method based on a Neumann series (NS) expansion has been developed that can be employed for accurate image reconstruction in the presence of acoustically heterogeneous media and an irregular observation geometry for a finite acquisition time T (Qian et al., 2011;Stefanov and Uhlmann, 2009). The NS-based reconstruction method is accurate when the pressure p(r, t) is known on the whole boundary Ω 0 and T is greater than a stability threshold (Qian et al., 2011). To define the NS-based reconstruction method, one needs to first introduce the modified TR operator. The modified TR operator for lossy heterogeneous media solves Eqns. (38a), (38b) and (38c) subject to the initial and boundary value conditions: where P Ω : L 2 (Ω 0 ) → L 2 (Ω) is a harmonic extension operator. The harmonic extension operator is defined as P Ω g(r, T ) = φ(r), where φ(r) is the solution to the elliptic boundary value problem One can summarize Eqns. (41), (39) and (42) aŝ where H T R mod : L 2 (Ω 0 × [0, T ]) → L 2 (Ω) is the modified TR reconstruction operator specified by Eqns. (41) and (42).
Given the modified TR operator, the NS-based reconstructed initial pressure distribution can be defined aŝ where H wave is the C-C wave equation-based forward operator in Eqn. (4) and I is the identity operator. Although the NS-based reconstruction is guaranteed to be convergent when the speed of sound distribution is non-trapping, it has been successfully applied to the non-trapping case as well (Qian et al., 2011).
As the sum defined in Eqn. (44) extends from m = 0 to m = ∞, it cannot exactly be implemented. It is interesting to note that the NS method can be interpreted as an iterative time reversal method (Arridge et al., 2016). Let us define the i th estimate of the reconstructed initial pressure distribution aŝ The partial sum in Eqn. (45) can be expressed aŝ From Eqn. (46), the NS partial sum can be interpreted as an iterative update step, where the i th iteration refers to the i th partial sum.

Optimization-based image reconstruction
A general approach to PACT image reconstruction is to formulate the sought-after estimate of p 0 (r) as the solution of an optimization problem. In fact, most modern image reconstruction methods for computed imaging modalities including X-ray computed tomography and magnetic resonance imaging are formulated in this way (Fessler, 1994;Wernick and Aarsvold, 2004;Pan et al., 2009). There are potential practical and conceptual advantages to optimization-based image reconstruction over analytic methods. For example, because they are based on D-D forward models, optimizationbased image reconstruction methods can comprehensively compensate for the imaging physics and other physical factors such as responses of the measurement system that are not easily incorporated into an analytic method. Moreover, such methods provide a general framework for incorporating regularization, which can mitigate the effects of measurement noise and data incompleteness. Because optimization-based methods are often implemented by use of iterative algorithms, they are generally more computationally demanding than analytic methods; however the use of modern parallel computing technologies  can render iterative three dimensional model-based reconstruction scheme for arbitrary photoacoustic acquisition geometries feasible for many PACT applications. Consider a D-D imaging model u = Hθ as described in Section 3. An optimizationbased image reconstruction method seeks to determine an estimate of θ by solving an optimization program that can be generally specified aŝ Here, F (·) is the objective function to be minimized that depends on the D-D forward operator H, the measured data u, and the vector θ that specifies the estimate of p 0 (r). The functions {f i (θ)} Nc i=0 and the closed set {C i } Nc i=0 describe the set of N c constraints that the solution must satisfy. It is important to note that numerous choices must be made in the design of the reconstruction method. These include the specification of H, F (·), and f (θ), as well as the optimization algorithm that is employed to solve Eqn. (47). It is the joint specification of these quantities that defines the sought after solution and the numerical properties of the reconstruction method (Zhang et al., 2009). It should also be noted that many generic iterative methods that have been employed for PACT image reconstruction (

Penalized least squares methods
A special case of Eqn. (47) corresponds to a penalized least squared (PLS) estimator, and is given byθ Since the initial pressure distribution is non-negative, the above problem can be constrained so the solution satisfies θ ≥ 0. The objective function in the PLS estimator is expressed as two terms. The quantity 1 2 ||u − Hθ|| 2 W is referred to as the data fidelity term that corresponds to a weighted least squares functional. The matrix W ∈ R QK×QK that defines the weighted l 2 norm is symmetric and positive definite, whereas u ∈ R QK×1 corresponds to the measurement vector. This data fidelity functional is convex and differentiable with respect to θ. For the case of Gaussian measurement noise, this data fidelity term can be interpreted as a negative log-likelihood function; however, its use is not restricted to that case.
The quantity R(θ) is a penalty term that can be designed to regularize the inverse problem, and γ ∈ R is a regularization parameter that controls the amount of regularization. A classic form of regularization is Tikhonov regularization (Golub et al., 1999), where the regularization term is specified as where P ∈ R N ×N is a symmetric positive definite matrix. As the Tikhonov regularization term is differentiable with respect to θ, a variety of gradient-based methods can be employed to solve Eqn. (48). Gradient-based methods that can be employed to solve Eqn. (48) can be broadly grouped into two classes: first order methods and second order methods. As part of first order methods, the Taylor approximation used to compute the descent direction involves a linear or first order approximation of the cost function. Hence, information about the gradient of the function is employed to compute the descent direction. Some of the most commonly employed first order methods include the steepest descent method, and the conjugate gradient method. There also exists a class of first order methods that employ information about the past gradients/momentum to speed up the convergence rate of first order algorithms. Such methods are referred as Nesterov methods, whereby linear combinations of present and past gradients are used to compute the descent direction (Nesterov, 1983;Nesterov, 1988). In addition, a family of methods called Krylov-based methods, the subset of which is are the conjugate gradient methods, are also employed to solve smooth, convex programs defined in Eqn. (48) (Paige and Saunders, 1982;Gutknecht, 2007;Greenbaum, 1997). The design and study of variants of the Nesterov method and the Krylov-based methods to solve smooth, convex optimization programs is an active research area (Ghai et al., 2019;Su et al., 2014;Dax, 2019).
The second class of methods regularly employed to optimization programs are the second order methods. The Taylor approximation used to compute the descent direction for second order methods is a quadratic or second order approximation of the cost function. Hence, information about the inverse Hessian of the cost function is employed to compute the descent direction. Methods that explicitly compute the inverse Hessian are referred to as Newton methods. Although the newton methods have favorable converge properties, the computational burden associated with computing a Hessian and inverting it is prohibitively large. Hence, a variety of methods that approximate the inverse Hessian from first order gradient information are commonly employed. Such methods are referred to as Quasi-Newton methods. A variety of Quasi-Newton methods have been proposed and comprehensively studied (Wright and Nocedal, 1999;Dennis and Moré, 1977;Nocedal, 1980;Conn et al., 1991;Khalfan et al., 1993;Dai, 2002).
The set of methods discusssed above can handle cost functions that have smooth regularization and data fidelity terms that are differentiable. However, certain modern approaches to regularization employ non-smooth choices for R(θ) based on the l 1 -norm: where Φ ∈ R N ×N is a sparsifying transformation that is chosen such that θ is sparse in its range. Popular choices for Φ include the discrete wavelet transform or finite difference operators. Total variation (TV) regularization (Sidky and Pan, 2008) can be achieved as a special case when the latter are employed. Such choices are based on the observation that many objects possess a sparse representation in the wavelet or gradient domain (Starck et al., 2010;Chartrand, 2009). Although Eqn. (50) is convex, it is not differentiable with respect to θ. However, a variety of non-smooth optimization methods, such as proximal-gradient methods, can be employed to solve Eqn. (48) in this case. Proximal methods are a type of forward-backward splitting approach, which alternates between computing a gradient step on the data fidelity term and a solution to a proximal problem involving the regularization (Parikh et al., 2014;Combettes and Pesquet, 2011;Beck and Teboulle, 2009b). Moreover, the proximal gradient methods can also be accelerated by utilizing momentum information from past gradients or by deploying second order information to solve the proximal problem (Becker and Fadili, 2012;Beck and Teboulle, 2009b;Beck and Teboulle, 2009a;Pan et al., 2013;Nesterov et al., 2007;Lee et al., 2012;Stella et al., 2017). Examples of PACT images of a live mouse reconstructed by use of two different reconstruction methods are displayed in Fig. 3. A description of the imaging system and experimental details have been described in previous works Brecht et al., 2009). The small animal imager consisted of an arc-shaped probe of 64 transducers along the vertical axis. Thus, a tomographic view consisted of data recorded at 64 transducers along the arc. The complete tomographic dataset was acquired by rotating the object 360 • around the vertical axis. A 180-view dataset was used to reconstruct 3D images of the mouse trunk as shown in Fig. 3. The image shown in Fig. 3(a) was reconstructed by use of a FBP algorithm, while the image in Fig. 3(b) was reconstructed by use of the PLS estimator in Eqn. (48) that employed a TV penalty. The image corresponding to the PLS-TV estimate possesses a higher contrast than the images reconstructed by use of the FBP algorithm and reveal a much sharper body vascular tree. These observations are consistent with the fact that optimization-based image reconstruction methods can produce images that possess different physical and statistical characteristics than the ones produced by analytical method. An important consideration in the formulation of an optimization-based image reconstruction method is the choice of the penalty function. To demonstrate how different choices can influence image quality, Fig. 4 displays images of a simple experimental phantom reconstructed by use of Eqn. (48) with two different choices for the penalty function. Figures 4 (a) and (b) correspond to use of quadratic smoothness Laplacian regularization and TV regularization, respectively. The quadratic Laplacian penalty corresponds to the l 2 -norm of a discrete Laplacian operator acting on the image estimate. In this example, use of the TV penalty resulted in images with reduced artifact levels and enhanced spatial resolution as compared to use of the quadratic Laplacian penalty.
All the images shown in Figures 3 and 4 were reconstructed by use of an iterative image reconstruction model that assumed homogeneous acoustic media and thus could not account for the acoustic heterogeneity of the medium. However, knowledge about the spatial distribution of acoustic heterogeneity of the medium can be incorporated into an iterative reconstruction algorithm based on the imaging model described in Eqn. (28). In vivo experimental studies have been conducted to validate the aforementioned iterative algorithm whereby the spatial distribution of the acoustic properties of the medium are accounted for in the reconstruction algorithm. For the in vivo studies, the experimental data were acquired from a mouse trunk using a small animal imaging system that has been described in detail in earlier works . As opposed to the previous studies where the medium was assumed to be homogeneous, in this study the acoustic variation between the background and the bulk mouse tissue were accounted for in the reconstruction algorithm. A reconstructed initial pressure distribution image for fixed constant sound speed value is shown in Fig. 5(a). In the reconstructed image, strong surface and interior vessel structures are observed. While the interior vessel structures appear out of focus, the surface vessels appear to be in focus. When assuming a homogeneous acoustic media, the image reconstruction algorithm could not produce images where both the surface and the interior vessels could be concurrently focused with a single tuned speed of sound value. To account for this, an image reconstruction algorithm that assigned different speed of sound values to the bulk tissue and the background was employed. The PLS-TV algorithm that was based on the imaging model defined in Eqn. (28) was employed to reconstruct the initial pressure distribution. The reconstructed initial pressure distribution when a heterogeneous speed of sound distribution is used is shown in Fig. 5

Computation of adjoint operators and objective function gradients in PACT
The reconstruction approaches described above are general and applicable to a wide range of computational inverse problems. Here, PACT-specific details regarding the implementation of optimization-based image reconstruction methods are reviewed. Specifically, methods for computing the adjoints of some of the D-D forward operators of Section 3 are presented. The explicit computation of the adjoint of the D-D forward operator facilitates the computation of gradients of data fidelity functionals. A necessary step in solving optimization problems of the form given in Eqn. (48) is the computation of the gradient of the data fidelity term with respect to θ. Formally, this quantity is given by where H † denotes the adjoint of the system matrix H. As such, a key step in computing the data fidelity gradient is computing the action of the adjoint operator. As mentioned previously, for many problems of interest, H is too large to store in memory and its action is commonly computed on-the-fly by use of an algorithm. The same is true for H † . Although the weighted l 2 -norm is one of the most commonly employed data fidelity terms, a variety of convex data fidelity terms can be employed for PACT image reconstruction. Some, of the alternative convex data fidelity terms include weighted l 1 -norm, KL-divergence, etc. Furthermore, the weight matrix associated with the weighted l 2 -norm is a design parameter that can be constructed to incorporate a priori information about the uncertainties associated with the imaging model. In cases where an arbitrary data fidelity term is employed, the gradient of the data fidelity term can be computed efficiently through use of the adjoint state method (Norton, 1999;Plessix, 2006). In the adjoint state method, the gradient of the data fidelity term with respect to the model parameters are computed through the use of adjoint state variables. For PACT applications, the adjoint state variables are solutions to the adjoint of the wave equation defined in Eqn.
(2). Hence, the computational complexity associated with the computation of the gradient of an arbitrary data fidelity term through the use of the adjoint state method would at least be on the same order as computing the action of the discrete adjoint operator H † .
Although not discussed below, it should be noted that another option for establishing H † is to employ an adjoint-then-discretize approach (Arridge et al., 2016). In such an approach, the explicit analytical form of the adjoint of the C-C forward operator is established. Subsequently, the C-C adjoint operator is discretized to obtain an estimateĤ † of the D-D adjoint operator H † . From an implementation perspective, such approaches can sometimes be more convenient than computing the adjoint operator corresponding to the assumed D-D forward operator H. However, in cases whereĤ † does not accurately mimic H † , the behavior of iterative algorithms can be altered (Zeng and Gullberg, 2000). The spectral properties of the forward-adjoint operator pair that govern convergence properties and the restrictions associated in choosing such operator pair have been studied (Zeng and Gullberg, 2000;Lou et al., 2019).

Adjoint for interpolation-based forward model for homogeneous medium
Here, the interpolation-based D-D forward model described in Secs. 3.2.1 and 3.4.1 is considered. The matched adjoint operator corresponding to H int in Eqn. (29) is defined as H † int = G † D † H e † , where (·) † denotes the transpose of the corresponding D-D operator (i.e., a matrix). These operators can be computed as ) While the forward operator H int can be efficiently implemented by use of parallel computing techniques , the adjoint operator H † int is difficult to implement efficiently. This is partly due to the fact that when implementing H † in the interpolation-based model, it is difficult to satisfy the principle of "partition on target results rather than sources" (Kirk and Wen-Mei, 2016) for safe and efficient GPU implementation. In addition, evaluating the backward operator G † relies on expensive atomic operations on the GPU , resulting in a 6-10 times longer runtime for H † int than for H int . To address this problem, an approximation of the adjoint operator can be employed that closely approximates the true adjoint but is computationally more efficient to compute (Lou et al., 2019). Such operators, denoted as H † unmatched , are referred to as unmatched adjoint operators. An unmatched adjoint operator that approximates H † int by use of a simplified voxel-driven model can be defined as: where Here,k ≡ ( r s q −rn c 0 )/∆ t and the value of [g]k is approximated by linearly interpolating from its two neighboring samples: with k denoting the integer part ofk. The operator H † unmatched approximates the operation of H † int , but is much faster to compute. It can be seen from Eqns. (55) and (57) that the computation of the exact adjoint G † involves QKN i N j ×O(1) calculations, while the approximated adjoint G † unmatched involves only Q × O(1) calculations. Though it is difficult to express the computational complexity of the matched adjoint operator in strict big-O notation, because N i and N j change with q and k, the product of these two terms can be approximately on the order of 10,000 for 3D OAT studies; in such cases, QKN i N j × O(1) is significantly larger than Q × O(1).
To demonstrate the use of the proposed unmatched adjoint operator, images were reconstructed from experimental whole-body mouse PACT data by use of the PLS estimator in Eqn. (48) with a quadratic penalty term specified by the l 2 norm of the 3D gradient of the object. The 3D PACT dataset was acquired by use of a previously reported small animal imaging system . The system employed an arc-shaped transducer array containing 64-elements that spanned 152 degrees over a circle of radius 65 mm. During scanning, the object was rotated over a full 360 degrees with 180 equispaced tomographic views. A Landweber iterative algorithm (Landweber, 1951) was employed to compute two different PLS estimates; in one case, H † int was employed and in the other H † unmatched was employed. The reconstructed PACT mouse images with regularization parameter λ = 0.001 are shown in Fig. 6(a)  iterations, respectively. In addition, Fig. 6(c) displays the objective function values versus iteration number (top row) and computational time (bottom row). These results demonstrate that, for this example, use of H † unmatched reduced computational times by approximately a factor of 6 while producing images that are visually comparable in image quality to those obtained by use of H † int . However, as mentioned above for the adjoint-then-discretize approach, employing unmatched adjoint operators in iterative image reconstruction algorithms is not without risk. Improperly designed unmatched adjoint operators may lead to algorithm divergence. In addition, the converged solution obtained by use of an unmatched adjoint operator will generally be different from the true solution. That being said, a carefullydesigned unmatched operator can still be of great practical value for accelerating iterative reconstruction for large scale 3D problems (Lou et al., 2019).

Adjoint for full-wave forward model
Here, the adjoint operator corresponding to the full-wave D-D forward model described in Section 3.3.2 for use with known heterogeneous acoustic media is considered. Given the definition of H P S in Eqn. (28), the adjoint of H P S can be computed as where the adjoint of the temporal propagator matrices T k ∈ R 7N K×7N K can be derived from Eqn.
(2) . The adjoint of the forward operator described in Eqn. (58) can be interpreted as an explicit reversal of the computational steps of the forward scheme .

Joint reconstruction of initial pressure distribution and acoustic medium parameters
The forward models and image reconstruction methods surveyed above assume either that (1) the to-be-imaged object and surrounding medium are acoustically homogeneous or (2) the object and coupling medium are acoustically heterogeneous and the spatially variant acoustic parameters are known. However, it remains generally difficult and/or inconvenient to accurately estimate the spatially variant acoustic parameters. Accordingly, the vast majority of reported PACT studies assume that the medium is acoustically homogeneous and tune the values of the spatially invariant acoustic parameters to mitigate the impact of acoustic aberrations on the reconstructed image. Below, a non-conventional approach to PACT image reconstruction is reviewed in which p 0 (r) and the spatially variant distribution of the acoustic parameters are jointly estimated. This is referred to as a joint reconstruction (JR) approach for PACT Jiang et al., 2006). A physical motivation for attempting JR results from the observation that spatial variations in the acoustic parameters induce aberrations in the photoacoustic wavefields; consequently, the measured PACT data encodes some information about the acoustic parameters. Several JR methods have been proposed in which the initial pressure distribution and speed of sound (SOS) distribution are estimated concurrently from PACT data alone Yuan et al., 2006;Chen et al., 2013;Kirsch and Scherzer, 2012;Huang et al., 2016).
Motivated by the earlier numerical investigations, mathematical studies of the JR problem have been conducted (Liu and Uhlmann, 2015;Hickmann, 2010;Kirsch and Scherzer, 2012;Stefanov and Uhlmann, 2012). These studies established that, when neglecting the discrete sampling effects, the initial pressure distribution and the SOS distribution can be uniquely determined from the measured PACT data only under certain restrictive assumptions (Liu and Uhlmann, 2015). In addition, for a linearized version of the JR problem, it was demonstrated that the SOS distribution and p 0 (r) could not be stably recovered from PACT data alone (Stefanov and Uhlmann, 2012). For a linearized version of the forward problem, where the linearization is represents a smoothing operator for p 0 (r) and c(r) 2 , the JR problem is unstable in any scale of Sobolev spaces (Stefanov and Uhlmann, 2012). This suggests that solving the JR problem from PACT data alone where the imaging model is described by Eqn. 4 is an extremely challenging undertaking. Various approaches have been proposed to overcome these challenges. A possible approach to mitigate the problem is to augment the PACT measurements with a sparse set of ultrasound tomography data .
Below, based on the D-D forward model for use with heterogeneous acoustic media, a general formulation of the JR problem in a discrete setting is provided. Subsequently, a low-dimensional parameterized version of the JR problem is reviewed that holds promise for practical applications.

JR algorithm
Here, a JR problem is considered in which the spatially variant SOS distribution of the object and coupling medium are unknown, but all other acoustic parameters are known. Let c ∈ R N ×1 denote a finite-dimensional representation of the SOS distribution, the explicit form of which will depend on the choice of numerical scheme that is adopted to solve the wave equation in Eqn. (2).
Consider the D-D PACT forward model in Eqn. (12) that is re-stated as where H(c) ∈ R QK×N denotes a D-D forward model for use with heterogeneous acoustic media as described in Section 3.3.2. The notation H(c) is employed to make explicit the dependence of forward operator on the unknown discretized SOS distribution c.
The JR problem can be formulated aŝ θ,ĉ = argmin θ≥0,c>0 where R c (c) and R θ (θ) are convex penalty functions whose relative weights are determined by the regularization parameters β 1 and β 2 , respectively. To solve the JR problem in Eqn. (60), an alternating minimization optimization can be employed as described in Algorithm 1 in which two subproblems are solved alternatively until a convergence condition is satisfied (Huang et al., 2016) . The two subproblems can be expressed as Subproblem 2:ĉ = argmin For a fixed c, Subproblem 1 corresponds to the conventional PACT reconstruction problem stated in Eqn. (48), which is a convex optimization problem since the objective function is convex for fixed c. In Subproblem 2, θ is fixed and an estimate of the SOS distribution c is determined. This corresponds to a non-convex problem. When solving Eqn. (62) by use of gradient-based methods, the computation of the gradient of the data fidelity term with respect to the model parameter c can be accomplished by use of the adjoint state method (Norton, 1999;Plessix, 2006). Since Subproblem 2 is non-convex, obtaining its accurate solution represents a challenge. In the subsequent section, some insights obtained from computer-simulation Algorithm 1 Alternating optimization approach to JR of θ and c Input: θ 0 , c 0 , θ , c , β 1 , β 2 . Output:θ,ĉ 1: k = 0, k is the iteration number. 2: while θ < θ F and c < c F do 3: 8: k = k + 1, 9: end while 10:θ ← θ (k) ,ĉ ← c k studies that shed a light on the numerical instability and the non-uniqueness properties of the JR problem will be reviewed.
6.1.1. Numerical instability of JR methods To explore the numerical instability associated with solving Subproblem 2, computer-simulation studies were performed to provide insights into how small perturbations in the assumed θ affects the accuracy of the reconstructed c (Huang et al., 2016). Figures 7(a) and 7(c) display two similar normalized numerical phantoms depicting θ. The RMSE between these phantoms is 0.004. Simulated ideal PACT measurements were computed corresponding to the cases where each of the two θ was paired with a distinct SOS distribution c. The utilized c are not shown but is nearly identical to the reconstructed image shown in Fig. 7(b). It is important to note that the (θ, c) pair in the top row of Fig. 7 produces nearly identical PA data to that produced by the (θ, c) pair in the bottom row. The simulated noiseless normalized pressure data at an arbitrary transducer location produced by the two (θ, c) pairs are shown in Fig. 8. The normalized pressure signals in Fig. 8 are observed to overlap almost completely and the RMSE between the two sets of PA data was 3.2e − 4. Figures 7(b) and 7(d) display the reconstructed estimates of c when the θ specified in Fig. 7(a) and 7(c) was assumed, respectively. These results demonstrate that the problem of reconstructing c for a given θ is ill-posed in the sense that small changes in θ can produce significant changes in the reconstructed estimate of c. This observation is consistent with the theoretical results (Stefanov and Uhlmann, 2012).
Due to the non-convex nature of Subproblem 2, the optimization algorithm can return a local minimum (i.e., an inaccurate JR solution) as opposed to a global minimum (i.e., an accurate JR solution). Even though the estimate of c obtained at the local minimum is inaccurate, an accurate estimate of the initial pressure distribution θ can sometimes be obtained. Various numerical studies have been conducted to show the non-uniqueness properties of the JR algorithm. Figures 9(a) and 9(b) display two numerical phantoms that describe the initial pressure distribution and the speed of sound distribution of a mouse phantom, respectively. The JR algorithm was applied to the forward PACT measurement data generated from the corresponding phantoms. The alternating minimization algorithm described in Algorithm 1 was used to solve the JR algorithm. The results from the JR algorithm are displayed in Fig. 10. From the reconstruction results, one can observe that even though the sound speed distribution is highly inaccurate, the initial pressure distribution can be effectively estimated. This demonstrates that JR algorithms can be employed to improve PACT image quality even if the produced estimates of the sound speed distribution is highly inaccurate.

Parameterized JR
To mitigate the instability of JR, a modified version of the JR problem can be formulated in which p 0 (r) and only a lower dimensional estimate of SOS distribution are estimated Matthews et al., 2018;Poudel et al., 2018). The SOS representation c is typically formed by use of pixel expansion functions. To constrain the JR problem, a lower dimensional parameterized representation of the SOS distribution, c p ∈ R D , can be introduced as where Γ ∈ R N ×D is a binary matrix that maps c p to c. Let I j denote the set of pixel indices in c that corresponds to the j th -parameterized SOS value (i.e., component of c p ), Γ can be defined as The parameterized JR problem is given by  θ,ĉ p = argmin θ≥0,cp>0 Once c p is estimated, an estimate of c can be obtained by use of Eqn. (63). The gradient of the cost function with respect to c p can be related to the gradient with respect to c via the chain rule. When Γ corresponds to a linear mapping, one obtains where the gradient of the cost function with respect to c can be computed by use of the adjoint state method (Plessix, 2006). Subsequently, an algorithm similar to Algorithm 1 can be employed for image reconstruction, where c is replaced by c p . Alternatively, a weighted proximal gradient method has been proposed to solve Eqn. (65) in which the necessary gradients can be computed with only two wave solver runs .
To demonstrate the use of the parameterized JR algorithm, images were reconstructed from experimental mouse data using the PACT imager described earlier . Figure 11 displays zoomed-in regions of images of a live mouse reconstructed by use of an iterative method that employed a constant SOS value of 1.500 mm/µs (subfigure (a)) and a JR method that employed a two-parameter SOS model (subfigure (b)) . To establish the two-parameter SOS model, the outer boundary of the mouse was manually segmented from the estimate of p 0 (r) that was reconstructed using the fixed SOS value. The simulation grids consisted of 2048 × 2048 pixels with a pixel size of 0.05 mm. In the JR method, the initial guess for the parameterized sound speed distribution was 1.48 mm/µs for the background and 1.54 mm/µs for the mouse, while the initial guess for θ was the zero-vector. The JR result demonstrates improvement over the image obtained with the constant SOS. The largest difference can be seen in the rightmost surface vessel, which appears as an arc in the constant sound speed image and a point in the JR image. In addition, several interior vessels are better focused in the JR image.

Other challenges
There are numerous other challenges and opportunities for research related to image reconstruction for practical applications of PACT that are beyond the scope of this article. Some of these are outlined below.

Approximating PACT as a 2D problem
In practice, PACT imaging systems are often designed to image 2D sections through a 3D object. This can be accomplished by employing an array of elevationally focused ultrasound transducers. For example, several groups have developed systems that employ a circular ring array (or part of a ring) that surrounds the 2D section to be imaged Li et al., 2017;Gamelin et al., 2009;Ma et al., 2009;Lin, Hu, Shi, Appleton, Maslov, Li, Zhang and Wang, 2018). In such systems, a 3D object can be imaged on a 2D slice-by-slice basis and the resulting images stacked together along the direction perpendicular to the plane of the array to estimate a 3D reconstructed volume. There are practical advantages to such approaches. From a hardware perspective, 2D systems can be less costly to construct than full 3D systems and may not require mechanical scanning of the transducer array to image a 2D slice. When advanced electronics are employed, 2D systems can yield near real-time data-acquisition; this can result in excellent temporal resolution and spatial resolution that is not strongly compromised by physiological motion. Image reconstruction can also be performed in near real-time (Buehler et al., 2010).
However, from an image reconstruction perspective, there are potential challenges associated with such 2D approaches to PACT associated with the fact that PACT is inherently a 3D technique. If (idealized) focused transducers could be employed in a PACT imager to isolate an infinitesimally thin slice of the object, the spherical Radon transform forward model described in Section 2.2 would reduce to a circular Radon transform . In this case, the reconstruction problem could be treated in 2D. However, the focused piezoelectric transducers employed in practice do not perfectly reject acoustic signals that arrive from outside the plane of interest. Moreover, the focusing properties of such transducers possess a temporal frequency dependence as described in Section 3.4.1. Such factors and the fact that the PA wavefield propagates according to a 3D wave equation result in a mismatch between the measured data and a canonical 2D PACT forward model. Finally, when 2D PACT images corresponding to different parallel slices of the object are stacked together to estimate a 3D reconstructed volume, the resolution is generally severely compromised in the direction perpendicular to the plane of the array. The presence of artifacts in 2D PACT images due to the presence of 3D out-of-plane scatterers can also significantly degrade image quality. Similar issues have been studied in the geophysics community and medical ultrasound computed tomography communities ( mitigate some of these out of plane scattering artifacts by carefully designing the elevational focusing properties of the transducers (Duric et al., 2014;Li et al., 2017).
The discrete forward models and optimization-based image reconstruction methods reviewed in the previous sections can address these issues and potentially improve the accuracy and spatial resolution of images reconstructed by use of ring-array systems. Namely, to compensate for the transducer focusing effects, the SIRs of the transducers can be incorporated into a 3D discrete forward model. In practice, however, it may be challenging to accurately model the SIR for the case where an acoustic lens is employed. By use of a 3D forward model, p 0 (r) within a thin 3D volume can be estimated. In such approaches, parameters such as voxel size must be designed carefully to balance the effects of data-incompleteness and model error.

Spatiotemporal image reconstruction
The majority of PACT reconstruction algorithms assume static imaging conditions; namely, the object of interest is considered to be fixed and independent of time. This assumption, however, is violated in many biomedical applications of PACT (Gamelin et al., 2009;Li et al., 2010;Buehler et al., 2010;Xiang et al., 2013;Lou et al., 2016), where the sought-after initial pressure distribution can depend on time denoted as p 0 (r, t). Spatiotemporal, or dynamic, image reconstruction methods for PACT seek to reconstruct a sequence of images that correspond to a collection of temporal samples. In a dynamic PACT study, the measurement data collected at a particular time point is referred to as a data frame. It is assumed that p 0 (r) remains static during the acquisition of each data frame. This condition can approximately be satisfied if the temporal resolution of the imaging system is sufficiently high, such as when fixed arrays of transducers are employed.
A simple approach to dynamic reconstruction is to simply employ a (static) conventional PACT reconstruction method to reconstruct estimates of p 0 (r, t) corresponding to times t at which the data frames were acquired. Such approaches are referred to as frame-by-frame reconstruction (FBFIR) methods (Li et al., 2010;Buehler et al., 2010;Xiang et al., 2013;Chatni et al., 2012). A limitation of FBFIR methods is that they do not exploit statistical correlations between data frames and are therefore computationally and statistically suboptimal (Wernick et al., 1999;Tsao et al., 2003;Lingala et al., 2011). Below, a few representative examples of spatiotemporal image reconstruction strategies for dynamic PACT that leverage correlations between data frames are reviewed.
Unlike the FBFIR methods, spatiotemporal image reconstruction (STIR) methods for dynamic PACT jointly reconstruct the sequence of object frame estimates by using all of the data frames. One such method that has received attention is the low rank matrix estimation (LRME)-STIR method (Wang, Xia, Li, Wang and Anastasio, 2014;Trémoulhéac et al., 2014). The LRME-STIR method exploits the fact that, for many dynamic PACT applications, the image sequence can be approximately described as a low-rank matrix whose rank is typically much smaller than the number of temporal samples. Hence, the LRME-STIR method consists of applying a data denoising step followed by image reconstruction conducted in the domain of the singular system of the low rank data matrix. The performance of the LRME-STIR method was compared with that of conventional FBFIR method with both computer-simulated and 2D experimental data. The results of the study demonstrated that the LRME-STIR method was not only computationally more efficient but also yielded more accurate dynamic PACT images than conventional FBFIR methods (Wang, Xia, Li, Wang and Anastasio, 2014).
Recently, to improve upon low-rank approximation-based methods, a generalized spatio-temporal modeling framework has been proposed that can encode information about a wide range of dynamics (Lucka et al., 2018). In this approach, the image dynamics are described by an explicit partial differential equation (PDE) model chosen a priori such that it reflects the underlying dynamics. Subsequently, the image sequence and the corresponding motion parameters of the PDE are jointly estimated by minimizing a variational energy cost function. In particular, the motion model employed was based on the popular optical flow equation. The approach was applied successfully to a simulated data as well as to a challenging 3D scenario with experimental data. The results of the study showed that both the simulated and the experimental databased reconstructions showed a significant improvement in image quality over FBFIR reconstructions (Lucka et al., 2018). Furthermore, the reconstructed motion fields, or the parameters of the PDE provided additional information regarding the dynamics. For the case where the temporal dynamics are relatively simple, low-dimensional parametric models can also be effectively be used to constrain the dynamic image reconstruction problem (Chung and Nguyen, 2017).
In general, there remains an important need for the development of reconstruction methods for dynamic PACT that are computationally efficient and can compensate for important physical factors. For example, there have been no reported methods for dynamic PACT that can compensate for the effects of unknown heterogeneities in the acoustic properties of an object.

PACT in the presence of elastic media
Conventional PACT reconstruction methods assume that the to-be-imaged object can be described as a fluid acoustic medium. When only soft biological tissues are present, this assumption is well-accepted. However, calcified tissues or bone are more accurately described as an elastic, or solid, medium. In elastic media, the propagation of both longitudinal pressure waves as well as transverse shear waves are supported, each having a distinct speed of propagation (Fry F. and Barger, 1978). As such, to preserve image quality, PACT reconstruction methods for use in applications in which elastic solids are present need to be predicated upon the elastic wave equation as described below.
Here, tr (·) is the operator that calculates the trace of a matrix and I ∈ R 3×3 is the identity matrix. In Eqn. (69c), it has been assumed that the initial pressure distribution p 0 (r) is compactly supported in a fluid medium where the shear modulus µ(r) = 0. In transcranial PACT (Huang et al., 2012), for example, this corresponds to the situation where the initial photoacoustic wavefield is produced within the soft tissue enclosed by the skull. Equation (69) represents a generalization (but with a diffusive instead of power law attenuation model) of the canonical PACT forward model for fluid media in Eqn.
(2). For the case of layered elastic media, an analytic image reconstruction formula has been established (Schoonover and . More generally, a methodology for establishing a D-D forward model based on Eqn. (69) and the associated matched adjoint operator has been established and investigated within the context of transcranial PACT . The discretization of the elastic wave equation was based on the finite difference in time domain and space, which facilitated large scale parallelization of the forward and adjoint operators using multiple GPUs. By use of the D-D operators, a variety of optimization-based image reconstruction methods can be employed ). An alternative formulation of the D-D forward and adjoint operators that was established by use of the k-space pseudo-spectral method has also been reported (Javaherian and Holman, 2018).
There remains an important need for the development of reconstruction methods for transcranial PACT that are computationally efficient and can compensate for important physical factors. For example, there have been no reported methods for transcranial PACT that can compensate for the effects of unknown acoustic heterogeneities within the skull of a subject.

Summary
In this topical review, a survey of the acoustic inverse problem in PACT has been presented. Specifically, the PACT image reconstruction problem of estimating the photoacoustically-induced initial pressure distribution has been reviewed within the context of modern optimization-based image reconstruction methodologies. Imaging models that relate the measured photoacoustic wavefields to the sought-after initial pressure distribution were described in their continuous and discrete forms. Subsequently, a description of how important physical factors relevant to PACT can be incorporated into the imaging models was also provided.
Details regarding the implementation of optimization-based PACT reconstruction methods that are not widely available in other review papers were presented. For example, descriptions of the implementations of different discrete adjoint operators employed in PACT were provided. Such adjoint operators are a key component of gradient-based algorithms that are employed to solve optimization-based reconstruction problems. Additionally, non-conventional formulations of the PACT image reconstruction problem were also reviewed, in which the acoustic parameters of the medium are concurrently estimated along with the initial pressure distribution.