Analysis of the inverse problem for determining nematic liquid crystal director profiles from optical measurements using singular value decomposition

We use the problem of determining nematic liquid crystal director profiles from optical measurements as an example to illustrate that what is often treated purely as a data fitting problem is really an inverse problem and that useful insights can be obtained by treating it in this way. Specifically we illustrate the analysis of the sufficiency of data and the sensitivity of a solution to measurement errors. We assume a stratified medium where the Berreman method can be used for the optical forward problem and we consider the inverse problem to be the determination of an anisotropic dielectric permittivity tensor from optical data. A numerical singular value decomposition (SVD) analysis reveals that although this inverse problem is severely ill-conditioned it is possible to determine depth-dependent information provided the medium is sufficiently birefringent and that, as one might expect, a larger range of incident angles gives greater information. Analytical solutions of the Berreman equations for general perturbations of an orthorhombic crystal confirm the uniqueness of solution for the linearized problem and give further insights into the severely ill-posed nature of the inverse problem.


Introduction
Many inverse problems occur in experimental physics, although these are often just treated as data fitting problems, where the parameters in a forward model are varied until a good fit to the experimental data is obtained. Examples include optically probing nematic liquid crystal (LC) cells to obtain information about the director profile through the cell (the example used here to illustrate the approach being presented) and determining material structure using x-rays, or neutron scattering. In these problems some initial questions that arise are as follows.
1. Can the required parameters be deduced from the data? 2. How is the solution affected by the amount and type of data collected? 3. How can a solution be efficiently obtained? 4. How does experimental noise in the data limit the information that we can reliably deduce from it?
These questions are typical of many such inverse problems that occur in experimental physics and where a similar analysis to that given here may also be appropriate. Nematic LCs are generally made up of rod-like molecules which show a nematic phase, i.e. a phase where the centres of mass of the molecules have no positional order, but where there is local orientational order. At any point, r, in the material we can define an average direction, which we represent by a unit vector, n(r) and which we call the director. We make no distinction between n(r) and −n(r). In the absence of external influences the LC director is the same everywhere, there is no distortion. In an LC cell the cell walls are treated to provide boundary conditions that the director must satisfy. The LC takes up a configuration which minimizes the free energy, which in its simplest form (with rigid anchoring at the boundaries and no applied  fields) is given in (1). The three terms in the integral represent the allowable distortions for a nematic LC [1]: splay, twist and bend.
These are given in terms of the elastic constants K 11 , K 22 and K 33 . These can be measured experimentally and are typically of the order of 10 −11 Nm −2 .
Optically probing nematic LC cells, although experimentally straightforward, relies on difficult data fitting to give the director profile [2]- [5]. Work at Hewlett Packard Laboratories, Bristol (HPLB) is focused on developing a technique that can efficiently obtain director profiles for such cells to allow different surface treatments and LC materials to be evaluated.
The experimental set-up uses a polarimeter to measure the normalized Stokes parameters [6] of monochromatic light transmitted by a test cell as a function of the incident angle and polarization state of the input beam. The data is then compared to results obtained from an optical model of the system and the model parameters adjusted to give a good fit to the data.
In this paper results are presented in terms of the fundamental model parameters, the elements of the dielectric tensor. Figure 1 shows the experimental set-up used at HPLB. A cylindrical mount enables data to be collected over a wide range of incident angles. For each incident angle normalized Stokes parameters are measured for a number of input polarizations. 4 For the Stokes parameter measurement the z-axis is taken to be in the direction of the beam and the alignment of the polarimeter defines the x-and y-axes. In the experimental set-up used at HPLB the x-axis is horizontal and the y-axis vertical and into the optical bench.

Experimental method
Similar experiments are carried out in the Department of Physics at the University of Exeter. Later in this paper we will apply the same techniques to their experimental setup.

The forward problem
The forward problem is the calculation of the Stokes parameters for a given incident angle and input polarization.
Taking the z-axis in the direction of the beam and defining the x-and y-axes relative to the laboratory system the electric field vector, E, can be written E = (E x , E y ) T (where T denotes the transpose) with E x = a cos(ωt − kz) and E y = b cos(ωt − kz + δ). The Stokes parameters [6] are defined as 5 To calculate the E-vector we use the Berreman 4 × 4 matrix method [7]. This method supersedes the simpler 2 × 2 extended Jones method [8] used elsewhere in that reflected waves are also correctly included.
Maxwell's equations (in SI units) for a dielectric are [9] ∇ · D = 0, Putting (x, y, z) = (x 1 , x 2 , x 3 ) we write each of the terms in the solution as giving We assume that D is linearly related to E and that B is linearly related to H with relative permeability µ = 1 and write 6 5 The Stokes parameters are used to characterize any light beam. When the light is fully polarized (as in our case) S 2 0 = S 2 1 + S 2 2 + S 2 3 and there is some redundancy in the definition. Note also that for our experiment normalized Stokes parameters are used, each value is divided by S 2 0 . 6 In SI units 0 c 2 = 10 7 /4π (by definition) and µ 0 0 c 2 = 1. is the dielectric tensor.

DEUTSCHE PHYSIKALISCHE GESELLSCHAFT
Choosing our co-ordinate system so that the xz-plane is the incident plane (k 2 = 0) and substituting for D and B gives 7 were ξ depends on the incident angle and the refractive index of the input medium (usually air). Eliminating E 3 and H 3 gives an ordinary differential equation for the Berreman where the matrix M is The final value problem for this linear ordinary differential equation (ODE) can be solved to give the linear relationship between 'initial data' X(0) and X(z) = P(z)X(0), where P(z) is a propagation matrix. Given the field vector in the input plane we can calculate the field vector at the output (z = d). We will write P(d) as simply P. We now assume that we have an input wave, X i , a reflected wave, X r and a transmitted wave, X t . We then have This equation can be re-arranged (see appendix) to give X r and X t as functions of the input field vector, X i . For a region where M is independent of z we write each of the terms in the solution as and Berreman's equation (12) then reduces to an eigenvalue equation The eigen-solutions of this equation give the modes that propagate in that region. 7 We use the usual summation convention, i.e. we sum over repeated indices. So, for example, 1k E k = 11 E 1 + 12 E 2 + 13 E 3 .

Linearization and the singular value decomposition (SVD)
In this case the specific questions that we would like to answer are as follows.
1. Can the dielectric tensor through the cell be deduced from the data? 2. How is the solution affected by the range of incident angles and input polarizations used? 3. Given the limited accuracy of the polarimeter how much information can be deduced from the data?
As a first step to answering these questions we have carried out an analysis of the inverse problem by linearization and using SVD. The SVD is a widely used tool in inverse problems as it can be used to study the number of unknown parameters one can expect to reliably recover from a given system of measurements of a specified precision. The truncated SVD (TSVD) gives an explicit inversion of a linearized problem ignoring components in the solution that are not justified by the precision of the data. The forward problem can be written as with S the vector of measured Stokes parameters, and F the corresponding calculated values for the vector of model parameters p. We wish to solve the inverse problem, i.e. given a set of measured Stokes parameters find the model parameters consistent with this data. This inverse problem is ill-posed in the sense that without including additional a priori information widely varying parameter values will fit the data to a given precision. To understand this difficulty we first linearize the problem about some initial guess for the parameters p 0 giving where J is the Jacobian matrix, J ij = ∂S i /∂p j . A solution to the (nonlinear) inverse problem is obtained by iteratively solving the linearized problem and updating the model parameters. Given an initial set of parameters, p 0 we obtain the next estimate by solving the equation for δp. Solution of this equation is ill-conditioned and the results strongly affected by any noise in the data. To investigate this we look at the SVD of the Jacobian, J. Any m × n(m n) real matrix J can be written [10] as where U is an m × m orthogonal matrix (its column vectors, u i , are orthonormal). V is an n × n orthogonal matrix (with column vectors, v i . V T is the transpose of V ). is an m × n matrix, arranged so that the upper n × n block is diagonal with these diagonal elements in descending order, and zeros elsewhere. The m − n rows below are all zeros. We will also write this as = diag(σ 1 , . . . , σ n ). The σ i are called singular values. If rank (J ) = r then the last (n − r) singular values will be zero and we will write = diag(σ 1 , . . . , σ r , 0, . . . , 0). This is the SVD of the matrix. Given this decomposition we can write the least-squares solution of (20) as Small singular values give large contributions to the solution vector and this is where the illconditioning arises. A measure of this ill-conditioning is the ratio of the largest to the smallest singular value, the condition number of the matrix [11]. Any noise on the data carries across to δS and is then amplified by the smaller singular values. Put another way, if we only know the data δS to some limited relative precision ε, we are not justified in recovering components of the parameter vector u T i δp for σ i /σ 1 < ε. The graph of the decay of the singular values can therefore be used to indicate how many independent components of δp we can expect to recover for a given measurement precision. We can see how this varies for different data sets and choices of parametrization. The TSVD is an approximate least squares solution where the sum in (22) is taken only as far as some k < min(m, n) determined by the accuracy of the data.

Numerical results
Because the dielectric tensor is symmetric we need six parameters to specify it completely. For N layers we have 6N parameters in all. For the results given below the Jacobian is calculated for sets of initial parameters (p 0 ) calculated by assuming that the dielectric tensor is derived from an LC layer and that the LC director in this layer is aligned in the following ways (see figure 2).
H The director is parallel to the incident (xz)-plane. It is parallel to the z-axis at z = 0, and is parallel to the x-axis at z = d (d is the thickness of the cell). The tilt of the director varies linearly across the cell. A hybrid aligned nematic (HAN) cell. This configuration closely models the experimental set-up at HPLB. 8  T The director is parallel to the xy-plane and is aligned at −π/4 to the x-axis at z = 0 and twists linearly through the cell so that it is aligned at π/4 to the x-axis at z = d. A twist cell with no pre-tilt.
P The director is parallel to the x-axis throughout the cell. A planar cell with no pre-tilt.
In order to focus on the LC layer of the cell we choose the ordinary and extraordinary refractive indices (n o and n e ) so that their average is equal to the refractive index of the surrounding medium (glass). 9 The LC is further assumed to be lossless so its refractive index is real. Given the refractive indices and the azimuthal (φ) and tilt (θ) angles of the director the elements of the dielectric tensor are given by 11 Figure 3 shows how the normalized singular values, σ i /σ 1 , vary with N, the number of layers in the system. 10 As the number of layers increases so does the number of parameters. However there is no significant change in the number of singular vectors that can usefully be used to construct a solution (those for which σ i /σ 1 < ε). For the other results presented here we will therefore use N = 50. Figure 4 shows how the normalized singular values depend on the initial parameters, p 0 . For σ i /σ 1 > 0.01 we see that the normalized singular values show little dependence on the initial parameter set, although, of course, the corresponding singular vectors may be quite different. For the other results presented below we just use the HAN-like profile. Figure 5 shows how the normalized singular values vary with different incident angle ranges. As the angle range increases we see a marked increase in the number of singular vectors that we  can use to construct a solution. Experimental considerations mean that the angle range cannot be increased above −70 to +70. Figures 6 and 7 show the normalized singular values plotted for different values of the birefringence, n. As n becomes small the number of singular vectors that can be used to construct a solution drops to five, even with a large angle range. Work in photoelastic tomography [12] shows that when the birefringence is weak the ray approximation can be used to reduce the forward problem to a Radon transform. In this case only five singular vectors can be obtained and extra rays passing through all of the layers (as they must in this case) give no extra information. LCs are highly birefringent and the ray approximation cannot be used, the problem cannot be reduced in this way. We can obtain more than five singular vectors and as we see in figure 5 having extra rays, i.e. using a larger angle range in the experiment, does give us extra information.

Theoretical details
While the numerical studies we have presented stand on their own, there are some limited theoretical results that are also illuminating. First we show how the linearization of the Berreman equations, and consequently the Jacobian J, can be derived without recourse to a finite difference approximation perturbing parameter values. We begin with the Berreman system of ODEs in the form for some fixed ξ. Of course h = 0 in Berreman's equations but we will have cause to solve for nonzero h(z) in the course of our linearization. For clarity we omit the ξ subscripts in what follows. We note that there is a Green's function G so that the solution to (24) can be written and we shall write the right most term in operator notation as G[h]. Now we consider a hypothetical perturbation of the material parameters resulting in a change in the Berreman matrix to M ξ + δM ξ . For simplicity we will assume the initial condition X(0) is fixed and, although in our practical situation that is not the case, we will rectify the deficiency below. Taking M to be an initial guess, and h in (24) to be zero, the new electromagnetic field X + δX, with δX(0) = 0, satisfies using (24). We see that this is a differential equation of the same form as (24) but in δX, and in operator form is Using the standard Neumann series for operators we can write 12 DEUTSCHE PHYSIKALISCHE GESELLSCHAFT a series that will converge provided the operator norm ω c GδM < 1, which can be ensured by taking a sufficiently small perturbation in δM. We now see that to first order in δM or returning to the differential form to first order. The final data are obtained by solving this ODE for its final value δX(d), or from the integral In our practical application M depends on the parameters for each layer, but the dependence is a simple algebraic formula that can easily be differentiated, so the partial derivatives of the final value X ξ (d) with respect to each parameter is readily calculated using the chain rule. Let us pause to consider what type of inverse problem we have. ξ depends on the incident angle and enters the formula for M (see (13)) as terms up to second order. If M was simply a linear function of ξ we would have something very close to the inverse Sturm-Liouville problem of recovering the coefficients of an operator from its boundary eigendata. Indeed before the elimination of the z components of the electric and magnetic field we had a 6 × 6 system of second order partial differential equations where the coefficient matrix is linear in ξ. Similar, but not identical inverse spectral problems are treated for example by [13,14]. Just as in typical inverse spectral problems our boundary data is an analytic function of ξ so for exact data a knowledge of X ξ (d) for ξ in some interval is sufficient to determine the data for all ξ by analytic continuation. However analytic continuation is extremely ill-posed so we expect that in numerical examples a wider range of incident angles will result in a better conditioned system.
We can explore this phenomenon in a little more depth, and provide some insight into the numerical results, by considering perturbation about a homogeneous M. In this case we have an explicit formula for the Green's function and find that the solution for (24) is As for non-lossy materials M is real, we can consider (34) as a generalized Fourier transform of δMX. The successful recovery of the unknown spatially varying coefficients in δM relies on knowing their moments with respect to a sufficiently large range of functions. Here ξ plays a role analogous to a frequency variable and it is reasonable to expect that a wider range of incident angles would yield more information. The eigenvalues q i of M, are typically distinct and for non-dissipative media are real. Let U be the matrix of eigenvectors of M, and Q = diag(q i ) so that (34) becomes We note that exp(− iω c Qz)A exp( iω c Qz)has elements e i(q i −q j )z A ij . As we know all possible pairs of initial and final data X(0), X(d), we know the linear response map T defined by δX(d) = TX(0). From this data, and the known initial orthorhombic we also know the matrix and so where theˆdenotes the Fourier transform and A is considered to be extended by zero outside the interval [0, d].
Here the q i are functions of ξ and in general varying ξ over an interval will vary q i − q j over an interval, for i = j. We see therefore that this data, over a variety of incident angles gives us some information about the deviation of the permittivity tensor from our initial constant assumption, in terms of the deviation of the measurements from those we could calculate for the constant case. If it were not for the awkward fact that U is also a function of ξ, we could invoke the Paley-Wiener theorem that says that the Fourier transform of a compactly supported function is entire. Knowing an entire function on an interval is enough to determine the Taylor series, and hence the function everywhere. However such analytic continuation is severely ill-posed, and we can expect our problem to be as well. In simple cases we can expect to calculate U explicitly and come to a definite conclusion about the sufficiency of data for the linearized problem, and an example of this is given in the next section.

Orthorhombic case
Following Berreman, we attempt to understand one of the simplest cases. We consider a general perturbation in permittivity tensor δ about the permittivity for an orthorhombic crystal with principle axes aligned with the coordinate axes. As before we take the incident plane to be the xz-plane. and then consider the coefficient of each δ ij in U −1 δMU, noting that we will be able to obtain Fourier data for off-diagonal terms: We see that for all except δ 13 there are nontrivial off diagonal components so that for data using incident light in this plane for an interval of incident angles the Fourier transforms of all other δ ij are determined for some interval of frequencies, hence by analytic continuation the perturbation is determined uniquely everywhere. Although of course their recovery in the presence of noise will be unstable. Note now that by rotating the plane of incidence through a right angle, the x-and y-axis swap roles and we can with the data for these two planes recover all the perturbed coefficients. Now let us look in a little more detail at the off diagonal terms above. For each of the identifiable δ ij we recover P ξδij (q i (ξ) − q j (ξ)) where P ξ is either a known constant or a known multiple of ξ or q ±1 k or ξ/q k for k = 1 or 3. In any case these are Fourier integral operators applied to δ ij . One can expect different spectral information on δ ij from a term appearing in the 1 2 or 3 4 position from those in the 1 3 and 2 4 positions. In the first case q 1 − q 2 = 2q 1 , q 3 − q 4 = 2q 3 , and as dq i /dξ = 0 at ξ = 0 a small range of incident angle about zero gives only a small spectral window. By contrast q 1 and q 3 will be close for a weakly anisotropic medium, indeed for giving a wider spectral window for a range of incident angles near zero.  Figure 8. The experimental set-up used at Exeter University. The photo detector and polarizer for the reflected beam rotate at twice the speed of the cell.

An alternative approach
Researchers in the Department of Physics at the University of Exeter have been working on similar problems for some time [2]- [5]. Figure 8 shows the experimental set-up that they use. Instead of a cylindrical mount, Exeter use prisms to allow high angles of incidence to be achieved, although this does restrict the incident angle range over which measurements can be taken. For each incident angle s and p polarized light is input and the reflected and transmitted intensities for s and p polarizations are measured. 11 This gives eight measurements for each incident angle: R ss , R sp , R ps , R pp , T ss , T sp , T ps and T pp . We have carried out a similar analysis for the Exeter experiment, assuming that the azimuthal angle for the cell is 45 • and that the range of incident angles used is the same as in a typical experiment (60-72 • ). The results for the different test director profiles are shown in figure 9. The distribution of singular values is very similar and so their problem is equally ill-posed. Any methods developed to facilitate the solution of our inverse problem should work in this case also.

Conclusions
Our numerical investigation shows that a dielectric tensor depending on depth can be recovered from the experimental measurements used by both the HPLB and Exeter groups. While there is no rank deficiency indicating a unique solution for exact data, this is a severely ill-posed problem so some regularization will be needed to recover the dielectric tensor field from experimental data. The number of parameters that can be recovered for a given data error increases as the range of incident angle is increased. In a special case of a general perturbation of an orthorhombic material, the linearized inverse problem has a unique solution, provided two planes of incidence are used. Further study is needed to apply the techniques of inverse spectral theory for systems of ordinary differential equations to this problem in an attempt to understand the question of sufficiency of data more completely.
Some simplification of the inverse problem can be obtained by assuming that the LC is uniaxial (in inverse problem terms this is the inclusion of prior information). We can also use the Euler-Lagrange equations obtained by minimizing the free energy (1) to constrain any solution.
Results for this uniaxial case, efficient methods of solving the inverse problem using the Euler-Lagrange equations and results using experimental data will be reported separately. A more general problem where we remove the hypothesis that the medium is stratified has been treated numerically using the finite element method for forward solution and is reported in [15]. The question of sufficiency of data for this more general case is still an open problem. Similarly, assuming that there is only an outgoing transmitted wave, we can write We then have