The path of light

Rafael G González-Acuña; Héctor A Chaparro-Romo

doi:10.1088/978-0-7503-4705-1ch1

Optical Path Theory

Fundamentals to freeform adaptive optics

Chapter 1 • Free to read

The path of light

Rafael G González-Acuña and Héctor A Chaparro-Romo
Published June 2022 • Copyright © IOP Publishing Ltd 2022
Pages 1-1 to 1-27

Download PDF chapter

Download ePub chapter

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

Download complete PDF book, the ePub book or the Kindle book

Chapter navigation

Preview

Export citation and abstract

BibTeX RIS

Permissions

Get permission to re-use this book

Abstract

In this chapter, we cover the preliminary concepts necessary to understand the general formula for adaptive optics mirrors. These concepts are Fermat's principle, aberration theory, stigmatism, adaptive optics, and null lenses.

1.1. Purpose and introduction to this treatise

This treatise was written in order to explain a recently-published equation in greater detail; this equation is the general formula for adaptive optics mirrors, which was published in January 2021 and highlighted with the 'editor's pick' distinction. As a result of this publication, the authors decided to write a treatise in which an explanation of the abovementioned formula could be set out in more detail. Thus, a book on the topic of the published article would be quite useful. The book would have to address the introductory topics, the solution, the implications, and the potential applications. We decided to do all this because we agreed that the abovementioned equation is the analytical closed-form solution to the fundamental problem in adaptive optics.

We have called this project 'optical path theory' because the derivation of this enigmatic equation was only necessary to study the path of light. In other words, we only use Fermat's principle and its implications. For us, Fermat's principle is the only axiom of optical path theory and all its implications are the theorems. Thus, the aforementioned equation is just one more theorem derived from Fermat's principle. This is something that we explore in the following chapters.

This treatise is written in three parts: the first is the introduction, consisting of this single chapter. In this chapter, we will discuss the issues necessary to understand the equation this book is about. In addition, this chapter will study other concepts related to possible applications, such as adaptive optics and null lenses. These concepts will only be covered in a superficial way and the reader is recommended to check the bibliography of this chapter on these topics for more in-depth and detailed reading.

The second part deals with the abovementioned equation in two dimensions and its direct implications for aspherical design. The third part consists of the generalization of the chapters of the second part to three dimensions, i.e. we move from aspherical designs to freeform designs.

A characteristic of this treatise is that from chapter 2, each equation is directly explained step by step and the code is later presented in the Mathematica coding language, ready to be executed, so that the reader can easily reproduce the content of this book. Mathematica, officially called Wolfram Mathematica, is a package containing several built-in libraries that have applications in symbolic computing, matrices, plotting, implementing algorithms, creating user interfaces, and interacting with programs written in other programming languages.

1.2. The optical path and Fermat's principle

This treatise studies optics using the paradigm of geometric optics. Geometric optics is the study of the behaviour of light according to geometry. How would light relate to geometry? Well, everything makes sense if and only if you have a postulate that justifies it. That postulate is Fermat's principle. Fermat's principle is the only axiom of the whole theory that we are going to develop.

An axiom is a truth in itself that does not need to be proved. What we ask ourselves is: if this is true, what else is true? What are the implications if Fermat's principle is true? The most important implication is that under this paradigm, the behaviour of light tends to be geometric in nature. We mean that the propagation of light tends to be geometric in nature.

Rays are typically used to describe the propagation of light in geometric optics. Rays are useful abstractions that approximate the paths along which light propagates under certain conditions. The first premise of the propagation of light in straight lines is only valid when light travels within a homogeneous medium. A homogeneous medium has a constant refractive index, which it is a dimensionless number that describes how fast light travels through it. The refractive index is defined by

$\begin{eqnarray}&&n=\displaystyle \frac{c}{\mathit{\unicode[Book Antiqua]{x76}}}\end{eqnarray} \tag{ 1.1 }$

where c is the speed of light in a vacuum and $\mathit{\unicode[Book Antiqua]{x76}}$ is the phase velocity of light in the medium. In this book, we only work with homogeneous mediums. The premise of 'in straight lines' comes, in fact, from Fermat's principle.

According to Fermat's principle, a ray of light can pass between two points in the shortest possible time when travelling between them.

Fermat's principle can be expressed as follows:

The optical length of the path followed by light between two fixed different points is the global minimum. The optical length is the physical length multiplied by the refractive index of the medium.

It is important to remark on the word global minimum. Generally, the global minimum refers to the smallest value of a set. Consider for a moment that we have a set whose elements are all those possible optical paths between two points. There is an optical length for each of these paths. According to Fermat's principle, the only path in our set that is physically valid is the one that has the shortest optical path length (OPL). The OPL is the physical path multiplied by the refractive index of the medium.

Fermat's principle can be mathematically described as the amount of time it takes for a ray point to travel between two points, T, as follows:

$\begin{eqnarray}&&T=\displaystyle {\int }_{{t}_{0}}^{{t}_{1}}{dt}=\displaystyle \frac{1}{c}\displaystyle {\int }_{{t}_{0}}^{{t}_{1}}\displaystyle \frac{c}{\mathit{\unicode[Book Antiqua]{x76}}}\displaystyle \frac{{ds}}{{dt}}{dt}=\displaystyle \frac{1}{c}\displaystyle {\int }_{A}^{B}{nds},\end{eqnarray} \tag{ 1.2 }$

where the points are A and B, ds is an infinitesimal displacement along the ray, $\mathit{\unicode[Book Antiqua]{x76}}={ds}/{dt}$ is the speed of light in the medium, $n=c/\mathit{\unicode[Book Antiqua]{x76}}$ is the refractive index of that medium, t₀ is the starting time (the ray is in A), and t₁ is the arrival time at point B. The optical path length of a ray from A to B is given by the following integral:

$\begin{eqnarray}&&S=\displaystyle {\int }_{A}^{B}{nds},\end{eqnarray} \tag{ 1.3 }$

which is related to the travel time by $S={cT}$ . The optical path length is a purely geometrical quantity, since time is not considered in its calculation. The global minimum of the light's travel time between two points A and B is equivalent to the global minimum of the optical path length between A and B.

Fermat's principle not only predicts the straight path of the light across a homogeneous medium but also describes the change of the light's direction when it touches a mirror or passes through another medium. The first phenomenon is called reflection and the second is called refraction. We will explore them in the following subsections.

1.3. The law of reflection

Reflection is the change in the direction of a ray at an interface between two different media, so that the ray returns into the medium from which it originated. This media is usually a polished surface or a mirror.

When light is reflected, it behaves according to Fermat's principle. Please see figure 1.1; a light beam is shown that is incident to a surface. Reflection occurs at the surface. We can see that the ray, if it originates at ${x}_{1},{y}_{1}$ , travels in a straight line to the point where it is reflected in the surface and then reaches point ${x}_{2},{y}_{2}$ . The time it takes for light to travel from point ${x}_{1},{y}_{1}$ to point ${x}_{2},{y}_{2}$ can be written as follows:

$\begin{eqnarray}&&T(x)=\displaystyle \frac{1}{c}\left[\sqrt{{\left(x-{x}_{1}\right)}^{2}+{y}_{1}^{2}}+\sqrt{{\left({x}_{2}-x\right)}^{2}+{y}_{2}^{2}}\right],\end{eqnarray} \tag{ 1.4 }$

where x is the point on the plane in the horizontal direction where the light strikes. By computing $T(x)$ , we find that light moves in straight lines in places with constant refractive indices, just as Fermat's principle predicts.

**Figure 1.1.** Schematic diagram of ray tracing for the phenomenon of light ray reflection. A ray of light falls on the reflective surface at the point $(x,0)$ and is deflected at the same angle as the one at which it arrived; both rays coexist within the same optical medium.
Download figure:
Standard image High-resolution image

**Figure 1.1.** Schematic diagram of ray tracing for the phenomenon of light ray reflection. A ray of light falls on the reflective surface at the point $(x,0)$ and is deflected at the same angle as the one at which it arrived; both rays coexist within the same optical medium.
Download figure:
Standard image High-resolution image

However, we are unaware of how the light reflects. There is a critical point at $(x,0)$ . The slopes of the straight lines are determined by the value of x. In accordance with Fermat's principle, light moves within the shortest amount of time. Therefore, we can derive the following with respect to x and equal to zero:

$\begin{eqnarray}&&\displaystyle \frac{\partial T(x)}{\partial x}=0,\end{eqnarray} \tag{ 1.5 }$

and by computing the derivative

$\begin{eqnarray}&&\displaystyle \frac{x-{x}_{1}}{\sqrt{{\left(x-{x}_{1}\right)}^{2}+{y}_{1}^{2}}}=\displaystyle \frac{{x}_{2}-x}{\sqrt{{\left({x}_{2}-x\right)}^{2}+{y}_{2}^{2}}},\end{eqnarray} \tag{ 1.6 }$

from figure 1.1, we can relate the angles ${\theta }_{1}$ and ${\theta }_{2}$ using the last expression:

$\begin{eqnarray}&&\sin \,{\theta }_{1}=\sin \,{\theta }_{2},\end{eqnarray} \tag{ 1.7 }$

to finally obtain

$\begin{eqnarray}&&\boxed{{\theta }_{1}={\theta }_{2}}.\end{eqnarray} \tag{ 1.8 }$

A simple representation of reflection in a flat mirror can be seen in figure 1.1, where the incident angle ${\theta }_{1}$ is equal to the reflected angle ${\theta }_{2}$ , which leads to ${\theta }_{1}={\theta }_{2}$ . In angular notation, this is Snell's law for reflection.

By computing the second derivative, we can show that this is a real minimum:

$\begin{eqnarray}\begin{array}{rcl}c\displaystyle \frac{{\partial }^{2}T}{\partial {x}^{2}} & \,=\, & \displaystyle \frac{1-{\sin }^{2}\,{\theta }_{1}}{\sqrt{{\left(x-{x}_{1}\right)}^{2}+{y}_{1}^{2}}}+\displaystyle \frac{1-{\sin }^{2}\,{\theta }_{2}}{\sqrt{{\left({x}_{2}-x\right)}^{2}+{y}_{2}^{2}}}\\ & = & (1-{\sin }^{2}\,{\theta }_{2})\,\left[\displaystyle \frac{1}{\sqrt{{\left(x-{x}_{1}\right)}^{2}+{y}_{1}^{2}}}+\displaystyle \frac{1}{\sqrt{{\left({x}_{2}-x\right)}^{2}+{y}_{2}^{2}}}\right].\end{array}\end{eqnarray} \tag{ 1.9 }$

Since $\sin \,{\theta }_{2}^{2}\leqslant 1$ is positive for all values of x, then equation (1.9) is an absolute minimum.

1.4. The law of refraction

Light is refracted when passing through an area with a changing refractive index. A simple case of refraction occurs at a point where there is an interface between a uniform medium with an index of refraction of n₁ and another medium with an index of refraction of n₂. Snell's Law, also known as the law of refraction, describes how light is deflected in such a situation. This is a result of Fermat's principle.

Refraction is illustrated in figure 1.2, in which two transparent media with equal refractive indexes interact at a flat surface. Based on the figure, we can deduce that light travels as follows:

$\begin{eqnarray}&&R(x)={n}_{1}\sqrt{{d}_{1}^{2}+{x}^{2}}+{n}_{2}\sqrt{{d}_{2}^{2}+{\left(L-x\right)}^{2}}.\end{eqnarray} \tag{ 1.10 }$

**Figure 1.2.** Schematic diagram of ray tracing for the phenomenon of light ray refraction. A ray of light falls on the refractive surface and is deflected at a different angle than the one it exhibited on entry; the participating rays are in different optical media.
Download figure:
Standard image High-resolution image

The height of the light ray at its initial position is defined as d₁. In a coordinate system, x is the horizontal distance between the light ray's origin and its initial position. In the diagram below, L represents the horizontal distance between the starting and finishing positions. Finally, in the diagram above, d₂ is the final height of the light ray.

By differentiating equation (1.10) with respect to x, we can find the path that the light covers in the shortest amount of time.

$\begin{eqnarray}&&\displaystyle \frac{{dR}(x)}{{dx}}={n}_{1}\displaystyle \frac{x}{\sqrt{{d}_{1}^{2}+{x}^{2}}}-{n}_{2}\displaystyle \frac{(L-x)}{\sqrt{{d}_{2}^{2}+{\left(L-x\right)}^{2}}}.\end{eqnarray} \tag{ 1.11 }$

As we can see in figure 1.2, $\sin \,{\theta }_{1}$ is equal to

$\begin{eqnarray}&&\sin \,{\theta }_{1}=\displaystyle \frac{x}{\sqrt{{d}_{1}^{2}+{x}^{2}}}.\end{eqnarray} \tag{ 1.12 }$

As we can also see from the same figure, $\sin \,{\theta }_{2}$ is

$\begin{eqnarray}&&\sin \,{\theta }_{2}=\displaystyle \frac{L-x}{\sqrt{{d}_{2}^{2}+{\left(L-x\right)}^{2}}}.\end{eqnarray} \tag{ 1.13 }$

By replacing equation (1.11) and (1.12) in equation, we obtain

$\begin{eqnarray}&&\boxed{{n}_{1}\,\sin \,{\theta }_{1}={n}_{2}\,\sin \,{\theta }_{2}.}\end{eqnarray} \tag{ 1.14 }$

If the light rays are moving through a homogeneous medium, their paths are straight. A change of homogeneous medium will result in refraction (deviation) of the light. This is expressed by the equation above. In angular notation, this is Snell's law for refraction.

1.5. The vector form of Snell's law

In vector form, Snell's law can be expressed in a more manageable manner. To obtain Snell's law in vector form, let ${\vec{{\boldsymbol{a}}}}_{1}$ be the incident unit vector, let ${\vec{{\boldsymbol{a}}}}_{2}$ be the refracted unit vector, and let $\vec{{\boldsymbol{n}}}$ be the normal unit vector of the surface; ${\theta }_{1}$ is the incident angle with respect to $\vec{{\boldsymbol{n}}}$ and ${\theta }_{2}$ is the refracted angle with respect to $-\vec{{\boldsymbol{n}}}$ .

From figure 1.2 we can now express the refracted vector, which is given by

$\begin{eqnarray}&&{\vec{{\bf{a}}}}_{2}=\displaystyle \frac{{n}_{1}}{{n}_{2}}{\vec{{\bf{a}}}}_{1}+\left({\vec{{\bf{a}}}}_{2}\cdot \vec{{\bf{n}}}-\displaystyle \frac{{n}_{1}}{{n}_{2}}{\vec{{\bf{a}}}}_{1}\cdot \vec{{\bf{n}}}\right)\vec{{\bf{n}}},\end{eqnarray} \tag{ 1.15 }$

and using the definition of the dot product between vectors,

$\begin{eqnarray}&&{\vec{{\bf{a}}}}_{2}=\displaystyle \frac{{n}_{1}}{{n}_{2}}{\vec{{\bf{a}}}}_{1}+\left(\cos \,{\theta }_{2}-\displaystyle \frac{{n}_{1}}{{n}_{2}}\,\cos \,{\theta }_{1}\right)\vec{{\bf{n}}},\end{eqnarray} \tag{ 1.16 }$

equation (1.16) can also expressed as

$\begin{eqnarray}&&{\vec{{\boldsymbol{a}}}}_{2}=\displaystyle \frac{{n}_{1}}{{n}_{2}}{\vec{{\boldsymbol{a}}}}_{1}+\left(\displaystyle \frac{{n}_{1}}{{n}_{2}}| \vec{{\boldsymbol{n}}}| | {\vec{{\boldsymbol{a}}}}_{1}| \cos \,{\theta }_{1}-| \vec{{\boldsymbol{n}}}| | {\vec{{\boldsymbol{a}}}}_{2}| \cos \,{\theta }_{2}\right)\vec{{\boldsymbol{n}}}.\end{eqnarray} \tag{ 1.17 }$

From figure 1.2 we deduce that the vertical component of the refracted blue ray ${\vec{{\bf{a}}}}_{2}$ is,

$\begin{eqnarray}&&\vec{{\bf{n}}}\cdot {\vec{{\bf{a}}}}_{2}=| \vec{{\bf{n}}}| | {\vec{{\bf{a}}}}_{2}| \cos \,{\theta }_{2}.\end{eqnarray} \tag{ 1.18 }$

Remember that we are working with unit vectors, so $| \vec{{\bf{n}}}| =| {\vec{{\bf{a}}}}_{2}| \equiv 1$ ,

$\begin{eqnarray}&&\vec{{\bf{n}}}\cdot {\vec{{\bf{a}}}}_{2}=\cos \,{\theta }_{2},\end{eqnarray} \tag{ 1.19 }$

and squaring both sides, we obtain

$\begin{eqnarray}&&{\left(\vec{{\bf{n}}}\cdot {\vec{{\bf{a}}}}_{2}\right)}^{2}=\cos \,{\theta }_{2}^{2}=1-\sin \,{\theta }_{2}^{2}=1-\displaystyle \frac{{n}_{1}^{2}}{{n}_{2}^{2}}\,\sin \,{\theta }_{1}^{2}.\end{eqnarray} \tag{ 1.20 }$

From the definition of the cross-product, we know that

$\begin{eqnarray}&&\vec{{\bf{n}}}\times {\vec{{\bf{a}}}}_{1}=| \vec{{\bf{n}}}| | {\vec{{\bf{a}}}}_{1}| \sin \,{\theta }_{1}({\vec{{\bf{e}}}}_{1}\times {\vec{{\bf{e}}}}_{2}),\end{eqnarray} \tag{ 1.21 }$

and also that $| \vec{{\bf{n}}}| =| {\vec{{\bf{a}}}}_{1}| =1$ .

$\begin{eqnarray}&&\vec{{\bf{n}}}\times {\vec{{\bf{a}}}}_{1}=\sin \,{\theta }_{1};\end{eqnarray} \tag{ 1.22 }$

squaring both sides in the last expression, we obtain

$\begin{eqnarray}&&{\left(\vec{{\bf{n}}}\times {\vec{{\bf{a}}}}_{1}\right)}^{2}=\sin \,{\theta }_{1}^{2}.\end{eqnarray} \tag{ 1.23 }$

By replacing the above expression in ${\left(-\vec{{\bf{n}}}\cdot {\vec{{\bf{a}}}}_{2}\right)}^{2}\,=1-({n}_{1}^{2}/{n}_{2}^{2})\sin \,{\theta }_{1}^{2}$ , we get

$\begin{eqnarray}&&\vec{{\bf{n}}}\cdot {\vec{{\bf{a}}}}_{2}=\sqrt{1-\displaystyle \frac{{n}_{1}^{2}}{{n}_{2}^{2}}{\left(\vec{{\bf{n}}}\times {\vec{{\bf{a}}}}_{1}\right)}^{2}}.\end{eqnarray} \tag{ 1.24 }$

On the other hand, the other terms of equation (1.17) are

$\begin{eqnarray}&&\displaystyle \frac{{n}_{1}}{{n}_{2}}[{\vec{{\bf{a}}}}_{1}+| \vec{{\bf{n}}}| | {\vec{{\bf{a}}}}_{1}| \cos \,{\theta }_{1}\vec{{\bf{n}}}]=\displaystyle \frac{{n}_{1}}{{n}_{2}}[{\vec{{\bf{a}}}}_{1}-(\vec{{\bf{n}}}\cdot {\vec{{\bf{a}}}}_{1})\vec{{\bf{n}}}].\end{eqnarray} \tag{ 1.25 }$

As a consequence, Snell's law in vector form is

$\begin{eqnarray}&&\boxed{{\vec{{\bf{a}}}}_{2}=\displaystyle \frac{{n}_{1}}{{n}_{2}}[{\vec{{\bf{a}}}}_{1}-(\vec{{\bf{n}}}\cdot {\vec{{\bf{a}}}}_{1})\vec{{\bf{n}}}]-\vec{{\bf{n}}}\sqrt{1-\displaystyle \frac{{n}_{1}^{2}}{{n}_{2}^{2}}{\left(\vec{{\bf{n}}}\times {\vec{{\bf{a}}}}_{1}\right)}^{2}}}.\end{eqnarray} \tag{ 1.26 }$

Equation (1.26) will be very useful and will be applied in the following chapters.

The vector notation form of Snell's law for reflection is equation (1.27); we set $\tfrac{{n}_{1}}{{n}_{2}}=-1$ , thus

$\begin{eqnarray}&&\boxed{{\vec{{\bf{a}}}}_{2}=-[{\vec{{\bf{a}}}}_{1}-(\vec{{\bf{n}}}\cdot {\vec{{\bf{a}}}}_{1})\vec{{\bf{n}}}]-\vec{{\bf{n}}}\sqrt{1-{\left(\vec{{\bf{n}}}\times {\vec{{\bf{a}}}}_{1}\right)}^{2}}.}\end{eqnarray} \tag{ 1.27 }$

1.6. The wavefront and the Malus–Dupin theorem

Light follows the shortest path, according to Fermat's principle. One can think of the refractive index as a measurement of a material's speed of light. Therefore, the incident and refracted rays are related equally in angular terms, according to Fermat's law. Although Snell's law and Fermat's principle share some information in common, they are not the same. Snell's law only relates to the angles at which light passes from one medium to another and is an implication of Fermat's principle.

An inherent property of geometric optics that connects it to wave optics is the Malus–Dupin theorem. According to the Malus–Dupin theorem, if one follows the same optical paths along all the rays emitted from a source, then there is a surface normal to all those rays. 'Wavefront' is the name given to the surface perpendicular to the rays. Fermat's principle provides proof of the Malus–Dupin theorem. The Malus–Dupin theorem holds, no matter how many reflections or refractions a ray may experience en route to its destination.

Two infinitesimally separate paths are taken, $[{AB}]$ and $[{AB}^{\prime} ]$ , where A is the focus and B and $B^{\prime}$ are the arrival points separated by equal optical paths. As a result, we designate the following optical paths:

$\begin{eqnarray}&&{L}_{B}=\displaystyle {\int }_{A}^{B}n(\vec{{\bf{r}}}){ds}\end{eqnarray} \tag{ 1.28 }$

and

$\begin{eqnarray}&&{L}_{B^{\prime} }=\displaystyle {\int }_{A}^{B^{\prime} }n(\vec{{\bf{r}}}^{\prime} ){ds}^{\prime} ,\end{eqnarray} \tag{ 1.29 }$

where $\vec{{\bf{r}}}$ and $\vec{{\bf{r}}}^{\prime}$ are the respective positional vectors, ds and ${ds}^{\prime}$ are the respective differentials of space, and $n(\vec{{\bf{r}}})$ is the refractive index. In this case, we admit that the refractive index can be differentiable. We also consider the following relationship: $\mathop{{\bf{r}}^{\prime} }\limits^{\longrightarrow}=\vec{{\bf{r}}}+\varepsilon \vec{{\bf{b}}}$ , in which $\vec{{\bf{dr}}}^{\prime} =\vec{{\bf{dr}}}+\varepsilon \vec{{\bf{db}}}$ is true if and only if $\varepsilon \approx 0$ . See figure 1.3.

**Figure 1.3.** Schematic diagram of the Malus–Dupin theorem, an abstraction of the relationship between the vectors ${\bf{r}}$ and ${\bf{r}}^{\prime}$ that describes the relationship between the propagation of the wave and the light rays.
Download figure:
Standard image High-resolution image

**Figure 1.3.** Schematic diagram of the Malus–Dupin theorem, an abstraction of the relationship between the vectors ${\bf{r}}$ and ${\bf{r}}^{\prime}$ that describes the relationship between the propagation of the wave and the light rays.
Download figure:
Standard image High-resolution image

Our derivation is based on the first-order Taylor series,

$\begin{eqnarray}&&f(\vec{{\bf{r}}})=f({\vec{{\bf{r}}}}_{0})+{\rm{\nabla }}f({\vec{{\bf{r}}}}_{0})\cdot (\vec{{\bf{r}}}-{\vec{{\bf{r}}}}_{0}).\end{eqnarray} \tag{ 1.30 }$

We can obtain the refractive index in terms of a first-order Taylor series development,

$\begin{eqnarray}&&n(\mathop{{\bf{r}}^{\prime} }\limits^{\longrightarrow})=n(\vec{{\bf{r}}})+{\rm{\nabla }}n(\vec{{\bf{r}}})\cdot (\mathop{{\bf{r}}^{\prime} }\limits^{\longrightarrow}-\vec{{\bf{r}}})=n(\vec{{\bf{r}}})+{\rm{\nabla }}n(\vec{{\bf{r}}})\cdot \varepsilon \vec{{\bf{b}}}.\end{eqnarray} \tag{ 1.31 }$

We do the same for the positional vector:

$\begin{eqnarray}&&| \mathop{{\bf{r}}^{\prime} }\limits^{\longrightarrow}| =| \vec{{\bf{r}}}| +{\rm{\nabla }}| \vec{{\bf{r}}}| \cdot (\mathop{{\bf{r}}^{\prime} }\limits^{\longrightarrow}-\vec{{\bf{r}}})\end{eqnarray} \tag{ 1.32 }$

$\begin{eqnarray}&&=\,| \vec{{\bf{r}}}| +\displaystyle \frac{\partial }{\partial r}r\vec{{\bf{u}}}\cdot (\vec{{\bf{r}}}^{\prime} -\vec{{\bf{r}}})\end{eqnarray} \tag{ 1.33 }$

$\begin{eqnarray}&&=\,| \vec{{\bf{r}}}| +\vec{{\bf{u}}}\cdot (\mathop{{\bf{r}}^{\prime} }\limits^{\longrightarrow}-\vec{{\bf{r}}})\end{eqnarray} \tag{ 1.34 }$

$\begin{eqnarray}&&=\,| \vec{{\bf{r}}}| +\vec{{\bf{u}}}\cdot \mathop{{\bf{r}}^{\prime} }\limits^{\longrightarrow}-\vec{{\bf{u}}}\cdot \vec{{\bf{r}}}\end{eqnarray} \tag{ 1.35 }$

$\begin{eqnarray}&&=\,| \vec{{\bf{r}}}| +\vec{{\bf{u}}}\cdot \vec{{\bf{r}}}^{\prime} -| \vec{{\bf{r}}}| \end{eqnarray} \tag{ 1.36 }$

$\begin{eqnarray}&&=\,\vec{{\bf{u}}}\cdot \mathop{{\bf{r}}^{\prime} }\limits^{\longrightarrow},\end{eqnarray} \tag{ 1.37 }$

where $\vec{{\bf{r}}}=\vec{{\bf{u}}}$ , so $\vec{{\bf{u}}}$ is the vector in the direction of the ray. Then, for the differential ${ds}^{\prime}$ ,

$\begin{eqnarray}&&{ds}^{\prime} =| \vec{{\bf{dr}}}^{\prime} | =\vec{{\bf{u}}}\cdot \vec{{\bf{dr}}}^{\prime} =\vec{{\bf{u}}}\cdot \vec{{\bf{dr}}}+\vec{{\bf{u}}}\cdot \varepsilon \vec{{\bf{db}}}={ds}+\varepsilon \vec{{\bf{u}}}\cdot \vec{{\bf{db}}}.\end{eqnarray} \tag{ 1.38 }$

The last expression is in equation (1.29),

$\begin{eqnarray}&&{L}_{B^{\prime} }=\displaystyle {\int }_{A}^{B^{\prime} }n(\mathop{{\bf{r}}^{\prime} }\limits^{\longrightarrow}){ds}^{\prime} =\displaystyle {\int }_{A}^{B^{\prime} }[n(\vec{{\bf{r}}})+{\rm{\nabla }}n\cdot \varepsilon \vec{{\bf{b}}}]({ds}+\varepsilon \vec{{\bf{u}}}\cdot \vec{{\bf{db}}});\end{eqnarray} \tag{ 1.39 }$

expanding the above equation, we obtain

$\begin{eqnarray}&&{L}_{B^{\prime} }=\displaystyle {\int }_{A}^{B^{\prime} }n(\vec{{\bf{r}}}){ds}+\displaystyle {\int }_{A}^{B^{\prime} }{\rm{\nabla }}n\cdot \varepsilon \vec{{\bf{db}}}+\displaystyle {\int }_{A}^{B^{\prime} }n(\vec{{\bf{r}}})\varepsilon \vec{{\bf{u}}}\cdot \vec{{\bf{db}}}+\displaystyle {\int }_{A}^{B^{\prime} }{\rm{\nabla }}n\cdot \varepsilon \vec{{\bf{b}}}\varepsilon \vec{{\bf{u}}}\cdot \vec{{\bf{db}}}.\end{eqnarray} \tag{ 1.40 }$

In the last expression we set ${\varepsilon }^{2}\to 0$ , because we considered that $\varepsilon \approx 0$ , so,

$\begin{eqnarray}&&{L}_{B^{\prime} }=\displaystyle {\int }_{A}^{B^{\prime} }n(\vec{{\bf{r}}}){ds}+\displaystyle {\int }_{A}^{B^{\prime} }{\rm{\nabla }}n\cdot \varepsilon \vec{{\bf{db}}}+\displaystyle {\int }_{A}^{B^{\prime} }n(\vec{{\bf{r}}})\varepsilon \vec{{\bf{u}}}\cdot \vec{{\bf{db}}}.\end{eqnarray} \tag{ 1.41 }$

We calculate the optical path difference ${\rm{\Delta }}L$ using equation (1.28) minus equation (1.41):

$\begin{eqnarray}&&{\rm{\Delta }}L=\displaystyle {\int }_{A}^{B^{\prime} }n(\vec{{\bf{r}}}){ds}+\displaystyle {\int }_{A}^{B^{\prime} }{\rm{\nabla }}n\cdot \varepsilon \vec{{\bf{b}}}{ds}+\displaystyle {\int }_{A}^{B^{\prime} }n(\vec{{\bf{r}}})\varepsilon \vec{{\bf{u}}}\cdot \vec{{\bf{db}}}-\displaystyle {\int }_{A}^{B}n(\vec{{\bf{r}}}){ds}.\end{eqnarray} \tag{ 1.42 }$

In the last equation, we now collect the terms multiplied by ε:

$\begin{eqnarray}&&{\rm{\Delta }}L=\displaystyle {\int }_{A}^{B^{\prime} }n(\vec{{\bf{r}}}){ds}-\displaystyle {\int }_{A}^{B}n(\vec{{\bf{r}}}){ds}+\varepsilon \left[\displaystyle {\int }_{A}^{B^{\prime} }{\rm{\nabla }}n\cdot \vec{{\bf{b}}}{ds}+\displaystyle {\int }_{A}^{B^{\prime} }n(\vec{{\bf{r}}})\vec{{\bf{u}}}\cdot \vec{{\bf{db}}}.\right]\end{eqnarray} \tag{ 1.43 }$

By differentiating the trajectories, we find that $d(n\vec{{\bf{u}}}\cdot \vec{{\bf{b}}})=d(n\vec{{\bf{u}}})\cdot \vec{{\bf{b}}}+n\vec{{\bf{u}}}\cdot \vec{{\bf{db}}}\,={\rm{\nabla }}n\cdot \vec{{\bf{b}}}{ds}+n\vec{{\bf{u}}}\cdot \vec{{\bf{db}}}$ , so equation (1.43) becomes

$\begin{eqnarray}&&{\rm{\Delta }}L=\displaystyle {\int }_{A}^{B^{\prime} }n(\vec{{\bf{r}}}){ds}-\displaystyle {\int }_{A}^{B}n(\vec{{\bf{r}}}){ds}+\varepsilon \left[\displaystyle {\int }_{A}^{B^{\prime} }d(n\vec{{\bf{b}}}\cdot \vec{{\bf{u}}})\right].\end{eqnarray} \tag{ 1.44 }$

If we assume that we have chosen $B\equiv B^{\prime}$ then

$\begin{eqnarray}&&{\rm{\Delta }}L=\displaystyle {\int }_{A}^{B^{\prime} }n(\vec{{\bf{r}}}){ds}-\displaystyle {\int }_{A}^{B^{\prime} }n(\vec{{\bf{r}}}){ds}+\varepsilon \left[\displaystyle {\int }_{A}^{B^{\prime} }d(n\vec{{\bf{b}}}\cdot \vec{{\bf{u}}})\right]=0\end{eqnarray} \tag{ 1.45 }$

turns into

$\begin{eqnarray}&&{\rm{\Delta }}L=+\varepsilon \left[\displaystyle {\int }_{A}^{B^{\prime} }d(n\vec{{\bf{b}}}\cdot \vec{{\bf{u}}})\right]=0.\end{eqnarray} \tag{ 1.46 }$

Thus,

$\begin{eqnarray}&&\displaystyle {\int }_{A}^{B^{\prime} }d(n\vec{{\bf{b}}}\cdot \vec{{\bf{u}}})=0,\end{eqnarray} \tag{ 1.47 }$

which implies

$\begin{eqnarray}&&n\vec{{\bf{b}}}\cdot \vec{{\bf{u}}}=0.\end{eqnarray} \tag{ 1.48 }$

Dividing by n in both sides, we obtain

$\begin{eqnarray}&&\vec{{\bf{b}}}\cdot \vec{{\bf{u}}}=0,\end{eqnarray} \tag{ 1.49 }$

which also implies

$\begin{eqnarray}&&\vec{{\bf{b}}}\perp \vec{{\bf{u}}}\end{eqnarray} \tag{ 1.50 }$

and since $\vec{{\bf{u}}}$ is the vector in the direction of the ray and $\vec{{\bf{b}}}$ is the vector tangential to the wavefront, we can conclude that rays are perpendicular to waves.

As a final remark, in this book we also call the wavefront predicted by the Malus–Dupin theorem 'the eikonal.'

1.7. Optical path difference and phase difference

At this point we have shown that the trajectory of light is described by rays and that there is a surface perpendicular to the rays. In reality, that surface, the wavefront, is a surface with a constant phase. To consider what we mean by 'phase,' we first consider a point source of light emitting rays in all possible directions around it. The OPL of a ray is given by the length travelled multiplied by the refractive index where the ray travels. For our example, the OPL is a radius multiplied by the refractive index. These rays form a surface perpendicular to themselves that expands as the light travels further from the source; this surface is the wavefront. The wavefront is a surface with equal phase, so it is also called a phase front. The concept of phase comes from the electromagnetic nature of the wave phenomenon. An electric field can be represented as

$\begin{eqnarray}&&u={u}_{o}\,\sin \,{\rm{\Phi }}={u}_{o}\,\sin \,\left(\displaystyle \frac{2\pi (x-{ct})}{\lambda }\right),\end{eqnarray} \tag{ 1.51 }$

where u_o is the amplitude of the field, x is a variable representing special propagation, t is the time, c is the speed of light, and λ is the wavelength. The term Φ is the phase, which is a point in the cycle of the sine function. So if ${\rm{\Phi }}=0$ then $u=0$ or if ${\rm{\Phi }}=90^\circ$ then $u={u}_{o}$ . Going back to our point source, the light emitted from it has the same phase ${{\rm{\Phi }}}_{o}$ , no matter the direction of the ray. So along the ray path, the cyclic nature of light is presented in the number of wavelengths necessary to fit the OPL. See figure 1.4. Thus, at the surface of the wavefront, the rays have the same phase ${{\rm{\Phi }}}_{o}$ . For this example, the wavefront is stigmatic, since it comes from a single point source and in fact, it is a spherical wavefront. If this point source is far away, at infinity, then we can see that the spherical wavefront turns into into a flat wavefront.

**Figure 1.4.** Schematic diagram showing the phase between different waves that propagate from the same source. Two possible cases are shown: (1) when the waves travel in phase, (2) when there is a phase shift between them.
Download figure:
Standard image High-resolution image

1.8. Stigmatism and aberrated wavefronts

Once we know how light propagates, it is time to manipulate it to get artifacts of interest.

Let us suppose that one has a solid body and that the refractive index of the environment is different from that of the body. When the physical body focuses or disperses rays due to the difference of refraction indices between it and the medium, it is a lens. The study of lenses is central to geometric optics and in this treatise we are going to study them in an interesting way.

Light is focused and scattered by lenses. Mirrors also possess such features. All this happens due to the laws of reflection and refraction. Surface normals determine both the law of reflection and the law of refraction. Consequently, the shape of the lens/mirror plays a crucial role in determining its function.

A basic principle of the focusing of light is that after rays are emitted by a point object, they converge into an image. Consequently, stigmatic lenses and mirrors are required. The stigmatism of an optical system describes the process of focusing a point object into a point image. Such points are called a stigmatic pair of the optical system. Mirrors and stigmatic lenses need to have very specific shapes.

Stigmatic mirrors are formed from conic sections of reflective surfaces.

Rays reflected by mirrors with parabolic surfaces converge into a common focus when parallel rays hit them. When a mirror has a spherical surface, all rays that emerge from a point object are reflected back to the same location if and only if the object is situated at the centre of the circumference. Rays reflected from a focus of an elliptical mirror appear as a point image at the other focus. Last but not least, rays that originate from a point object are reflected in a single virtual point image when they are reflected by mirrors with hyperbolic surfaces. Similarly, other curved surfaces may also focus light, but not on a single point. See figure 1.5.

**Figure 1.5.** Schematic diagrams of the reflective behaviours of conical mirrors. Top right: a spherical mirror; top left: an elliptical mirror; bottom right: a hyperbolic mirror; bottom right: a parabolic mirror.
Download figure:
Standard image High-resolution image

In terms of refractive surfaces, the stigmatic surface is the Cartesian oval, which is a fourth-order function. Another way of explaining this is that when the rays emitted from a point object are refracted, they focus on a single image point. See figure 1.6.

**Figure 1.6.** Schematic diagram of the refractive behaviour of a Cartesian oval surface. By definition, the Cartesian oval is the stigmatic surface.
Download figure:
Standard image High-resolution image

So, recalling the Malus–Dupin theorem that rays are perpendicular to the wavefront, observe that the spherical wavefronts are generated by point objects, and flat wavefronts are also generated by point objects but this time, the objects are far away. Stigmatic optics deals with the optical artifacts that convert spherical wavefronts to other spherical wavefronts or plane wavefronts.

Optical path theory is more general than stigmatic optics; it contemplates the path of light described by its pure nature, the Fermat principle. It also may study the path of light in optical systems with aberrations. A system has aberrations if it is not stigmatic for all points of an object. It is not always possible to resolve the point object at a single point on the image space, in which case, the optical system has aberrations. Depending on the type of aberration, the region of space where the image is formed differs.

The two main types of errors are chromatic and monochromatic. Chromatic aberrations are caused by variations in the refractive index of a lens with regard to the wavelength. Monochromatic aberrations occur because of optical system geometry. There are three laws associated with it: the reflection law, Snell's law, and Fermat's principle (note that the first two laws are implications of Fermat's principle). Information about the imaging system geometry is available to Fermat's principle. It is crucial to remove monochromatic aberrations by designing the optical system to have a specific shape. The most common monochromatic aberrations are spherical aberration, along with coma, astigmatism, field curvature, and image distortion.

1.8.1. Spherical aberration

The term spherical aberration refers to a situation in which a point object located on an optical axis lacks a stigmatic correlation with a point image. Rays leaving a point object on an optical axis do not converge to a point image. Because spherical lenses exhibit this phenomenon, it is called spherical aberration. Aspherical lenses are lenses that are non-spherical in shape.

In most cases, they aim to reduce spherical aberrations. Figure 1.7 shows an example of spherical aberration generated by a spherical surface with a constant refractive index n along the material.

1.8.2. Coma

Coma is the aberration of an image caused by an object that is outside the optical axis of the system. As a result of coma, the image appears distorted, with a tail like a comet. Thus, when an optical system has coma, a point outside the optical axis does not have a stigmatic relationship with a point image, since the image is not a point, but a region.

The coma is shown in figure 1.8, in which a ray crosses the surfaces of index refraction n and is refracted; the ray then crosses the interface of interposition. Some of the rays are inverted due to the refraction, as seen in the above figure. These photos depict an inversion that appears to be that of a coma or a comet.

1.8.3. Astigmatism

When two perpendicular planes have different image points, an image becomes astigmatic. These planes are usually called the meridional and sagittal planes. In a three-dimensional model using the x, y, and z coordinate system, the meridional plane is the y − z plane and the sagittal plane is the x − z plane. The direction z is called the optical axis; this is the axis where the light travels, usually from $-z$ to z. An illustration of astigmatism is presented in figure 1.9.

**Figure 1.9.** Schematic diagram of the aberration of astigmatism at a refractive surface. Note that two image points are formed, one corresponding to the vertical axis and the other to the horizontal axis.
Download figure:
Standard image High-resolution image

1.8.4. Distortion

A rectilinear projection is used to measure distortion in an image-forming system. When the system is functioning as it should, a rectilinear projection (a projection that renders straight lines in a scene as straight lines in the image) passes through. Figure 1.10 shows the most common distortions.

**Figure 1.10.** (Left) barrel and (right) pincushion distortion. Schematic diagram of the distortion aberration at a refractive surface. Note that two cases can occur: barrel distortion, which is an expansion of the image and pincushion distortion, which is image compression.
Download figure:
Standard image High-resolution image

1.8.5. Field curvature

The Petzval field curvature, named for Joseph Petzval, is an optical aberration in which an object is focused as a curved image. Figure 1.11 shows the field curvature due to a Cartesian oval.

1.9. Adaptive optics

Adaptive optics is a technique that makes it possible to correct the evolving and non-predictive deformations of a wavefront in real time using a deformable mirror. Deformable mirrors compensate for non-predictive deformations in real time. In this section, we introduce the general concept of adaptive optics, and in the following subsection, we explore their main components.

Adaptive optics was first developed in the 1950s; its main field of use is astronomy, but it has started to extend to many other fields such as ophthalmologic medicine and telecommunications. Here, we are only going to focus on its astronomical applications.

Astronomers have found that the atmosphere introduces aberrations into the input beams of light that enter telescopes. These aberrations affect image quality and they are very non-predictive. Thus, the main limitation of the quality of astronomical observations is no longer the physical dimensions of the mirrors but atmospheric disturbances.

This observation prompted the creation of high-altitude observatories or even the sending of telescopes into space, since space telescopes are free from atmospheric problems. The problem of atmospheric turbulence can also be solved by adaptive optics, which use computer-controlled rapidly deformable mirrors to compensate for wavefront distortions.

Adaptive optics is an optical technique that allows us to counteract, in real time, the effects of the Earth's atmosphere on the formation of astronomical images. To achieve this, a deformable mirror supported by a set of computer-controlled actuators is inserted into the optical path of the telescope. In astronomy, this technique is used in particular by terrestrial telescopes to correct star observations, among other things.

In the case of stars, if we have the impression that a star is twinkling, it is not because it emits light in a non-constant way, but because of the atmospheric turbulence which distorts the image we have of it; such turbulence distorts the wavefront and thus the phase. Indeed, a star, assumed to be a point in the visible sky and located at a very great distance compared to the scale of the Earth, emits light with a spherical wavefront which, on our scale, can be considered a plane. However, if we consider the case of a telescope that has a primary mirror with a diameter of several tens of meters, the wavefront incident upon the surface of the primary mirror undergoes random deformations when it passes through the atmosphere because of variations in refractive index caused by atmospheric turbulence.

This is explained by the dependence of the refractive index on the temperature and the local pressure of the atmosphere encountered. The optical path traversed by a light ray is defined as the integral of ${nds}$ , where n is the refractive index and ds is the elementary displacement along the path. The light rays do not travel the same optical path: the wavefront that is observed is then no longer plane, and the image is distorted. In adaptive optics, a wavefront analyser is used to estimate the disturbance due to the atmosphere, and a mirror is deformed so as to exactly compensate for this disturbance. Thus the image after reflection by the mirror is almost as if there had been no degradation by atmospheric turbulence.

In order to use this technique, a reference star is needed in the stellar field: the analysis of its aspect makes it possible to evaluate the disturbances to which the image is subjected in real time. The computer reacts by sending commands to the deformable mirror actuators many times every second; the mirror then takes a shape that compensates for image defects.

This system can also use a reference to an artificial star produced by a laser beam that passes through the layers of air causing poor image quality. Any point or large object in the field of view, such as a galaxy, thus improves its sharpness. It should be noted that adaptive optics not only compensates for the variable disturbances induced by the atmosphere, but is also capable of correcting a good part of the intrinsic aberrations due to telescope optics.

Although it is a technique that finds its natural field of application in astronomy, it is currently also being investigated in adaptive optics applied to human vision. To do this, we proceed to analyse the images of point objects (artificial 'stars') projected onto the retina through the front lenses of the eye (cornea and crystalline). The study of these images makes it possible to evaluate the aberrations due to the organ of vision, and to act on external adaptive systems that introduce the necessary corrections into the incident light beam so that the images projected on the retina are as sharp as possible.

In practice, the implementation of an adaptive optics system begins with the construction of a control matrix. This matrix represents the actuators to be modified to reproduce each of the optical aberrations of the base of the Zernike polynomials.

Zernike polynomials are a series of polynomials that are orthogonal to the unit disk. From the analysis of the disturbance of the wavefront by the atmosphere via a wavefront analyser, we can decompose the wavefront fault on the basis of Zernike polynomials to compensate for the delays using the deformable mirror. In practice, only a limited number of Zernike orders need to be corrected to obtain a sufficiently small residual defect.

If we consider a light wave that has passed through an imperfect system, the wavefront at the output of the system is not completely flat: we define the phase shift function Φ, which at any point of a plane front denotes the phase shift between the theoretical light wave in the geometrical optics model and the real light wave, while taking into account the defects; it would be equal to the null function if the system were perfect. It is then possible to approximate this so-called aberrant phase using a linear combination of Zernike polynomials; each of the polynomials of the base is considered to correspond to a different category of aberration.

Thus, in adaptive optics, it is possible to use a wavefront analyser coupled to a computer system capable of calculating Φ and its decomposition into Zernike polynomials in real time. This allows us to know the nature of the aberrations of the system under study at any time and possibly correct them using a deformable mirror. It is important to remark that in this treatise we are not going to directly use the Zernike polynomials, since the equations that describe the adaptive optical system that we are going to study are simpler if we study them in Cartesian coordinates instead of working in a unitary disk such as that of the Zernike polynomials. However, Zernike polynomials have a direct transformation or translation to Cartesian coordinates. So all the mathematical models that we will present in the following chapter can ultimately be expressed in terms of Zernike polynomials.

1.9.1. The main configuration in adaptive optics

The general configuration of adaptive optics is presented in figure 1.12.

1.9.2. The crux in adaptive optics: phase conjugation

Therefore the crux in adaptive optics is the surface, which when given an input wavefront cancels its phase, thereby obtaining a flat output wave. This procedure is also called phase conjugation.

Mathematically speaking, phase conjugation is fairly simple, but in practice, it is not. So far in this treatise we have not mentioned it, but the reader should know that light, as a propagating electric field, is an electromagnetic wave represented by an amplitude and a phase:

$\begin{eqnarray}&&A(\rho ,\theta )\,\exp \,[-i{\rm{\Phi }}(\rho ,\theta )].\end{eqnarray} \tag{ 1.52 }$

Phase conjugation is a procedure that flattens the phase of the electric field by multiplying it by $\exp [+i{\rm{\Phi }}(\rho ,\theta )]$ and then by the complex conjugate of the field. Physically, this means we need to add aberrations to the field at the right time using the deformable surface. Even though the concept of phase conjugation is based on the physical description of light in terms of electric fields, we are going to analytically solve it using only optical path theory and the Fermat principle, in chapters 2 and 6.

Figure 1.13 shows an example of phase conjugation performed by a deformable mirror.

1.9.3. Deformable mirrors

As we mentioned previously, deformable mirrors are used in large terrestrial telescopes to correct observations of celestial bodies. Deformable mirrors can be segmented, continuous, or edge-actuated mirrors. A segmented mirror has flat segments that are moved either up and down by only one actuator or it may have three or more actuators that handle tip, tilt, and piston motions. Continuous mirrors work similarly to segmented mirrors, but the actuators deform a continuous mirror with an up-and-down movement. Finally, in edge-actuated mirrors, structural deformation of the surface takes place using mechanical modes.

Figure 1.14 shows schematics of segmented mirrors, continuous mirrors, and edge-actuated mirrors and figure 1.15 presents the different types of actuator array used in deformable mirrors.

1.9.4. Wavefront sensors

We mentioned that typical adaptive optical systems need a wavefront sensor, also called a wavefront analyser, to measure the wavefront of the light, so that the deformable mirror can subsequently perform phase conjuration. A wavefront analyser is an optical device used to analyze the shape of a wave surface, as the name suggests. The principle of the wavefront analyser is to break down a wavefront into elementary wavefronts and to determine the orientation of each elementary wavefront. When integrated, the measurements of these orientations allow the shape of the wavefront to be recovered.

Shack–Hartmann sensors are instruments used to measure the deformation of the wavefront (or phase) of an optical beam. Their use is quite widespread, and they are typically used in adaptive optical systems. In this section, we are going to study this type of sensor. The Hartmann–Shack wavefront analyser was first conceived by Johannes Franz Hartmann and later by American physicist Roland Shack.

In 1880, Johannes Franz Hartmann, a German astronomer, had the idea of placing a pierced plate in front of a Cassegrain-type telescope and of capturing the signal received from a star on a photographic plate in an intra- or extra-focal plane. Such a pierced plate samples the measured wavefront. If the plate has several holes then the behaviour of the wavefront is analysed in small portions, since the holes break down the elementary wavefront into wavefront portions. In Hartmann's time, measurements were made with a double decimetre directly onto the photographic plate. Later on, in 1970, Hartmann's principle was improved by the American physicist Roland Shack who replaced the holes with micro-lenses, making the measurement more precise.

A Shack–Hartmann sensor is made up of a matrix of micro-lenses, placed in front of a light-sensitive sensor, for example, a CCD (charge-coupled device) or a CMOS (complementary metal–oxide–semiconductor) camera. These two components are located in parallel planes. When a beam arrives, each micro-lens generates a focal point on the sensor relative to its reference position, which corresponds to an undistorted waveform. The position of the focal point varies, depending on the local deformation of the wavefront. See figure 1.16.

**Figure 1.16.** Diagram of the principle of the Shack–Hartmann sensor. The beam light is focused at a different point for each subaperture. The position of the centroid of the spot is proportional to the tilt of each subaperture.
Download figure:
Standard image High-resolution image

The image given by the sensor is thus a matrix of spots, the number of which corresponds to the number of micro-lenses. The deviation of each spot with respect to its reference gives the local derivative of the wavefront; the set of deviations directly gives the gradient of the wavefront. Algorithms then make it possible to recover the wavefront. If a wavefront is flat or spherical and thus has a point light source, the micro-lenses generate a matrix of spots whose spacing is perfectly regular. This spacing is directly related to the curvature of the wavefront, which is directly related to the distance from the source. The more it is curved, the greater the spacing between the spots. The sensor makes it possible to perform the wavefront measurement in real time. As the sensor is connected to a computer, a computer program searches the image for the position of the spots, calculates their deviation from their respective reference spot, then recovers the wavefront.

1.10. Optical testing

In the previous section and its subsections, we studied adaptive optics because it is a direct application of the mathematical models that we are going to study in the next chapters. We now turn to optical testing for the same reason, which is that the mathematical models that we are going to study in the next chapters have direct applicability to optical testing.

We use 'optical testing' to denote the family of techniques used to test an optical element; normally, this means the surface of an optical element. As we have mentioned previously, if an optical element has defects on its surface, these change its performance because the laws of refraction and reflection directly depend on the normal vector of the surface. The history of optical testing starts in the seventeenth century. Galileo was the first astronomer to want to perform optical tests. He wanted to establish whether the surfaces of his lenses had the shapes he wanted, but he was not able to do it.

Today, interferometers are widely used in optical testing. The mathematical models that we are going to study in the next chapters, along with interferometers, can be very useful for optical testing. Thus, in the next subsection, we explore interferometers further.

1.10.1. Interferometry

The interferometry technique uses the superimposition of waves, typically electromagnetic waves, in order to obtain information. In optical testing, interferometry is used to extract information about the surface quality and surface irregularities of an optical element. Interferometry uses devices called interferometers to extract the abovementioned information, but interferometers are also widely used in science and industry to measure small displacements, changes in refractive index, etc.

In an interferometer, light from a single source is split into two beams that travel different optical paths and are then recombined to produce interference. The light is split into two identical beams by a beam splitter, e.g. a partially reflecting mirror. Each of these beams travels via a different route, called a path, and they are recombined before arriving at a detector. The resulting interference fringes at the detector provide information about the difference in the optical path lengths. The path difference is the difference between the distances travelled by the beams, which creates a phase difference between them. It is this introduced phase difference that creates the interference pattern between the initially identical waves.

If a single beam has been split along two paths, then the phase difference provides an indication of anything that changes the phase along the paths. This could be a physical change in the length of the path itself or a change in the refractive index along the path. Interferometry uses the principle of superposition to combine the waves so that the result of their combination has a significant property that is a diagnosis of the original state of the waves. This works because when two waves of the same frequency combine, the resulting intensity pattern is determined by the phase difference between the two waves: in-phase waves experience constructive interference, while out-of-phase waves experience destructive interference. Waves that are not completely in phase or completely out of phase will have an intermediate intensity pattern, which can be used to determine their relative phase difference.

Although there are several types of interferometer, we are just going to examine the one of the most used interferometers, the Fizeau interferometer. The Fizeau interferometer interferes with the reflections of two optical surfaces placed opposite each other, in order to control their quality, for example. See figure 1.17.

**Figure 1.17.** Diagram of a typical Fizeau interferometer used for optical testing. The source of light is expanded and then divided. One of the light beams goes to the reference surface and test mirror and the other goes directly to the imaging lens. Once both beams reach the imaging lens, the indifference patterns are shown in the interferogram.
Download figure:
Standard image High-resolution image

The Fizeau interferometer is relevant to our proposal because it is frequently used for optical testing. We will therefore explore it further.

The Fizeau interferometer is an optical assembly based on the principle of amplitude division interferometry, proposed in 1862 by the French physicist Hippolyte Fizeau. The Fizeau interferometer consists of a source, a beam splinter, a diverging lens, a test mirror, and the interferogram (where the interference is displayed).

The light source of the interferometer is typically a laser beam. The laser beam passes through the beam splitter and is divided into two beams. One of the beams goes directly to the interferogram. The other is refracted by the lens and then reflected by the mirror. There, it reverses, passes through the lens again and ends up at the interferogram. See figure 1.17.

If the lens and the mirror are such that the optical path lengths (OPLs) of all rays are the same then a uniform fringe pattern appears in the interferogram at the output of the Fizeau interferometer. When there is an optical path difference (OPD) between the rays that passed through the lens and the mirror, the Fizeau interferometer output contains nonuniform fringes.

The nonuniform fringes are caused by the OPD. Thus, Fizeau interferometers are very useful when lenses or mirrors are being tested to verify a certain shape. If a pair of mirrors or lenses has uniform fringes at the output, then the rays that passed through the lens and the mirror have a zero OPD. A defect on the surfaces of the lenses or mirrors can generate a non-zero OPD.

A null lens is a common name for these arrays. Null lenses are those lenses that, given a mirror, null the effect of the mirror in the Fizeau interferometer. Alternatively, if the shape of a mirror nulls the effect of the lens at the Fizeau interferometer, the mirror can be referred to as a null mirror. An example is shown in figure 1.18.

**Figure 1.18.** Diagram of a null lens with its respective null mirror.
Download figure:
Standard image High-resolution image

The null lens has an input spherical wavefront and the output wavefront is such that when the light strikes the mirror, the normal vectors of the mirror surface are parallel to the striking rays. This implies that the mirror is spheric.

1.11. End notes

In this chapter, we focused on the geometrical behaviour of light. From it, we successfully found Fermat's principle, Snell's law and the Malus–Dupin theorem.

We then demonstrated several essential concepts, such as stigmatism, and how the absence of stigmatism in a system forms aberrations.

Finally, we studied several techniques for which the equations of the following chapters have potential applications. These techniques are adaptive optics and optical testing.

The path of light

Chapter navigation

Export citation and abstract

Permissions

Share this chapter

Affiliations

Dates

Chapter DOI

Books links

Abstract

1.1. Purpose and introduction to this treatise

1.2. The optical path and Fermat's principle

1.3. The law of reflection

1.4. The law of refraction

1.5. The vector form of Snell's law

1.6. The wavefront and the Malus–Dupin theorem

1.7. Optical path difference and phase difference

1.8. Stigmatism and aberrated wavefronts

1.8.1. Spherical aberration

1.8.2. Coma

1.8.3. Astigmatism

1.8.4. Distortion

1.8.5. Field curvature

1.9. Adaptive optics

1.9.1. The main configuration in adaptive optics

1.9.2. The crux in adaptive optics: phase conjugation

1.9.3. Deformable mirrors

1.9.4. Wavefront sensors

1.10. Optical testing

1.10.1. Interferometry

1.11. End notes