Topical Review The following article is Open access

Zernike polynomials and their applications

and

Published 15 November 2022 © 2022 The Author(s). Published by IOP Publishing Ltd
, , Citation Kuo Niu and Chao Tian 2022 J. Opt. 24 123001 DOI 10.1088/2040-8986/ac9e08

2040-8986/24/12/123001

Abstract

The Zernike polynomials are a complete set of continuous functions orthogonal over a unit circle. Since first developed by Zernike in 1934, they have been in widespread use in many fields ranging from optics, vision sciences, to image processing. However, due to the lack of a unified definition, many confusing indices have been used in the past decades and mathematical properties are scattered in the literature. This review provides a comprehensive account of Zernike circle polynomials and their noncircular derivatives, including history, definitions, mathematical properties, roles in wavefront fitting, relationships with optical aberrations, and connections with other polynomials. We also survey state-of-the-art applications of Zernike polynomials in a range of fields, including the diffraction theory of aberrations, optical design, optical testing, ophthalmic optics, adaptive optics, and image analysis. Owing to their elegant and rigorous mathematical properties, the range of scientific and industrial applications of Zernike polynomials is likely to expand. This review is expected to clear up the confusion of different indices, provide a self-contained reference guide for beginners as well as specialists, and facilitate further developments and applications of the Zernike polynomials.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

The Zernike polynomials are a sequence of continuous functions that form a complete orthogonal set over a unit disk. They were named after the optical physicist Frits Zernike (figure 1), winner of the 1953 Nobel Prize in Physics and the inventor of phase-contrast microscopy. Since most optical systems have circular apertures, Zernike polynomials are useful for wavefront analysis and thus play important roles in optics. Zernike polynomials can be generally divided into two basic types, i.e. Zernike circle polynomials and Zernike annular polynomials, which are defined over a unit disk and an annular unit disk, respectively. The Zernike circle polynomials were first introduced by Zernike in 1934 as eigenfunctions of a second-order rotationally invariant partial differential equation to describe the phase contrast method [1, 2] and were derived by Bhatia and Wolf in 1954 from the requirements of orthogonality and invariance [3]. The Zernike annular polynomials first appeared in a report of Perkin-Elmer Corporation in 1971 [4] and were discussed by Tatian in 1976 for aberrations balancing in optical systems with annular pupils from the standpoint of lens design [5]. They are systematically studied and explicitly given by Mahajan in 1981 [4].

Figure 1.

Figure 1. Frits Zernike (1888–1966), a Dutch physicist and mathematician who was awarded the Nobel Prize in Physics in 1953 for his invention of the phase contrast microscope. Figure reproduced with permission from [7]. Reprinted from [7], Copyright © 2012 Elsevier B.V. All rights reserved.

Standard image High-resolution image

Zernike polynomials gradually aroused people's interests after introduction (figure 2) and have found widespread applications in optics and image processing. In 1942, Bernard Nijboer, a PhD student of Zernike, expanded aberration functions of a symmetrical optical system into a series of Zernike polynomials and formulated an efficient representation of the complex amplitude distribution of a point object in the image plane [6]. This work allows analytical evaluation of diffraction integrals and the point spread function (PSF) of a general optical system and is referred to as the Nijboer–Zernike theory. However, the Nijboer–Zernike theory is only valid in the case of small aberrations and can only produce accurate results at positions close to geometrical focus. Seventy years later, Janssen formulated a general expression in terms of power-Bessel series and extended the Nijboer–Zernike theory for optical systems with large aberrations [8]. The extended Nijboer–Zernike theory can analytically compute the PSF of an aberrated optical system described by Zernike coefficients and accelerates further developments in the focused field diffraction theory. While the developments of diffraction theory of aberrations solely rely on analytical derivations, Zernike polynomial-based wavefront analysis depends on the use of computers. In the 1970s, with the rise of adaptive optics, Noll proposed a modified set of Zernike polynomials by normalizing and sorting the polynomials for statistical analysis of wavefront aberrations caused by atmospheric turbulence [9]. At the same time, Loomis at the University of Arizona introduced a reordered subset of Zernike polynomials to the interferogram processing software FRINGE for wavefront analysis in interferometric measurements [10, 11]. This subset called Zernike fringe set contains only 37 terms but has good corresponding relationships with classical aberrations. In 1980, Teague extended the applications of Zernike polynomials from optics to image processing and pioneered Zernike moments, which hold the property of rotation invariance and can be used as shape descriptors for pattern recognition [12]. The Zernike moments have since then become a valuable shape descriptor for image analysis. After entering the 21st century, the developments of Zernike polynomials gradually become mature and several Zernike sets were standardized to promote effective communication by the American National Standards Institute (ANSI) [13, 14] and the International Organization for Standardization (ISO) [1517] (see figure 3).

Figure 2.

Figure 2. Number of publications on Zernike polynomials from 1970 to 2020. Source: Web of Science; keyword: Zernike polynomials; search criteria: topic.

Standard image High-resolution image
Figure 3.

Figure 3. Key events in the history of Zernike polynomials (ZPs).

Standard image High-resolution image

The widespread use of Zernike polynomials stems from their unique mathematical properties. First, Zernike polynomials are orthogonal over a unit circle. The orthogonality makes the expansion coefficients of a wavefront function independent of the number of terms [18]. This also enables convenient mathematical manipulations of wavefronts, such as addition, subtraction, translation, rotation, and scaling. Second, while other polynomials orthogonal over a unit disk also exist, Zernike polynomials are unique in the sense that they have good corresponding relationships with classical aberrations, such as astigmatism, coma, and spherical aberration [19, 20]. This enables fast classifications and quantifications of wavefront aberrations. Third, Zernike polynomials make the evaluation of the image quality of an optical system easy since the system PSF can be analytically computed from the Zernike expansion coefficients of wavefront aberrations based on the (extended) Nijboer–Zernike theory [6, 8, 21]. In addition, Zernike polynomials can serve as a basis set for wavefront reconstruction in slope sensitive wavefront sensors, such as the Shack–Hartmann wavefront slope sensor [22, 23] and the lateral shearing interferometers [24], which are important wavefront sensing tools in ophthalmic optics and adaptive optics.

Nowadays, a variety of indices for Zernike polynomials are in use by authors and authorities around the world. Well-known ones include the Noll indices [9, 25], the OSA/ANSI indices [1315, 17, 26, 27], the Fringe/University of Arizona indices [10, 19], the ISO 14999 indices [16, 28], the Born and Wolf indices [29], and the Malacara indices [30, 31]. Each indexing scheme adopts a different naming, normalization, and indexing strategy and even the coordinate system may be different, which causes great confusion to researchers working with the polynomials and hinders effective communication. Moreover, mathematical properties of Zernike polynomials developed in the past few decades, such as derivatives, Fourier transform, and recurrence relations, are scattered in the literature and no work summarizes these results. This motivates us to prepare a review paper on the Zernike polynomials with the aims of clearing up the confusion of different indices, summarizing mathematical properties, surveying state-of-the-art applications, and providing a quick reference guide for scientists and engineers in this community.

The remainder of this review is organized as follows (see figure 4). Section 2 reviews different indexing schemes for the Zernike circle polynomials, their mathematical properties, roles in wavefront fitting, relationships with classical Seidel aberrations and the Strehl ratio, connections with other important functions, such as the XY monomials and the Legendre polynomials. Section 3 discusses orthonormal polynomials over noncircular pupils based on the Zernike circle polynomials with an emphasis on Zernike annular polynomials, whose definition, mathematical properties, and roles in wavefront fitting are presented. Section 4 surveys state-of-the-art applications of Zernike polynomials in a range of fields, including diffraction theory, optical design, optical testing, ophthalmic optics, adaptive optics, and image analysis. Finally, section 5 draws concluding remarks. Table 1 lists the acronyms and symbols used in this review.

Figure 4.

Figure 4. Major topics discussed in this review.

Standard image High-resolution image

Table 1. List of acronyms and symbols.

SymbolsMeaning
3DThree-dimensional
PSFPoint spread function
$(x,y),\;(\rho ,\theta )$ Cartesian coordinates and polar version in the spatial domain
$(u,v),\;(r,\phi )$ Cartesian coordinates and polar version in the frequency domain
${F_k}$ Orthonormal polynomials over arbitrary pupil shapes
$G_n^i$ Zernike resizing factor
I Intensity; moment invariants
${J_n}(x)$ Bessel functions
${M_{nm}}$ Zernike moments
$N_n^m$ Normalization factor
${P_n}^{(\alpha ,\beta )}$ Jacobi polynomials
${P_n}$ Legendre polynomials
$P(\rho ,\theta )$ Pupil function
$R_n^m(\rho )$ Zernike radial polynomials
$T\,_n^m$ XY monomials
$U(r,\phi ,\upsilon )$ Complex amplitude on the image plane
Un Chebyshev polynomials
$V\,_n^l$ Complex Zernike circle polynomials of degree n with an azimuthal frequency l
$W$ Wavefront aberration
$\overline W $ Wavefront mean value
${Z_j}(\rho ,\theta ),Z_n^m(\rho ,\theta )$ Zernike circle polynomials of degree n with an azimuthal frequency m
${Z_j}(\rho ,\theta ;\varepsilon ),Z_n^m(\rho ,\theta ;\varepsilon )$ Zernike annular polynomials with an obscuration ratio of epsilon
$\Phi (\rho ,\theta )$ Phase function
${a_j},a_n^m$ Zernike expansion coefficients
${b_j},b_n^m$ Transformed Zernike expansion coefficients
$f(x,y)$ Original images
$g(x,y)$ Degraded images
$\omega _n^m,\omega _n^l$ Weighting factor for (complex) annular Zernike radial polynomials
$\psi (x,y)$ Basis functions
${\mathcal{M}_{nl}}$ Pseudo Zernike moments
$\mathcal{R}_n^m$ Pseudo Zernike radial polynomials
$\Re _n^m$ Radon transform of Zernike radial polynomials
$\mathcal{V}_n^l$ Pseudo Zernike polynomials
${\mathcal{Z}_j}(r,\phi )$ Fourier transform of Zernike circle polynomials
${\mathcal{Z}_j}(r,\phi ;\varepsilon )$ Fourier transform of Zernike annular polynomials
$\mathbf{A}$ Coefficient matrix
$\mathbf{a}$ Zernike expansion coefficients vector
$\mathbf{W}$ Wavefront vector
$\mathbf{s}$ Wavefront slope vector

2. Zernike polynomials over circular pupils

2.1. Definitions

Zernike polynomials over circular pupils are called Zernike circle polynomials or simply Zernike polynomials. They are defined over a unit disk and can be most conveniently expressed in polar coordinates (ρ, θ), where ρ is the normalized radial coordinate (0 ⩽ ρ⩽ 1) and θ is the polar angle measured counterclockwise from the +x-axis (0 ⩽ θ < 2π), as shown in figure 5(a). The polar coordinates ρ and θ can be converted to the Cartesian coordinates x and y using the trigonometric functions:

Equation (1)

Figure 5.

Figure 5. Definition of a unit circle: (a) Coordinate system used in this review. (b) Coordinate system not recommended.

Standard image High-resolution image

Likewise, the Cartesian coordinates x and y can be converted to polar coordinates ρ and θ by:

Equation (2)

It is worth noting that while most people follow the convention that θ is positive when measured counterclockwise from the +x-axis, some authors, such as Born [32] and Malacara [30, 31], measure the polar angle from the +y-axis in the clockwise direction [33], which stems from early (pre-computer) aberration theory and is not recommended [27].

Zernike polynomials have several different indexing schemes during evolution, causing confusion to researchers, especially beginners. In this section, we classify indices in the literature into six groups, i.e. the Noll indices, the OSA/ANSI indices, the Fringe indices, the ISO 14999 indices, the Born and Wolf indices, and the Malacara indices, and compare their naming, normalization, and indexing strategies.

2.1.1. The Noll indexing scheme.

When Zernike first introduced the orthogonal polynomials, radial polynomials and azimuthal functions were explicitly given [1, 2]. However, the normalization and ordering methods for these polynomials were not specified. Noll in 1976 sorted and normalized Zernike polynomials to facilitate statistical analysis of wavefront distortion caused by atmospheric turbulence [9]. The indexing scheme was later followed by many authors [3436] and was used in commercial software, such as Zemax [25] as the standard indices. Note that the 'standard' indices in Zemax are not associated with any ANSI or ISO standards. In this section, we summarize the Noll indexing scheme, discuss normalized and non-normalized Zernike circle polynomials, and extend the definitions from the real domain to the complex domain.

2.1.1.1. Real Zernike circle polynomials.

In the Noll indices, the normalized or orthonormal Zernike circle polynomials are defined as the products of normalization factors, radial polynomials, and azimuthal (angular) functions, which are written as [9]:

Equation (3)

where the index n is the degree of the radial polynomials, $R_n^m(\rho )$; the index m is the azimuthal frequency describing the repetition of the angular function; n and m are non-negative integers and satisfy nm⩾ 0 and nm= even; j is a mode-ordering number starting from 1 and its relationships with n and m are presented in equation (8). There are a total of (n + 1)(n + 2)/2 linearly independent polynomials for a degree ⩽ n. The radial polynomial $R_n^m(\rho )$ is defined as [1, 6]:

Equation (4)

The radial polynomials of the first few degrees are shown in figure 6. It is easy to verify the following relations:

Equation (5)

Equation (6)

Figure 6.

Figure 6. Zernike radial polynomials of the first few degrees when m = 0, 1, and 2.

Standard image High-resolution image

The normalized Zernike circle polynomials meet the following orthonormality condition:

Equation (7)

where ${\delta _{jj^{\prime}}}$ is the Kronecker delta function.

The orthonormal Zernike circle polynomials can be sorted by either the single index, j, or the double indices, n and m. The former is useful for describing Zernike expansion coefficients while the latter is useful for unambiguously describing the functions. To convert a given value j to n and m, one can use the following relationships [36]:

Equation (8)

where ⌊x⌋ denotes the floor function that gives as output the greatest integer less than or equal to x. For example, ⌊2.4⌋ = 2. To convert given values of n and m to j, the following relationship can be used:

Equation (9)

Table 2 lists the first 37-term real orthonormal Zernike circle polynomials in the polar and Cartesian coordinate systems and the values for n, m, and j.

Table 2. First 37-term orthonormal Zernike circle polynomials under the Noll indices [25, 36].

j n m Zj (ρ, θ) Zj (x, y)Aberration
10011Piston
211 $2\rho \cos \theta $ $2x$ x-tilt
31 $2\rho \sin \theta $ $2y$ y-tilt
420 $\sqrt 3 (2{\rho ^2} - 1)$ $\sqrt 3 [2({x^2} + {y^2}) - 1]$ Defocus
52 $\sqrt 6 {\rho ^2}\sin 2\theta $ $2\sqrt 6 xy$ 45° Primary astigmatism
62 $\sqrt 6 {\rho ^2}\cos 2\theta $ $\sqrt 6 ({x^2} - {y^2})$ 0° Primary astigmatism
731 $\sqrt 8 (3{\rho ^3} - 2\rho )\sin \theta $ $\sqrt 8 y[3({x^2} + {y^2}) - 2]$ Primary y-coma
81 $\sqrt 8 (3{\rho ^3} - 2\rho )\cos \theta $ $\sqrt 8 x[3({x^2} + {y^2}) - 2]$ Primary x-coma
93 $\sqrt 8 {\rho ^3}\sin 3\theta $ $\sqrt 8 y(3{x^2} - {y^2})$  
103 $\sqrt 8 {\rho ^3}\cos 3\theta $ $\sqrt 8 x({x^2} - 3{y^2})$  
1140 $\sqrt 5 (6{\rho ^4} - 6{\rho ^2} + 1)$ $\sqrt 5 [6{({x^2} + {y^2})^2} - 6({x^2} + {y^2}) + 1]$ Primary spherical aberration
122 $\sqrt {10} (4{\rho ^4} - 3{\rho ^2})\cos 2\theta $ $\sqrt {10} ({x^2} - {y^2})[4({x^2} + {y^2}) - 3]$ 0° Secondary astigmatism
132 $\sqrt {10} (4{\rho ^4} - 3{\rho ^2})\sin 2\theta $ $2\sqrt {10} xy[4({x^2} + {y^2}) - 3]$ 45° Secondary astigmatism
144 $\sqrt {10} {\rho ^4}\cos 4\theta $ $\sqrt {10} [{({x^2} + {y^2})^2} - 8{x^2}{y^2}]$  
154 $\sqrt {10} {\rho ^4}\sin 4\theta $ $4\sqrt {10} xy({x^2} - {y^2})$  
1651 $\sqrt {12} (10{\rho ^5} - 12{\rho ^3} + 3\rho )\cos \theta $ $\sqrt {12} x[10{({x^2} + {y^2})^2} - 12({x^2} + {y^2}) + 3]$ Secondary x-coma
171 $\sqrt {12} (10{\rho ^5} - 12{\rho ^3} + 3\rho )\sin \theta $ $\sqrt {12} y[10{({x^2} + {y^2})^2} - 12({x^2} + {y^2}) + 3]$ Secondary y-coma
183 $\sqrt {12} (5{\rho ^5} - 4{\rho ^3})\cos 3\theta $ $\sqrt {12} x({x^2} - 3{y^2})[5({x^2} + {y^2}) - 4]$  
193 $\sqrt {12} (5{\rho ^5} - 4{\rho ^3})\sin 3\theta $ $\sqrt {12} y(3{x^2} - {y^2})[5({x^2} + {y^2}) - 4]$  
205 $\sqrt {12} {\rho ^5}\cos 5\theta $ $\sqrt {12} x[16{x^4} - 20{x^2}({x^2} + {y^2}) + 5{({x^2} + {y^2})^2}]$  
215 $\sqrt {12} {\rho ^5}\sin 5\theta $ $\sqrt {12} y[16{y^4} - 20{y^2}({x^2} + {y^2}) + 5{({x^2} + {y^2})^2}]$  
2260 $\sqrt 7 (20{\rho ^6} - 30{\rho ^4} + 12{\rho ^2} - 1)$ $\sqrt 7 [20{({x^2} + {y^2})^3} - 30{({x^2} + {y^2})^2} + 12({x^2} + {y^2}) - 1]$ Secondary spherical
232 $\sqrt {14} (15{\rho ^6} - 20{\rho ^4} + 6{\rho ^2})\sin 2\theta $ $2\sqrt {14} xy[15{({x^2} + {y^2})^2} - 20({x^2} + {y^2}) + 6]$ 45° Tertiary astigmatism
242 $\sqrt {14} (15{\rho ^6} - 20{\rho ^4} + 6{\rho ^2})\cos 2\theta $ $\sqrt {14} ({x^2} - {y^2})[15{({x^2} + {y^2})^2} - 20({x^2} + {y^2}) + 6]$ 0° Tertiary astigmatism
254 $\sqrt {14} (6{\rho ^6} - 5{\rho ^4})\sin 4\theta $ $4\sqrt {14} xy({x^2} - {y^2})[6({x^2} + {y^2}) - 5]$  
264 $\sqrt {14} (6{\rho ^6} - 5{\rho ^4})\cos 4\theta $ $\sqrt {14} [{({x^2} + {y^2})^2} - 8{x^2}{y^2}][6({x^2} + {y^2}) - 5]$  
276 $\sqrt {14} {\rho ^6}\sin 6\theta $ $\sqrt {14} xy[32{x^4} - 32{x^2}({x^2} + {y^2}) + 6{({x^2} + {y^2})^2}]$  
286 $\sqrt {14} {\rho ^6}\cos 6\theta $ $\begin{gathered} \sqrt {14} [32{x^6} - 48{x^4}({x^2} + {y^2})\, + \\ 18{x^2}{({x^2} + {y^2})^2} - {({x^2} + {y^2})^3}] \\ \end{gathered} $  
2971 $4(35{\rho ^7} - 60{\rho ^5} + 30{\rho ^3} - 4\rho )\sin \theta $ $4y[35{({x^2} + {y^2})^3} - 60{({x^2} + {y^2})^2} + 30({x^2} + {y^2}) - 4]$ Tertiary y-coma
301 $4(35{\rho ^7} - 60{\rho ^5} + 30{\rho ^3} - 4\rho )\cos \theta $ $4x[35{({x^2} + {y^2})^3} - 60{({x^2} + {y^2})^2} + 30({x^2} + {y^2}) - 4]$ Tertiary x-coma
313 $4(21{\rho ^7} - 30{\rho ^5} + 10{\rho ^3})\sin 3\theta $ $4y(3{x^2} - {y^2})[21{({x^2} + {y^2})^2} - 30({x^2} + {y^2}) + 10]$  
323 $4(21{\rho ^7} - 30{\rho ^5} + 10{\rho ^3})\cos 3\theta $ $4x({x^2} - 3{y^2})[21{({x^2} + {y^2})^2} - 30({x^2} + {y^2}) + 10]$  
335 $4(7{\rho ^7} - 6{\rho ^5})\sin 5\theta $ $\begin{gathered} 4[4{x^2}y({x^2} - {y^2}) + y{({x^2} + {y^2})^2} - 8{x^2}{y^3}]\\ \times [7({x^2} + {y^2}) - 6]\qquad\quad\qquad\quad\quad\quad\;\; \\ \end{gathered} $  
345 $4(7{\rho ^7} - 6{\rho ^5})\cos 5\theta $ $\begin{gathered} 4[x{({x^2} + {y^2})^2} - 8{x^3}{y^2} - 4x{y^2}({x^2} - {y^2})]\, \\ \times[7({x^2} + {y^2}) - 6]\qquad\quad\qquad\quad\quad\quad\;\; \\ \end{gathered} $  
357 $4{\rho ^7}\sin 7\theta $ $\begin{gathered} 8{x^2}y[3{({x^2} + {y^2})^2} - 16{x^2}{y^2}]\,\;\quad\quad\;\; \\ + 4y({x^2} - {y^2})[{({x^2} + {y^2})^2} - 16{x^2}{y^2}] \\ \end{gathered} $  
367 $4{\rho ^7}\cos 7\theta $ $\begin{gathered} 4x({x^2} - {y^2})[{({x^2} + {y^2})^2} - 16{x^2}{y^2}] \\ - 8x{y^2}[3{({x^2} + {y^2})^2} - 16{x^2}{y^2}]\;\;\quad \\ \end{gathered} $  
3780 $3(70{\rho ^8} - 140{\rho ^6} + 90{\rho ^4} - 20{\rho ^2} + 1)$ $\begin{gathered} 3[70{({x^2} + {y^2})^4} - 140{({x^2} + {y^2})^3}\,\quad \\ + 90{({x^2} + {y^2})^2} - 20({x^2} + {y^2}) + 1] \\ \end{gathered} $ Tertiary spherical

The non-normalized real Zernike circle polynomials can be obtained by dropping the normalization factors from the normalized Zernike circle polynomials as:

Equation (10)

They satisfy the following relationship:

Equation (11)

The orthogonality of non-normalized Zernike circle polynomials can be written as:

Equation (12)

Note that the integral in the denominator is equal to π. Figures 7 and 8 show the three-dimensional (3D) visualization of the non-normalized Zernike circle polynomials up to the sixth degree and their corresponding interferometric fringe patterns as in optical testing [37].

Figure 7.

Figure 7. Pyramid of the non-normalized Zernike circle polynomials up to the sixth degree under the Noll indexing scheme.

Standard image High-resolution image
Figure 8.

Figure 8. Interferometric fringe patterns corresponding to the Zernike aberrations shown in figure 7.

Standard image High-resolution image
2.1.1.2. Complex Zernike circle polynomials.

The Zernike circle polynomials in the complex domain were not defined in Noll's original definition [9]. However, they can be obtained based on Bhatia and Wolf's work [3] by replacing the azimuthal functions in real Zernike circle polynomials with a complex exponential function. The orthonormal complex Zernike circle polynomials can be written as [4]:

Equation (13)

where n is a non-negative integer, l is an integer, n − |l| ⩾ 0 and is even. The radial polynomial is defined as:

Equation (14)

The normalized complex Zernike circle polynomials meet the following orthonormality condition:

Equation (15)

where * denotes complex conjugate.

The non-normalized version of the complex Zernike circle polynomials are defined as [32]:

Equation (16)

The orthogonality can be expressed as:

Equation (17)

The complex and real definitions of Zernike circle polynomials are related via the Euler's formula [3, 32]. The complex version is useful to define Zernike moments in image analysis, which will be discussed in section 4.6.

2.1.2. The OSA/ANSI indexing scheme.

The OSA/ANSI indices for Zernike circle polynomials were initially developed by an OSA Standards Taskforce in 1999 to reach consensus recommendations on definitions, conventions, and standards for reporting of optical aberrations of human eyes [26, 27]. It was later standardized in ANSI Z80.28 [13, 14] and ISO 24157 [15, 17] and adopted in some commercial software, such as COMSOL Ray Optics Module [38].

The Zernike circle polynomials in the OSA/ANSI indices employ a right-handed coordinate system, as shown in figure 9, and are defined in the real domain as [13, 14, 26, 27].

Equation (18)

Figure 9.

Figure 9. Conventional right-handed coordinate system for the eye in Cartesian and polar forms.

Standard image High-resolution image

where n is a non-negative integer, m is an integer, n − |m| ⩾ 0 and is even, j is a mode-ordering number starting from 0. The radial polynomial, $R_n^m(\rho )$, is defined as:

Equation (19)

The normalization factor, $N_n^m$, can be written as:

Equation (20)

The Zernike circle polynomials under the OSA/ANSI indices can be sorted by either the single index, j, or the double indices, n and m. To achieve conversion among these indices, one can use the following relationships [26, 27]:

Equation (21)

where ⌈x⌉ denotes the ceiling function that gives as output the least integer greater than or equal to x. For example, when j = 4, n = ⌈1.7⌉ = 2, m = 0. Table 3 lists the first 37-term Zernike circle polynomials in the OSA/ANSI indices and the values for n, m, and j.

Table 3. First 37-term Zernike polynomials under the OSA/ANSI indices [14, 27, 39].

j n m Zj Aberration
0001Piston
11−1 $2\rho \sin \theta $ y-tilt
21 $2\rho \cos \theta $ x-tilt
32−2 $\sqrt 6 {\rho ^2}\sin 2\theta $ 45° Primary astigmatism
40 $\sqrt 3 (2{\rho ^2} - 1)$ Defocus
52 $\sqrt 6 {\rho ^2}\cos 2\theta $ 0° Primary astigmatism
63−3 $\sqrt 8 {\rho ^3}\sin 3\theta $  
7−1 $\sqrt 8 (3{\rho ^3} - 2\rho )\sin \theta $ Primary y-coma
81 $\sqrt 8 (3{\rho ^3} - 2\rho )\cos \theta $ Primary x-coma
93 $\sqrt 8 {\rho ^3}\cos 3\theta $  
104−4 $\sqrt {10} {\rho ^4}\sin 4\theta $  
11−2 $\sqrt {10} (4{\rho ^4} - 3{\rho ^2})\sin 2\theta $ 45° Secondary astigmatism
120 $\sqrt 5 (6{\rho ^4} - 6{\rho ^2} + 1)$ Primary spherical aberration
132 $\sqrt {10} (4{\rho ^4} - 3{\rho ^2})\cos 2\theta $ 0° Secondary astigmatism
144 $\sqrt {10} {\rho ^4}\cos 4\theta $  
155−5 $\sqrt {12} {\rho ^5}\sin 5\theta $  
16−3 $\sqrt {12} (5{\rho ^5} - 4{\rho ^3})\sin 3\theta $  
17−1 $\sqrt {12} (10{\rho ^5} - 12{\rho ^3} + 3\rho )\sin \theta $ Secondary y-coma
181 $\sqrt {12} (10{\rho ^5} - 12{\rho ^3} + 3\rho )\cos \theta $ Secondary x-coma
193 $\sqrt {12} (5{\rho ^5} - 4{\rho ^3})\cos 3\theta $  
205 $\sqrt {12} {\rho ^5}\cos 5\theta $  
216−6 $\sqrt {14} {\rho ^6}\sin 6\theta $  
22−4 $\sqrt {14} (6{\rho ^6} - 5{\rho ^4})\sin 4\theta $  
23−2 $\sqrt {14} (15{\rho ^6} - 20{\rho ^4} + 6{\rho ^2})\sin 2\theta $ 45° Tertiary astigmatism
240 $\sqrt 7 (20{\rho ^6} - 30{\rho ^4} + 12{\rho ^2} - 1)$ Secondary spherical
252 $\sqrt {14} (15{\rho ^6} - 20{\rho ^4} + 6{\rho ^2})\cos 2\theta $ 0° Tertiary astigmatism
264 $\sqrt {14} (6{\rho ^6} - 5{\rho ^4})\cos 4\theta $  
276 $\sqrt {14} {\rho ^6}\cos 6\theta $  
287−7 $4{\rho ^7}\sin 7\theta $  
29−5 $4(7{\rho ^7} - 6{\rho ^5})\sin 5\theta $  
30−3 $4(21{\rho ^7} - 30{\rho ^5} + 10{\rho ^3})\sin 3\theta $  
31−1 $4(35{\rho ^7} - 60{\rho ^5} + 30{\rho ^3} - 4\rho )\sin \theta $ Tertiary y-coma
321 $4(35{\rho ^7} - 60{\rho ^5} + 30{\rho ^3} - 4\rho )\cos \theta $ Tertiary x-coma
333 $4(21{\rho ^7} - 30{\rho ^5} + 10{\rho ^3})\cos 3\theta $  
345 $4(7{\rho ^7} - 6{\rho ^5})\cos 5\theta $  
357 $4{\rho ^7}\cos 7\theta $  
368−8 $\sqrt {18} {\rho ^8}\sin 8\theta $  

2.1.3. The Fringe indexing scheme.

The Zernike circle polynomials under the fringe indexing scheme (also known as the USAF set) were first developed by John Loomis in an interferogram analysis program called FRINGE at the University of Arizona, Optical Sciences Center in the 1970s [10, 11, 40, 41]. They are a low-order Zernike set supplemented with radial polynomials of higher order and are preferred for lens design and optical metrology because they group terms according to optical wavefront aberration order [42, 43].

The Zernike circle polynomials under the fringe indices do not have normalization factors and can be written as:

Equation (22)

where n is a non-negative integer, m is an integer, n − |m| ⩾ 0 and is even, j is a mode-ordering number starting from 0 (In CODE V and Zemax, j starts from 1 instead of 0). The radial polynomial is expressed as:

Equation (23)

Note that the above formulas are modified from the Wyant and Creath formula [19] to facilitate comparisons with other indices. The final mathematical expression for each term, as listed in table 4, is the same as that in Wyant's notation.

Table 4. Fringe set of the Zernike circle polynomials [19, 25, 29].

j N n m Zj (ρ, θ)Aberration
00001Piston
111+1 $\rho \cos \theta $ Tilt X
21−1 $\rho \sin \theta $ Tilt Y
320 $2{\rho ^2} - 1$ Defocus
422+2 ${\rho ^2}\cos 2\theta $ Astigmatism X
52−2 ${\rho ^2}\sin 2\theta $ Astigmatism Y
63+1 $(3{\rho ^3} - 2\rho )\cos \theta $ Coma X
73−1 $(3{\rho ^3} - 2\rho )\sin \theta $ Coma Y
840 $6{\rho ^4} - 6{\rho ^2} + 1$ Primary spherical
933+3 ${\rho ^3}\cos 3\theta $ Trefoil X
103−3 ${\rho ^3}\sin 3\theta $ Trefoil Y
114+2 $(4{\rho ^4} - 3{\rho ^2})\cos 2\theta $ Secondary X astigmatism
124−2 $(4{\rho ^4} - 3{\rho ^2})\sin 2\theta $ Secondary Y astigmatism
135+1 $(10{\rho ^5} - 12{\rho ^3} + 3\rho )\cos \theta $ Secondary X coma
145−1 $(10{\rho ^5} - 12{\rho ^3} + 3\rho )\sin \theta $ Secondary Y coma
1560 $20{\rho ^6} - 30{\rho ^4} + 12{\rho ^2} - 1$ Secondary spherical
1644+4 ${\rho ^4}\cos 4\theta $ Tetrafoil X
174−4 ${\rho ^4}\sin 4\theta $ Tetrafoil X
185+3 $(5{\rho ^5} - 4{\rho ^3})\cos 3\theta $ Secondary X trefoil
195−3 $(5{\rho ^5} - 4{\rho ^3})\sin 3\theta $ Secondary Y trefoil
206+2 $(15{\rho ^6} - 20{\rho ^4} + 6{\rho ^2})\cos 2\theta $ Tertiary X astigmatism
216−2 $(15{\rho ^6} - 20{\rho ^4} + 6{\rho ^2})\sin 2\theta $ Tertiary Y astigmatism
227+1 $(35{\rho ^7} - 60{\rho ^5} + 30{\rho ^3} - 4\rho )\cos \theta $ Tertiary X coma
237−1 $(35{\rho ^7} - 60{\rho ^5} + 30{\rho ^3} - 4\rho )\sin \theta $ Tertiary Y coma
2480 $70{\rho ^8} - 140{\rho ^6} + 90{\rho ^4} - 20{\rho ^2} + 1$ Tertiary spherical
2555+5 ${\rho ^5}\cos 5\theta $ Pentafoil X
265−5 ${\rho ^5}\sin 5\theta $ Pentafoil Y
276+4 $(6{\rho ^6} - 5{\rho ^4})\cos 4\theta $ Secondary X Tetrafoil
286−4 $(6{\rho ^6} - 5{\rho ^4})\sin 4\theta $ Secondary Y Tetrafoil
297+3 $(21{\rho ^7} - 30{\rho ^5} + 10{\rho ^3})\cos 3\theta $ Tertiary X Trefoil
307−3 $(21{\rho ^7} - 30{\rho ^5} + 10{\rho ^3})\sin 3\theta $ Tertiary Y Trefoil
318+2 $(56{\rho ^8} - 105{\rho ^6} + 60{\rho ^4} - 10{\rho ^2})\cos 2\theta $ Quaternary X astigmatism
328−2 $(56{\rho ^8} - 105{\rho ^6} + 60{\rho ^4} - 10{\rho ^2})\sin 2\theta $ Quaternary Y astigmatism
339+1 $(126{\rho ^9} - 280{\rho ^7} + 210{\rho ^5} - 60{\rho ^3} + 5\rho )\cos \theta $ Quaternary X coma
349−1 $(126{\rho ^9} - 280{\rho ^7} + 210{\rho ^5} - 60{\rho ^3} + 5\rho )\sin \theta $ Quaternary Y coma
35100 $252{\rho ^{10}} - 630{\rho ^8} + 560{\rho ^6} - 210{\rho ^4} + 30{\rho ^2} - 1$ Quaternary Spherical
366120 $924{\rho ^{12}} - 2772{\rho ^{10}} + 3150{\rho ^8} - 1680{\rho ^6} + 420{\rho ^4} - 42{\rho ^2} + 1$  

Defining N = (n + |m|)/2, Zernike fringe polynomials can be sorted as follows. First arrange N in ascending order from 0 to 6; then sort n in ascending order for a given value of N; finally organize m in descending order for given values of N and n. Compared with other Zernike sets, the Zernike fringe set is unique in the sense that it only has 37 terms (N ⩽ 6). This small polynomial set is useful for interferogram analysis and automatic lens design and is widely adopted in commercial optical software, such as Zemax [25], CODE V [29], OSLO [44, 45], and MetroPro [46].

2.1.4. The ISO-14999 indexing scheme.

The ISO-14999 indices were first published by ISO in the ISO/TR 14999–2 technical report in 2005 [28] for the description of wavefront in interferometric measurement of optical elements and optical system and then updated in 2019 [16].

The Zernike circle polynomials under the ISO-14999 indexing scheme do not have normalization factors and can be written as [16, 28]:

Equation (24)

where n is a non-negative integer, m is an integer, n − |m| ⩾ 0 and is even, j is a mode-ordering number starting from 0 (j = 0, 1, 2, ..., ). The radial polynomial is expressed as:

Equation (25)

Defining N = n + |m|, the Zernike circle polynomials under the ISO-14999 indices are sorted as follows. First arrange N in ascending order from 0 to ; then sort n in ascending order for a given value of N; finally organize m in descending order for given values of N and n. One may find that the ISO-14999 indices share almost the same definition as the fringe indices except that the former contains infinite terms. Actually, the fringe set is a subset of the ISO-14999 set, which is called the Extended Fringe Zernike Polynomials in CODE V [29]. Table 5 lists the first 37 terms of the ISO-14999 set and the values for n, m, N, and j.

Table 5. First 37-term Zernike circle polynomials under the ISO-14999 indices [16, 28].

j N n m Zj Aberration
00001Piston
1211 $\rho \cos \theta $ x-Tilt
21−1 $\rho \sin \theta $ y-Tilt
320 $2{\rho ^2} - 1$ Defocus
4422 ${\rho ^2}\cos 2\theta $ 0° Primary astigmatism
52−2 ${\rho ^2}\sin 2\theta $ 45° Primary astigmatism
631 $(3{\rho ^3} - 2\rho )\cos \theta $ Primary x-coma
73−1 $(3{\rho ^3} - 2\rho )\sin \theta $ Primary y-coma
840 $6{\rho ^4} - 6{\rho ^2} + 1$ Primary spherical aberration
9633 ${\rho ^3}\cos 3\theta $  
103−3 ${\rho ^3}\sin 3\theta $  
1142 $(4{\rho ^4} - 3{\rho ^2})\cos 2\theta $ 0° Secondary astigmatism
124−2 $(4{\rho ^4} - 3{\rho ^2})\sin 2\theta $ 45° Secondary astigmatism
1351 $(10{\rho ^5} - 12{\rho ^3} + 3\rho )\cos \theta $ Secondary x-coma
145−1 $(10{\rho ^5} - 12{\rho ^3} + 3\rho )\sin \theta $ Secondary y-coma
1560 $20{\rho ^6} - 30{\rho ^4} + 12{\rho ^2} - 1$ Secondary spherical
16844 ${\rho ^4}\cos 4\theta $  
174−4 ${\rho ^4}\sin 4\theta $  
1853 $(5{\rho ^5} - 4{\rho ^3})\cos 3\theta $  
195−3 $(5{\rho ^5} - 4{\rho ^3})\sin 3\theta $  
2062 $(15{\rho ^6} - 20{\rho ^4} + 6{\rho ^2})\cos 2\theta $ 0° Tertiary astigmatism
216−2 $(15{\rho ^6} - 20{\rho ^4} + 6{\rho ^2})\sin 2\theta $ 45° Tertiary astigmatism
2271 $(35{\rho ^7} - 60{\rho ^5} + 30{\rho ^3} - 4\rho )\cos \theta $ Tertiary x-coma
237−1 $(35{\rho ^7} - 60{\rho ^5} + 30{\rho ^3} - 4\rho )\sin \theta $ Tertiary y-coma
2480 $70{\rho ^8} - 140{\rho ^6} + 90{\rho ^4} - 20{\rho ^2} + 1$ Tertiary spherical
251055 ${\rho ^5}\cos 5\theta $  
265−5 ${\rho ^5}\sin 5\theta $  
2764 $(6{\rho ^6} - 5{\rho ^4})\cos 4\theta $  
286−4 $(6{\rho ^6} - 5{\rho ^4})\sin 4\theta $  
2973 $(21{\rho ^7} - 30{\rho ^5} + 10{\rho ^3})\cos 3\theta $  
307−3 $(21{\rho ^7} - 30{\rho ^5} + 10{\rho ^3})\sin 3\theta $  
3182 $(56{\rho ^8} - 105{\rho ^6} + 60{\rho ^4} - 10{\rho ^2})\cos 2\theta $ 0° Quaternary astigmatism
328−2 $(56{\rho ^8} - 105{\rho ^6} + 60{\rho ^4} - 10{\rho ^2})\sin 2\theta $ 45° Quaternary astigmatism
3391 $(126{\rho ^9} - 280{\rho ^7} + 210{\rho ^5} - 60{\rho ^3} + 5\rho )\cos \theta $ Quaternary x-coma
349−1 $(126{\rho ^9} - 280{\rho ^7} + 210{\rho ^5} - 60{\rho ^3} + 5\rho )\sin \theta $ Quaternary y-coma
35100 $252{\rho ^{10}} - 630{\rho ^8} + 560{\rho ^6} - 210{\rho ^4} + 30{\rho ^2} - 1$ Quaternary spherical
361266 ${\rho ^6}\cos 6\theta $  

2.1.5. The Born and Wolf indexing scheme.

In the classic textbook Principle of Optics, Born and Wolf reviewed the definition of Zernike circle polynomials and used it for the expansion of aberration functions [32, 47]. Many people [12, 48] later follow the Born and Wolf definition and treat it as the 'standard' indexing scheme. However, as pointed out in Born and Wolf's book (appendix VII in the 7th expanded edition [32]), the indices actually originated from Bhatia and Wolf's work published in 1954 [3].

The Zernike circle polynomials under the Born and Wolf indices do not have normalization factors and can be written as [3, 32]:

Equation (26)

where n is a non-negative integer, m is an integer, n − |m| ⩾ 0 and is even, j is a mode-ordering number starting from 1 (j = 1, 2, ..., ). The radial polynomial is expressed as:

Equation (27)

The Zernike circle polynomials under the Born and Wolf indices are sorted as follows. First arrange n in ascending order from 0 to and then sort m in descending order for a given value of n. The Born and Wolf indices are used by several authors [12, 48] and software [29]. For example, although the software CODE V does not explicitly define the standard Zernike polynomials, the tabulated polynomials in its manual [29] have the same expressions as those in the Born and Wolf indices. Table 6 lists the first 37-term Zernike polynomials under the Born and Wolf indices and the values for n, m, and j.

Table 6. First 37-term Zernike circle polynomials under the Born and Wolf indices [29].

j n m Zj (ρ, θ)Aberration
1001Piston
211 $\rho \cos \theta $ x-tilt
3−1 $\rho \sin \theta $ y-tilt
422 ${\rho ^2}\cos 2\theta $ 0° Primary astigmatism
50 $2{\rho ^2} - 1$ Defocus
6−2 ${\rho ^2}\sin 2\theta $ 45° Primary astigmatism
733 ${\rho ^3}\cos 3\theta $  
81 $(3{\rho ^3} - 2\rho )\cos \theta $ Primary x-coma
9−1 $(3{\rho ^3} - 2\rho )\sin \theta $ Primary y-coma
10−3 ${\rho ^3}\sin 3\theta $  
1144 ${\rho ^4}\cos 4\theta $  
122 $(4{\rho ^4} - 3{\rho ^2})\cos 2\theta $ 0° Secondary astigmatism
130 $6{\rho ^4} - 6{\rho ^2} + 1$ Primary spherical aberration
14−2 $(4{\rho ^4} - 3{\rho ^2})\sin 2\theta $ 45° Secondary astigmatism
15−4 ${\rho ^4}\sin 4\theta $  
1655 ${\rho ^5}\cos 5\theta $  
173 $(5{\rho ^5} - 4{\rho ^3})\cos 3\theta $  
181 $(10{\rho ^5} - 12{\rho ^3} + 3\rho )\cos \theta $ Secondary x-coma
19−1 $(10{\rho ^5} - 12{\rho ^3} + 3\rho )\sin \theta $ Secondary y-coma
20−3 $(5{\rho ^5} - 4{\rho ^3})\sin 3\theta $  
21−5 ${\rho ^5}\sin 5\theta $  
2266 ${\rho ^6}\cos 6\theta $  
234 $(6{\rho ^6} - 5{\rho ^4})\cos 4\theta $  
242 $(15{\rho ^6} - 20{\rho ^4} + 6{\rho ^2})\cos 2\theta $ 0° Tertiary astigmatism
250 $20{\rho ^6} - 30{\rho ^4} + 12{\rho ^2} - 1$ Secondary spherical
26−2 $(15{\rho ^6} - 20{\rho ^4} + 6{\rho ^2})\sin 2\theta $ 45° Tertiary astigmatism
27−4 $(6{\rho ^6} - 5{\rho ^4})\sin 4\theta $  
28−6 ${\rho ^6}\sin 6\theta $  
2977 ${\rho ^7}\cos 7\theta $  
305 $(7{\rho ^7} - 6{\rho ^5})\cos 5\theta $  
313 $(21{\rho ^7} - 30{\rho ^5} + 10{\rho ^3})\cos 3\theta $  
321 $(35{\rho ^7} - 60{\rho ^5} + 30{\rho ^3} - 4\rho )\cos \theta $ Tertiary x-coma
33−1 $(35{\rho ^7} - 60{\rho ^5} + 30{\rho ^3} - 4\rho )\sin \theta $ Tertiary y-coma
34−3 $(21{\rho ^7} - 30{\rho ^5} + 10{\rho ^3})\sin 3\theta $  
35−5 $(7{\rho ^7} - 6{\rho ^5})\sin 5\theta $  
36−7 ${\rho ^7}\sin 7\theta $  
3788 ${\rho ^8}\cos 8\theta $  

2.1.6. The Malacara indexing scheme.

Different from the aforementioned five indexing schemes, the Malacara indices use a different coordinate convention, where the polar angle, θ, is measured clockwise from the +y-axis. The Zernike circle polynomials under the Malacara indices do not have normalization factors and can be written as:

Equation (28)

where n is a non-negative integer, m is an integer, n − |m| ⩾ 0 and is even, j is a mode-ordering number starting from 1 (j = 1, 2, ..., ). The radial polynomial is expressed as:

Equation (29)

The Zernike circle polynomials in the Malacara indices have the same ordering scheme as the Born and Wolf indices and are sorted as follows. First arrange n in ascending order from 0 to and then sort m in descending order for a given value of n. The Malacara indices are mainly used in the first and second editions of the well-known book Optical Shop Testing [30, 31], the third edition of which, however, defines the Zernike circle polynomials under the Noll indexing scheme [36]. Table 7 lists the first 37-term Zernike circle polynomials under the Malacara indices and the values for n, m, and j.

Table 7. First 37-term Zernike circle polynomials under the Malacara indices [31].

j n m Zj (ρ, θ)Aberration
1001Piston
211 $\rho \sin \theta $ x-tilt
3−1 $\rho \cos \theta $ y-tilt
422 ${\rho ^2}\sin 2\theta $ 0° Primary astigmatism
50 $2{\rho ^2} - 1$ Defocus
6−2 ${\rho ^2}\cos 2\theta $ 45° Primary astigmatism
733 ${\rho ^3}\sin 3\theta $  
81 $(3{\rho ^3} - 2\rho )\sin \theta $ Primary x-coma
9−1 $(3{\rho ^3} - 2\rho )\cos \theta $ Primary y-coma
10−3 ${\rho ^3}\cos 3\theta $  
1144 ${\rho ^4}\sin 4\theta $  
122 $(4{\rho ^4} - 3{\rho ^2})\sin 2\theta $ 0° Secondary astigmatism
130 $6{\rho ^4} - 6{\rho ^2} + 1$ Primary spherical aberration
14−2 $(4{\rho ^4} - 3{\rho ^2})\cos 2\theta $ 45° Secondary astigmatism
15−4 ${\rho ^4}\cos 4\theta $  
1655 ${\rho ^5}\sin 5\theta $  
173 $(5{\rho ^5} - 4{\rho ^3})\sin 3\theta $  
181 $(10{\rho ^5} - 12{\rho ^3} + 3\rho )\sin \theta $ Secondary x-coma
19−1 $(10{\rho ^5} - 12{\rho ^3} + 3\rho )\cos \theta $ Secondary y-coma
20−3 $(5{\rho ^5} - 4{\rho ^3})\cos 3\theta $  
21−5 ${\rho ^5}\cos 5\theta $  
2266 ${\rho ^6}\sin 6\theta $  
234 $(6{\rho ^6} - 5{\rho ^4})\sin 4\theta $  
242 $(15{\rho ^6} - 20{\rho ^4} + 6{\rho ^2})\sin 2\theta $ 0° Tertiary astigmatism
250 $20{\rho ^6} - 30{\rho ^4} + 12{\rho ^2} - 1$ Secondary spherical
26−2 $(15{\rho ^6} - 20{\rho ^4} + 6{\rho ^2})\cos 2\theta $ 45° Tertiary astigmatism
27−4 $(6{\rho ^6} - 5{\rho ^4})\cos 4\theta $  
28−6 ${\rho ^6}\cos 6\theta $  
2977 ${\rho ^7}\sin 7\theta $  
305 $(7{\rho ^7} - 6{\rho ^5})\sin 5\theta $  
313 $(21{\rho ^7} - 30{\rho ^5} + 10{\rho ^3})\sin 3\theta $  
321 $(35{\rho ^7} - 60{\rho ^5} + 30{\rho ^3} - 4\rho )\sin \theta $ Tertiary x-coma
33−1 $(35{\rho ^7} - 60{\rho ^5} + 30{\rho ^3} - 4\rho )\cos \theta $ Tertiary y-coma
34−3 $(21{\rho ^7} - 30{\rho ^5} + 10{\rho ^3})\cos 3\theta $  
35−5 $(7{\rho ^7} - 6{\rho ^5})\cos 5\theta $  
36−7 ${\rho ^7}\cos 7\theta $  
3788 ${\rho ^8}\sin 8\theta $  

2.1.7. Comparisons.

The different indexing schemes of Zernike circle polynomials are compared and summarized in table 8 from the perspectives of coordinate system, normalization, and ordering strategy. In particular, the sorting of the polynomials under different indices is shown in figure 10 for the first few degrees. An illustration of the sources and applications of the six indices is presented in figure 11. For convenience, the Noll indices will be used in the remaining part of the article unless otherwise stated.

Figure 10.

Figure 10. Comparison of the low-order sequences of different indices.

Standard image High-resolution image
Figure 11.

Figure 11. Summary of the different indexing schemes and their sources. ZPs: Zernike polynomials.

Standard image High-resolution image

Table 8. Comparison of different indices for Zernike circle polynomials in the real domain.

 NollOSA/ANSIFringe/U of ArizonaISO-14999Born and WolfMalacara
SourcesNoll, 1976 [9] Zemax Standard ZPs [25] Optical Shop Testing (3rd ed), 2007 [36] OSA VSIA Standards Taskforce, 2000 [26] ANSI Z80.28 [13, 14] ISO 24157 [15, 17] COMSOL Ray Optics Module [38] Loomis, 1970s [10, 49] Wyant, 1992 [19] Zemax Fringe ZPs a [25] CODE V Fringe ZPs a [29] OSLO b [44, 45] MetroPro [46] ISO/TR 14999–2, 2005 [16, 28] CODE V Extended Fringe ZPs a [29] Bhatia and Wolf, 1954 [3] Born and Wolf, 1964 [50] CODE V Standard ZPs [29] Optical Shop Testing (1st & 2nd eds), 1978 and 1992 [30, 31]
DomainRealRealRealRealRealReal
Coordinate system0 ⩽ ρ ⩽ 1, θ from +x axis anticlockwise0 ⩽ ρ ⩽ 1, θ from +x axis anticlockwise0 ⩽ ρ ⩽ 1, θ from +x axis anticlockwise0 ⩽ ρ ⩽ 1, θ from +x axis anticlockwise0 ⩽ ρ ⩽ 1, θ from +x axis anticlockwise0 ⩽ ρ ⩽ 1, θ from +y axis clockwise
Definition $\begin{array}{l} {Z_j}(\rho ,\theta ) = Z_n^m(\rho ,\theta ) \\ = \left\{ {\begin{array}{l} N_n^mR_n^m\cos m\theta ,\;m \ne 0, j {\text{ is even}} \\ N_n^mR_n^m\sin m\theta ,\;m \ne 0, j {\text{ is odd}} \\ N_n^mR_n^m,\;m = 0 \\ \end{array}} \right. \\ \end{array} $ $\begin{array}{l} {Z_j}(\rho ,\theta ) = Z_n^m(\rho ,\theta ) \\ = \left\{ {\begin{array}{l} N_n^mR_n^{|m|}(\rho )\cos (m\theta ),\;\;m \geqslant 0 \\ N_n^mR_n^{|m|}(\rho )\sin (\left| m \right|\theta ),\;m < 0 \\ \end{array}} \right. \\ \end{array} $ $\begin{array}{l} {Z_j}(\rho ,\theta ) = Z_n^m(\rho ,\theta ) \\ = \left\{ {\begin{array}{l} R_n^{|m|}(\rho )\cos (m\theta ),\;\;m \geqslant 0 \\ R_n^{|m|}(\rho )\sin (\left| m \right|\theta ),\;m < 0 \\ \end{array}} \right. \\ \end{array} $ $\begin{array}{l} {Z_j}(\rho ,\theta ) = Z_n^m(\rho ,\theta ) \\ = \left\{ {\begin{array}{l} R_n^{|m|}(\rho )\cos (m\theta ),\;\;m \geqslant 0 \\ R_n^{|m|}(\rho )\sin (\left| m \right|\theta ),\;m < 0 \\ \end{array}} \right. \\ \end{array} $ $\begin{array}{l} {Z_j}(\rho ,\theta ) = Z_n^m(\rho ,\theta ) \\ = \left\{ {\begin{array}{l} R_n^{|m|}(\rho )\cos (m\theta ),\;\;m \geqslant 0 \\ R_n^{|m|}(\rho )\sin (\left| m \right|\theta ),\;m < 0 \\ \end{array}} \right. \\ \end{array} $ $\begin{array}{l} {Z_j}(\rho ,\theta ) = Z_n^m(\rho ,\theta ) \\ = \left\{ {\begin{array}{l} R_n^{|m|}(\rho )\sin (m\theta ),\;\;m > 0 \\ R_n^{|m|}(\rho )\cos (m\theta ),\;m \leqslant 0 \\ \end{array}} \right. \\ \end{array} $
Radial polynomial $R_n^m = \sum\limits_{s = 0}^{(n - m)/2} {\frac{{{{( - 1)}^s}(n - s)!}}{{s!\left( {\genfrac{}{}{.3pt}{3}{{n + m}}{{\text{2}}} - s} \right)!\left( {\genfrac{}{}{.3pt}{3}{{n - m}}{{\text{2}}} - s} \right)!}}} {\rho ^{n - 2s}}$ $R_n^{|m|} = \sum\limits_{s = 0}^{(n - \left| m \right|)/2} {\frac{{{{( - 1)}^s}(n - s)!}}{{s!\left( {\genfrac{}{}{.3pt}{3}{{n + \left| m \right|}}{{\text{2}}} - s} \right)!\left( {\genfrac{}{}{.3pt}{3}{{n - \left| m \right|}}{{\text{2}}} - s} \right)!}}} {\rho ^{n - 2s}}$
NormalizationYesYesNoNoNoNo
Normalization factor $N_n^m = \sqrt {\frac{{2(n + 1)}}{{(1 + {\delta _{m0}})}}} $ $N_n^m = \sqrt {\frac{{2(n + 1)}}{{(1 + {\delta _{m0}})}}} $
Term numberInfiniteInfinite37InfiniteInfiniteInfinite
Double indices n: non-negative integer m: non-negative integer nm ⩾ 0 and even n: non-negative integer m: integer n − |m| ⩾ 0 and even n: non-negative integer m: integer n − |m| ⩾ 0 and even N = (n + |m|)/2 n: non-negative integer m: integer n−|m| ⩾ 0 and even N = n + |m| n: non-negative integer m: integer n−|m| ⩾ 0 and even n: non-negative integer m: integer n−|m| ⩾ 0 and even
Single index j = 1, 2, 3, ..., j = 0, 1, 2, ..., j = 0, 1, 2, ..., 36 j = 0, 1, 2, ..., j = 1, 2, 3, ..., j = 1, 2, 3, ...,
Indexing scheme n from 0 to m in ascending order n from 0 to m in ascending order N from 0 to 6 n in ascending order m in descending order N from 0 to n in ascending order m in descending order n from 0 to m in descending order n from 0 to m in descending order
Major applicationsGeneral opticsOphthalmic opticsOptical metrologyOptical metrologyOptical designRarely used

a In CODE V and Zemax, the single index j for the fringe Zernike polynomials (ZPs) starts from 1. b In OSLO, the polar angle θ is measured from the +y axis [44].

2.2. Mathematical properties

In this section, we review major mathematical properties of Zernike circle polynomials, including orthogonality, symmetry, Fourier transform, integral representation of radial polynomials, derivatives, and recurrence relations. For more properties, one can refer to [51, 52].

2.2.1. Orthogonality.

The orthogonal relationships of real and complex Zernike circle polynomials have been presented and can be found in equations (7), (12), (15) and (17). Moreover, the radial and azimuthal functions of Zernike circle polynomials are also orthogonal and satisfy the following relationships [36]:

Equation (30)

Equation (31)

Note that the Noll indices are used here.

2.2.2. Symmetry.

The symmetry of Zernike circle polynomials can be expressed as:

Equation (32)

2.2.3. The Fourier transform.

Define the Fourier transform pair as:

Equation (33)

Equation (34)

where ${\mathcal{Z}_j}$ is the Fourier transform of Zj and (u, v) are Cartesian coordinates in the frequency domain. Use (r, φ) to denote the polar coordinates in the frequency domain and apply the transformation relationships x = ρcosθ, y = ρsinθ, u = rcosφ, v = rsinφ, the Fourier transform of Zernike circle polynomials can be written as [9, 53]:

Equation (35)

where Jn (x) is the nth-order Bessel function of the first kind and is defined as [54]:

Equation (36)

The Fourier transform of Zernike circle polynomials is useful for the conversion between Zernike coefficients and Fourier series coefficients of a wavefront [53].

2.2.4. Integral representation of radial polynomials.

Substituting equation (35) into the inverse Fourier transform of Zernike circle polynomials (equation (34)), an integral representation of Zernike radial polynomials can be obtained as [2, 6]:

Equation (37)

This integral representation is useful for deriving a recurrence relation of the derivatives of Zernike radial polynomials [9].

2.2.5. Derivatives.

The integral representation for the radial polynomials provides a good starting point for calculating derivatives. The derivatives of radial polynomials can be written in a recursion relation as [9]:

Equation (38)

In polar coordinate system, the partial derivatives of Zernike circle polynomials under the Noll indices with respect to x and y can be written as [18, 55]:

Equation (39)

where the normalization factor and the azimuthal function are:

Equation (40)

Equation (41)

In Cartesian coordinate system, the partial derivatives of Zernike circle polynomials under the OSA/ANSI indices with respect to x and y can be written as [14]:

Equation (42)

where n' increases with a step of 2 in the summations and in the case that (|m| + 1) is larger than (n − 1), the first summation term does not exist. The Cartesian derivatives of Zernike circle polynomials can also be obtained using recurrence relations, as reported in [5557]. The derivatives of Zernike circle polynomials are useful for certain problems, such as ray tracing in optical design [58] and wavefront reconstruction in wavefront sensing [22, 24].

2.2.6. Recurrence relations.

Computation of high-order of Zernike circle polynomials is necessary for some applications, such as Zernike moments-based image analysis. Although the radial polynomials are explicitly formulated (equation (19)), direct numerical computation suffers from the problem of low computational efficiency and possible cancellation errors [5961]. To deal with these problems, various recurrence relations have been proposed for evaluating the radial polynomials [6064]. Here we briefly review four widely-used recurrence methods, including the modified Kintner method [59, 65], the Prata's method [66], the q-recursive method [59], and the Shakibaei and Paramesran method [62].

The modified Kintner method was first proposed by Kintner in 1976 [65] and improved by Chong et al in 2003 [59] by adding recurrence relations for special cases when nm = 0 and 2. The improved recurrence relation can be expressed as:

Equation (43)

where n and m are non-negative integers and satisfy n m ⩾ 0 and n m = even; the coefficients are given by:

Equation (44)

The modified Kintner method is a degree-varying (n-varying) approach that computes radial polynomials at higher order from those at lower order for a fixed value of m.

The Prata method was proposed by Prata and Rusch in 1989 [66] and the recurrence relation can be written as:

Equation (45)

where,

Equation (46)

The q-recursive method was proposed by Chong et al [59] and the three-term recurrence relation can be written as:

Equation (47)

where the coefficients are given by:

Equation (48)

Different from the modified Kintner method and the Prata method, the q-recursive method is an m-varying method that computes radial polynomials at lower m from those at higher m for a fixed radial order n.

The Shakibaei and Paramesran method uses a particularly simple recursion, in which a radial polynomial is expressed as a linear combination of three earlier computed radial polynomials as [62]:

Equation (49)

The recursion can be initialized with the conditions $R_0^0(\rho ) = 1$ and $R_n^m(\rho ) \equiv 0$ when n < m. According to [67], the speed and accuracy of the recursion outperforms the Prata method and the q-recursive method in an image processing setting.

2.2.7. Summary.

The mathematical properties of Zernike circle polynomials are summarized in table 9.

Table 9. Properties of Zernike circle polynomials and their radial polynomials.

PropertiesRadial polynomialsZernike circle polynomials
Commutativity a $\left\langle {R_n^m,R_{{n}{^{\prime}}}^{{m}{^{\prime}}}} \right\rangle = \int_0^1 {R_n^mR_{{n}{^{\prime}}}^{{m}{^{\prime}}}{\text{d}}\rho } = \left\langle {R_{{n}{^{\prime}}}^{{m}{^{\prime}}},R_n^m} \right\rangle $ $\left\langle {{Z_i},{Z_j}} \right\rangle = \int_{\text{0}}^{{\text{2}}\pi } {\int_{\text{0}}^{\text{1}} {{Z_i}{Z_j}\rho {\text{d}}\rho {\text{d}}\theta } } = \left\langle {{Z_j},{Z_i}} \right\rangle $
Homogeneity b $\left\langle {cR_n^m,R_{{n}{^{\prime}}}^{{m}{^{\prime}}}} \right\rangle = \int_0^1 {cR_n^mR_{{n}{^{\prime}}}^{{m}{^{\prime}}}{\text{d}}\rho } = c\left\langle {R_n^m,R_{{n}{^{\prime}}}^{{m}{^{\prime}}}} \right\rangle $ $\left\langle {c{Z_i},{Z_j}} \right\rangle = \int_{\text{0}}^{{\text{2}}\pi } {\int_{\text{0}}^{\text{1}} {c{Z_i}{Z_j}\rho {\text{d}}\rho {\text{d}}\theta } } = c\left\langle {{Z_i},{Z_j}} \right\rangle $
Distributivity $\langle R_n^m,R_{{n^\prime }}^{{m^\prime }} + R_{n''}^{m''}\rangle = \langle R_n^m,R_{{n^\prime }}^{{m^\prime }}\rangle + \langle R_n^m,R_{n''}^{m''}\rangle $ $\left\langle {{Z_i},{Z_j} + {Z_k}} \right\rangle = \left\langle {{Z_i},{Z_j}} \right\rangle + \left\langle {{Z_i},{Z_k}} \right\rangle $
Zero mean value ${\overline Z _j} = \int_\Sigma {{Z_j}(\rho ,\theta )\rho {\text{d}}\rho {\text{d}}\theta } = 0, \;j \ne 0$
Orthogonality $\int_0^1 {R_n^m\left( \rho \right)} R_{{n}{^{\prime}}}^m\left( \rho \right)\rho {\text{d}}\rho = \frac{1}{{2\left( {n + 1} \right)}}{\delta _{{nn}{^{\prime}}}}$ $\begin{array}{l} {\text{Non-normalized real ZPs:}} \\[4pt] \;\;\frac{1}{{{\pi }}}\int_0^{2{{\pi }}} {\int_0^1 {{Z_j}} } {Z_{{j}{^{\prime}}}}(\rho )\rho {\text{d}}\rho {\text{d}}\theta = \frac{{1 + {\delta _{m0}}}}{{2(n + 1)}}{\delta _{jj^{\prime}}} \\[4pt] {\text{Normalized real ZPs:}} \\[4pt] \;\;\frac{1}{{{\pi }}}\int_0^{2{{\pi }}} {\int_0^1 {{Z_j}} } {Z_{{j}{^{\prime}}}}(\rho )\rho {\text{d}}\rho {\text{d}}\theta = {\delta _{jj^{\prime}}} \\[4pt] {\text{Non - normalized complex ZPs:}} \\[4pt] \;\;\frac{1}{{{\pi }}}\int_0^{2{{\pi }}} {\int_0^1 {{{[V_n^m(\rho )]}^ * }} } V_{{n}{^{\prime}}}^{{m}{^{\prime}}}(\rho )\rho {\text{d}}\rho {\text{d}}\theta = \frac{1}{{n + 1}}{\delta _{mm^{\prime}}}{\delta _{{nn}{^{\prime}}}} \\[4pt] {\text{Normalized complex ZPs:}} \\[4pt] \;\;\frac{1}{{{\pi }}}\int_0^{2{{\pi }}} {\int_0^1 {{{[V_n^m(\rho )]}^ * }} } V_{{n}{^{\prime}}}^{{m}{^{\prime}}}(\rho )\rho {\text{d}}\rho {\text{d}}\theta = {\delta _{mm^{\prime}}}{\delta _{{nn}{^{\prime}}}} \\[4pt] \end{array} $
Symmetry $R_n^m(\rho ) = R_n^{ - m}(\rho )$ $Z_n^m(\rho ,\theta ) = {( - 1)^m}Z_n^m(\rho ,\theta + {{\pi }})$
Fourier transform $\begin{aligned} & {{\mathcal{Z}}_j}(k,\phi ) \\ & = \left\{ {\begin{array}{*{20}{l}} {\sqrt {2(n + 1)} {{( - 1)}^{n/2 - m}}\frac{{{J_{n + 1}}(2{{\pi }}k)}}{k}\cos (m\phi ),}&{m \ne 0,\;j {\text{ is }} {\text{even}}} \\[6pt] {\sqrt {2(n + 1)} {{( - 1)}^{n/2 - m}}\frac{{{J_{n + 1}}(2{{\pi }}k)}}{k}\sin (m\phi ),}&{m \ne 0,\;j {\text{ is }} {\text{odd}}} \\[6pt] {\sqrt {(n + 1)} {{( - 1)}^{n/2}}\frac{{{J_{n + 1}}(2{{\pi }}k)}}{k},}&{m = 0} \end{array}} \right. \\ \end{aligned}$
Integral representation $R_n^m\left( \rho \right) = 2{{\pi }}{\left( { - 1} \right)^{\frac{{n - m}}{2}}}\int_0^\infty {{J_{n + 1}}\left( {2{{\pi }}k} \right){J_m}\left( {2{{\pi }}k\rho } \right)} {\text{ d}}k$
Derivative $\frac{{{\text{d}}R_n^{|m|}(\rho )}}{{{\text{d}}\rho }} = n\left[R_{n - 1}^{|m| - 1}(\rho ) + R_{n - 1}^{|m| + 1}(\rho )\right] + \frac{{{\text{d}}R_{n - 2}^{|m|}(\rho )}}{{{\text{d}}\rho }}$ $\begin{gathered} \frac{{\partial {Z_j}(\rho ,\theta )}}{{\partial x}} = \left[ {\frac{{\partial R_n^m(\rho )}}{{\partial \rho }}{\Theta ^m}(\theta )\cos \theta - \frac{{R_n^m(\rho )}}{\rho }\frac{{\partial {\Theta ^m}(\theta )}}{{\partial \theta }}\sin \theta } \right] \\ \frac{{\partial {Z_j}(\rho ,\theta )}}{{\partial y}} = \left[ {\frac{{\partial R_n^m(\rho )}}{{\partial \rho }}{\Theta ^m}(\theta )\sin \theta + \frac{{R_n^m(\rho )}}{\rho }\frac{{\partial {\Theta ^m}(\theta )}}{{\partial \theta }}\cos \theta } \right] \\ \end{gathered} $
Recurrence relation $\begin{gathered} R_n^m = \frac{1}{{{k_1}}}\left[ {({k_2}{\rho ^2} + {k_3})R_{n - 2}^m(\rho ) + {k_4}R_{n - 4}^m(\rho )} \right]\; \\ R_n^m(\rho ) = {k_1}\rho R_{n - 1}^{\left| {m - 1} \right|}(\rho ) + {k_2}R_{n - 2}^m(\rho )\quad\qquad\;\; \\ R_n^m(\rho ) = {k_1}R_n^{m + 4}(\rho ) + \left( {{k_2} + \frac{{{k_3}}}{{{\rho ^2}}}} \right)R_n^{m + 2}(\rho )\; \\ \end{gathered} $

a Angle brackets denote the inner product of two functions. b c is a constant.

2.3. Wavefront fitting

2.3.1. Mathematical formulation.

A wavefront function W defined over a unit circle can be represented by the linear combination of finite terms of Zernike circle polynomials as [31, 34, 68, 69]:

Equation (50)

where R is the radius of the pupil, 0 ⩽ ρ⩽ 1, J is the maximum number of terms of the polynomials, aj is the expansion coefficients, and Zj is the jth-term Zernike circle polynomial. The equation can be equivalently expressed in Cartesian coordinates as:

Equation (51)

Written in discrete and matrix forms, equation (51) becomes:

Equation (52)

where,

Equation (53)

where K is the total number of data points within the unit circle. Generally, equation (52) is an overdetermined linear system, where there are more equations (K) than unknowns (J). It can be written into the normal equation [34, 68]:

Equation (54)

where the superscript T denotes matrix transpose. The solution can be obtained by matrix inversion as:

Equation (55)

Figure 12 shows an example illustrating circular wavefront decomposition using orthonormal 37-term Zernike circle polynomials under the Noll indices. The amplitude of each coefficient indicates the strength of corresponding aberrations (table 2).

Figure 12.

Figure 12. Wavefront decomposition using orthonormal Zernike circle polynomials under the Noll indices: (a) wavefront and (b) 37 expansion coefficients.

Standard image High-resolution image

The Zernike-based wavefront fitting has several useful properties [18]. First, the truncation of the expansion of a wavefront does not change the expansion coefficients. In other words, the expansion coefficients are independent from each other:

Equation (56)

Second, all Zernike terms except the piston term have a mean value of zero and, therefore, the mean value of a wavefront equals the piston coefficient, i.e.:

Equation (57)

Third, wavefront variance equals the sum of the square of each expansion coefficient, excluding the piston coefficient, i.e.:

Equation (58)

The properties of Zernike based wavefront fitting are summarized in table 10.

Table 10. Properties of Zernike based wavefront fitting [18, 35, 36].

PropertyFormula
Coefficients independence ${a_j} = \frac{{\text{1}}}{{{\pi }}}\int_{\text{0}}^{{\text{2}}\pi } {\int_{\text{0}}^{\text{1}} {W(\rho ,\theta ){Z_j}\rho {\text{d}}\rho {\text{d}}\theta } } $
Wavefront mean value $\overline W (\rho ,\theta ) = \frac{{\text{1}}}{{{\pi }}}\int_{\text{0}}^{{\text{2}}\pi } {\int_{\text{0}}^{\text{1}} {W(\rho ,\theta )\rho {\text{d}}\rho {\text{d}}\theta } } = {a_0}$
Wavefront mean square value $\overline {{W^{\,2}}} (\rho ,\theta ) = \frac{{\text{1}}}{{{\pi }}}\int_{\text{0}}^{{\text{2}}\pi } {\int_{\text{0}}^{\text{1}} {{W^{\,2}}(\rho ,\theta )\rho {\text{d}}\rho {\text{d}}\theta } } = \sum\limits_{j = 0}^\infty {a_j^2} $
Wavefront variance ${\sigma ^2} = \overline {{W^{\,2}}} - {(\overline W )^2} = \sum\limits_{j = 1}^\infty {a_j^2} $

2.3.2. Transformation of Zernike coefficients with pupil translation, rotation, or resizing.

Zernike polynomials and their associated coefficients are commonly used to quantify the wavefront aberrations of the eye. When the aberrations of different eyes, pupil sizes, or corrections are compared or averaged, it is important that the Zernike coefficients have been calculated for the correct position, orientation, and size of the pupil. In this section, we discuss transformation relationships of Zernike expansion coefficients for translated, rotated, and resized pupils, which are shown in figure 13.

Figure 13.

Figure 13. Coordinate transformations of a wavefront: (a) original wavefront, (b) translated wavefront by Δx and Δy, (c) rotated wavefront by an angle α, (d) resized wavefront from a larger pupil (radius: R1) to a smaller pupil (radius: R2).

Standard image High-resolution image
2.3.2.1. Translation.

Translating a pupil (figure 13(b)) changes the expansion coefficients of a wavefront defined over it. Assuming that the displacements along the x and the y axis are Δx and Δy, respectively, the translated wavefront function can be expanded using the Taylor series [14, 70] as:

Equation (59)

New wavefront expansion coefficients $b_n^m$ can be obtained by computing the first-order derivatives of the Zernike circle polynomials in equation (59), which can be expressed as linear combinations of the untransformed Zernike circle polynomials (see section 2.2.5).

2.3.2.2. Rotation.

Rotating a pupil (figure 13(c)) similarly changes the expansion coefficients of a wavefront defined over it, which should be taken into consideration for applications such as vision correction surgery [71]. For a wavefront counterclockwise rotated with respect to its original coordinate system by an angle α, transformed Zernike expansion coefficients $b_n^m$ can be derived from original expansion coefficients $a_n^m$ as [18]:

Equation (60)
2.3.2.3. Resizing.

Comparison of Zernike expansion coefficients of wavefronts over different non-normalized pupils requires the same aperture size. Therefore, it is necessary to calculate expansion coefficients for an arbitrary pupil size based on the expansion coefficients of the full pupil. Many transformation relationships for pupil resizing have been developed [7281] and two simpler methods were described by Dai [18, 82] and Janssen [79, 83].

Suppose there are two wavefronts, W1 and W2, defined over concentric pupils with radii of R1 and R2, respectively, as shown in figures 13(a) and (d). W2 is part of W1 and R2R1. The Zernike expansion of the wavefront W2 can be written as:

Equation (61)

where 0 ⩽ ρ⩽ 1, bj is the expansion coefficients in the OSA/ANSI indices for the wavefront W2. Define a scale factor epsilon = R2/R1 and the expansion can also be written as:

Equation (62)

where aj is the expansion coefficients for the wavefront W1. Connecting equations (61) and (62), Dai gives [18, 82]:

Equation (63)

where nmax is the maximum radial degree of the Zernike circle polynomials and $G_n^i(\varepsilon )$ is a resizing factor, defined as [18]:

Equation (64)

where i ⩽ ⌊(nmax n)/2⌋ and ⌊x⌋ denotes the floor function that gives as output the greatest integer less than or equal to x. The results suggest that the transformed expansion coefficient $b_n^m$ is a linear combination of $a_n^m$ and more untransformed coefficients $a_n^m$ are involved for the calculation of transformed coefficients $b_n^m$ for lower degrees. Table 11 lists the expressions for the resizing factor $G_n^i(\varepsilon )$ and transformed expansion coefficients $b_n^m$ for nmax = 6.

Table 11. Resizing factor $G_n^i(\varepsilon )$ and transformed expansion coefficients $b_n^m$ for nmax = 6 [18].

nmax n i $G_n^i(\varepsilon )$ $b_n^m$
6001 $G_0^0(\varepsilon )a_0^0 + G_0^1(\varepsilon )a_2^0 + G_0^2(\varepsilon )a_4^0 + G_0^3(\varepsilon )a_6^0$
1 $ - \sqrt 3 (1 - {\varepsilon ^2})$
2 $\sqrt 5 (1 - 3{\varepsilon ^2} + 2{\varepsilon ^4})$
3 $ - \sqrt 7 (1 - 6{\varepsilon ^2} + 10{\varepsilon ^4} - 5{\varepsilon ^6})$
10 $\varepsilon $ $G_1^0(\varepsilon )a_1^m + G_1^1(\varepsilon )a_3^m + G_1^2(\varepsilon )a_5^m$
1 $ - 2\sqrt 2 (1 - {\varepsilon ^2})\varepsilon $
2 $\sqrt 3 (3 - 8{\varepsilon ^2} + 5{\varepsilon ^4})\varepsilon $
20 ${\varepsilon ^2}$ $G_2^0(\varepsilon )a_2^m + G_2^1(\varepsilon )a_4^m + G_2^2(\varepsilon )a_6^m$
1 $ - \sqrt {15} (1 - {\varepsilon ^2}){\varepsilon ^2}$
2 $\sqrt {21} (2 - 5{\varepsilon ^2} + 3{\varepsilon ^4}){\varepsilon ^2}$
30 ${\varepsilon ^3}$ $G_3^0(\varepsilon )a_3^m + G_3^1(\varepsilon )a_5^m$
1 $ - 2\sqrt 6 (1 - {\varepsilon ^2}){\varepsilon ^3}$
40 ${\varepsilon ^4}$ $G_4^0(\varepsilon )a_4^m + G_4^1(\varepsilon )a_6^m$
1 $ - \sqrt {35} (1 - {\varepsilon ^2}){\varepsilon ^4}$
50 ${\varepsilon ^5}$ $G_5^0(\varepsilon )a_5^m$
60 ${\varepsilon ^6}$ $G_6^0(\varepsilon )a_6^m$

In addition to the Dai's formula (equation (63)), a concise expression with an elegant proof is given by Janssen and Dirksen as [79, 83]:

Equation (65)

where n' = n, n + 2, ..., and $R_n^{n + 2} \equiv 0$. The Janssen and Dirksen expression is mathematically equivalent to the Dai's formula but has the advantages of simplicity and only involving the radial polynomials, Zernike which can provide better numerical stability for high radial degrees.

A simple numerical simulation is presented in figure 14 to demonstrate the idea of wavefront resizing. Figure 14(a) is the original wavefront defined over a 3 mm-radius pupil and its first 37 expansion coefficients, aj , under the OSA/ANSI indices are shown in figure 14(b). Figure 14(c) illustrates the Zernike expansion coefficients bj for the 2 mm-radius portion of the original wavefront based on the conversion relationship in equation (63). Figures 14(d) and (e) show the reconstructed wavefront using the transformed coefficients bj and the ground truth, respectively. The wavefront difference map is displayed in figure 14(d). The simulation suggests that equation (63) is effective for Zernike expansion coefficients computation over an arbitrary pupil size in wavefront resizing.

Figure 14.

Figure 14. Wavefront resizing example. (a) and (b) Wavefront over a 3 mm-radius pupil and its first 37 expansion coefficients aj in the OSA/ANSI indices. (c) Transformed Zernike expansion coefficients bj for a 2 mm-radius portion of the original wavefront. (d) Reconstructed wavefront using bj . (e) True wavefront within the 2 mm-radius aperture. (f) Wavefront difference map between (d) and (e).

Standard image High-resolution image

2.4. Relationships with Seidel aberrations and Strehl ratio

2.4.1. Relation with Seidel aberrations.

The performance of an optical system can be characterized by the deformation of the wavefront emerging from the exit pupil relative to its reference sphere, that is, wavefront aberration, as shown in figure 15. The wavefront aberration for a rotationally symmetric system can be expanded by a set of power series in the four variables (ρ, θ) (polar coordinates of the exit pupil) and (r, φ = 0) (polar coordinates of an image point on the image plane) as [19, 20, 84, 85]:

Equation (66)

Figure 15.

Figure 15. Wavefront aberration of a rotationally symmetric optical system. Wa: aberrated wavefront. Wr: reference sphere. ${P_0}$ and ${P^{\prime}_0}$: object point and its Gaussian image point, respectively. (r, φ): polar coordinates of the Gaussian image point on the image plane. Since the optical system is rotationally symmetric, the coordinate system of the image plane can be chosen such that φ = 0.

Standard image High-resolution image

where l is a non-negative integer describing the dependence of the given term upon the distance of the image point from the axis; n and m are two non-negative integers determining the type of aberration. The first two terms in equation (66) represent the transverse (W111) and the longitudinal (W020) focal shifts, respectively. The remaining aberration terms constrained by the relation l + n = 4 are called primary or Seidel aberrations, which include five monochromatic aberrations, namely, spherical aberration (W040), coma (W131), astigmatism (W222), field curvature (W220), and distortion (W311). These aberrations are sometimes called third-order aberrations when referring to ray aberration, which can be obtained as the derivative of wavefront aberration. For a fixed image point, r is a constant and can be absorbed into the coefficients. Assuming the relative aperture and the size of the field to be such that higher-order terms can be ignored, the expression of the wavefront aberration in equation (66) reduces to [19, 20]:

Equation (67)

Table 12 lists the first- and third-order aberrations.

Table 12. First- and third- order aberrations [84].

l n m CoefficientsExpressionsAberrations
111 W111 $r\rho \cos \theta $ Transverse focal shift
020 W020 ${\rho ^2}$ Longitudinal focal shift
222 W222 ${r^2}{\rho ^2}{\cos ^2}\theta $ Astigmatism
040 W040 ${\rho ^4}$ Spherical aberration
131 W131 $r{\rho ^3}\cos \theta $ Coma
220 W220 ${r^2}{\rho ^2}$ Field curvature
311 W311 ${r^3}\rho \cos \theta $ Distortion

The wavefront aberration for a rotationally symmetric system can also be expanded by a set of Zernike series instead of power series. Assuming the first nine terms of Zernike circle polynomials are used for the expansion, the wavefront aberration can be written as [20]:

Equation (68)

It can be further rearranged as [20]:

Equation (69)

wherein the expressions for the coefficients and phase angles are tabulated in table 13. The expansion in equation (69) has a similar form to equation (67) indicating the coefficients of Seidel aberrations can be converted from Zernike expansion coefficients. However, one should keep in mind that without considering field dependence, the terms in equation (69) are not true Seidel aberrations. Wavefront measurement using an interferometer only provides data at a single field point. For this reason, field curvature looks like defocus and distortion like tilt. A set of wavefronts from different object points should be measured to determine the Seidel aberrations unambiguously from a Zernike expansion.

Table 13. Relationship between Zernike coefficients and Seidel aberrations [20].

AberrationsCoefficientsPhase
Piston ${a_p} = {a_0} - {a_3} + {a_8}$
Tilt ${a_t} = \sqrt {{{({a_1} - 2{a_6})}^2} + {{({a_2} - 2{a_7})}^2}} $ ${\phi _t} = \arctan \left[ {({a_2} - 2{a_7})/({a_1} - 2{a_6})} \right]$
Defocus a ${a_d} = \left( {2{a_3} - 6{a_8} \pm \sqrt {a_4^2 + a_5^2} } \right)$
Astigmatism b ${a_a} = \mp 2\sqrt {a_4^2 + a_5^2} $ ${\phi _a} = 1/2\arctan ({a_5}/{a_4})$
Coma ${a_c} = 3\sqrt {a_6^2 + a_7^2} $ ${\phi _c} = \arctan ({a_7}{\text{/}}{{\text{a}}_6})$
Spherical ${a_s} = 6{a_8}$

a The sign in the defocus coefficient is chosen to minimize the magnitude of the coefficient [20]. b The sign in the astigmatism coefficient is chosen to be opposite to the sign in the defocus coefficient [20].

2.4.2. Relation with Strehl ratio.

The Strehl ratio is defined as the ratio of the intensity I at the Gaussian image point in the presence of aberration, divided by the intensity I0 when no aberration was present, as shown in figure 16. It is given by [19]:

Equation (70)

Figure 16.

Figure 16. The Strehl ratio is the ratio of the central irradiance in an aberrated image to the central irradiance in an aberration-free image. This plot shows the irradiance distribution along the x axis normalized by its aberration-free value at the center. NA: numerical aperture.

Standard image High-resolution image

where W is the wavefront aberration with respect to the best reference sphere in the unit of wavelength. The Strehl ratio is a good measure of image quality when an optical system is well corrected. For modest amounts of aberrations, equation (70) can be approximated as [86, 87]:

Equation (71)

where σ2 is the variance of the wavefront across the pupil and is defined as [19]:

Equation (72)

The Strehl ratio is inversely proportional to the variance of a wavefront, which can be characterized by Zernike expansion coefficients.

2.5. Relationships with other functions

2.5.1. XY monomials.

XY monomials are power series in x and y and in Cartesian coordinates are defined as [88]:

Equation (73)

where n and m are non-negative integers and nm. The XY monomials are also frequently used for representing wavefront aberrations, largely because they are a simple and complete set of basis functions. However, they are less popular than Zernike polynomials, especially after the 1980s, due to their non-orthogonality [88]. The conversions of wavefront expansion coefficients based on XY monomials and Zernike polynomials have been discussed by several authors and can be found in [88, 89].

2.5.2. Jacobi polynomials.

The Jacobi polynomials are a class of classical orthogonal polynomials and can be defined by Rodrigues formula as [51]:

Equation (74)

where α, β > 1. Their explicit expressions are given as [51]:

Equation (75)

They are orthogonal with respect to the weight (1 − x)α (1 + x)β on the interval [1, 1]:

Equation (76)

The Zernike radial polynomials are a special case of the Jacobi polynomials multiplied by ρm with [90]:

Equation (77)

The first few terms of the Jacobi polynomials are illustrated in figure 17. For more information about the Jacobi Polynomials, one can refer to [51, 91].

Figure 17.

Figure 17. Jacobi polynomials up the 5th degree for α = β = 3.

Standard image High-resolution image

2.5.3. Legendre polynomials.

The Legendre polynomials, sometimes called Legendre functions of the first kind, are solutions to the Legendre differential equation. They are a special class of the Jacobi polynomials with α = β = 0 and can be defined by Rodrigues formula as [51]:

Equation (78)

Their explicit expressions are given as [51]:

Equation (79)

The Legendre polynomials are orthogonal over the interval [−1, 1]:

Equation (80)

They relate to the Zernike radial polynomials via [21]:

Equation (81)

The first few terms of the Legendre polynomials are illustrated in figure 18. For more information about the Legendre Polynomials, one can refer to [51, 92].

Figure 18.

Figure 18. Legendre polynomials up to the 5th degree.

Standard image High-resolution image

2.5.4. Bessel functions.

The nth-order Bessel function of the first kind is defined as [92]:

Equation (82)

They relate to the Zernike radial polynomials via [32]:

Equation (83)

which is of great importance for the reduction of the diffraction integral in the Nijboer–Zernike theory [6, 21]. The first few terms of the Bessel functions are illustrated in figure 19.

Figure 19.

Figure 19. Bessel functions of the first kind up to the 4th degree.

Standard image High-resolution image

2.5.5. Chebyshev polynomials.

The Chebyshev polynomials of the second kind and of degree n are defined as [51]:

Equation (84)

They relate to the Radon transforms of Zernike radial polynomials $\Re _n^m$ via [93]:

Equation (85)

The equation (85) can be used to compute the Zernike radial polynomials for large values of the degree n [94]. The first few terms of the Chebyshev polynomials of the second kind are illustrated in figure 20.

Figure 20.

Figure 20. Chebyshev polynomials of the second kind up to the 4th degree.

Standard image High-resolution image

2.5.6. Pseudo Zernike polynomials.

The pseudo Zernike polynomials (see table 14), first derived by Bhatia and Wolf in 1954 [3], are a set of polynomials orthogonal over a unit circle and analogous to complex Zernike circle polynomials. They are obtained by eliminating the condition n − |l| = even from the definition of the complex Zernike circle polynomials in equation (16). Specifically, the pseudo Zernike polynomials are defined as:

Equation (86)

Table 14. Pseudo Zernike polynomials up to the 5th degree.

j n l νj (ρ, θ)
1001
21−1 $\rho \exp ( - {\text{i}}\theta )$
30 $3\rho - 2$
41 $\rho \exp ({\text{i}}\theta )$
52−2 ${\rho ^2}\exp ( - {\text{i}}2\theta )$
6−1 $(5{\rho ^2} - 4\rho )\exp ( - {\text{i}}\theta )$
70 $10{\rho ^2} - 12\rho + 3$
81 $(5{\rho ^2} - 4\rho )\exp ({\text{i}}\theta )$
92 ${\rho ^2}\exp ({\text{i}}2\theta )$
103−3 ${\rho ^3}\exp ( - {\text{i}}3\theta )$
11−2 $(7{\rho ^3} - 6{\rho ^2})\exp ( - {\text{i}}2\theta )$
12−1 $(21{\rho ^3} - 30{\rho ^2} + 10\rho )\exp ( - {\text{i}}\theta )$
130 $35{\rho ^3} - 60{\rho ^2} + 30\rho - 4$
141 $(21{\rho ^3} - 30{\rho ^2} + 10\rho )\exp ({\text{i}}\theta )$
152 $(7{\rho ^3} - 6{\rho ^2})\exp ({\text{i}}2\theta )$
163 ${\rho ^3}\exp ({\text{i}}3\theta )$
174−4 ${\rho ^4}\exp ( - {\text{i}}4\theta )$
18−3 $(9{\rho ^4} - 8{\rho ^3})\exp ( - {\text{i}}3\theta )$
19−2 $(36{\rho ^4} - 56{\rho ^3} + 21{\rho ^2})\exp ( - {\text{i}}2\theta )$
20−1 $(84{\rho ^4} - 168{\rho ^3} + 105{\rho ^2} - 20\rho )\exp ( - {\text{i}}\theta )$
210 $126{\rho ^4} - 280{\rho ^3} + 210{\rho ^2} - 60\rho + 5$
221 $(84{\rho ^4} - 168{\rho ^3} + 105{\rho ^2} - 20\rho )\exp ({\text{i}}\theta )$
232 $(36{\rho ^4} - 56{\rho ^3} + 21{\rho ^2})\exp ({\text{i}}2\theta )$
243 $(9{\rho ^4} - 8{\rho ^3})\exp ({\text{i}}3\theta )$
254 ${\rho ^4}\exp ({\text{i}}4\theta )$
265−5 ${\rho ^5}\exp ( - {\text{i}}5\theta )$
27−4 $(11{\rho ^5} - 10{\rho ^4})\exp ( - {\text{i}}4\theta )$
28−3 $(55{\rho ^5} - 90{\rho ^4} + 36{\rho ^3})\exp ( - {\text{i}}3\theta )$
29−2 $(165{\rho ^5} - 360{\rho ^4} + 252{\rho ^3} - 56{\rho ^2})\exp ( - {\text{i}}2\theta )$
30−1 $(330{\rho ^5} - 840{\rho ^4} + 756{\rho ^3} - 280{\rho ^2} + 35\rho )\exp ( - {\text{i}}\theta )$
310 $462{\rho ^5} - 1260{\rho ^4} + 1260{\rho ^3} - 560{\rho ^2} + 105\rho - 6$
321 $(330{\rho ^5} - 840{\rho ^4} + 756{\rho ^3} - 280{\rho ^2} + 35\rho )\exp ({\text{i}}\theta )$
332 $(165{\rho ^5} - 360{\rho ^4} + 252{\rho ^3} - 56{\rho ^2})\exp ({\text{i}}2\theta )$
343 $(55{\rho ^5} - 90{\rho ^4} + 36{\rho ^3})\exp ({\text{i}}3\theta )$
354 $(11{\rho ^5} - 10{\rho ^4})\exp ({\text{i}}4\theta )$
365 ${\rho ^5}\exp ({\text{i}}5\theta )$

where n is a nonnegative integer, l is an integer, and n − |l| ⩾ 0; the radial polynomials of pseudo Zernike polynomials can be written as:

Equation (87)

The relation between the pseudo Zernike radial polynomials (equation (87)) and the Zernike radial polynomials (equation (19)) is given by [3]:

Equation (88)

The first few terms of the pseudo Zernike radial polynomials are illustrated in figure 21. Pseudo Zernike polynomials can be used for wavefront sensing [83], and to define pseudo Zernike moments, which can generate moment invariants as shape descriptors for pattern recognition (section 4.6.2).

Figure 21.

Figure 21. Pseudo Zernike radial polynomials with the azimuthal index l = 1.

Standard image High-resolution image

3. Zernike polynomials over noncircular pupils

3.1. Zernike polynomials over arbitrary pupil shapes

Zernike circle polynomials are in widespread use for wavefront analysis in optical systems with circular pupils. They are unique in the sense that they are not only orthogonal across a unit circle, but they also represent balanced aberrations yielding minimum variance. However, in practice, optical systems do not always have circular pupil shapes. Non-circular pupils, such as annular, hexagonal, elliptical, rectangular, and square, are also very common. For example, many telescopes, such as the Hubble space telescope, have annular pupils [95, 96]; some mirrors of large telescopes are segmented into small hexagonal segments to facilitate fabrication, testing, and alignment [97]; the pupil of a human eye is slightly elliptical [98]; rectangular or square optics are applied in anamorphic optical systems [99, 100] and high-powered laser systems [101]. In such cases, Zernike circle polynomials are no longer orthogonal and their advantages are lost. It is necessary to construct new orthonormal polynomials for aberration representation. Methods for constructing orthonormal polynomials mainly include the recursive Gram–Schmidt process [37] and the nonrecursive matrix approach [102]. The Gram–Schmidt orthogonalization approach is briefly summarized below.

Using the Gram–Schmidt orthonormalization process [103], a set of polynomials Fj (x, y) orthogonal over noncircular pupils can be constructed from Zernike circle polynomials as [4, 37, 104]:

Equation (89)

where $\overline {{Z_{j + 1}}{F_i}} $ denotes the mean value of ${Z_{j + 1}}{F_i}$ and is defined as:

Equation (90)

where A is the area of the region of integration. Nj +1 is a normalization factor and can be expressed as:

Equation (91)

The constructed polynomials satisfy the following orthonormality condition:

Equation (92)

Since an orthonormal polynomial is a linear combination of Zernike circle polynomials (equation (89)), the wavefront decomposition with a set of orthonormal polynomials over noncircular pupils is identical to the decomposition with a corresponding set of Zernike circle polynomials. However, in this case, the Zernike circle polynomials do not represent balanced aberrations and their expansion coefficients lack physical significance [105].

The constructed orthogonal polynomials are determined recursively and each term is a linear combination of Zernike circle polynomials with no higher radial order. The Gram–Schmidt orthonormalization approach can be applied to construct orthonormal polynomials over any pupil shape [106, 107]. Figure 22 presents five common noncircular pupils, including annular, rectangular, square, hexagonal, and elliptical pupils. Orthonormal polynomials over these noncircular pupils can be obtained using the Gram–Schmidt orthogonalization process and are tabulated in table 15.

Figure 22.

Figure 22. Common noncircular pupils inscribed inside a unit circle. (a) Annular pupil (obscuration ratio: epsilon). (b) Rectangular pupil (half width: a). (c) Square pupil. (d) Hexagonal pupil. (e) Elliptical pupil (semi-minor axis: a).

Standard image High-resolution image

Table 15. Orthonormal polynomials over noncircular pupils [37, 104, 108].

Pupil shapesOrthogonal polynomials Fj+ 1 $\overline {{Z_{j + 1}}{F_i}} $
Annular ${F_{j + 1}} = {N_{j + 1}}\left[ {{Z_{j + 1}} - \sum\limits_{i = 1}^j {\overline {{Z_{j + 1}}{F_i}} } {F_i}} \right]$ $\overline {{Z_{j + 1}}{F_i}} = \frac{1}{{{{\pi (}}1 - {\varepsilon ^2}{\text{)}}}}\int_0^{2{{\pi }}} {\int_\varepsilon ^1 {{Z_{j + 1}}{F_i}\rho {\text{d}}\rho {\text{d}}\theta } } $
Rectangular $\overline {{Z_{j + 1}}{F_i}} = \frac{1}{{4a\sqrt {1 - {a^2}} }}\int_{ - \sqrt {1 - {a^2}} }^{\sqrt {1 - {a^2}} } {\int_{ - a}^a {{Z_{j + 1}}{F_i}{\text{d}}} x{\text{d}}y} $
Square $\overline {{Z_{j + 1}}{F_i}} = \frac{1}{2}\int_{ - {1 / {\sqrt 2 }}}^{{1 / {\sqrt 2 }}} {\int_{ - {1 / {\sqrt 2 }}}^{{1 / {\sqrt 2 }}} {{Z_{j + 1}}} {F_i}{\text{d}}x{\text{d}}y} $
Hexagonal $\overline {{Z_{j + 1}}{F_i}} = \frac{2}{{3\sqrt 3 }}\int_{{\text{ hexagon }}} {{Z_{j + 1}}} {F_i}{\text{d}}x{\text{d}}y$
Elliptical $\overline {{Z_{j + 1}}{F_i}} = \frac{1}{{{{\pi }}a}}\int_{ - a\sqrt {1 - {x^2}} }^{a\sqrt {1 - {x^2}} } {\int_{ - 1}^1 {{Z_{j + 1}}{F_i}{\text{d}}x{\text{d}}y} } $

3.2. Zernike polynomials over annular pupils

Annular pupil plays an important role in optical systems, such as telescopes for astronomical observation [96] and stitching interferometers for aspheric wavefront testing by annular sub-apertures [109111]. Orthonormal Zernike polynomials over annular pupils, called Zernike annular polynomials, can be constructed using the Gram–Schmidt orthogonalization process based on Zernike circle polynomials. Zernike annular polynomials first appeared in a report of Perkin-Elmer Corporation in 1971 [4], were later discussed by Tatian in 1976 [5] and systematically studied and explicitly given by Mahajan in 1981 [4].

Zernike annular polynomials are defined over a unit annular disk with an obscuration ratio of epsilon (0 ⩽ epsilon< 1) and can be most conveniently expressed in polar coordinates (ρ, θ), where ρ is the normalized radial coordinate (epsilonρ⩽ 1) and θ is the polar angle measured counterclockwise from the +x-axis (0 ⩽ θ < 2π), as shown in figure 23.

Figure 23.

Figure 23. Unit annular pupil with an obscuration ratio of epsilon.

Standard image High-resolution image

3.2.1. Definition.

3.2.1.1. Real Zernike annular polynomials

Real Zernike annular polynomials have normalized and non-normalized forms. The normalized form defined under the Noll indexing scheme can be written as [4, 35, 112]:

Equation (93)

where the index n is the degree of the radial polynomials, $R_n^m(\rho ;\varepsilon )$; the index m is the azimuthal frequency describing the repetition of the angular function; n and m are non-negative integers and satisfy nm⩾ 0 and nm= even; j is a mode-ordering number starting from 1, and epsilon is the obscuration ratio. There are a total of (n + 1)(n + 2)/2 linearly independent polynomials for a specific degree of n. The radial polynomials $R_n^m(\rho ;\varepsilon )$ can be obtained by Gram–Schmidt orthogonalization and are given by:

Equation (94)

where:

Equation (95)

and the weighting factor $\omega _n^m$ can be determined according to the orthogonality condition of the radial polynomials as:

Equation (96)

Exemplary profiles of the radial polynomials are shown in figure 24. It is easy to verify that when epsilon = 0, Zernike annular polynomials reduce to circle polynomials. The normalized Zernike annular polynomials meet the following orthonormality condition:

Equation (97)

Figure 24.

Figure 24. Annular Zernike radial polynomials (epsilon = 0.5) of the first few degrees when m = 0, 1, and 2.

Standard image High-resolution image

where ${\delta _{jj^{\prime}}}$ is the Kronecker delta function.

Similar to Zernike circle polynomials, orthonormal Zernike annular polynomials can be sorted by either the single index, j, or the double indices, n and m. The former is useful for describing Zernike expansion coefficients while the latter is useful for unambiguously describing the functions. To convert between the indices n, m, and j, one can use the relationships described in equation (8) and (9). Table 16 lists the first 28-term orthonormal real Zernike annular polynomials in the polar coordinate system and the values for n, m, and j. For more terms up to the 45th, one can refer to the tables 57 in [104].

Table 16. Zernike annular polynomials under the Noll indices up to the sixth degree (n = 6).

j n m $R_n^m(\rho ;\varepsilon )$ ${Z_j}(\rho ,\theta ;\varepsilon )$ Aberration
1001 $R_0^0(\rho ;\varepsilon )$ Piston
211 $\frac{\rho }{{{{(1 + {\varepsilon ^2})}^{1/2}}}}$ $2R_1^1(\rho ;\varepsilon )\cos \theta $ X Tilt
31 $2R_1^1(\rho ;\varepsilon )\sin \theta $ Y Tilt
420 $\frac{{2{\rho ^2} - 1 - {\varepsilon ^2}}}{{1 - {\varepsilon ^2}}}$ $\sqrt 3 R_2^0(\rho ;\varepsilon )$ Defocus
52 $\frac{{{\rho ^2}}}{{{{(1 + {\varepsilon ^2} + {\varepsilon ^4})}^{1/2}}}}$ $\sqrt 6 R_2^2(\rho ;\varepsilon )\sin 2\theta $ Primary Y astigmatism
62 $\sqrt 6 R_2^2(\rho ;\varepsilon )\cos 2\theta $ Primary X astigmatism
731 $\frac{{3(1 + {\varepsilon ^2}){\rho ^3} - 2(1 + {\varepsilon ^2} + {\varepsilon ^4})\rho }}{{(1 - {\varepsilon ^2}){{[(1 + {\varepsilon ^2})(1 + 4{\varepsilon ^2} + {\varepsilon ^4})]}^{1/2}}}}$ $\sqrt 8 R_3^1(\rho ;\varepsilon )\sin \theta $ Primary Y coma
81 $\sqrt 8 R_3^1(\rho ;\varepsilon )\cos \theta $ Primary X coma
93 $\frac{{{\rho ^3}}}{{{{(1 + {\varepsilon ^2} + {\varepsilon ^4} + {\varepsilon ^6})}^{1/2}}}}$ $\sqrt 8 R_3^3(\rho ;\varepsilon )\sin 3\theta $  
103 $\sqrt 8 R_3^3(\rho ;\varepsilon )\cos 3\theta $  
1140 $\frac{{[6{\rho ^4} - 6(1 + {\varepsilon ^2}){\rho ^2} + 1 + 4{\varepsilon ^2} + {\varepsilon ^4}]}}{{{{(1 - {\varepsilon ^2})}^2}}}$ $\sqrt 5 R_4^0(\rho ;\varepsilon )$ Primary spherical
122 $\frac{{4{\rho ^4} - 3[(1 - {\varepsilon ^8})/(1 - {\varepsilon ^6})]{\rho ^2}}}{{{{\left\{ {{{(1 - {\varepsilon ^2})}^{ - 1}}\left[ {16(1 - {\varepsilon ^{10}}) - 15{{(1 - {\varepsilon ^8})}^2}/(1 - {\varepsilon ^6})} \right]} \right\}}^{1/2}}}}$ $\sqrt {10} R_4^2(\rho ;\varepsilon )\cos 2\theta $ Secondary X astigmatism
132 $\sqrt {10} R_4^2(\rho ;\varepsilon )\sin 2\theta $ Secondary Y astigmatism
144 $\frac{{{\rho ^4}}}{{{{(1 + {\varepsilon ^2} + {\varepsilon ^4} + {\varepsilon ^6} + {\varepsilon ^8})}^{1/2}}}}$ $\sqrt {10} R_4^4(\rho ;\varepsilon )\cos 4\theta $  
154 $\sqrt {10} R_4^4(\rho ;\varepsilon )\sin 4\theta $  
1651 $\frac{{\left[ 10(1 + 4{\varepsilon ^2} + {\varepsilon ^4}){\rho ^5} - 12(1 + 4{\varepsilon ^2} + 4{\varepsilon ^4} + {\varepsilon ^6}){\rho ^3} + 3(1 + 4{\varepsilon ^2} + 10{\varepsilon ^4} + 4{\varepsilon ^6} + {\varepsilon ^8})\rho \right]}}{{{{(1 - {\varepsilon ^2})}^2}{{[(1 + 4{\varepsilon ^2} + {\varepsilon ^4})(1 + 9{\varepsilon ^2} + 9{\varepsilon ^4} + {\varepsilon ^6})]}^{1/2}}}}$ $\sqrt {12} R_5^1(\rho ;\varepsilon )\cos \theta $ Secondary X coma
171 $\sqrt {12} R_5^1(\rho ;\varepsilon )\sin \theta $ Secondary Y coma
183 $\frac{{5{\rho ^5} - 4[(1 - {\varepsilon ^{10}})/(1 - {\varepsilon ^8})]{\rho ^3}}}{{{{\left\{ {{{(1 - {\varepsilon ^2})}^{ - 1}}[25(1 - {\varepsilon ^{12}}) - 24{{(1 - {\varepsilon ^{10}})}^2}/(1 - {\varepsilon ^8})]} \right\}}^{1/2}}}}$ $\sqrt {12} R_5^3(\rho ;\varepsilon )\cos 3\theta $  
193 $\sqrt {12} R_5^3(\rho ;\varepsilon )\sin 3\theta $  
205 $\frac{{{\rho ^5}}}{{{{(1 + {\varepsilon ^2} + {\varepsilon ^4} + {\varepsilon ^6} + {\varepsilon ^8} + {\varepsilon ^{10}})}^{1/2}}}}$ $\sqrt {12} R_5^5(\rho ;\varepsilon )\cos 5\theta $  
215 $\sqrt {12} R_5^5(\rho ;\varepsilon )\sin 5\theta $  
2260 $\frac{{[20{\rho ^6} - 30(1 + {\varepsilon ^2}){\rho ^4} + 12(1 + 3{\varepsilon ^2} + {\varepsilon ^4}){\rho ^2} - (1 + 9{\varepsilon ^2} + 9{\varepsilon ^4} + {\varepsilon ^6})]}}{{{{(1 - {\varepsilon ^2})}^3}}}$ $\sqrt 7 R_6^0(\rho ;\varepsilon )$ Secondary spherical
232 $\scriptsize{\frac{{\left[ \begin{array}{l} 15(1 + 4{\varepsilon ^2} + 10{\varepsilon ^4} + 4{\varepsilon ^6} + {\varepsilon ^8}){\rho ^6} - 20(1 + 4{\varepsilon ^2} + 10{\varepsilon ^4} + 10{\varepsilon ^6} + 4{\varepsilon ^8} \\ + {\varepsilon ^{10}}){\rho ^4} + 6(1 + 4{\varepsilon ^2} + 10{\varepsilon ^4} + 20{\varepsilon ^6} + 10{\varepsilon ^8} + 4{\varepsilon ^{10}} + {\varepsilon ^{12}}){\rho ^2} \\ \end{array} \right]}}{{\left\{ {{{(1 - {\varepsilon ^2})}^2}{{\left[ \begin{array}{l} (1 + 4{\varepsilon ^2} + 10{\varepsilon ^4} + 4{\varepsilon ^6} + {\varepsilon ^8})(1 + 9{\varepsilon ^2} + 45{\varepsilon ^4} \\ + 65{\varepsilon ^{\text{6}}} + 45{\varepsilon ^8} + 9{\varepsilon ^{10}} + {\varepsilon ^{12}}) \\ \end{array} \right]}^{1/2}}} \right\}}}}$ $\sqrt {14} R_6^2(\rho ;\varepsilon )\sin 2\theta $ Tertiary Y astigmatism
242 $\sqrt {14} R_6^2(\rho ;\varepsilon )\cos 2\theta $ Tertiary X astigmatism
254 $\frac{{6{\rho ^6} - 5[(1 - {\varepsilon ^{12}})/(1 - {\varepsilon ^{10}})]{\rho ^4}}}{{{{\left\{ {{{(1 - {\varepsilon ^2})}^{ - 1}}[36(1 - {\varepsilon ^{14}}) - 35{{(1 - {\varepsilon ^{12}})}^2}/(1 - {\varepsilon ^{10}})]} \right\}}^{1/2}}}}$ $\sqrt {14} R_6^4(\rho ;\varepsilon )\sin 4\theta $  
264 $\sqrt {14} R_6^4(\rho ;\varepsilon )\cos 6\theta $  
276 $\frac{{{\rho ^6}}}{{{{(1 + {\varepsilon ^2} + {\varepsilon ^4} + {\varepsilon ^6} + {\varepsilon ^8} + {\varepsilon ^{10}} + {\varepsilon ^{12}})}^{1/2}}}}$ $\sqrt {14} R_6^6(\rho ;\varepsilon )\sin 4\theta $  
286 $\sqrt {14} R_6^6(\rho ;\varepsilon )\cos 6\theta $  

The non-normalized Zernike annular polynomials can be obtained by dropping the normalization factors from the orthonormal Zernike annular polynomials as:

Equation (98)

They satisfy the following orthogonality condition:

Equation (99)

Figures 25 and 26 show the 3D visualization of the non-normalized Zernike annular polynomials up to the sixth degree for epsilon = 0.6 and their corresponding interferometric fringe patterns as in optical testing [37].

Figure 25.

Figure 25. Pyramid of the non-normalized Zernike annular polynomials up to the sixth degree with an obstruction ratio of 0.6.

Standard image High-resolution image
Figure 26.

Figure 26. Interferometric fringe patterns of the Zernike annular aberrations shown in figure 25.

Standard image High-resolution image
3.2.1.2. Complex Zernike annular polynomials

Complex Zernike annular polynomials have normalized and non-normalized forms. The normalized complex Zernike annular polynomials defined in the Noll indices can be written as [4, 35]:

Equation (100)

where n is a non-negative integer, l is an integer, n − |l| ⩾ 0 and is even, and the radial polynomial is defined as:

Equation (101)

and the weighting factor $\omega _n^{|l|}$ is given by:

Equation (102)

The normalized complex Zernike annular polynomials satisfy the following orthonormality condition:

Equation (103)

The non-normalized complex Zernike annular polynomials can be written as:

Equation (104)

They satisfy the following orthogonality condition:

Equation (105)

3.2.1.3. Summary.

The definitions for the real and complex Zernike annular polynomials are summarized in table 17.

Table 17. Summary of the definition of Zernike annular polynomials.

 Real Zernike annular polynomialsComplex Zernike annular polynomials
SourcesTatian, 1974 [5] Mahajan, 1981 [4] Zemax [25] Mahajan, 1981 [4]
IndexingThe Noll indicesThe Noll indices
Definition $\begin{array}{l} {Z_j}(\rho ,\theta ;\varepsilon ) = Z_n^m(\rho ,\theta ;\varepsilon ) \\[4pt] = N_n^m\left\{ {\begin{array}{*{20}{l}} {R_n^m(\rho ;\varepsilon )\cos m\theta ,}\;{m \ne 0, \;j {\text{ is even}}} \\[4pt] {R_n^m(\rho ;\varepsilon )\sin m\theta ,}\;\;{m \ne 0,\; j {\text{ is odd}}} \\[4pt] {R_n^m(\rho ;\varepsilon ),}\;\qquad \quad {m = 0} \end{array} } \right. \\ \end{array} $ $V\,_n^l(\rho ,\theta ;\varepsilon ) = N_n^lR_n^l(\rho ;\varepsilon )\exp ({\text{i}}l\theta )$
Radial polynomialsEquation (94)Equation (101)
Coordinate system0 ⩽ ρ ⩽ 1 θ from +x axis anticlockwise 0 ⩽ ρ ⩽ 1 θ from +x axis anticlockwise
NormalizationOptionalOptional
Normalization factor $N_n^m = \left\{ \begin{array}{l} 1,\;\;\;\;\;\;\;\;\;\;\;\;\;\;{\text{non-normalized}} \\[6pt] \sqrt {\frac{{2(n + 1)}}{{(1 + {\delta _{m0}})}}} ,\;{\text{normalized}} \\ \end{array} \right.$ $N_n^l = \left\{ \begin{array}{l} 1,\;\;\;\;\;\;\;\;{\text{non-normalized}} \\[6pt] \sqrt {n + 1} ,\;{\text{normalized}} \\ \end{array} \right.$
Term numberInfiniteInfinite
Indices j = 1, 2, 3, ..., n: non-negative integer m: non-negative integer nm ⩾ 0 and even n: non-negative integer l: integer n − |l| ⩾ 0 and even
Relationship between indices $j = \left\{ {\begin{array}{*{20}{l}} {\frac{{n(n + 1)}}{2} + 1,}&{m = 0} \\[6pt] {\left[ {\frac{{n(n + 1)}}{2} + m,\frac{{n(n + 1)}}{2} + m + 1} \right],}&{m \ne 0} \end{array}} \right.$ $\begin{array}{l} n = \left\lfloor {\left( {\sqrt {2j - 1} + 0.5} \right) - 1} \right\rfloor \\[8pt] m = \left\{ {\begin{array}{*{20}{l}} {2 \times \left\lfloor {\frac{{2j + 1 - n\left( {n + 1} \right)}}{4}} \right\rfloor ,}&{n{\text{ is even}}} \\[8pt] {2 \times \left\lfloor {\frac{{2\left( {j + 1} \right) - n\left( {n + 1} \right)}}{4}} \right\rfloor - 1,}&{n{\text{ is odd}}} \end{array}} \right. \\ \end{array} $
Ordering n, m both in ascending order

3.2.2. Mathematical properties.

3.2.2.1. Orthogonality.

The orthogonal relationships of real and complex Zernike annular polynomials have been presented and can be found in equations (97), (99), (103) and (105). Moreover, the radial polynomials of Zernike annular polynomials are also orthogonal over the annular aperture and satisfy the following relationships [4]:

Equation (106)

3.2.2.2. Recurrence relation.

The recurrence relationship for generating radial polynomials of Zernike annular polynomials was derived by Tatian in 1974 [4, 5] and can be written as:

Equation (107)

where n and m obey the same conditions defined in Zernike annular polynomials (n and m are non-negative integers, nm⩾ 0 and is even); k is a non-negative integer (k = 0, 1, 2, ..., ); l = (nm)/2; u = ρ2; $Q_l^m(u)$ is a set of orthogonal polynomials obtained by orthogonalizing the sequence 1, u, ..., ul over the interval (epsilon2, 1) with a weight function um and can be written as [4, 5]:

Equation (108)

The coefficient $h_l^m$ is:

Equation (109)

Especially, when m = n,

Equation (110)

The above recurrence relationship can be initialized with $R_0^0(\rho ;\varepsilon ) = 1$.

3.2.2.3. The Fourier transform.

The Fourier transform of Zernike annular polynomials is derived by Dai and Mahajan [95] and can be written as:

Equation (111)

where (r, φ) denotes the polar coordinates in the frequency domain and:

Equation (112)

where J is the Bessel function of the first kind (equation (82)) and:

Equation (113)

Equation (114)

A list of the first few terms for $H_n^m$, ${g_{{n}{^{\prime}}}}$, and ${h_{{n}^{^{\prime\prime}}}}$ can be found in [95]. The expression in equation (111) reduces to the Fourier transform of Zernike circle polynomials (equation (30)) when epsilon = 0.

3.2.3. Wavefront fitting.

The orthogonality of Zernike annular polynomials makes them an excellent basis for wavefront analysis in annular optical systems. An annular wavefront can be represented by the linear combination of finite terms of Zernike annular polynomials as [113]:

Equation (115)

where J is the total terms of the polynomials, aj is the expansion coefficients, and Zj is the jth-term Zernike annular polynomial. The equation can be equivalently expressed in Cartesian coordinates as:

Equation (116)

Written in discrete and matrix forms, equation (51) becomes:

Equation (117)

where:

Equation (118)

where K is the total number of data points within the unit circle. Generally, equation (117) is an overdetermined linear system, where there are more equations (K) than unknowns (J). It can be written into the normal equation [34, 68]:

Equation (119)

where the superscript T denotes matrix transpose. The solution can be obtained by matrix inversion as:

Equation (120)

Figure 27 shows an example illustrating annular wavefront decomposition using 28-term orthonormal Zernike annular polynomials under the Noll indices. The amplitude of each coefficient indicates the strength of corresponding aberrations (table 16).

Figure 27.

Figure 27. Annular wavefront decomposition using orthonormal Zernike annular polynomials under the Noll indices: (a) wavefront and (b) 28 expansion coefficients.

Standard image High-resolution image

4. Applications

The unique properties of Zernike polynomials have enabled them to be an attractive mathematical tool in many fields. In this section, we survey their applications in a range of fields, including diffraction theory, optical design, optical testing, adaptive optics, ophthalmic optics, and image analysis, as illustrated in figure 28.

Figure 28.

Figure 28. Major applications of Zernike polynomials.

Standard image High-resolution image

4.1. Diffraction theory

4.1.1. The diffraction theory of aberrations.

Zernike polynomials have important applications in the diffraction theory of aberrations, which is concerned with the study of how wavefront aberrations affect image formation in practical optical systems [32, 47]. In a perfect optical imaging system, the light waves from a point object emerge in the image space as spherically convergent waves and form the well-known Airy pattern. However, a perfect imaging system never exists in practice. Waves emerging from a practical optical system deviate from a spherical wave and possess complicated forms.

Consider the wave propagation model illustrated in figure 29, where an aberrated wavefront at the exit pupil converges to the image plane. Let Wa and Wr denote the aberrated wavefront and its Gaussian reference wavefront in the unit of length, respectively. The position of the exit pupil is defined by the Cartesian coordinates (x, y, z) or the cylindrical coordinates (ρ, θ, z); the position of the image plane is defined by the Cartesian coordinates (ξ, η, ζ) or the cylindrical coordinates (r, φ, υ). The complex amplitude distribution at the exit pupil, called the pupil function, can be written as [114]:

Equation (121)

Figure 29.

Figure 29. Geometry of wave propagation from the exit pupil to the image plane. Wa: aberrated wavefront; Wr: reference wavefront.

Standard image High-resolution image

where A(ρ, θ) is the amplitude function and Φ is the phase function in the form of:

Equation (122)

where λ is the wavelength. According to the scalar Debye integral [32, 114], the normalized complex amplitude, U(r, φ, υ), in the focal region of the image plane is given by:

Equation (123)

where υ is defined as the negative axial coordinate (-ζ) normalized with respect to the axial diffraction unit, $\lambda /(\pi s_0^2)$, and s0 is the numerical aperture (NA) of the focusing beam. When the image plane is at the best focus (υ = 0), the complex amplitude U(r, φ, υ) in equation (123) reduces to the Fourier transform of the pupil function P(ρ, θ).

The PSF, defined as the diffraction pattern of a point object in the image plane, can be written as the squared modulus of the complex amplitude U [115, 116], i.e.

Equation (124)

The image of an extended object formed by an optical system is the convolution of the object itself with the PSF of the system, which can be mathematically modeled as [117]:

Equation (125)

where f and g denote the object and the image, respectively, and ∗ represents convolution. To understand the impact of wavefront aberrations on the final image quality, the PSF needs to be evaluated. In the next section, we briefly review the analytical PSF computation approaches first developed by Nijboer and Zernike [6, 21] and later extended by Janssen [8], where expanding wavefront aberrations at the exit pupil using Zernike circle polynomials is the key.

4.1.2. PSF computation using the Nijboer–Zernike theory.

In general, analytical evaluation of the diffraction integrals in equation (123) is difficult except for some specific cases. In 1942, Bernard Nijboer, a PhD, student of Zernike, expanded the aberration function at the exit pupil into a series of Zernike circle polynomials and formulated an efficient representation of the complex amplitude distribution in the image plane [6, 21]. This work allows analytical evaluation of the diffraction integral and the PSF of a general optical system and is referred to as the Nijboer–Zernike theory. For completeness, we briefly review the basic principle of the theory, which is well summarized in [114].

In the Nijboer–Zernike theory, the pupil function is assumed to be uniform in amplitude and thus can be written as a purely phase-aberrated function, i.e.

Equation (126)

Expanding the pupil function into the Taylor series, the diffraction integral in equation (123) becomes:

Equation (127)

Expanding the phase function, Φ(ρ, θ), into a set of Zernike circle polynomials gives:

Equation (128)

Substituting the Zernike expansion into the integral (equation (127)) and performing the integration over θ using elementary Bessel function operations, we obtain:

Equation (129)

where $\alpha _n^m$ is the Zernike expansion coefficients and Jm is a Bessel function of the first kind and of order m. Note that in the reduction, the phase aberration is considered small enough so that truncation of the infinite series in equation (128) after the term k = 1 is allowed. The above equation can be further reduced using the relationship in equation (83) as:

Equation (130)

where the prime indicating that m = n = 0 should be excluded from the summation.

The expression of the complex amplitude, U, provides an analytical method for evaluating the PSF of an optical system. Although elegant in expression, the Nijboer–Zernike approach is not widely used in practice [114], largely because that the derivation requires the amplitude over the pupil to be uniform and the wavefront aberration is restricted to be sufficiently small (in the order of a fraction of the wavelength [32]).

Figure 30 illustrates the appearance of the PSF of an optical system when only a single Zernike term (root mean square value: 0.1 μm, wavelength: 570 nm) is present in the wavefront aberration. Figure 31 presents an example showing that wavefront aberrations degrade the image quality of an optical system.

Figure 30.

Figure 30. PSFs of the Zernike circle polynomials (root mean square value: 0.1 μm) up to the fourth degree under the Noll indices.

Standard image High-resolution image
Figure 31.

Figure 31. Convolution of an object with the PSF of an optical system leads to degraded images.

Standard image High-resolution image

4.1.3. PSF computation using the extended Nijboer–Zernike theory.

The Nijboer–Zernike theory is only valid in the case of small aberrations and can only produce accurate values of PSF at positions close to geometrical focus. To deal with the problem, Janssen in 2002 formulated a general expression in terms of the power-Bessel series and extended the Nijboer–Zernike theory for optical systems with large aberrations [8, 114, 118, 119]. The extended Nijboer–Zernike theory can analytically compute the PSF of an aberrated optical system described by Zernike coefficients and accelerates further developments in focused field diffraction theory.

The extended Nijboer–Zernike theory adopts a generalized definition for the pupil function and expands it using Zernike circle polynomials as [114]:

Equation (131)

where the amplitude, A(ρ, θ), and phase aberration, Φ(ρ, θ), are real-valued; the coefficients, $\beta _n^m$, is in general complex-valued; $Z_n^m$ is the non-normalized complex Zernike circle polynomials defined in equation (16). Substituting the pupil function into the diffraction integral (equation (123)), the complex amplitude, U, can be expressed as:

Equation (132)

where:

Equation (133)

Janssen derived a Bessel series representation for the integral in equation (133) and reformulate the above equation as:

Equation (134)

where:

Equation (135)

Equation (136)

The symbol $\left( \begin{gathered} n \\ k \\ \end{gathered} \right)$ denotes combination and is defined as:

Equation (137)

Note that the equation (134) suffers from loss-of-digits and slow convergence for larger υ under standard precision. An advanced version of the ENZ-theory has been developed to virtually eliminate the convergence problem by replacing the power-Bessel series in equation (134) with Bessel-Bessel series [120]. Using equation (134), we can compute the PSF of an optical system with an exit pupil defined by a set of β-coefficients (equation (131)). The extended Nijboer–Zernike theory has been used in several applications, such as aberration retrieval in high-NA optical lithography systems [121123] and acoustic diffraction problems [124, 125].

4.2. Optical design

Optical design is the process of designing an optical system to meet specific performance requirements and constraints. Owing to their unique properties, Zernike polynomials are beneficial to wavefront analysis and surface representation in modern lens design programs. In wavefront analysis, since Zernike expansion coefficients are independent and directly represent balanced aberrations, it is convenient to decompose wavefront aberrations of an optical system into a set of Zernike polynomials to evaluate the contribution of each aberration [25]. Moreover, the coefficients of Zernike polynomials can also be used as variables of the merit function of a lens system to facilitate system optimization [126].

In surface representation, Zernike polynomials have emerged as a means of describing the shape of freeform optical surfaces [127131]. State-of-the-art lens design programs, such as Zemax and CODE V, empower optical designers to use Zernike polynomials to represent freeform surfaces, which are called Zernike surfaces. For example, Zernike phase surfaces and Zernike sag surfaces are defined and used in Zemax. The Zernike phase surfaces are standard surfaces, such as planes, spheres, and conics, superimposed with phase terms defined by Zernike polynomials [132]. The phase term can be written as:

Equation (138)

where m represents the diffraction order, Zj is Zernike circle polynomials, and aj is the expansion coefficients, ρ is the normalized radial coordinate and θ is the polar angle. This surface type is well suited to modeling system aberrations for which measured interferometer data is available [132]. The Zernike sag surfaces are defined as the conic surface (figure 32) plus additional deformation terms characterized by even orders of the power series and finite terms of Zernike polynomials [58, 132]. They are given by:

Equation (139)

Figure 32.

Figure 32. Conic surfaces.

Standard image High-resolution image

where c denotes the curvature of the base conic; r = x2 + y2 is the radial ray coordinate in lens unit; k is the conic constant; αj and aj are the coefficients of the power series and the Zernike polynomials, respectively; J is the maximum number of terms of the Zernike polynomials; ρ is the normalized radial coordinate and θ is the polar angle. The Zernike phase surfaces (equation (138)) describe phase variation of a surface while the Zernike sag surfaces (equation (139)) characterize surface deformations. These Zernike surfaces can also employ Zernike annular polynomials to define the aspheric terms when an optical system has an annular pupil. Figure 33 is an example showing the design of a long wave infrared reflective imaging system optimized with Zernike surfaces [133].

Figure 33.

Figure 33. Optical system design with Zernike polynomials. (a) Optical layout of a long wave infrared reflective imaging system optimized with Zernike surfaces. (b) Housing structure of the optical system. Reprinted with permission from [133] © The optical Society.

Standard image High-resolution image

4.3. Optical testing

Optical testing is concerned with testing the optical quality of optical systems by optical techniques [134, 135]. The applications of Zernike polynomials in optical testing are mainly concentrated in the field of optical surface or wavefront measurement by phase-shifting interferometry [136], the principal purpose of which is to determine the aberrations present in an optical component or an optical system [19]. There are many different types of phase-shifting interferometers used in practice, such as the Fizeau, Mach-Zehnder, and Twyman-Green interferometers [137140]. Here we use a phase-shifting Twyman-Green interferometer as an example to demonstrate the usefulness of Zernike polynomials in precise surface figure measurement.

A typical optical layout of a phase-shifting Twyman-Green interferometer is shown in figure 34. The emitted beam from a laser source is first collimated by a beam expander and then divided by a beam splitter into two parts. The reflected part (red) propagates to a reference mirror and is then reflected back serving as the reference beam. The transmitted part (blue), after passing through a compensation lens, is incident onto the optical surface under test and then reflected back along the same path. The reference beam and the measurement beam meet at the beam splitter, interfere with each other, and produce fringe patterns with periodic intensity modulation. The fringe patterns carry the surface figure information of the optical component under test and are finally recorded by a charge-coupled device (CCD) detector. Phase shifting is achieved by moving the reference mirror a certain amount with a piezoelectric transducer (PZT).

Figure 34.

Figure 34. A phase-shifting Twyman-Green interferometer for optical surface testing. PZT: piezoelectric transducer. CCD: charge-coupled device. Adapted with permission from [141] © The optical Society.

Standard image High-resolution image

The intensity of a fringe pattern can be mathematically modeled as [142144]:

Equation (140)

where a(x, y) and b(x, y) are the background and the modulation terms, respectively, and Φ(x, y) is the phase map to be recovered. There are three unknowns in equation (140), indicating at least three frames of phase-shifted interferograms are needed to recover the phase function. This is known as the three-step or three-bucket phase demodulation algorithm. In practice, there are many more different phase-shifting algorithms in use, such as the four-step, five-step, and least- squares algorithms [137, 138, 142]. Herein the four-step algorithm will be introduced. Suppose that four interferograms with phase shifts of 0, π/2, π, and 3π/2 are collected. Their intensity functions can be written as:

Equation (141)

The phase, Φ(x, y), can be simply calculated as:

Equation (142)

Since the value of the inverse tangent function is within [−π, π], the calculated phase, Φ(x, y), is typically wrapped. An extra process, called phase unwrapping [145, 146], is needed to yield a continuous phase map.

Generally, misalignment errors, such as tilt and defocus, are present in the phase function, Φ(x, y) and need to be removed to reveal the true surface figure. This can be achieved by expanding the phase function into finite terms of Zernike polynomials and eliminating the coefficients of the tilt and defocus terms. Mathematically, the phase expansion can be written as:

Equation (143)

where aj is the expansion coefficients and can be computed by the least-squares method described in section 2.3.1. The final figure map of the surface under test can be obtained as [141, 147]:

Equation (144)

where a0, a1, a2, and a3 represent the coefficients of the piston, x-tilt, y-tilt, and defocus terms of the Zernike expansion, respectively. The Zernike expansion coefficients, aj , can be further used to calculate the PVr (peak-to-valley robust) [148], which is a robust amplitude parameter for describing the figure error of the optical surface under test.

For ease of understanding, the whole procedure for surface figure retrieval from a set of phase-shifted fringe patterns is illustrated in figure 35 and a state-of-the-art commercial Fizeau interferometer for optical testing is shown in figure 36.

Figure 35.

Figure 35. Surface figure retrieval from a set of phase-shifted fringe patterns. (a) Phase-shifted interferograms. (b) Demodulated phase. (c) Unwrapped continuous phase. (d) 37 expansion coefficients of the Zernike circle polynomials under the fringe indices. (e) Recovered surface figure. λ: wavelength.

Standard image High-resolution image
Figure 36.

Figure 36. A commercial laser Fizeau interferometer for optical surface testing. Reproduced with permission from Zygo Corporation. ©2021 Zygo Corporation.

Standard image High-resolution image

4.4. Ophthalmic optics

The eye, like any other optical system, suffers from a number of specific optical aberrations [149]. Aberrations of eyes with refractive errors include lower-order aberrations and higher-order aberrations. Lower-order aberrations, such as myopia, hyperopia, and regular astigmatism, account for approximately 90% of the overall ocular aberration and are the most common causes of visual impairment [150]. In contrast, higher-order aberrations, such as spherical aberration, coma, and trefoil, account for less than 10% of ocular aberrations but they may significantly impact on visual performance when the pupil is large [149, 151]. Measuring the aberrations of the human eye can provide objective and quantitative data for vision correction and is of critical importance to certain corrective measures [23, 152], such as wavefront-guided refractive surgery [153, 154], which has been a paradigm shift in the field of refractive error correction. The most commonly used tool for the measurement of ocular aberrations is the Shack–Hartmann wavefront slope sensor [155], which was developed by Shack and Platt in the late 1960s [156, 157] and is an evolutionary technology of the Hartmann Screen test. The Shack–Hartmann wavefront slope sensor can measure wavefront like an interferometer but uses optical components less expensive.

The setup and principle for aberrations measurement of the eye using a Shack–Hartmann wavefront slope sensor are illustrated in figure 37. An incident infrared light beam is reflected by a beam splitter and focused onto the retina. Since the beam diameter is small (approximately 1 mm), the light spot on the retina can be regarded as a point source independent of eye aberrations. This point source emits spherical waves, which will be affected by eye aberrations and become aberrated planar waves when leaving the eye. The aberrated waves pass through the beam splitter and are detected by a Shack–Hartmann wavefront slope sensor, which consists of a 2D microlens array and a CCD camera located at the focal plane of the microlenses. In this arrangement, the whole aberrated wavefront is actually divided into many smallareas, which can be locally treated as plane waves and are individually focused onto the CCD camera. When the eye is aberration-free, the outgoing wavefront from the eye is planar and the CCD camera detects a regular spot pattern (shown as black dots in figure 37(b)). In contrast, when the eye has aberrations, the outgoing wavefront from the eye is aberrated and individual parts of the wavefront are tilted with respect to the reference wavefront, resulting in displaced focal spots (shown as red dots in figure 37(b)) after being imaged onto the CCD camera. The magnitude of the position shifts of the displaced spots reflects the tilt amount of the measured wavefront and can be used to recover the original wavefront using the algorithm described below.

Figure 37.

Figure 37. Measurement of ocular wavefront aberration with a Shack–Hartmann wavefront slope sensor. (a) Setup schematic. (b) Principle of Shack–Hartmann wavefront sensing. Aberrated wavefront passing through the lenslet array is focused on a CCD. f: focal length of the lenslet array; Δx and Δy: shift of the actual spot with respect to the ideal spot in the x and y directions, respectively.

Standard image High-resolution image

In a Shack–Harmann wavefront slope sensor, the relationship between the position shift of an actual spot and the slope of an aberrated wavefront can be written as [22, 23]:

Equation (145)

where f is the focal length of the microlens array; Δx and Δy denote the shifts of the actual spot with respect to its ideal position in the x and y directions, respectively. Based on these relationships, the aberrated wavefront, W(x, y), can be recovered using either zonal or modal algorithms [22, 24]. In a modal reconstruction, the wavefront is represented by finite terms of Zernike circle polynomials as:

Equation (146)

where aj is the expansion coefficients. Taking the derivatives with respect to x and y for both side of equation (146) at each sampling points gives [23]:

Equation (147)

where K is the total number of sampling points. Substituting equation (145) into equation (147) yields a matrix equation as:

Equation (148)

where:

Equation (149)

Equation (150)

Equation (151)

s is a 2 K× 1 column vector containing measured wavefront slope data, a is a J× 1 column vector containing the unknown Zernike expansion coefficients, A is a 2 K× J coefficient matrix whose elements can be computed using the derivative formulas of Zernike polynomials in equation (42). The unknown a can be computed by matrix inversion as:

Equation (152)

Substituting the expansion coefficients (equation (152)) into equation (146) gives the wavefront aberrations of the eye. The first measurement of ocular aberration using a Shack–Hartmann wavefront slope sensor was performed by Liang et al in 1994 [23]. The wavefront sensor was later improved by increasing sampling density to provide more complete descriptions of the aberrations of the eye, including irregular and classical aberrations [152]. Since then, measuring ocular aberrations by Shack–Hartmann wavefront slope sensor has become common in clinical practice. Ocular aberrations can also be measured by a wavefront curvature sensor, in which curvature polynomials can be used to obtain Zernike aberration coefficients [158].

In addition to wavefront reconstruction in a Shack–Hartmann wavefront slope sensor, Zernike polynomials are also very useful in the analysis of the aberrations of the eye [159]. Since Zernike polynomials are orthogonal over a circular disk, their expansion coefficients contain a wealth of measurable metrics, such as root mean square error, equivalent defocus, spherocylindric refraction values [155] can be derived for more illustrative description of eye aberrations. Consensus recommendations on definitions, conventions, and standards of Zernike polynomials were developed by OSA in 1999 for reporting of optical aberrations of the human eye [27]. The recommendations were later standardized in ANSI Z80.28 [14] and ISO 24157 [15, 17] and accepted by the vision community.

Figure 38 shows a photograph of a commercially available aberrometer, which uses a Shack–Hartmann wavefront slope sensor for aberrations measurement. Zernike polynomials are used for wavefront reconstruction and aberration reporting of the eye [155].

Figure 38.

Figure 38. Photograph of a commercially available aberrometer that uses a Shack–Hartmann wavefront slope sensor. Left: Photograph of ZEISS i.Profiler. Right: interface of its analysis software. Reproduced with permission from © ZEISS.

Standard image High-resolution image

4.5. Adaptive optics

Ground-based telescope is an important tool to explore the universe. Its image quality is critical to astronomical observations but can be degraded significantly by atmospheric turbulence-induced optical aberrations. Naturally, light coming from distant stars is plane waves before reaching the atmosphere of the earth and can theoretically form images limited only by the optical diffraction limit. However, due to the effect of atmospheric turbulence, the light wavefront will be distorted when propagating through the atmosphere, degrading the image quality of a telescope. Adaptive optics is such a technology that can improve the performance of an astronomical telescope by compensating wavefront aberrations induced by atmospheric turbulence using wavefront correctors [160, 161]. The technique was first envisioned by Babcock in 1953 [162] but did not come into common usage until the 1990s.

A typical adaptive optics system for an astronomical telescope consists of three principal subsystems: a wavefront sensor, a deformable mirror, and a control computer [163], as illustrated in figure 39. Its working principle is sketched in figure 40 and can be understood as follows. A telescope captures the light from the object of interest, such as a distant star or a satellite. Before being focused on the camera, the light is first sampled by a wavefront sensor, such as a Shack–Hartmann wavefront sensor, and the sampling data are transferred to a control computer. The control computer performs mathematical reconstruction to recover the wavefront distortion of the sampled light and drives a servo system to control the wavefront corrector, such as a deformable mirror, to compensate for the wavefront distortion. After compensation, the wavefront of the light should be less distorted, yielding images with improved quality at the camera. If the light from the object is too faint to determine the wavefront distortion, reference sources, such as nearby natural guide stars or artificial guide stars, can be used to facilitate the correction process. Figure 41 is an example showing that adaptive optics can improve the image quality of a telescope significantly.

Figure 39.

Figure 39. Schematic of an astronomical telescope equipped with an adaptive optics system, which contains a deformable mirror, a wavefront sensor, and a control computer. Reproduced with permission from [163].

Standard image High-resolution image
Figure 40.

Figure 40. Working principle of adaptive optics.

Standard image High-resolution image
Figure 41.

Figure 41. Adaptive optics sharpens a telescope's view. (a) No adaptive optics. (b) With adaptive optics. AO: adaptive optics. Reproduced from [164], with permission from Springer Nature.

Standard image High-resolution image

The use of Zernike polynomials in adaptive optics can be reflected in two aspects. On one hand, Zernike polynomials provide a unique set of functions for the representation, reconstruction, and analysis of wavefront distortions in adaptive optics. Generally, atmospheric turbulence, described by the Kolmogorov model [160], generates smoothly varying optical wavefronts [161], which can be decomposed into different modes by Zernike polynomials [9, 161]. The decomposition makes it possible to use modal algorithms to reconstruct and analyze wavefronts measured by slope-sensitive wavefront sensors, such as the Shack-Harmann wavefront slope sensor described in section 4.4. On the other hand, Zernike polynomials offer a modal basis for the compensation of wavefront distortions caused by atmospheric turbulence. In practice, both zonal and modal approaches are used for wavefront compensation in adaptive optics [161]. The zonal approach achieves the compensation by an array of independent subapertures while the modal approach compensates for distorted wavefronts over the whole aperture. High-order aberrations are suitable for the use of the zonal method while low-order aberrations described can be compensated for more effectively by Zernike based modal methods. Although Zernike polynomials are not statistically orthogonal and are not independent [161] when used for turbulence compensation, they are near optimum for low-order corrections [9, 161, 165].

To date, adaptive optics has become a standard instrumentation suite and is in widespread use in a range of biomedical and industrial applications, such as retinal imaging [166168], optical microscopy [169172], optical tweezer [173], micro/nanofabrication [174176], and optical storage [177]. Figure 42 shows representative applications of adaptive optics in laser fabrication, optical coherence tomography (OCT), and super-resolution microscopy with improved performance.

Figure 42.

Figure 42. Applications of adaptive optics in industry and biomedicine. (a) Ultrafast laser fabrication beneath the surface of diamond without and with aberration correction. Adapted with permission from [174] © The optical Society. (b) Ultrafast laser writing of structures in glasses without and with wavefront correction. Adapted with permission from [175] © The optical Society (c) In vivo OCT tomograms of a normal human eye in the foveal region without and with adaptive optics. Adapted with permission from [178] © The optical Society. (d) Mitochondria (magenta) and the plasma membrane (green) in a cell ∼150 μm deep in a zebrafish hindbrain imaged without and with aberration correction and deconvolution. Reproduced from [172], with permission from Springer Nature.

Standard image High-resolution image

4.6. Image analysis

In addition to applications in optics, Zernike polynomials also play an important role in moments-based image analysis. Image moments are real- or complex-valued quantities used to characterize an image function and describe its features. They are commonly used in statistics to characterize the distribution of random variables and in mechanics to measure the mass distribution of a body. The use of moments for image analysis is straightforward if we treat the pixel intensity of a binary or gray level image as a random variable. Image moments, M, can be considered as projections of an image function onto a set of basis functions and are mathematically defined as:

Equation (153)

where f(x, y) is the image function and ψ(x, y) is the basis function.

Image moments have been intensively studied in image analysis because they can be used to construct moment invariant features for the description and recognition of deformed objects and patterns. Common moments used for image analysis include geometric moments, rotational moments, complex moments, and orthogonal moments [48, 179]. Among them, geometric moments, which use a power series as the basis function [ψ(x, y) = xnyl ], are the earliest. Based on geometric moments, Hu first introduced moment invariants in 1962 using the theory of algebraic invariants and constructed seven moment invariants to linear transformations (translation, rotation, scaling, and skew) [180]. This work opens the door to moment invariants based image analysis and pattern recognition. In contrast to geometric moments, orthogonal moments are a family of image moments that use orthogonal polynomials as the kernel. Orthogonal moments have simple inverse transform and minimum information redundancy compared with geometric moments and are widely used in practice. Zernike moments are an important type of orthogonal moments.

4.6.1. Zernike moments and fast calculation.

Zernike moments for image analysis and pattern recognition were first introduced by Teague in 1980 [12]. They are defined over a unit circle by employing Zernike polynomials as the basis function and can be written as [12]:

Equation (154)

where $V\,_n^l$ is the non-normalized complex Zernike polynomials (equation (16)), the asterisk denotes complex conjugate, and Mnl is the Zernike moment of degree n with repetition l. n is a non- negative integer, l is an integer, n − |l| ⩾ 0 and is even. The completeness and orthogonality of $V\,_n^l$ allow for the representation of a square integrable image function, f(x, y), defined on a unit disk using Zernike polynomials as [181, 182]:

Equation (155)

The expression in equation (155) suggests that the image, f(x, y), can be theoretically reconstructed from its Zernike moments. However, the practical importance of this property is not that significant because moments are not a good tool for image compression in general [183]. For a digital image, equation (154) can be written in discrete form as:

Equation (156)

where x2 + y2 ⩽ 1.

The fundamental feature of Zernike moments is their rotational invariance. If the image, f(x, y) is rotated by an angle, α, the Zernike moments of the rotated image is given by [184]:

Equation (157)

Equation (157) indicates that rotating an image will introduce a phase shift to the Zernike moments but will not alter the magnitudes. This simple property leads to the conclusion that the magnitudes of the Zernike moments, $|{M_{nl}}|$, can be used as rotationally invariant features of the image function, f(x, y). Moreover, bearing the facts that ${M_{n, - l}} = {M_{nl}}^*,\left| {{M_{n, - l}}} \right| = \left| {{M_{nl}}} \right|$ in mind, one may only use $|{M_{nl}}|$ with l ⩾ 0 as Zernike feature descriptors, as shown in table 18. Based on the Zernike moments, two primary rotation invariants, In 0 and Inl , can be constructed as [12, 185, 186]:

Equation (158)

Table 18. List of Zernike moments from degree zero to nine.

DegreeMomentsNo. of Moments
0 ${M_{00}}$ 1
1 ${M_{11}}$ 1
2 ${M_{20}},{M_{22}}$ 2
3 ${M_{31}},{M_{33}}$ 2
4 ${M_{40}},{M_{42}},{M_{44}}$ 3
5 ${M_{51}},{M_{53}},{M_{55}}$ 3
6 ${M_{71}},{M_{73}},{M_{75}},{M_{77}}$ 4
7 ${M_{71}},{M_{73}},{M_{75}},{M_{77}}$ 4
8 ${M_{80}},{M_{82}},{M_{84}},{M_{86}},{M_{88}}$ 5
9 ${M_{91}},{M_{93}},{M_{95}},{M_{97}},{M_{99}}$ 5

They are the most important and most frequently used Zernike shape descriptors.

The rotation invariance of Zernike moments is illustrated by a numerical experiment. Figure 43 shows a gray image of 512 × 512 pixels and its five rotated versions with rotation angles of 60°, 120°, 180°, 240°, 300°, respectively. Table 19 presents the magnitudes of the Zernike moments up to the third degree and their statistics (mean μ and standard deviation σ). The data show that the standard deviations of the Zernike moments are close to zero, suggesting that Zernike moments are excellent shape descriptors for object recognition. In this example, the reason for not obtaining exact rotation invariance is due to discretization errors, which have been discussed by several authors [182, 187].

Figure 43.

Figure 43. Zernike moments are rotationally invariant. From left to right: image rotated counterclockwise by an angle of 60 degrees.

Standard image High-resolution image

Table 19. Magnitudes of some Zernike moments for the rotated images in figure 43 and their statistics.

  $\left| {{M_{00}}} \right|$ $\left| {{M_{11}}} \right|$ $\left| {{M_{20}}} \right|$ $\left| {{M_{22}}} \right|$ $\left| {{M_{31}}} \right|$ $\left| {{M_{33}}} \right|$
29 686.852295.994959.743543.283219.671585.55
60°29 682.932296.334969.333543.653217.361584.38
120°29 685.442296.224964.343548.263220.371586.41
180°29 686.852295.994959.743543.283219.671585.55
240°29 682.932296.334969.333543.653217.361584.38
300°29 685.442296.224964.343548.263220.371586.41
μ 29 685.072296.184964.473545.063219.131585.47
σ 1.780.154.292.481.410.91
σ/μ%0.0060.0070.090.070.040.06

Fast computation of Zernike moments is an important topic in Zernike moments based image analysis. To compute Zernike moments at lower degrees, one can directly use the explicit definition of Zernike polynomials. However, for Zernike moments at higher degrees, this method is not recommended because it involves the calculation of factorial functions present in the radial polynomial, which is computationally expensive [59]. In these cases, one can employ fast computation algorithms, such as the recursive methods based on the recurrence relationships of the radial polynomials (section 2.2.6) [59, 62, 64], among others [63, 188]. The recursive algorithms are reportedly more efficient and particularly suitable for fast calculation of Zernike moments.

As an important orthogonal moment, Zernike moments employ Zernike polynomials as the basis function and overcome the drawbacks of geometric moments. They have minimum information redundancy and can represent features in a more efficient, irredundant way. To date, Zernike moments have been widely used in many applications, such as pattern recognition [184, 189191], multimedia watermarking [192, 193], and medical image analysis [194, 195].

4.6.2. Pseudo Zernike moments.

Pseudo Zernike moments stem from the pseudo Zernike polynomials (described in section 2.5.5), which are also a set of polynomials orthogonal over a unit circle analogous to the conventional Zernike polynomials. Pseudo Zernike moments, denoted by ${\mathcal{M}_{nl}}$, are defined as [48, 186]:

Equation (159)

where $\mathcal{V}_n^l$ is the pseudo Zernike polynomials defined in equation (86), n is a nonnegative integer, l is an integer, and n − |l| ⩾ 0. Pseudo Zernike moments are analogous to conventional Zernike moments (equation (154)) and also hold the property of rotation invariance. However, they eliminate the constrain of n − |l| = even and thus have more moment invariants for the same degree n [Pseudo Zernike moments contain (n+ 1)2 invariants while Zernike moments have (n + 1)(n + 2)/2]. It is shown that pseudo Zernike moments are less sensitive to image noise than conventional Zernike moments [48]. Pseudo Zernike moments also have fast computation algorithms [196, 197] and have been used in a range of image analysis and pattern recognition applications [198, 199].

5. Discussion and conclusion

Although Zernike polynomials have been successfully used in a range of fields, it is important to be aware of potential pitfalls. First, Zernike circle polynomials are only orthogonal over a unit circle. For systems with non-circular pupils, such as annular and hexagonal pupils, Zernike circle polynomials are neither orthogonal nor represent balanced aberrations. In these cases, orthonormal polynomials can be constructed by orthogonalizing Zernike circle polynomials across the pupil [37, 105], as discussed in section 3.1. Second, Zernike polynomials are only orthogonal in a continuous fashion. This suggests that in general, they are not or at least not strictly orthogonal over a discrete set of data points in numerical simulation or real experiments. Potential errors should be taken into consideration when data points are sparse or unevenly distributed [200, 201]. Third, when comparing Zernike expansion coefficients of two wavefronts, it is important to specify the pupil diameters since the expansion coefficients vary with aperture size. This is especially true when comparing the aberrations of the eye from two measurements. Furthermore, Zernike polynomials may fail to represent some complex, irregular surfaces or shapes using a reasonable number of terms. Representative examples include fabrication errors present in the single-point diamond turning process [19] and irregular corneal aberrations of postsurgical or pathological eyes [202, 203].

In conclusion, we provide a comprehensive account of the development of Zernike polynomials in the past several decades, including the history, definitions, mathematical properties, roles in wavefront fitting, relationships with associated physical concepts, and connections with other polynomials, and survey their state-of-the-art applications. Potential pitfalls when using the Zernike polynomials are also discussed.

For Zernike polynomials over circular pupils, there are at least six different indexing schemes used by national and international standards, commercial software, and prominent scientists, including the Noll, OSA/ANSI, Fringe (University of Arizona), ISO-14999, Born and Wolf, Malacara indices. All indices share the same expression for the radial polynomials, which is the eigenfunctions of a second-order rotationally invariant partial differential equation [1, 6]. However, they differ from each other in naming, normalization, and indexing strategies, which are compared and summarized (table 8). Zernike polynomials possess rigorous mathematical properties, such as orthogonality and symmetry, and are closely related to other functions, such as XY monomials, Jacobi polynomials, Legendre polynomials, Bessel functions, and pseudo Zernike polynomials. Their Fourier transform, integration representation, derivative, and recurrence relations can be explicitly obtained to facilitate solving complex problems. Zernike polynomials are well-suited for wavefront analysis in optics because they have good corresponding relationships with Seidel aberrations. The wavefront fitting problem can be solved using the least-squares method. Expansion coefficients represent the standard deviations of corresponding aberration terms (except the piston term) and contain a wealth of information about the wavefront. The expansion coefficients can be easily transformed when the original wavefront is translated, rotated, or resized (section 2.3.2).

Zernike circle polynomials are only orthogonal over the interior of a unit circle. Polynomials orthogonal over non-circular pupils can be constructed based on Zernike circle polynomials. The most commonly used construction approach is the recursive Gram–Schmidt orthogonalization method. Based on this method, orthonormal polynomials over five noncircular pupils, including annular, rectangular, square, hexagonal, and elliptical pupils common in optics, are discussed. The orthonormal polynomials over annular pupils, called Zernike annular polynomials, are reviewed with emphasis due to their practical significance. The Zernike annular polynomials are defined based on the Noll indices and their recurrence relations and Fourier transform are explicitly presented. The Zernike annular polynomials have similar corresponding relationships with Seidel aberrations as Zernike circle polynomials and are well-suited for wavefront analysis over annular pupils.

In addition, we also survey state-of-the-art applications of Zernike polynomials in a range of fields, including the diffraction theory of aberrations, optical design, optical testing, ophthalmic optics, adaptive optics, and image analysis. In the diffraction theory of aberrations, Zernike polynomials are used to expand the wavefront aberration at the exit pupil of an optical system and corresponding expansion coefficients are used to compute the PSF at the image plane according to the (extended) Nijboer–Zernike theory. In optical design, Zernike polynomials are used to analyze the wavefront aberration of a designed optical system, represent freeform surfaces, and facilitate system optimization. In optical testing, Zernike polynomials are used to fit measured interferometric wavefronts and remove misalignment errors. In ophthalmic optics, Zernike polynomials are used to reconstruct ocular wavefront measured by a Shack–Hartmann wavefront slope sensor and report optical aberrations of the eye. In adaptive optics, Zernike polynomials are used for the representation, reconstruction, and compensation of optical wavefronts distorted by atmospheric turbulence. In image analysis, Zernike polynomials are used to define Zernike moments and pseudo Zernike moments, which hold the property of rotation invariance and can be used as shape descriptors for pattern recognition.

This review is aimed to clear up the confusion of different indexing schemes, provide a self-contained reference guide for beginners as well as specialists, and facilitate further developments and applications of Zernike polynomials.

Acknowledgments

We thank Xiaoxiao Ma, Heren Li and Jin Wei for proofreading the manuscript. This work is supported in part by the National Natural Science Foundation of China (NSFC) under Grant Nos. 62122072, 12174368, and 61705216, the National Key R&D Program under Grant No. 2022YFA1404400, the Institute of Artificial Intelligence at Hefei Comprehensive National Science Center under Grant No. 21KT016, the Anhui Science and Technology Department under Grant Nos. 202203a07020020 and 18030801138, the Zhejiang Lab under Grant No. 2019MC0AB01, the Research Fund of the Chinese Academy of Sciences (CAS), and the Research Fund of the Double First Class Initiative.

Data availability statement

The data generated and/or analyzed during the current study are not publicly available for legal/ethical reasons but are available from the corresponding author on reasonable request.

Conflict of interest

The authors declare no conflicts of interest.

Please wait… references are loading.