High-NA optical edge detection via optimized multilayer films

There has been a significant effort to design nanophotonic structures that process images at the speed of light. A prototypical example is in edge detection, where photonic-crystal-, metasurface-, and plasmon-based designs have been proposed and in some cases experimentally demonstrated. In this work, we show that multilayer optical interference coatings can achieve visible-frequency edge detection in transmission with high numerical aperture, two-dimensional image formation, and straightforward fabrication techniques, unique among all nanophotonic approaches. We show that the conventional Laplacian-based transmission spectrum may not be ideal once the scattering physics of real designs is considered, and show that better performance can be attained with alternative spatial filter functions. Our designs, comprising alternating layers of Si and SiO$_2$ with total thicknesses of only $\approx 1{\rm\mu m}$, demonstrate the possibility for optimized multilayer films to achieve state-of-the-art edge detection, and, more broadly, analog optical implementations of linear operators.


I. INTRODUCTION
In this article, we show that optimally designed multilayer dielectric films can achieve high-numerical-aperture optical edge detection.
In a typical scenario, one might have a scene or object illuminated by a laser or narrow-bandwidth source, in which case an optical device that generates an image of edges offers the prospect for speed-of-light detection [1][2][3][4][5][6][7][8][9][10][11]. Of the many edge-detection designs to date [2][3][4][5][6][7][8][9][10][11], none offers all three of: high-numerical-aperture, two-dimensional image formation, the prospect for realistic fabrication, and transmission-mode operation. Multilayer films are commonly fabricated to high precision for wide-ranging applications [12,13], and guarantee a two-dimensional field of view by their rotational symmetry. Thus the key question is whether they can achieve high-fidelity, transmissionmode-based edge detection for a large numerical aperture, which we show is indeed possible with structural optimization. The canonical approach to achieving edgedetection behavior [14,15] is to target a transmission profile that scales with k 2 ρ , for in-plane wavevector k ρ , as such a profile will mimic the effect of the in-plane Laplacian operator, ∇ 2 ⊥ , on the incoming field. We show that targeting such a quadratic profile with a multilayer structure can successfully produce an effective design. Yet, the design is imperfect, and the requirement of a k 2 ρ profile is an over-prescription of the response function. Any transmission response that acts as a high-pass filter, i.e., which filters out small spatial frequencies (corresponding to nearly constant in-plane spatial modes), can produce high-quality edges. To demonstrate this, we show that a transmission profile that scales as the cube of the incident angle, ∼ θ 3 , can produce even higher-quality images for the same design parameters. Our designs offer the highest theoretical performance to date, should be straightforward to fabricate, and reveal the potential of such multilayer structures for analog optical devices. The essential feature of optical analog edge detection, as shown in Fig. 1, is to engineer the light field of a coherent image, isolating its edges in the transmission or reflection spectrum of an optical device. A classic approach [16] to analog edge detection is to use a lens to Fourier transform the incoming waves and an aperture to filter out the low in-plane wavevector components, with two free-space propagation regions to allow the evolution of the wave field to achieve the Fourier and Inverse-Fourier Transforms. The key drawback is that the setup must be large and bulky to accommodate the free-space propagation. An emerging alternative is to use coherent scattering effects to isolate edges in a more compact device architecture, including photonic-crystal slabs [3,9,10], dielectric metasurfaces [5,6,8], plasmonic films [2,7], dielectric interface [11] and split-ring-resonator metamaterials [4]. The incoming field, for a given polarization at a frequency of interest, can be written as a linear combination of plane waves, E in (k ρ )e i(kρρ+kzz) dk ρ . One method to identify edges is to try to identify a structure whose wavevectordependent transmission or reflection mimics the in-plane Laplacian operator, ∇ 2 ⊥ . In the spatial Fourier space, the Laplacian corresponds to multiplying the incoming plane waves by k 2 ρ , suppressing low spatial frequencies relative to high ones. The effectiveness of the Laplacian can be attributed to the fact that edges are high-spatialfrequency components of images, whereas a low-contrast background comprises primarily low spatial frequencies.

II. OPTIMIZATION METHOD
Multilayer films (i.e., optical interference coatings) have well-established fabrication techniques [17,18], and their optical response is necessarily isotropic under rotations around their propagation axis. Thus if their transmission coefficients can be optimized to have the right profile, they can simultaneously satisfy the three key requirements (high-NA 2D field of view, simple fabrication, and transmission-mode operation). We define a target transmission coefficient, t target , as a function of angle (or, equivalently, in-plane wavevector), which serves as the ideal transmission function for edge detection. We take the allowed materials in the multilayer to be given, and use the thicknesses of the corresponding layers, w for each layer , to be the designable degrees of freedom. Our optimization problem, then, is to minimize the error between the designed and targeted transmission coefficients over a sufficiently dense discrete set of incoming angles θ: where t(θ; w ) is the transmission coefficient of a given multilayer stack. One can optimize the transmission coefficient for edge-image formation at any output plane beyond the multilayer; to demonstrate how compact this approach can be, we take the image plane to be the exterior of the rear surface itself. For optimization of a large number of layers, one must be able to rapidly compute gradients of the objective function, Eq. (1), with respect to the many degrees of freedom. Here we briefly outline how the gradients are computed, using a method similar to that of the "needle" approach to multilayer-film design [19][20][21][22]. In a multilayer medium, the continuous translational and rotational symmetry prevents coupling between different inplane wavevectors. At each wavevector, the standard matrix approach [12] connects the forward-and backwardgoing wave amplitudes in the incident region to the equiv-alent amplitudes in the transmission region through matrices P and D and that represent propagation through, and interface reflections at, layer . The reflection coefficient r and transmission coefficient t satisfy the matrix equation [12]: where the "1" on the left-hand side represents the incident-wave normalization, and the reflection coefficient, transmission coefficient, and the matrices P and D all vary with wavevector k ρ . One can then solve for r and t from the components of the M matrix; in particular, t is given by t = 1/M 11 (Ref. [12]). The derivative of t with respect to the thickness of region i is then given by dt/dw i = −(1/M 2 11 )dM 11 /dw i . A change in the thickness of layer i, while keeping all other layer thicknesses fixed, does not affect interface transmission and reflection, and will only incur changes in the propagation matrix P i in the product of Eq. (2). The derivative of the M matrix can then be written: which can be rapidly computed. Figure 2 demonstrates the capability for high-efficacy edge detection with a computationally optimized multilayer stack. We consider up to 20 alternating layers of Si and SiO 2 , with refractive indices 3.77 + 0.01i and 1.47, respectively, for light incident at 700 nm wavelength. We choose materials such as silicon for their large refractive indices, which facilitates stronger interactions with light [23] and high-performance designs. Although one might be concerned about the lossiness of silicon, the designs are ultra-thin, with total silicon thicknesses on the order of half a micron, well below the > 5 µm absorption depth of silicon at 700 nm wavelength. To perform the optimization, we use the gradient computed via Eq. (3) in a local, gradient-based interior-point method [24][25][26][27] that is run until convergence. Simulations of image formation are done by Fourier transforming the incident field, multiplying by the wavevector-dependent transmission, and then inverse Fourier transforming at the output plane. As our input we consider a Yale logo, containing edges oriented in almost every direction. We take the image to occupy a numerical aperture of ≈ 0.98, corresponding to polar angles ranging from 0°to 80°. In one set of optimizations, we target a Laplacianbased transmission coefficient function given by t target (θ) = αk 2 ρ , where α is a constant optimization hyperparameter. Ideal design would have both polarizations follow the lineshape of t target ; in practice, however, we find that such designs appear to be impossible to achieve in the multilayer form factor. An alternative solution, halving the brightness, is to fully reflect one polarization and perform the filtering operation in the transmission spectrum of the other polarization. (Note that the orientations of the usual s and p polarizations of plane waves are wavevector-dependent, and a linearly polarized incoming wave contains each.) After running the optimization for many values of α and many layer-thickness initializations, the design shown in Fig. 2(b), slightly more than 1 µm in total thickness, emerges as optimal (cf. SM for detailed design data). Figure 2(c) shows the actual transmission coefficient of the design (red and green lines), compared to the target (blue). The transmission of s-polarized waves is nearly 0 for all incident angles (red dashed line), so only p-polarized waves (red solid line) contribute to the final edge detection. The targeted quadratic k 2 ρ distribution is not exactly mimicked, but the variation in phase is less than π over the whole angular range, which ensures effective interference at the image plane. Interestingly, the behavior of the transmission in the ultra-high-angle range, between 80°to 90°, shows the difficulty of designing high-NA edge-detection devices in transmission mode: any multilayer dielectric film will tend towards perfect reflection at glancing incidence (90°), in which case the design of a quadratic transmission profile that increases from 0 to a maximum at, say, 80°, is highly unnatural and hard to achieve. Conversely, designing reflection-mode designs is significantly simpler. Yet transmission is the ideal operational mode for highspeed edge detection, and the designs presented in Fig. 2 achieve this with high fidelity at high NA (0.98). The intensity of the output field is shown in Fig. 2(d), where one can see that the edges, oriented in all directions, are clearly resolvable.

III. RESULTS
As mentioned in the introduction, edge detection requires good filtering of the spatial frequencies of the incoming wave, but such filtering does not necessarily require the ∼ k 2 ρ dependence of the Laplacian operator. The discrepancy between the optimal-design transmis- sion coefficient and that of the target in Fig. 2(c) suggests that it may be impossible for multilayer structures to achieve perfect quadratic scaling of their transmission coefficients alongside minimal phase variations, which would imply that the optimal design of Fig. 2(b) may have paid some penalty in attempting to minimize the difference with a quadratic target, instead of simply aiming for good filtering properties.
In Fig. 2(e), we show an alternative design that emerged for a target transmission coefficient given by t target (θ) = αθ 3 (cf. SM). We chose this function to more closely match the transmission curves of real multilayer designs, and Fig. 2(f) shows the much closer match between the transmission coefficient of the new optimal design and that of the new target. In Fig. 2(g) the outputfield intensity again shows very good edge resolution, although it is difficult by eye to detect which of the designs of Fig. 2(b,e) is better. Figure 3 quantitatively compares the edge image quality of the two designs of Fig. 2(b,e). We measure the width of every edge that is present in the output, and compare between the two designs (also using the ground truth that is known from the input image). Figure 3(a,b) demonstrates prototypical results, where the design using k 2 ρ target transmission exhibits thicker edges, as measured by the number of pixels above the background intensity (dashed lines), than its counterpart designed with t target ∼ θ 3 . The histogram of Fig. 3(c) shows the relative numbers of edges with a given width (in terms of number of pixels) for the two designs, with the θ 3 design offering slightly better performance and a smaller average width per pixel. (The number of edges missed entirely is nonzero but very small for both designs.) This demonstrates that, although the Laplacian operator may be a good starting point for edge-detection design, it is neither required nor necessarily globally optimal, which is likely true for alternative edge-detection approaches (metasurface, plasmonic, etc.) as well.

IV. DISCUSSIONS
In Fig. 4 we compare the multilayer designs of Fig. 2(b,e) to other recent state-of-the-art designs [2][3][4][5][6][7][8][9][10][11]. The designs in the bottom region (white) of the figure operate in reflection mode, which is easier to design but not ideal to implement. The design denoted by the black cross is multilayer-film-based, but it has large phase variations in the transmission coefficient as a function of angle (> 180°), which significantly blurs the edge resolution (cf. SM). The designs in the middle region (light grey) work in transmission mode but only for limited operation across all possible azimuthal angles. (We also note that the metasurface design of Ref. [6], denoted by a purple hexagon, requires lenses and thus does not offer space savings.) The designs in the top region (dark grey) all work in transmission mode and have rotational symmetry, under continuous or discrete rotations, that make their scattering response independent (or nearly so) of azimuthal angle. Of these designs, three [3,9,10] operate only for low numerical aperature (which corresponds to operation over only a narrow range of wavevectors), while the fourth [4] is difficult or impossible to fabricate. The optimized multilayer films of this work show a clear advantage along these three dimensions. Finally, we analyze the effects of fabrication errors on the designed structures. We simulate random errors in the layer thicknesses by sampling from the normalized Gaussian distribution N (0, σ), where 0 represents the mean shift from the desired thicknesses and σ represents the standard deviation. We find that for the design with t target ∝ k 2 ρ , the objective function in Eq. (1) for p-polarization increases on average by at most 25% when σ ≤ 3 nm, while for s-polarization the transmission hardly changes from 0 at all angles. For the design with t target ∝ θ 3 , the equivalent deviation is σ ≤ 1.8 nm for 25% error. The performance error varies smoothly with the fabrication error, and such tolerances are well beyond the angstrom-level thickness errors in state-ofthe-art LPCVD fabrication [28][29][30]. Alternative, easierto-synthesize materials can be designed using the same techniques described above and achieve similar performance.

V. OUTLOOK
Looking forward, computational optimization of multilayer structures may enable a wide range of analog linear operators, beyond just edge detection. Instead of defining specific target transmission coefficient profiles, as in Eq. (1), one could utilize a data-driven approach that matches the desired features in a given scattered field with those of known image/field pairs. We implemented such an approach specifically for edge detection, but the performance of the optimal designs was nearly equivalent to the best designs already shown here. Another possible direction to explore is towards significantly thicker multilayer designs, which may enable efficient multi-frequency performance, though in such a case local-optimization techniques might falter and require global-optimization techniques instead. angular range of the input image take to correspond to NA = 0.98. The widths of the edges obtained from plasmonic film design in Ref. [1] (green bars) have an average width of 5.7 pixels, and a modal edge width of 6 pixels, a few times larger than the output edges from our transmission-mode multilayer designs (blue and red bars).