Shapelet-based orientation and defect identification method for nanostructured surface imaging

Structure-property relations are of fundamental importance for continued progress in materials research. Determining these relationships for nanomaterials introduces additional challenges, especially when nanostructure is present, either through self-assembly or nano-lithographic processes. Recent advances have been made for quantification of nanostructured surfaces, for which many robust experimental imaging methods exist. One promising approach is based on the use of shapelet functions for image analysis, which may be used as a reduced basis for surface pattern structure resulting from a broad range of phenomena (e.g. self-assembly). These shapelet-based methods enable automated quantification of nanostructured images, guided by the user/researcher, providing pixel-level information of local order without requiring detailed knowledge of order symmetries. In this work, enhancements to the existing shapelet-based response distance method are developed which enable further analysis of local order, including quantification of local orientation and identification of topological defects. The presented shapelet-based methods are applied to a representative set of images of self-assembled surfaces from experimental characterization techniques including scanning electron microscopy, atomic force microscopy, and transmission electron microscopy. These methods are shown to be complementary in implementation and, importantly, provide researchers with a robust and generalized computational approach to comprehensively quantify nanostructure order, including local orientation and boundaries within well-aligned grains.


Introduction
In contrast to relatively reliable and robust image-based characterization methods for nanomaterials [1], post-processing methods for this class of images are in a far less advanced state [2].Post-processing methods enabling quantification of nanostructure are vital for scientific analysis and the development of structure-property relationships.These quantification tasks are relatively challenging due to the presence of complex nanostructures resulting from selforganization and self-assembly processes [2], in addition to the presence of measurement uncertainty and noise.However, analyzing ordered nanostructures and defects associated with this order is important to identify connections between nanomaterial synthesis, function, and application [3].Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence.Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
Recently there has been an increased focus on the presence of defects in nanomaterials and their resulting properties.Defects can be generally classified as orientational (disclinations) and translational (dislocations) [4].Directly related to the presence of defects is variation of local nanostructure orientation, where interfaces between wellordered regions (e.g.grains) are composed of defect structures (e.g. grain boundaries).These types of defects and defect structures, amongst others, are often inherent characteristics of self-assembly processes [5].For example, they are intrinsic to liquid crystal phases and self-assembly materials such as colloids and block copolymers [6].
Variation and manipulation of defects in nanostructured materials has been shown to enable the 'tuning' of material properties (electrical, optical, magnetic, chemical) to obtain desired functionality [6,7].This approach, so-called defect engineering, has seen success in crystalline inorganic hard matter and organic hard materials [6].However, while defect engineering in soft matter is less prevalent, recent work in this field has demonstrated significant promise.Defect engineering can be important for specific self-assembly applications, such as microelectronics, where material requirements may involve extremely low defect densities or specific defect placement within integrated circuits [5].
A recently developed method for analysis of nanostructured surfaces (e.g.two-dimensional order) utilizes shapelet functions [8], originally developed for the analysis of galactic images.This shapelet-based response distance approach [9,10] has proven useful for automated quantification of nanostructured surfaces.Suderman et al [9] showed that a subset of polar shapelets [11] can be used as a local reduced basis to extract and compare a broad range of pattern orders observed through topographical imaging of nanostructured surfaces, independent of the interactions resulting in order (self-assembly, nanoimprint lithography, electrochemical etching, etc).This method is differentiated from past approaches, such as bond-orientational order analysis [2,12], as it does not require image thresholding or segmentation and instead resolves pixellevel detail of local order.This response distance method [9] is supervised, requiring the researcher or user to specify a reference subdomain where the pattern is uniform (undeformed), using this information to analyze the entire image.Results of applying the response distance method to simulated nanomaterial surface images are shown in figure 1.While these example images were generated from deterministic simulations of self-assembly, the response distance method was recently enhanced to enable analysis of images with uncertainty and measurement noise [10], such as those resulting from experimental characterization techniques including scanning electron microscopy, atomic force microscopy, and transmission electron microscopy.
The computational complexity of the original response distance method [9] was significantly reduced in later work [10], which increased the robustness of the analysis in the presence of measurement uncertainty (noise).While these recently developed shapelet-based response distance methods have been applied to quantify the local degree of order, their utility to quantify local pattern orientation and identify topological defects has not been adequately explored.
The overall objective of this work is to extend the shapelet-based response distance method [9] to include quantification of local pattern orientation and identification of topological defects.Given that the current approach is supervised by requiring user input for a well-ordered subdomain, an emphasis on enhancements with minimal user input is prioritized with specific objectives as follows: • Analyze the effect of incorporating higher-order shapelets within the existing response distance method [9].• Utilize shapelet orientation at maximum response, resulting from steerable filter theory [13], to determine local pattern orientation.• Develop a defect identification method for self-assembly images which directly identifies topological defects and defect structures.
The paper is organized into the following sections: 2 presents a background on the shapelet-based method for self-assembly imaging.3 presents three sets of results: (a) incorporation of higher-order shapelets into the response distance method [9], (b) quantification of local pattern order using shapelet orientation at maximum response, and (c) the development of a defect response distance method to identify topological defects with validation using stripe, square, and hexagonal pattern orders.4 presents conclusions and future work.

Shapelet-based pattern analysis
Shapelets are a class of orthogonal functions originally developed for the decomposition and reconstruction of images of galaxies [8].However, they have recently been shown to be useful for analyzing patterns present in self-assembly imaging with a generalized method first presented by Suderman et al [9].Several various shapelet formulations have been developed, with polar shapelets [11] found to be particularly suited for quantification of local pattern order.
Polar shapelets have several parameterizations, with parameters n and m corresponding to modulation in the radial and polar dimensions, respectively.Past work focused on self-assembly considers a subset of shapelets where n = 0, which results in rapid radial decay.This subset of shapelets was further reformulated so that the radial scale is constant for increasing m and they are orthonormal, similar to an alternative formulation presented in equation (8) of [11].The constant scale orthonormal shapelet formulation used in this work is as follows [10], where the radial (constant) scale of the shapelet is λ and a  rescaling of the standard polar shapelet is achieved through reformulation of the scale parameter, with f being a geometric factor mapped to specific m values; see [10] for more details.Figure 2 shows the real and imaginary components for orthonormal polar shapelets where n = 0 and m ä [1,8] with a length-to-image scale ratio of 0.25 N = l .As is standard practice in image processing, shapelet functions are treated as kernels through projection onto a discrete domain, which then enables them to be used in convolution operations with target images.However, shapelet orientation has a strong effect on the resulting convolution, which was addressed by Suderman et al [9] who showed that shapelets are steerable [13].Steerable kernels/filters may be decomposed into the following general form, where θ is the angle of rotation of the filter, k j (θ) are rotationdependent coefficients, and g j (r, f) are basis functions from a minimum set of M. The use of steerable formulations of filters enables computationally-efficient ( 1 ( )  ) analysis of local orientation resulting in maximum (and minimum) response [13], section V.A. Setting the first derivative (w.r.t orientation θ) of the convolution of the steerable filter with an image function (I) to zero, and solving the resulting expression can then be used to find both the maximum response of the steerable filter and the corresponding orientation at maximum response [14], where I is the target image (function) and ⊗ represents the convolution operation.Suderman et al [9] demonstrated that shapelets are steerable with noting that this steerable formulation is real-valued.Based on [13,14] and solving equation (4) for the steerable shapelet in equation (5) (see Supplementary data for more details), the maximum shapelet response S max (w.r.t orientation θ) and the orientation at maximum response ( max q ) for steerable shapelets are, where C i is the result of convolving a steerable basis function with I, the target image (function).
The response distance method developed by Suderman et al [9] is a supervised image analysis technique that quantifies local pattern order.The supervision required from the user involves the selection of a local reference region within the image in which the desired pattern is uniform (undeformed).A set of response vectors is then found for steerable shapelets centered at each pixel in the reference region.This reference set of response vectors is then used to compute the response distance for each pixel in the image, with the distance computed as the minimum of the 2-norm between the local response vector and all response vectors in the reference set, where r i j ,  denotes the local response vector at pixel location {i, j} and R  is the reference set of response vectors.Akdeniz et al [10] enhanced the response distance method by applying k-means clustering [15] on the userspecified reference set of response vectors to obtain a condensed representation with cluster centroid vectors.This reduces the set of reference response vectors from the number of pixels in the reference region (typically 10 2 -10 3 ) to a fixed set of centroid vectors (≈20).Application of k-means clustering [15] reduced the computational complexity of the method at least an order-of-magnitude (from ∝10 1 to ∝10 0 s) for the self-assembly images used both here and in past work [10].

Results and discussion
The surface self-assembly imaging data used in this study is consistent with past work [9,10] The results presented in this work are divided into three different sections: Higher-order shapelet basis-in this section the effect of including a higher-order (m 7) shapelet basis is analyzed and compared to past response distance methods [9,10], where a lower-order (m 6) shapelet basis was used based on the maximum symmetry determined by the user.
Local pattern orientation-in this section a method is proposed and validated for the use of existing information from the response distance method, orientation at maximum response, to determine local pattern orientation.This is achieved through the use of steerable filter theory [13] and the 'steerable' shapelet formulations presented in [9,10], as opposed to brute-force techniques, which involve explicitly computing a large number of shapelet (filter) orientations [19].
Defect identification method-in this section a defect response distance method is presented which identifies the presence of local pattern distortions, defects (dislocations, disclinations [4]), and defect structures (grain boundaries [4]).
The analysis and methods presented are applied to a representative set of self-assembly images (figure 3), however, it is important to note that all these images are both (i) inclusive of only one type of self-assembly pattern (stripe, square, or hexagonal) and (ii) do not include significant areas of disorder.An open source software reference implementation provided in [20] can be used to reproduce the presented results and apply these methods for the analysis of other selfassembly images.

Higher-order shapelets for pattern analysis
Previously developed shapelet response distance methods [9,10] used lower-order (m 6) shapelets based upon two considerations: • from a qualitative perspective, dominant symmetries observed in surface self-assembly imaging range from 2-fold (stripe) to 6-fold (hexagonal), • the magnitude of the response to shapelets of higherorder (m 7) decreases rapidly with increasing order (m).
However, given the significantly reduced computational complexity involved in the shapelet-based response distance method presented in [10], quantitative analysis of higherorder shapelet response can be considered.This is further motivated by the work of Massey et al [11], which involved the use of higher-order shapelet expansions (m 20) to optimize the tradeoff between galactic image reconstruction quality and computational complexity.
In order to quantitatively assess the overall contribution of higher-order shapelets, shapelet response vectors were computed (see section for the reference self-assembly images in figure 3. The increase in computational complexity of the response distance method associated with the incorporation of higher-order shapelets was found to be negligible, ≈1%-10% of the lower-order shapelet computation time, depending on image size.For each image, the sum of shapelet responses over the entire domain (herein the cumulative response) was compared for each individual shapelet component, as shown in figure 4. In agreement with past work, for all surface self-assembly images, lower-order shapelet responses are dominant in magnitude, however, the response for shapelets with m = [7, 10] decays but is still significant.Additionally, for all experimental images, the rate of decay is  much lower, with higher-order shapelet responses being 10% of the maximum response up to m = 29.This is especially pronounced for experimental images of hexagonal surface self-assembly (figure 3(c)).
These results show that higher-order shapelets do capture some information from self-assembly images, especially those with significant noise.This response behavior is similar to what is observed when applying the discrete Fourier transform (DFT) to images with considerable noise and/or spatial discontinuities, where coefficients for the DFT for higher frequency modes have increasing magnitude/moduli [21], section 7.2.
To determine the local pattern features which are not captured using lower-order shapelets with the response distance method [10], analysis that included higher-order shapelets was performed and the response distance results were compared.Maximum shapelet order (m¢) was determined for each image by iteratively computing the cumulative response for increasing shapelet order, until the cumulative response was found to be below a normalized threshold value (ò = 0.1) with respect to the maximum cumulative response.Referring back to figure 4, the computed truncation index m¢ is highlighted in the caption for each of the reference self-assembly images in figure 3.
The response distance results for both lower-order and (incorporation of) higher-order shapelets are qualitatively (visually) similar and, consequently, these results are not shown here (see Supplementary data for more details).After taking the difference in response distance using lower-versus including higher-order shapelet response and superimposing on the original images (figure 5), the relation between higherorder shapelet response and local pattern features is more clear.Focusing on this result for simulated self-assembly images of stripe (figure 5(a)) and hexagonal (figure 5(b)) patterns, higher-order shapelet-based response is observed to be significantly greater in areas with (i) pattern distortion (curvature or dilation) and (ii) topological defects.This is beneficial compared to the lower-order shapelet result in that, while regions with pattern distortion and defects have lower order than uniform regions, they still have order and are topologically different from regions without order (disorder).This is consistent with order parameter representations of local alignment in liquid crystalline phases [22], where order parameters are composed of two components, (order) magnitude and phase.Higher-order contributions to local order have also been shown to be important for quantification of alignment of nanostructures, such as nanorods and nanowires [23].Topological defects, such as dislocations and disclinations, correspond to regions with non-zero order magnitude and degenerate phase [4].This is most evident for the hexagonal simulation image (figure 5(b)), where higher-order shapelet-based response directly corresponds to both disclinations (both plus and minus [4], figure 9.2.19) and dislocations, which are composed of adjacent pairs of oppositely charged disclinations for a hexagonal pattern.
Focusing on this result for experimental self-assembly images of stripe (figure 5(c)), square (figure 5(d)), and hexagonal (figure 5(e)) patterns, higher-order shapelet-based response is observed to be significantly greater throughout the image.The difference in response when using lower-order and including higher-order shapelets can now be attributed to both (i) pattern distortion/defects and (ii) local uncertainty (noise).That is, in addition to significantly increased higherorder shapelet response in areas with distortion and defects, there is also difference in uniform regions where there is highfrequency noise.As with the simulated images, the difference in response distance for experimental images is most significant in areas directly corresponding to disclination defects.

Orientation
The shapelet response distance method also provides local orientation (at maximum response, see section 2) for each shapelet function.This local orientation is a result of the use of steerable shapelet formulations [13,14], which involves convolutions with two (basis) components of each shapelet function, instead of multiple rotations at fixed intervals as with other methods [19].Quantification of local pattern orientation and spatial variation is as important as quantifying the degree of order itself.However, this is not a straightforward task given that each shapelet of order m may have a different orientation at every pixel within an image.Furthermore, the orientation for each shapelet ranges from 0, ], further complicating the use of the local shapelet orientation vector in order to approximate local pattern orientation.
An approach to determine local pattern orientation was developed, inspired by a similar approach using bond-orientational order theory [12] but retains the inherent benefits of the shapelet-based approach where image segmentation is not required.Additionally, the shapelet method does not require different segmentation and pattern feature identification steps as with bond-orientational order theory.Instead of integrating all local orientational information provided by the shapelet orientation vector, the dominant rotational symmetry (m″) of the pattern is determined and orientation from only this shapelet mode is used.For stripe, square, and hexagonal patterns, the dominant rotational symmetries (m″) are 1, 4, and 6 respectively.This results in a maximum orientation value of 2π, 2 p , and 3 p for stripe, square, and hexagonal patterns (respectively) due to degeneracy in rotational symmetry.
However, unlike local pattern response, local orientation is not a feature-scale attribute which results in highly varying local orientation vectors.The proposed method uses local m ″-fold shapelet orientation only in regions with relatively high m″-fold shapelet response, masking-out local orientation values where pattern intensity is low.Standard image processing methods (masking, dilation, and blending) are then applied to interpolate local orientation in those regions with low pattern response.This approach is demonstrated for both simulation-based stripe (figure 1(a)) and hexagonal (figure 1(b)) self-assembly images and shown in figure 6: Masking: figures 6(a) and (b) show the result of masking the m″-fold shapelet orientation based on a threshold of scaled local response.This threshold, unique to each image, is found via an iterative scheme, beginning with the largest (and least strict) threshold and analyzing the resulting orientation plot after dilation and blending operations.If there exist undefined orientation regions after blending (e.g.either masked-out and/or not interpolated from blending), the masking threshold is reduced and the process is repeated until the post-blending image contains an insignificant number of undefined orientations (<1% of all pixels).This results in masking-out regions of the pattern in which either (i) the pattern is relatively uniform with low image intensity (in between pattern features) or (ii) there are defects present (both orientational and translational).
Dilation: figures 6(c) and (d) show the result of dilation of the masked orientation images in order to approximate the local orientation of areas with relatively low pattern response.Morphological greyscale dilation, conveniently available from the Scientific Python package multi-dimensional image processing module (scipy.ndimage)[24], is used to 'fill in' void space between neighboring well-defined response regions.The dilation kernel size is chosen to be 2λ, where λ is the characteristic wavelength of the pattern and also the approximate distance between well-defined response regions.This scale was chosen to allow for adequate overlapping of neighboring well-defined response regions which ultimately defines orientation in void space and across orientational boundaries.
Blending: figure 7 shows the result of blending the dilated orientation image for both simulated stripe (figure 1(a)) and hexagonal (figure 1(b)) self-assembly microscopy images superimposed on the original pattern.The blending operation is performed using a median filter [24] with kernel size 4λ, so that the effective local orientation at each pixel is the median between approximately two layers of surrounding well-defined features.This allows for effective transitions in orientation between well-defined response regions and void space/ orientational boundaries.
For both simulated stripe (figure 6(a)) and hexagonal (figure 6(b)) masks, the resulting local orientation in uniform areas is qualitatively consistent and correct, with respect to manual (human) analysis.Several uniform regions in both images have gradual (defect-free) changes in local orientation corresponding to elastic-like pattern deformation.This is quantified well, with gradual local computed orientation changes corresponding to those observed manually.For sharp changes in orientation, which correspond to the presence of grain boundaries, there are corresponding sharp changes in computed orientation which validate the specific blending method used in that these orientational interfaces should be sharp and not diffuse.It is important to note that, even though the presented method estimates local orientation values in regions of orientational defects (disclinations), local orientation is degenerate in these areas, given that local m-fold order is present.
The shapelet-based local orientation method was then applied to experimental images of stripe (figure 3(a)), square (figure 3(b)), and hexagonal (figure 3(c)) self-assembly, with results shown in figures 7(c), (d), and (e) respectively.The presence of noise and non-ideal pattern features is not found to have a significant effect on the accuracy of the computed local orientation using the method described here.Furthermore, like the shapelet-based response distance method, the local orientation method is generalized for arbitrary pattern orders and is able to analyze patterns with non-convex features (e.g.stripe patterns).

Defect identification method
Changes in local orientation in self-assembly are mediated by the presence of orientation defects, disclinations, and defect structures such as arrays of dislocations and grain boundaries [4].Thus identification of defects is both complementary to local orientation information and an important property for structure-property relations.While crystalline and liquid crystalline phases have well-defined topological defects [4,26] (orientational versus translational, dimensionality, charge), in self-assembly there can be significant order deformation which is not energetically possible for order on molecular and atomic scales.Thus the use of the shapeletbased response distance method [9,10] with user supervision for defect identification is challenging, in that the user would need to identify multiple subdomains with pattern defects of greatly varying character.This is opposite of the original approach where the user selects a subdomain containing uniform (undeformed) pattern order.
To address this challenge, a complementary method for defect identification, the defect response distance method, is proposed which uses k-means clustering [15] of shapelet response vectors from the whole image instead of a predefined subdomain.The cluster associated with each pixel (response vector) in the image is then visualized so that the user can select multiple clusters associated with one or more defects and/or defect structures.Figures 8(a Stripe patterns required k 4, where cluster centroids ciated with ordered regions have higher response for specific subsets of m-fold shapelet symmetries and those associated with defects have relatively equal response.For square and hexagonal patterns, additional clusters were needed (k 8 and k 10 respectively) with three or four centroids associated with defects, depending on both the amount of noise present in the image and user discretion for selecting defect clusters.Figures 8(d)-(f) show visualizations of cluster centroid weights for shapelet order m 7. Higher-order shapelets for each image (maximum shapelet order m¢ found in figure 4) were included in this analysis, but their corresponding centroid weight values were omitted for brevity.
User supervision is significantly more involved than the response distance method, as previously the user was required to choose a single reference subdomain, but now both the number of clusters k and the set of defect-related cluster centroids must be chosen (at runtime).Given this information, the response distance method can then be applied separately to each defect cluster by using the centroid vector as the reference (set), instead of a collective set of reference response vectors associated with uniform order (as in section 3.1).This defect response distance method was performed for each of the reference self-assembly images in figure 3 and the results are shown in figure 9, with intermediate results for the other experimental images (e.g.similar to those in figure 8) provided in the Supplementary data.Additionally, a median filter [24] with kernel size 2 l , where λ is the characteristic length scale (wavelength) of the pattern, was used to smooth the defect response distance result before superimposing on the original image, as shown in figure 9.This kernel size is significantly smaller than that used for smoothing (blending) of dilated orientation results (see section 3.2) because of the very localized nature of defects and defect structures.
Referring to defect response distance results for both hexagonal self-assembly images (figures 9(b) and (e)), the method identifies both positive and negative disclinations (local 5-fold and 7-fold symmetric regions).In figure 9(b), several grain boundaries and pairs of adjacent oppositely charged disclinations, corresponding to dislocations in a hexagonal pattern, are also correctly identified.In figure 9(e), defect identification results are similarly found to be correct, however, the presence of noise in this image (experimentally determined) manifested in less strong response.This is particularly true for positive disclinations, but each one is correctly identified to some extent.
Referring to defect response distance results for both stripe self-assembly images (figures 9(a) and (c)), the method responds much more strongly due to the presence of different grain boundary defect structures.It is important to note that stripe patterns are not composed of convex pattern structures and, instead, they are non-convex and semi-continuous within the image.Thus instead of localized point-like defect structures as is present in the representative hexagonal patterns, defect structures in stripe patterns are found to be line-like, similar to lower dimensional orientational inversion wall defects as seen in liquid crystalline phases [26].Both positive and negative disclinations are identified by the method, along with individual and arrays of dislocations.Finally, inversion wall-like defect lines are also correctly identified due to the discontinuity of local orientation that they represent.
Referring to defect response distance results for the experimental square self-assembly image (figure 9(d)), topological defects (disclinations, dislocations) and grain boundary defect structures are adequately identified.
While the defect response distance method is found to correctly identify regions which include defects, it is not currently suitable for defect classification or quantification of the number of defects present in a given pattern.However, the ratio of the image area to that in which defects are identified is an important quantity that could be used for structure-property relationships.

Conclusions
Significant enhancements to the previous shapelet-based response distance method [9,10] were developed which enabled automated quantification of local pattern orientation and identification of topological defects via the defect response distance method within images of nanostructured surfaces.These enhanced methods were applied to a representative set of nanostructured surface images, including those with varying surface patterns (stripe, square, hexagonal) and including a representative set of images from experimental characterization techniques (TEM, AFM, and SEM).The incorporation of higher-order shapelets within the existing response distance method [9,10] was shown to improve the performance of the method at identifying regions of the pattern with deformation and topological defects.Including higher-order shapelet response was especially important for experimental images with significant noise.
A method for determining local pattern orientation was presented and applied which uses information pre-computed by the response distance method, specifically, steerable [13] shapelet orientation at maximum response.This shapelet-based method was validated using the set of reference self-assembly nanostructure images and was able to quantify local orientation for each pattern, including those composed of non-convex features (e.g.stripe patterns).This additional data enables researchers to identify local grains and orientation, along with computing estimates of orientational correlation length [12].
Finally, a supervised method for identification of topological defects, the defect response distance method, was developed and validated.The defect response distance method allows the user or researcher to select defect response centroids which are locally present in dislocation and disclination defects (and defect structures), and then automates the identification of all similar defects throughout the image.The method is found to identify defects for both convex (square, hexagonal) and non-convex (stripe) patterns, with measurement noise present.
The shapelet-based methods presented in this work provide researchers in nanoscience and nanotechnology with comprehensive and robust computational techniques for local quantification of nanostructure surface order/disorder, orientation, and the presence of topological defects and defect structures.Future work includes: (i) the integration of more advanced machine learning methods to reduce or remove user supervision and (ii) extending the response distance method to incorporate phase-contrast imaging in addition to topography imaging for AFM-based experimental analysis.Furthermore, the development of three-dimensional (3D) shapelet formulations would enable extension of the presented response distance method to the analysis of tomography images of self-assembled volumes.Implementations of the methods developed here are provided for the research community as an open source software package [20].

Figure 1 .
Figure 1.Simulated stripe and hexagonal self-assembly surface images and their response distance from [10]: (a)-(b) simulation results of stripe and hexagonal self-assembly patterns respectively.Reprinted with permission from [9].(c)-(d) response distance scalar field from [10] for figures 1(a) and (b) respectively.Reprinted with permission from [10].

Figure 2 .
Figure 2. Real and imaginary basis shapelet components for n = 0 and m ä [1, 8] projected onto a discrete domain.All kernel windows are the same size and shapelet scale is fixed, highlighting the effect of the geometric factor f [10].Length-to-image scale ratio is 0.25 N = l , being a combination of simulation (figures 1(a)-(b)) and experimental images of nanostructured surfaces.Experimental imaging includes the use of a representative set of surface patterns (stripe, square, and hexagonal) and microscopy techniques including TEM (figure 3(a)), AFM (figure 3(b)), and SEM (figure 3(c)) which include significant measurement noise.

Figure 3 .
Figure 3.The set of simulated and experimental images of self-assembled surfaces used in this study: (figure 1(a)) simulation results of a stripe self-assembled surface from [9], (figure 1(b)) simulation results of heteroepitaxial surface self-assembly[9], (a) a TEM image of block copolymer surface self-assembly with stripe order[16], (b) an AFM image of a square patterned nanotemplate formed around surface self-assembled block copolymers[17], and (c) an SEM image of block copolymer surface self-assembly with hexagonal order[18].From[9,[16][17][18], reprinted with permission from APS, AAAS, AAAS, and Spring Nature, respectively.

Figure 4 .
Figure 4.The cumulative shapelet response for order m 30 shapelets for the reference self-assembly images in figure 3: (a) stripe, (b) square, and (c) hexagonal pattern types.The cumulative response is scaled as a proportion of the maximum cumulative response (which happens to occur at m = 6 for all pattern types).The dotted purple line at m = 6 highlights the cutoff between lower-order (m 6) and higher-order (m 7) shapelets.The computed maximum shapelet order (m¢) is as follows: 10 for both simulated stripe (figure 1(a)) and hexagonal (figure 1(b)) reference images, and 29, 14, and 23 for the experimental stripe (figure 3(a)), square (figure 3(b)), and hexagonal (figure 3(c)) reference images respectively.

Figure 5 .
Figure5.The normalized difference in the response distance when using lower-order (m 6) and incorporating higher-order (m 7) shapelets superimposed onto the original image for the reference self-assembly images in figure3.Brighter regions correspond to larger differences in response distance.

Figure 7 .
Figure 7.The result of applying the blending operation via median filter [24] and superimposing this onto the original image for the reference self-assembly images in figure 3: (a)-(b) simulated stripe (figure 1(a)) and hexagonal (figure 1(b)) images respectively, and (c)-(e) experimental stripe (figure 3(a)), square (figure 3(b)), and hexagonal (figure 3(c)) images respectively.Cyclic color map 'hsv' from matplotlib [25] is used to account for degeneracy in shapelet rotational symmetry.