Derivation and analysis of the analytical velocity and vortex stretching expressions for an O (N log N)-FMM

. In the current paper, a method for deriving the analytical expressions for the velocity and vortex stretching terms as a function of the spherical multipole expansion approximation of the vector potential is presented. These terms are essential in the context of 3D Lagrangian vortex particle methods combined with fast summation techniques. The convergence and computational eﬃciency of this approach is assessed in the framework of an O ( N log N )-type Fast Multipole Method (FMM), by using vorticity particles to simulate a system of coaxial vortex rings for which also the exact results are known. It is found that the current implementation converges rapidly to the exact solution with increasing expansion order and acceptance factor. An investigation into the computational eﬃciency demonstrated that the O ( N log N )-type FMM is already viable for a particle size of only several thousands and that this speedup increases signiﬁcantly with the number of particles. Finally, it is shown that the implementation of the FMM with the current analytical expressions is at least twice as fast as when opting for using even the simplest implementation of ﬁnite diﬀerences instead. ) are presented over a range of expansion orders ( p ). All results are normalized to the computational eﬀort of only the streamfunction. In the right part of table 1, a comparison of computing the streamfunction and all of its derivatives is presented between using the analytical expressions, a ﬁrst-order accurate forward-diﬀerence scheme (7 nodes), and a second-order accurate central-diﬀerence scheme (19 nodes). These results are normalized to the result of the computational eﬀort with the analytical expressions.


Introduction
Vortex methods, which rely on the Navier-Stokes (NS) equations in terms of vorticity, are of continuous interest in fluid mechanics due to some desirable features. Especially the absence of a regular mesh is important: due to its Lagrangian nature, a vortex method is auto-adaptive, meaning that the refined complexity of the flow is automatically taken into account (i.e. in contrast to a Eulerian method, no grid study is required to achieve a fine or course grid distribution in regions of different vorticity scales). This allows for a compact computational domain and infinite boundary conditions which are automatically satisfied. As vortex methods form an N-body problem involving O(N 2 ) interactions to be solved, efficient algorithms that scale the problem down to O(N log N ) or even O(N ) are a necessity [1]. Broadly speaking, these algorithms fall in two groups: mesh-based interpolation techniques such as Vortex-In-Cell (VIC) methods, and hierarchical algorithms such as treecodes and Fast Multipole Methods (FMM).
For the purpose of modelling wind-turbine wakes a 3D vortex particle method has been developed, which has an FMM applied for the necessary downscaling of the computational effort to O(N log N ) [2]. The implemented FMM is CPU-based and available as an extension of the open-source Parallel Particle-Mesh (PPM) library [3], but relies on the singular or vector potential Coulomb kernel as a solution to the Poisson equation. To be applicable in the context of vortex methods, the algorithm has to be adjusted to retrieve the Biot-Savart kernel and vortex stretching terms. These quantities are linked to the Coulomb (or Laplace) kernel via its first and second derivatives in space, such that those should be analytically obtainable. Although FMMs are commonly applied in vortex methods, the mathematics involved are notoriously dounting, and descriptions on how to derive the necessary expressions are rather limited.
Hu [4] applies a vortex element method for rotorcraft aeromechanical analyses, and discusses to some detail the description of the Biot-Savart kernel in terms of multipole expansions in a Cartesian coordinate framework. However, the stretching term is not treated and for the present work it is of interest to find expressions in the spherical coordinate system. Yokota [5,6,7] applies the FMM for a vortex particle method and describes the expressions for both the Biot-Savart kernel and stretching terms in several of his papers. Conveniently, he adopts the notation by Greengard [8] in spherical coordinates as is the case for the current FMM, but how to deal with the operators and conversions of the derivative terms from spherical to Cartesian coordinates is not discussed.
As a guideline, a more detailed description is presented in the current work on how to retrieve the Biot-Savart kernel and stretching terms described through multipole expansions. An approach is followed whereby the first and second order gradient terms of the streamfunction are derived in spherical coordinates and then converted into Cartesian coordinates afterwards. The overall aim of the paper is to investigate the applicability of the O(N logN )-type FMM on the CPU in the context of vortex particle methods for wind energy purposes. A test case using two coaxial vortex rings is therefore applied for which the solution of both the velocity and vortex stretching is analytically available. Along this test case, the convergence of the proposed solution is then assessed regarding the expansion order and acceptance factor, and the computational speed-up with the number of particles is analysed.
Moreover it is noted that the expressions for the derivatives are possibly quite expensive to compute [9]. A 2.5-fold increase in the computational effort of the Biot-Savart kernel has been observed relative to the potential calculation [4], while the velocity+stretching terms were identified to be 6 times more expensive than the potential+force kernels combined [6]. It is therefore also investigated if an approach with finite differences might be a more viable solution, whereby a stencil is applied locally around each Lagrangian target point.

Analytical description of the FMM
In this section, the analytical description of the potential through the use of a multipole expansion approximation, as used by the FMM, is summarized. Thereafter a discussion is presented on how to retrieve the velocity and vortex stretching terms analytically from these expressions. The discussion is limited to the use of multipole expansions, as local expansions are not required for the O(N log N )-type FMM. In this case, the derivation is left to the reader as the expressions for the multipole expansion and local expansion are very similar. Furthermore, it is noted that the expressions are based on a singular vorticity field, as it is assumed that the singular and smooth solutions converge in the far-field.

The streamfunction
The main characteristic of a vortex particle method is that the flow is described by a vector field of vorticity point sources ω which form the solution to the Poisson equation (equation 1). As such, a vector potential or streamfunction exists which is obtained by convolution of the vorticity vector with the Green's function G. If the particle strength α is defined as the domain integral over the vorticity around a certain source, the outcome of the convolution is obtained as in equation 2 In the framework of multipole expansions, a potential at position i is usually rewritten in terms of an approximation of some far-field sources at positions j using the spherical harmonics Y m n (see equation 3). In this expression the terms S m n and M m n are known as the singular and multipole moments respectively (equations 4 and 5). The spherical coordinates {ρ, α, β} refer to the position of the source terms and {r, θ, φ} to the position of the target, and α is the source in the form of a particle strength. p is the expansion order, whereby the expansion approaches the exact solution for increasing p.
The spherical harmonics themselves are outlined by equation 6 and are a function of the associated Legendre polynomials P m n , for which numerically stable solutions are given by equations 7 through 9. !! denotes the double factorial which is a type of factorial whereby the odd terms are rejected.

The velocity
Of interest is to find expressions for the velocity and vortex stretching terms. The velocity is related to the streamfunction as its curl (equation 10).
The partial derivatives of the streamfunction are a function of the singular moments only: In equation 12 the term ∂Y m n ∂θ appears which is itself a function of the derivatives of the associated Legendre polynomials. Therefore it requires a bit more algebra to be solved, as shown in equations 14 through 17.
These expressions fully define the first spatial partial derivatives of the streamfunction. As a final step, the result is transformed from spherical into Cartesian coordinates. By applying the chain rule, we obtain the following expressions (equations 18 through 20). The outcome is then plugged into equation 10.
The values of the computational terms ∂ρ/∂x, ∂θ/∂x, etc. are dependent on the definition of angles, but for the currently applied system the results are added in Appendix A.

The vortex stretching
The computation of the vortex stretching follows the same path as that of the velocity, but the process is somewhat tedious and the outcome a lot more elongated. To start off, the vortex stretching is defined as the material derivative of the particle strength and can be expressed as a function of the second spatial derivatives of the potential, see equations 21 and 22.
All components of the potential second derivatives in spherical coordinates are summarized in equations 23 through 28.
Where ∂Y m n /∂θ was already given in equation 18. The term ∂ 2 Y m n /∂θ 2 is obtained from equations 29 through 32.
n − m (32) Likewise as for the velocity case, the spherical coordinates have to be transformed into the Cartesian system. Applying the chain rule for instance the derivation to x and y yields equation 33.
The other five forms of partial derivatives follow the same structure. Again, the results for the terms which are not a function of the potential are presented in Appendix A for the current spherical system. With these results, expression 22 for the vortex stretching is completely described.

Setup of the simulation 3.1. A set of coaxial vortex rings
For the analysis of the FMM with the extension to the Biot-Savart kernel and the vortex stretching terms, a simple test case is considered consisting of two coaxial vortex rings with equal properties (see figure 1). The rings are discretized using N singular vortex particles per ring, whereby the overall quantities of the circulation Γ, radius R, and ring spacing in axial direction from center to center ∆z are all set to unity. Each particle then receives a particle strength of α = 2πRΓ/N . Analytical solutions to this problem are well described, and for instance available through [10]. Noting that the vortex stretching is linearly dependent on the radial velocity of the ring, the solutions for both rings are u R ≈ ±9.0982075336049 · 10 −2 [m/s] and Dα/Dt ≈ ±1.1433144779363 · 10 −5 [m 3 /s 2 ].

Finite difference schemes
Apart from retrieving the Biot-Savart kernel and stretching terms through the analytical expressions of the multipole approximation, an approach with finite differences is assessed.
The amount of extra nodes can be reduced to 6 by discarding ψ i+2,j,k , see equation 36. This requires knowledge of ∂ψ i,j,k ∂x with an accuracy of at least O(∆x 2 ), such that this term has to be retrieved analytically at the target point.
It is assessed whether a first-order or even second-order finite-difference approximation can outweigh the analytical expressions in computational effort. It is also verified whether a firstorder finite-difference approximation would be sufficient at all; even though double-precision floating-point operations are applied, the squared form of the very small stencil spacing in the denominator could pose an obstacle.

Simulation setup
Throughout the simulations, a serial computation on the CPU is carried out using a doubleprecision floating-point accuracy. The CPU is a dual-core Intel i7-3537U processor with a base and turbo frequency of 2.00 and 3.10 GHz respectively. The two rings which are equal in properties, share the same symmetry axis and circulation rotational direction. The following cases are considered: The convergence of the Biot-Savart kernel and vortex stretching terms are assessed for a range of expansion orders and acceptance factors. Each ring is discretized by 5 · 10 4 particles, which allows for a fine hierarchical octa-tree structure to be constructed.
The speed-up of the FMM is compared relative to a direct O(N 2 ) computation, over a range of particle numbers. The number of particles is kept equal between the two rings.
The computational effort of a single interaction using the analytical expressions is mapped. This is done for the case when only the streamfunction is calculated, the streamfunction and its first derivatives, and finally for the streamfunction and both of its derivatives.
Finally, the computational effort of a single interaction using the analytical expressions versus a finite-difference approach is identified. A first-order accurate forward and a second-order accurate central finite-difference scheme are assessed, whereby the first scheme requires 7 nodes in total and the latter scheme 19 nodes. It is also checked whether a first-order accurate scheme would be sufficient at all for the stretching terms, due to the round-off errors involved.

Convergence of the velocity and stretching
The error plots are presented as a function of the expansion order, for the velocity (figure 2) and vortex stretching ( figure 3). Four cases are included for an acceptance factor (θ) of 0.5, 1.5, 2.0, and 3.0. It is visible that the solutions obtained by the FMM converge rapidly towards the exact solution for increasing expansion order. Also, an increase of acceptance factor shows convergence. Even for the case of θ < 1, where theory states that convergence is not guaranteed, the exact solution is approached although being it with a much smaller slope.

Computational effort of the FMM
The computational speed-up of the FMM relative to a direct calculation is presented over a range of expansion orders (figure 4) and acceptance factors (figure 5). It is visible that the threshold (speed-up=10 0 ) using a low expansion order and small acceptance factor is already achieved for less than a thousand particles and close to a 10-fold speed-up is apparent at 10 4 particles. A linear trend in the log-log plot is noticeable, which flattens out probably due to the expensive computations involved in constructing the hierarchical data tree. Anyhow, these figures demonstrate that the FMM is already relevant for vortex methods applying cases with only a few thousand particles.

Computational effort of the kernels
To assess the feasibility of applying finite differences, a small study into the computational effort of the analytical expressions is presented. In the left part of table 1, the computational effort of computing only the streamfunction ( ψ), the stream function and its first derivative ( ψ, ψ ), or all terms ( ψ , ψ , ψ ) are presented over a range of expansion orders (p). All results are normalized to the computational effort of only the streamfunction. In the right part of table 1, a comparison of computing the streamfunction and all of its derivatives is presented between using the analytical expressions, a first-order accurate forward-difference scheme (7 nodes), and a second-order accurate central-difference scheme (19 nodes). These results are normalized to the result of the computational effort with the analytical expressions.    From table 1 it is apparent that the computational effort increases when also the first and second derivatives of the streamfunction are considered. However, due to the fact that a lot of symmetries and recurrences are involved in the calculations, this increase in computational effort is manageable. Therefore, applying even the simplest form of finite differences does not outweigh the computational efficiency of using analytical expressions, as visible in the right part of table 1.
In figure 6 the accuracy of the vortex stretching is presented using a forward (O(h)) and central finite-difference scheme (O(h 2 )). A range of acceptance factors (θ = 1.5, 3.0, 6.0) is included, while the expansion order is kept fixed at p = 4. It is visible that for the forward finite-difference scheme an acceptable accuracy is never reached. If the stencil spacing is too large, we run into the approximation constraints, while the stencil spacing cannot be too small due to round-off errors. At least a second-order finite-difference scheme has thus to be applied.
From these results it has to be concluded that using analytical expressions for the Biot-Savart kernel and vortex stretching terms is faster than applying even the simplest form of finite-difference scheme.

Conclusions
An O(N log N )-type FMM has been applied for the speed-up of a 3D Lagrangian vortex particle method. The Biot-Savart kernel and vortex stretching terms have been derived analytically from the spherical multipole approximation of the streamfunction. The convergence and computational effort of these kernels have been assessed along a test case of two coaxial vortex rings, discretized by singular particles.  Figure 6. Accuracy of the vortex stretching using a finite-difference stencil (p = 4).
It was found that both the Biot-Savart kernel and vortex strething term converge rapidly with increasing expansion order and acceptance factor. Even for an acceptance factor smaller than 1, convergence was manifested. Results for the computational effort versus the particle size demonstrated that for less than 10 4 particles an accurate multipole approximation with the FMM was already faster than the direct method. Particle sizes up to 4 · 10 4 have been applied, for which a speed-up of more than 10 times was achieved for a solution which was accurate up to ≈ 10 −4 .
The computational breakdown of the multipole expansion has also been analysed, to assess whether it would be more beneficial to use a finite difference approach to bypass the expensive calculation of the multipole derivative terms. It was found that these derivative terms are far less expensive than on forehand expected, due to the abundance of recurring and symmetrical terms in the computations. Applying the currently derived expressions, the computation of the Biot-Savart kernel and stretching terms combined would only result in about a 3-fold increase in computational effort of the interaction. The application of finite differences results in at least a 6-fold increase in computational effort, which obviously does not justify its implementation.