Lagrangian particle tracking in the presence of obstructing objects

Volumetric flow measurement techniques have become the state-of-the-art for characterizing a broad range of different flow fields. Still, certain major limitations are present, which hinders the application of these techniques for some of the more complex flow configurations. In particular, flow measurements involving the presence of obstructing objects require time consuming measurement strategies and careful adjustment of the experimental equipment to avoid inaccurate measurement results. Within this study, these limitations are mitigated by the use of a known object’s shape and position in the form of depth maps for commonly used Lagrangian particle tracking schemes like Shake-the-Box (STB) as well as in volume self-calibration methods. The incorporation of these depth maps is computationally inexpensive and straight forward to implement. In order to evaluate the performance of this novel object-aware Lagrangian particle tracking (OA-LPT) approach, synthetic as well as experimental test data is created and the reconstruction quality is evaluated. It is shown, that OA-LPT is capable of providing full flow-field information, whereas the default STB implementation fails to correctly reconstruct particles in the partly-occluded regions.


Introduction
Over the last years volumetric Lagrangian particle tracking (LPT) techniques have matured enough to be applicable for a wide variety of different flow cases.They have been utilized to study the flow within models of the human ventricle [1], to characterize the transport of aerosols [2] at a cubic meter scale, to assess the aerodynamic performance of cyclists [3] or to determine the wake of a ducted propeller [4], among many other applications.Here, the measured flow fields are strongly influenced by an object.Either the flow is bound by the ventricle's wall, it is driven by the propeller or the breathing and heated mannequin, or the flow is obstructed by the cyclist.But the object's geometrical properties (shape, location, orientation) are still neglected during the flow reconstruction.Despite the presence of the object, the reconstruction is possible in these cases since the model is either transparent and does not obstruct the views of the cameras [1], the model is placed just outside of the measurement volume [4] to not interfere with the reconstruction algorithm, or the full flow field is reconstructed by combining several sub-volumes, each recorded with a different location of the multi-camera system mounted on a robot arm, to overcome the optical blockage [3].
These few examples already show, that the use of volumetric flow measurement systems for a wide range of applications is already possible, but a significant effort has to be made to work around some limitations of the involved systems and algorithms, especially when fluid-structure-interactions are the key investigated feature.Recently, the European research project HOMER [5] focused on the problem of simultaneously measuring the fluid phase together with several moving non-deformable and deformable objects.Many promising solutions on this problem have been proposed within this framework [6][7][8][9][10], but still, a general approach on how to actually incorporate the measured object information into the particle tracking schemes with the aim of informing the particle reconstruction about the fluid-solid boundary is lacking.
With this work, we want to propose a general work flow for LPT methods to overcome this missing link and to enable a complete flow field reconstruction around solid objects.Within Tomographic particle image velocimetry (Tomo-PIV, [11]), the use of voxelized masks deduced from visual hull approaches [12,13] has been suggested to deduce the object shape.Here, we introduce the concept of depth maps for LPT, which are already widely used in various other fields of computer vision (e.g.[14,15]), but have not yet been fully utilized in the experimental fluid dynamics community.For the purpose of particle reconstruction, they offer a more memory efficient way of encoding the object information compared to the voxelized 3D masks, since only an additional 2D image must be stored for each camera instead of a whole volumetric grid.
In the next sections, more detailed information is provided on the calculation of these depth maps and their use for volumetric flow measurement techniques.This work is structured in the following way: in section 2 the foundational methods for achieving the object-aware LPT flow field reconstruction are introduced, where we focus exemplarily on the Shakethe-Box (STB) technique [2,16].Implementation and modification remarks on the geometrical calibration and further changes to volume self-calibration (VSC) are given as well.In sections 3 and 4, a synthetic and an experimental test case will be introduced and analyzed with the aim of comparing the performance of the proposed modified LPT scheme with the standard particle tracking implementation without the use of the object information.Finally, in section 5, the key findings of this study are summarized and several connecting aspects for future studies are outlined.

Obtaining object information
The first task is to obtain the required knowledge about the object's shape and its location with respect to the multi-camera set-up before using such information in LPT.In the following, a list (see also figure 1), although certainly not exhaustive, provides an overview of possible solutions for this problem: (i) The exact geometry of the object is precisely known a priori: (a) The object is manufactured from a CAD drawing.(b) The object is photogrammetrically reconstructed with a separate measurement system prior to the flow measurements.
(ii) The object shape is unknown and must be determined from the measurement images themselves: (a) Using the visual-hull method [12].(b) Photogrammetrically, e.g. from (projected) speckle pattern or marks on the object surface and Stereo-DIC (digital image correlation) [6,8,9,17] during the actual LPT experiment.(c) Estimate objects shape from intermediate particle reconstructions and detect objects as voids in the reconstruction result [18].
Object shape and position are preferably measured by the same cameras as used for the flow measurement [6,8,9,19].Alternatively, a separate system is employed [20,21], which adds experimental complexity, but offers more flexibility.The easiest way of obtaining the object information is by using a computer model (i), e.g. a CAD drawing or a digital twin of the real object obtained from a photogrammetric reconstruction of the object of interest.By working with these digital models, the shape is precisely known and only the position of the object must be fitted into the multi-camera coordinate system.This can either be done by precisely mounting the object with respect to the defined origin of the measurement system or by applying a few markers on the body.Those markers can be reconstructed by the very same LPT-camera set-up to fit the object's location [6].
But this might only be feasible, if the object itself is either rigid during the measurements or if it undergoes a known deformation during the image acquisition.Otherwise, both the object's shape and location must be continuously determined from the measurement images themselves (ii).If the object is fairly easy to segment from the flow tracers and the calculation of an automatic mask is possible, one way to achieve this is by the use of the visual-hull method.For example, this has been implemented for a Synthetic Aperture PIV approach with the aim of measuring the wake flow of a living and freely moving archer fish [13].
Another alternative is a separate stereo-digital image correlation system to detect the shape and location of the model beforehand or continuously during the experiment.This requires the preparation of the model's surface with a suitable (projected) speckle or marker pattern.A further interesting approach is the use of the particle reconstruction solution itself.In areas, where no particles are found (and if the reconstruction quality is sufficiently high), a void in the particle cloud distribution remains.This void can be detected and the object surface is characterized by the void's geometry [18].This may end up as an iterative approach between particle reconstruction, object detection and repeating with a modified LPT-reconstruction using the object information as will be discussed below.With a known object shape model, the task is reduced to fit the six parameters of the object location and orientation.
As mentioned before, there are many approaches to tackle the problem of the determination of the object's shape and location.In the following, we assume that the object shape is precisely known by a digital CAD model and its position is correctly determined with sufficient accuracy, focusing on how such object information can be used for common LPT schemes.One way, that has already been proposed by [12] in the context of Tomo-PIV, is the approach of storing the object information in a volumetric grid, where each voxel, which is occupied by the object, gets flagged and will be treated accordingly during the reconstruction.For Tomo-PIV this can be the optimal way, since the flow field is reconstructed on the same voxel grid.For LPT methods, where such a voxel grid is not used, an alternative way of working with the object's data is presented here, which is much more memory efficient.In our approach the object's shape and position is encoded in a depth map mask DM(x i , y i ), which is defined for each involved camera i separately and which holds the depth value / Z-position of the object in the volume as a scalar value for each camera pixel.Here, a line-of-sight (LOS) from each camera pixel is projected through the measurement volume and it is determined, whether the LOS intersects with the object's surface.If this is the case, the Z-position of the intersection point is stored as the pixel value, if not it can either be set to − / + infinity for cameras looking from the front/ back ('front' means here: located toward positive Z values) or, more practically, to the maximum Z-extend of the measurement domain.
The concept of using depth maps is very common in other computer vision applications (e.g.[14,15]) but has not been introduced for three-dimensional flow reconstructions so far.Besides the low memory requirement, the calculation of the depth map can also be efficiently implemented by using fast ray-triangle intersection algorithms [22].With this newly introduced approach of encoding the object's location and shape, all necessary modifications to the LPT methodology will be discussed in the next sections.Although the covered points will be exemplarily implemented in the STB [16] framework, it is easily adoptable to other LPT schemes.

Modifications to STB
For STB, modifications are only required in the Iterative Particle Reconstruction (IPR) [23] step.The complete particle matching and tracking part can be left unaltered.All subroutines affected by the depth map recognition are illustrated in figure 2. These are: the initial particle triangulation, the rendering of the projection and residual images, and the position and intensity update ('shaking').

Particle triangulation.
In standard IPR, the particle triangulation-i.e. the computation of possible particle world positions in the measurement volume-starts with the detection of particle peaks above some user selected intensity threshold in all camera images and determining their sub-pixel peak locations.Then taking a particle location (x 1 , y 1 ) in the image of e.g.camera 1, a search is performed along the corresponding epipolar line in the image of camera 2. With the basic calibration mapping functions (x i , y i ) = M i (X, Y, Z) for camera i at some world position (X, Y, Z), as well as its inverse function (X, Y, Z) = M −1 i (x i , y i , Z), such an epipolar line is given by Particles in the image of camera 2 along the epipolar line and within a distance of allowed triangulation error ±ε are taken as possible matches and the two positions (x 1 , y 1 ) and (x 2 , y 2 ) are triangulated into a corresponding best-fit world position (X, Y, Z).Projecting this position back to the images of the other cameras 3, 4, . . ., it is then checked again if there are also particle peaks visible within a small region of ±ε, arriving at all possible combinations of detected particles in all images and their triangulated world position.
For standard STB experiments, all cameras should be able to see the complete measurement volume and all particles are imaged by all cameras.Therefore, it is not too critical with which initial 2-camera combination for the epipolar line search a particle is reconstructed and a standard set of fixed camera combinations can be used for all calculations.Now, in the case of an obstructing object blocking the view for certain cameras, it becomes crucial in which order the different camera views are combined to triangulate the single particles.Therefore, all possible combinations of the starting camera and the second camera, in which candidates are searched for along the epipolar line, are tested (i.e. 12, 13, 14, . . ., 23, 24, . ..) to ensure the detection of all particles.A similar extension to the triangulation scheme was already proposed in the past [24,25] with the aim of better resolving overlapping particles and suppressing ghost particles more effectively.But whereas in these approaches it is mainly important to rotate the first camera position and not all camera combinations need to be considered, the full implementation of first and second camera combinations is necessary here, in order to correctly cover all partly obstructed subregions.As described above, all particle candidates detected by the first two cameras are triangulated to an intermediate 3D position (X, Y, Z) and here for the first time the cameras' depth maps are consulted to verify if the particle is indeed visible or not by those two cameras (Z > or < DM 1 (x 1 , y 1 ) and DM 2 (x 2 , y 2 )).Also, when back projecting (X, Y, Z) onto the remaining camera images, only those cameras are taken into account, where the depth maps indicate that they see the particle.A user defined threshold of number of cameras specifies, that only those regions of the measurement volume are reconstructed, where at least e.g. 3 or 4 cameras see the particles.It must be noted further, that due to the multiple camera combinations, particle duplicates are highly likely to occur.Those cases must be handled by simply removing particles, which are too close to each other or by clustering nearby particles, as, for example, proposed in the multiset triangulation method [24].

Shaking.
Once the initial particle distribution is found by the triangulation step, all particle positions are refined by incrementally shifting ('shaking') the particle position in each spatial direction until a best fit in accordance to the actual measurement images is found.With the new approach, for a certain particle only cameras, which fulfill the depth map criterion, are used for the fit during the shaking.In a similar manner, the intensity update of the particle at its new final position is also only performed with cameras, which are able to see the region where the particle is located.

Projection and residual image rendering.
Already during the shaking procedure, particle projection images (a representation of how the camera image would look like with the current distribution of particles) and the residual images (original image subtracted by projected images) are rendered for every camera to guide the position fitting.This is also done before any further outer IPR-loop starting with detecting yet undetected additional particles in the residual images (see figure 2).Again, the camera depth maps are used to decide if a certain particle should be plotted in the projection images, which in the end should look like the recorded images as close as possible given the true particle distribution.The quality of the residual images-how empty they are-will greatly impact the performance of the whole IPR process.

Multi-camera calibration
As a prerequisite for the whole particle reconstruction, the multi-camera set-up must be calibrated appropriately in order to determine the geometric relationships between the single camera views and to identify how the cameras' image coordinate systems correspond to the common world coordinate system.In a most general sense, the result of the camera system calibration is a certain mapping function for each camera M i , which can be used to map a 3D space point (X, Y, Z) to an image point (x i , y i ), or vice versa.
For volume flow measurement applications, the two most used calibration models are the pinhole camera model [26,27] and some type of polynomial mapping function [28].The pinhole camera model tries to model the light paths of the cameras with physical meaningful parameters, such as the intrinsic (focal length, principal point, . ..) and extrinsic parameters (location and orientation in space).As a result of this physically sound modeling, the calibration with this model is very robust.
A common approach to calibrate the camera system with the pinhole model is to record arbitrarily shifted, tilted, and rotated views of the-ideally two-level and for set-ups with front and back cameras double-sided-calibration plate under random orientations with varying visibility from the different involved cameras.It is not required that all cameras must see all markers of the calibration plate all of the time, when using a robust and tolerant bundle-adjustment fit [29].As long as a continuous link between all the views with suitable camera correspondences can be made and each camera has seen the target at least once, the equation system can be solved together for the complete measurement system distributing the reprojection errors optimally across all cameras.The accuracy of the standard pinhole model is reduced once optical interferences, interfaces with changing refractive indices or other strong distortions are introduced in the light paths, requiring modifications to the standard model [30] or more advanced ray-tracing models for a particular experimental setup [31,32].
For such scenarios, e.g. for a camera system looking from the outside into a water tank with curved walls under oblique angles, polynomial mapping functions with more parameters have proven to be a suitable alternative.Typically, a planar calibration target is scanned through the volume at several equidistant co-planar planes.Within each Z-plane, e.g. a 2D polynomial in XY of third-order degree is used to fit an optimal mapping even under non-linear distortions.In-between the planes the mapping function can be interpolated with a linear fit along the LOS as suggested by Paolillo and Astarita [32].
Besides the typically employed calibration recording strategies-arbitrary views for the pinhole model or co-planar planes for the polynomial model-both models can also be adjusted to work with the other recording approach as well.This can be important in the context, discussed here, of flow measurements with obstructing objects, where it may not be possible to choose freely between the two recording methods.For example, if the investigated object cannot be removed from the measurement domain prior to the calibration, a scanning of the calibration plate through the volume at co-planar position might not be possible.In those cases, it is recommended to perform the camera system calibration with recordings of the target under arbitrary views.When the use of a planar calibration target is not possible at all, e.g.due to a limited physical access to the measurement volume, alternative calibration approaches, such as a wand-based calibration [33], a dumb-bell calibration [34] or even a non-invasive calibration [35], might be adequate to place calibration points around the object.

Modifications to VSC
2.4.1.Background.The VSC procedure ( [36,37]) uses real recordings with seeding particles to quantify and to correct remaining calibration errors, which may vary spatially and/or over time e.g.due to camera vibrations.The magnitude of the disparity errors can be as large as 10 pixel or more.VSC works by dividing the volume into n X × n Y × n Z subvolumes.For all triangulated particles, the disparity (dx i ,dy i ) is the difference between the triangulated world position projected back onto the image of camera i and the measured particle peak position (x i , y i ): ( The most prominent disparity vector (dx i ,dy i ) for each subvolume is determined statistically by plotting for each particle a small blob in the disparity map and choosing at the end the position of the highest value in this map.The final field of disparity vectors is then used to correct the original calibration by fitting a new mapping function M ′ i for each camera: M ′ i can have the same functional form as M i , but one can also switch here e.g. from a pinhole model to a multi-plane polynomial model with more degree of freedoms, one Z-plane for each Z-plane of subvolumes, which is often more appropriate for stronger optical distortions.
For spatially varying but time constant disparity errors, the statistics can be improved by adding the disparity information of several images, respectively time steps.For strong local optical distortions a fine-scale correction field can be created or updated [38,39], which is then added to the original mapping function, i.e. storing directly the disparities (dx i , dy i ) -available at the center of each subvolume as a volumetric grid with appropriate trilinear/spline interpolation inbetween-without fitting a new M ′ .
For a temporal change of the calibration function, a modified calibration function needs to be generated for each time step only using a single image with less statistics.Since vibrations mostly involve a single global shift of the camera images or in rare cases some rotation, a single or only a few subvolumes are required, which then contain enough statistical information for calculating valid disparity vectors.Vibrations of more than 10 pixels have been successfully corrected in the past [40].Often, when not explicitly checking for it, small vibrations e.g. less than 0.5 px remain unnoticed, but reducing the errors from e.g.0.3 px to 0.1 px already provides a significant improvement in the reconstruction quality of STB, especially for high seeding densities.The complete VSCprocedure is usually repeated a few times until the remaining disparity errors are ideally reduced to less than 0.1 px everywhere in the volume.
In the context of obstructions, it must be noted that in standard VSC, without any information about object shape and position, the triangulation procedure assumes that a particle is visible by all n cameras, and the combination of suitable n particle peak locations is then triangulated into a best-fit world position using some appropriate cost function (here L 2 -norm of all (dx i , dy i ) as recommended by [41]).In obstructed regions not visible by all cameras, the fitted world positions are therefore almost always non-existent ghost particles with random disparities.
For illustration, this is shown in figure 3(a) for synthetic data with a cylinder aligned along the Y-axis viewed by six cameras placed in the XZ-plane around the cylinder.The images for the six cameras have random global offsets added in the range of ±1 px relative to the calibration.Further details about the synthetic data are given in the next section.The disparity vectors with non-yellow background are all invalid outliers, here dy is shown for camera 1.No valid disparity vectors can be computed in the center region with obstructions.

Modifications to VSC.
Modifications are needed for the VSC-triangulation very similar to the changes in the IPRtriangulation (section 2.2).Again, all two-camera combinations are checked for the initial search along the epipolar line of the second camera and the object depth maps provide information which part of the volume is visible by which set of cameras.For each subvolume, the set of cameras is determined, which can see the center location of the subvolume.Disparities for particles located within a subvolume are only added to the disparity maps if they are also visible by the same set of cameras (or more cameras).Note, that often a smaller fraction of a subvolume may be visible by less or more cameras.Particles within the subvolume with less cameras are discarded.For particles visible by more cameras, the triangulation and disparity calculation is only done on the set of cameras associated with the center point of the subvolume.This way, all particles contributing to the disparity map of each subvolume originate from the same set of cameras.Otherwise, no single true disparity peak would be generated as explained below.
With this modified scheme, all valid disparity vectors are recovered as shown in figure 3(b).As a side note, when using only a single subvolume-mainly relevant for correcting large shifts at the beginning or for single-image vibration corrections-the depth maps are not checked and only those particles are considered, which are visible by all cameras.Unfortunately, disparity vectors may change significantly based on which set of cameras they originate from.As an example, consider four inline cameras along the X-axis with constant image shifts in y-direction of (+1.0, 0.0, −1.0, 0.0) px.The L 2 -norm triangulation fit computes dy-correction values of (−1.0, 0.0, +1.0, 0.0) px, but, e.g. when camera 3 is missing, the corrections are about (−0.6, +0.4,-, +0.4) px.Every combination of cameras might lead to different disparity vectors.While all those disparity vectors are correct in a strict sense, using such a disparity field is obviously not optimal for fitting a new smooth M ′ according to equation (3).One would like to have the same constant disparity at all locations.

Neighborhood consistency algorithm.
There is a way to repair the disparity field to recover the single constant shift (as used here), or, in the general case, to make the disparity field smooth and consistent again in a global sense.For each subvolume, three free parameters (∆X,∆Y,∆Z) are available which can be added in equation ( 2) to all camera mapping functions-equivalent to a shift of the subvolume in the world coordinate system-still preserving the basic disparity and mapping relationship between all cameras: with first-order expansion: Thus, each local set of disparities (dx i ,dy i ) can be transformed with the goal to recover a smoother version of the complete disparity vector field.This is accomplished by: (i) Take all subvolumes visible by all cameras and compute a reference set of median disparity vectors for all cameras.The median is more robust than the average in case of outliers.(ii) For each subvolume not visible by all cameras, use equation ( 6) to find the optimal set of (∆X,∆Y,∆Z) to replace (dx i ,dy i ) by (dx i ,dy i )-(∆x i ,∆y i ), such that the difference to the reference set of disparities is minimized.
In the experiments we encountered so far, there has always been a region visible by all cameras.If this is not the case, the algorithm above in step (i) takes the subvolumes with the most cameras as a reference.Probably in the future, some modifications are needed for such special topologies.In case of VSC with a single subvolume, typically used for initially correcting large shifts or vibrations, the software is only using the particles visible by all cameras like in standard VSC.For the synthetic data example above, the constant global disparity shifts are completely recovered as shown in figure 3(c), after which the correction of the calibration mapping functions can proceed as usual.Repeating VSC confirms that the average remaining disparity decreases down to 0.03 px (max.0.1 px).
The local neighborhood consistency algorithm is not necessary, when there is only a constant disparity shift in the complete volume and when the subvolume(s) seen by all cameras can determine this shift sufficiently.Further, when the VSC can be applied on image recordings without the object in place, then using the standard VSC on the recordings with the object in place is usually fully sufficient to correct possible constant shifts.For remaining spatially varying disparities and enough available subvolumes viewed by all cameras (i.e. the object is small compared to the measurement volume), standard VSC might also be sufficient to correct the calibration function, accordingly.For the investigated experiments so far, accurate calibration functions can often be achieved without VSC-modifications but typically requiring a few more VSC-iterations.More experience gained from future experiments will provide more clarity about circumstances and issues potentially requiring further modifications.
As a final remark to the extended VSC approach, it has to be noted, that the self-calibration and depth map determination may have to be done in an iterative manner.Especially, if large initial disparities (>1-2 px) need to be corrected, the initial depth map would not be completely accurate anymore.Therefore, before starting the next VSC step, it is recommended to recalculate the depth maps with the updated calibration information.In later steps, when only smaller sub-pixel corrections are performed, the pixel-discretized depth map offer sufficient accuracy without the need of recalculating it.

Image generation
For a first validation of the proposed methods, an image set of synthetic 2-pulse data is created.These images (900 × 600 px) simulate a virtual hardware set-up with six ideal (no distortions, no noise) cameras with telecentric lenses.The cameras are placed in an in-line configuration around a cylinder with a diameter of 200 vox, where three cameras are placed on one side of the object and three cameras on the other.The maximum covered viewing angle of each of both camera groups is ±40 • .A total volume of interest with the size of 900 × 500 × 400 vox is considered and the cylinder is placed exactly in the center of the domain with its rotational axis aligned with the Y coordinate axis (figure 4(a)).With the simulated camera set-up, schematically represented in figure 4(c), a configuration is mimicked, where some regions are observed by all cameras and some regions can only be seen by as few as three cameras.For a later additional analysis with a 4-camera configuration, the cameras 2 and 5 will be omitted, reducing the minimum camera coverage number to 2 (figure 4(d)).The tracer particles are rendered with a constant diameter of 2 px and a peak intensity of 500 counts.Overall, 20 000 particles are simulated that way, resulting in a seeding density of 0.05 ppp.
Furthermore, two flow cases are considered for these synthetic images.As a first case the tracers exhibit a constant displacement in the direction of the cylinder's axis of v = 5 vox in-between the double-frames.This very simplified flow modeling is sufficient for focusing on the particle reconstruction and no further complexity for the tracking is incorporated.The second considered case is a laminar flow around the cylinder with maximum particle displacements of 10 vox and will be used for the four-camera configuration with the aim of adding a velocity gradient in the Z-direction, thus useful temporal information for the particle tracking.

Particle reconstruction
The evaluation of the synthetic image data is done with the 2-pulse STB approach [42] as implemented in DaVis 11 (LaVision GmbH).Flow field reconstructions are done with the default STB implementation and with the additional incorporation of depth maps.An exemplary depth map for one of the cameras is shown in figure 4(b).For the IPR reconstruction four inner and four outer loops are conducted (figure 2) and particles are triangulated within a tolerance of 1 px error.In order to achieve a converged solution, eight iterations of particle reconstruction and particle tracking are chosen.Although the synthetic particles are only created in a volume spanning 400 vox in depth, the reconstructed volume during 2-pulse STB is set to be 500 vox deep, which allows to easily identify the amount of ghost particles outside of the true volume.Knowing the ground truth, the reconstruction results can directly be analyzed with respect to the amount of found true particles, ghost particles and with respect to the positional accuracy of the true particles.For the classification of being a true particle, the reconstructed particle must be within a 1 vox distance to a ground truth particle, otherwise it is considered a false particle.

Results
Starting with the six-camera configuration, the reconstruction results for both the standard and the newly introduced STB approach are presented in figures 5(a) and (b), respectively.The shown particle distribution is presented in a top-down view onto the XZ projection plane, where the complete measurement domain is covered.For standard STB, where it is assumed that every particle should be visible by all cameras, the majority of particles is missing in a large central region.Only 65% of all true particles are reconstructed.Those regions coincide exactly with the areas, where the number of involved cameras drops below six (see figure 4(c)).The root-meansquare error of these particles is 0.18 px.As a result of the missing true particles in the center of the domain, their unresolved intensity signal now contributes to a larger amount of ghost particles (4%), which are easily identifiable outside of the actual measurement domain.
When utilizing the object information in the modified STB reconstruction, 99.8% of all true particles are correctly found with a ghost fraction of only 0.2%.The positional accuracy, as defined as the root-mean-square error between detected and true position, of all found true particles is better than 0.02 vox.These numbers are very close to the reconstruction quality one would get in the same case but without any object in place (true particles: 99.7%, ghost particles: 0.06%, RMS of position: 0.02 vox).Only the number of ghost particles is slightly higher, which can most likely be accounted to the regions, where particles can only be triangulated by three cameras resulting in a higher chance of producing ghost particles.
With this data set we also want to explore how the reconstruction is affected by erroneous object information.Such misinformation can occur, e.g.due to an inaccurate fitting of surface markers or due to an unnoticed shift of the complete model during the image acquisition.The latter case is considered now by artificially shifting the original object by 10 vox in negative X direction-and equivalent shifts in the camera depth maps-and restarting the same STB reconstruction on the same images as before.The obtained particle distribution is presented in figure 5(c).
Three effects stemming from the depth map misalignment can be deduced from this result.Most noticeably, empty lines (more precisely complete empty sheets in the volume) appear, in which no particles were found.Here, the number of cameras to be included is higher than what is physically present and therefore particle signals are searched for in too many cameras.Since IPR cannot find sufficient intensity information in all of the supposedly involved cameras, possible particles are discarded.As a second effect, there is a small empty area of particles close to the left side of the cylinder.Due to the shifted depth maps, the IPR algorithm assumes that this area should be blocked by the cylinder and therefore prevents the occurrence of any particles there.Thirdly, since the correct reconstruction of all true particles (found true particles: 96.4%) is hindered due to the two aforementioned effects, the likelihood of the creation of ghost particles is increased (ghost particle fraction: 2.0%) and the positional error increases slightly to 0.06 px as well.This is especially noticeable by the presence of particles outside of the actual measurement volume.All of this indicates, that the accuracy of the object detection must be sufficiently high to prevent such reconstruction errors.
Finally, a 4-camera configuration is tested, where due to obstructions, parts of the volume are only visible by 2 cameras (figure 6(a)).The number of required cameras to be able to triangulate a particle is now set to only two and the complete flow field around the object can be filled with true particle information.But, especially for higher seeding densities, with only two cameras the number of possible particle candidates exceeds the possibilities with three or more cameras by far, since every particle within the specified triangulation error around epipolar line needs to be considered.As a result, the number of falsely reconstructed ghost particles significantly increases in those areas.The ratio of correctly found true particles is 91% but even more ghost particles (92%) have been produced at the same time.As a consequence of the high ghost particle fraction, the RMS error of the position shows a rather high value of 0.15 px.In regions, where all cameras are able to see the particles, the reconstruction solution is almost as good as before.
Although the 2-pulse STB method is used here, which should yield more accurate results than a pure IPR particle reconstruction step, this is not the case for this data set.The reason is the very constant velocity field.Such a flow scenario can be very disadvantageous as has been already discussed for Tomo-PIV [43], since ghost particles will move with the average velocity of all actual, true particles.As a result, ghost particles will not decorrelate over time and they cannot be eliminated efficiently.It has been concluded, that for a faster temporal decorrelation of the ghost particles higher spatial velocity gradients are required.By employing a more complex flow field with stronger velocity gradients, the reconstruction quality can be enhanced, even in the four-camera case discussed here.Based on this, the flow field is now set to a laminar velocity field (Stokes flow) around the cylinder, where the bulk flow direction is in positive X direction.Furthermore, the IPR paramters are set to a more constrained reconstruction by raising the number of outer iterations to 16, lowering the to 0.5 vox and by incorporating an intermediate median filter in-between the IPR and tracking iterations.The improved particle tracks for this data set are shown in figure 6(b).The reconstruction yields a true particle recovery of 98% with only a 2% ghost particle ratio.The RMS error of the true particle positions is about 0.03 px.

Set-up
As an experimental test case, an object-camera configuration close to the synthetic data is chosen with a six-camera set-up around a cylindrical object (figure 7).The experiment is conducted in a water tank (1000 × 400 × 400 mm 3 ) and the flow within the tank is created by a pump placed in one corner of the tank.The emitted jet is directly aimed at the object, creating an impinging flow scenario.The obstructing object itself is a stepped cylinder (material: black anodized aluminum, small diameter: 30.5 mm, large diameter: 55.9 mm (see CAD representation in figure 8(d)) and is mounted in the center of the water tank.The cylinder is placed in such a way, that the overlapping field of views of the cameras cover the flow around the small and large diameter section as well as a small part underneath the cylinder.
Two image sets are recorded, each with a total number of 500 images per camera, one without the object to be able to use a standard VSC approach and the second recording with the object in place featuring the obstruction.Fluorescent particles (UVPMS-BO-1.00125-150 µm, Cospheric, material: polyethylene, diameter: 125-150 µm, density: 1000 kg m −3 , emission wavelength: 606 nm) are added to the water as tracer material.Illumination is provided by two pulsed LED panels (LED Flashlight 300 blue, LaVision GmbH, emitter area: 300 × 100 mm, peak emission wave length: 450 nm) with pulse lengths of 250 µs.Such a short pulse length makes it possible to overdrive the LED emitters, resulting in a pulse energy of 72.5 mJ.For a sharper definition of the illuminated volume, black apertures are installed on the short side walls of the glass tank.All in all, a measurement volume of 240 × 200 × 300 mm 3 is achieved.For the calibration of the multi-camera configuration a double-sided two-level calibration plate (309-15, LaVision GmbH, number of calibration marks: 441) is imaged at multiple shifted, tilted and rotated position in the volume without the object in place, using a pinhole camera calibration.

Object detection and localization
As pointed out in the previous sections, precise knowledge of the object's location in the camera coordinate system is necessary to calculate a precise depth map.This can be achieved by using the same mounting point for the calibration plate and the stepped cylinder, e.g. at the Z = 0 location and aligning cylinder's central axis with the Y-axis.But despite such effort, the real object location was still too far off from the modeled CAD location, leading to reconstruction artifacts in preliminary LPT results.In order to overcome this problem, the object's location is determined by the camera system itself.During the manufacturing of the cylinder no special markers were placed on the surface, which is recommended for the future, but a sufficient number of tracer particles got stuck onto the cylinder during adjustment runs of the experiment.Since these particles were stationary during the image acquisition, their reconstructed 3D position using IPR are taken as supporting points to fit the known CAD geometry to the precise location.
This whole procedure is illustrated in figure 8. Starting from the raw particle recordings (figure 8(a)), a minimum intensity image along the whole time series is calculated (figure 8(b)).This way, the stationary surface tracers can be separated from the moving flow tracers.In the following, the minimum intensity images are used as an input for an IPR processing.Since the seeding density of those minimum intensity images is very sparse, the IPR can be configured with very loose camera constrains, so that all particles around the object can be reconstructed without introducing too many ghost particles.The obtained surface particle distribution is illustrated in figure 8(c).For a final location determination of the object in the measurement system's coordinate system, particles are used as support points to fit the CAD model of the stepped cylinder to its exact position via an Iterative Closest Points algorithm [44].Finally, the depth maps for each camera are calculated as outlined in section 2.1.The depth map for one of the six cameras is shown in figure 8(d).

VSC
VSC applied to the recording without object reveals that the initial camera pinhole model has spatially varying disparities/calibration errors of up to 5-6 px.Even after a few VSC iterations, keeping the pinhole model, the errors remain above 3 px.As mentioned before, the reason is that the optical distortions due to the refractive index changes of the air-glass-water interfaces are not fully accounted for by the standard pinhole model, here implemented with two additional and tangential distortion terms.Indicative of this situation is already the high residual calibration error (measured versus computed mark positions) of the initial pinhole calibration of more than 1 px, where good calibrations have typically less than 0.1 px residual errors.
After switching to a 2-plane 3rd-order polynomial model and a few more iterations of VSC with 5 × 5 × 2 subvolumes, the average/maximum disparities reduce to almost perfect values of 0.04/0.19px.Then, using the actual recording with the object now in place, another VSC pass with a single subvolume using particles visible by all six cameras is conducted to check if the calibration is still accurate.Here, VSC detects and corrects shifts in the camera position of about 0.1-0.2px.
This VSC-procedure can also be done right away from start using the recording with the object, since the object itself is only a small part of the measurement volume.With 5 × 5 × 2 subvolumes, only a single y-column in the middle of the volume contains invalid disparity vectors, which can be filtered out by an appropriate local median filter together with a 5%-10% intensity threshold of the disparity peak height relative to the maximum peak height, since outlier disparity peaks are much lower than true ones.The high quality of the final calibration function remains the same as before.
Nevertheless, for this experiment, even standard VSC works well.But by using the modified VSC, as described in section 2.4, one or two less correction passes are needed, since the disparities in front of the object can be computed now as well, which adds stability to the convergence process.But the overall processing time is increased, since the particle triangulation with all camera combinations adds computational complexity.

LPT
The reconstruction is now performed by the proposed object-aware STB method and compared with the default STB implementation without any object knowledge.The investigated flow configuration is an impinging jet on a curved surface.For such a type of flow, Schanz et al [45] proposed a variable time-stepping scheme (VT-STB) with the aim of better capturing the high dynamic velocity range.Such a VT-STB approach is used here, as well, next to a more standard single-pass processing strategy.During the VT-STB processing, the complete calculation is split into five passes with decreasing image increments of 5, 3, 2, 1 and 1.In addition, the VT-STB approach is combined with an alternating forwardbackward tracking scheme [16], which inverts the temporal direction of reconstruction with each pass.For the particle detection an intensity threshold of 10 counts out of 255 counts (8 bit images) is set and particle candidates for the 3D reconstruction are searched with an allowed triangulation error of 1 px.Four outer and four inner iterations for IPR are chosen.Ghost particles are filtered out once their intensity drops below 20% of the average intensity of all reconstructed particles.Furthermore, IPR is set to allow the reconstruction of a particle if it is seen by at least three out of the six cameras to account for the most obstructed subvolumes (figure 7).Trajectories are initialized over four time steps with a maximum allowed displacement of 20 vox in-between frames.

Results
As a first result of the different STB schemes, all reconstructed particle trajectories at the beginning of the time series (time step 1-15) are presented in figure 9.The standard STB approach without variable timing fails to initially reconstruct any particles in the partially occluded regions of the measurement domain (figure 9(a)).Only by letting STB predict and extend already established trajectories into those areas with multi-pass VT-STB (figure 9(b)), the void fills up successively.Still, empty regions are visible and the measurement volume is not homogeneously reconstructed.When informing the STB algorithm about the object using the depth maps, the flow field can immediately be reconstructed all around the object without any nonphysical voids (figure 9(c)).When additionally employing the variable time-stepping scheme, the trajectories are looking very similar but the reconstructed size of the domain can be slightly extended (figure 9(d)).As an additional visualization, an animation of the reconstructed particle trajectories with the novel object-aware Lagrangian particle tracking (OA-LPT) approach is provided in the supplementary materials.
For a more quantitative comparison about the tracking capabilities of the different approaches, the number of simultaneously tracked particles over time is plotted in figure 10.
Here the same trends as already pointed out are observed as well.Both default STB implementations (blue curves) show noticeably lower particle numbers than with the new object-aware STB method (red curves).Even the multi-pass scheme with five passes cannot reach the same level of tracked particles compared to the single-pass calculation of the objectaware STB reconstruction.The maximum number of tracked particles can be found, as expected, for the VT-STB including the object depth maps.A maximum of 30 256 simultaneously tracked particles is achieved, which is 16% higher than with the VT-STB approach without the object information (26 081 tracked particles).
The correct handling of the object's shape information is confirmed by inspecting two volumetric slices at different All in all, the experimental data confirms the correct implementation of the object-aware STB technique under real measurement conditions and showcases a superior reconstruction quality over the standard STB method.

Conclusion and outlook
With the experimental challenge of measuring the complete flow field around arbitrary objects in mind, we introduced a new concept of overcoming current shortcomings of stateof-the art LPT schemes, when handling partially occluded regions.This new concept of an OA-LPT involves the determination of depth maps for each camera, which efficiently encodes the information on the object's shape and location in space.Especially, when dealing with stationary and rigid objects, this approach is very fast, memory efficient and easy to implement.We discussed this practical scenario as a whole, including a brief description on how to obtain the object information, an explanation of the necessary modifications to the particle reconstruction scheme as well as a presentation of a new method for obtaining consistent VSC results across the different measurement regions.
A performance assessment of the proposed changes is conducted with both synthetic and real experimental data, featuring both a six-camera set-up with a view-blocking cylinder in the middle of the measurement domain.The standard STB implementation fails to reconstruct the majority of particles in the occluded areas, when no further information on the object is present.Only by using very loose constrains during the particle triangulation, the complete volume could be filled, but simultaneously the suppression of ghost particles suffers substantially.By utilizing the introduced depth maps, particles can be reconstructed in any subregion of the complete domain while also minimizing the occurrence of reconstruction artifacts.In general, it is highly recommended for OA-LPT, that each part of the measurement volume is visible by at least three cameras to ensure a reliable particle triangulation.
So far, the object handling is only implemented for static, solid objects, where the only major challenge is to accurately position the used CAD geometry in the camera's coordinate system.By employing a depth map calculation for each time step, the proposed approach can be extended to a time varying method, which is already under development and will be presented in the future.Furthermore, subsequent studies, incorporating more complex geometries than shown in this work, are already in preparation and they will give a deeper insight on how the object's geometrical complexity influences the final reconstruction quality.
Future work on this topic should cover a thorough investigation on how accurately the object must be located and which object registration method for static or deforming objects may be the most practical one to achieve this, as it was shown, that even a slight mismatch in the registration can lead to reconstruction artifacts.OA-LPT also opens up new possibilities for the data post-processing.With the accurate object information and the complete flow field around it, binning methods [46] or more advanced data assimilation schemes like VIC# [47] or FlowFit [48], could be improved by employing correct boundary conditions during the conversion step, even calculating accurate surface pressure fields.Data-assimilation schemes with a proper boundary treatment have recently been proposed [49], and it is envisioned, that these methods can be combined in succeeding works.

Figure 1 .
Figure 1.Overview of different approaches to detect the geometrical properties of an arbitrary object for the use in Lagrangian particle tracking methods, such as Shake-the-Box.

Figure 2 .
Figure 2. Extended Iterative Particle Reconstruction scheme for Shake-the-Box with the incorporation of depth maps to characterize an obstructing object in the measurement domain.

Figure 3 .
Figure 3. Central XZ-plane of dy-disparitiy field calculated from the volume self-calibration over 100 images for camera 1 of the synthetic test case featuring images with a seeding density of 0.01 ppp.(a) standard volume self-calibration without any information about obstructing objects, (b) modified volume self-calibration with informed triangulation about which subvolumes are visible by which cameras and (c) modified volume self-calibration with informed triangulation and neighborhood consistency correction.

Figure 4 .
Figure 4. Schematic representations of the synthetic experiments.(a) Extend of the measurement domain with indicated position of the cylindrical object.(b) Exemplary depth map for camera 6 used in the new Shake-the-Box implementation.(c) and (d) Color-coded slices at y = 0 represents the number of cameras, which are able to see a specific part of the volume for the case with all six cameras (c) and with four cameras (d).

Figure 5 .
Figure 5. Top-down view on the XZ projection plane with the reconstructed particles for (a) default Shake-the-Box , (b) Shake-the-Box utilizing the object information and (c) Shake-the-Box utilizing object information with slightly misaligned (horizontally shifted for 10 vox) depth maps.The position of the cylinder is indicated by the dashed black line.

Figure 6 .
Figure 6.Top-down view on the XZ projection plane with the reconstructed particles for the synthetic test case with a reduced number of four cameras.(a) Constant particle shift in direction of the cylinder's rotation axis Y with v = 5 vox.(b) Laminar flow around the cylinder with a free-stream particle shift of u = 5 vox.

Figure 7 .
Figure 7. Sketch of the experimental set-up for the flow measurements around a stepped cylinder.As an inlet graphic, the overlapping areas of all six cameras are visualized.

Figure 8 .
Figure 8. Data processing steps preparing the final Shake-the-Box calculations.(a) Raw image for camera 1 (colors inverted for clarity), (b) pre-processed minimum intensity image for camera 1 to separate stationary tracers on object surface from flow tracers, (c) fitted CAD model of the stepped cylinder to the reconstructed surface tracers and (d) determined depth map for camera 1.

Figure 9 .
Figure 9. Reconstruction results of the default ((a) and (c)) and the modified ((b) and (d)) Shake-the-Box implementation for both a single-pass ((a) and (c)) and a variable time-step Shake-the-Box ((b) and (d)) evaluation.Particle trajectories are drawn over 15 consecutive time steps at time step 7 out of 500 and are color coded with respect to their local stream wise velocity.

Figure 10 .
Figure 10.Tracked particles over time of the experimental test case for different Shake-the-Box evaluation strategies.Single-pass STB without depth maps, single-pass STB with depth maps, VT-STB without depth maps and VT-STB with depth maps.

Figure 11 .
Figure 11.Close-ups on the reconstruction results (slices of 50 mm thickness) at two different heights of the Shake-the-Box approach utilizing depth maps.(a) Region around smaller diameter, (b) region around larger diameter of the cylinder.Particle trajectories are drawn over 49 consecutive time steps at time step 375 out of 500 and color coded with respect to their local stream wise velocity.