Optimal Transport Approaches for Cloud Fusion in 3D Reconstruction

Cloud fusion plays a crucial role in achieving accurate 3D models. This paper introduces two optimal transport-based approaches, Gaussian Fusion and Cyclical Fusion, to showcase the effectiveness of optimal transport in establishing dense correspondence among incoming 3D point clouds. These approaches encompass several important components. Firstly, the fusion correspondences are derived from the optimal transport problem. The core principle of Gaussian Fusion and Cyclical Fusion involves manipulating points along geodesic curves to effectively leverage significant local geometric information. Secondly, this paper explores the fundamental concepts underlying Gaussian Fusion (displacement interpolation) and Cyclical Fusion (cyclical monotonicity), which greatly enhance the accuracy and completeness of the reconstruction process. And provides evidence for the uniqueness of displacement interpolation as geodesics on L2-Wasserstein space and the rationality of cyclical monotonicity in this context. Finally, the proposed approaches excel in capturing intricate surface details, particularly on small objects, when compared to the original fusion scheme which often introduces severe artifacts.


Optimal Transport
In mathematics and economics, the study of optimal transportation is commonly referred to as transportation theory or transport theory.This concept was initially formalized by Gaspard Monge, a French mathematician, in the 18th century.After a century and a half, another Soviet mathematician, A.N. Tolstoi, conducted an in-depth mathematical analysis of the transportation problem.And this field saw significant advancements during World War II, thanks to the contributions of Leonid Kantorovich, a Soviet mathematician and economist.As a result, the problem is also denoted as the Monge-Kantorovich transportation problem [1] to commemorate the people who raise and solve this problem.
McCann [2] further explores this problem by introducing displacement interpolation using quadratic cost in Euclidean space.This interpolation allows mass preservation while smoothly transitioning between probability measures derived from the Monge-Kantorovich problem.To illustrate the Monge-Kantorovich problem, consider the task of reshaping a heap of cargo, represented by the function (), into a desired shape at a different location, represented by the function ().(, ) is the cost function, plays a crucial role in determining the optimal transport plan, represented by , which is obtained by minimizing the integral of (, ): To elaborate further on displacement interpolation, let us assume that cargo transport has some dynamics, specifically that the transport is time-dependent. 0,1 (, ) is the cost function, combined with a Lagrangian action () on  × , defined as follows: Here,  denotes a group of curves on , which is continuous, and  represents a curve characterized by a Lagrangian action (), where  0 =  at the beginning of the curve, and  1 =  at the end of the curve.The action () is defined as: The infimum of the action functional (γ) over all curves results in the minimizing, constant-speed geodesic curve, commonly referred to as the geodesic.Within the context of fusion, displacement interpolation involves the movement of points along this geodesic curve.Gaussian Fusion [3] assumes a sufficiently small local surface follows a Gaussian distribution.While an explicit solution exists for the geodesic curve between Gaussian measures on the L 2 -Wasserstein space [2,5], this approach cannot be directly applied in scenarios with noisy and corrupt point clouds.In such cases, an alternative method, the Hitchcock-Koopman formulation is employed, as the covariance can be inaccurate when there are few points in a voxel.The reason for optimal transport to be accurate in determining dense correspondence is a substantial invariance property named cyclic monotonicity.Another optimal transport-based method, Cyclical Fusion [4], further explores this substantial invariance property to accelerate the fusion speed.

3D Reconstruction with RGB-D Sensors
The availability and affordability of RGB-D sensors have significantly improved with the rise of active optical techniques.In numerous research works, these features have been leveraged for real-time reconstruction of indoor scenes.The limited resolution and accuracy often stem from cost and size constraints.Moreover, the pose estimation is constrained by matching accuracy.As a result, accurately reconstructing intricate geometric details using consumer RGB-D sensors remains a complex task.The choice of fusion strategy significantly impacts the quality of the reconstruction.Many methods utilize a back-projection strategy, which varies depending on the surface structure.Two primary representations are commonly used for reconstructing 3D data.The voxel grid representation utilizes the truncated signed distance function by averaging its values during the back-projection process from the depth map.Another representation, the surfel/point-based model, offers an alternative to voxel-based models.Consequently, this approach often leads to artifacts when reconstructing intricate geometric details.Therefore, we identify key observations: 1) inaccuracies in back-projection correspondences stem from uncertainties in sensor noise; 2) solely relying on averaging makes it challenging to filter out uncertainties in sensor depth and pose.
The depth map, as illustrated in Figure 1(a), obtained from RGB-D sensors can be transformed into point clouds.Given two input point clouds ,  ∈ ℝ 3 , where each point    ∈ ,    ∈  in it are coordinates (, , ) of the world, the optimal transport methods want to find dense correspondence between A and B. Gaussian Fusion [3] assume these points is an optimal transport problem-namely, moving points from A to B with minimum cost.Gaussian Fusion [3] introduces the displacement Interpolation approach to determine the new fusion point between A and B.

Displacement Interpolation
Following the concise definitions based on the works of Villani [6] and Takatsu [5].Firstly, we present the definition using Borel probability measures without loss of generality.Subsequently, we introduce the linear map from one Gaussian measure to another.Let(, ) represent a separable and complete metric space.Consider Borel probability measures  0 and  1 , both belonging to  2 () and having finite second moments on .These measures fulfil the following condition: Consider the transportation plan  between  0 and 1 on the product space  × .It is worth noting that the marginals of  correspond to  0 and  1 .

𝜋𝜋[𝜓𝜓
The condition holds for all Borel sets ∈  .The L 2 -Wasserstein distance function  2 ( 0 ,  1 ) between  0 and  1 in  2 () is defined as follows: ( 0 ,  1 ) is the collection of transportation plans between  0 and  1 , with the infimum over( 0 ,  1 ) being the L 2 -Wasserstein distance.The corresponding transport plan is considered optimal.The  2 ( 0 ,  1 ) serves as the distance function on  2 (), while ( 2 (),  2 ) is referred to as the L 2 -Wasserstein space over.For Euclidean space, optimal transport plans can be characterized by pushforward.We consider a measurable map: ℝ  → ℝ  .The push-forward of a Borel probability measure  0 on ℝ  is defined as follows: This notation holds true for all Borel sets  ∈ ℝ  .
(2)(  ) 0≤≤1 is a geodesic curve in the space ( 2 (),  2 ).Corollary [3] states that the displacement interpolation can be represented as geodesics on the L 2 -Wasserstein space (see Figure 2), denoted as ( 2 (),  2 ).Here,  0 and  1 are Borel probability measures belonging to  2 (), which is constructed by probability measures with order 2, defined on a complete, separable, metric, locally compact space(, ).The corollary establishes that there exists a continuous curve (  ) 0≤≤1 ∈  2 () that satisfies two equivalent properties.Firstly, (  ) 0≤≤1 corresponds to the distribution of (  ) 0≤≤1 , where( 0 ,  1 ) is an optimal coupling and  is a geodesic with constant speed.Secondly, (  ) 0≤≤1 is a geodesic curve in the ( 2 (),  2 ) space.Corollary [3] follows from Theorem Displacement interpolation and its corollaries [6], which holds for  > 1.It should be noted that the geodesic curves in   ()differ for various  > 1.In other words, geodesics in the L 2 -Wasserstein space are not the same as in the L p -Wasserstein space for  > 2. Furthermore, [6] establishes the uniqueness of displacement interpolation, which states that the displacement interpolation is unique if the optimal transference plan is unique.
Corollary [6] Uniqueness of Displacement Interpolation.It states that if exists single optimal transportation plan  , connected  0 and  1 , then almost surely, for ( 0  1 ) ,  0 and  1 are connected by a solitary shortest curve, and only one unique displacement interpolation   for 0 ≤  ≤ 1 that joins  0 to  1 .
The proof of Corollary [6] Uniqueness of Displacement Interpolation is closely connected to the original Displacement Interpolation Theorem [6].According to the Displacement Interpolation Theorem, displacement interpolation is composed with () # , and  is a probability measure which is on the shortest curves, and  ≔ ( 0 ,  1 ) #  represents an optimal transportation plan.It is assumed that there exists only one such .Let us define  as the set of pairs ( 0 ,  1 ) where there are multiple minimizing curves connecting 0 and  1 ; the assumption states that [] = 0.For ( 0 ,  1 ) ∉ , there exists a unique geodesic  = ( 0 ,  1 ) mapping  0 and  1 .Therefore,  must coincide with  # .
The proof of Corollary [3] Displacement Interpolation as Geodesics on L 2 -Wasserstein Space is closely intertwined with Property (ii) of Displacement Interpolation Theorem [6].Let () be the actions over the curve (start from s to t) And The hypotheses of local compactness and cost are satisfied in this particular action.The hypotheses of local compactness is utilized to demonstrate the coerciveness of the action.It is crucial to highlight that the current focus lies in the fact that these assumptions hold true.
In summary, Gaussian Fusion leverages the solution of the optimal transport problem for accurate dense correspondence.This approach involves moving points from cloud A to B along geodesic curves, utilizing displacement interpolation as geodesics on L 2 -Wasserstein space.However, it is essential to explain the solution of the optimal transportation problem is suitable for establishing dense correspondence between point clouds.Therefore, we provide further insight into the geometric property known as Cyclical monotonicity.

Cyclical Monotonicity
First, we give the definition of Cyclical Monotonicity following [1] ) in ,  ∈ ℝ 3 remains acute.Conversely, if the angle formed by these vectors is obtuse, the correspondences become distorted (see Figure 2).In the special case where � 2  −  1  �( 2  −  1  ) = 0, it implies either  2  =  1  or  2  =  1  .This geometric constraint, which holds for all pairs of correspondences and any , shares similarities with the rotation-invariant constraint but encompasses wider criteria.Solving the Monge-Kantorovich problem is synonymous with the solution and is well-suited for determining dense correspondence in 3D point clouds due to this strong geometric constraint.However, while the solution of the optimal transport problem always satisfies cyclical monotonicity, the opposite is not necessarily true.To accelerate fusion speed, Cyclical Fusion employs the Cyclical Monotonicity Verification Scheme instead of directly solving the computationally burdensome Monge-Kantorovich problem.swap (( + 1), ()) 9: end if 10: end for 11:  ←  + 1 12: end while

3D reconstruction and Experiment Results
Finally, we give the experiment results of our optimal transport methods, Gaussian Fusion and Cyclical Fusion.Our objective is to achieve accurate reconstruction of real objects by utilizing a cost-effective RGB-D sensor.For a comprehensive comparison with others, please refer to Table 1 and Figure 4.The artificial skull's ground-truth model is derived using an industrial scanner to enable quantitative evaluation.This evaluation encompasses assessing both accuracy and completeness.Completeness is evaluated based on the average distance between the reconstructed surface and the ground truth, where lower values indicate better results.Accuracy, on the other hand, is evaluated by measuring the average distance between the ground truth and the reconstructed surface, with lower values indicating higher accuracy.The visual results are presented by employing both our technique Gaussian Fusion and Cyclical Fusion.Under the same inputs, our method demonstrates the highest accuracy, indicating superior levels of geometric detail and overall preservation of the global shape.We calculate accuracy and completeness metrics.Consistently, our technique achieves the highest accuracy score, as depicted in Table 1 and Figure 4.

Conclusion
In this study, we introduce the novel approaches to 3D reconstruction by incorporating optimal transport with the L 2 -Wasserstein distance technique.To address the limitations of back-projection fusion methods, our method utilizes displacement interpolation.We demonstrate the effectiveness of optimal transport theory in determining the 3D relationships among real-world point clouds, leveraging its substantial geometrical constraint known as Cyclical monotonicity.Our experimental results highlight

Figure 1 .
Figure 1.RGB-D sensor (Intel L515) output and reconstruction results:(a) depth map with object distance to pixel; (b) RGB map;(c) reconstruction results (fusion of multiple frames, Gaussian Fusion).