Real-time 360 degrees view for the operator of milliAmpere 2

In the evolving domain of autonomous marine operations, accurate perception and representation of the surrounding environment are crucial for safe and effective execution. This paper addresses this issue by developing and testing a near real-time 360-degree bird’s eye view system for the situations where the ferry, milliAmpere 2, has to be manually controlled by a local operator onboard. The goal was to aid the operator during the critical phase of docking, by displaying the surrounding area of the ferry from a bird’s eye view. The bird’s eye view was made by using inverse perspective mapping on the undistorted images from the 8 cameras onboard. The system was implemented in Python, and aimed to reach a run-time of less than 200ms. This goal was reached during the initial phase of the work. However, during live testing, only near real-time performance was achieved. Despite some shortcomings, the operators found the system to be a “useful additional assistance” during the docking process.


Introduction
The growing demand for sustainable transportation in cities has posed challenges for building bridges over rivers and channels.Trondheim's municipality faced similar obstacles when proposing a bridge construction project in 2016.As a result, researchers at NTNU embarked on a project to find alternative solutions, leading to the development of the milliAmpere 2, an autonomous passenger ferry.
Inverse perspective mapping (IPM) is a well-studied technique in the automotive industry for lane tracking and obstacle avoidance [1,2,3,4].While IPM has been primarily explored in that domain, its application in the maritime industry is still emerging.Some notable studies have investigated the use of IPM in different maritime contexts.In [5], a path planning and navigation method for autonomous vessels using a convolutional neural network (CNN) and IPM is presented.A similar method is proposed in [6] to obtain 3D ship detection and tracking.
One of the main challenges with IPM is the progressively lower pixel density for objects further away from the camera.Interpolation can decrease the effect of this problem by filling empty pixels with RGB -values from surrounding pixels.The research field of image interpolation has taken place in many different industries: Medical imaging, remote sensing, target detection and recognition, radar imaging, forensic science, and surveillance systems [7].In addition to covering a large group of different industries, a large amount of different interpolation methods have been developed.In [8], 7 different interpolation methods for medical imaging were compared, covering traditional interpolation methods, such as nearest neighbor, linear interpolation, and Gaussian interpolation with different kernel sizes.In recent years, more complex and modern interpolation methods have been developed.One of these is Super-Resolution, which is able to enhance low-resolution images or video frames by increasing their spatial resolution [7].
This work is based upon the Master thesis [9] and addresses the challenges of implementing a real-time 360-degree visualization system for use on the milliAmpere 2 autonomous ferry.A proof-of-concept system is demonstrated on milliAmpere 2 with operators in a scenario comparable to the ferry's intended use [10] and evaluated based on both real-time performance and operator feedback.The research implements the IPM method described by [1], and extends it to multiple cameras and a new environment.
The outline of this paper is as follows.Section 2 introduces the theory behind IPM, section 3 details the image processing pipeline while section 4 presents the various optimization steps implemented to achieve real-time performance.The experimental setup is described in section 5 with run-time and visual results in sections 6 and 7. Section 8 then concludes the work.

Camera model
A camera is a mapping between the 3D world and a 2D image [11].In this work, the primary camera model was chosen to be the pinhole camera model.It maps a point in 3D space onto a 2D plane.This plane is often called the image plane or the focal plane and is represented by the frame F I .The position of the point x I is determined by the intersection between the image plane and the drawn line from the point X c to the camera center through the image plane.From the geometry of the camera model, the relationship between a point X c and u can be expressed in homogenous coordinates as

Inverse Perspective mapping
Inverse perspective mapping is the problem of determining the world coordinate position of a point based on its pixel location in an image.This is an underdetermined problem, where X v , Y v , Z v are unknown and u, v is known.In a maritime context, the ocean surface can be modeled as a flat plane with a known elevation where the pixels originate from.It is assumed the plane (Π sea ) has the elevation Z v = 0.A method to solve the IPM is described by [11].
With the flat plane assumption, the solution is given by Equation 2.
For this to be a valid solution, P ′ has to be invertible.

Interpolation
Interpolation is the process of estimating the intermediate values in a signal at continuous positions from a set of discrete samples [12].The three most commonly used interpolation methods are nearest neighbor, bilinear (also called linear), and bicubic.For image data, nearest neighbor interpolation involves determining the value of a pixel based on the four nearest pixels.
The value of the pixel with the shortest distance to the target pixel will be the chosen value.Bilinear interpolation also considers the four closest pixel values to the target pixel, but weights them based on their distance from the target pixel.

Image processing pipeline
MilliAmpere 2 is equipped with 8 electro-optical cameras of the type FLIR Blackfly S 50-S5C with a 6mm lens.Images are provided at a rate of 5Hz with resolution 1224px × 1024px.Each camera has a field of view (FOV) of 77.8 • .However, due to the cameras' positions and orientations, full near-range coverage is not achieved due to blind zones on the port and starboard sides.

Image Processing Pipeline
The main steps of the image processing pipeline are presented in Figure 2  IPM was solved according to [13].The system solves Equation 2 to find X v and Y v , with the assumption that the target plane Π sea has Z = 0.After calculating the F v coordinates for all pixels in the image, a filter removed all points with a distance further out than 10m from the center of F v .The interpolation method used in this work is custom-made and inspired by [14].The algorithm can be described in four steps: (i) Triangulating the irregular input data (pixel positions) using Delaunay triangulation.(ii) For every point in the new grid, the triangulation is searched to identify the simplex (a triangle) that encompasses the point.(iii) The barycentric coordinates of each new grid point are calculated relative to the vertices of the surrounding simplex.(iv) An interpolated value is computed for each grid point using the barycentric coordinates as weights and the RGB values at the three vertices of the enclosing simplex, performing linear interpolation.
After applying IPM to the images from all 8 cameras, the next step was to stitch them together.The IPM provides the position of each pixel in the captured images, represented as To facilitate transformation into an image of arbitrary resolution, normalization of the points X v was performed, mapping them the range [−1, 1].The final step involved multiplying Xv by the "intrinsic" matrix K IP M , where res x and res y denote the resolution of I IP M , to obtain u IP M .
Given the cameras' FOV and position, some of the cameras observe the same area.Due to different viewing angles and local light conditions, this overlap creates unwanted noise in the IPM image and was therefore removed.

Code optimization
To function in real-time operations, the system must be able to generate bird's eye view images at a rate no slower than the provided input images.For milliAmpere 2, the cameras capture images at a rate of 5Hz which imposes a run-time requirement of less than 200ms per IPM image.This section describes the equipment used during the first phase of the research and the optimization steps and methods used with their respective run-times.For detailed descriptions, see [9].

Code optimization setup
The computer used for this phase had Ubuntu 20.04, ROS 2 Foxy Fitzroy and an Intel Core i7-8700 CPU with a base clock frequency of 3.20GHz, max clock frequency of 4.60GHz, 6 cores, and 12 threads.The run-time data presented in section 4.2 were captured by using cProfiler on main() function.The presented run-times are the average run-time of the function make BEW(), which produces the IPM image.The timer starts after the make BEW()function has received all 8 undistorted images, and ends when the IPM image is made.Overhead, such as undistortion, is excluded.

Optimization steps and results
The initial performance benchmark for this paper was bilinear interpolation with a resolution of 3000px × 3000px, and the average run-time was 34s.During the initial phase of the research, multiple interpolation methods were tested and benchmarked.
The most important methods and their run-time are presented in Figure 3 where the first method was based on linear interpolation with a resolution of 3000px×3000px.Method 2 reduced the resolution to 1500px × 1500px and switched to nearest neighbor interpolation.In method 3, the center of the image was filled with black pixels and removed from the interpolation process.Method 4 introduced Delaunay triangulation which was combined with parallel processing for method 5. Finally, method 6 introduced the custom interpolation method described in section 3.This resulted in a run-time reduction from the original 34.0s using method 1 to 0.6s for method 6.Additional optimization was then conducted to achieve the required run-time of < 200ms with results shown in Figure 4. From the baseline, step 1 moved the image masking and distance checking to program initialization.Step 2 removed overlapping pixels present in two cameras from the process.In step 3 the final 8th camera was introduced while step 4 optimized the indexing function in the interpolation.Step 5 did the same for the function that extracts relevant areas from individual images.Finally, step 6 introduced in-place processing for certain calculations, removing the previous copy-based methods.This reduced the run-time from the original 617ms to 179ms yielding real-time performance.Step 7 was then performed as an additional test with resolution reduced from 1500px × 1500px to 1100px × 1100px, further reducing run-time to 130ms.

Experimental setup
The ferry milliAmpere 2 was used to capture images, and all 8 FLIR Blackfly S 50-S5C optical cameras were used.The computer was set up with an AMD EPYC 7313P "MILAN" CPU with a base clock frequency of 3.0GHz, max clock frequency of 3.7GHz, 16 cores and 32 threads [15].The code was running in a Docker container using Ubuntu 22.04 and ROS2 Humble Hawksbill.The experiment took place on May 10, 2023, in the canal between Fosenkaia and Ravnkloa in Trondheim, Norway, and the system was tested with two operators.The experiment involved crossing the canal twice with four dockings and had two main goals: assessing the realtime performance of the system onboard milliAmpere 2 and evaluating its usefulness for the operator during docking operations.

Run-time results
The run-times of the visualization system during live testing are presented in Figures 5 and 6 using a resolution of 1100px × 1100px for the bird's eye view.The main density of data points   For the sake of fast and convenient prototyping, the system was implemented in Python.At the onset of the project, we assumed that a run-time of less than 200ms could be achieved, which would be sufficient for the purpose of using the birds-eye view as a navigation aid.In the pre-recorded setting this was indeed achieved.In the real-time experiment, this did not hold, as shown by the run-time histogram in Figure 5.There are several reasons for this.The CPU clock speed of milliAmpere 2 is about 20% slower than the computer used for code optimization and the integration of the code in the onboard ROS2 system resulted in increased overhead.
Analysis of Figure 5 reveals two peaks, centered around 250ms and 215ms.This is supported by the data in Figure 6, which shows two distinct periods.Prior to the 35-minute mark, the average run-time is around 250ms, followed by a noticeable reduction to approximately 215ms thereafter.The exact reasons for these fluctuations and changes are challenging to determine due to limited data on concurrent systems and potential data variations.However, it is clear that the performance of the system is significantly affected by concurrent processes and programs.For further details, see [9].

Visual accuracy and operator usefulness
Accurate camera calibration, both intrinsic and extrinsic, is crucial for optimal bird's eye view images.Unfortunately, due to time constraints, precise calibration was not achieved during the research.Instead, the extrinsic parameters relied on the CAD model of milliAmpere 2. The consequences of this inaccurate calibration are evident in various bird's eye view examples.In Figure 7 for instance, the straight lines on the dock appear discontinuous, exhibiting abrupt steps when crossing the edges of the cameras' FOV.The position and size of the inserted ferry visualization were determined manually which resulted in slight inaccuracies in scale.Nonetheless, operator feedback emphasized that it was a very positive addition to the system.
Despite the run-time being on average above the real-time goal of 200ms, the system was still considered fast enough to provide useful aid.The slower speed at which the ferry navigates during docking mitigated the impact of the slightly longer run-time.The operator's questionnaire feedback indicated that the refresh rate of the bird's eye view was deemed high enough but that higher refresh rates would be beneficial in addition to camera synchronization.A video from the experiment with the birds-eye view and operator footage is available. 1

Conclusion and future work
In this work, a 360-degree bird's eye view system written in Python for milliAmpere 2 was presented.The run-time goal was not fully achieved on the ferry due to CPU speed and overhead in the autonomy system.Despite some shortcomings, the operators reported that the system provided useful additional assistance during the docking process.

Figure 1 .
Figure 1.An image of milliAmpere 2 from the side.

Figure 2 .
Figure 2. The main steps in the image processing pipeline.

Figure 3 .
Figure 3. Run-times of the tested interpolation methods.

Figure 4 .
Figure 4. Run-times for the additional optimization steps based on method 6.

Figure 5 .
Figure 5. Run-time histogram from live experiment.Each bin represents a millisecond.

Figure 6 .
Figure 6.showing run-times for the entire experiment.