Research on vision control methods based on multi-sensor fusion

This paper focuses on the application scenarios of electric power live line working robots, addressing tasks such as live wire stripping and live wire connection/disconnection. It utilizes Mixed Reality (MR) technology for rapid environment modeling and visual signal acquisition. Through the research on robot contact force sensing, real-time force feedback, and obstacle avoidance through the fusion of robot motion paths and environmental data, it establishes a novel electric power live line working robot system centered around dexterous dual arms at the operation end, force feedback teleoperation controllers, multi-sensor fusion acquisition systems, and robot motion planning and control technology. Through prototyping, a demonstration application is realized, showcasing a semi-autonomous electric power live line working robot system that employs remote operation combined with sensor-based MR for task execution. This teleoperation and Mixed Reality-based semi-autonomous electrified operation approach is poised to play a significant role in advancing and demonstrating the use of robots in the field of electrical grid operations.


Research Overview
The existing electric power live line working robot systems on the market are comprehensively analyzed, including State Grid Ruijia, Yijiahe, and State Grid Intelligent, which have been iterating for decades but have not been applied in batches on the market.In addition to the stability of the system, the main focus is on the ease of use of human-computer interaction.The lack of ease of use of human-computer interaction makes it difficult for the efficiency of robot operation to be comparable with that of manual operation [1,2].Taking the most common three-phase wire stripping fire as an example, manual operation usually takes about half an hour, and robots usually take one and a half hours [3].
From the perspective of improving operation efficiency and ease of use of human-computer interaction, we need to work hard from the perspective of systems and software.The design mainly considers two aspects: x The convenience of human-computer interaction needs to be realized from the perspective of software systems.If it can operate autonomously, it shall be confirmed by manual intervention as much as possible, and if it can not be judged autonomously, it shall be realized through master-slave teleoperation [4,5].x An efficient and interesting training method is established for the use of the robot system, so that front-line employees are willing to accept the training of robot use, and are willing to operate and use electric power live line working robot system from the bottom of their hearts.
Through various operation means, the operation efficiency is improved to be basically consistent with manual operation.And through continuous iteration, the efficiency is better than that of manual operation [6,7].In this paper, we adopt a two-step method of visual guidance robot coarse positioning and human-computer collaborative teleoperation fine positioning on the positioning control technology route, to balance the automation of artificial intelligence and the environmental adaptability of human-computer collaboration, and improve the robustness and environmental adaptability of the system control method.

Overall System Planning
The system software mainly includes robot arm motion control and environmental information acquisition software, mixed reality fusion and modeling software, and teleoperation control software.At the slave station (robot arm control terminal), motion control and environmental information acquisition are primarily accomplished based on the ROS (Robot Operating System) for robots.ROS is a highly flexible software architecture designed for writing robot software programs [8].It encompasses a wide range of tools, library code, and standardized protocols, with the aim of simplifying the challenges and complexities of creating intricate and robust robot behavior across various robot platforms.The motion control component chiefly involves calculations for the motion relationship between the robot's joint space and Cartesian space, as well as motion trajectory planning.Additionally, a laser radar measures distance information within the working environment by emitting electromagnetic waves.Following basic data processing, such as filtering and compression, this data is integrated with information from visible light cameras, infrared sensors, and other acquisition instruments, and then is visualized and fused.This software implementation also needs to be completed within the ROS environment.Throughout this process, multiple data sources are optimally leveraged for effective control and utilization.The ultimate goal of information fusion is to extract more valuable insights through the combination of multi-level and multi-faceted information gathered individually by each sensor.Through software development, an elegant and straightforward human-computer interaction interface is established, simultaneously displaying visible and invisible environmental information within the operator's field of view, resulting in the creation of a multi-dimensional environmental model, as depicted in Figure 3.The information after fusion is subjected to three-dimensional reconstruction to represent threedimensional objects.The most fundamental aspect involves drawing the outline of three-dimensional objects and using points and lines to construct the outer boundary of the entire three-dimensional object, which means that only boundaries are used to represent the three-dimensional object.This part of the software implementation is typically achieved by using the Unity virtual engine.The most common method for representing three-dimensional objects using boundaries in three-dimensional graphics objects is to use a set of surface polygons that enclose the object's interior.The polygons accurately define the surface features of the object.However, for other objects, the surface can be embedded into the object to generate a polygon mesh approximation.Polygon mesh approximation used on curved surfaces can be improved by breaking down the surface into smaller polygons.Building upon this, timeefficient texture mapping modeling can be performed to handle the object's appearance by using texture mapping technology.This not only enhances the level of detail and realism but also provides better spatial cues and reduces the number of visible polygons, thus improving frame refresh rates and enhancing real-time dynamic display in complex scenes.Finally, control buttons, operation points, and other information are overlaid onto the interactive interface, forming a video stream that is transmitted to the control terminal [1][2][3].
In the process of power grid operations such as wire stripping and connection, both remote control mode and autonomous mode complement each other.On one hand, the robot arm does not yet possess the ability to autonomously plan rational movements in complex power grid environments.Under human decision and planning, remote control is used to achieve the initial positioning of the robot arm at a global level, preparing it for autonomous operations.On the other hand, with the assistance of video monitoring, remote control relies on human control for executing precise operations, which presents a significant challenge.Autonomous fine-grained operations at the local level compensate for the lack of precision in remote control.

Machine Vision Control Method for Robot Systems
An aerial electric power live line working robot is a robot capable of performing live-line work on highaltitude power distribution lines.It operates remotely, replacing manual labor in live-line work tasks.Compared to traditional manual live-line work methods, it eliminates the risk to human safety, doubles work efficiency, and achieves complete physical isolation between humans and electricity throughout the process.This effectively enhances the quality and efficiency of live-line work.During aerial liveline work, the robot needs to accurately identify the cables to carry out tasks such as cable gripping, stripping, and hanging.Existing technologies generally rely on global laser modeling positioning or visual positioning for cable recognition.However, using global laser modeling for cable positioning, the sensor's inherent accuracy is typically within a range of 3 cm to 5 cm.Taking system errors into account, it is difficult to achieve 1 cm precision, making it challenging to meet the accuracy requirements for tasks such as cable gripping, stripping, and hanging.Using a vision-based approach alone for cable positioning, especially in outdoor environments with strong light or nighttime conditions, difficulties in terms of interference resistance may be faced.

Secondary Branch Gripping and Wire Stripping Position Calculation
In the process of live-line work, the electric power live-line working robot needs to effectively perceive specific objects in the surrounding environment, such as power lines, insulators, and utility poles.It must quickly and accurately acquire a point cloud model of the surrounding environment to support a series of subsequent operations.However, existing electric power live line working robots require the manual selection of two endpoints for the mainline and branch lines on the point cloud model when calculating the stripping and gripping points.These selected endpoints are then used as prior conditions for the algorithm to compute the mainline stripping points and branch line gripping points.This manual process demands high accuracy and consistency, is time-consuming, and can be cumbersome.To address these limitations, this paper introduces an automatic cable recognition method based on retroreflective markers and lidar.This method allows the electric power live line working robot to break free from the dependency on manual operations and the cumbersome process, enabling it to rapidly and effectively perform automatic point selection.A cable automatic recognition method based on retroreflective markers and lidar includes the following steps: x Step 1: Two retroreflective markers are attached to the target cable (branch line).The lidar on the electric power live line working robot scans the target cable and stitches the corresponding point cloud together.

Experimental Testing and Results
Figure 4. Photographs of the electrically electric power live line working robot system.
The electrically electric power live line working robot system, as a result of the system design, consists of the main station, human-machine interaction system, and substation, as shown in Figure 4.The three images from the top left, bottom left, and right represent the main station of the electric power live line working robot, the human-machine interaction system, and its interface, and the electric power live line working robot's substation.The substation is equipped with intelligent operating tools and can complete live-line work on all three phases.In this experiment, the operator in the master station is grasping the master robot to control the slave robot with two arms in the slave station.In the operation, the 3D image in the helmet is critical as shown in the bottom of the left figure in Figure 4.The WIFI6 technology with a high speed is used in the image transmission to ensure the time delay of the image transfer from slave to master under 330 milliseconds.This system can autonomously reconstruct a virtual environment based on the real surroundings.It not only offers a highly immersive, interactive, and multi-sensory experience for the operators but also enhances their work efficiency and safety.For the construction of virtual scenes, a novel scene recognition and reconstruction technique is proposed, based on the principles of binocular stereovision, using a binocular camera.This method is primarily designed for the live-line work scenarios in power distribution networks, focusing on the power lines.It involves image preprocessing, target recognition and extraction, binocular stereo measurement, generation of depth point cloud images, and finally, the reconstruction of a three-dimensional model of the target.Electric utility workers can simply wear virtual reality devices to enter the virtual live-line work environment and then use the remote operation system of the robotic arm to perform high-altitude live-line work.

Conclusion
In this study, an innovative design for a robot's visual control system, incorporating multiple sensors and multimodal fusion such as binocular cameras and lidar, is developed.The research focuses on insulation bucket positioning and robot visual recognition and positioning algorithms, resulting in a mechanism for environment object recognition and positioning through multimodal fusion technology.This mechanism effectively complements the fine adjustments of master-slave teleoperation.
Additionally, the study encompasses the generation of mixed reality systems, the interaction and communication protocols between the main and sub-stations, the composition of the substation control system, and information transmission mechanisms among the components.The research leads to the design of a 3D scene construction and mixed reality fusion feedback-based remote control system, which significantly improves operational transparency and system safety.

x Step 2 :
Range filtering and highlight threshold extraction are performed on the stitched point cloud obtained from the lidar in the first step, and point cloud clusters are obtained at the locations of the retroreflective markers.x Step 3: Statistical filtering and cluster analysis are conducted on the point cloud clusters extracted in the second step, and centroids are extracted to obtain the respective effective retroreflective marker centroids.x Step 4: The distances between the centroids of the effective retroreflective markers obtained in the third step are calculated.Then, considering the positions of different cables, their proximity and branch connectivity relationships are determined.Finally, each retroreflective marker is matched with the corresponding cable to achieve automatic cable recognition [6-9].Specifically, in the second step, after the point cloud stitching is completed, each point in the point cloud consists of spatial coordinates and reflectance, represented as p(x, y, z, intensity) (p ę MatchedPointsrobot).Therefore, extracting point cloud data that simultaneously satisfies the following condition indicates the point cloud cluster at the location of the retroreflective markers.p.xę(xmin, xmax), p.yę(ymin, ymax), p.zę(zmin, zmax), p.intensity > intensity threshold, where intensity represents reflectance information, MatchedPointsrobot denotes the stitched point cloud collection in the robot's coordinate system, and (xmin, xmax), (ymin, ymax), (zmin, zmax) represent the value ranges of the point cloud p(x, y, z) in the lidar's coordinate system along the three axes.Intensity threshold is the defined high reflectance threshold for identifying the retroreflective markers.The clustering analysis in the third step is performed as follows: o Step 1: A point p in space is located, a kd-tree is used to find the nearest n points to it, the distances from these n points to p are assessed, and the points with distances less than the threshold r are placed into class Q. o Step 2: Within class Q, a point is found and the first step is repeated until no new points are added to class Q. o Step 3: Determining if the number of points in class Q is within the set threshold for the number of exposure points.If it is, the search should be completed.If it is not, it is necessary to return to Step 2. x Step 5: The electric power live line working robot sends the matching results from Step 4 to the visual interface for display.Users can view the matching results through the visual interface.If it is determined that the automatic calculation results are incorrect, manual intervention can be initiated to update and recalculate the erroneous points until the data is determined.