Research on Image Recognition Algorithm Technology for Power Line Business Audit

The development of electric power business is not only limited to the on-site office of the business hall, but also can be carried out on the mobile Internet. At present, the power supply company’s review of business such as name change, transfer, reclassification-peak shaving, etc., because customers use different methods to handle business, Resulting in a wide variety of certificate pictures, which caused great trouble to the staff. This paper studies an image recognition algorithm that can be applied to power business audits, can accurately identify various types of certificate images, and automates the overall business process. Implementation provides preparation. Users can use the mobile phone APP and computer network to handle all the business processes of electricity business online, know the main items of each link, the materials that should be provided, and the various procedures that should be handled, etc., electricity customers do not have to go out, just log on to the Internet All business can be handled well, which greatly facilitates the efficiency of users in handling various power businesses and saves time very convenient.


Introduction
With the transformation of the power business, the company's daily response center received a large number of business work orders. The main piece of business is the name change, transfer, reorganization-peak-shaving audit, and usually receive the work order to be reviewed every hour. , The task of the staff becomes more and more heavy. Once the audit is overtime or wrong, there will be a risk of receiving complaints. Taking the State Grid Ningbo Yinzhou District Power Supply Company as an example, according to data statistics, the company processed more than 80,000 transactions in 2017, and the change business accounted for 91.23%. The processing time of a single change business is about 8 to 10 minutes, and 4 people are needed every day. Responsible for the approval of this business, and the results are not good, the efficiency is low, and it is easy to cause customer complaints [1]. At present, the renaming, transfer, and reclassification-peaking valley table audit module in the State Grid and State System has a complicated operation process, which mainly has the following shortcomings: 1) Due to the large amount of loading content, the staff often needs some Necessary waiting time; 2) The website system is affected by the browser and the system resolution. The page display is incomplete, and the operation efficiency is affected; 3) There are many types of uploaded pictures, and the sources include self-service device uploads in the business hall and user phone photo uploads. The clarity is also very different; 4) The operation steps are cumbersome, the audit content is more, the human vision frequently stares at the numbers in the picture, it is easy to Secondly, considering that the reviewer still needs to verify whether the certificate number and address on the real estate certificate, application form, room ticket, purchase contract and other materials are consistent with the system, but the photocopy of the real estate certificate, application form, room ticket, purchase contract and other photos are uploaded Later, the size is inconsistent, the angle is inconsistent, the contrast is low, and the color scale is not obvious, which is not conducive to human eye recognition. And densely packed words are more prone to fatigue and errors.  Figure 2 Analysis of real estate certificate information collection system In general, the main difficulties are: the main difficulties: 1) The sources of images are diverse, and the differences in images are very large, which poses a huge challenge to the accuracy of recognition.
2) The proportion of the area to be identified in the image is inconsistent, which poses a challenge to the identification area. Secondly, the image is inverted, twisted, tilted, glare, highlight, etc. The difference is too large, which poses a huge challenge to the accuracy of the classifier algorithm identified.
3) The website system is complex and the amount of data is large. Separating and extracting key information consumes manpower. In order to make the simplified audit module stable and reliable, it takes time to debug.

Design of image recognition algorithm
The text content recognition of a picture is to recognize the text information contained in the picture from an ordinary picture [4]. Due to the limitations of the shooting environment, external light, shooting angle, picture compression algorithm, shooting equipment and other factors, the quality of the picture, Sharpness and resolution are very different. In order to make the recognition text and the background color show a large difference, the R component is selected to grayscale the color image, and then the image is binarized by acquiring the global threshold and the local threshold of the image. 1) Binarize the image: First use the Otsu algorithm to obtain the global threshold T of the entire image, then use the Beresen method to calculate the grayscale mean Tbn of the current window, and finally use the maximum grayscale value and minimum grayscale value of the entire image to calculate A correction factor b, as shown in the following formula: Here, C is an empirical coefficient, usually taken as 0.12, g2 is the maximum value of grayscale in the image, and g1 is the minimum value of grayscale.
2) Black and white inverse color processing, the black and white inverse color processing is performed on the image obtained above, that is, black background, white text, and then use the findContours technology to detect the outer contour of the white pixel block in the binary image, and extract to meet the aspect ratio and Area required profile.
3) Number segmentation: Observation found that the cropped character image was obtained. In this image, it is obvious that the character and background color are highly differentiated, so the color is reversed to segment the character. 4) Feature extraction: Extract feature vectors of characters, that is, extract gradient distribution features + gray distribution features + horizontal projection histogram + vertical projection histogram, and finally each character can correspond to a 1 * 72 feature vector. 5) Neural network training: the training pictures used are all obtained by segmentation from multiple pictures, and then feature extraction, and the training image is generated by segmenting the image, and then the features are extracted according to the above method to obtain the training matrix 6) Classifier classification: The classifier trained through the neural network framework is used to classify the extracted character feature vectors to achieve character recognition [5].

Research on Algorithm Design of Image Recognition System
The criterion of the identification algorithm is the minimum mean square error, that is, the expected value of the square of the difference ) (n e between the ideal signal ) (n d Using vector form to express weight coefficients and inputs w and ) (n X , the error signal ) (n e can be written The square of the error is: After taking the mathematical expectations on both sides of the above formula, the mean square error is obtained: Define the cross-correlation function vector: Ed n X n  (6) And autocorrelation function matrix: Derivate the weight coefficient W by equation (8) to obtain the gradient of the mean square error function: Let ( ) n  = 0, you can find the best weight coefficient vector: Substituting opt W into equation (8), the minimum mean square error is obtained: The basis of this algorithm is the steepest descent method in the optimization method. According to this steepest descent method, the "next moment" weight coefficient vector Substituting (14) into equation (13), the gradient estimate is obtained: Therefore, the Windrow-Hoff identification algorithm is finally: Design of image recognition algorithm The text content recognition of a picture is to recognize the text information contained in the picture from an ordinary picture. Due to the limitations of the shooting environment, external light, shooting angle, picture compression algorithm, shooting equipment and other factors, the quality of the picture, Sharpness and resolution are very different. In order to make the recognition text and the background color show a large difference, the R component is selected to grayscale the color image, and then the image is binarized by acquiring the global threshold and the local threshold of the image [6].
1) Binarize the image: First use the Otsu algorithm to obtain the global threshold T of the entire image, then use the Beresen method to calculate the grayscale mean Tbn of the current window, and finally use the maximum grayscale value and minimum grayscale value of the entire image to calculate A correction factor b, as shown in the following formula: Here, C is an empirical coefficient, usually taken as 0.12, g2 is the maximum value of grayscale in the image, and g1 is the minimum value of grayscale.
2) Black and white inverse color processing, the black and white inverse color processing is performed on the image obtained above, that is, black background, white text, and then use the findContours technology to detect the outer contour of the white pixel block in the binary image, and extract to meet the aspect ratio and Area required profile.
3) Number segmentation: Observation found that the cropped character image was obtained. In this image, it is obvious that the character and background color are highly differentiated, so the color is reversed to segment the character. 4) Feature extraction: Extract feature vectors of characters, that is, extract gradient distribution features + gray distribution features + horizontal projection histogram + vertical projection histogram, and finally each character can correspond to a 1 * 72 feature vector. 5) Neural network training: the training pictures used are all obtained by segmentation from multiple pictures, and then feature extraction, and the training image is generated by segmenting the image, and then the features are extracted according to the above method to obtain the training matrix and label matrix for Model training, mainly through BP multi-layer neural network and deep learning model for character recognition training. 6) Classifier classification: The classifier trained through the neural network framework is used to classify the extracted character feature vectors to achieve character recognition. [4]. At this time, on the one hand, the recovery error: On the other hand,

( )
T W X n can be regarded as the prediction of ( ) x n . Therefore, the prediction error can be defined: The purpose of designing the adaptive filter is naturally to minimize the recovery error ( ) η n . But because the real signal ( ) s n is unknown, ( ) η n is unobservable or incalculable. On the contrary, the prediction error ( ) e n is observable, and its relationship with the recovery error is: The noise sequence ( ) η n is independent, so the minimization of the unobservable recovery error ( ) η n is equivalent to the minimization of the observable prediction error ( ) e n . Specifically, consider the following formula to minimize. In the formula,  is the forgetting factor, usually The equivalent relationship can be obtained: Then formula (24) can be abbreviated as: Assuming that ) (n R is non-singular, then: This is the formula for the filter parameters of the filter. The reason for this is ( ) W n because W changes with time.

Advantages of online image review system
The system adds online declaration and status inquiry. Applicants can log in to the electric company's online service platform to apply without leaving the house. On the online service platform, the service guide and the handling process are clear at a glance, and applicants can check the progress of the handling at any time through the "business status" column of the individual declaration on the online service platform. The approval service has increased service outlets, making it easier for power users to work nearby. Reduce the number of submissions in the window. It is very convenient to complete the online pre-review on the next day [7].

Conclusion
In general, the main difficulties are: the main difficulties: 1) The sources of images are diverse, and the differences in images are very large, which poses a huge challenge to the accuracy of recognition.
2) The proportion of the area to be identified in the image is inconsistent, which poses a challenge to the identification area. Secondly, the image is inverted, twisted, tilted, glare, highlight, etc. The difference is too large, which poses a huge challenge to the accuracy of the classifier algorithm identified.
3) The website system is complex and the amount of data is large. Separating and extracting key information consumes manpower. In order to make the simplified audit module stable and reliable, it takes time to debug.