Long-Term Weather Prediction Based on GA-BP Neural Network

A genetic algorithm is used to optimize the weights and thresholds of the BP neural network in this paper, this method improves the disadvantages of the BP neural network. For example, It is insensitive to weights and thresholds. In addition, it has a slow convergence rate and is prone to fall into local minima. etc. To verify the improvement effect, the BP neural networks and the improved BP neural networks were used to simulate the weather in Rizhao city, Shandong Province, respectively. The results of simulation show that the GA-BP neural network model is superior to BP neural network model in the accuracy of temperature prediction in long-term weather. Next, the GA-BP neural network was further optimized by selecting the optimal number of hidden layers and changing the impact factors, and the prediction results were fitted. The analysis shows that the accuracy of the improved GA-BP model in prediction is higher.


Introduction
The human brain is a network structure composed of interwoven neurons of different organisms. It can complete complex processes such as thinking, perception, association, and memory, so it is known as an effective and perfect information processing system. Artificial Neural Networks (ANNs) [1], composed of many interconnected neurons, are complex network systems that simulates the structure and function of the human brain. The artificial neural network has the ability to self-adaptability and solving nonlinear problems. It can process information by adjusting the connection strength between neurons. The research on the neural network can be divided into research on its theory and its application. In its application research, one is the analysis and research on the software simulation and hardware implementation of ANNs, and the other is the research on its specific application in some fields, such as pattern recognition, combination optimization, signal processing, and so on. By the middle and late 1980s, different types of artificial neural network models were proposed. At the same time, scholars at home and abroad accelerated the exploration of their theories and applications, and a large number of research results emerged, which further promoted the development of an artificial neural network.
The application of artificial neural networks in meteorology started late, and now its main applications [2] include short-term climate prediction, agricultural meteorology, typhoon prediction, soil moisture prediction, visibility prediction, air pollution prediction, etc. Compared with the traditional numerical forecast model, an artificial neural network is applied to the weather forecast, which reduces the complexity of the implementation process, reduces the limitation of data, and improves the accuracy of the forecast by comprehensively considering various nonlinear factors. Therefore, it is more in line with people's needs for the accuracy of the weather forecast. Gradually, With the rapid development of artificial neural network technology, the BP algorithm is introduced into ANN, and then BP neural network is obtained. The BP neural network is widely applied in prediction, catching the eyes of many scholars, and abundant research results are obtained. For example, Cao X F [3] established a feedforward BP neural network, and used the method of a variable gradient to carry out correlation analysis on the meteorology of wheat scab in the Yangtze River Basin, and obtained effective and accurate prediction results, which provided theoretical guidance for controlling the occurrence of wheat scab in this area. Yuan M Y [4] used BP prediction model to make the precipitation and temperature forecast of Heilongjiang Province, they predicted the temperature in 1998 and 1999 and compared the results with the real temperature. The results were not far from each other, and the prediction was more accurate. Later, based on BP neural network model, some scholars further improved the BP neural network model by using different algorithms to influence the input factors at the input layer. For example, in literature [5], rainfall-runoff is divided into flood season and non-flood season through using the partition algorithm and then predicted by combining with BP neural network model. The results show that the improved model can significantly improve the prediction of rainfall-runoff. Liu D [6] took the probability of daily and lunar phases as the BP neural network's input factors, then he established a long-term weather forecast model, and the results showed that: compared with the method of direct prediction of long-term weather with the probability of daily and lunar phases, the model combined with BP's prediction accuracy was higher. BP neural network is used for prediction greatly improves the prediction accuracy, but because the neural network training is usually a complex process, and the weights of along with the network and the change of the training sample set, will lead to the randomness of network training results and easy to fall into local extremum, so there are still many problems to be studied in the field of prediction. Therefore, some scholars aim at the shortcomings of the BP neural network and try to improve it with a genetic algorithm. For example, Ai H F [7] added a genetic algorithm to the BP neural network model to establish the GA-BP prediction model and simulated the PM2.5 content in Changchun. Zhao Z [8] predicted and simulated the PM2.5 value of a certain place by GA-BP model. Yang Y [9] used GA-BP neural network to predict and analyze the air index during the heating period. Zhang P D [10] used GA-BP and BP neural networks to predict and simulate urban air quality respectively, and the results all showed that the improved BP neural network model had higher prediction accuracy.
In this research, a long-term BP model and its optimized prediction model are used to train the temperature related data in Rizhao city, Shandong Province, and then predict the temperature in recent years. The results show that the improved GA-BP model is more accurate than the single BP model, which has practical application value. The improved GA-BP model can be used for the meteorological demand of long-term weather forecast and can provide reference for the prevention of some climatic disasters.

GA-BP Algorithm Implementation Process
BP (Back Propagation) neural network [11] is a common network model in the artificial neural network. Its algorithm process is as follows: Step 1: When the sample data is transmitted forward, it first goes through the input layer, connects and weights it to the hidden layer, and then each neuron in the hidden layer summarizes all the inputs, then produces some output through the transfer function, and outputs through the next layer connect and weights it through the output layer.
Step 2: Compare the error between the expected output d o (k) and the actual output yo o (k) .
Error function: Step 3: In the error back transmission stage, the error between them should be sent back, and the weight between each connection layer should be adjusted in reverse order for repeated learning and training, and the cycle should continue until the output layer outputs a relatively satisfactory result.
The purelin function is used to transfer parameters in output layer of the BP model, while the tansig Its mathematical model is shown in figure 1.  Genetic Algorithm (GA) is an evolutionary Algorithm developed on the basis of natural selection and the principle of gene genetics. Figure 2 is the neural network algorithm program of genetic algorithm optimization.

Establishment of Forecast Model
In this paper, monthly meteorological data of Rizhao, Shandong province from January 1973 to December 2018 are selected (from China Meteorological Network). Relevant contents of weather data include monthly average wind speed, mean sea level pressure, monthly precipitation, monthly precipitation days, and monthly average visibility, etc. A total of 552 sets of data were simulated by MATLAB, among which the first 456 sets of data from 1973-2010 were used for simulation training of the model, and the last 96 sets of data from 2011-2018 were used for simulation prediction.
In the construction process of the BP model, the forecast model includes five meteorological input parameters, namely, monthly average wind speed, mean sea level pressure, monthly precipitation, monthly precipitation days, and monthly average visibility, and one output parameter, namely, monthly average temperature. There are 5 nodes in the input layer and 1 node in the output layer.
The concrete steps to implement GA-BP algorithm [12] are as follows: Step 1: Import sample data and normalize the data to build a network. In the GA-BP model, 60 iterations are performed, setting the population size as 60, crossover probability as 0.3, and mutation probability as 0.01, and the random initial population is initialized.
Step 2: The fitness value of the population is calculated to find the optimal individual. Then, through selection, crossover, and mutation operation, determine the optimal and the worst individual with the fitness value and their positions in the team and change the best individual.
Step 3: Determine if evolution is over, and if so, proceed to the next step. If not, go back to step 2 for loop iteration optimization.
Step 4: Finish the iterative optimization of genetic algorithm optimization BP network algorithm, and assign the network parameters of the optimal individual obtained by optimization to BP neural network for training.
Step 5: Set the network parameters of BP neural network, and train the network for 100 times, the learning rate was set as 0.1, the momentum coefficient as 0.8, and the network target accuracy as 0.001.

Prediction of Temperature by BP Model
BP neural network takes into account many nonlinear factors that cause temperature change, and it has strong mapping ability to nonlinear fitting. When there are 5 nodes in the input layer and 1 node in the output layer, according to experience, the number of nodes in the hidden layer is set as 10. Thus, its network model structure can be obtained, as shown in figure 3 below.  Figure 4 shows the error change curve of the network in the training process. Through analysis, when BP neural network iterations for 8 times, the output performance reaches the optimal level, and the output error is 0.012946. Moreover, when the network iterates 14 times and the iteration stops, and the BP neural network error accuracy target 0.001 during the training process, the error performance of the model network does not reach the expected error accuracy target.

Prediction of Temperature by GA-BP Model
The improved BP neural network model based on a genetic algorithm overcomes the shortcomings of the traditional BP neural network that is sensitive to threshold and weight. After analyzing the error performance curve (figure 7) in the process of network training, the optimized neural network reached the best output performance with an error of 0.01215 when iterating for 3 times. When the number of iterations reaches 9, the iteration is stopped. Compared with BP, although the output error is reduced, it still fails to meet the expectation.
Based on the prediction steps of the BP model, the first 456 samples were simulated and predicted by using the trained network, and the fitting results of their real values and predicted values were obtained. The simulation prediction of the final 96 sample data was carried out with this model, and the fitting curve of the real value and the predicted value (figure 8) and the error curve of this model were obtained (figure 9). According to figure 9, the maximum error of this model is 7degrees Celsius.
Through the above figure, the prediction results of the two network models are analyzed and compared, and the prediction accuracy of GA-BP model for temperature of Rizhao, Shandong Province is higher than that of BP model.

Improvement of GA-BP Model
Adding the input nodes of the network in the GA-BP neural network model is meaningful to improve the accuracy of the prediction results, so the input layer' nodes were increased from 5 to 7, and the improved model is denoted as GA-BP1. Considering the different network output error performance corresponding to the number of nodes in different hidden layers, the error performance is measured by mean-variance MSE, and hidden layers' nodes number are determined according to the empirical formula [13]  MSE calculation formula is as follows: Where t i is the network output value,a i is the expected output value of the sample, and N is the number of samples.
After repeated training experiments, when the input layer and output layer's nodes are 7 and 1, the network output error data corresponding to the number of nodes in different hidden layers is obtained (table 2). It can be known that when the hidden layer is 10, the mean-variance is the lowest, with a value of 0.0012. Therefore, the hidden layer is determined as 10 here.     To make the improved results more convincing, the model GA-BP2 was added, and the difference among GA-BP, GA-BP1 and GA-BP2 were given in table 3. Then the network was trained, and the error performance curves of GA-BP1 and GA-BP2 models ( figure 10a, figure 10b), the simulation curves of the real and predicted air temperature (figure 11a, figure 11b) and the error curves (figure 12a, figure 12b) were obtained in turn, just like the steps of BP and GA-BP models in predicting air temperature.      According to the network error curve ( figure 10a, figure 10b), it can be seen that the optimal output errors of GA-BP1 and GA-BP2 are 0.0013240 and 0.00054331 respectively, both of which meet the prediction error expectation, and GA-BP2 has a better error performance. According to the simulation curves of the real and predicted air temperature ( figure 11a, figure 11b), it can be seen that the predicted value of the model is close to the real value. Then, according to the error curve (figure 12a, figure 12b), the difference between the predicted value of GA-BP1 and the real value is close to 4 degrees Celsius, while the maximum difference temperature between the predicted value of GA-BP2 and the real value is still less than 2.5 degrees Celsius. It is easy to know that the prediction errors are all smaller than GA-BP model. In conclusion, the GA-BP1 and GA-BP2 model are superior to GA-BP model in prediction performance .

Comparison and Analysis of Results
It can be seen from the simulation results of the above four models that the prediction accuracy is gradually increasing. However, in order to observe the improvement degree of each model more intuitively, according to the prediction results of each model, the fitting curves of four models are respectively made below (figure 13).
Note: The vertical axis represents the true value and the horizontal axis represents the predicted value; the corresponding model of each fitting curve is marked out in the figure.