Analysis and Research on the Performance of Solar Concentration Based on Big Data and Machine Learning

Empirical formulas for PTC optical efficiency calculation are difficult and costly to obtain from rigorous comparative experiments, whereas simpler optical modeling methods inadequately incorporate realistic optical effects. In this article, algorithms are respectively developed to calculate the geometric concentration ratio (Cg) of linear Cassegrainian solar concentrators (CSC) with a secondary flat mirror based on the way of edge rays from solar sources to a flat-plate receiver. On the basis of the large amount of data generated, machine learning and Python language programming methods are used to analyze and process the data, and the functional relationship between the concentration ratio and each parameter is obtained. The learning and training effect is good, and the ideal result is achieved.


Introduction
Generally, empirical formulas for PTC optical efficiency calculation are difficult and costly to obtain from rigorous comparative experiments, whereas simpler optical modeling methods inadequately incorporate realistic optical effects. Thus, new computing methods that are accurate, fast, and costeffective for PTC optical efficiency calculation or empirical fitting formulas are urgently needed. As an accurate optical efficiency computing method for geometric designing, energy analyses, and optimization solar concentration collectors, Cheng Z D et al. [1] proposed fitting formulas for PTCs by combining Monte Carlo ray tracing (MRCT) and population-based particle swarm optimization algorithm. The optical efficiency fitting formulas were obtained by analyzing the Monte Carlo ray tracing data through a variable reduction technique, while the fitting formula coefficients were acquired from the particle swarm optimization algorithm.Wu S B et al. [2] proposed big data security framework based on encryption. Some scholars have conducted research on big data and machine learning. Bode G et al. [3] proposed real-world application of machine-learning-based fault detection trained with experimental data. Rodriguez S D et al. [4] developed the methods for Improving the concentration ratio of parabolic troughs using a second-stage flat mirror. Wu S B et al. [5] analyzed the crime data in YD county and proposed crime prediction using data mining and machine learning. In this paper, Machine learning and python language programming methods are used to analyze the performance of the secondary plane reflection solar concentrator, and perform machine learning on the generated data to obtain a higher accuracy of solar concentrator and the functional relationship between related parameters and solar concentration ratio.

Linear Regression
Linear regression is a simple prediction model in which the error terms are assumed to follow a Gaussian distribution 2 (0, ) N  [6]. The linear regression formulation of the problem in the present study is expressed as follows: where xi represents nodal features such as reflectance, transmittance, absorptivity, intercept factor, and incidence angle distribution, while ai is the coefficient. On the right-hand side of Eq. (1), ε is the only stochastic variable, while other terms are fixed observation values.

R-square Value (R 2 )
In statistics, the coefficient of determination R 2 in the model is used to measure the predictive ability of the model. The specific expression of R 2 is as follows [7]: In the formula, ̅̅̅ represents the mean value of . The value range of R 2 is 0-1. The larger the value of R 2 , the better the prediction effect. If the value is 1, it can be said that the regression model perfectly fits the actual data.

Machine Learning-Based Linear Regression
The best way to solve machine learning or analysis problems in the real world is to use a machine learning framework, from collecting data to using machine learning techniques to convert it into valuable information or knowledge. Supervised machine learning techniques are used in conjunction with supervised, labeled data to train the model and then predict the results of unseen test data samples. Certain processes (such as feature scaling, extraction, and selection) must remain the same in order to use the same features to train the model and extract the same features from invisible test data samples to test the model in the prediction phase [8].
Simplification of a complex problem, followed by simulation using software tools is the main approach used by researchers for studying concentration ratio for a secondary flat concentrator. In this study, machine learning is employed to solve a complex concentration ratio calculation problem. Linear regression is then utilized to generate a simple fitting expression. This expression is convenient for calculating optical efficiency and studying the optical performance of PCTs. Therefore, linear regression provides simple relationships that can be exploited for optics-related calculations.
First, we used this approach for designing the secondary reflection hyperbolic concentrator presented in Sections 3.1 and 3.2 after selecting the best performance parameter. Then, we calculated the concentration ratio introduced in Section 3.2, and employed the Python program for designing the concentration ratio by calculating and obtaining a dataset of parameters related to the concentration ratio. Finally, by considering the concentration ratio as target, a linear regression method in machine learning is used for its calculation. The learning results were evaluated using the evaluation index accuracy rate (R 2 ) in machine learning, with the effect getting better as R 2 approached 1, with the regression coefficients concurrently obtained.

Algorithms to Correlate the CR and Receiver's Position of Linear CSCs
To be convenient for analysis, a coordinate system with the origin at bottom of parabola trough and xaxis normal to the aperture is used in this work, as shown in figure 1. In the suggested coordinate system, the primary parabola is expressed by 2 where  , D and f are the rim angle, width and focal length of primary parabolic trough, respectively.

Design Description of CSC with a Flat Mirror
The secondary reflector as flat mirror is called Newton structure system. Its concentrating principle is consisted of a main mirror and a secondary mirror. The primary mirror adopts a rotating paraboloid and the secondary mirror adopts a plane. It utilizes the parabolic and planar reflection properties: a parabolic concave mirror can converge all the light parallel to the optical axis to its focus; the planar reflection changes the focus of the concentrating light.

Definition
The definition of the Concentration Ratio by Cg is strictly geometric, as ratio of aperture area to absorber surface area Considering the shading, we denote this ratio by Ce and call it the efficient concentration ratio which is where the Shading is that Shading occurs if direct sunlight rim angels to reach a mirror because the mirror is in the shadow of another mirror.
Another important definition by IF and call it the intercept factor which is

Linear CSC with a Secondary Flat Reflector (CSC-F)
Considering another two rays r1 and r2 making an angle θ to the axis of the parabola, as shown in figure 2. After reflection, they intersect at a point E of the intersection of the straight lines passing through points A1 and A2 with the directions of rays r1 and r2 after reflection. It is given by cos The focal belt location of secondary reflection is: From what have been discussed above, the efficient geometrical concentration ratio is computed by following:     figure 4, the relation between ECR and focal belt location of flat mirror with rim angle increasing. The ECR of flat mirror increases with rim angle increasing, the rim angles vary from 27 to 87 at =0.269 and the ECR of flat mirror increases from 82.306 to 104.906 at rim angle =45 and then decreases to13.777. With the rim angle increasing, the focal belt location decreases from 107.207 to 26.557.

Big Data Visualization Based on Python
In order to analyze the relationship between the data and study the relationship between rim angle and Ce, f, Blocking, Spot, Ce/C0, Cg/C0, Ce/Cg, we derived the design of the secondary reflection plane concentrator from the previous the data obtained by multiple iterations is filtered and the relevant data is obtained as shown in figure 6 (=0.5°) and figure 7 (=1.0°).

Function Relationships between Efficient Concentration Ratio and All Sorts of Parameters
The mean square error of train and test are:0.00,0.00 The coefficient of determination is: R 2 =1.00 (4) Relationship between rim angle, f, m, blocking, spot and Cg/C0. The coefficient of determination is: R 2 =1.00

Conclusions
The main objective of this work was to demonstrate how to use machine learning methods to derive empirical formulas and the functional relationships between parameters and dependent variables in the context of big data. In this paper, the data generated by a Secondary flat concentrators system is used in machine learning through machine learning training and testing methods, achieving very good learning and training effects, and the determination coefficient as an evaluation standard has reached 1.0, that is R 2 =1.00. It provides a useful method and guidance for big data machine learning. This method can be used in all kinds of big data machine learning training.