Operation State Identification of Commercial & Industrial Users

Users’ electricity usage information is helpful to promote performance of load forecasting and demand response. Users’ metering data contains abundant usage information and various approaches have been developed to extract users’ usage information from metering data. Since a user have specific several operation states and the user’s electricity consumption have particular features in each state, user’s operation state identification based approach is developed in this paper. The three phase power with an interval of 15 minutes in a day is utilized as fingerprint of the day. For the abnormal users with anomaly usage, load data should be analysed to get the load fingerprint in each day. Thereafter, the load fingerprint can be clustered with Affinity propagation algorithm. Once the user in suspicious days with much less electricity consumption has similar load fingerprint as that in holiday, the anomaly electricity consumption could be caused by alteration of operation state.


Introduction
Load forecasting and demand response of users of power utility plays a key role to facilitate stability of power system with high ratio renewable power.A considerable number of data-driven detection methods take the sudden change of load or power consumption [1] as the core indicators, and design the classification [24] or clustering algorithm [57], which can reliably and effectively identify the abnormal power consumption.In practical systems, environmental protection inspections, college entrance examination noise nuisances, and security inspections and many other reasons can cause a sudden change in user's electricity consumption, which can easily be confused with electricity theft and lead to false positives.It is an important obstacle to hinder data-driven detection methods from becoming practical.Although the short-term power consumption drop can be identified based on the cumulative abnormality of the user [8], there is a bottleneck in how to set a threshold with universal applicability.Considering the impact of weekends, in engineering applications, it is generally regarded as abnormal power consumption if the power consumption drops for more than 3 days.However, users often need a long time for equipment overhaul, shutdown transformation and fire protection rectification, which is difficult to eliminate interference according to the duration.
The anomaly detection generally evaluates the detection effect according to the accuracy rate, and requires the number of sample error classification (the sum of the number of false positives and the number of false negatives of abnormal samples) to be minimized.Power utilities serve a large number of customers, in which a considerable number of users have abnormal power consumption.Restricted by the human resources of power use inspection, the power utilities have to implement load forecast and demand response with a data driven approach.To determine operating state of a user could be helpful [9].
In this paper, the author believes that the lack of available information according to the power consumption identification of the power consumption abnormality, and the failure to identify the decline in power consumption caused by changes in the normal production and operation state are the root causes of susceptibility to interference.Since the power equipment used by a user under different operation states could be of specific load fingerprint in its metering data, when the normal users stop production and reduce production, resulting in a sharp drop in power consumption, although the sudden drop in power and load is confused with other phenomenon, there will be significant differences in power consumption behaviour mode.It is possible to identify the power consumption mode corresponding to the user's production and operation state by using the load fingerprint identifying the user's power consumption behaviour mode characteristics, so as to distinguish whether the power reduction is due to the change of production and operation state or other reasons.

Identifying power consumption mode with load fingerprint
Non-invasive load analysis was proposed by Hart in 1992.According to the measurement data of electricity consumption of all electrical equipment recorded by the metering device at the user's access system, it decomposes the electricity consumption of each electrical equipment to obtain the information of energy consumption of equipment and electricity consumption law of the user [10].
In the non-invasive load analysis, the power consumption behaviour of electrical equipment is also called load fingerprint, which is the key to realize load decomposition.For non-invasive load analysis, it is necessary to combine the load fingerprint of specific electrical equipment, set identification rules, and then identify the equipment through pattern matching to obtain electrical equipment power consumption information [11].At present, the load fingerprint of electrical equipment is generally divided into high frequency fingerprint and low frequency fingerprint.Among them, the highfrequency fingerprint mainly includes the current, power transient waveform, jump edge width and height in the process of equipment operation [12]; the low-frequency fingerprint mainly includes the active/reactive power [10], harmonic composition, etc.Because of the higher transient signal identification, the early non-invasive load analysis mostly uses transient signal identification and user load information extraction [13,15].Because the analysis of high-frequency fingerprint requires far more equipment performance than the existing non-invasive monitoring equipment such as smart meters, the current research focuses on the use of steady-state signals [16,17], exploring the use of various algorithms to extract load information.
Because power users usually arrange production activities day by day, this paper uses the measurement data of 30min / 15min interval of a day as the load fingerprint to identify the operation state of users and the corresponding power consumption behaviour pattern.
The composition of industrial and commercial users' electrical equipment is basically fixed.The investment combination of various electrical equipment is determined by the scheduling arrangement corresponding to the operation state.The investment combination of electrical equipment corresponding to different operation conditions will form a relatively fixed power usage pattern.For a specific user, the number of applied power modes is always limited with its operation state, and the user's power consumption will always switch between several limited modes.For industrial enterprises, there are common modes such as the Spring Festival holiday, weekend holiday, full-load or normal production.For business users, there may be winter, summer, and spring/autumn patterns.
Under normal circumstances, the user's various operation states can be traversed according to one year's historical data.For the normal users, the external interference only makes the users switch from the state of more power consumption to the state of less usage, which is the transition between the states of normal operation, and does not produce a new state.Once the power consumption behaviour mode displayed in the low power period belongs to the previously existing mode, it indicate that the user's low power usage is abnormal.

Power usage pattern recognition based on load fingerprint
In the electrical equipment, generally low-power equipment is a single-phase load, while highpower equipment is mostly a three-phase load.During the rest period at night, the main load is low-power single-phase loads such as lighting.Generally, at this time, the user load is small and the threephase symmetry is the lowest.During the normal production and operation period in the day, the number of equipment in use is large and the power is large, and the three-phase symmetry degree is significantly higher than that at night, and the heavier the load is, the higher the three-phase symmetry degree is.
During the day, the user's 15 / 30min three-phase metering data contains information such as load level and three-phase symmetry, which can be used as a load fingerprint to identify the use of electrical equipment and the corresponding production and operation state.For the purpose of visualization, the maximum value of the three-phase load of the user is used as the reference value.After calculating the standard value of the three-phase power at each time, the three-phase power of the user on weekdays and holidays is displayed in the form of a scattered cloud as shown in Fig. 1.Fig. 1 (a) is a three-phase power cloud diagram of a working day, and Fig. 1 (b) is a three-phase power scattered cloud diagram of a weekend.The three coordinate axes are three-phase power.It can be observed that. In the working day, there are many three-phase symmetrical high-power equipment in usage, the power load and the three-phase symmetry are relatively high, and the three-phase power is tightly distributed along the diagonal; the power dispersion point at night when the load is low is concentrated in the lower right corner of zero power. In the holiday, a large number of three-phase symmetrical high-power equipment used in normal production and operation are out of service.The main equipment put into use is single-phase lowpower equipment.The load level and three-phase symmetry are significantly lower than the working day.At this time, the three-phase power is loosely distributed around the diagonal line, and the power scattered around the lower right corner of zero power is significantly higher than the working day.No matter what kind of anomaly the user is, the load fingerprint (including loading level and threephase symmetry) will always be changed to make it to deviate from the power consumption mode corresponding to the user's inherent production and operation state.Therefore, the three-phase power of one day is used as the load fingerprint, which can identify the different production and operation states of normal users and the state changes caused by anomalies.It provides incremental information for distinguishing and identifying the production and operation state mode of the user.

Load fingerprint clustering based on neighbor propagation algorithm
The affinity propagation (AP) [17] algorithm is a clustering algorithm based on the information dissemination of the nearest neighbours.Its purpose is to find the optimal set of class representatives, so that the sum of similarity from all samples to the nearest class representatives is the largest [19,20].
The nearest neighbour propagation algorithm first considers all N samples of the data set as candidate class representatives, and establishes for each sample the information of the degree of attraction with other samples, that is, the similarity between any two samples xi and xk (using European When the distance is a measure, s(i, k) = ||xi -xk|| 2 ) is stored in the N×N similarity matrix.The nearest-neighbour propagation algorithm uses s(i, k) to represent the extent to which the sample xk is suitable as a class representative of the sample xi.The initial assumption of neighbourhood propagation algorithm is that all samples have the same probability of being selected as class representatives, that is to say, all s(i, k) are set as the same value p.In order to select the appropriate class representatives, the neighbour propagation algorithm constantly collects the relevant evidence from the samples.Therefore, two important information parameters, namely the degree of attraction r and the degree of belonging a, are introduced into the neighbour propagation algorithm, which represent different competitive purposes.r(i, k) is from xi to xk, which represents the evidence accumulated by xk, and is used to represent the degree of xk suitability as the class representative of xi; a(i, k) is from xk to xi, which represents the evidence accumulated by xi, and is used to represent the degree of xi suitability to select xk as the class representative.For any sample xi, the sum of attractiveness r(i, k) and attribution a(i, k) of all samples is calculated, and the sample xk with the largest sum of the two is the class representative.Through the information transfer between the nodes, each node accumulates the evidence as the centre point, and finally determines the optimal cluster representative point set to complete the clustering.

Fig 2. Flow chart of neighbour propagation clustering
Because the nearest neighbour propagation algorithm not only does not need to specify the number of unpredictable clusters, but also the cluster centre is the actual sample, which is easy to understand.This paper chooses the nearest neighbour propagation algorithm to cluster the user power consumption mode according to the load fingerprint.The cluster analysis process is shown in Fig. 2. The specific process is described as follows:  Set the algorithm parameters, including the attenuation coefficient λ , the maximum number of iterations , and the maximum number of iterations of the cluster center ' . Take the single-day three-phase power standardization scattered point data as a sample point.
Euclidean distance is selected as the similarity measure between the sample points.The similarity s , between different sample points and is calculated as follows: s , is taken as the element of the corresponding position of the similarity matrix, and the median of all the similarity elements is taken as the diagonal element of the similarity matrix to generate the similarity matrix; For the zero matrix with the degree of attraction matrix R and the degree of belonging matrix a initialized to n × n, the elements of the degree of attraction matrix R and the degree of belonging matrix A are 0, and the elements of the degree of attraction matrix R and the degree of belonging matrix A are calculated as follows:  Initialize the attractiveness matrix and the belongingness matrix to an × zero matrix, the element 0 , of the attractiveness matrix and the element 0 , of the belongingness matrix are 0, and calculate the element 1 , of the attractiveness degree matrix and the element 1 , of the belongingness matrix are calculated as follows: (5) The clustering centre is determined by summing the degree of belonging and attraction of each sample point.Among them, for the sample point , when a , + , gets the maximum value, if = , the sample point is determined as a cluster centre; and if ≠ , then sample point is determined to be the cluster center; when the number of iterations of the cluster centre is equal to the pre-set maximum number T' of the cluster centre, the cluster centre is still unchanged, or the number of iterations is equal to the pre-set maximum number , the clustering algorithm ends.

Secondary screening of power consumption anomaly based on load fingerprint clustering
The composition of power consumption equipment for industrial and commercial users is relatively fixed.The combination of various types of power consumption equipment is determined by the shift arrangement corresponding to the production and operation state.The power consumption mode and corresponding production and operation state of users can be identified according to the load fingerprint [21,22].Because the number of production and operation states of a specific user is always limited, its power consumption mode will change between a limited number of states, and the corresponding load fingerprint can be clustered into different clusters according to different production and operation states [23,24].The load power corresponding to these production and operation conditions are different, but they are all normal load conditions.Stealing electricity will change the structure of load fingerprint and form a new load fingerprint cluster.
According to the user's historical power consumption data, the load fingerprint of the user under different production and operation conditions can be determined.During the detection of power stealing, if abnormal power consumption abnormality of the user 's power consumption is detected,, the cluster comparison can be made between the load fingerprint in the period of abnormal power consumption and the load fingerprint corresponding to the historical data without abnormal power consumption according to the production and operation state marked by the load fingerprint, so as to judge whether the power consumption is decreased due to the switching of the production and operation state of the user or the real abnormal power consumption.The specific process is shown in Fig. 3, and the specific steps are detailed as follows:  The existing algorithm is used to detect the abnormal power consumption and identify the suspected power stealing users with abnormal power consumption;  Select the power consumption data of suspected power stealing users for nearly one year, and take the maximum single-phase power as the reference value, calculate the per unit value of 15/30min three-phase power every day as the load fingerprint;  Adopt the nearest neighbor propagation algorithm to cluster and analyze the load fingerprint before and during the suspected power stealing period;

Fig 3. Flow chart of clustering based electricity theft verification
Determine whether the load fingerprint in the abnormal period of power decline is the new cluster centre.If the daily load fingerprint in the period of abnormal power consumption after clustering and the previous fingerprint are clustered into the same cluster without adding new clusters, the user can be considered as the power consumption decrease caused by the change of production and operation state, and the suspicion of power stealing can be ruled out; if the load fingerprint in the period of suspected power stealing is clustered into new clusters, it can be identified as the suspected power stealing that needs on-site inspection and confirmation.

Numerical simulation
Based on the 15 minute interval metering data of 40 users in four industries (including knitted products manufacturing, metal products manufacturing, cement products manufacturing and plastic products manufacturing) verified by power grid company, the proposed method is tested and simulated.The following takes a textile factory user as an example to illustrate the process and effect of the method proposed in this paper to screen the user's abnormal electricity consumption.Fig. 4 shows the daily power consumption curve of the user in 2019.It can be seen from the figure:  The electrical load of textile mills has a clear periodicity, and the electrical load on weekends is significantly lower than on weekdays;  The first ten days of February is the Spring Festival holiday, and users are in a state of complete shutdown.The electricity load before and after the Spring Festival holiday will gradually fall and recover as people return home and return. In mid-to-late August, the power consumption of users decreased significantly.After on-site inspection, it was confirmed that the user steals electricity by using the current transformer Bphase shunting method, and the measured load of this phase is only about 20% of the actual load.The theft period is from Aug. 9 to Sept. 2.  On the first day of the May Day holiday and the first 4 days of the 11th holiday, users are close to The load fingerprint clustering analysis of the user's power stealing period and previous historical load data is carried out by using the nearest neighbour propagation algorithm.In the simulation experiment, the parameters of the nearest neighbour propagation algorithm is set as follows: the attenuation coefficient λ mainly affects the clustering convergence time and has no effect on the clustering results.In this paper, it is set as 0.5; the maximum number of iterations is set as 500, the maximum number of iterations of the clustering center does not change is 50, and the reference is set as the median of all values in the similarity matrix.According to the analysis of near neighbour propagation, the data of power stealing period and the previous 9 months are grouped into 6 clusters.The date composition, corresponding production and operation state and cluster centre of each cluster are listed in Table 1.For the convenience of comparison and explanation, the daily load curve and three-phase power scatter cloud diagram of each cluster centre are also plotted as shown in Fig. 5 and Fig. 6.

Conclusions
In view of the problem that the data-driven detection method is easy to be affected by the power fluctuation, which leads to the high false positive rate and hinders the industrial application.A load fingerprint based approach is proposed to identify operation state of user.Based on the analysis of the corresponding characteristics between the user's production and operation state and the power consumption mode, a load fingerprint model based on the user's three-phase power throughout the day is constructed, which can identify the corresponding production and operation state of the user's power consumption mode according to the load fingerprint.Numerical simulation of multiple users of 4 industries verifies the effectiveness of the proposed method.
(a) working day mode (b) holiday mode Fig 1. Scatter plot of 3-phase power consumption of a user

Fig 4 .
Fig 4. Abnormal electricity user load curve The load fingerprint is calculated according to the user's daily metering data.The user's load data can be expressed as = , = 1, ⋯, 365 , where = 1 , , ⋯, 96 ; 1 , ⋯, 96 ; 1 , ⋯, 96 .The load fingerprint clustering analysis of the user's power stealing period and previous historical load data is carried out by using the nearest neighbour propagation algorithm.In the simulation experiment, the parameters of the nearest neighbour propagation algorithm is set as follows: the attenuation coefficient λ mainly affects the clustering convergence time and has no effect on the clustering results.In this paper, it is set as 0.5; the maximum number of iterations is set as 500, the maximum number of iterations of the clustering center does not change is 50, and the reference is set as the median of all values in the similarity matrix.According to the analysis of near neighbour propagation, the data of power stealing period and the previous 9 months are grouped into 6 clusters.The date composition, corresponding production and operation state and cluster centre of each cluster are listed in Table1.For the convenience of comparison and explanation, the daily load curve and three-phase power scatter cloud diagram of each cluster centre are also plotted as shown in Fig.5and Fig.6.Table1.Clustering of load signature Cluster class cluster description

Fig. 5 6
(e) Load profile of cluster 5 (f) Scatter plot of cluster 5 (f) Load profile of cluster 6 (f) Scatter plot of cluster 6 Load profile of clustering centres Fig.Scatter plot of cluster centres Combined with the chart, the detailed analysis of various clusters is as follows:  Cluster 1 denotes the operation state of users during the Spring Festival and May Day holidays.The workers are on vacation and the machine is stopped.Only the basic electricity such as lighting is used.In the three-phase power scatter diagram, the distribution is concentrated in the lower right corner of the zero power and the lower bottom power. Cluster 2 denotes operation state of users on normal working days.The textile mill is in production mode.The equipment runs without stopping, and the load is evenly distributed throughout the day.It is slightly larger in the day than at night.Because the load is large throughout the day, all power scatter are far away from the lower right corner of zero power and the bottom of low power, and distributed diagonally in the upper part of scatter diagram. Cluster 3 denotes the time period for workers to return home / rework in succession before and after the Spring Festival, during which only part of the equipment is put into operation in the daytime due to insufficient start-up, and the load is slightly less than the normal working day; the three-phase power scatter diagram is divided into two parts, and the measurement data in the daytime and cluster 2 are distributed along the diagonal in the upper part of the scatter diagram, while the measurement data at night are concentrated in the lower right corner of zero power. Cluster 4 denotes the normal weekday and Qingming Festival.The weekday of the textile factory is set on Friday.The load level of day and night is significantly lower than the normal working day of cluster 2. The three-phase power scatter cloud chart is distributed at the bottom. Cluster 5 and cluster 6 denotes the normal working day and weekly rest day of power stealing respectively.As the phase B shunt is nearly 80%, the daily load curve on the central day of cluster is significantly lower than that of other two phases; both of them clearly deviate from the diagonal on the cloud map, and are distributed near the low-power area of phase B on the right side of the graph.The cluster 6 is clustered tightly in the lower right corner of the zero-power scatter plot due to light load during the day.Therefore, two new clusters are formed during the period of decline in power consumption, which are different from the previous ones.

Table 1 .
Clustering of load signature