 Proceedings of Engineering and Technology Innovation , vol. 1, 2016, pp. 15 - 18 15 Cop y right © TAETI The Application of Target Analysis in Electricity Demand-Side Management Fung-Fei Chen 1 , Seng-Cho Chou 2, * , Chiao-Yi Wang 2 , Tai-Ken Lu 3 1 Taiwan Power Company Research Institute, Taipei, 100, Taiwan 2 Department of Information Management, National Taiwan University, Taipei, 100, Taiwan 3 Department of Electrical Engineering, National Taiwan Ocean University, Keelung, 202, Taiwan Received 13 April 2015; received in revised form 11 May 2015; accept ed 18 June 2015 Abstract Recently, target analysis combined with database technology and data mining has been widely used in in- dustries such as marketing, finance, insurance, teleco m- municat ions, advertising, and e-commerce. Because of the unique comple xit ies of user behavior in e lectric ity de mand, e xa mples of target analysis applications have yet to be seen. Considering the industry’s urgent need to enhance the efficiency of electric ity demand -side management, this study aims to build a min ing analysis model fo r potential target users of interruptible load that both fully reflects consumer behavior characteristics and serves as a rule for static comparisons. The results of a data mining analysis of the Taiwan Power Co mpany (Taipowe r)’s interruptible loads 1 to 6 show that the number of potential target users is 1669, which is 21% o f the original mining population. Additionally, the target users who were classified to have “the most potential” for all categories of interruptible load only accounted for 0.76% of the total mining population (= 59/7814), verifying the mining effects. Ke ywor ds : Ta rget analysis, data mining, association rule, e lectric ity de mand-side manage- ment 1. Introduction In manage ment, target analysis has always been crucial to enterprise operations because it enables full marketing e xploitation o f the 80/20 rule and increases the opportun i- ties for p roduct recommendation and cross selling. Another aspect of target analysis that has attracted significant a t- tention and extensive discussions in recent years is its ability to identify valuable potential target users. Target analysis coupled with database technology and data mining, which e merged fro m analysis of significant a mounts of data, is currently applied e xtensively in the fie lds of mar- keting, finance, insurance, manufacturing, and medica l care [2, 4, 13, 16]. By e mp loying target analysis data mining, ma rketers can obtain hidden knowledge or identify the most co mmerc ia lly valuable information fro m signif- icant amounts of unsorted data. When used appropriately, target analysis can provide significant overall benefits for the organization. Most previous studies employed decision tree analysis or data mining through artific ial neural networks to iden- tify and e xtract the characteristics of potential target users, before establishing comparison rules to enable companies to e xa mine potential customers as ma rket ing targets [3, 5, 6, 8, 10, 12, 19]. The purpose of these investigations was to determine the consumer behavior characteristics of e xis t- ing customers, and then use these characteristics to identify hidden potential target customers. However, for industries where consumer behavior is influenced by industry poli- cies or other e xte rnal factors, because existing customers do not exh ibit the sa me consumer behavior as potential customers, establishing a comparison rule between the e xisting customers and potential customers is d ifficu lt, resulting in a lack of research in this area. This phenome- non is particularly co mmon in teleco mmunicat ion and utility industries, where marketing measures may a d- versely affect the consumer behaviors of e xisting users, decreasing the objectivity of the consumer behavio r char- acteristics identified using target analysis, further hinde r- ing comparisons [1, 14, 18]. To address these issues and compensate for the disad- vantages of current target marketing analysis methods, we conduct a case study of Taipower’s de mand-side man- agement data mining ana lysis of potential target users. This study also constructs a min ing analysis model for potential target users, which both fu lly reflects consumer behavior characteristics and serves as a rule for static comparisons. We hope that the target analysis mining model proposed in this study can provide a list of the most valuable potential users of an interruptible load. The analysis model is a lso e xpected to provide a useful reference for identifying p o- tential target users for other purposes related to electricity demand, thereby facilitating the demand-side management of electricity. 2. Experiment 2.1. Research design Taipower’s demand-side management measures in- clude time use rates, seasonal rates, interruptible loads, and air conditioning control; a mong which, interruptible load is the most effective for reducing peak powe r loads. Since the introduction of Taipower’s interruptib le load policy in 1987, despite the seven interruptible load categories, by the end of 2005, the number of interruptible load users was only 670. Co mpared to the target population of over 10,000 qualified users, the number of actual users is min imal, presenting significant scope for promot ion. Therefore , target analysis and mining analysis of significant a mounts of user data must be employed to identify potential target users and expand the scope of promotion [9]. Target analysis generally includes customer data co l- lection, data analysis, the e xtraction of segmentation var- iables, and the establishment of segmentation market characteristics [7]. Without identical consumer behavior data and uncertainty whether basic variables a re sufficient for ma rket segmentation, the consumer behavior of e xist- ing customers can be used as the foundation for segme n- tation to identify the basic variable characteristics of each ma rket segment. However, applying the e xisting target ma rket ing analysis method, which selects the differing basic variables of the segmentation market fo r character- istic analysis, may lead to losses of consumer beha v- ior-related data or defic iencies in the descriptive variab le dimensions, thereby preventing the effective convergence *Corresponding aut hor. Email address: chou@ntu.edu.t w Proceedings of Engineering and Technology Innovation , vol. 1, 2016, pp. 15 - 18 16 Cop y right © TAETI Cop y right © TAETI Cop y right © TAETI Cop y right © TAETI Cop y right © TAETI of the target marketing scope. To solve this proble m, we determine the behavioral data of existing customers who have changed because of promotional measures, identify the basic variables that are not subject to influence but are related, and eventually establish potential target customer rational co mparison rules as the target analysis method in this study. The target market ing analysis model based on a lack of identica l consumer behavior data is shown in Fig. 1. Fig. 1 The target ma rket ing model used when there is insufficient information about equivalent con- sumption behavior 2.2. Data description This study prima rily e xa mines the data of low-voltage users who qualify for Ta ipower’s interruptible load poli- cies, categorizing them as either usage data or basic data. Designed to record the various electric ity co nsumption behaviors of e xisting users who adopt interruptible load policies, data usage includes 79 variables written in n u- me rica l form, e xc luding “electric ity no.” and “industry.” To ensure the quality of this research, we first o mit a number of the insignificant colu mns based on the exper i- ence of experts. Then, to facilitate subsequent data analysis, we categorize the re ma ining 51 fie lds into five e lectricity consumption factors according to their features and rele- vance. (1) “Frequent and Peak” demand or reading group: All 16 variables show users ’ frequent electricity con- sumption behavior during peak hours. (2) “Off-Peak” demand or reading group: All 8 varia- bles show users’ electricity consumption behavior during off-peak hours. (3) “Saturday Semi-peak” demand or reading group: All 8 variables show users’ electricity consumption behaviors during semi-peak hours. (4) “Comparative” demand or reading group: All 11 variables show users ’ varying electricity consump- tion behaviors in different periods. (5) “Others ” group: The remaining 8 variables. Additionally, of the 38 basic data variables, including both classifying and analysis variables, a number of the variables that we re irre levant to electricity consumption behavior were o mitted based on expert e xpe rience. Data of the 7 re ma ining variables were used to develop comparison rules for identifying the potential users who may accept interruptible load policies, as shown below. Table 1 The relevant variables of users ’ basic data By e xa min ing the electric ity consumption behaviors of e xisting users with various interruptible load polic ies and segmenting their object ive effectiveness or determining the amount they can generate, we a im to identify the customers with the most potential, the second most potential, and the least potential a mong the users of various interruptible loads. This study also correlates the basic files of e xisting users and extracts the unique characteristics from the basic data of this beneficia l user type. A co mparison rule fo r identifying potential target users is then developed to narrow the target ma rket ing scope and best employ the limited time and cost resources . 2.3. Data mining process Because of the require ments of the analysis model and the characteristics of relevant data adopted in this study, we e mploy the follo wing techniques for data mining analysis: principal co mponent analysis, data clustering, correlation ana lysis, association rule and MANOVA tests. Thus, a data min ing analysis process is established, as shown in Fig. 2. 2.4. Results The results of all data mining stages are presented b e- low. 2.4.1. Analysis of data dimension reduction To determine the most suitable linear equations, this study adopted principal co mponent analysis. By condens- ing the variables into a representative indicator, the re- search comple xity and dimensions were reduced effe c - tively, facilitating the smooth operation of subsequent research processes. The princ ipal co mponent analysis equation is as follows:               pppppp pp XaXaXaY XaXaXaY   ... ... 2211 12121111 Xa Xa p 1  (1) where Y1 to Yp a re representative principa l co mponent indicators e xtracted fro m a certa in factor dimension. De - pending on the difference of internal data, each factor dimension may possess 1 top (the number of variab les in that dimension) representative indicators. A suitable criti- cal threshold (cu mulat ive e xp lanatory variance) can be determined for the selection. Fig. 2 Data analysis process After repeated verification, we found that principa l component analys is of dimensions 4 and 5 does not pro- vide good results and fails to converge dimensions effec- tively. To ensure the credibility, reliability , and exp lana- tory power of the data analysis results, only the first rep- Proceedings of Engineering and Technology Innovation , vol. 1, 2016, pp. 15 - 18 17 17 resentative princ ipal co mponent indicator (PC1 is the first representative principal co mponent indicator of dimension 1, PC2 is the first representative principal co mponent in- dicator of dimension 2, etc) of the first three usage data dimensions mentioned in Section 2.4 were used as seg- mentation variables for future data co llect ion, as shown in Table 2. Table 2 Cu mu lative e xplanatory variance of represent a- tive principal component indicators 2.4.2. Cluster analysis The most common ly used data clustering methods in- clude hierarch ical clustering, non -hierarch ical clustering, and artificia l neural networks[15]. A mong them, the K means method provides superior clustering results and higher effic iency compa red to hiera rchica l c lustering and artific ial neura l network clus tering [11, 15, 17]. Using a two-dimensional and three-dimensional scatter plot and chi-square plot, we verified that the usage data in this study has norma l distribution. After assessments and experi- mentation, the K means method was employed for cluster analysis. Tables 3 and 4 show the cluster analysis results of the data of existing users. 2.4.3. MANOVA Test A MANOVA test was conducted in this study to d e- termine whether the clustering results can differentiate. Using Wilk’s la mbda, all impact factors (i.e., seg mentation variables) were examined to Table 3 Cluster analysis results -the number of existing users in each cluster Table 4 Cluster analysis results – the number of inter ruptible load users belonging to more than one cluster Determine whether they had a significant influence on the overall segmentation results. If the like lihood of the Wilk’s la mbda value be ing sma lle r than a certain critica l chi-square distribution value is minimal (<0.0001), we can conclude that the impact factor has substantial influence (i.e ., discriminatory ability), as shown in Eq. (2). Fro m a quantitative analysis perspective, if each impact factor has significant influence, the credib ility of the c luster analysis results can be verified, as shown in Table 5. Table 5 MANOVA test results for interruptible load 1           =          * * )(ln 2 )1(1 )1( ,arg 0one: )1(0: 2 )1( 0 1 i1 210    pg resfac res g gp ngb ifHrejectsampleselFor SSPSSP SSP Let leastAtH effectsfactornoH                (2) 2.4.4. Correlation analysis Because of the changing use behavior of existing in- terruptible load users, basic data were used to analyze user characteristics. To ensure that the basic variables afte r correlation reflect both the electricity consumption b e- havior of all users and the results of cluster analysis, co r- relation analysis of the three analytical variables in Table 2 should be performed and the segmentation variables e m- ployed for cluster analysis (Table 6). Su itable variables identified in the analysis can be used for further mining analysis. An exa mp le of the test results for interruptible load 1 is shown below. Table 6 Correlation analysis results for interruptible load 1 Of the analytical variab les of interruptible load 1, CS_ CAPACIT Y and CS_UP_CAPA CIT Y a re the most relevant to users ’ consumption behaviors; thus, they can be included as the descriptive variables of interruptible load 1. 2.4.5. Association rule analysis The association rule was e mp loyed to establish data correlation ru les. After co mputations to determine whether it satisfies the threshold limit of minimal support and minimal confidence, the ru le is matched with the Apriori algorithm to select appropriate correlation rules. Data Analysis Results : An exa mple o f the representa- tive characteristic behaviors and correlation mining threshold of users with “the most potential” for interrupti- ble load 1 is shown below.  Number of Existing Users: 6  Minimal Support; 6  Minimal Confidence: 0.8  Importance: 0.2  Essential Association Rule: For users with an “A” contract type, the like lihood that they are fro m the top 20 industry types = 11 (basic meta l industries) is > 0.8. For three-s tage users in Category 2, the like lihood that they are fro m the top 20 industry types = 11 (basic meta l industries) is > 0.8. For users whose usage type = 5, the like lihood that they are three-stage users in Category 2 is > 0.8. For users whose usage type = 5, the likelihood that their contract type = 5 is > 0.8. When the necessary correlation rules for all clusters are condensed into rigorous comparison rules, the number o f potential target users and e xisting users of every cluster can be identified, as shown in Table 7. The total nu mber of potential target users shown in Table 7 is 4,687. However, because a number of the m overlap between clusters, the actual number o f potential target users is 3,770. Additionally, some of the clusters Proceedings of Engineering and Technology Innovation , vol. 1, 2016, pp. 15 - 18 18 Cop y right © TAETI Cop y right © TAETI Cop y right © TAETI Cop y right © TAETI Cop y right © TAETI have few e xisting users (no more than 10, and some as few as 1 or 2); thus, the comparison rules established with these clusters are relative ly loose and can result in an overesti- mat ion of potential target users. By contrast, potential target users mined fro m c lusters with more e xisting users yield superior convergence. The potential target users from these clusters (exc luding c lusters with only 1 or 2 e xisting users) totals 2,058; e xcluding overlapped users, the actual number o f potential target users is 1,669, which is 21% o f the original min ing population. The number of ta rget users with the most potential fro m a ll interruptib le load policies is 0.76% of the overall mining population (= 59/7814). Table 7 Correlat ion rule co mparison analysis results for all clusters 2.4.6. Generating a list of potential target users Using the corre lation rules e xt racted fro m c lusters, we compile a list of potential target users from a mong the customers not yet using interruptible load polic ies. Table 8 shows the first five results for users with the most potential in interruptible load 1. Table 8 Interruptible load 1- users with the most potential 3. Conclusions By applying target analysis to electricity demand -side manage ment and mining ana lysis to a list of potential target users, this study develops a mining analysis model for potential target users of interruptible load that both fully reflects consumer behavior characteristics and serves as a rule for static comparisons . This model was designed to compensate for the deficiencies of the current target ma rket ing analysis model. The results show that the min - ing analysis model proposed in this study can effectively narrow the target marketing scope to 21% of the overall mining population, with a condensing capability of 79% , which facilitates the segmentation and differentiation o f customers with various potential benefits and value. The proportion of target users with the most potential for a ll interruptible load policies is 0.76% of the overall mining population (= 59/ 7814). Each step of the min ing analysis process underwent thorough quantitative (theoretical veri- fication) and qualitative testing (industry knowledge and e xperience-based assessments ). The practical significance of this study is that it provides the electricity industry with informat ion of the most potential and valuable target users. The min ing analysis model proposed in this study can be emp loyed by relevant marketing decision ma kers to iden- tify potential ta rget customers for other de mand -side manage ment policies and would have a lot to be referenced for utility in the future. References [1] A. Al-Ghandoor, et al. "Residential p ast and future energy consump tion: p otential savings and environmental imp act," Renewable and Sustainable Energy Reviews, 2008. [2] E. Bay am, J. Liebowitz, and W. Agresti, "Older drivers and accid ents: a meta-analy sis and d ata min in g ap p lication on traffic accident data," Exp ert Sy stems with App lications, vol. 29, p p . 598-629, 2005. [3] J. Z. Bloom, M arket Segmentation. "A neural network ap - p lication," Annals of Tourism Research, vo l. 32, p p . 93-111, 2005. [4] R. J. Brachman, T. Khabaza, W. Kloesgen, G. Pi- atetsky -Shap iro, and E. Simoudis, "M ining business data- bases," Communication of the ACM , vol. 39, p p . 42-48, 1996. [5] S. W. Ch an gchien, and T. C. Lu, "M inin g association rules p rocedure to supp ort on-line recommendation by customers and p roducts fragmentation," Exp ert Sy stems with App lica- tions, vol. 20, p p . 325-335, 2001. [6] S. Daskalaki, I. Kop anas, M . Goudara, and N. Avouris, "Data minin g for decision supp ort on customer insolvency in tele- communications business," Europ ean Journal of Op erational Research, vol. 145, p p . 239-255, 2002. [7] S. Dibb, and P. Stern, "Questioning the reliability of market segmentation techniques," Omega International Journal of M anagement Science, vol. 23, p p . 625-636, 1995. [8] S. H. Ha, and S. C. Park, "App lication of data mining tools to hotel data mart on the Intranet for database marketin g," Ex- p ert Sy stems with Ap p lications, vol. 15, p p . 1-31, 1998. [9] http ://www.taip ower.com.tw, 2012. [10] S. Y. Hun g, D. C. Yen, and H. Y. Wan g, "App ly ing data minin g to teleco m churn management," Exp ert Sy stems with Ap p lications, vol. 31, p p . 515-524, 2006. [11] A. K. Jain, M . N. M urty , and P. J. Fly nn, "Data clustering: a review," ACM Comp uting Survey s, vol. 31, pp . 264-323, 1999. [12] T. S. Lee, C. C. Ch iu, Y. C. Chou, and C. J. Lu, "M ining the customer credit usin g classification and regr ession tree and multivariate adap tive regression sp lines," Comp utational Statistics and Data Analy sis, vol. 50, p p . 1113-1130, 2006. [13] D. R. Liu, and Y. Y. Shih, "Integr ating AHP and data min in g for p roduct recommendation based on customer lifetime value," Information & M anagement, vol. 42, p p . 387-400, 2005. [14] S. Roberts, "Demograp hics, ener gy and our homes," Energy Policy , vol. 36, p p . 4630-4632, 2008. [15] S. S. Shahap urkar, and M . K. Sundareshan, "Comp arison of self-organizin g map with k-means hierarch ical clustering for bioinformatics ap p lications," Neural Networks, Proceedin gs. 2004 IEEE International Joing Conf erence on, pp . 1221-1226, 2004. [16] W. E. Sp angler, M . Gal-Or, and J. H. M ay , "Using data minin g to p rofile TV viewers," Communication of the ACM, vol. 46, p p . 67-72, 2004. [17] A. Ultsch, "Self-or ganizin g neur al networks p erform differ- ent from statistical k-means clusterin g," Proc. Conf. Soc. For Information and Classification, Basel, 1995. [18] W. O. Onuh, A. T. Valerio, A. Permalino Jr., "Residential demand for electricity dasmarinas, cavite, p hilipp ines," Journal of Glob al Busin ess & Economics, vol. 2, p p . 1-22, 2011. [19] C. H. Wu, S. C. Kao, Y. Y. Su, and C. C. Wu, "Tar getin g customers via discovery knowledge for the insurance indus- try ," Exp ert Systems with App lications, vol. 29, pp . 291-299, 2005. http://www.taipower.com.tw/