Microsoft Word - 211.docx CHEMICAL ENGINEERING TRANSACTIONS VOL. 61, 2017 A publication of The Italian Association of Chemical Engineering Online at www.aidic.it/cet Guest Editors: Petar S Varbanov, Rongxin Su, Hon Loong Lam, Xia Liu, Jiří J Klemeš Copyright © 2017, AIDIC Servizi S.r.l. ISBN 978-88-95608-51-8; ISSN 2283-9216 Research on Online Multiple Model Soft-sensor Shijie Wanga, Zhenlei Wanga,*, Xin Wangb aKey Laboratory of Advanced Control and Optimization for Chemical Processes, East China University of Science and Technology, Shanghai 200237, China bCenter of Electrical & Electronic Technology,Shanghai Jiao Tong University, Shanghai 200240, China wangzhen_l@ecust.edu.cn Offline updating is a method that most of multiple model soft-sensors used to adapt the new operating conditions. Replacing online models with offline ones is bound to affect the efficiency of soft-sensors, and it costs manpower as well as time simultaneously. It takes maintenance staffs some time to re-train complete models, which requires a lot of historical data, and then the existing models will be changed with new ones. A soft-sensor that can be added or subtracted models online is proposed in this paper. Density-based spatial clustering of applications with noise (DBSCAN) is employed for clustering analysis. Compared with traditional kernel fuzzy clustering method (KFCM), DBSCAN improves the ability of filtering out noise and enhance the ability to decide whether there is a new working condition. However, the clustering results of DBSCAN are extremely sensitive to the input parameters. In this study, kernel density estimation (KDE) is applied to determine the number of subsets and a novel method is proposed to determine the parameters. The new sub-models can be directly added to the online models after trained. The results of soft-sensor achieved by a number of models according to the switching or weighted way. The method proposed in this paper is applied to the measurement of cracking depth of ethylene cracking furnace, which proves the practicability and effectiveness. 1. Introduction It is essential for the process industry to achieve real-time estimation and control of key variables, which extremely relate to safety and efficiency. In actual plant, due to feed components, operating conditions and other changes, same process often contains multiple operating points. In order to predict key variables in these conditions, multiple model soft-sensor is proposed (Chu et al., 2009). It takes advantage of establishing multiple sub-models to cover the data characteristics of each operating point and the clustering algorithm is applied to divide the data subsets. Yang et al. (2012) proposed a novel soft sensor using multi-model neural network (MNN) based on modified kernel fuzzy clustering (KFCM) to release a single data-based soft sensor from suffering heavy burden calculation and poor accuracy. Tang et al. (2014) applied Dempster-Shafer rule (D-S) and least squares support vector machine (LSSVM) to improve the accuracy of weighted multiple soft-sensor models. For time-varying systems, a multi-mode moving-window Gaussian process regression (MWGPR) based approach for ARX modelling is used to capture process nonlinearity or switching dynamics (Xiong et al., 2016). However, in the above methods, whether it is offline or online, the original model structure need be changed, rather than do some increase or decrease on the basis of the original part when model updates. Soft-sensors based on adaptive methods can update models with recently collected data, but the new models established are just local models, which only valid for current operating conditions. If a soft-sensor is asked for containing all the characteristics, old models deserve to be preserved. The re-clustering requires a lot of historical data, which increases the difficulty of on-site maintenance. It is also difficult to pick out the corresponding data in the database when a feature is not needed in current models. To solve this problem, this study proposes a novel multiple soft-sensor method based on density clustering. With the help of kernel density estimation (KDE), new characteristics of inputs can be found. New sub-sets are collected by density-based spatial clustering of applications with noise (DBSCAN). New sub-models established by LSSVM are directly attached to the currently running soft-sensor structure without stopping the current models. Sub-models are also can be deleted according to the demand. This method avoids overturn the original DOI: 10.3303/CET1761297 Please cite this article as: Wang S., Wang Z., Wang X., 2017, Research on online multiple model soft-sensor, Chemical Engineering Transactions, 61, 1795-1800 DOI:10.3303/CET1761297 1795 model, which greatly reducing the complexity of model training and extremely retaining effective historical characteristics. 2. Density-based spatial clustering of applications with noise and kernel density estimation 2.1 Kernel density estimation Kernel density estimation is a nonparametric test method, commonly used to estimate the unknown density function. There is no need to make any assumptions about the data segment in advance. The kernel density estimation method is a method of studying the distribution characteristics of the data set entirely from the data samples. This method is applied in this study to analysis the distribution of unknown inputs and the results can facilitate the DBSCAN algorithm to determine the number of subsets. Set 1 2 n x ,x , ,x as independent random variables of the same distribution in data set m D (m Dimensions),The relative density function f (x) is defined as:     1 1 n h m,h i i f x K x x n    (1) Where,     1 m m,h i i i i K x K x / h / h   and  K . is the kernel function, such as Gaussian function. h is called bandwidth (BW), which has great effect on the results of KDE. In other words, when h increases, some of the features be filtered out by the density estimation, so that the final result is a large cluster. In contrast, when it decreases, many small clusters will be highlighted, but they may not be some effective peaks that contain valid features. Figure 1: Density distribution based on KDE Figure 1 shows the density estimation of a data set. When h is 0.15, a reasonable distribution occurs. If h comes to 0.2, some features are fused. In contrast, when h reduced to 0.05, some features become particularly prominent. Thus, as long as there are some appropriate peaks, the current distribution of the data set can be determined easily. 2.2 Density-based spatial clustering of applications with noise DBSCAN is a density-based clustering method, the main idea is: In the neighbourhood Eps, the number of objects in each object cluster must equal to or more than the given value MinPts。Eps is a parameter similar to radius, mainly to limit the size of an object cluster. MinPts is a parameter that represents the smallest number of objects in the spherical area which the object cluster as the centre point and Eps as the radius the distance formula is applied to calculate the number of objects including in the neighbourhood showed as follow:   2 2 2 1 1 2 2 m m D p,q p q p q p q      (2) According to different number of objects contained in the neighbourhood, the result is divided into three categories. One is the core point, the number of objects in the neighbourhood is greater than or equal to MinPts, which usually plays as an important role in the cluster. It will connect to other core points and boundary points in the surrounding density to constitute the same cluster. Other is the boundary point, the number of objects within the neighbourhood is less than MinPts, but there is at least a core point in it, which works as a relevant member attached to the core point around. If no core point can be found in the neighbourhood, it does not belong to any cluster and regarded as noise. 1796 It is a long-running question to set appropriate MinPts and Eps. In particular, Eps will directly affect the distribution of the core points, so minor changes may lead to large different clustering results. Based on part 2.1, KDE is helpful to solve the problem. In this study, distribution of inputs need be acknowledged at first to determine the number of clusters, and then Eps can be approximated by dichotomy. The specific methods are as follows: Step1:Set 1 2 n x ,x , ,x as independent random variables of the same distribution in data set m D (m Dimensions), Calculate the centre point of the m D by the distance formula ,and then maximum distance in m D called DMax, the minimum distance in m D called DMin can be achieved. Set the number of clusters C according to KDE. Set an initial value to MinPts between 50 with 100. Next step. Step2:Calculate   2Eps DMax DM in /  . Then carry out  mDBSCAN D ,Eps,MinPts to get real number of clusters C . Next step. Step3:If C C , carry out DMin Eps . If C C , carry out DMax Eps .If C C , go to Step4, otherwise go back to Step2. Step4:Adjust MinPts appropriately obeys to actual results. If it is increased, the core points will be reduced, but more noise points can be kicked out. Using the above method, subsets contain new characteristics are collected. This kind of screening method is only carried out in the prediction data with errors and retains the original models. So, there is no need to store data from original models. Compared with other clustering methods, it is possible to greatly reduce the number of training data when clustering, which eliminating unnecessary duplication and reducing time. As a result of applying KDE, new characteristics of current operating conditions are demonstrated in detail. However, conventional clustering methods lack approaches to determine the number of clustering centres, which will affect the actual results of clustering and soft-sensor. 3. Online multiple model soft-sensor Support vector machine is a small sample machine learning method that can be used for classifier design and numerical regression. In this study, subsets representing different characteristics of the operating conditions, are selected by DBSCAN clustering. The sub-models are established by LSSVM. The predicting results of sub- models combined with switching or weighted way act as the final output of these sub-models, also as the output of soft-sensor. After the sub-models are established, the soft-sensors output can be obtained by the following formula: 1/ ( 1) 1/ ( 1) 1 [2 2 ( , )] [2 2 ( , )] m k i ik c m k j j K U K         x v x v (3) Where ik U is the membership of the k-th input data for the i-th clustering centre and C is the number of clustering centres. If choosing switching way to combined multiple models, ik U is be replaced by k A :  k ikA i max U (4) Therefore, the formula of output of multiple model soft-sensor showed as follow: 1 ik c k ik i i k A Output U Lssvm weighted Output Lssvm switching     (5) The structure of online multiple model soft-sensor is demonstrated in Figure 2. The process in the red box shows the part of selection of subsets and establishment of sub-models which mentioned above. When new data come, their membership will be calculated based on existing sub-models. Results are the values mixed by all the sub- models. Those estimation results with bias are stored in the database and wait for new clustering analysis. The model in the dashed box represents the portion that can be added or deleted online. When number of training data in database reach the threshold or manual instruction has been received, new clustering subset is collected by DBSCAN and KDE. If the sub-models set are changed, the corresponding membership will change at the same time. The model structure can be implemented by the following steps: Step1: Determine whether to increase a sub-model. To add models, carry out Step 2. To delete models, carry out Step4. No action, re-execute Step1. Set a training data set to collect data whose prediction contains bias. 1797 Step 2: Export data from training data set, remove the coarse error, calculate KDE, carry out DBSCAN clustering and train sub-models by LSSVM. If the accuracy of sub-models is satisfied, carry Step 3. Otherwise, re-execute Step 2 or back to Step 1. Step 3: Adding or deleting some sub-models, recalculate the new sub-models' membership. The next predict data will using the updated models and membership. Back to Step 1. Figure 2: The structure of online multiple model soft-sensor 4. Case study 4.1 Description of depth estimation of ethylene cracking furnace Ethylene cracking depth is an important indicator of the degree of ethylene cleavage, directly related to the yield of the product. Ethylene cracking process is the first step in the ethylene production process, which splitting raw materials into ethylene, propylene and other mixed products. The cracking reaction device is ethylene cracking furnace, as shown in Figure 3. Figure 3: Structure diagram of tube cracking furnace Cracking depth is an indicator, which usually expressed by the ratio of the yield of propylene and ethylene. Real- time estimation to the value of cracking depth is conducive to the operators to master the actual operation condition of cracking furnace, but also benefit the real-time control and process optimization. But the main problems faced by soft-sensor are changes in inputs, process parameters, etc. Some unpredictable factors, such as furnace tube coke, noise interference is also disadvantage for measurement. When faced with multiple operating conditions, the structure of traditional soft-sensor will become complex and its accuracy and generalization ability will be reduced. Problems, such as huge amounts of history data, complex calculation, prohibiting to shut down model when upload, etc., will increase the difficult for maintenance when updating models. The method proposed in this study is used to solve these problems as well as compare with the results of the single model LSSVM and KFCM-LSSVM. 4.2 The measurement of cracking depth of ethylene cracking furnace The partial process data of an ethylene cracking furnace were selected and analysed. The feed of the cracking furnace is changeable, such as naphtha (NAP), liquefied petroleum gas (LPG) and mixed feed GAS+NAP, and the corresponding operating conditions are also changed. In this case, the single model LSSVM, KFCM-LSSVM and KDE-DBSCAN-LSSVM were used to estimate the depth of the crack respectively. The results are shown in Table 1 and Figure 4. 1798 The first row in Table 1 is the results by using LSSVM. In the case of operating condition changes, the original model is poorly estimated for the current operating condition. So the models in the stage of condition change need to be re-trained. As each training is for a certain working conditions, so the corresponding MSE and COR are relatively high. The KFCM-LSSVM retrains the sub-models each time a new condition occurs. Compared to the single model LSSVM, it can effectively predict data which similar to data in subsets, that is, to record some of the historical data models. But updating needs to re-cluster all the data, and training sets become increasingly larger with the passage of time. It will cost more time and lager computer resource. Also, there are two different NAP feed conditions in actual process, it cannot be effectively identified by KFCM-LSSVM, resulting in the decline of model accuracy. In this study, the KDE-DBSCAN-LSSVM retains the original models and use the current data to train new sub-models when the working condition changes. As demonstrated in Figure 4, each of sub-models are stored by system, and a new sub-model is added in when inputs contain new characteristic occur. When new NAP data comes, the original NAP model can be deleted or invalidated (the membership maintains 0), then a new sub-model is added. Although the total number of support vectors is large, but each sub-model only need to contain part of them. This method can be regarded as this as a collection of knowledge, with the passage of time, the model structure has been improved and becoming integrated, nearing to the whole conditions of the real process. Table 1: Results of soft-sensors NAP LPG GAS+NAP NAP MSE COR SVs FL MSE COR SVs FL MSE COR SVs FL MSE COR SVs FL Single 0.0010 97.03 392 Y 0.0164 99.61 249 Y 0.0255 96.90 179 Y 0.0005 89.94 505 Y KFCM 0.0010 97.03 392 Y 0.0202 99.40 280 Y 0.0101 99.48 425 Y 0.0035 50.92 425 Y KDE- DBSCAN 0.0010 97.03 392 Y 0.0217 99.29 669 P 0.0008 99.66 776 P 0.0005 90.70 1292 P Where MSE is the mean square error, COR is correlation, SVs is the total number of support vectors contained in the model and FL is the flag represent the way model requires update, Y indicates that the model structure is completely updated, N is no update, and P is update partially. Figure 4: Result of KDE-DBSCAN-LSSVM 5. Conclusion In this study, multiple model soft-sensor based on online increment and subtraction structure is proposed. Compared with the single model soft-sensor, it simplifies the structure of each sub-models, and only needs to change some sub-models, rather than completely change the original model structure when update. It is conducive to online maintenance and the impact of the current operating condition of the scene is also smaller. Compared with the traditional multiple model soft-sensor, DBSCAN plays a noise reduction and finds new characteristics from new process data. Traditional KFCM need to re-clustering all the data, but this method only need to cluster part of the data, which significantly simplifying computing complexity. Thus, this method can be applied in the process with multi-mode operating conditions. It is flexible for operators to make rapid and effective changes, which ensure the stability and accuracy of soft-sensors. Although industrial data show that this method is feasible, theoretical proof is still lacking and it will be discussed in future study. 1799 Acknowledgments This work was supported by National Key Technology Support Program (2015BAF22B02); the National Natural Science Foundation of China (61403141, 21406061). This work was supported by the Shanghai Natural Science Foundation of China (14ZR1421800) and State Key Laboratory of Synthetical Automation for Process Industries (PAL-N201404). References Jin H.P., Chen X.G., Yang K., Wang L., 2015, Multi-model adaptive soft sensor modeling method using local learning and online support vector regression for nonlinear time-variant batch processes. Chemical Engineering Science, 131: 282-303. Li X.L, Su H.Y., Chu J., 2009, Multiple model soft sensor based on affinity propagation, Gaussian process and Bayesian committee machine. Chinese Journal of Chemical Engineering, 17(1): 95-99. Tang K., Wang X., Wang Z.L., 2014, Multi-model soft sensor based on Dempster-Shafer rule. Control Theory & Applications, 31(5): 632-637. Wang L., Chen X.G., Yang K., Jin H.P., 2017, Soft sensor modeling based on variable partition ensemble method for nonlinear batch processes. 7th International Conference on Electronics and Information Engineering. International Society for Optics and Photonics: 103222E-103222E-8. Wang Z.L., Tang K., Wang X., 2014, A multi-model soft sensing method based on DS and ARIMA model. Control and Decision, 29(7): 1160-1166. Xiong W.L., Zhang W., Xu B.G., Huang B., 2016, JITL based MWGPR soft sensor for multi-mode process with dual-updating strategy. Computers & Chemical Engineering, 90: 260-267. Zhang W.Q., Fu Y.J, Yang H.Z., 2012, Multi-model soft-sensor modeling based on improved clustering and weighted bagging. CIESC Journal, 9: 005. Zhang W., Xiong W.L., Xu B.G., 2015, Multi-model combination modeling based on just-in-time learning using Gaussian process regression. Information and Control, 44(4): 487-492. Zhou H.F., Wang P., Li H.Y., 2012, Research on adaptive parameters determination in DBSCAN algorithm. Journal of Information & Computational Science, 9(7): 1967-1973. 1800