CHEMICAL ENGINEERING TRANSACTIONS VOL. 81, 2020 A publication of The Italian Association of Chemical Engineering Online at www.cetjournal.it Guest Editors: Petar S. Varbanov, Qiuwang Wang, Min Zeng, Panos Seferlis, Ting Ma, Jiří J. Klemeš Copyright © 2020, AIDIC Servizi S.r.l. ISBN 978-88-95608-79-2; ISSN 2283-9216 Fault Diagnosis Algorithm of Chemical Process Based on Information Entropy Cheng Jia, Xuebing Zhua, Fangyuan Maa,b, Jingde Wanga, Wei Suna,* aCollege of Chemical Engineering, Beijing University of Chemical Technology, North Third Ring Road 15, Chaoyang District, Beijing, 100029, China bCenter of process monitoring and data analysis, Wuxi Research Institute of Applied Technologies, Tsinghua University, Wuxi, 214072, China sunwei@mail.buct.edu.cn In modern chemical industries, significant economic losses and unnecessary energy consumption are constantly resulted from process failures. In order to eliminate them in time, the early identification of the root cause of abnormal process deviation is crucial. Generally, causal analysis based on process knowledge plays an important role in process fault isolation. But with the increasing process complexity, the root cause is difficult to obtain using knowledge-based method alone. As a result of the wide application of distributed control systems, a large number of process data have been collected, which makes data-driven methods for fault diagnosis an active field in recent years. In previous research, contribution plots are widely used in practice to find the variables that are major contributors to the fault. However, the propagation of contribution among variables makes the results fluctuate at different sample points. In this study, a novel fault diagnosis method based on information entropy and signed directed graph (SDG) is proposed. Information entropy is first applied to select significant variables to specific faults according to the distribution feature of variables. Then, the propagation path of selected nodes is identified by SDG model to diagnosis the root cause. Due to the exclusion of less correlated variables, the results at each sample are relatively consistent compared with contribution plots. In order to verify its effectiveness, the proposed method is applied to the benchmark Tennessee Eastman process. The propagation path of most process faults is identified, which is well matched with the fault description, indicating that the proposed method has a good performance on diagnosing process faults. 1. Introduction Process monitoring has been widely used in chemical process to ensure safety as well as avoid unnecessary operating costs. Although abnormal process deviations can be effectively detected through numerous multivariate statistical methods by extracting cross-correlations (Qin, 2012) and spatial correlations (Ma, et al., 2019) among data, the root cause diagnosis of these deviations still remains many challenges Fault diagnosis methods can be generally classified into model-based, knowledge-based and data-based methods (Frank, 1990). With the rapid development of the chemical industry, the dynamic model and prior knowledge of a high-dimensional nonlinear process are hardly available, making model-based and knowledge- based methods not applicable for real time application. Benefiting from the wide application of distributed control system (DCS), the research on data-based methods have received more and more attention. For the issue of fault diagnosis, the contribution plots are well-known methods for practical application (Miller, et al., 1998). Generally, the contribution plots indicate how much is the effect of each variable on the T2 or squared prediction error (SPE) statistic index from principal component analysis (PCA). The variables with the largest contributions are considered as the root cause of the fault. However, due to the rapid interaction among process variables when a fault occurs, the contribution from the root cause variable is propagated to other variables, which may not be the actual root cause of the fault. Vedam proposed a PCA-SDG method, where SPE statistic from PCA is used to determine the node thresholds and the consistent path is found in SDG model (Vedam and Venkatasubramanian, 1999). But the contribution can be propagated among variables with the fault spreads, making it difficult to identify the root cause of the fault. Information entropy (IE) is a data-based statistical tool DOI: 10.3303/CET2081091 Paper Received: 26/03/2020; Revised: 01/05/2020; Accepted: 08/05/2020 Please cite this article as: Ji C., Zhu X., Ma F., Wang J., Sun W., 2020, Fault Diagnosis Algorithm of Chemical Process Based on Information Entropy, Chemical Engineering Transactions, 81, 541-546 DOI:10.3303/CET2081091 541 that is calculated by the distribution of variables. Considering the distribution of some variables under normal conditions is different from that in fault states, the IE value can be used to select the fault-correlated variables. On the other hand, since historical fault data are available with known fault types in Tennessee Eastman process(TEP), most fault diagnosis methods in current literature are conducted by supervised classification based on the data sets with clear labels, which is hard to obtain in real industrial process. Instead, most faults occurring in industrial operation are different from those in historical record one way or another, which will be misclassified if only known fault feature is considered. The aim of this work is to combine IE and PCA-SDG method to identify the propagation path solely based on data feature for a particular fault. The benchmark TEP is analysed. With selection of fault-correlated variables by IE, the root cause of the fault is diagnosed based on SDG model. 2. Methodology In this section, the fault detection and diagnosis methods and their implementation used in this study are introduced to prepare for subsequent modeling. 2.1 Principal component analysis (PCA) and contribution plots Considering that PCA has a good performance on feature extraction and dimension reduction, it is used in this study for fault detection. For a standardised data matrix Xn×m, which contains n samples and m variables, the covariance matrix of X can be calculated as follows, ( ) 1 T X X Cov X n = − (1) The matrix X is decomposed into a score matrix T and a loading matrix P by singular value decomposition, 1 21 2 T T T T k k X TP E t p t p t p E= + = + + + + (2) Where E is the residual matrix, k is the number of the principal components. Then in PCA based method T2 statistic, SPE statistic, and their control limits are utilized to detect process faults. When a fault is detected, the contribution plots are usually used for fault diagnosis by calculating the effect of each process variable on T2 or SPE statistic. The contribution plots are easy to calculate, but they may lead to mis-diagnosis because of the fast propagation of the contribution from root cause variable to other variables. 2.2 Signed directed graph(SDG) method SDG is a commonly used knowledge-based fault diagnosis method, in which the causal relationship between variables is established in the form of directed graph. The results are obtained by identifying a single consistent path from the root node to all abnormal nodes. The node thresholds are determined by single-variable statistic method, which means that 2m thresholds need to be determined when m variables are given. Considering the performance of PCA in multivariate statistic, when a fault is detected, the threshold can be calculated based on the variable contribution to SPE, which is a unique statistic in residual space (Vedam and Venkatasubramanian, 1999). In this way, only one threshold needs to be determined, which can greatly reduce the calculation. 2.3 Fault-correlated variables selection In order to avoid the wrong propagation path obtained from the contribution plot due to control response, a variable selection method based on IE is introduced in this section. 2.3.1 Information entropy Information entropy is a measurement of the uncertainty of a random variable proposed by Shannon (Shannon, 1948). Given a random variable X, its IE value can be calculated as follows, 1 ( ) ( ) log( ( )) n i i i H X p x p x = = − (3) where n is the number of samples, p(xi) is the probability distribution of the sample xi. The base in the formula determines the unit of IE value. Generally, the base is taken by 2 and the unit is bit. 2.3.2 Control limit determination According to Eq(3), the IE value depends only on the probability distribution of the variable. Calculated by the Lagrange multiplier method, the IE value reaches its maximum when a variable is under uniform distribution. At steady state, the distribution of a variable is nearly uniform. When a fault occurs, the distribution of fault- 542 correlated variables is broken, which will reduce its IE value. Given a variable X (x1,x2,…,xn) at steady state, the control limit can be determined through multiple calculations of IE with a moving window Xt (xt,x(t+1),…,x(t+h)), t represents the t th windows and h is the window length. The significance level is defined as follows, ( ) 6 H X   −  (4) where λ, σ is the mean and standard deviation of the IE value calculated at steady state. A six sigma threshold for the significance level is chosen here. Once a fault occurs, the variables with IE values H(x) that exceed the threshold in Eq. 4 are selected as fault-correlated variables. 3. Fault diagnosis model and its application on Tennessee Eastman process 3.1 Fault diagnosis model based on IE and SDG In the proposed methodology IE algorithm and SDG model are employed to select the fault-correlated variables and identify the fault propagation paths. According to the process diagram in Figure 1, historical data are used to build a PCA model and determine the threshold of IE value. Once a fault is detected, the IE value of each variable is calculated and compared with its threshold. The variable, whose IE value exceeds the threshold, is regarded as a fault-correlated variable. Then the propagation path among these variables is identified by a SDG model and the root cause is located. Since less correlated variables to specific fault are first excluded, the final result obtained from the SDG model can be correctly located in the variable which is the root cause of the fault. Figure 1: Fault detection and diagnosis procedure 3.2 Tennessee Eastman process TEP is a widely used chemical process simulated from a real industrial process plant by the Eastman Chemical Company, which contains five process units, reactor, condenser, compressor, vapor/liquid separator, and the stripper. The reactor is the most critical unit, where gaseous reactants A, C, D, E and inert B are mixed to produce liquid products G and H. F is also produced as a byproduct. The flow diagram is shown in Figure 2. Figure 2: Revised Tennessee Eastman process model (Bathelt et al., 2015) 543 3.3 Data description The simulation data sets of TEP are downloaded from the website(Braatz, 2002), which contains 41 measured variables and 11 manipulated variables. The sampling period of the data sets is 3 minutes except for the 19 measured composition variables of which the sampling period is from 6 minutes to 15 minutes. In this study, the 19 composition variables are excluded. All the 33 variables and their symbols used in this work are listed in Table 1 (Downs and Vogel, 1993). The training data set contains 500 samples and all the faulty testing data sets contain 960 samples. There are 20 preset faults that occur at the 160th sample point. The description of these fault types can also be found in the literature (Downs and Vogel, 1993). Table 1: The variables in Tennessee Eastman process Variable Description Variable Description F1 A feed (stream 1) T18 Stripper temperature F2 D feed (stream 2) F19 Stripper steam flow F3 E feed (stream 3) C20 Compressor work F4 A and C feed (stream 4) T21 Reactor cooling water outlet temperature F5 Recycle flow (stream 8) T22 Separator cooling water outlet temperature F6 Reactor feed rate (stream 6) V23 D feed flow (stream 2) P7 Reactor pressure V24 E feed flow (stream 3) L8 Reactor level V25 A feed flow (stream 1) T9 Reactor temperature V26 A and C feed flow(stream 4) F10 Purge rate (stream 9) V27 Compressor recycle valve T11 Product separator temperature V28 Purge valve (stream 9) L12 Product separator level V29 Separator pot liquid flow (stream 10) P13 Product separator pressure V30 Stripper liquid prod flow (stream 11) F14 Product separator underflow (stream 10) V31 Stripper steam valve L15 Stripper level V32 Reactor cooling water flow P16 Stripper pressure V33 Condenser cooling water flow F17 Stripper underflow (stream 11) 4. Results and discussion In this section, the fault diagnosis results of fault 7 from the proposed method is emphasized and compared with conventional contribution plots method. As shown in Figure 3, when the fault occurs at the 160th sample point, it can be immediately detected by the PCA model. First, contribution plots method is applied to diagnosis the correlated variables in this fault. The contribution plots of various sampling points within 10 minutes after the fault occurs are shown in Figure 4. It can be seen that the variables with high contribution rates vary with time, making it difficult to determine a threshold of variables nodes in SDG model to find the propagation path. Although it is not shown in this paper, the difference in contribution plot at each sample become much larger as the fault spreads because the contribution from one variable is propagated to all variables (Qin, 2012). And the method mentioned before to determine the threshold by SPE statistic based on PCA will still get inconsistent results at different samples, which may fail to exclude the nodes that are not the root cause of the fault. Figure 3: Monitoring result by PCA model 544 Figure 4: (a) Contribution plot in the 160th sample (b) Contribution plot in the 163th sample In order to select the fault-correlated variables effectively, the IE value of each variable is calculated and compared with the threshold obtained from the training data set. The IE value of each variable and its control limit at the third sample point after the fault occurs is shown in Figure 5. The threshold of each variable is represented by a dashed line and as mentioned before, the variable can be selected as a fault-correlated variable if its IE values exceeds the threshold. It can be clearly inferred from Figure 5 that the IE value of variables F4, P7, T9, P13, P16, T21, V26 exceed their threshold. Then they are selected and brought into the SDG model to further identify the fault propagation path. In order to evaluate its performance with the contribution plots method, the IE value at different samples is compared with the threshold. The results are shown in Table 2. Although the results are not completely consistent, the quantity of the fault-correlated variables is increasing, which to some extent reflects the spread of the fault in the process. It’s worth mentioning that the results are exactly the same from sample 162 to 165. It indicates that it is reliable to select the results at one of these four sample points to identify the fault propagation. Table 2: Variables selection results by IE Sampling point Fault-correlated variables 160th F4, T9, P16 161th F4, P7, T9, P13, P16, T21 162th F4, P7, T9, P13, P16, T21, V26 163th F4, P7, T9, P13, P16, T21, V26 164th F4, P7, T9, P13, P16, T21, V26 165th F4, P7, T9, P13, P16, T21, V26 166th F4, P7, T9, T11, P13, P16, T21, V26 167th F4, P7, T9, T11, P13, P16, T21, V26, V27 Figure 5: IE value of each variables and threshold Since the correlated variables to the fault is selected, next they are brought into the SDG model established from process knowledge (Wan Y, et al., 2013) to identify the root cause. As mentioned previously, the SDG model of TEP is established from process knowledge. The final fault diagnosis result is shown in Figure 6. The 545 manipulated variable A and C feed flow, which is represented with a red node, is diagnosed as the root cause of this fault because there is no fault input path about it. The blue nodes represent the variables that fluctuate with the spread of the fault and the variable represented by a green node is the final result caused by the fault because there is no fault output path. From the description of fault 7, the fault occurs because of the loss of pressure in reactant C, which can serve as a proof of the diagnosis result. It indicates that the proposed fault diagnosis method can be applied to isolate the root cause and identify the propagation path of the faults in TEP. Figure 6: Propagation path result of fault 7 5. Conclusions In this work, fault diagnosis based on IE and SDG is implemented and tested in TEP. The results of correlated variables selection by IE at different sample points are relatively consistent compared with contribution plots method. The propagation paths among the selected variables are identified by SDG model. By combining process data and expert knowledge, the root cause of process faults in TEP is successfully isolated, which is well matched with the fault description in literature. Since the proposed method is unsupervised, where process faults do not need to be defined in advance, the proposed method is possible to be further applied to actual industrial process. However, there are certain limitations and challenges for industrial applications, as the SDG model obtained from expert knowledge may not be valid once control strategy changes. It can be expected that the fault propagation path can be identified based on real time operation data with more efficient feature extraction algorithms. Acknowledgements The authors gratefully acknowledge the following institutions for support: The National Natural Science Foundation of China (Grant No. 21878012). References Bathelt A., Ricker N.L., Jelali M., 2015, Revision of the Tennessee Eastman process model, IFAC-Papers Online, 48 (8), 309–314. Braatz R., the Braatz Research Group, 2002, TE process simulator software accessed 03.12.2019. Downs J., Vogel E., 1993, A plant-wide industrial process control problem, Computers & Chemical Engineering, 17 (3), 245–255. Frank P.M., 1990, Fault diagnosis in dynamic systems using analytical and knowledge-based redundancy: A survey and some new results, Automatica, 26(3), 459-474. Ma F., Han X., Lin D., Zhong J., Wang J., Sun W., 2019, Early identification of process deviation based on the spatial correlation of measurements, Chemical Engineering Transactions, 74, 607-612. Qin S.J., 2012, Survey on data-driven industrial process monitoring and diagnosis, Annual Reviews in Control, 36(2), 220-234. Shannon C.E., 1948, A mathematical theory of communication, Bell Labs Technical Journal, 27(4), 379-423. Vedam H., Venkatasubramanian V., 1999, PCA-SDG based process monitoring and fault diagnosis, Control Engineering Practice, 7(7), 903-917. Wan Y., Yang F., Lv N., Xu H., Ye H., Li W., Xu P., Song L., Adam K.U., 2013, Statistical root cause analysis of novel faults based on digraph models, Chemical Engineering Research and Design, 91(1), 87-99. 546