International Journal of Computers, Communications & Control Vol. II (2007), No. 2, pp. 185-194 Fault Detection for Large Scale Systems Using Dynamic Principal Components Analysis with Adaptation Jesús Mina, Cristina Verde Abstract: The Dynamic Principal Component Analysis is an adequate tool for the monitoring of large scale systems based on the model of multivariate historical data under the assumption of stationarity, however, false alarms occur for non-stationary new observations during the monitoring phase. In order to reduce the false alarms rate, this paper extends the DPCA based monitoring for non-stationary data of linear dynamic systems, including an on-line means estimator to standardize new observations according to the estimated means. The effectiveness of the proposed methodology is evaluated for fault detection in a interconnected tanks system. Keywords: Fault Detection, Statistical Analysis, Dynamic Principal Component Analysis, Time Series Analysis, Non-Stationary Signals. 1 Introduction The on-line process monitoring for fault detection and isolation, FDI, is an important task to ensure plant safety and product quality. One of the most consolidated FDI techniques of the last twenty years is the analytical approach, which is based on explicit modeling, this is, models obtained from primary physical principles, some of the analytical approaches are very well revised e.g. [1]. In the case of large scale processes whose analytical models are not available or are difficult to obtain the FDI techniques based on data-driven can help to overcome the problem of modeling, these techniques are based on implicit modeling through multivariate statistical methods, some of these methods are resumed in [2]. Principal Component Analysis (PCA) is a multivariate statistical method which models the linear correlation structure of a multivariate process from nominal historical data. PCA transforms a set of multivariate observations to a lower dimension orthogonal space, retaining the most variability of the original data [3]. Because of the simplification and the orthogonal property obtained with PCA, this has been used with success for fault diagnosis issues as in [4] and [5]. It is important to note that like other multivariate statistical methods, PCA works under three assumptions: the data follows a multivariate normal distribution; there exist no auto-correlation among observations; and the variables are stationary, this is, the variables should keep constant mean and standard deviation over time [6], [7]. In the case of data with non normal distribution it is possible to carry out an appropriate transformation like square root or logarithm [8], in order to improve the distribution of data. In the case of dynamic systems the auto-correlation in variables is taking into account incorporating time lags of the time series during the modeling stage, this extension is called dynamic principal component analysis, DPCA [9]. Since PCA and DPCA assume stationarity during modeling process, high rate of false alarms are generated in the diagnosis stage if the test data are non-stationary. The non stationary problem has been tackled with adaptive versions of PCA like in [10] and [11]. Although these algorithms adapt means, covariance, and the PCA model, however, they can not be used for FDI tasks since the adaptation is based in the variations of actual multivariate observations without distinguish the real causes of changes in the variables. Copyright © 2006-2007 by CCC Publications Supported by DGAPA-UNAM-IN11403-2 and the EOLI Project of the European Community INCO program contract number ICA4-CT-2002-10012 186 Jesús Mina, Cristina Verde A non-stationary condition has many possible causes e.g. due to components aging, to faults, even to normal changes in the operating point of the plant, this problem motivated the development of a fault detection algorithm which is robust to changes in the operating point but sensitive to faults. The proposal is based in the fact that the correlation structure between system signals is invariant under nominal conditions, i.e. the relations between variables is the same despite the nominal changes in the mean of the signals, this is a result of the affinity property of nominal signals in linear systems [12]; therefore this work proposes: in the modeling stage, obtain a nominal DPCA model from nominal historical data and identify a set of nominal inputs-output relations; in the diagnosis stage, keep the nominal DPCA model but estimate actual means of input variables through exponentially weighted moving average (EWMA) and estimate means of the output variables from input means through the identified nominal inputs-output relations in order to carry out an appropriate standardization with respect to the estimated means. In the following sections the recursive means estimation process is summarized; next, the proposed extension of DPCA based fault detection for changes in the operating point will be described. Finally the methodology will be evaluated for faults detection in a three interconnected tanks system. 2 Backgrounds 2.1 Identification of the Inputs-Output Relations It is proposed the recursive mean estimation of the input signals using EWMA and the outputs mean estimation from inputs estimated means and through inputs-output nominal relations. Here is proposed to identify the inputs-output relations with moving average models, MA(), for each one of the output variables. Lets consider the case of a MIMO linear system with r inputs and s outputs y = Au (1) u = un + η y = yn + ν (2) where η and ν are stationary white noise vectors added to the inputs and to the outputs, with zero mean and variances σ 2η and σ 2 ν , respectively. Each one of the output variables can be expressed as a linear combination of the inputs with corre- sponding time lag orders q1i, q2i,··· , qri yni (t) = q1i ∑ k=0 ak1i u1 (t −k) +···+ qri ∑ k=0 akriur (t −k) (3) for i = 1,··· , s. The ak1i,··· , akri parameters are obtained through correlation analysis [13]. The identified input-output relations can be expressed in compact form as follows ŷni (t) = âi −→u i (t) (4) where âi = [ â01i ... âq1i 1i ... â0ri ... âqri ri ] Fault Detection for Large Scale Systems Using Dynamic Principal Components Analysis with Adaptation 187 and −→u i (t) = [ u1 (t) ··· u1 (t −q1i) ··· ur (t) ··· ur (t −qri) ]T The orders q ji of the model (4) is selected through a validation procedure taking the minimal value of the sum of square error as a function of q ji. Finally, from (4) for all of the outputs, the relations in matrix notation are given by ŷn (t) =  −→u (t) (5) where ŷn (t) = [ ŷn1 (t) ··· ŷns (t) ]T (s×1) −→u (t) = [ u1 (t) ··· u1 (t −L) ··· ur (t) ··· ur (t −L) ]T (o×1) and  ∈ ℜs×o is made up by the coefficients âk ji = { âk ji , k ≤ q ji f or j = 1, ..., r 0 , q ji < k ≤ L i = 1, ..., s L = Max{q11, ..., qr1, ..., q1s, ..., qrs} and o = r (L + 1). 2.2 Means Recursive Estimation Once the identification of MA models (5), for the system (1), are obtained, the recursive means estimation is carried out in the following way. The inputs mean recursive estimation can be computed as follows µ̂−→u (t) = β µ̂−→u (t−1) + (1−β )−→u (t) (6) with (0 < β ≤ 1) the forgetting factor of EWMA. By the other side, the outputs means at time instant t are estimated from the input means given in (6) and through (5) according with µ̂y(t) = µ̂yn(t) = µ̂−→u (t) (7) 3 DPCA Based Fault Detection with Mean Parameter Estimation The DPCA statistical tool is used to obtain an implicit model of a dynamic system from nominal historical data, and use this implicit model to carry out fault detection tasks. The proposed algorithm is illustrated in Fig. 1. The idea is not only to obtain a DPCA based statistical model but also identify nominal inputs-output relations. So, estimating actual means of input variables through EWMA and estimating means of the output variables from input means using the nominal inputs-output relations, an appropriate standardization can be carried out. 188 Jesús Mina, Cristina Verde Figure 1: Proposed fault detection algorithm. (a) Modeling stage carried out off-line, (b) Detection stage carried out on-line 3.1 DPCA based Statistical Modeling Let matrix X be a set of historical data made up of nt observations of r input variables and s output variables, taken from a dynamic system working under nominal conditions and around an operating point X = [ u1 ··· ur y1 ··· ys ] (nt×p) (8) Each column in X represents an auto-correlated time series. In DPCA the serial correlation is included constructing the named trajectory matrix applying w time lags on each time series, this is −→ X = [ −→ U 1 ··· −→ U r −→ Y 1 ··· −→ Y s ] (n×m) (9) where e.g. −→ U 1 = [u1 (t) u1 (t −1) ... u1 (t −w)]; n = nt −w and m = p (w + 1). To avoid that some particular variables dominate the modeling process, it is convenient to carry out a data standardization in matrix −→ X in relation to its means and standard deviations. Thus, the means of−→ X are given by µ̂−→X = [ 1 n −→ X T 1 ]T = [ µ̂−→U µ̂−→Y ] (1×m) (10) where 1 = [1, 1, ..., 1]T ∈ ℜn. By the other side, the covariance matrix of −→X is given by S = 1 n−1 (−→ X −1µ̂−→X )T (−→ X −1µ̂−→X ) from which the standard deviations can be obtained σ̂−→X = √ diag (S) = [ σ̂−→U σ̂−→Y ] (1×m) (11) Thus, the data standardization is computed in the following way X̃ (i, j) = −→ X (i, j)− µ̂−→X ( j) σ̂−→X ( j) for i = 1, . . . , n and j = 1, . . . , m. Fault Detection for Large Scale Systems Using Dynamic Principal Components Analysis with Adaptation 189 The uncorrelated principal components Z of dimension n × l are obtained through the following transformation Z = X̃Vt (12) where the orthonormal transformation matrix Vt ∈ ℜm×l is composed of an appropriate selection of l eigenvectors, called loading vectors, associated to the correlation matrix R = 1n−1 X̃ T X̃. The data matrix X̃ can be expressed as X̃ = ̂̃X + E = ZVTt + E where ̂̃X is the information captured by the l-principal components and E is the information in the neglected m − l-components. So, for the detection purpose it is possible to use the Hotelling statistic, T 2Zi from Z and/or the squared prediction error (SPE) from E. In this paper the Hotelling statistic is used for fault detection because our interest is not the evaluation of these two statistical parameters but the illustration of reduction of false alarms. For each l-variate observation in Z the Hotelling univariate statistic T 2Zi is given by T 2Zi = ZiS −1 Z Z T i (13) where SZ is the covariance matrix of Z. Finally, a threshold of normal condition from the probability den- sity function of the set of parameters T 2Zi is calculated. [6] propose, among others, for a beta distribution of the data set T 2Zi the threshold UCL as UCL = (n−1)2 ( l n−l−1 ) F ( α 2 ; l, n−l −1 ) n ( 1 + ( l n−l−1 ) F ( α 2 ; l, n−l −1 )) (14) where n and l are the dimensions of Z and α is a level of significance. In a DPCA based modeling conventional approach the implicit model consist of the means and standard deviations vectors (10), (11); the loading vectors in Vt ; the variance in the principal components given by SZ; and the nominal threshold UCL. However, according with the proposal it is just considered σ̂−→X ,Vt , SZ and UCL, since µ̂−→X will be estimated recursively. Additionally to the statistical modeling it is carried out the identification of the inputs-output relations (5). 3.2 Fault Detection DPCA detects a deviation of vector of actual observation −→x a from the nominal reference in terms of its mean and its standard deviation. However, it is important to note that the modeling process is based in the data set −→ X which was obtained in a particular operating point of the system, so any change in the nominal values of the signals is interpreted by DPCA as a fault, even when the process is healthy, this misinterpretation is because of the time variant behavior of components in (10). For linear systems, a change in the operating point means a new assignment in the input variables with consequent variations in the output variables, this is, changes in the mean values of input and output variables but no changes in their correlation structure. However, faults in the system produce changes in the mean values and in the correlation structure between variables. Thus, here it is proposed during the detection stage, the on line estimation of the statistical set (10) using nominal linear inputs-output relations in order to adapt the standardization procedure. 190 Jesús Mina, Cristina Verde So, according to the proposed extension to DPCA based fault detection algorithm, the procedure to evaluate and classify an actual observation −→x a ∈ ℜ1×m is summarized as follows. Be the actual observation vector, with input and output variables, expressed in w time lags −→x a = [ −→u a1 ··· −→u ar −→y a1 ··· −→y as ] (15) 1. Estimate through (6) the means of the actual input data, µ̂−→u a , and through (7) the nominal means of the output variables, µ̂−→y ; next construct the vector µ̂−→x a = [ µ̂−→u a µ̂−→y ] (1×m) (16) 2. Standardize the m terms in (15) using the means estimated given in (16) and the historical standard deviations (11), this is x̃a ( j) = −→x a ( j)− µ̂−→x a ( j) σ̂−→X ( j) for j = 1, . . . , m. 3. Transform the x̃a vector to the principal components subspace za through Vt za = x̃aVt 4. Map za in the behaviour symptom T 2za through T 2za = zaS −1 Z z T a 5. If the resulting value deviates from the normal condition threshold UCL then a fault is present in the system. The key of the proposed methodology is in the continuous estimation of nominal means (16) using the nominal linear inputs-output relations (5) in order to carry out an appropriate standardization. In the following section the proposed fault detection algorithm is applied to detect faults in a three interconnected tanks system considering simple relations for the means estimation. 4 Three Tanks System The tanks system is composed of three cylindrical tanks, interconnected at the bottom by pipes and with valves V1 in the link between tanks 2 and 3, and V2 in the link between tank 2 and the outside, which aperture can be manipulated in order to emulate faults (e.g. pipe blockage), see Fig. 2. The tank dimensions are: hT = 0.63m, AT = 0.01539m2. The system is feed by two inputs Q1 to the tank 1 and Q2 to the tank 2 which are measured just as the output variables h1, h2 and h3 which correspond to tanks levels. The mathematical model is the following AT dh1 dt = Q1 + Q31 −Q10 AT dh3 dt = Q23 −Q31 AT dh2 dt = Q2 −Q23 −Q20 ; Q10 = K1 √ h1 Q31 = K31ρ (h3 −h1) Q23 = K23ρ (h2 −h3) Q20 = K2 √ h2 (17) where ρ (x) , sgn (x) √ |x|. Fault Detection for Large Scale Systems Using Dynamic Principal Components Analysis with Adaptation 191 Q1 Q2 AT h1 h3 h2 Q10 Q31 Q23 Q20 V1 V2 Figure 2: Three Tanks System For the experiments the system was simulated under the following operation point: Q01 = 4.75e − 5m3/s, ( σ 2Q1 = 1.07e−10 ) ; Q02 = 7.35e − 5m3/s, ( σ 2Q2 = 1.05e−10 ) ; h01 = 0.147m, ( σ 2h1 = 1.96e−4 ) ; h02 = 0.276m, ( σ 2h2 = 4.81e−4 ) ; h03 = 0.195m, ( σ 2h3 = 2.65e−4 ) ; K1 = 1.816e−4, K31 = 1.005e−4, K02 = 9.804e−5 and K023 = 7.804e−5. Taking a set of 400 nominal observations measured every 10s it was obtained a DPCA based principal components space of dimensions 301×68, so, for an α = 0.01 the resulting threshold is UCL = 95.886. By the other side, the inputs-output relations identified were h1 = f (Q1, Q2, q1), h2 = f (Q1, Q2, q2) and h3 = f (Q1, Q2, q3) with time lags of order q1 = 61, q2 = 61 and q3 = 60, respectively. 4.1 Detection Results Using a forgetting factor of β = 0.95 for the recursive inputs means estimation (6) the fault detection algorithm is evaluated considering the following cases: 1. Fault condition, blockage in the pipe which links tanks 2 and 3, the fault occurrence is at 8000s. 2. Normal operation of the system during 15000s, with changes in the means of U1 of +20% in 3000s < t < 6000s; −20% in 9000s < t < 12000s and in U2 of +20% in 4500s < t < 7500s; −20% in 10500s < t < 13500s. 3. Change in the mean of U1 of +20% from 4000s and fault condition, blockage in the pipe between tank 2 and 3, at 8000s. The first test is to compare the performance of the DPCA based conventional fault detection and the fault detection based in the proposed algorithm, under fault conditions. The monitoring results are given in Fig. 3 which shows that both algorithms are able to detect the fault. The second test evaluate the performance of both algorithms before changes in the operation point, the monitoring results are given in Fig. 4, where it is cleared observed that the traditional DPCA-based fault detection (Mon1) interprets the normal changes in the operation point as faults, however, the proposed algorithm (Mon2) is robust before these changes, which reduces the false alarm rate. Finally, the third test shows the capability of the proposed fault detection algorithm to distinguish between normal variations in the operating point and the presence of faults, see Fig. 5. 5 Conclusions Here, a modification to the DPCA algorithm for fault detection has been proposed, in which an ap- propriate standardization with respect to on-line estimated statistical parameters is carried out if simple 192 Jesús Mina, Cristina Verde 0 5000 10000 15000 0 50 100 150 200 250 300 350 400 450 t(s) Mon1 Mon2 UCL Figure 3: Fault condition: UCL - Threshold of normal condition; Mon1 - DPCA-based monitoring; Mon2 - DPCA with adaptation 0 5000 10000 15000 0 200 400 600 800 1000 1200 1400 1600 1800 t(s) Mon1 Mon2 UCL Figure 4: Normal condition: UCL - Threshold of normal condition; Mon1 - DPCA-based monitoring; Mon2 - DPCA with adaptation 0 5000 10000 15000 0 100 200 300 400 500 600 t(s) Mon1 Mon2 UCL Figure 5: Normal and Fault condition: UCL - Threshold of normal condition; Mon1 - DPCA-based monitoring; Mon2 - DPCA with adaptation Fault Detection for Large Scale Systems Using Dynamic Principal Components Analysis with Adaptation 193 healthy relations between variables can be obtained. This idea allows to deal with non-stationary signals and to reduce significatively the rate of false alarms. It was shown through a series of tests the effective- ness of the proposed fault detection algorithm to distinguish between normal changes in signals and the variations due to the presence of faults. References [1] R. J. Patton, P. M. Frank, R. N. Clark, Issues of Fault Diagnosis for Dynamic Systems, Springer- Verlag, 1989, London. [2] L. H. Chiang, E. L. Russell, R. D. Braatz, Fault Detection and Diagnosis in Industrial Systems, Advanced Textbooks in Control and Signal Processing, Springer-Verlag, 2001, London. [3] J. E. Jackson, A Users Guide to Principal Components, John Wiley, 1991, New York. [4] J. V. Kresta, J. F. MacGregor, T. E. Marlin, Multivariate Statistical Monitoring of Process Operating Performance, The Canadian Journal of Chemical Engineering, Vol. 69, pp. 35-47, February, 1991. [5] A. Raich, A. Çinar, Statistical Process Monitoring and Disturbance Diagnosis in Multivariable Continuous Processes, AIChE Journal, Vol. 42, No. 4, pp. 995-1009, April, 1996. [6] N. D. Tracy, J. C. Young, R. L. Mason, Multivariate Control Charts for Individual Observations, Journal of Quality Technology, Vol. 24, No. 2, pp. 88-95, April, 1992. [7] A. Norvilas, A. Negiz, J. DeCicco, A. Çinar, Intelligent Process Monitoring by Interfacing Knowledge-based Systems and Multivariate Statistical Monitoring, Journal of Process Control, Vol. 10, No. 4, pp. 341-350, August, 2000. [8] D. C. Montgomery, Introduction to Statistical Quality Control, John Wiley, 2001, New York. [9] W. Ku, R. H. Storer, Ch. Georgakis, Disturbance Detection and Isolation by Dynamic Principal Component Analysis, Chemometrics and Intelligent Laboratory Systems, Vol. 30, No. 1, pp. 179- 196, November, 1995. [10] N. B. Gallagher, B. M. Wise, S. W. Butler, D. D. White, G. G. Barna, Development and Benchmark- ing of Multivariate Statistical Process Control Tools for a Semiconductor Etch Process: Improving Robustness Through Model Updating, ADCHEM’97, Banff, Canada, pp. 78-83, 9-11 June, 1997. [11] W. Li, H. H. Yue, S. Valle-Cervantes, S. J. Qin, Recursive PCA for Adaptive Process Monitoring, Journal of Process Control, Vol. 10, No. 5, pp. 471-486, October, 2000. [12] T. Kailath, A. H. Sayed, B. Hassibi, Linear Estimation, Prentice Hall, 2000, New Jersey. [13] G. E. P. Box, G. M. Jenkins, G. C. Reinsei, Time Series Analysis: Forecasting and Control, Prentice Hall, 1994, New Jersey. Jesús Mina, Cristina Verde Instituto de Ingeniería-UNAM Automatización Coyoacán, DF, 04510, México, Fax: (52)-55-56233600 ext 8052 E-mail: jminaa@iingen.unam.mx, verde@servidor.unam.mx Received: January 15, 2007 194 Jesús Mina, Cristina Verde Jesús Mina received the BS degree in Electric Eng. from Tuxtla Gutiérrez Technological Institute, México in 1999; the MS de- gree in Electronic Eng. from the Research and Technological Development National Center, México in 2002; and currently is a student in the PhD program in Electrical Eng. of the National University from México. He was professor from 2002 to 2003 in the Zacatepec Technological Institute. He has carried out research in non-linear control for power active filters and currently is inter- ested in fault diagnosis based in multivariate statistical analysis. Jesús Mina is a member of the International Society of Automa- tion. Cristina Verde received the BS degree in Electronic and Com- munication Eng. from National Polytechnic, México in 1973, the MS degree in Electrical Eng. from the National Polytechnic, in México 1974 and the PhD Degree in Electrical Eng. from the Duisburg University in Germany in 1983. In 1984, she joined the National University from México (UNAM) and became the head of the automatic control department in 1988. She has been the Coordinator of the Postgraduate School in Computer Science and Engineering from the National University. She has used the control theory to improve distribution, regulation and quality of water in México and her main topics of research interests include automatic fault detection and diagnosis for dynamic systems and integrity of industrial process. She got the Prize Sor Juana Ines de la Cruz given by the National University to the outstanding women in the engineering field in 2005.