 Kurdistan Journal of Applied Research (KJAR) Print-ISSN: 2411-7684 | Electronic-ISSN: 2411-7706 Website: Kjar.spu.edu.iq | Email: kjar@spu.edu.iq Kurdistan Journal of Applied Research | Volume 6 – Issue 2 – December 2021 | 94 A Comparison Between the MEWMA and Mahalanobis Distance Control Chart Kawa M. Jamal Rashid University of Sulamani College Administration and Economics Informatics Statistics Department Email: kawa.rashid@univsul.edu.iq Article Info ABSTRACT Volume 6 – Issue 2- December 2021 DOI: 10.24017/science.2021.2.9 Article history: Received 17/8/2021 Accepted 15/9/2021 Statistical Process Control (SPC) is approaching that uses statistical techniques to monitor the process. Quality control methods are used widely in charts strategy. A traditional variable control chart includes three lines: The Upper and Lower Control Limit (UCL LCL), and The Centre Line (CL), all of which are represented by numeric values. A control chart illustrates the centerline of the average value of the quality feature under investigation. Depending on numeric observation values, a process is either "in control" or "out of control."Considering there are no questions regarding the observations and their values during the production process. However, when these observations include human judgments, assessments, and choices, a continuous random variable (xi) of a manufacturing process should be made up of the variable. The multivariate Exponential Weighted Moving Average (MEWMA) and Mahalanobis distance control chart techniques are used when more than one variable is involved. Today quality control has become one of the most essential techniques for studying all variables to control production or consumption decision factors. Quality control's primary aim is to guarantee that the goods, services, or processes delivered satisfy particular standards and are reliable and satisfactory. The factor's direct approach is based on the product's quality. This study shows that the MD- Distance control chart has a good result. This paper aims to compare the MEWMA and Mahalanobis distance chart for three compounds for water drinking production in the ALA -Factory in Sulaimani provice of Iraq. Keywords: Statistical Process Control (Exponentially Weighted Moving Average-EWMA and Multi Exponentially Weighted Moving Average (MEWMA) chart and Mahalanobis distance Copyright © 2021 Kurdistan Journal of Applied Research. All rights reserved. Kurdistan Journal of Applied Research | Volume 6 – Issue 2 – December 2021 | 95 1. INTRODUCTION Quality control methods are one of the most critical technologies to study all variables to control production. The primary strategy of quality control is to ensure that the products are in a high quality, processes provided meet consumers and are employed to ensure levels of quality in a product. Quality control problems in the industry may involve more than a single quality characteristic. Walter A Shewhart (1924) was invented the quality control chart, which has formed the basis of extensive work in multivariate quality control; he was the first to recognize the need for considering quality control problems as multivariate. During the period of 1930-1940 Hotelling performed a great deal of work on multivariate statistical control procedures. The Hotelling’s T2 control chart is one of the most widely used tools in multivariate statistical process control. Continuous multivariate approaches include three important properties, multivariate Statistical Process Control is a collection of advanced techniques for monitoring and controlling and operational (1) Is the procedure under your control? (2) Type I errors must be specified. (3) They must take into account the variables' interrelationships. [1][2][5][6]. Exponentially weighted moving average (EWMA) chart (Roberts, 1959) and cumulative sum CUSUM chart (Page, 1954) are often used for detecting shifts in a sequence of independent normal observations with common variance coming from a particular process. The EWMA chart relies on the specification of a target value and a known or reliable standard deviation estimate. For this reason, the moving average chart is best used after process control has been established [6][8]. The exponentially weighted moving average (EWMA) control chart is a good alternative than Shewhart control chart. In the Shewhart control chart when we are interested in detecting small shifts. The performance of the EWMA control chart is approximately equivalent to that of the cumulative sum control chart, and in some ways, it is easier to set up and operate. As with the CUSUM, the EWMA is typically used with individual observations [2][3][4] If the observations Xi are independent random variables with variance S2, then the variance of Zi is: 𝜎𝑧𝑖 2 = 𝜎2 ( 𝜆 2 − 𝜆 ) [1 − (1 − 𝜆)2𝑖] (1) 2. THE MULTIVARIATE NORMAL DISTRIBUTION The multivariate normal distribution model extends the univariate normal distribution model to fit multivariate observations. It is an essential direction in statistics that analyzes the relationships between more than one variable and analyzes the dependence between variables and between groups of variables. Generally, we use the normal distribution to describe the behavior of a continuous quality characteristic. The univariate normal probability density function is: 𝑓(𝑥) = 1 √2𝜋𝜎2 𝑒 − 1 2 ( 𝑥 − 𝜇 𝜎 ) 2 − ∞ < 𝑥 < ∞ (2) The mean of the normal distribution is µ, and the variance is σ2. Note that (apart from the minus sign) the term in the exponent of the normal distribution can be written as follows: (𝑥 − 𝜇)′𝜎2(𝑥 − 𝜇) (3) The measures of the squared standardized distance from X to the mean µ, whereby the term “standardized”. The distance is expressed in standard deviation units. The vector x′=[ x1, x2, . . . , xp]. Be a p-component of variables given by x1, x2, . . . , xp, approach can be used in the multivariate normal distribution case. Kurdistan Journal of Applied Research | Volume 6 – Issue 2 – December 2021 | 96 Let µ′=[ µ1, µ2, . . . , µp] The variances and covariance’s of the random variables in x be contained in a (p x p) covariance matrix S, and let the vector of the means of the xs be the vector of the means the x’s,. The x's variances are the major diagonal elements of Σ. The covariances are the off- diagonal elements. Now the normalized or standardized distance is squared [5][7]. From x to µ is: (𝑥 − 𝜇)′Σ−1(𝑥 − 𝜇) (4) In multivariate analysis, Mahalanobis distance (MD) has been a fundamental statistic, It was introduced by a famous Indian statistician Prof. P. C. Mahalanobis in 1936.It has been applied by researchers in several different areas [6][8][10]. The Mahalanobis distance (MD) measured the distance of unknown observations from the reference points in multivariate space or measuring the distance between vectors about different practical uses, such as the difference between pairwise individuals, comparing the similarity of observations. It's a multivariate equivalent of the Euclidean distance. One of the main reasons for using the MD is that it is very sensitive to intervariable changes in the reference data, the idea of measuring how many standard deviations from the mean. [6][9] The Mahalanobis distance formula is as: 𝑀𝐷𝐽 = 𝐷𝑗 2 = 𝑍𝑖𝑗 ′ 𝐶−1𝑍𝑖𝑗 (5) 𝑍𝑖𝑗 = (𝑧1𝑗, 𝑧2𝑗, …𝑧𝑖𝑗) (6) 𝑗 = 1 𝑡𝑜 𝑛 𝑖 = 1 𝑡𝑜 𝑘 𝑍𝑖𝑗 = 𝑋𝑖𝑗 − 𝑚𝑖 𝑆𝑖 (7) 𝐷2 = (𝑥 − 𝑚)𝑇. C−1. (𝑥 − 𝑚) (8) Where: D2 : Is the MD X: Is the vector of observations m: Is the vector of mean C-1: Is the inverse covariance matrix Finally, the Euclidean distance is calculated. The three procedures listed above are intended to address the issues with Euclidean distance that we discussed before. So, how do you understand the formula above? Let’s take the (x – m)T .C-1(x – m) Is essentially the vector's distance from the mean. The covariance matrix is then divided by this (or multiply by the inverse of the covariance matrix) Is What is the effect of dividing by the covariance? If the variables in your data set are strongly correlated, then the covariance will be high. Dividing by a significant covariance will effectively reduce the distance. Likewise, if the X’s are not correlated, then the covariance is not high, and the distance is not reduced much. So effectively addresses both the problems of scale and the vector's distance from the mean and the correlation of. The covariance matrix is then divided by this (or multiply by the inverse of the covariance matrix) [5][7]. Let Xi , i = 1, ..., n be random vectors with p components; its mean be E(Xi) = µ and covariance matrix be Cov (Xi) = Σ, the MD space yi is generated by : 𝑦𝑖 = Σ −1/2 (𝑥𝑖 − 𝜇) 𝑖 = 1,2, …. 𝑛 (9) Kurdistan Journal of Applied Research | Volume 6 – Issue 2 – December 2021 | 97 3. HOTELLING’S T2 DISTRIBUTION Some of the multivariate procedures for control charts are based heavily on Hotelling’s T2 distribution introduced by an American mathematical statistician and an influential economic Harold Hotelling's father theorist T2 distribution (1947). The basis of several multivariate control charts or a multivariate sampling distribution, the T2 control chart can be used with data in subgroups or data that are individual observations to show how to construct a T2 control chart[1][5] . t2 = (�̅� − 𝜇0) 2 𝑠2/𝑛 (10) (x − μ)′S−1(x − μ) (11) Where x is the sample mean vector and S is the sample covariance vector. The S matrix is a little more challenging to find. It is found from the vector of moving differences for each variable. For each variable, Vi is found where: Vi = Xi+1 – Xi This is done for both variables and the vector V contains the results for both variables:             = −1 2 1 m v v v V  Where a m are number of samples. And V has a column for each variable and (m – 1) rows. The S can be found using the following: )1(2 1 ' − = m vv S (12) Where V´ is the transpose of V Then T2 is given by: T2 = (x –x’)'S-1(x -x) For the T2 control chart, we need to define or calculate is the UCL. And, of course, The upper control limit based on individual observations is given by the following:[7][8][11] 𝑈𝐶𝐿 = (𝑚 − 1)2 𝑚 𝛽 𝛼 𝑝 2 (𝑞−𝑝−1)/2 (13) Where m = number of samples, p = number of variables, β = Beta distribution, α = the confidence level and q = 2(m−1)2 3m−4 (14) It is well known that for a random sample of size n from a univariate normal population, X~N(μ, σ2), if we compute the sample mean X̅ And sample variance S2, Then it t = (X̅ − μ)/( s √n ) Will have a t distribution with (n -1) degrees of freedom[10][11] Suppose that X1, X2, X3.... Xn, are jointly distributed with a multivariate with a normal distribution, Let the mean be as µ1, µ2, µ3,... µn and variances as σ1 2, σ2 2, σ3 2, , , σn1 2 For n >1 The Characteristics may be independent and the covariance of Xi and Xj is given as 𝜎𝑖𝑗 = 𝐸(𝑋𝑖 − 𝜇𝑖)(𝑋𝑗 − 𝜇𝑗 ) (15) And the variance of Xi is: σij = E(Xi − μi)(Xj − μj ) = E[(Xi − μi) 2 = σi 2 (16) And the random variables can be shown as a vector: Kurdistan Journal of Applied Research | Volume 6 – Issue 2 – December 2021 | 98 hjpjmk X X X X pk k k k ==               = ,3,2,1,,3,2,1 2 1   (17) And the vector of E(X) value of X is:               = pX X X  2 1 ̂ Where m X X m k jk j  = = 1 and 1 )( 1 2 2 − − =  = m XX S m k jjk j hj pj  = ,3,2,1  (18) Therefore we can calculate the value of Hoteling T2 If we have two characteristics, X1 and X2 with a joint distribute with sample mean, variance and covariance between two variables , under these conditions the formula of Hoteling T2 as: 𝑇2 = 𝑛 𝑆1 2𝑆2 2 − 𝑆12 2 [𝑆2 2(�̅�1 − �̅�01) 2 + 𝑆1 2(�̅�2 − �̅�02) 2 − 2𝑆12(�̅�1 − �̅�01)(�̅�2 − �̅�02)] (19) )ˆ()ˆ( 12  −−= − kkk XSXT (20) And the S as shows:               = 2 2 2 2 112 2 1 p p p S SS SSS S    (21) Therefore to determine the value of Upper and Lower Limit given By:     1,, 1,, 2 )1/( )1/()1( −− −− −−+ −−− = pmp pmp Fpmmpm Fpmpm UCL   (22) pmp F pmm mmp UCL − − −+ = *,,*2* )( )1*)(1*(  (23) m:is the samples, m=m* if no sample And the UCL is equal the Value of Chi-Square as: Kurdistan Journal of Applied Research | Volume 6 – Issue 2 – December 2021 | 99 UCL = χα,p 2 Value of Hotelling’s T2 percentile points can be obtained from the percentile points of the F-distribution given as a relation Ta,p,n−1 2 = P( n−1 n−p )Pa,p,n−p (24) Where Fa,p,n-1 represents to the F-distribution with p degree of freedom . EWMA Control charts are used to monitor the mean of a process and are interested in identifying minor shifts. They also incorporate historical data in addition to current data, but the weights attached to the data are exponentially decreasing as the observations become less recent, the new generation of value of Zi is as: 𝑍𝑖 = 𝜆𝑥𝑖 + (1 − 𝜆)Z𝑖−1 (25) Were 0 < λ ≤ 1 is a constant and the starting value required with the first sample at i=1 is the process target value, so that 𝑍0 = 𝑢𝑜 Sometimes the average of the data is used to start value, then it become 𝑍0 = 𝑥 If the observations Xi are independent random variables with σ2 then the variance is: σxi 2 = σ√ λ 2 − λ (1 − (1 − λ)2i (26) Then the EWMA control chart would be constructed by plotting Zi versus the sample number i. The centre line and control limits for the EWMA control chart are as follows [6][7][10]: UCL, LCL = μ0 ± L Lσ √n √ λ 2 − λ (1 − (1 − λ)2i) (27) From the UCL, and LCL that the term [1 − (1 – 𝜆)2i] Approaches unity as i get larger, than the ULC and LCL it become as: UCL, LCL = μ0 ± Lσ√ λ 2−λ (28) Cl = μ0 Where σx is the standard deviation, where L is the distance between the control limits and the centre line (CL). And λ is the constant such that 0 < λ ≤ 1. The quantity Z0 is the starting value, and it is taken equal to the target mean µ or the average of the initial data in case when the information on the target mean is not available.[3][4][5][11] 4. APPLICATION: NUMERICAL ILLUSTRATION There are numerical analysis and quality control charts to compare Mahalanobis Distance (MD) and, MEWMA control chart shows the positive measurement aspects of the above presented and demonstrates which one has a good outcome. This paper used data from three chemical components of water to do so. (ACL Temp. (X3), Conductivity (X1), TDS (X2), and Conductivity (X1). We have a random sample size (30) for the chemical analysis data for this application using the SPSS and STATGRAPHICS – XVI software and data as in table(1). We determine the value of MD, Multivariate T-Squared Control Chart as shown in table(1). Table (1) shows Table (1) shows that the value of MD and T-Square.The Max.value of MD is (12.57) and The Max.value for MEWMA is (42.53) Kurdistan Journal of Applied Research | Volume 6 – Issue 2 – December 2021 | 100 Table 1: contains three kinds of variables of compotes of water. Sampl Conductivity (X1) TDS (X2) ACL Temp. C (X3) EWMA for MD Value EWMA- For T- Square Value MD- Value T- Squared -Value 1 183.67 117.67 15.00 4.83 15.98 12.57 42.53 2 187.33 120.00 17.33 4.15 14.73 1.43 9.74 3 189.33 121.33 18.00 3.83 13.57 2.54 8.90 4 191.67 122.67 17.67 3.84 13.81 3.90 14.78 5 189.00 120.67 15.33 4.82 18.23 8.72 35.93 . . . . . . . . . . . . . . . . . . . . . . . . 25 168.00 107.67 19.33 2.47 8.18 3.00 11.84 26 169.00 108.33 19.33 2.46 8.57 2.42 10.13 27 174.67 112.00 20.33 2.38 9.00 2.08 10.71 28 173.67 111.33 20.00 2.17 8.92 1.32 8.64 29 170.33 109.33 19.67 2.20 9.21 2.33 10.35 30 177.00 113.67 19.33 2.19 8.05 2.12 3.41 Figure 1: MD and T-Square Chart Figure (1) shows that the MD and T-Square chart. It is seen that the MD chart has a well-distributed chart, and it is seen that only one point is out of control and the upper control chart of MD is (11.62), as shown in figure (2). From figure (3) of the Multivariate T-Squared chart, there are (5) points out of control, 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 0 5 10 15 20 25 30 35 40 45 MD T-Squared Kurdistan Journal of Applied Research | Volume 6 – Issue 2 – December 2021 | 101 Multivariate Control Chart -T-Squared UCL = 11.62 0 5 10 15 20 25 30 0 10 20 30 40 50 Figure 2: MD Control Chart Figure 3: T-Square Chart Figure 4:EWMA Chart For MD Multivariate Control Chart -MD UCL = 11.62 0 5 10 15 20 25 30 0 3 6 9 12 15 5.23 EWMA Chart for MD-Value 0 5 10 15 20 25 30 Observation0 1 2 3 4 5 6 E W M A 2.90 0.57 Kurdistan Journal of Applied Research | Volume 6 – Issue 2 – December 2021 | 102 Variables MD-Value T-Square Quantile Plot 0 10 20 30 40 50 0 0.2 0.4 0.6 0.8 1 pr op or tio n 15.38 EWMA Chart for T-Square -Value 0 5 10 15 20 25 30 Observation 0 4 8 12 16 20 E W M A 9.34 3.30 Figure 5: EWMA Chart For T-Square Figure (4) shows that the EWMA-chart of MD value, it is seen that the UCL, LCL (5.23 ,0.57), and CL is (2.9). There are two points out of control: process capability (2.33), and the UCL, LCL for T-Square (15.38, 3.3). The CL is (9.34), and there are (4) points out of control, with process capability (6.04) as shown in Figure (5). Figure 6: Density of EWMA Chart For T-Square The density Traces distribution of MD has a better density than T-Square, as shown in Fig(6). The Median of MD is (2.098) and for T-Square is(7.6) by the Whitney-test, there is a statistically significant difference between the medians at the 95.0% confidence level, and the interval for the mean of MD is (2.9 ±1. 12). The T-square is: (9.34 ± 3.58) with % 95 confidence, And For Variance for MD and T- Square are (9.31,93.84). Figure 7: Probability Plot of EWMA Chart For T-Square The Box whisker Plot has a good separation as shown in Fig(8), the lower and Upper quartile of MD are (1.102,3.26), and for Multivariate T-Squared are (3.05,11.23), with Interquartile range of MD and T- Square are(1.95,7.97) and although for the histogram Frequency. The MD value has a good distribution frequency more than T-Square, as shown in Fig (9). MD T-Squared Density Traces 0 10 20 30 40 50 0 0.02 0.04 0.06 0.08 d e n s it y Kurdistan Journal of Applied Research | Volume 6 – Issue 2 – December 2021 | 103 MD -Value T-Square -2 3 8 13 18 23 28 33 38 43 48 53 15 10 5 0 5 10 15 fr e q u e n c y Figure 8: Box Plot of MD and T-Square value Figure 9 : Histogram Frequency of MD and T-Square value 5. CONCLUSION The main goal of this research paper will be to using the MD-Distance control chart and the Multivariate T-Squared Control Chart. During all Control charts are used. According to the comparison study, a good comparison between the MEWMA and MD-Distance control chart shows that the MD-Distance control chart has a good result. In general, the MD-Distance control chart has a good effect on variables and offers a good relationship between variables. Box-and-Whisker Plot 0 5 10 15 20 25 30 35 40 45 MD T-Square Kurdistan Journal of Applied Research | Volume 6 – Issue 2 – December 2021 | 104 REFERENCE [1] A.Faraz, Hotelling’s T2Control Chart with Double Warning Lines”, Quality Management Department, Air Defense Group, Aerospace Industries Organization, Tehran, Iran. [2] A. MITRA ,FUNDAMENTALS OF QUALITY CONTROL AND IMPROVEMEN, Fourth Edition Auburn University, College of Business Auburn, Alabama Copyright. 2016 by John Wiley&Sons, Inc. Allrightsreserved. [3] A S.T.A. Niaki ,M.Malakib, M.J. Ershadi ,,A particle swarm optimization approach on economic and economic- statistical designs of MEWMA control charts 2011 ,SCIENTIA IRANICA [4] D.C. MONTGOMERY ’Introduction to Statistical Quality Control” , seven editions , Arizona State University Printed in the United States of America. 10 9 8 7 6 5 4 3 2 1 ,2013 [5] K. Yang .J. Trewm,”Multivariate Statistical Methods in Quality Management “ Copyright © The McGraw-Hill Companies 2004 [6] G.TAGUCHI And R. JUGULUM, ”The Mahalnobis -Taguchi Strategy” : A Pattern Technology System.Copyright 2002 john Wiley & Sons. Ins. [7] K. J.Rashid, 2017,” Design of an Exponentially Weighted Moving Average (EWMA) and An Exponential Weighted Root Mean Square (EWRMS) Control Chart, International Journal Of Advanced Engineering Research and Science (IJAERS) [Vol-4, Issue-3, Mar- 2017] [8] R. Mawonike, 2017,” Multivariate Exponentially Weighted Moving Average for Monitoring Coke Production: A Case of Wankie Colliery Company, Global Journal of Theoretical and Applied Mathematics Sciences. ISSN 2248- 9916 Volume 7, Number 1 (2017), pp. 1-16 [9] S.Bersimis and,J.Panaretos,2005, ”Multivariate Statistical Process Control Charts and the Problem of Interpretation: A Short Overview and Some Applications in, University of Economic and Business, Department of Statistics, Athens, Greece [10] T.T. Allen,, 2010, ”Introduction to Engineering Statistics “ and Lean Sigma Statistical Quality Control and Design of Experiments and Systems Second Edition. [11] T. P. RYAN, Smyrna, Georgia” 2011”, Statistical Methods for Quality Improvement, Third Edition Copyright © 2011 by John Wiley & Sons, Inc. All rights reserved. Published by John Wiley & Sons, Inc., Hoboken, New Jersey.