INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL ISSN 1841-9836, 13(6), 972-987, December 2018. Scheme for Statistical Analysis of Some Parametric Normalization Classes A. Krylovas, N. Kosareva, E.K. Zavadskas Aleksandras Krylovas Department of Mathematical Modelling, Vilnius Gediminas Technical University, Sauletekio av. 11, Vilnius, Lithuania aleksandras.krylovas@vgtu.lt Natalja Kosareva Department of Mathematical Modelling, Vilnius Gediminas Technical University, Sauletekio av. 11, Vilnius, Lithuania natalja.kosareva@vgtu.lt Edmundas Kazimieras Zavadskas* Laboratory of Operational Research, Institute of Sustainable Construction, Vilnius Gediminas Technical University, Sauletekio av. 11, Vilnius, Lithuania *Corresponding author: edmundas.zavadskas@vgtu.lt Abstract: In this research 7 parametric classes of normalization functions depending on 1 or 2 parameters proposed for MCDM problem solution. Monte Carlo experiments carried out to perform comparative statistical analysis and find optimal parameter values for the case of Gaussian distribution of decision making matrix elements. Opti- mal parameter values were ascertained for each normalization method. Normalization formulas were compared with each other in the sense of their efficiency. Logarithmic and Max normalization formulas demonstrated highest values of the best alternative identification. The proposed methodology of determining optimal parameter values of normalization formulas could be applied by approximation of real data with ap- propriate probability distributions. Keywords: normalization methods, multi-criteria optimization, Monte Carlo method, comparative statistical analysis, SAW. 1 Introduction Multiple criteria decision making (MCDM) methods deal with ranking of alternatives (projects) by measurements or evaluations of the projects according to finite number of attributes (criteria). Ranking results depend on many components of this process, the main of them that influence finite results are • data normalization formula, • weight determination method, • data aggregation method. Data normalization is an important part of a decision-making problem, but it is still not given enough attention in scientific literature. Nevertheless, it was shown in the number of scientific pa- pers that data normalization method selection often significantly affects the accuracy of MCDM problem solution. The main topic of existing articles is investigation of various normalization formulas together with TOPSIS as one of the most popular MCDM method for ranking alter- natives. Jahan and Edwards [4] proposed the comprehensive systematized review of existing Copyright ©2018 CC BY-NC Scheme for Statistical Analysis of Some Parametric Normalization Classes 973 normalization methods. Some of them are traditionally used with the certain MCDM meth- ods, for example, the well known tandem of vector normalization (Van Delft and Nijkamp [17]) and The Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) method (Hwang and Yoon [3]). The study of Celen [1] revealed that TOPSIS with vector normalization generated the most consistent results among the most popular four normalization procedures. Among the linear normalization procedures, max-min and max methods appeared as the possible alternatives to the vector normalization procedure. A simulation comparison of normalization procedures for TOPSIS in terms of their ranking consistency and weight sensitivity was carried out by Chakraborty and Yeh [2]. The study results also justify the use of the vector normalization procedure for TOPSIS. Influence of normalization tools on COPRAS-G method applied for ma- terial selection task proposed by Yazdani et al. [19] In the study of Podviezko and Podvezko [13] it is shown that different types of transformation and normalization of data applied to popular MCDA methods, such as SAW or TOPSIS may produce considerable differences in evaluation. Authors stated that attention has to be paid to making a choice of the type of normalization, which reflects preferences of decision-maker. Kosareva et al. [7] accomplished comparative statistical analysis of 5 widely used normaliza- tion methods with SAW method for ranking the alternatives and ternary estimates matrix. It is notable that results strongly differ for benefit and cost type attributes. Minmax method in most cases is significantly better than other. In the study of Peldschus [12] the impact of linear, concave and convex function profiles for mapping on a dimensionless interval (normalization) was investigated. Review of the normalization methods used in construction engineering and management, and their applications there are presented by Kaplinski and Tamošaitienė [6]. Mi- lani et al. [11] examined how different families of norms affect the result of solving engineering decision problem by entropy and TOPSIS methods. It was verified that the linear optimization norms cannot affect the rank of alternatives significantly. In contrast, nonlinear norms may yield some deviations, mainly for alternatives that are inherently close. Recently, some new normalization formulas were proposed in the literature. Research of Zavadskas and Turskis [20] is focused on introducing a new logarithmic method for decision making matrix normalization. Based on Weitendorf [18] and Juttler [5] formulas, Stanujkic et al. [14] proposed a new normalization procedure. The idea is to use the distance from the preferred ratings, which respects the decision-maker’s preferences. This procedure, adapted for negotiations, was integrated with Step-Wise Weight Assessment Ratio Analysis (SWARA) and Additive Ratio Assessment (ARAS) methods in the research of Stanujkic et al. [15] Data normal- ization, as well as the measurement scales, inconsistency issues, missing judgement estimation methods, etc. have been extensively studied in the pairwise comparison matrix (PCM). A com- prehensive literature review on PCM provided in Kou et al. [8] A group decision-making (GDM) method for integrating heterogeneous information proposed by Li et al. [9] The purpose of this study is to propose a new methodology of constructing parametric nor- malization methods and to carry out their statistical comparative analysis. 7 classes of parametric data normalization procedures are presented. Some of the proposed normalization methods with particular parameter values are well known and widespread, other methods aren’t so popular. Nevertheless, all these methods were not being applied using the wide range of parameter values so far. The article is organized as follows. In the Section 2 seven classes of parametric normalization functions are introduced, their properties and dependency on parameter values are discussed. Experiment design and detailed description of initial data matrices generation procedure is given in the Section 3. Monte Carlo experiment results and comparative statistical analysis of normal- ization methods depending on the parameter values presented in the Section 4. Conclusions and future research are discussed in the Section 5. 974 A. Krylovas, N. Kosareva, E.K. Zavadskas 2 Parametric data normalization methods In this article Simple Additive Weighting (SAW) method (see Ref. 19) with equal weights is applied for MCDM problem solution. Let us suppose that the initial data are the results of some measurements, expert evaluations, etc., and are written in m×n-dimension matrix X = (xij)m×n. The element xij of decision making matrix is evaluation of alternative i(i = 1, 2, . . . ,m) by the criterion j(j = 1, 2, . . . ,n). Decision making matrix after normalization procedure is noted as follows: X̃ = (x̃ij)m×n, 0 ≤ x̃ij ≤ 1. Let wj,j = 1, 2, . . . ,n be criteria weights satisfying conditions ∑n j=1 wj, 0 ≤ wj ≤ 1. Then, SAW criteria aggregated value is calculated for each alternative: Qi = n∑ j=1 wjx̃ij, i = 1, 2, . . . ,m. If Q1 ≥ Q2 ≥ . . . ≥ Qm, then the alternatives are ranked as follows: altern1 � altern2 � . . . � alternm. The essence of any normalization procedure is mapping of the real values xij ∈ [mj,Mj] ⊂ (−∞, +∞) having certain meaning in the interval [0, 1], which in general case represents these numbers unevenly. For example, functions xα : [0, 1] → [0, 1] depending on the parameter α represent half of the maximum values to the 13% of the maximum normalized values, when α = 0.2, to the 30%, when α = 0.5, 75%, when α = 2 and even 97%, when α = 5 (see Figure 1). Figure 1: Normalization functions xα : [0, 1] → [0, 1]. It means that the aggregated values Qi can not only depend strongly and not so much on the absolute values of xij as on the relationships xi′j < xi′′j. In this research we’ll limit ourselves with direct optimization, when higher values of criteria are better, omitting inverse optimization, when lower values are treated as better. So, the only essential requirement for normalization functions is – they must be monotonously increasing. The other restriction of our research is dealing with random generation of initial decision making matrix elements that have the Gaussian (normal) probability distribution. Practically applied normalization methods can be more complex compared to functions de- picted in Figure 1 and can therefore have a great deal of influence in alternative comparisons and decision-making results. Therefore, the literature deals with a large number of normalization methods, the new methods are developed, their researches and comparisons are carried out. Denote mj = min 1≤i≤m xij,Mj = max 1≤i≤m xij,j = 1, 2, . . . ,n. In the Table 1 we propose 7 classes of normalization functions [mj,Mj] → [0, 1] depending on 2 parameters α ≥ 0 and k ≥ 0. Scheme for Statistical Analysis of Some Parametric Normalization Classes 975 These classes describe 7 different normalization approaches. Note, that (1), (6) and (7) methods depend exclusively on the parameter α, meanwhile the rest (2)–(5) methods are depending on two parameters α ≥ 0 and k ≥ 0. When α = 1, methods (1), (6) and (7) are known from the literature and are entitled as: (1) – Minmax normalization, (6) – Logaritmic normalization, (7) – Max normalization. Plots of functions (1)–(7), when α = 1,k = 1, are given in Figure 2, plots of functions (1), (6) and (7), when α = 0.5, 2 – in Figure 3). Graphs of functions (2)–(5) with corresponding parameter values α = 0.5, 2; k = 2, 3, 5 are depicted in Figure 4. In the Table 2 the example of data matrix normalization by methods (1)–(7) is presented when α = 1, k = 1 and initial data matrix is (xij)(4×4) =   120 1.2 1 4 250 2.5 2 5 3600 36 5 8 6400 64 10 13   . The second column elements of matrix (xij) are obtained dividing the first column elements by 100, while the fourth column elements are calculated by adding constant 3 to the third column elements. Let us see that data normalization formulas (1)–(5) are invariant with respect to linear transformation αx + β, formula (7) is invariant with respect to multiplication/division, but does not preserve addition, while formula (6) does not preserve nor multiplication, neither addition. All the first 5 methods map the lowest value to 0, and the highest value to 1, method (7) maps the highest value to 1. Several questions naturally arise considering this issue. Are the mutual differences between the results of these methods application for MCDM problem solution significant? What are the “best” values of parameters α and k? How do the results vary when varying parameter α and k values? How do the results vary when varying m and n values? Table 1: Formulas for 7 parametric normalization methods in the case of direct normalization. Formula number Normalization method Direct normalization formula (1) Minmax normalization (Weitendorf, 1976) x̃ij = ( xij −mj Mj −mj )α (2) Exponential normaliza- tion x̃ij = e −k  Mj −xij xij −mj  α (3) Logaritmic maxmin nor- malization x̃ij = 1( 1 + k ln ( Mj −mj xij −mj ))α (4) Arctangent normaliza- tion x̃ij = ( 2 π arctan ( k xij −mj Mj −xij ))α (5) Double exponential nor- malization x̃ij =  1 −e −k ( xij −mj Mj −xij ) 1 + e −k ( xij −mj Mj −xij )   α (6) Logarithmic normal- ization (Zavadskas and Turskis, 2008) x̃ij = ( ln(xij) ln( ∏m i=1 xij) )α ,xij ≥ 1 (7) Max normalization (Stopp [16], 1975) x̃ij = ( xij Mj )α 976 A. Krylovas, N. Kosareva, E.K. Zavadskas Figure 2: Functions (1)–(7), when α = 1,k = 1. Figure 3: Functions (1), (6) and (7), when α = 0.5 and α = 2. 3 Experiment design The case under investigation is when the decision making matrix elements are some measure- ments done with sufficient precision and having Gaussian probability distribution xij ∼ N(µi,σ). We’ll suppose, that the first row elements of the matrix have at the average higher values than elements of other rows µ1 > µ2 = . . . = µm. Only such matrices were analysed, for which the first alternative (first row) has no domination property in comparison with any other alternative, i.e. when the following conditions are not fulfilled: (x11,x12, . . . ,x1n) � (xi1,xi2, . . . ,xin), if x11 ≥ xi1,x12 ≥ xi2, . . .x1n ≥ xin. (8) If condition (8) is valid, weighted averages Qi = ∑n j=1 wjx̃ij will satisfy inequalities Q1 ≥ Qi with any values of weights wj. Therefore, the result of all normalization methods will be the same – the first alternative is better than i-th alternative. So, when random matrices are being generated, such matrices whose first and any (at least one) i-th row satisfy domination property (8) are being rejected. As a result, the number of cases of a fair decision has decreased. For example, the first and the second rows elements of decision making matrix X(1) =   52.34 66.31 63.38 67.01 48.90 62.05 56.54 53.93 72.05 56.24 61.58 48.05 52.10 65.00 71.95 56.82 77.69 70.34 65.00 55.65   Scheme for Statistical Analysis of Some Parametric Normalization Classes 977 Table 2: Normalization methods application example when α = 1 and k = 1. Normalization method Normalized matrix x̃ij Formula (1)   0 0 0 0 0.021 0.021 0.11 0.11 0.55 0.55 0.44 0.44 1 1 1 1   x̃ij = xij −mjMj −mj (2)   0 0 0 0 2.8 · 10−21 2.8 · 10−21 0.0003 0.0003 0.45 0.45 0.29 0.29 1 1 1 1   x̃ij = e−  Mj −xij xij −mj   (3)   0 0 0 0 0.21 0.21 0.31 0.31 0.63 0.63 0.55 0.55 1 1 1 1   x̃ij = 1 1 + ln ( Mj −mj xij −mj ) (4)   0 0 0 0 0.01 0.01 0.08 0.08 0.57 0.57 0.43 0.43 1 1 1 1   x̃ij = 2π arctan ( xij −mj Mj −xij ) (5)   0 0 0 0 0.01 0.01 0.06 0.06 0.55 0.55 0.38 0.38 1 1 1 1   x̃ij = 1 −e − ( xij −mj Mj −xij ) 1 + e − ( xij −mj Mj −xij ) (6)   0.18 0.02 0 0.18 0.20 0.10 0.15 0.21 0.30 0.41 0.35 0.27 0.32 0.47 0.5 0.34   x̃ij = ln(xij)ln(∏mi=1 xij) (7)   0.02 0.02 0.1 0.31 0.04 0.04 0.2 0.38 0.56 0.56 0.5 0.62 1 1 1 1   x̃ij = xijMj satisfy inequalities 52.34 ≥ 48.90, 66.31 ≥ 62.05, 63.38 ≥ 56.54, 67.01 ≥ 53.93, therefore (52.34, 66.31, 63.38, 67.01) � (48.90, 62.05, 56.54, 53.93) and matrix X(1) is being rejected. Matrix X(2) =   57.56 66.42 58.31 55.51 58.51 63.08 67.93 46.31 55.36 57.58 59.16 72.82 38.95 58.40 40.93 57.93 53.68 45.57 70.41 59.98   is appropriate. The results of 7 normalization procedures when α = 1,k = 1 are as follows: X̃(2)(1) =   0.95 1.0 0.59 0.35 1.0 0.84 0.92 0.0 0.84 0.58 0.62 1.0 0.0 0.62 0.0 0.44 0.75 0.0 1.0 0.52   , X̃(2)(2) =   0.95 1.0 0.50 0.15 1.0 0.83 0.91 0.0 0.83 0.48 0.54 1.0 0.0 0.54 0.0 0.28 0.72 0.0 1.0 0.39   , 978 A. Krylovas, N. Kosareva, E.K. Zavadskas a) Function (2) b) Function (3) c) Function (4) d) Function (5) Figure 4: Graphs of functions (2)–(5) with different parameter α and k values. X̃(2)(3) =   0.95 1.0 0.65 0.49 1.0 0.85 0.91 0.0 0.85 0.65 0.68 1.0 0.0 0.67 0.0 0.55 0.78 0.0 1.0 0.60   , X̃(2)(4) =   0.97 1.0 0.61 0.31 1.0 0.88 0.94 0.0 0.88 0.60 0.65 1.0 0.0 0.64 0.0 0.42 0.80 0.0 1.0 0.52   , X̃(2)(5) =   1.0 1.0 0.62 0.26 1.0 0.99 1.0 0.0 0.99 0.59 0.67 1.0 0.0 0.66 0.0 0.37 0.91 0.0 1.0 0.49   , X̃(2)(6) =   0.21 0.21 0.20 0.20 0.21 0.20 0.21 0.19 0.20 0.20 0.20 0.21 0.19 0.20 0.18 0.20 0.20 0.19 0.21 0.20   , X̃(2)(7) =   0.98 1.0 0.83 0.76 1.0 0.95 0.96 0.64 0.95 0.87 0.84 1.0 0.67 0.88 0.58 0.80 0.92 0.69 1.0 0.82   . We can see that matrices X̃(2) (1)−X̃(2) (5) mutually differ slightly, meanwhile there are essential differences between the first 5 matrices and X̃ (2) (6) − X̃(2) (7). When proper matrix X is generated and some kind of normalization (1)–(7) is done, SAW criteria aggregated values with Scheme for Statistical Analysis of Some Parametric Normalization Classes 979 equal weights are calculated for each row: Qi = n∑ j=1 1 n · x̃ij, i = 1, 2, . . . ,m. (9) The first alternative is considered to be the best one when Q1 > Q2, Q1 > Q3, . . . , Q1 > Qm. 10 series by 100 experiments were conducted. After 100 realizations of experiments denote ID - the number of cases, when the first alternative identified as the best one (the number of correct MCDM problem solutions from 100). Denote also WID = ID100, i.e. the percent of the correct MCDM problem solutions. Table 3: Numbers of cases with the best first alternative (ID) for normalization methods (1)–(7), when α = 1 and k = 1. (1) 62 54 62 56 58 52 49 58 52 45 (2) 69 60 66 57 56 56 50 57 54 46 (3) 55 50 59 55 51 45 45 57 55 44 (4) 64 56 60 58 56 51 50 57 52 46 (5) 62 56 59 59 53 49 47 53 51 47 (6) 69 66 66 73 63 64 58 64 63 56 (7) 69 65 66 72 63 62 61 65 63 54 Table 3 shows experiment results – how many times after application of methods (1)–(7) with α = 1 and k = 1 it was found that the best option is the first alternative. Each row contains results of 10 experiment series for each normalization method. Next, Monte Carlo experiments were conducted by changing parameters α and k values as follows. First row elements of matrix X were generated according to the Gaussian distribution with the average value µ1 = 67 and standard deviation σ = 15, elements of other rows – with the average values µi = 57, i = 2, 3, . . . ,m and standard deviation σ = 15. Calculations were performed using the C++ program. 4 Experiment results 4.1 Dependence of the best alternative detection accuracy on parameter α At the first stage of the experiments we fixed k value at k = 1 and varied α choosing the individual range for each normalization method. 100 series of 100 experiments were repeated for each parameter α value and empirical averages WID = 1100 ∑100 i=1 WIDi calculated. The purpose is to detect α value which maximizes WID . In the Table 4 the dynamics of WID values depending on α is depicted. 95% confidence intervals [w1_0.95; w2_0.95] for mean values EWID are given in the last column of the table. Maximum values of correct MCDM problem solution empirical averages WID were achieved for such α values: α = 2 for methods (1), (3), (4) and (7), α = 0.75 for method (2), α = 4 for method (5), α = 20 for method (6). The dependence of empirical mean values on α as well as upper and lower bounds of confidence intervals of the mean values are presented graphically in Figs. 5–6. For improvement of graphic images, smoothing spline fitting of Table 4 data was produced with MATLAB. The question arises: are the detected differences between normalization methods with different α values statistically significant? Student’s t-tests were applied for testing the hypotheses H0 of equal average values EWID in the cases of α = 1 and corresponding optimal parameter α values. Table 5 shows p-values of the respective t-tests. Significant differences in the best alternative 980 A. Krylovas, N. Kosareva, E.K. Zavadskas Table 4: WID values and confidence intervals for mean values EWID for 7 normalization methods at k = 1. Normalization method α WID [w1_0.95; w2_0.95] (1) 1 0.4784 [0.4678; 0.4890] 1.25 0.4899 [0.4790; 0.5008] 1.5 0.4998 [0.4885; 0.5111] 2. 0.5015 [0.4911; 0.5119] 3 0.4958 [0.4860; 05055] 4 0,4905 [0.4804; 0.5006] 5 0.4859 [0.4758; 0.4960] (2) 0.5 0.4952 [0.4854; 0.5050] 0.75 0.4987 [0.4876; 0.5098] 1 0.4936 [0.4836; 0.5036] 1.25 0.4897 [0.4796; 0.4998] 1.5 0.4874 [0.4774; 0.4974] 2 0.4807 [0.4707; 0.4907] (3) 0.5 0.3438 [0.3342; 0.3534] 1 0.4485 [0.4373; 0.3534] 1.5 0.4859 [0.4758; 0.4597] 2 0.4979 [0.4875; 0.4960] 3 0.4959 [0.4862; 0.5083] 4 0.4919 [0.4821; 0.5017] (4) 1 0.4751 [0.4643; 0.4859] 1.5 0.4947 [0.4842; 0.5052] 2 0.5009 [0.4904; 0.5114] 3 0.498 [0.4880; 0.5080] 4 0.4929 [0.4827; 0.5031] 5 0.4906 [0.4803; 0.5009] (5) 1 0.4684 [0.4582; 0.4786] 2 0.4914 [0.4812; 0.5016] 3 0.4951 [0.4845; 0.5057] 4 0.4952 [0.4844; 0.5060] 5 0.494 [0.4834; 0.5046] 6 0.4927 [0.4823; 0.5031] 7 0.4923 [0.4822; 0.5024] (6) 1 0.5370 [0.5258; 0.5482] 2 0.5396 [0.5287; 0.5505] 3 0.5444 [0.5336; 0.5552] 4 0.5486 [0.5377; 0.5595] 5 0.5508 [0.5401; 0.5616] 10 0.5596 [0.5496; 0.5696] 20 0.5609 [0.5512; 0.5706] 30 0.5523 [0.5420; 0.5626] (7) 0.25 0.5394 [0.5284; 0.5504] 0.5 0.5400 [0.5290; 0.5510] 0.75 0.5396 [0.5286; 0.5506] 1 0.5403 [0.5293; 0.5513] 2 0.5419 [0.5310; 0.5528] 3 0.5384 [0.5276; 0.5492] 4 0.5340 [0.5232; 0.5448] Scheme for Statistical Analysis of Some Parametric Normalization Classes 981 detection accuracy on significance level 0.05 were observed for normalization methods (1), (3)– (6). Methods (2) and (7) didn’t show significant differences. Consequently, it makes sense using formulas (1), (3)–(6) with the appropriate optimal parameter α values that differ from α = 1 (see optimal α values in the Table 5), while it makes sense using α = 1 for methods (2) and (7). 1 1.5 2 2.5 3 3.5 4 4.5 5 0.465 0.47 0.475 0.48 0.485 0.49 0.495 0.5 0.505 0.51 0.515 α Method 1: dependence of W ID on alpha W ID w 1_0.95 w 2_0.95 0.5 1 1.5 2 0.47 0.475 0.48 0.485 0.49 0.495 0.5 0.505 0.51 0.515 α Method 2: dependence of W ID on alpha W ID w 1_0.95 w 2_0.95 0.5 1 1.5 2 2.5 3 3.5 4 0.32 0.34 0.36 0.38 0.4 0.42 0.44 0.46 0.48 0.5 0.52 α Method 3: dependence of W ID on alpha W ID w 1_0.95 w 2_0.95 1 1.5 2 2.5 3 3.5 4 4.5 5 0.46 0.47 0.48 0.49 0.5 0.51 0.52 α Method 4: dependence of W ID on alpha W ID w 1_0.95 w 2_0.95 1 2 3 4 5 6 7 8 9 10 0.45 0.46 0.47 0.48 0.49 0.5 0.51 α Method 5: dependence of W ID on alpha W ID w 1_0.95 w 2_0.95 0 5 10 15 20 25 30 0.53 0.54 0.55 0.56 0.57 0.58 0.59 0.6 0.61 α Method 6: dependence of W ID on alpha W ID w 1_0.95 w 2_0.95 Figure 5: Dependence of empirical average accuracy WID on the parameter α values for (1)–(6) normalization methods and 95% confidence intervals bounds. Next, ANOVA procedure was applied to check the hypothesis of equal averages EWID for 7 normalization methods with optimal α values. Procedure results show significant differences existing between methods. Bonferroni test was chosen for Post Hoc multiple comparisons. It revealed that there are nor significant differences between averages EWID for (1)–(5) methods, neither between (6) and (7) methods. However, when comparing any method from the first group 982 A. Krylovas, N. Kosareva, E.K. Zavadskas 0 0.5 1 1.5 2 2.5 3 3.5 4 0.52 0.525 0.53 0.535 0.54 0.545 0.55 0.555 0.56 α Method 7: dependence of W ID on alpha W ID w 1_0.95 w 2_0.95 Figure 6: Dependence of empirical average accuracy WID on the parameter α values for (7) method and 95% confidence inter- vals bounds. Figure 7: Mean plot from ANOVA pro- cedure comparing EWID for optimal α values corresponding to 7 normalization methods. ((1)–(5) methods) with any of the methods of the second group ((6)–(7) methods), statistically significant differences were found. The conclusion can be made that (6) and (7) methods at the average are more precise than (1)–(5) methods, since the average accuracy EWID obtained by (6)–(7) methods is significantly higher than the accuracy of (1)–(5) normalization formulas. In Figure 7 the mean plot from ANOVA procedure output is represented. EWID for optimal α values corresponding to 7 normalization methods are depicted. Table 5: Hypotheses of equal average values EWID testing results for α = 1 and optimal α value for (1)–(7) normalization methods for the first experiment. Normalization method Optimal α value H0 p-value (1) 2 EWID α=1 = EWID α=2 0.002 (2) 0.75 EWID α=1 = EWID α=0.75 0.499 (3) 2 EWID α=1 = EWID α=2 0.000 (4) 2 EWID α=1 = EWID α=2 0.001 (5) 4 EWID α=1 = EWID α=4 0.000 (6) 20 EWID α=1 = EWID α=20 0.002 (7) 2 EWID α=1 = EWID α=2 0.838 Table 6: WID values, corresponding optimal α values and t-test results for (1)–(7) normalization methods for the second experiment. Normalization method Optimal α value WID H0 p-value (1) 2 0.6967 EWID α=1 = EWID α=2 0.000 (2) 0.75 0.6966 EWID α=1 = EWID α=0.75 0.594 (3) 2 0.6959 EWID α=1 = EWID α=2 0.000 (4) 2 0.701 EWID α=1 = EWID α=2 0.000 (5) 3 0.6879 EWID α=1 = EWID α=3 0.002 (6) 10 0.7775 EWID α=1 = EWID α=20 0.003 (7) 1 0.7631 - - Scheme for Statistical Analysis of Some Parametric Normalization Classes 983 Table 7: Recommended normalization formulas (1)–(7) with optimal α values and k = 1. x̃ij = ( xij−mj Mj−mj )2 (1) x̃ij = e − ( Mj−xij xij−mj ) (2) x̃ij = 1( 1+ ln ( Mj−mj xij−mj ) )2 (3) x̃ij = ( 2 π arctan ( xij−mj Mj−xij ) )2 (4) x̃ij =  1−e− ( xij−mj Mj−xij ) 1+e − ( xij−mj Mj−xij )  4 (5) x̃ij = ( ln(xij) ln( ∏ m i=1 xij) )20 , xij ≥ 1 (6) x̃ij = xij Mj (7) Due to randomness of the experiments we observe some variability in the obtained results: optimal α value for (5)-th normalization method changed from 4 to 3, for (6)-th method – from 20 to 10, for (7)-th method – from 2 to 1. However, t-test didn’t detect significant differences between EWID at the mentioned α levels. ANOVA results and conclusions also are the same. So, the conclusions remained unchained – it is recommended to use optimal α values, which differ from 1 for normalization methods (1), (3)–(6) and apply α = 1 for methods (2) and (7). Moreover, (6) and (7) normalization formulas lead to significantly higher average accuracy of best alternative detection. The results of the experiments revealed that it is reasonable to use normalization formulas with α values specified in the Table 7. To evaluate the stability of obtained results, Monte Carlo experiments were repeated with other values of decision making matrices. The elements of matrix X were generated as follows: x1j ∼ N(67, 15), xij ∼ N(52, 15), i = 2, 3, . . . ,m. As the differences µ1 − µi increased, it is natural to expect that the best alternative detection accuracy will be higher than in the previous experiment. The results of the second experiment are shown in the Table 6. They essentially confirmed results of the first experiment. 4.2 Dependence of the best alternative detection accuracy on parameter k Data normalization formulas (2)–(5) are depending on both parameters α and k. So, it is interesting to reveal whether EWID change significantly while changing k. Calculations were carried out with few values of parameters α, and for each fixed α for few values of k. The elements of matrix X were generated as follows: x1j ∼ N(67, 15), xij ∼ N(57, 15), i = 2, 3, . . . ,m. 100 Monte Carlo experiments were repeated for each parameter values combination and empirical mean values WID calculated. Then maximum value of WID was detected for some optimal parameter k value. Calculation results for normalization formulas (4) and (5) are presented in the Table 8. Maximum WID values and corresponding parameter α and k values are as follows: normalization method (2) – WID = 0.511,α = 0.75,k = 1; (3) – WID = 0.505,α = 2,k = 1; (4) – WID = 0.516,α = 2,k = 1; (5) – WID = 0.509,α = 1,k = 0.5. In Figure 8 the dependence of average accuracy WID on k = 1 is depicted. For normalization formulas (2)–(4) optimal parameter k value is 1, only for method (5) optimal value differs from 1 (k = 0.5). Next, t-test was applied for testing the hypotheses H0 of equal average values EWID for k = 1 and corresponding optimal parameter k values. The difference between EWID with parameters α = 1,k = 0.5 and EWID with parameters α = 1,k = 1 is significant at significance level 0.05. 984 A. Krylovas, N. Kosareva, E.K. Zavadskas 0 2 4 6 8 10 0.35 0.4 0.45 0.5 0.55 0.1 ≤ k ≤ 10 Method 2: dependence of W ID on k α=0.5 α=0.75 α=1.0 α=1.5 α=2.0 0 2 4 6 8 10 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.1 ≤ k ≤ 10 Method 3: dependence of W ID on k α=0.5 α=0.75 α=1.0 α=1.5 α=2.0 0 2 4 6 8 10 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.1 ≤ k ≤ 10 Method 4: dependence of W ID on k α=0.5 α=0.75 α=1.0 α=1.5 α=2.0 0 2 4 6 8 10 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.1 ≤ k ≤ 10 Method 5: dependence of W ID on k α=0.5 α=0.75 α=1.0 α=1.5 α=2.0 Figure 8: Dependence of empirical average accuracy WID on parameter k values for (2)–(5) normalization methods. Table 8: WID calculated by varying α and k values for normalization methods (4) and (5). Method (4) α k 0.5 0.75 1 1.5 2 3 0.1 0.497 0.486 0.500 0.482 0.486 0.479 0.2 0.510 0.494 0.500 0.496 0.484 0.479 0.5 0.475 0.500 0.501 0.509 0.495 0.497 1 0.422 0.460 0.486 0.502 0.516 0.506 2 0.353 0.399 0.435 0.478 0.488 0.504 5 0.303 0.317 0.358 0.381 0.401 0.449 10 0.285 0.297 0.307 0.324 0.345 0.378 Method (5) α k 0.5 0.75 1 1.5 2 3 0.1 0.487 0.482 0.479 0.485 0.484 0.485 0.2 0.497 0.483 0.479 0.486 0.484 0.483 0.5 0.483 0.502 0.509 0.504 0.497 0.498 1 0.423 0.466 0.482 0.489 0.494 0.494 2 0.339 0.385 0.411 0.442 0.454 0.473 5 0.296 0.302 0.318 0.339 0.354 0.381 10 0.287 0.278 0.290 0.299 0.296 0.323 Scheme for Statistical Analysis of Some Parametric Normalization Classes 985 So, it is recommended to use α = 1,k = 0.5 for normalization method (5): x̃ij = 1 −e −1 2 ( xij−mj Mj−xij ) 1 + e −1 2 ( xij−mj Mj−xij ) , whereas for methods (2)–(4) it is reasonable to use k = 1 with respective optimal α value (see relevant formulas in Table 7). 5 Conclusions and future research The purpose of this article is to analyze some parametric normalization formulas and establish how various data normalization methods and parameter values affect the accuracy of MCDM problem solution. 7 data normalization methods were investigated (see formulas (1)–(7)). Some of them with certain parameter values are generalizations of the well known normalization meth- ods. Data matrices were randomly generated according to Gaussian probability distribution. In all conducted Monte Carlo experiments decision making matrices were generated with the first alternative as the best one. Then the alternatives were ranked by the SAW method overall aggregated value with equal weights. The measure of identification accuracy is the percentage of correct identifications of the best alternative. The results of experiments are as follows. 1. Identification accuracy obtained by methods (6)–(7) is significantly higher than the accu- racy for normalization methods (1)–(5). 2. Variation of parameter k revealed that for normalization method (5) it is reasonable to use combination of parameters α = 1,k = 0.5, whereas for methods (2)–(4) it is reasonable to use k = 1 with respective optimal α value (see relevant formulas in Table 7). 3. Optimal α value have some “degrees of freedom”, for example, it is possible to choose another optimal value α = 10 instead of α = 20 for normalization formula (6), since there isn’t significant difference between identification accuracy at these values. 4. This research is accomplished in the special case when elements of decision matrix have Gaussian distribution. If we possess more information, we can apply the same methodology by approximating real data with appropriate probability distributions. 5. Parameters of initial decision making matrix were chosen so that identification of the best alternative would be higher than 50%. Experiments revealed that identification accuracy is higher with the bigger number of criteria n, and is lower with fewer alternatives m. Bibliography [1] Celen, A. (2014). Comparative Analysis of Normalization Procedures in TOPSIS Method: With an Application to Turkish Deposit Banking Market, Informatica, 25(2), 185–208, 2004. [2] Chakraborty, S.; Yeh, C.-H. (2009). A Simulation Comparison of Normalization Procedures for TOPSIS, International Conference on Computers and Industrial Engineering (CIE39): Troyes, France, JUL 06-09, 1-3, 1815–1820, 2009. [3] Hwang, C.L.; Yoon, K. (1981). Multiple attribute decision making - methods and applications, 2nd eds., Springer-Verlag, Berlin, 1981. 986 A. Krylovas, N. Kosareva, E.K. Zavadskas [4] Jahan, A.; Edwards, K.; Kevin, L. (2015). A state-of-the-art survey on the influence of nor- malization techniques in ranking: Improving the materials selection process in engineering design, Materials & Design, 65, 335–342, 2005. [5] Juttler, H. (1966). Untersuchungen zur Fragen der Operationsforschung und ihrer Anwendungs-moglichkeiten auf okonomische Problemstellungen unter besonderer Beruck- sichtigung der Spieltheorie [Investigations on the question of operational research and its application to economic problems with special consideration of the game theory]: Doctoral dissertation, Fakultat der Humboldt-Universitat, Berlin, 1966. [6] Kaplinski, O.; Tamošaitienė, J. (2015). Analysis of Normalization Methods Influencing Re- sults: A Review to Honour Professor Friedel Peldschus on the Occasion of his 75th Birthday, Procedia Engineering, 122, 2–10, 2015. [7] Kosareva, N.; Krylovas, A.; Zavadskas, E.-K. Statistical analysis of MCDM data normaliza- tion methods using Monte Carlo approach. The case of ternary estimates matrix, Economic Computation and Economic Cybernetics Studies and Research. (In Press). [8] Kou, G.; Ergu, D.; Lin, C.; Chen, Y. (2016). Pairwise comparison matrix in multiple criteria decision making, Technological and Economic Development of Economy, 22(5), 738–765, 2016. [9] Li, G.; Kou, G.; Peng, Y. (2018). A Group Decision Making Model for Integrating Hetero- geneous Information, IEEE Transactions on Systems, Man, and Cybernetics-System, 48(6), 982–992, 2018. [10] MacCrimmon, K.R. (1968). Decision making among multiple-attribute alternatives: a sur- vey and consolidated approach, No. RM-4823-ARPA, Santa Monica: RAND Corporation, 1968. [11] Milani, A.S.; Shanian, A.; Madoliat, R.; Nemes, J.A. (2005). The effect of normalization norms in multiple attribute decision making models: a case study in gear material selection, Struct Multidiscip Optimization, 29(4), 312–318, 2005. [12] Peldschus, F. (2018). Recent Findings from Numerical Analysis in Multi-Criteria Decision Making, Technological and Economic Development of Economy, 24(4), 1456–1478, 2018. [13] Podviezko, A.; Podvezko, V. (2015). Influence of data transformation on multicriteria eval- uation result, Procedia Engineering, 51(1), 151–157, 2015. [14] Stanujkic, D.; Magdalinovic, N.; Jovanovic, R. (2013). A multi-attribute decision making model based on distance from decision maker’s preferences, Informatica, 24(1), 103–118, 2013. [15] Stanujkic, D.; Zavadskas, E.-K.; Karabasevic, D; Turskis, Z.; Keršulienė, V. (2017). New group decision-making ARCAS approach based on the integration of the SWARA and the ARAS methods adapted for negotiations, Journal of Business Economics and Management, 18(4), 599–618, 2017. [16] Stopp, F. (1975). Variantenvergleich durch Matrixspiele, Wissenschaftliche Zeitschrift der Hochschule für Bauwesen Leipzig, Heft 2, 1975. [17] Van Delft, A.; Nijkamp, P. (ed.) (1977). Multi-criteria analysis and regional decision-making, Studies in Applied Regional Science, Springer, 1977. Scheme for Statistical Analysis of Some Parametric Normalization Classes 987 [18] Weitendorf, D. (1976). Beitrag zur optimierung der räumlichen struktur eines gebäudes: Dissertation A., Weimar: Hochschule für Architektur und Bauwesen, 1976. [19] Yazdani, M.; Jahan A.; Zavadskas, E.-K. (2017). Analysis in Material Selection: Influence of Normalization Tools on COPRAS-G, Economic Computation and Economic Cybernetics Studies and Research, 51(1), 59–74, 2017. [20] Zavadskas, E.-K.; Turskis, Z. (2008). A New Logarithmic Normalization Method in Games Theory, Informatica, 19(2), 303–314, 2008.