Plane Thermoelastic Waves in Infinite Half-Space Caused Decision Making: Applications in Management and Engineering Vol. 3, Issue 1, 2020, pp. 1-21. ISSN: 2560-6018 eISSN: 2620-0104 DOI: https://doi.org/10.31181/dmame2003001s * Corresponding author. E-mail addresses: hareshshrm@gmail.com (H.K. Sharma), kriti.kri89@gmail.com (K. Kumari), samarjit.kar@maths.nitdgp.ac.in (S. Kar) A ROUGH SET THEORY APPLICATION IN FORECASTING MODELS Haresh Kumar Sharma 1, Kriti Kumari 2 and Samarjit Kar 1* 1 Department of Mathematics, National Institute of Technology, Durgapur, India 2 Department of Mathematics, Banasthali University, Jaipur, Rajasthan, India Received: 5 September 2019; Accepted: 11 November 2019; Available online: 15 November 2019. Original scientific paper Abstract. This paper introduces the performance of different forecasting methods for tourism demand, which can be employed as one of the statistical tools for time series forecasting. The Holt-Winters (HW), Seasonal Autoregressive Integrated Moving Average (SARIMA) and Grey model (GM (1, 1)) are three important statistical models in time-series forecasting. This paper analyzes and compare the performance of forecasting models using rough set methods, Total Roughness (TR), Min-Min Roughness (MMR) and Maximum Dependency of attributes (MDA). Current research identifies the best time series forecasting model among the three studied time series forecasting models. Comparative study shows that HW and SARIMA are superior models than GM (1, 1) for forecasting seasonal time series under TR, MMR and MDA criteria. In addition, the authors of this study showed that GM (1, 1) grey model is unqualified for seasonal time series data. Key words: Forecasting, Mean Absolute Percent Error (MAPE), Rough Set, Total Roughness, Maximum Dependency Degree 1. Introduction The future planning has emerged as a key component for the success of a large number of entrepreneurs round the corners of the world. It certainly makes it easy for all the countries to formulate economic policies as well as act upon them efficiently. The perfection and accuracy are quite important in the process of highly accurate and reliable prediction. It helps the government in formulating development policies concerned with economics, infrastructure and many other sectors of the global as well as the domestic economy. It even improves the decision-making process. Time series modeling and forecasting play a key role in accurate prediction. The current trend is mailto:kriti.kri89@gmail.com mailto:samarjit.kar@maths.nitdgp.ac.in Sharma et al./Decis. Mak. Appl. Manag. Eng. 3 (1) (2020) 1-21 2 incorporated for the future prediction; it thus becomes necessary to use highly consistent and precise forecasting tools. Since the last couple of decades, a wide variety of forecasting models is available for the study of tourist arrivals and demand forecasting (Chu, 1998, Lim and McAleer, 2002; Wang, 2004). Suggesting Autoregressive Integrated Moving Average (ARIMA) model as a most suitable model for tourism demand forecast. ARIMA was first brought out by Box-Jenkins (Box and Jenkins, 1976) and presently it is the most accepted model for forecasting univariate time series data. ARIMA model is the combined result of autoregressive (AR) and Moving Average (MA) model. ARIMA model develops an optimal univariate future prediction. Moreover, the ARIMA model has received worldwide confidence due to its ability to handle stationary and non-stationary series with seasonal and non-seasonal elements (Pankratz, 1983). But, SARIMA is particularly designed for the time series data with trends and seasonal patterns. Holt- Winters has also gained more popularity to capture trend and seasonality (Winters, 1960). The HW seasonal method consists of the forecast equation and smoothing equations for level, trend, and seasonality. Later on, grey system theory developed by Deng states that a system whose internal sources such as system characteristics operation mechanisms and architecture are completely clear, called a white system (Deng, 1982). The added advantage of this system is that the theory cannot only estimate an uncertain system but sometimes it produces ideal results. For example, Tseng et al. (2001) was reported the application of the grey model to forecast Taiwan Machinery Industry and soft drinks time series data. However, Nguyen et al. (2013) studied the forecasting of tourist arrivals in Vietnam using GM (1, 1) grey model. From the last many years variety of forecasting criteria has been used to select the best time series models (Lim and McAleer, 2002; Wang, 2004). Modeling and forecasting consist of a large number of criteria. For instance, (Chu, 1998) employed MAPE and U-statistics criteria to compare the Holt winters and SARIMA models. However, (Chen et al, 2009) applied MAPE criterion to evaluate the forecasting accuracy of Holt-Winters, SARIMA, and Grey models. In earlier research accuracy of forecasting models have been evaluated using error based criteria (Goh and Law, 2002; Law and Au, 1999; Law, 2000). Sometimes it may be possible that one model may become a good one due to some set of criteria but at the same time some other model may turn out to be the best one due to some other set of criteria. Moreover, these indicators have very much been exploited and only marginal improvements might be expected from their continued use. This research proposed a new approach that applies rough set theory to select the highly accurate models in time series forecasting. The rough set theory has been introduced to deal with vagueness, imprecision, and uncertainty. The original rough set theory depends on the equivalence relation (indiscernibility relation). This approach is taken into consideration the attributes in accordance with the normalized values (Goh and Law, 2003). Rough set theory has been able to overcome one of its advantages in association with statistical analysis during the process of attribute selection using rough set indicators (Hassanein and Elmelegy, 2013; Herawan, 2010). Current research is an extended effort of Chen et al. (2009) work where they have used the Holt-Winters (HW) model, Seasonal Autoregressive Integrated Moving Average (SARIMA) model and Grey Model (GM (1, 1)). GM (1, 1) has been used to define monthly inbound air travel arrivals to Taiwan and to distinguish the models based on their respective performance. Mean Absolute Percent Error (MAPE) has been used as an indicator to measure forecast accuracy. Based on the results derived, they concluded that the HW and SARIMA models are better reliable models than GM (1, 1). A rough set theory application in forecasting models 3 The objective of this paper is to obtain the best forecasting models using TR, MMR, and MDA rough set indicators. Based on rough set information table, those techniques are used to calculate roughness of models. Then, compare these three models in accordance with the roughness. The authors of this study showed that the GM (1, 1) is an inadequate model for forecasting with seasonality as compared to HW and SARIMA models. The rest of the research paper is organized as follows; Section 2 contains literature review, Section 3 briefly introduces the basic concepts rough set theory and some related properties. Section 4, presents an algorithmic approach for the evaluation of rough data using MAPE indicator. In Section 5, the experimental design and experimental results have been discussed. Finally, the paper is concluded in section 6. 2. Literature review In recent years, rough set theory have been employed in various literature to select the clustering attributes. For example, Mazlack et al. (2000) proposed Bi-Clustering (BC) technique depend on balanced/unbalanced bi-valued attributes and Total Roughness (TR) technique based on the average accuracy of roughness (Pawlak, 1982; Pawlak & Skowron, 2007). The TR technique is useful for selecting the clustering attributes in the data set, where the maximum TR is the maximum accuracy for selecting clustering attributes. Three indicators i.e. TR, MMR, and MDA of the rough set theory have been successfully used. For instance, Parmar et al. (2007) developed a new method called Min-Min Roughness (MMR) to develop BC technique for the information system with many valued attributes. In this technique, attributes for approximation are calculated using well-known corporate to the lower and upper approximations of a subset of the universe in the information system. Herawan et al. (2010) developed a new technique known as Maximum Dependency of Attributes (MDA) to select clustering attributes. MDA technique is based on the dependency of attributes using rough set theory in an information system. These three techniques TR, MMR and MDA provide the same outcome in selecting the attributes. This makes the rough set criteria a very useful to select the different attributes. However, in previous literature, there is no any link of rough set theory with relationship time series modeling to select the best forecasting models. In time series analysis and forecasting, the selection of highly accurate model is very important to evaluate the best time series model. Hence, this research proposed a rough set criterion for strong evidence in the selection of best suitable time series models that is different another traditional statistical indicator. Rough set theory has been consistently employed in a variety of research areas for the extraction of decision rules (Law & Au 1998, 2000, Goh & Law, 2003; Liou et al. 2016). Celotto et al. (2012) applied rough set theory based forecasting model in data of tourist service demand.. Moreover, Li et al. (2011) predicted tourism in Tangshan city of China using rough set model. Golmohammadi and Ghareneh (2011) analyze the importance of travel attributes by rough set approach. Celotto et al. (2015) applied rough set theory to summarize tourist evaluations of a destination. , Faustino et al. (2011) present a rough set analysis of electrical charge demand in the United States and the level of the Sapucal river in Brazil. Liou (2016) used the rough set theory to study the airline service quality to Taiwan. Sharma et al. (2019) proposed hybrid rough set based forecasting model and applied on tourism demand of air transportation passenger data set in Australia tourism demand. Sharma et al./Decis. Mak. Appl. Manag. Eng. 3 (1) (2020) 1-21 4 Rough set theory use to alter the roughness of a data, which has been successfully applied to various real life decision making problems ( KaravidiΔ‡ & ProjoviΔ‡, 2018; Roy et al., 2018; VasiljeviΔ‡ et al., 2018). Moreover, the rough set concept can definitely be implemented to sets categorized by means of immaterial facts wherein statistical tools fail to provide fruitful outcomes (Pawlak, 1991). Pamučar et al. (2018) proposed interval rough number enabled AHP-MABAC model for web pages evaluation. Sharma et al. (2018) applied Modified Rough AHP-Mabac Method for Prioritizing Indian Railway Stations. 3. Rough Set Theory The rough set theory was first introduced by Pawlak (1982). The rough set concept is a new mathematical technique to tackle vagueness, imprecision, and uncertainty (Pawlak, 1982; Pawlak & Skowron, 2007). It is a vital tool to examine the degree of dependencies and minimize the number of attributes within the dataset. Its success is partly owed to the following properties: (1) Analysis is performed on the hidden fact of the data; (2) Supplementary information on data is not required like specialist awareness or thresholds; (3) Equivalent relation is a basic idea of classical rough set theory. Whereas, the attribute might be assign with both the values symbolic or real. Pawlak proposed that the rough set theory is established on the assumption that with every member of the universe of discourse we relate some information. For example, symptoms of the disease develop a crucial part of information where objects are the patients suffering from the certain disease. The objects become indiscernible (similar) when characterized by the same information in view of the available information about them. The indiscernibility relation created in this way is the mathematical foundation of the rough set theory. The original concept of the rough set theory is the induction of approximation. The main aim of the rough set theory is the approximation of a set by a pair of two crisp sets called the lower and upper approximations of the sets. 3.1 Indiscernibility Relation Let U be the non-empty finite set of all objects known as the universe and 𝐴 is the finite set of all attributes, then the couple 𝑆 = (π‘ˆ, 𝐴) is known as an information system. For any non-empty subset 𝐡 of 𝐴 is associated with an equivalence relation INDS(B) relation, 𝐼𝑁𝐷𝑆(𝐡) = { (𝑦𝑖 , 𝑦𝑗 ) ∈ π‘ˆ Γ— π‘ˆ ∣∣ βˆ€ 𝑏 ∈ 𝐡, 𝑏(yi) = b(yj) } (1) where 𝑏(𝑦𝑖 ) represents the value of attribute 𝑏 for the element 𝑦𝑖 . 𝐼𝑁𝐷𝑆(𝐡) is called the Indiscernibility relation on π‘ˆ. The notion [𝑦𝑖 ]𝐼𝑁𝐷𝑆(𝐡) represent the equivalence class of the indiscernibility relation. [𝑦𝑖 ]𝐼𝑁𝐷𝑆(𝐡) is also called as elementary set with respect to the attribute 𝐡. 3.2. Lower and Upper Approximation Lower approximation and upper approximation (Pawlak, 1982; Pawlak, 1991) of any set can be defined as follows: For an information system S = (U, A) Given the set of attribute B βŠ† A, Y βŠ†U, the lower and upper approximation of Y are defined as follows respectively, π‘Œπ΅ = βˆͺ {𝑦𝑖 |[𝑦𝑖 ]𝐼𝑁𝐷𝑆(𝐡) βŠ† π‘Œ} (2) π‘Œπ΅ = βˆͺ {𝑦𝑖 |[𝑦𝑖 ]𝐼𝑁𝐷𝑆(𝐡) ∩ π‘Œ β‰  βˆ…} (3) A rough set theory application in forecasting models 5 Clearly, lower approximation contains all members which certain objects of Y and upper approximation consists all members which possible objects of Y. The boundary region is the set of members that can possible member, but not surely, defined as follow: 𝐡𝑁𝐷𝐡 (π‘Œ) = π‘Œπ΅ βˆ’ π‘Œπ΅ (4) The boundary region of an exact (crisp) set is an empty set like the lower approximation and upper approximation of exact set are similar. If the boundary region of a set is non-empty i.e. 𝐡𝑁𝐷𝐡 (π‘Œ) β‰  βˆ…, then the set Y has been referred to as rough (vague). 3.3. Roughness (R) Inexactness of a category (set) is one of the reasons behind the existence of boundary line region. As the boundary line region of a category increases, the accuracy of the category decreases. To model such kind of imprecision the concept of accuracy of approximation (Pawlak, 1991) is very much required. Accuracy measure represented as follow: 𝛼𝐡(π‘Œ) = π‘π‘Žπ‘Ÿπ‘‘ π‘Œπ΅ π‘π‘Žπ‘Ÿπ‘‘ π‘Œπ΅ The accuracy is intended to compute the degree of satisfaction of our knowledge about the category (set). Obviously 0 ≀ 𝛼𝐡 (π‘Œ) ≀ 1. If 𝛼𝐡 (π‘Œ) =1, Y is exact with respect to B, if 𝛼𝐡 (π‘Œ) < 1, Y is rough with respect to B. Assume that an attribute π‘Žπ‘– ∈ 𝐴 having k-distinct values, say π›Όπ‘˜ , π‘˜ = 1,2, … . , π‘š. Suppose π‘Œ(π‘Žπ‘– = π›Όπ‘˜ ),π‘€β„Žπ‘’π‘Ÿπ‘’ π‘˜ = 1,2, . . … , π‘š 𝑖 a subset of the objects consists k-distinct values of attribute π‘Žπ‘– . The roughness of TR (Mazlack, 2000) of the set(π‘Žπ‘– = π›Όπ‘˜ ), π‘˜ = 1, 2, … . , π‘š, with respect to aj, where 𝑖 β‰  𝑗, represented by π‘…π‘Žπ‘— (π‘Œ ∣ π‘Žπ‘– = π›Όπ‘˜ ) as is defined by π‘…π‘Žπ‘— ( π‘Œ ∣∣ π‘Žπ‘– = π›Όπ‘˜ ) = |Yaj (π‘Žπ‘–=π›Όπ‘˜)| |Yaj (π‘Žπ‘–=π›Όπ‘˜)| ,π‘˜ = 1, 2, … . , π‘š (5) 3.3.1. Mean roughness (MR) The values of Mean roughness of an attribute π‘Žπ‘– ∈ 𝐴 with respect to another attribute π‘Žπ‘— ∈ 𝐴, where, 𝑖 β‰  𝑗, represented by the following formula π‘…π‘œπ‘’π‘”β„Žπ‘Žπ‘— (π‘Žπ‘– )= βˆ‘ π‘…π‘Žπ‘— (π‘Œβˆ£π‘Žπ‘–=π›Όπ‘˜) |𝑉(π‘Žπ‘–)| π‘˜=1 |𝑉(π‘Žπ‘–)| (6) where 𝑉(π‘Žπ‘– ) is the set of all values of attribute π‘Žπ‘– ∈ A. 3.3.2. Total Roughness (TR) The total roughness of the attribute π‘Žπ‘– ∈ A with respect to the attribute π‘Žπ‘— ∈ 𝐴, where, 𝑖 β‰  𝑗, represented by 𝑇𝑅 (π‘Žπ‘– ), is defined by 𝑇𝑅(π‘Žπ‘– ) = βˆ‘ π‘…π‘œπ‘’π‘”β„Žπ‘Žπ‘— (π‘Žπ‘–) |𝐴| 𝑗=1 |𝐴|βˆ’1 (7) The maximum value of TR, the finest selection choice of clustering attributes. 3.3.3. Minimum – Minimum Roughness (MMR) From the TR system, the mean roughness of attribute ai with respect to attribute aj, where, 𝑖 β‰  𝑗 is define by Sharma et al./Decis. Mak. Appl. Manag. Eng. 3 (1) (2020) 1-21 6 π‘€π‘€π‘…π‘œπ‘’π‘”β„Žπ‘Žπ‘— (π‘Žπ‘– ) =1 βˆ’ βˆ‘ π‘…π‘Žπ‘— (π‘Œβˆ£π‘Žπ‘–=π›Όπ‘˜) |𝑉(π‘Žπ‘–)| π‘˜=1 |𝑉(π‘Žπ‘–)| π‘€π‘€π‘…π‘œπ‘’π‘”β„Žπ‘Žπ‘— (π‘Žπ‘– ) = 1 – π‘…π‘œπ‘’π‘”β„Žπ‘Žπ‘— (π‘Žπ‘– ) (8) 3.4. Maximum Dependency Attribute (MDA) [Herawan et al. (2010)] Suppose S = (U, A) is information system and let ai and aj be any subsets of A. Dependency attribute ai on aj in a degree k (0