International Journal of Interactive Mobile Technologies(iJIM) – eISSN: 1865-7923 – Vol 16 No 09 (2022) Paper—Information Systems for Cultural Tourism Management Using Text Analytics and Data… Information Systems for Cultural Tourism Management Using Text Analytics and Data Mining Techniques https://doi.org/10.3991/ijim.v16i09.30439 Thanet Yuensuk1, Potsirin Limpinan1, Wongpanya Sararat Nuankaew1, Pratya Nuankaew2() 1Faculty of Information Technology, Rajabhat Maha Sarakham University, Maha Sarakham, Thailand 2School of Information and Communication Technology, University of Phayao, Phayao, Thailand pratya.nu@up.ac.th Abstract—Using technology to deliver specific human interests is gaining attention. It results in humans being presented differently with what each individual wants. Therefore, this research aims to develop a culturally tourism recommended application using machine learning technology. It has three objec- tives: to develop a predictive model for cultural tourism management using text mining techniques, to evaluate the effectiveness of the cultural tourist attraction management model, and to assess the satisfaction of using the application for cultural tourism management. The research data was collected on Facebook con- versations from 385 tourists (3,257 transactions) who had traveled to a famous tourist destination in Maha Sarakham Province. The prediction model develop- ment tools are three classification technique including Naïve Bayes, Neural Net- work, and K-Nearest Neighbor. The model performance evaluation tool consists of a confusion matrix and cross-validation methods. In addition, a questionnaire was used to assess the satisfaction of the application. The results showed that the model with the highest accuracy was modeled by Naïve Bayes technique with an accuracy of 91.65%. Simultaneously, the level of satisfaction with the applica- tion was high, with an average of satisfaction equal to 3.98 (S.D. equal to 0.69). It was therefore concluded that the application was accepted by it to be further expanded to offer more widespread research. Keywords—cultural tourism management, opinion data mining, text mining, tourist attraction, tourist experience 1 Introduction Chatbot technology is an artificial intelligence technology that is used to simulate providing information or answers to questions asked by users, regardless of whether they are in text or voice messages [1]–[3]. The working principle of this chatbot tech- nology is powered by Artificial Intelligence (AI) by applying the principles of machine learning for analyzing user interactions [4]–[7]. In the selection of the most appropriate 146 http://www.i-jim.org https://doi.org/10.3991/ijim.v16i09.30439 mailto:pratya.nu@up.ac.th Paper—Information Systems for Cultural Tourism Management Using Text Analytics and Data… answer to the user’s question. It combines natural language processing technology to translate computer language into a language that users can easily understand [8]. Moreover, the continuous improvements in big data and analysis using machine learn- ing tools and the improved decision-making capabilities have resulted in the adoption of chatbot technology. Therefore, chatbot applications are becoming more and more popular [9], [10]. It is used to manage daily routines, provide necessary information through automated telephone systems, provide business information to inform products or services, as well as recommend the initial purchase of goods or services. However, the most popular chatbot technology is rule-based bots. The workflow is to create a Q&A rule. If a user asks a question, the bot will provide information that corresponds to the question, and there will be a troubleshooting process according to the rules set by the developers. In the ecotourism dimension, the researchers found that the problem of travelers’ desire for attractions was inconsistent with their experiences and backgrounds. As a result, traveling to various places is unhappy and unsettling. For this reason, the pri- mary goal of the chatbot technology implementation in this research is to provide automated communication services where users can access and interact with chatbots through existing platforms. Please note that today’s chatbot technology is not a replace- ment for all human conversations, especially if the conversation is very complicated [1], [2], [10]. This point is also considered a limitation of chatbot technology. However, if technology is continuously developed, it will result in the application of technology more efficiently. Therefore, this research has an important goal of developing a cultural tourism recommended application using machine learning technology. There are three objectives. The first objective is to develop a predictive model and the recommended application for cultural tourism management using text mining techniques. The second objective is to evaluate the effectiveness of the cultural tourist attraction recommenda- tion model. Finally, the last objective is to assess the satisfaction of using the recom- mended application for cultural tourism management. The research methodology based on the development of data mining; it is known as CRISP-DM: CRoss-Industry Standard Process for Data Mining [11]–[13]. There are six steps: business understanding, data understanding, data preparation, model- ing, evaluation, and deployment. The research data is compiled from 385 tourists who have traveled in a famous tourist attraction in Maha Sarakham Province. The com- putation and collection of research samples were enumerated in the population and sample selection section. The data collected from tourists is a questionnaire on attitudes towards tourism in a particular location. In addition, the research framework and meth- odological concepts are presented in Figure 1. iJIM ‒ Vol. 16, No. 09, 2022 147 Paper—Information Systems for Cultural Tourism Management Using Text Analytics and Data… Fig. 1. The research framework and methodological concepts Figure 1 presents the research framework and methodological concepts. It consists of three key phases. The first phase is the process of text analysis, which is an analy- sis of the traveler’s experiences and backgrounds. The second phase is the process of model and application development. Lastly, the third phase is the process of model evaluation and application assessment, which aims to assess model performance and application satisfaction. Nevertheless, the research content presentation structure is divided into six sections: the first section is the presentation of the introduction and the research background. The second section presents the links of other research and related theories. The third section describes the research process that is divided into six steps based on the principles of data mining development. The fourth section is a summary of the research findings that have emerged. The fifth section is a discussion of the research findings and the outcomes obtained from the research. At the end, the last section is the conclusion. It provides a summary of the gist and recommendations derived from this research. Finally, the researchers have great confidence and hope that this research will be of the public interest with the hope that the research will be accepted, and the research results will be extended in the future. 2 Literature reviews 2.1 Tourists’ attitudes towards technology Understanding tourists’ attitudes towards technology is an important area of tourism research as it relates to the interaction between the ability to use technology and the tourist experience, a dream destination that influences sustainable tourism development [14]. Technology is therefore being used as a tool to present alternatives to tourists [1], [2], [15]–[17]. An interesting examples are the use of technology to offer themes and attractions that match an individual’s personality [15], using big social data for analysis of tourist behavior [18], a study of factors and influences on tourists [14], [19], and so on. 148 http://www.i-jim.org Paper—Information Systems for Cultural Tourism Management Using Text Analytics and Data… In addition, the use of technology in tourism is trending around the world [1], [15], [20], [21]. It underscores that technology is part of the travel apparatus, where the dimension of technology adoption consists of three dimensions being awakened. The first dimen- sion is building a smart tourism city by integrating technology networks. It consists in the use of technology in tourist activities such as online vending, smart security services networking, improved transport services, linguistic services, smart city robots to guide visitors [5], [6], [22], [23]. The second dimension corresponds to the first dimension. It is the use of massive data to conduct analysis on the question of how big data technologies can improve tourism [4], [24]. Whilst smart city tourism is affected and benefits from the various sporting and entertainment activities that take place in the city at different times. Thousands of people are accepting and interested in partici- pating in these events, although the types of events are diverse. For this reason, smart city tourism technology needs to be consistent and transferable, with event managers having to safely manage crowds and maximize event revenue. The last dimension is creating awareness to create an experience for tourists. In this regard, technology can be involved using augmented reality (AR) technology [25]–[27]. It is used to provide information or to offer directions. For example, when tourists use the Maps application they can search for landmarks and when they point their camera at that landmark while viewing the screen, they can choose to view information or historical images overlaid with the current scene. From the literature review, it was found that understanding tourists’ attitudes is an important question in the analysis for designing technology that is suitable for tourists. It is therefore sensible to study the context and attitudes of tourists in which the study and research process in this research is carried out in a scientific process. 2.2 Big data and innovation in tourism Big data is an innovation of technology development based on information and communication technology. It is generated by referencing multiple datasets from different sources. It is supported by advanced data storage, analysis, and processing technologies. Beyond that, big data thinking is an opportunity based on the exponential growth of data volumes and the generation of unstructured data. With an unlimited amount of data, it is possible to study and discover patterns in the data. The number of innovations in using data to support the environment for tourism has emerged [18], [22], [28]. It features text-mining analysis for predictions in a variety of ways, includ- ing topic extraction, text classification, sentiment analysis, text clustering, and so on [28]. It is also used in the model of decision-making and application development such as tourist recommendation system, tourist satisfaction model, and AI technology-based service [9], [22], [27], [28]. For Thailand, part of the major income comes from tourism, with Thailand defining tourism as an important asset of the country. It appears research is bringing innovation and big data to drive research [29]–[31]. Consequently, these have inspired research- ers to study and develop this research. Furthermore, the relationship perspective from another research that affects the researchers is that researchers want to apply accepted machine learning technology to study the behavior of tourists in Thailand. iJIM ‒ Vol. 16, No. 09, 2022 149 Paper—Information Systems for Cultural Tourism Management Using Text Analytics and Data… Hereby, the issues and procedures for conducting research were presented in the research methodology section. 3 Research methodology The research methodology consists of five key elements: Determination of the pop- ulation and the sample, Development of research instruments, Data Collection, Model Generation and Efficiency, Using Statistics to Analyze Application Satisfaction. The researcher has taken the following actions. 3.1 Determination of the population and the sample The population in this research was tourists in Maha Sarakham Province who had experience of traveling in cultural attractions in Maha Sarakham Province between 2021–2022. Please note that tourism data has not been collected between 2021–2022. Therefore, the researcher assigned an unknown population to calculate the sample [32], [33]. A common goal of social science research is to collect data representative of a popu- lation. A subset of the population is selected to represent the entire population in a study as a sample. Generally, there are several methods of classification in determining the sample size of the study for known and unknown population size. This study employed Cochran’s technique in determining the sample size. The formula used to calculate the sample size for continuous data and unknown population size is n = ((z)2 * (σ) 2)/(e)2 where, n is the sample size, z = z value at reliability level or significance level (Reliabil- ity level 95% or significance level 0.05; z = 1.96, Reliability level 99% or significance level 0.01; z = 2.58), σ = standard deviation of the population, e = acceptable sampling error ≈ (±5%) [If σ is unknown, defined e as % of σ such as 8% of σ (e = 0.08σ) or 10% of σ (e = 0.10σ)]. Given that the largest acceptable discrepancy (e) is 10% of the population standard deviation and a statistical significance level of 0.05 for an unknown population, it was able to calculate the equivalent of 385 samples used in the research. 3.2 Development of research instruments The research instruments consisted of three tools. The first tool is a chatbot model for cultural tourism management in Maha Sarakham Province. The second tool is a chatbot application for cultural tourism management in Maha Sarakham Province. The first two parts are detailed in the model generation and efficiency section. The last tool is the satisfaction questionnaire. It was used to assess the satisfaction of using the chat- bot application by a sample of 30 students from Rajabhat Maha Sarakham University. Selected by a simple random sampling method as students in the Faculty of Information Technology, Rajabhat Maha Sarakham University. It consists of eight assessment issues as shown in Table 5. 150 http://www.i-jim.org Paper—Information Systems for Cultural Tourism Management Using Text Analytics and Data… 3.3 Data collection Data collection is classified into two parts. The first part is to collect data for model- ing purposes. The data collection at this stage involves bringing information about the tourists’ conversations through Facebook Messenger. It contains data from 385 Face- book accounts with 3,257 conversations. This data occurred in Phase 1: Text Processing to Analyze Traveler Experiences and Backgrounds and was used in Phase 2: Modeling and Application Development as shown in Figure 1. The second data collection takes place for application assessment purposes. It is to collect information from 30 students in the Faculty of Information Technology, Rajabhat Maha Sarakham University. 3.4 Model generation and efficiency Model development and efficiency were carried out according to a six-step data min- ing principle known as CRISP-DM: CRoss-Industry Standard Process for Data Mining. Additionally, there are two major phases in this step involved in the development of data mining principles: the first phase is text processing to analyze the traveler’s expe- rience and background as shown in Figure 1. It consists of three stages of CRISP-DM: Business Understanding, Data Understanding, and Data Preparation. The second phase is modeling and application development. It consists of three stages of CRISP-DM: Modeling, Evaluation, and Deployment. The details of each step are as follows. Business understanding. In traditional human inquiries regarding the tourist cultural attractions in Maha Sarakham Province, there may be delayed or inconsistent answers to questions. It affects tourists from boredom and misinformation. Moreover, the vast amount of tourism information is scattered among the various departments of the size of the collection and coordination. These are the problems that have made researchers realize the importance that modern technology can solve. Recognizing the importance of research, the researchers developed an application to guide the management of cultural tourism using text mining analytics and using machine learning technology to develop the application. Data understanding. Understanding data can be said to be understanding the origin of the data or its form. Data today consists mainly of unstructured data formats, it manifests itself in the form of text and images [28], [34]. The major problem with most tourist inquiries is not the form of a sentence. It creates misunderstandings and confuses the reader and thus feeds back the wrong information. Some of the unstructured data the researchers found appeared in online conversations. The researchers then analyzed these data using text mining analysis techniques collected from 385 Facebook user accounts. The Facebook Messenger question set data collection consists of eight question sets: (1) beverage shops and restaurants, (2) costs and fees for visiting, (3) facilities and data centers, (4) identity of community, (5) local wisdom, (6) location, (7) public transport, and (8) route. This data manipulation is in the data mining analysis phase, which is in the data preparation section. Data preparation. Data preparation is the first phase as shown in Figure 1. It consists of five processes. The first step is to collect the questions and analyze the iJIM ‒ Vol. 16, No. 09, 2022 151 Paper—Information Systems for Cultural Tourism Management Using Text Analytics and Data… messages. This step explores how to classify questions using text mining techniques. It uses text mining processes and techniques to solve text classification problems. The goal is to cut and classify to identify key features for modeling. The data were then analyzed and modeled on the classification technique that gave the best classification performance. The second step is the word cutting stage. This research uses Python and Google Collab, which is an Open-Source program that has a process of cutting Thai words. It is then used to index keywords to exclude unimportant parts. The third step is the elimination of the stop word. Word types that are eliminated include prepositions, conjunctions, pronouns, adverbs, and interjections. The fourth step is to select and manage the question category feature. The fifth step is to create a keyword index. This research indexes the keywords in TF-Weighting format by substituting the text in the form of word vectors which calculate the weights for the index. After completing the five-step process, it obtained the data ready for model devel- opment. The methods and tools that will be used to develop the model are discussed in the following sections. Modeling. The text analysis modeling of this research was based on three machine learning techniques. It consists of Naïve Bayes technique, Neural Network technique, and K-Nearest Neighbors (K-NN) technique [35]–[37]. These three techniques are popular in the classification of supervised learning. The working principles and benefits of these three techniques are as follows: Naïve bayes classifier is a famous model for classification based on the Bayes Rule. The advantage of the Naïve bayes classifier is that it is a model that is easy to under- stand and can also be easily evaluated. It can work quickly. If the conditional indepen- dence hypothesis persists and it can produce excellent results. A neural network is a set of algorithms that attempt to recognize and relate hidden relationships in a dataset through a process that mimics the way the human brain works. Neural networks can adapt to changing information. Therefore, the network produces the best results without redesigning the export criteria. Its main assembly consists of three parts: an input layer, a processing layer, and an output layer. The input layer may be weighted according to various criteria. While the processing layer is hidden from view, it is responsible for the connections between these nodes that are like neurons. Finally, the output layer serves to display or show signs of various human activities. The K-Nearest Neighbors (K-NN) algorithm is one of the techniques of machine learning algorithms that are easy to use, and the results are easy to understand. It can be used to effectively solve classification and regression problems. The K-NN algorithm uses the principle of near-similar comparison. In other words, similar things are close to each other. The working principle of K-NN therefore uses a method for calculating the distance of the object or class itself. After the data has been prepared and the tools have been prepared, the next step is to test the developed model in which the methods are presented in the next section. Evaluation. The model performance testing tool used for this research consisted of two tools. The first tool is a method of dividing the data to test the model known as “Cross-validation method”. Its principle is to divide data into two parts. The first part is used to create the model which is called “Training Dataset”. The rest of the data is used to test the model called “Testing Dataset”. In addition, the divided data is defined as intervals called “k-Fold”. 152 http://www.i-jim.org Paper—Information Systems for Cultural Tourism Management Using Text Analytics and Data… The second tool to analyze model performance is known as “Confusion Matrix Performance”. There are three criteria for determining the model’s performance. The first criterion is accuracy. It is used to represent the overall accuracy of the model, expressed as a percentage. The second criterion is the precision. It is used to show the precision of each answer (class), expressed as a percentage. The third criterion is recall. It is used to show the accuracy of each answer (class), expressed as a percentage. All criteria calculations are shown in Figure 2. Fig. 2. The confusion matrix performance calculations Deployment. The process of deploying a chatbot to recommend cultural attractions in Maha Sarakham Province is divided into two major steps. The first step is to prepare the chatbot server. This research used Chat Fuel’s server and platform features to build a chatbot using natural language processing (NLP) based on machine learning techniques. It has a feature that helps in understanding the intent and entity of users in conversation. The second step is to connect the Facebook Messenger APIs. This process requires a tool by Facebook called “Facebook for Developers”. This tool is what developers use to create applications that interact with Facebook. The communication process and the chatbot working process is shown in Figure 3. iJIM ‒ Vol. 16, No. 09, 2022 153 Paper—Information Systems for Cultural Tourism Management Using Text Analytics and Data… Fig. 3. The communication process and the chatbot working process Figure 3 shows the process of using the chatbot application. When tourists enter questions into Facebook’s chatbot application, the questions are then analyzed through a model and feedback is provided back to the tourists. After the development of the chatbot application was completed, the researchers continued to test it with a target audience of 30 samples. 3.5 Using statistics to analyze application satisfaction The statistics used in this data analysis were the basic statistics to analyze user satisfaction with the chatbot application to recommend cultural attractions in Maha Sarakham Province. The tools used are frequency calculation, mean calculation, standard deviation calculation, and percentage calculation. The satisfaction assessment criteria consisted of 5 levels. Level 1 has a rating of 1, which means extremely dissatisfied. Level 2 has a rating of 2, which means dissatisfied. Level 3 has a rating of 3, which means moderately satisfied. Level 4 has a rating of 4, which means satisfied. Level 5 has a rating of 5, which means extremely satisfied. Interpretation is divided into 5 levels as follows: The mean is between 1.00 to 1.80. It can be interpreted as strongly disagree. The mean is between 1.81 to 2.60. It can be interpreted as disagree. The mean is between 2.61 to 3.40. It can be interpreted as neither agree nor disagree. The mean is between 3.41 to 4.20. It can be interpreted as agree. The mean is between 4.21 to 5.00. It can be interpreted as strongly agree. 4 Research results The results of the study and research on the chatbots application for cultural tourism management were performed and the data were analyzed to determine the prototype model and application satisfaction. The issues are summarized as follows. 154 http://www.i-jim.org Paper—Information Systems for Cultural Tourism Management Using Text Analytics and Data… 4.1 Predictive model for cultural tourism management The project’s research findings compiled a collection of general tourist question- naires. It consists of frequently asked questions by tourists that are in the form of conversational sentences, not in the database format. It is then taken through the data preparation process, once the formatted data is obtained, the structured data is ready for modeling. In this process, the Thai Word Segmentation program is used to remove irrelevant words. The prepared data enters the model development process, which con- sists of three techniques: Naïve Bayes, Neural Network, and K-Nearest Neighbors. All the models tested were then taken through the Cross-Validation Method and Confusion Matrix process to determine the model’s performance. The performance test results for each model are as follows. Modeling with Naïve Bayes technique. The modeling results of the Naïve Bayes technique applied by the classification results with precision, recall, and F1-score values are shown in Table 1. Table 1. Performance analysis results with Naïve Bayes technique Class Naïve Bayes Technique Performances Precision Recall F1-Score (1) Beverage shops and restaurants 94.90%* 85.05% 93.47%* (2) Costs and fees for visiting 89.36% 87.62% 90.81% (3) Facilities and data centers 83.60% 92.08% 84.72% (4) Identity of community 92.31% 96.06%* 93.39% (5) Local wisdom 91.92% 94.49% 88.35% (6) Location 89.69% 85.87% 90.16% (7) Public transport 89.32% 92.00% 88.46% (8) Route 91.01% 92.05% 91.53% From Table 1 shows the results of the Naïve Bayes classification of question sets, it was found that the precision performance was the most 94.90% in the beverage shops and restaurants class. The recall performance was as high as 96.06% in the identity of community class. Lastly, the F1-score performance was as high as 93.47% in the beverage shops and restaurants class. The results of these performance tests were fur- ther compared with other models. Modeling with Neural Network technique. The modeling results of the Neural Network technique applied by the classification results with precision, recall, and F1-score values are shown in Table 2. iJIM ‒ Vol. 16, No. 09, 2022 155 Paper—Information Systems for Cultural Tourism Management Using Text Analytics and Data… Table 2. Performance analysis results with Neural Network technique Class Neural Network Technique Performances Precision Recall F1-Score (1) Beverage shops and restaurants 96.91%* 94.57%* 91.26% (2) Costs and fees for visiting 92.22% 88.12% 88.77% (3) Facilities and data centers 87.82% 86.24% 90.73% (4) Identity of community 89.55% 94.26% 89.55% (5) Local wisdom 92.55% 89.55% 93.55%* (6) Location 83.33% 93.84% 88.46% (7) Public transport 91.75% 85.57% 89.90% (8) Route 95.18% 89.77% 92.40% From Table 2 shows the results of the Neural Network classification of question sets, it was found that the precision performance was the most 96.91% in the beverage shops and restaurants class. The recall performance was as high as 94.57% also in the beverage shops and restaurants class. Lastly, the F1-score performance was as high as 93.55% in the local wisdom class. The results of these performance tests were further compared with other models. Modeling with K-Nearest Neighbors technique. The modeling results of the K-Nearest Neighbors technique applied by the classification results with precision, recall, and F1-score values are shown in Table 3. Table 3. Performance analysis results with K-Nearest Neighbors technique Class K-Nearest Neighbors Technique Performances Precision Recall F1-Score (1) Beverage shops and restaurants 92.55% 89.55% 92.55%* (2) Costs and fees for visiting 88.55% 89.26% 87.55% (3) Facilities and data centers 93.91% 91.57% 91.26% (4) Identity of community 87.82% 86.24% 90.73% (5) Local wisdom 92.22% 88.12% 88.77% (6) Location 91.75% 85.57% 89.90% (7) Public transport 95.18%* 89.77% 91.40% (8) Route 89.55% 94.26%* 89.55% From Table 3 shows the results of the K-Nearest Neighbors classification of question sets, it was found that the precision performance was the most 95.18% in the public transport class. The recall performance was as high as 94.26% also in the route class. Lastly, the F1-score performance was as high as 92.55% in the beverage shops and restaurants class. The results of these performance tests were further compared with other models. The results of the efficiency analysis in Table 1 to Table 3 show that each class has different performance from the results of each model. Where, Table 4 reports the results 156 http://www.i-jim.org Paper—Information Systems for Cultural Tourism Management Using Text Analytics and Data… of the performance classification of various models by analyzing the accuracy perfor- mance and time of model development. Table 4. Classification of model performance Classifiers Accuracy Times(s) Naïve Bayes 94.77% 10 Neural Network 86.67% 24 K-Nearest Neighbors 91.48% 18 From Table 4 the results of the classification of question sets by three techniques were Naïve Bayes, Neural Network, and K-Nearest Neighbors classifiers. The results were compared with the three highest-accuracy techniques in which Naïve Bayes had the highest accuracy of 91.65% and it took the least time to develop the model at 54 seconds. Once an efficient and reasonable model is obtained. The next part is the implementation of the Facebook Messenger chatbot, where an example of its imple- mentation is shown in Figure 4. Fig. 4. Deploying the model to facebook messenger chatbot iJIM ‒ Vol. 16, No. 09, 2022 157 Paper—Information Systems for Cultural Tourism Management Using Text Analytics and Data… Figure 4 shows an example of the model deployment in the Facebook Messenger chatbot program. The outcome of this application’s system is based on a model devel- oped with machine learning and all the performance tested to develop a chatbot system on Facebook Messenger that can answer questions automatically. After obtaining the application, it is tested with a target audience that has been defined. 4.2 Chatbot satisfaction assessment results The results of satisfaction with the chatbot application for cultural tourism man- agement in Maha Sarakham Province for each stage, which were presented as mean, standard deviation, and their interpretation are presented in Table 5. The sample group was 30 students of the Faculty of Information Technology, Rajabhat Maha Sarakham University. Table 5. Chatbot satisfaction assessment results Stage Satisfaction Level Mean S.D. Interpretation 1. Satisfaction toward providing responses to questions via chatbots 4.30 0.67 Strongly agree 2. Satisfaction toward providing appropriate and up-to-date information 4.13 0.68 Agree 3. Satisfaction toward using appropriate language via chatbots 4.02 0.69 Agree 4. Satisfaction toward the correctness of answering questions 4.25 0.68 Strongly agree 5. Satisfaction toward the speed in responding to questions 4.24 0.65 Strongly agree 6. Satisfaction toward a diverse set of responses 4.23 0.62 Strongly agree 7. Satisfaction toward the friendliness of the system 4.35 0.63 Strongly agree 8. Satisfaction toward the functionality and complexity of the system 4.32 0.60 Strongly agree Overall Satisfaction: 4.17 0.66 Agree Table 5 presents the results of the user satisfaction assessment of 30 students. It was found that overall users had a high level of satisfaction with accepting the chatbot application for cultural tourism management. It has an average satisfaction rating of 4.17. Where the issue that was recognized as having the highest satisfaction was the satisfaction with the friendliness of the system. It has an average satisfaction rating of 4.35. However, other issues were accepted as well. It can be concluded that the devel- oped application is accepted according to the target audience that has been defined. 5 Research discussions The results of a research study on the development of an application for cultural tourism management in Maha Sarakham Province, in which the model development steps were determined according to the six-step CRISP-DM data mining process. As well as assessing the satisfaction of the application, it can be summarized as two main points. 158 http://www.i-jim.org Paper—Information Systems for Cultural Tourism Management Using Text Analytics and Data… 5.1 Success in the development of predictive models The first success is that the prediction model for cultural tourism management of Maha Sarakham Province can actually be developed within the specified scope. In the model development, three techniques were compared to select the most efficient model. All 3 techniques consist of Naïve Bayes, Neural Network, and K-Nearest Neighbors. The comparison results of the three models are as follows: The classification results of the question sets using Naïve Bayes technique, it was found that the class with the highest precision rating was the beverage shops and restaurants class, with a precision of 94.90%. In addition, the Recall performance was as high as 96.06% in the identity of community class. Lastly, the F1-Score performance was as high as 93.47% in the beverage shops and restaurants class as detailed in the Table 1. The classification results of the question sets using Neural Network technique, it was found that the class with the highest precision rating was the beverage shops and restaurants class, with a precision of 96.91%. In addition, the Recall performance was as high as 94.57% in the beverage shops and restaurants class. Lastly, the F1-score performance was as high as 93.55% in the local wisdom class as detailed in the Table 2. Lastly, the classification results of the question sets using K-Nearest Neighbors tech- nique, it was found that the class with the highest precision rating was the public trans- port class, with a precision of 95.18%. In addition, the Recall performance was as high as 94.26% in the route class. Lastly, the F1-score performance was as high as 92.55% in the beverage shops and restaurants class as detailed in the Table 3. Moreover, the results of the analysis by classification according to the validity of the developed models are presented in Table 4. It was found that the models created using the Naïve Bayes technique had the highest accuracy and the least time to develop the model, which has been selected for use in Facebook’s Messenger Chatbot. An example of the model deployment in the Facebook’s Messenger Chatbot application is shown in Figure 4. 5.2 Success in application development The developed application was tested to assess the satisfaction of testers with the use of the Facebook’s Messenger Chatbot application. The application testers were selected samples from 30 students of the Faculty of Information Technology, Rajabhat Maha Sarakham University. A detailed summary of the eight assessments is presented in Table 5. The results of the assessment of the chatbot application for the management of cul- tural tourism through the Facebook Messenger platform, it was found that the applica- tion testers had a high level of satisfaction. The overall satisfaction of the application testers was found to be an average of 3.98. It can mean that the tester accepts the use of the developed application. The point that the testers accepted the most was the satis- faction toward the correctness of answering questions issue. It had an average of 4.37 (S.D. equal to 0.45). The second point that the testers most accepted was the satisfac- tion toward providing appropriate and up-to-date information issue. It had an average of 4.10 (S.D. equal to 0.36). In addition, other issues have been accepted as well. It can be concluded that the applications developed are accepted in all dimensions. iJIM ‒ Vol. 16, No. 09, 2022 159 Paper—Information Systems for Cultural Tourism Management Using Text Analytics and Data… 6 Conclusion Innovation is not just about creating something new; it should make current tech- nology uphold the foundation of a national institution, which means a nation’s com- munity. Therefore, this research aims to develop innovations for communities and developing countries. The goal of this research is to bring university knowledge to the local community. Its main objective is to develop technology to enable commu- nities to benefit from modern innovations. The sub-objectives of this research consist of three issues: to develop a predictive model for cultural tourism management using text mining techniques, to evaluate the effectiveness of the cultural tourist attraction management model, and to assess the satisfaction of using the application for cultural tourism management. The research data was collected on Facebook conversations from 385 tourists (3,257 transactions) who had traveled to a famous tourist destination in Maha Sarakham Province. It was found that informants were very pleased with the research objectives. The prediction model development tools are three classification technique including Naïve Bayes, Neural Network, and K-Nearest Neighbor. The model performance evaluation tool consists of a confusion matrix and cross-validation meth- ods. Model development and model testing tools make it possible to discover unique models. It found the correlation of data and models that enable the system to deliver results to users with high satisfaction. While the questionnaire was used to assess the satisfaction of the application indicating acceptance of the innovation that had been created. The results showed that the model with the highest accuracy was modeled by Naïve Bayes technique with an accuracy of 91.65%. Simultaneously, the level of satisfaction with the application was high, with an average of satisfaction equal to 3.98 (S.D. equal to 0.69). The conclusions of the testers using the developed applications show that the researchers developed a responsive information system. It was therefore concluded that the application was accepted by it to be further expanded to offer more widespread research. The researcher has strong confidence that what has been studied in this research has been successful and will be accepted for future use. 7 Acknowledgment This research project was supported by the Thailand Science Research and Innova- tion Fund and the University of Phayao (Grant No. FF65-UoE006). In addition, this research was supported by many advisors, academicians, researchers, students, staff, and agencies from two organizations: the School of Information and Communication Technology at the University of Phayao, and the Faculty of Information Technology at the Rajabhat Maha Sarakham University. The authors would like to thank all of them for their support and collaboration in making this research possible. 160 http://www.i-jim.org Paper—Information Systems for Cultural Tourism Management Using Text Analytics and Data… 8 References [1] G. Sperlí, “A cultural heritage framework using a deep learning based chatbot for supporting tourist journey,” Expert Syst. Appl., vol. 183, p. 115277, Nov. 2021. https://doi.org/10.1016/j. eswa.2021.115277 [2] M. Casillo, F. Clarizia, G. D’Aniello, M. De Santo, M. Lombardi, and D. Santaniello, “CHAT-Bot: a cultural heritage aware teller-bot for supporting touristic experiences,” Pattern Recognit. Lett., vol. 131, pp. 234–243, Mar. 2020. https://doi.org/10.1016/j. patrec.2020.01.003 [3] I. D. Wahyono et al., “Shared nearest neighbour in text mining for classification material in online learning using mobile application,” Int. J. Interact. Mob. Technol. IJIM, vol. 16, no. 04, Art. no. 04, Feb. 2022. https://doi.org/10.3991/ijim.v16i04.28991 [4] J. Romão, K. Kourtit, B. Neuts, and P. Nijkamp, “The smart city as a common place for tourists and residents: a structural analysis of the determinants of urban attractiveness,” Cities, vol. 78, pp. 67–75, Aug. 2018. https://doi.org/10.1016/j.cities.2017.11.007 [5] G. Zhou, F. Kurauchi, S. Ito, and R. Du, “Identifying golden routes in tourist areas based on AMP collectors,” Asian Transp. Stud., vol. 8, p. 100052, Jan. 2022. https://doi.org/10.1016/j. eastsj.2021.100052 [6] M. Visan, S. L. Negrea, and F. Mone, “Towards intelligent public transport systems in Smart Cities; Collaborative decisions to be made,” Procedia Comput. Sci., vol. 199, pp. 1221–1228, Jan. 2022. https://doi.org/10.1016/j.procs.2022.01.155 [7] M. Paolanti et al., “Tourism destination management using sentiment analysis and geo-location information: a deep learning approach,” Inf. Technol. Tour., vol. 23, no. 2, pp. 241–264, Jun. 2021. https://doi.org/10.1007/s40558-021-00196-4 [8] S. Lu, G. Li, and M. Xu, “The linguistic landscape in rural destinations: a case study of Hongcun Village in China,” Tour. Manag., vol. 77, p. 104005, Apr. 2020. https://doi. org/10.1016/j.tourman.2019.104005 [9] M. Li, D. Yin, H. Qiu, and B. Bai, “A systematic review of AI technology-based service encounters: implications for hospitality and tourism operations,” Int. J. Hosp. Manag., vol. 95, p. 102930, May 2021. https://doi.org/10.1016/j.ijhm.2021.102930 [10] S. Mohamad Suhaili, N. Salim, and M. N. Jambli, “Service chatbots: a systematic review,” Expert Syst. Appl., vol. 184, p. 115461, Dec. 2021. https://doi.org/10.1016/j. eswa.2021.115461 [11] M. Cazacu and E. Titan, “Adapting CRISP-DM for social sciences,” BRAIN Broad Res. Artif. Intell. Neurosci., vol. 11, no. 2Sup1, Art. no. 2Sup1, May 2021. https://doi.org/10.18662/ brain/11.2Sup1/97 [12] C. Schröer, F. Kruse, and J. M. Gómez, “A systematic literature review on applying CRISP-DM process model,” Procedia Comput. Sci., vol. 181, pp. 526–534, Jan. 2021. https://doi.org/10.1016/j.procs.2021.01.199 [13] J. Venter, A. de Waal, and C. Willers, “Specializing CRISP-DM for evidence mining,” in Advances in Digital Forensics III, New York, NY, 2007, pp. 303–315. https://doi. org/10.1007/978-0-387-73742-3_21 [14] K. Panwanitdumrong and C.-L. Chen, “Investigating factors influencing tourists’ environ- mentally responsible behavior with extended theory of planned behavior for coastal tourism in Thailand,” Mar. Pollut. Bull., vol. 169, p. 112507, Aug. 2021. https://doi.org/10.1016/j. marpolbul.2021.112507 [15] C.-K. Lee, M. S. Ahmad, J. F. Petrick, Y.-N. Park, E. Park, and C.-W. Kang, “The roles of cultural worldview and authenticity in tourists’ decision-making process in a heritage tour- ism destination using a model of goal-directed behavior,” J. Destin. Mark. Manag., vol. 18, p. 100500, Dec. 2020. https://doi.org/10.1016/j.jdmm.2020.100500 iJIM ‒ Vol. 16, No. 09, 2022 161 https://doi.org/10.1016/j.eswa.2021.115277 https://doi.org/10.1016/j.eswa.2021.115277 https://doi.org/10.1016/j.patrec.2020.01.003 https://doi.org/10.1016/j.patrec.2020.01.003 https://doi.org/10.3991/ijim.v16i04.28991 https://doi.org/10.1016/j.cities.2017.11.007 https://doi.org/10.1016/j.eastsj.2021.100052 https://doi.org/10.1016/j.eastsj.2021.100052 https://doi.org/10.1016/j.procs.2022.01.155 https://doi.org/10.1007/s40558-021-00196-4 https://doi.org/10.1016/j.tourman.2019.104005 https://doi.org/10.1016/j.tourman.2019.104005 https://doi.org/10.1016/j.ijhm.2021.102930 https://doi.org/10.1016/j.eswa.2021.115461 https://doi.org/10.1016/j.eswa.2021.115461 https://doi.org/10.18662/brain/11.2Sup1/97 https://doi.org/10.18662/brain/11.2Sup1/97 https://doi.org/10.1016/j.procs.2021.01.199 https://doi.org/10.1007/978-0-387-73742-3_21 https://doi.org/10.1007/978-0-387-73742-3_21 https://doi.org/10.1016/j.marpolbul.2021.112507 https://doi.org/10.1016/j.marpolbul.2021.112507 https://doi.org/10.1016/j.jdmm.2020.100500 Paper—Information Systems for Cultural Tourism Management Using Text Analytics and Data… [16] E. M. Abuelrub and H. M. Solaiman, “A tourism E-guide system using mobile integra- tion,” Int. J. Interact. Mob. Technol. IJIM, vol. 4, no. 2, Art. no. 2, Mar. 2010. https://doi. org/10.3991/ijim.v4i2.1051 [17] C. Suphachaimongkol, C. Ratanatamskul, S. Silapacharanan, and P. Utiswannakul, “Devel- opment of mobile application for sustainable creative tourism assessment using con firmatory factor analysis approach,” Int. J. Interact. Mob. Technol. IJIM, vol. 13, no. 06, Art. no. 06, Jun. 2019. https://doi.org/10.3991/ijim.v13i06.10500 [18] M. T. Cuomo, D. Tortora, P. Foroudi, A. Giordano, G. Festa, and G. Metallo, “Digital trans- formation and tourist experience co-design: big social data for planning cultural tourism,” Technol. Forecast. Soc. Change, vol. 162, p. 120345, Jan. 2021. https://doi.org/10.1016/j. techfore.2020.120345 [19] S. Tse and V. W. S. Tung, “Understanding residents’ attitudes towards tourists: connecting stereotypes, emotions and behaviours,” Tour. Manag., vol. 89, p. 104435, Apr. 2022. https:// doi.org/10.1016/j.tourman.2021.104435 [20] S. (Sixue) Jia, “Motivation and satisfaction of Chinese and U.S. tourists in restaurants: a cross-cultural text mining of online reviews,” Tour. Manag., vol. 78, p. 104071, Jun. 2020. https://doi.org/10.1016/j.tourman.2019.104071 [21] J. Wen, S. (Sam) Huang, and T. Ying, “Relationships between Chinese cultural values and tourist motivations: a study of Chinese tourists visiting Israel,” J. Destin. Mark. Manag., vol. 14, p. 100367, Dec. 2019. https://doi.org/10.1016/j.jdmm.2019.100367 [22] A. Gutiérrez, A. Domènech, B. Zaragozí, and D. Miravet, “Profiling tourists’ use of public transport through smart travel card data,” J. Transp. Geogr., vol. 88, p. 102820, Oct. 2020. https://doi.org/10.1016/j.jtrangeo.2020.102820 [23] D. Ilić, I. Milošević, and T. Ilić-Kosanović, “Application of unmanned aircraft systems for smart city transformation: case study Belgrade,” Technol. Forecast. Soc. Change, vol. 176, p. 121487, Mar. 2022. https://doi.org/10.1016/j.techfore.2022.121487 [24] E. Sigalat-Signes, R. Calvo-Palomares, B. Roig-Merino, and I. García-Adán, “Transition towards a tourist innovation model: the smart tourism destination: reality or territorial mar- keting?,” J. Innov. Knowl., vol. 5, no. 2, pp. 96–104, Apr. 2020. https://doi.org/10.1016/j. jik.2019.06.002 [25] T.-L. Huang, “Restorative experiences and online tourists’ willingness to pay a price pre- mium in an augmented reality environment,” J. Retail. Consum. Serv., vol. 58, p. 102256, Jan. 2021. https://doi.org/10.1016/j.jretconser.2020.102256 [26] E. R. Fino, J. Martín-Gutiérrez, M. D. M. Fernández, and E. A. Davara, “Interactive tourist guide: connecting Web 2.0, augmented reality and QR codes,” Procedia Comput. Sci., vol. 25, pp. 338–344, Jan. 2013. https://doi.org/10.1016/j.procs.2013.11.040 [27] N. Chung, H. Han, and Y. Joun, “Tourists’ intention to visit a destination: the role of augmented reality (AR) application for a heritage site,” Comput. Hum. Behav., vol. 50, pp. 588–599, Sep. 2015. https://doi.org/10.1016/j.chb.2015.02.068 [28] Q. Li, S. Li, S. Zhang, J. Hu, and J. Hu, “A review of text corpus-based tourism big data min- ing,” Appl. Sci., vol. 9, no. 16, Art. no. 16, Jan. 2019. https://doi.org/10.3390/app9163300 [29] H. A. T. Nguyen et al., “Comparative carbon footprint assessment of agricultural and tourist locations in Thailand,” J. Clean. Prod., vol. 269, p. 122407, Oct. 2020. https://doi. org/10.1016/j.jclepro.2020.122407 [30] S. Fuktong et al., “A survey of stereotypic behaviors in tourist camp elephants in Chiang Mai, Thailand,” Appl. Anim. Behav. Sci., vol. 243, p. 105456, Oct. 2021. https://doi.org/10.1016/j. applanim.2021.105456 [31] Y. Jeaheng and H. Han, “Thai street food in the fast growing global food tourism industry: Preference and behaviors of food tourists,” J. Hosp. Tour. Manag., vol. 45, pp. 641–655, Dec. 2020. https://doi.org/10.1016/j.jhtm.2020.11.001 162 http://www.i-jim.org https://doi.org/10.3991/ijim.v4i2.1051 https://doi.org/10.3991/ijim.v4i2.1051 https://doi.org/10.3991/ijim.v13i06.10500 https://doi.org/10.1016/j.techfore.2020.120345 https://doi.org/10.1016/j.techfore.2020.120345 https://doi.org/10.1016/j.tourman.2021.104435 https://doi.org/10.1016/j.tourman.2021.104435 https://doi.org/10.1016/j.tourman.2019.104071 https://doi.org/10.1016/j.jdmm.2019.100367 https://doi.org/10.1016/j.jtrangeo.2020.102820 https://doi.org/10.1016/j.techfore.2022.121487 https://doi.org/10.1016/j.jik.2019.06.002 https://doi.org/10.1016/j.jik.2019.06.002 https://doi.org/10.1016/j.jretconser.2020.102256 https://doi.org/10.1016/j.procs.2013.11.040 https://doi.org/10.1016/j.chb.2015.02.068 https://doi.org/10.3390/app9163300 https://doi.org/10.1016/j.jclepro.2020.122407 https://doi.org/10.1016/j.jclepro.2020.122407 https://doi.org/10.1016/j.applanim.2021.105456 https://doi.org/10.1016/j.applanim.2021.105456 https://doi.org/10.1016/j.jhtm.2020.11.001 Paper—Information Systems for Cultural Tourism Management Using Text Analytics and Data… [32] J. T. Roscoe, Fundamental research statistics for the behavioral sciences. New York: Holt, Rinehart and Winston, 1969. Accessed: Feb. 08, 2022. [Online]. Available: http://archive. org/details/fundamentalresea0000rosc [33] J. Charan and T. Biswas, “How to calculate sample size for different study designs in med- ical research?,” Indian J. Psychol. Med., vol. 35, no. 2, pp. 121–126, Apr. 2013. https://doi. org/10.4103/0253-7176.116232 [34] L. Zhang, Z. Qi, and F. Meng, “A review on the construction of business intelligence system based on unstructured image data,” Procedia Comput. Sci., vol. 199, pp. 392–398, Jan. 2022. https://doi.org/10.1016/j.procs.2022.01.048 [35] R. Blanquero, E. Carrizosa, P. Ramírez-Cobo, and M. R. Sillero-Denamiel, “Variable selec- tion for Naïve Bayes classification,” Comput. Oper. Res., vol. 135, p. 105456, Nov. 2021. https://doi.org/10.1016/j.cor.2021.105456 [36] A.-A. Tulbure, A.-A. Tulbure, and E.-H. Dulf, “A review on modern defect detection models using DCNNs – Deep convolutional neural networks,” J. Adv. Res., vol. 35, pp. 33–48, Jan. 2022. https://doi.org/10.1016/j.jare.2021.03.015 [37] J. R. Rico-Juan, J. J. Valero-Mas, and J. Calvo-Zaragoza, “Extensions to rank-based proto- type selection in k-Nearest Neighbour classification,” Appl. Soft Comput., vol. 85, p. 105803, Dec. 2019. https://doi.org/10.1016/j.asoc.2019.105803 9 Authors Thanet Yuensuk is currently an instructor at the Faculty of Information Technology, Rajabhat Maha Sarakham University, Maha Sarakham, 44000, Thailand. (Email: thanet. yu@rmu.ac.th) His research interests are learning media development, knowledge man- agement, information systems development, and information technology management. Potsirin Limpinan is currently an assistant professor at the Faculty of Information Technology, Rajabhat Maha Sarakham University, Maha Sarakham, 44000, Thailand. (Email: potsirin.li@rmu.ac.th) Her research interests are learning media develop- ment, knowledge management, information systems development, and information technology management. Wongpanya Sararat Nuankaew is currently an assistant professor at the Faculty of Information Technology, Rajabhat Maha Sarakham University, Maha Sarakham, 44000, Thailand. (Email: wongpanya.nu@rmu.ac.th) Her research interests are digital education, innovation and knowledge management, data science, and big data and information technology management. Pratya Nuankaew is currently an instructor at the School of Information and Com- munication Technology, University of Phayao, Phayao, 56000, Thailand. (Email: pratya. nu@up.ac.th) He is the corresponding author on this research. His research interests are applied informatics technologies, behavioral sciences analysis with technologies, computer-supported collaborative learning, data science in education, educational data mining, learning analytics and learning styles, learning strategies for lifelong learning, self-regulated learning, social network analysis, and ubiquitous computing. Article submitted 2022-02-26. Resubmitted 2022-03-27. Final acceptance 2022-03-29. Final version published as submitted by the authors. iJIM ‒ Vol. 16, No. 09, 2022 163 http://archive.org/details/fundamentalresea0000rosc http://archive.org/details/fundamentalresea0000rosc https://doi.org/10.4103/0253-7176.116232 https://doi.org/10.4103/0253-7176.116232 https://doi.org/10.1016/j.procs.2022.01.048 https://doi.org/10.1016/j.cor.2021.105456 https://doi.org/10.1016/j.jare.2021.03.015 https://doi.org/10.1016/j.asoc.2019.105803 mailto:thanet.yu@rmu.ac.th mailto:thanet.yu@rmu.ac.th mailto:potsirin.li@rmu.ac.th mailto:wongpanya.nu@rmu.ac.th mailto:pratya.nu@up.ac.th mailto:pratya.nu@up.ac.th