APPLICATION OF DIGITAL CELLULAR RADIO FOR MOBILE LOCATION ESTIMATION IIUM Engineering Journal, Vol. 22, No. 1, 2021 Buntoro et al. https://doi.org/10.31436/iiumej.v22i1.1532 IMPLEMENTATION OF A MACHINE LEARNING ALGORITHM FOR SENTIMENT ANALYSIS OF INDONESIA’S 2019 PRESIDENTIAL ELECTION GHULAM ASROFI BUNTORO1*, RIZAL ARIFIN1, GUS NANANG SYAIFUDDIIN1, ALI SELAMAT2,3, ONDREJ KREJCAR3, AND HAMIDO FUJITA4, 1Faculty of Engineering, Universitas Muhammadiyah Ponorogo, Indonesia 2Malaysia-Japan International Institute of Technology, Universiti Teknologi Malaysia 3Faculty of Informatics and Management, University of Hradec Kralove, Czech Republic 4Faculty of Software and Information Science, Iwate Prefectural University, Iwate, Japan *Corresponding author: ghulam@umpo.ac.id (Received: 3rd July 2020; Accepted: 1st October 2020; Published on-line: 4th January 2021) ABSTRACT: In 2019, citizens of Indonesia participated in the democratic process of electing a new president, vice president, and various legislative candidates for the country. The 2019 Indonesian presidential election was very tense in terms of the candidates' campaigns in cyberspace, especially on social media sites such as Facebook, Twitter, Instagram, Google+, Tumblr, LinkedIn, etc. The Indonesian people used social media platforms to express their positive, neutral, and also negative opinions on the respective presidential candidates. The campaigning of respective social media users on their choice of candidates for regents, governors, and legislative positions up to presidential candidates was conducted via the Internet and online media. Therefore, the aim of this paper is to conduct sentiment analysis on the candidates in the 2019 Indonesia presidential election based on Twitter datasets. The study used datasets on the opinions expressed by the Indonesian people available on Twitter with the hashtags (#) containing "Jokowi and Prabowo." We conducted data pre-processing using a selection of comments, data cleansing, text parsing, sentence normalization and tokenization based on the given text in the Indonesian language, determination of class attributes, and, finally, we classified the Twitter posts with the hashtags (#) using Naïve Bayes Classifier (NBC) and a Support Vector Machine (SVM) to achieve an optimal and maximum optimization accuracy. The study provides benefits in terms of helping the community to research opinions on Twitter that contain positive, neutral, or negative sentiments. Sentiment Analysis on the candidates in the 2019 Indonesian presidential election on Twitter using non-conventional processes resulted in cost, time, and effort savings. This research proved that the combination of the SVM machine learning algorithm and alphabetic tokenization produced the highest accuracy value of 79.02%. While the lowest accuracy value in this study was obtained with a combination of the NBC machine learning algorithm and N-gram tokenization with an accuracy value of 44.94%. ABSTRAK: Pada tahun 2019 rakyat Indonesia telah terlibat dalam proses demokrasi memilih presiden baru, wakil presiden, dan berbagai calon legislatif negara. Pemilihan presiden Indonesia 2019 sangat tegang dalam kempen calon di ruang siber, terutama di laman media sosial seperti Facebook, Twitter, Instagram, Google+, Tumblr, LinkedIn, dll. Rakyat Indonesia menggunakan platfom media sosial bagi menyatakan pendapat positif, berkecuali, dan juga negatif terhadap calon presiden masing-masing. Kampen pencalonan menteri, gabenor, dan perundangan hingga pencalonan presiden dilakukan melalui media internet dan atas talian. Oleh itu, kajian ini dilakukan bagi menilai sentimen terhadap calon pemilihan presiden Indonesia 2019 berdasarkan kumpulan data Twitter. Kajian ini menggunakan kumpulan data yang diungkapkan oleh rakyat Indonesia yang terdapat di 78 IIUM Engineering Journal, Vol. 22, No. 1, 2021 Buntoro et al. https://doi.org/10.31436/iiumej.v22i1.1532 Twitter dengan hashtag (#) yang mengandungi "Jokowi dan Prabowo." Proses data dibuat menggunakan pilihan komentar, pembersihan data, penguraian teks, normalisasi kalimat, dan tokenisasi teks dalam bahasa Indonesia, penentuan atribut kelas, dan akhirnya, pengklasifikasian catatan Twitter dengan hashtag (#) menggunakan Klasifikasi Naïve Bayes (NBC) dan Mesin Vektor Sokongan (SVM) bagi mencapai ketepatan optimum dan maksimum. Kajian ini memberikan faedah dari segi membantu masyarakat meneliti pendapat di Twitter yang mengandungi sentimen positif, neutral, atau negatif. Analisis Sentimen terhadap calon dalam pemilihan presiden Indonesia 2019 di Twitter menggunakan proses bukan konvensional menghasilkan penjimatan kos, waktu, dan usaha. Penyelidikan ini membuktikan bahawa gabungan algoritma pembelajaran mesin SVM dan tokenisasi abjad menghasilkan nilai ketepatan tertinggi iaitu 79.02%. Manakala nilai ketepatan terendah dalam kajian ini diperoleh dengan kombinasi algoritma pembelajaran mesin NBC dan tokenisasi N-gram dengan nilai ketepatan 44.94%. KEYWORDS: sentiment analysis; president; Indonesia; naïve Bayes classifier; support vector machine 1. INTRODUCTION The turmoil resulting from organizing the 2019 Indonesian general election, notably the presidential election, has been felt since last year. This has applied not only in the real world but also in cyberspace, mainly on social media sites such as Twitter, Instagram, Facebook, etc., which people used to discuss their potential presidential candidates. The stages of the general election and presidential election in 2019 were announced by the Indonesian General Elections Commission (KPU). The names of the presidential candidates had been widely discussed on social media as far back as the candidate registration phase in early 2019 by the Indonesian KPU [1]. The virtual world is a world that is so free and difficult to control, where everyone is free to speak or give their opinion on their respective candidates. The opinions expressed by the public may be positive, neutral, or even negative. The world of information has developed so fast that there is now a significant amount of online media, from news information to social media or friendships, with social media including Facebook, Twitter, Path, Instagram, Google+, and many more. Twitter has a total of 330 million active users to date, while around 500 million tweets are made worldwide every day. There are around 100 million active daily users of Twitter around the world [2]. Social media is not only used as a means of friendship or for making friends but also for activities such as the promotion of merchandise or sale and purchase, up to political party promos or campaigns for regent, presidential, and legislative candidates. The team charged with ensuring a candidate for president or regional head, for example, will justify any means of campaigning for their candidate, as evidenced by the presence of many Black Campaigns during the campaign period [3], especially on social media against a candidate. Today's campaign or imaging is not only done in the real world but also in the virtual world. Social media, especially Twitter, is now one of the most effective and efficient campaign venues. Sentiment analysis continues to be used as part of opinion mining research. It is the process of understanding, extracting, and processing textual data automatically to obtain the sentiment information contained in an opinion sentence [4]. In this study, sentiment analysis was conducted with the aim of viewing and retrieving information pertaining to the opinions expressed by people in the Indonesian language on Twitter with regard to the candidates in the 2019 Indonesian presidential election, whether those opinions were in the category of positive, neutral, or negative. To test the accuracy of the sentiment analysis in this study, we used two machine learning algorithms, namely 79 IIUM Engineering Journal, Vol. 22, No. 1, 2021 Buntoro et al. https://doi.org/10.31436/iiumej.v22i1.1532 Naïve Bayes Classifier (NBC) and Support Vector Machine (SVM) and 7 tokenizations including an Alphabetic Tokenizer, Character N-gram Tokenizer, Unigram, Bigram, Trigram, N-gram, and Word Tokenizer. The result will enable us to see the accuracy of the machine learning methods NBC and SVM[5] and 7 tokenizations including the Alphabet Tokenizer, Character N-gram Tokenizer, Character Tokenizer, Unigram, Bigram, Trigram, N-gram, and Word Tokenizer for sentiment analysis of the 2019 Indonesian presidential candidates. 2. RELATED WORK Sentiment analysis research used machine learning to classify Turkish political news [6]. This research classified the sentiment toward Turkish political news and determined whether the sentiment expressed was positive or negative. The different features of Turkish political news were extracted with the machine learning algorithms of Naïve Bayes Classifier (NBC), Maximum Entropy (ME), and SVM to produce a classification model. Sentiment analysis was used to group texts according to their positive or negative orientation [7]. This paper explains the experimental results that apply SVMs to conduct benchmarking with standard datasets to train sentiment analysis classifiers. N-grams and different weighting schemes were used to extract the most classic features. This study also explores the Chi-Square weight feature to select informative features for the classification method. The results of this experimental analysis reveal that using the Chi-Square feature selection can significantly enhance classification accuracy. The main challenge for law enforcement in recent years has been the automatic detection of abusive language in online media [8]. First, we have developed a deep learning architecture that uses word frequency vectorization to implement the features above. Second, we have proposed a method that, because it does not use pre-trained word embedding, is an independent language. Third, we have conducted a comprehensive evaluation of our model using public datasets from labelled tweets and open-source implementations built using Keras. The paper presents an ensemble classifier for detecting hate speech in short texts, such as opinion tweets used as corpus datasets [9]. Our classification uses deep learning and combines a set of features related to user behaviour characteristics, such as the tendency to send rough messages as input to a combination of machine learning algorithms [10,11]. Sentiment analysis research was carried out using a hybrid approach [12] with its research methods, including mining association rules, parsing dependencies, and Sentiwordnet applied to solve this aspect-based sentiment analysis problem [13]. The performance of the research was evaluated using negative and racial domains and other benchmarks to evaluate the accuracy of aspect-based sentiment classification. 3. PROPOSED METHODOLOGY 3.1 Tweet Data Collection Crawling [14] carried out tweet data collection with R Programming using R-Studio from Twitter. The data taken comprised only tweets in Indonesian, which consisted of 5,000 tweets containing the Jokowi keywords and 5000 tweets containing the Prabowo keywords, to give a total of 10,000 tweets. The data were taken randomly from ordinary users of Twitter (Fig. 1). 80 IIUM Engineering Journal, Vol. 22, No. 1, 2021 Buntoro et al. https://doi.org/10.31436/iiumej.v22i1.1532 Fig. 1: Proposed method of Twitter sentiment analysis. Figure 2 shows the coding of R Language [15] that was used to crawl the data from Twitter. The tweet data on Jokowi comprised 5,000 tweets, which, along with examples of original tweets, featured lots of noise characterized by the presence of symbols and links. 81 IIUM Engineering Journal, Vol. 22, No. 1, 2021 Buntoro et al. https://doi.org/10.31436/iiumej.v22i1.1532 Fig. 2: Crawling tweets of opinions about Jokowi from Twitter at R-Studio. Fig. 3: Crawling tweets of opinion about Prabowo from Twitter at R-Studio. Figure 3 contains a coding of the R Language for crawling data from Twitter. Also seen is the number of data tweets taken about Prabowo, which numbered 5,000. The left-hand panes in Figures 2 and 3 also contain examples of tweets in their original form and lots of noise from Twitter with the presence of symbols and links. 3.2 Data Pre-Processing The data pre-processing stage [16] in this study consisted of 4 steps, which are described as follows: A. Selection of Comments At this stage, comments were selected that contained the keywords of hashtags (#) Jokowi and Prabowo; any data that did not contain both were deleted. When crawling all comments with the hashtag, both will be taken even if they appear in the same sentence. Then, during this process, the same comment will be deleted, even if it comes from a different Twitter account, in order to find unique tweet data. 82 IIUM Engineering Journal, Vol. 22, No. 1, 2021 Buntoro et al. https://doi.org/10.31436/iiumej.v22i1.1532 B. Cleansing This process aimed to clean up any comments from Twitter that were still dirty and contained a lot of noise. The opinion sentences obtained from Twitter usually contained a certain level of noise, i.e., random errors or variants in measured variables; therefore, we had to eliminate and clean the noise. The items omitted were usually HTML characters, symbols, emoticon icons, hashtags (#), usernames (@username), URL addresses (http://websitename.com), and email addresses (name@websitename.com). C. Parsing The third data pre-processing step in this study was parsing [17]. The aim was to break the document into a string of words and then analyse the collection of words by separating them and determining the syntactic structure of each word. D. Sentence Normalization The aim of this step was to normalize the sentences taken from Twitter; for example, a sentence containing the words Gaul or Alay [18] would be normalized so that the sentence or language of Gaul and Alay could be recognized as a language following KBBI (The Great Dictionary of the Indonesian Language) [19]. The normalization of sentences involved the following processes: • Stretch punctuation and symbols other than the alphabet Stretching punctuation involves inserting distance around the punctuation associated with words that come after or before. The aim is to avoid any punctuation and/or symbols other than those in the alphabet becoming one with the words during the tokenization process. • Change to all lowercase letters • Normalization of words. The rules in the normalization process are shown in Table 1. Table 1: Rules for normalizing words Normal / slang Normal Suffix –ny Suffix –nya Suffix –nk Suffix –ng Suffix –x Suffix –nya Suffix –z Suffix –s Suffix –dh Suffix –t Repeat words: sama2 Repeat words: sama-sama Spelling: oe Alphabet: u Spelling: dj Alphabet: j • Eliminate repeated letters When happy or upset, someone may write opinions based on their emotions; often, when expressing this in written form, they will repeat the same letter. For example: "kereeen" to express pleasure. Repeated words like "kereeen" will be normalized to "cool". 3.3 Tokenization After normalizing the sentence, it was then broken down into tokens [20] using a delimiter or space bar. The tokens used in this study are: 83 IIUM Engineering Journal, Vol. 22, No. 1, 2021 Buntoro et al. https://doi.org/10.31436/iiumej.v22i1.1532 • Alphabetic Tokenizer: These tokens are formed only from adjacent alphabetical sequences, for example: aku, anak, asli, baik, bagus, cara, cinta, demi, engkau, enak, film • Character N-gram Tokenizer: This tokenizer divides the token into a one-word character; for example: pe, mi, lu, pe, mi, li, han, u, mum • Unigram: This tokenizer divides the sentence into a token, with each token consisting of only one word; for example, "Pemilu". • Bigram: This tokenizer divides the sentence into a token, with each token consisting of two words; for example: "Pemilihan Umum”. • Trigram: This tokenizer divides the sentence into a token, with each token consisting of three words; for example, “Pemilihan Umum Indonesia". • N-gram Tokenizer: This tokenizer divides the string into n-grams with the minimum and maximum number of grams as specified; for example, "pemilihan, pemilihan umum, pemilihan umum Indonesia, aku, aku anak, aku anak indonesia" • Word Tokenizer: This tokenizer divides tokens from the basic words; for example, "aku, akun, akuntansi, alam, alami, alamiah" 3.4 Determination of Class Attribute After pre-processing, the next stage in this research is to determine the class attribute. The class attribute used here is sentiment class; in this study, there are 3 class attributes [21], namely positive, neutral, and negative. The use of 3 class attributes provides a more detailed and accurate classification of public opinion toward a particular object. 3.5 Load Dictionary Following the class attribute determination, the next step is to apply the Lexicon-based method [22]. The dictionary used in this study comprises positive words (positive keywords), negative words (negative keywords), and negation words (negation keywords). The following is a sample dictionary and its contents: • Positive keywords; for example, “amanah, ahli, jujur, adil, keren”. • Negative keywords; for example, “apatis, benci, dosa, jahat, buruk”. • Negation keywords; for example, “lebih, kurang, tidak, bukan”. • Dictionary of slang conversion to KBBA; for example, “cyg = sayang, lbh = lebih, krn = karena, jd = jadi, spt = seperti, ciyus = serius”. 3.6 Determination of Sentiment This is the process used for determining the sentiment (Positive, Neutral, or Negative) in Twitter data once the processing has been performed. The sentiment determining process used in this study consisted of the Lexicon-based or Dictionary-based method with Python Programming. In this study, we are using the Positive and Negative Dictionary. The polarity score of an opinion word (p) will be 1 if the word is in the positive dictionary, meaning the word is positive. A word that is in neither the positive nor negative dictionary is worth 0, meaning it is neutral, while a word in the negative dictionary is worth -1, meaning it is negative [23]. The method for determining sentiment uses the sum formula n, namely the opinion polarity score of the word, plus p, that is the opinion commenting on the feature (f). After determining which words in a Twitter opinion sentence are positive, neutral, or negative, the weight of the values contained in the sentence is then calculated by totalling the value of each opinion word. If the number of opinion words in the sentence is ≥ 1, then the sentiment value of the opinion sentence is positive; if the opinion value of the sentence 84 IIUM Engineering Journal, Vol. 22, No. 1, 2021 Buntoro et al. https://doi.org/10.31436/iiumej.v22i1.1532 is 0, then the sentiment value of the opinion sentence is neutral, and if the opinion word value in the sentence is ≥ -1, then the sentiment value of the opinion sentence is negative. The determination of sentiment can be seen in Table 2. Table 2: Determination of Sentiment Sentiment Value Positive ≥1 Neutral 0 Negative ≥-1 3.7 Classification Processes Following the process for determining sentiment and having established the sentiment value of each opinion sentence using Python Programming, the next step is the sentiment classification process. The classification process uses the WEKA 3.8.3 Machine Learning tool [24], and the machine learning algorithms used in this study are NBC and SVM. In the classification process, the data were tested using the 10-fold cross-validation method [25]. The method works by dividing the dataset into two, namely 10 parts with 9/10 parts used as training data and 1/10 parts used as testing data. The iteration process in the method can be run 10 times with a variety of training data and data testing using a combination of 10 parts of data. Testing Twitter Dataset 1 2 3 4 5 6 7 8 9 10 Fig. 4: Illustration of 10-fold cross-validation. 3.8 Evaluation of Results The stages of evaluation in the study will examine the performance of Accuracy, Precision, and Recall from the experiments that have been carried out. The results evaluation process is conducted using a Confusion Matrix [26] featuring as its indicators a true positive rate (TP rate), true negative rate (TN rate), false positive rate (FP rate), and false negative rate (FN rate). The TP rate is the percentage of the positive class that is successfully classified as a positive class, while the TN rate is the percentage of the negative class that is 85 IIUM Engineering Journal, Vol. 22, No. 1, 2021 Buntoro et al. https://doi.org/10.31436/iiumej.v22i1.1532 successfully classified as a negative class. The FP rate is a negative class that is classified as a positive class, and the FN rate is a positive class that is classified as a negative class. Table 3: Confusion Matrix Predicted Positive (A) Neutral (B) Negative (C) Actual Positive (A) AA AB AB Neutral (B) BA BB BC Negative (C) CA CB CC 4. EXPERIMENT AND RESULTS In this study, the dataset was derived from tweets of public opinion on the Indonesian 2019 presidential candidates. The data were taken using the crawling method [27] with R Programming using R-Studio from Twitter social media. The data taken were only tweets in Indonesian, with the details of 5000 tweets containing Jokowi’s keywords and 5,000 tweets containing Prabowo’s keywords, giving a total of 10,000 tweets. The tweet data were taken randomly from both ordinary users and from the news media online on Twitter. Following the data pre-processing, tokenization, and class attribute determination steps, the dataset used for this study contained opinion sentences from Twitter classified into their respective sentiment classes (Positive, Neutral, or Negative) with Python Programming. The number of datasets is not the same as the amount of data taken because, during the data pre-processing, the same opinion sentence will be deleted to search for unique data, whereas when the data are being crawled, all opinion sentences will be taken even though the sentence is the same. Table 4 contains the results of the determination of the sentiment class using the Lexicon-based method [28] in Python Programming with three attribute classes, namely positive, neutral, and negative. Table 4: Results of Determination of Sentiment Classes Sentiment Total Positive 2688 Neutral 4666 Negative 2646 After determining the sentiment value of each opinion sentence, the opinion sentences are formed into a dataset using the Attribute-Relation File Format (ARFF) [29] as the input for classifying data with WEKA. The tweet data were then classified or tested for accuracy using the NBC machine learning algorithms and SVM with WEKA version 3.8.3 software. This study uses the 10-fold cross-validation method for the process of classifying or testing tweet data. In this process, the data are divided into 10 parts with 9/10 parts used for the training process and 1/10 parts used for the testing process. Iteration takes place 10 times with variations in training and testing data using a combination of 10 parts of data. Table 5 displays a comparison of the results from the NBC machine learning algorithm with SVM. Table 5: Comparison of Classification Results* Naïve Bayes Classifier (NBC) 86 IIUM Engineering Journal, Vol. 22, No. 1, 2021 Buntoro et al. https://doi.org/10.31436/iiumej.v22i1.1532 *The Precision and Recall values are the average values of positive class values and negative classes The information in Table 5 enables a comparison of the accuracy, precision, recall, TP rate, and TN rate values for each trial carried out with the NBC machine learning algorithm and SVM. The columns contain the tokenization data used in this study while the rows contain the accuracy, precision, recall, TP rate, and TN rate values for each trial conducted. The process from data pre-processing to the determination of the sentiment class produced the dataset of this research, which was then used as the input in the classification process. The classification process was carried out with WEKA Machine Learning using the NBC machine learning algorithm and SVM. The classification test process with 7 tokenizations produced values for accuracy, precision, recall, TP rate, and TN rate for each trial. Fig. 5: Accuracy level of different machine learning algorithms and tokenization methods. The blue and orange charts correspond to the Naïve Bayes Classifier (NBC) and Support Vector Machine (SVM), respectively. From Fig. 5, we can see that accuracy was tested in this study with two machine learning algorithms, namely NBC and SVM, and the 7 tokenizations of Alphabetic Tokenizer, Character N-gram Tokenizer, Unigram, Bigram, Trigram, N-gram, and Word Tokenizer. 0 10 20 30 40 50 60 70 80 90 Alphabetic Character N-gram Unigram Bigram Trigram N-gram Word Naïve Bayes Classifier (NBC) Support Vector Machine (SVM) Tokenization Methods A cc u ra cy ( % ) Tokenizer Accuracy (%) Precision (%) Recall (%) TP Rate (%) TN Rate (%) Alphabetic 49.94 52.5 49.9 51.4 50.5 Character N-gram 51.81 54.4 51.8 53.7 51,7 Unigram 50.11 52.6 50.1 53.3 48.4 Bigram 50.51 61.2 50.5 79.7 25.3 Trigram 55.98 68.5 56 21 20.3 N-gram 44.94 51.6 44.9 64.2 46 Word 50.11 52.6 50.1 53.3 48.4 Support Vector Machine (SVM) Tokenizer Accuracy (%) Precision (%) Recall (%) TP Rate (%) TN Rate (%) Alphabetic 79.02 79 79 73.8 70.3 Character N-gram 58.47 59.20 58.50 55.70 64.40 Unigram 78.9 78.9 78.9 73.2 70.8 Bigram 69.21 72.1 69.2 51.5 48 Trigram 61.21 71.7 61.2 30.1 31 N-gram 77.82 77.8 77.8 72.6 69.4 Word 78.9 78.9 78.9 73.2 70.8 87 IIUM Engineering Journal, Vol. 22, No. 1, 2021 Buntoro et al. https://doi.org/10.31436/iiumej.v22i1.1532 Accuracy was one of the main parameters in the assessment of the sentiment analysis model used in this study. The formula for the value of accuracy was the amount of data that were successfully classified according to the class of sentiment for the entire amount of data classified. Therefore, the greater the amount of data that were correctly classified according to the sentiment class, the higher the accuracy value. The highest accuracy value was obtained with respect to the combination of the SVM and Alphabetic Tokenization machine learning algorithms, which had an accuracy value of 79.02%. In this study, machine learning methods such as the SVM algorithm produced the highest accuracy because they work by recognizing word patterns. This machine learning algorithm is capable of easily recognizing and memorizing word patterns for a certain sentiment class in an opinion sentence. Yet while it is easy to classify sentiment data correctly using these methods, alphabetic tokenization can improve accuracy by breaking a sentence into words, which enables the easy classification of sentences with sentiments. The lowest accuracy value in this study was obtained for the NBC machine learning algorithm with N-gram tokenization, which yielded an accuracy value of 44.94%. Fig. 6: Precision level of different machine learning algorithms and tokenization methods. The blue and orange charts correspond to the Naïve Bayes Classifier (NBC) and Support Vector Machine (SVM), respectively. Fig. 7: Recall level of different machine learning algorithms and tokenization methods. The blue and orange charts correspond to the Naïve Bayes Classifier (NBC) and Support Vector Machine (SVM), respectively. Figure 6 shows that the highest Precision value of 79% was obtained by the SVM machine learning algorithm with alphabetic tokenization, while the lowest precision value 0 10 20 30 40 50 60 70 80 90 Alphabetic Character N-gram Unigram Bigram Trigram N-gram Word Naïve Bayes Classifier (NBC) Tokenization Methods P re ci si o n ( % ) 0 10 20 30 40 50 60 70 80 90 Alphabetic Character N-gram Unigram Bigram Trigram N-gram Word Naïve Bayes Classifier (NBC) Support Vector Machine (SVM) Tokenization Methods R e ca ll (% ) 88 IIUM Engineering Journal, Vol. 22, No. 1, 2021 Buntoro et al. https://doi.org/10.31436/iiumej.v22i1.1532 of 55.1% came from the NBC machine learning algorithm with N-gram tokenization. In the Figure 7, we can see that highest Recall value of 79% was obtained with the SVM machine learning algorithm and alphabetic tokenization, while the lowest Recall value of 51.6% was obtained with the NBC machine learning algorithm with N-gram tokenization. The high precision values were obtained because the precision value formula was based on the number of positive classes that were correctly classified as a positive class divided by the total data classified as a positive class, whereas the recall value formula consisted of the number of positive classes that were correctly classified as positive classes divided by the number of actual positive classes. Fig. 8: TP Rate level of different machine learning algorithms and tokenization methods. The blue and orange charts correspond to the Naïve Bayes Classifier (NBC) and Support Vector Machine (SVM), respectively. Fig. 9: TN Rate level of different machine learning algorithms and tokenization methods. The blue and orange charts correspond to the Naïve Bayes Classifier (NBC) and Support Vector Machine (SVM), respectively. Figure 8 displays the highest and lowest TP Rate obtained using the NBC machine learning algorithm. The TP Rate is a value denoting the amount of positive tweet data that were correctly classified according to the sentiment class, which in this case was positive. In contrast, the highest and lowest TN values were obtained using the SVM machine learning algorithm as shown in Fig. 9. The TN Rate indicates the value of negative tweet 0 10 20 30 40 50 60 70 80 90 Alphabetic Character N-gram Unigram Bigram Trigram N-gram Word Naïve Bayes Classifier (NBC) Support Vector Machine (SVM) Tokenization Methods T P R a te ( % ) 0 10 20 30 40 50 60 70 80 Alphabetic Character N-gram Unigram Bigram Trigram N-gram Word Naïve Bayes Classifier (NBC) Support Vector Machine (SVM) Tokenization Methods T N r a te ( % ) 89 IIUM Engineering Journal, Vol. 22, No. 1, 2021 Buntoro et al. https://doi.org/10.31436/iiumej.v22i1.1532 data that were correctly classified according to the sentiment class, which in this case was negative. From the research carried out, it can be seen that the model constructed delivered the greatest accuracy when using a combination of the SVM machine learning algorithm and N- gram tokenization, while the lowest accuracy value was obtained when testing using a combination of the NBC machine learning algorithm with Trigram tokenization. The accuracy results produced were quite good; however, the model still made a number of mistakes when the classification process of the dataset with the distribution of sentiments was not as balanced as this study intended. The use of datasets with imbalanced distribution will lead to the incorrect classification of minority class data as majority class data [30], which results in a large value difference because most classifiers manage to correctly classify the majority class compared to the minor class [31]. 5. CONCLUSIONS From the series of studies conducted, we can conclude that the Sentiment Analysis model built was suitable for use in determining the sentiment of public opinion on Twitter with respect to the 2019 Indonesian presidential candidates. The study aimed to test and determine which machine learning algorithms were suitable for the classification of public opinion on Twitter, and also to test 7 suitable tokenizations and produce high accuracy when combined with the Naïve Bayes Classifier (NBC) and Support Vector Machine (SVM) machine learning algorithms. The sentiment analysis revealed that there was much negative public sentiment on Twitter aimed at the 2019 Indonesian presidential candidates. The greatest accuracy value was obtained when using a combination of the SVM machine learning algorithm and alphabetic tokenization, which yielded an accuracy value of 79.02%. The lowest accuracy value in this study was obtained for the NBC machine learning algorithm with N-gram tokenization, which had an accuracy value of 44.94%. This study has therefore demonstrated that the SVM machine learning algorithm produces higher accuracy compared to the NBC machine learning algorithm. It is suggested that further research should endeavour to use more data and real-time data from both Twitter and other social media sites such as Facebook and YouTube. ACKNOWLEDGEMENTS This research was funded by Universiti Teknologi Malaysia (UTM) under Research University Grant Vot-20H04, Malaysia Research University Network (MRUN) Vot 4L876, and the Fundamental Research Grant Scheme (FRGS) Vot 5F073 supported under the Ministry of Education Malaysia. The work is partially supported by the SPEV project, University of Hradec Kralove, FIM, Czech Republic (ID: 2103-2019), “Smart Solutions in Ubiquitous Computing Environments” and by the project of excellence 2019/2205, Faculty of Informatics and Management, University of Hradec Kralove. Partial funding by Universitas Muhammadiyah Ponorogo through the Institute for Research and Public Services (LPPM) under the contract no. 115/VI.4/PN/2018 is also acknowledged. We are also grateful for the support of Ph.D. students Sebastien Mambou and Michal Dobrovolny in consultations regarding application aspects. REFERENCES [1] KPU - Portal Publikasi Pemilihan Umum 2019. Available: https://infopemilu.kpu.go.id/. [2] Instagram by the Numbers: Stats, Demographics & Fun Facts, Omnicore Agency, 2020. 90 IIUM Engineering Journal, Vol. 22, No. 1, 2021 Buntoro et al. https://doi.org/10.31436/iiumej.v22i1.1532 Available: https://www.omnicoreagency.com/instagram-statistics/. [3] Januru L. (2016) Analisis Wacana Black Campaign (Kampanye Hitam) Pada PILPRES Tahun 2014 di Media Kompas, Jawa Pos Dan Kedaulatan Rakyat, Natapraja, 4(2): 181-194. [4] Indurkhya N, Damerau FJ. (2010) Handbook of Natural Language Processing, 2nd ed. Chapman & Hall/CRC. [5] Wagh B, N RW. (2016) Sentimental Analysis on Twitter Data using Naive Bayes, IJARCCE, 5(12): 316-319. [6] Yuret D, Türe F. (2006) Learning morphological disambiguation rules for Turkish, Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, pages 328–334. [7] Zainuddin N, Selamat A. (2014) Sentiment analysis using Support Vector Machine, in I4CT 2014 - 1st International Conference on Computer, Communications, and Control Technology, Proceedings, pages. 333–337. [8] Pitsilis GK, Ramampiaro H, Langseth H. (2018) Effective hate-speech detection in Twitter data using recurrent neural networks, Appl. Intell., 48(12): 4730-4742. [9] Pak A, Paroubek P. (2010) Twitter as a corpus for sentiment analysis and opinion mining, Proc. 7th Int. Conf. Lang. Resour. Eval. Lr. 2010, pages. 1320–1326. [10] Kolchyna, O., Souza, T.T., Treleaven, P., Aste, T. (2015). Twitter Sentiment Analysis: Lexicon Method, Machine Learning Method and Their Combination. arXiv: Computation and Language. [11] Nezhad ZB, Deihimi MA. (2019) A combined deep learning model for Persian Sentiment Analysis, IIUM Eng. J., 20(1): 129-139. [12] Zainuddin N., Selamat A., Ibrahim R. (2016) Improving Twitter Aspect-Based Sentiment Analysis Using Hybrid Approach. In: Nguyen N.T., Trawiński B., Fujita H., Hong TP. (eds) Intelligent Information and Database Systems. ACIIDS 2016. Lecture Notes in Computer Science, vol 9621. Springer, Berlin, Heidelberg. [13] Zainuddin, N., Selamat, A., Ibrahim, R. (2018) Hybrid sentiment classification on twitter aspect-based sentiment analysis. Appl Intell 48, 1218–1232. [14] Purohit NS, Angadi AB, Bhat M, Gull KC. (2015) Crawling through web to extract the data from Social networking site-Twitter, 2015 Natl. Conf. Parallel Comput. Technol. PARCOMPTECH 2015. [15] R: What is R?. Available: https://www.r-project.org/about.html. [16] Sammut C. (2011) Genetic and Evolutionary Algorithms. In: Sammut C., Webb G.I. (eds) Encyclopedia of Machine Learning. Springer, Boston, MA. [17] Utomo FS, Suryana N, Azmi MS. (2020) Stemming impact analysis on Indonesian Quran translation and their exegesis classification for ontology instances. IIUM Eng. J., 21(1): 33- 50. [18] Lutfiatun A, Novitasari A, Helfiyana A. (2018) Bahasa Alay Pada Chating Di Medsos Remaja Millenial ( Bahasa Alay Vs Remaja Millenial ), Prosiding SENASBASA (Seminar Nasional Bahasa dan Sastra), pages. 34–41. [19] KBBI Daring. Available: https://kbbi.kemdikbud.go.id/ [20] Weka_tokenizers: R/Weka Tokenizers in RWeka: R/Weka Interface. Available: https://rdrr.io/cran/RWeka/man/Weka_tokenizers.html. [21] Jabreel M, Moreno A. (2019) A Deep Learning-Based Approach for Multi-Label Emotion Classification in Tweets, Appl. Sci., 9(6). [22] Musto C, Semeraro G, Polignano M. (2014) A comparison of lexicon-based approaches for sentiment analysis of microblog, CEUR Workshop Proc., 1314: 59-68. [23] Buntoro GA. (2017) Analisis Sentimen Calon Gubernur DKI Jakarta 2017 Di Twitter, Integer J. Maret, 1(1): 32-41. [24] Machine Learning Group - Department of Computer Science: University of Waikato. Available: https://www.cs.waikato.ac.nz/research/research-groups/machine-learning-group [25] Wiley M, Wiley JF. (2019) Advanced R statistical programming and data models: Analysis, machine learning, and visualization. Apress Media LLC. [26] Confusion Matrix for Your Multi-Class Machine Learning Model. Available: https://towardsdatascience.com/confusion-matrix-for-your-multi-class-machine-learning- 91 IIUM Engineering Journal, Vol. 22, No. 1, 2021 Buntoro et al. https://doi.org/10.31436/iiumej.v22i1.1532 model-ff9aa3bf7826. [27] Hernandez-Suarez A, Sanchez-Perez G, Toscano-Medina K, Martinez-Hernandez V, Sanchez V, Perez-Meana H. (2018) A Web Scraping Methodology for Bypassing Twitter API Restrictions, pp. 1-7. [28] Haniewicz K, Rutkowski W, Adamczyk M, Kaczmarek M. (2013) Towards the Lexicon- Based Sentiment Analysis of Polish Texts: Polarity Lexicon. [29] Attribute-Relation File Format (ARFF). Available: https://www.cs.waikato.ac.nz/~ml/weka/arff.html. [30] Handling imbalanced datasets in machine learning - Towards Data Science. Available: https://towardsdatascience.com/handling-imbalanced-datasets-in-machine-learning- 7a0e84220f28. [31] Longadge R, Dongre S. (2013) Class Imbalance Problem in Data Mining Review, Int. J. Comput. Sci. Netw., vol. 2, no 1. 92 1. INTRODUCTION 2. RELATED WORK 3. PROPOSED METHODOLOGY 4. EXPERIMENT AND RESULTS 5. CONCLUSIONS ACKNOWLEDGEMENTS REFERENCES