Journal of Applied Engineering and Technological Science Vol 4(2) 2023: 1012-1021 1012 Classification Of Multiple Emotions In Indonesian Text Using The K-Nearest Neighbor Method Ahmad Zamsuri1*, Sarjon Defit2, Gunadi Widi Nurcahyo3 Faculty of Computer Science, Universitas Lancang Kuning, Indonesia1 Faculty of Computer Science, Universitas Putra Indonesia “YPTK” Padang, Indonesia123 ahmadzamsuri@unilak.ac.id1, sarjon_defit@upiyptk.ac.id2, gunadiwidi@yahoo.co.id3 Received : 12 April 2023, Revised: 10 June 2023, Accepted : 12 June 2023 *Corresponding Author ABSTRACT Emotions are expressions manifested by individuals in response to what they see or experience. In this study, emotions were examined through individuals' tweets regarding the election issues in Indonesia in 2024. The collected tweets were then labeled based on emotions using the emotion wheel, which consisted of six categories: joy, love, surprise, anger, fear, and sadness. After the labeling process, the next step involved weighting using TF-IDF (Term Frequency-Inverse Document Frequency) and Bag-of-Words (BoW) techniques. Subsequently, the model was evaluated using the K-Nearest Neighbor (KNN) algorithm with three different data splitting ratios: 80:20, 70:30, and 60:40. From the six labels used in the modeling process, the accuracy was then calculated, and the labels were subsequently merged into positive and negative categories. Then the modeling was conducted using the same process with the six labels. The results of this study revealed that the utilization of TF-IDF outperformed BoW. The highest accuracy was achieved with the 80:20 data splitting ratio, attaining 58% accuracy for the six-label classification and 79% accuracy for the two-label classification. Keywords: Emotions, TF-IDF, BoW, KNN, Data Splitting 1. Introduction Emotions are psychological states that involve feelings and mental states that can arise in response to certain stimuli or experiences (Pace-Schott et al., 2019). Emotions involve complex subjective experiences and can influence a person's behaviour, perception, and physical responses. Emotions can vary and include feelings such as joy, sadness, anger, fear, love, disgust, and many more (Gu et al., 2019). Emotions can be expressed through comments on an article or by expressing opinions on social media regarding the discussed subject (Graciyal & Viswam, 2021). Currently, research related to emotions has been extensively discussed by previous researchers. One of them is the analysis of emotions using sentiment analysis (Chenna et al., 2021). Sentiment analysis is a method in natural language processing used to determine and analyze the sentiment or emotional attitude in text or data (Anam et al., 2022). The main goal of sentiment analysis is to identify whether a text or data contains positive, negative, or neutral sentiment (Iglesias & Moreno, 2020). However, in this study, sentiment analysis will be based on emotion labels, as done by previous researchers using labels such as anger, anticipation, disgust, fear, joy, love, optimistic, pessimistic, sad, surprise, and trust (Alturayeif & Luqman, 2021). Other studies also discuss emotions using several labels such as neutral, worry, happiness, sadness, love, surprise, fun, relief, empty, enthusiasm, boredom, and anger (Kiran Kumar & Kumar, 2021). In conducting sentiment analysis on emotions, previous research has used various machine learning algorithms. The algorithms used include K-Nearest Neighbor (KNN) (Atika Sari & Hari Rachmawanto, 2022; Sarimole & Rosiana, 2022), Support Vector Machine (SVM) (Gunawan et al., 2022), Naïve Bayes (Samsir et al., 2021; Jaya & Wahyudi, 2022), and others. This study uses the KNN algorithm because it has a high accuracy rate, above 78%, in analyzing emotions using sentiment analysis. A study conducted by (Pamuji, 2021) achieved an accuracy of 88%. Furthermore, (Satyanarayana et al., 2021) obtained a relatively high accuracy of 94.06%, and (Hustinawaty et al., 2019) achieved an accuracy of 83%. Most previous studies only used a single split and compared it with several other algorithms. For example, (Saifullah et al., 2021) used KNN, Bernoulli, Decision Tree, SVM, Random Forest, and XG-Boost, while (Kang et al., 2012) used SVM and Naïve Bayes. In this Zamsuri et al … Vol 4(2) 2023 : 1012-1021 1013 study, only one algorithm will be used, and three data splits will be employed for comparison, namely 80:20, 70:30, and 60:40. To obtain good accuracy results, this study utilizes data preprocessing with tools such as Data Cleaning, Case Folding, Tokenizing, Filtering, Stemming, and Transformation (Kusumawati et al., 2022). Additionally, it employs word weighting using TF-IDF. Term Frequency-Inverse Document Frequency (TF-IDF) is an algorithmic method used to calculate the weight of each commonly used word. This method is known for its efficiency, simplicity, and accurate results. It calculates the values of Term Frequency (TF) and Inverse Document Frequency (IDF) for each token (word) in each document within the corpus. Simply put, the TF-IDF method is used to determine how often a word appears in a document (Putra et al., 2022). Besides TF-IDF, this research also utilizes the Bag of Words (BoW) approach. The boW is one of the simplest methods for converting text data into vectors that can be understood by computers. Essentially, this method only counts the frequency of word occurrences across all documents (Juluru et al., 2021). In the BoW approach, documents are treated as "bags" of words, which are sets of words present in the document. In this representation, information about word order and sentence structure is lost, and only the frequency of word occurrences is considered (Kowsari et al., 2019). After word weighting, the next step involves testing with a model consisting of six labels: joy, love, surprise, anger, fear, and sadness. Additionally, this research will also conduct testing with two labels: positive and negative. These two labels are a combination of the six labels mentioned earlier, where joy and love fall into the positive label, while surprise, anger, fear, and sadness fall into the negative label. 2. Literature Review Research related to emotions has been extensively conducted by previous studies. Table 1 provides an overview of studies that have investigated emotions. Table 1 - Research on previous studies related to emotions. No Researcher Algorithm Feature Extraction Label Accuracy 1 (Fernandes.J et al., 2020) KNN MFCC Happy, Sad, Anger, Fear, Normal Happy: 86% Sad: 81% Anger: 85% Fear: 78% Normal: 90% 2 (Ramdani et al., 2022) Multinomial Bayes dan Decision Tree CountVectorizer and TF-IDF Positif dan negative Multinomial Bayes: 83% Decision Tree: 82% 3 (Santhosh Baboo & Amirthapriya, 2022) SVM, NB, RF, LR, SGB TFIDF Positif dan negative SVM: 95% NB: 87% RF: 93% LR: 92% SGM: 90% 4 (sajib et al., 2019) NB, SVM, LR Word Frequency, Word Weight, Term Document, Word n-gram Positif dan negative Naïve bayes: 79.3% SVM: 82.3% LR:74.9% 5 (Sailunaz & Alhajj, 2019) SVM, Random Forest, Naïve Bayes NAVA (Noun Adverb, verb and Adjective) Guilt, Joy, shame, Fear, Sadness, Disgust Akurasi tertinggi 43% pada naïve bayes 6 (Chenna et al., 2021) SVM, KNN, Decision Tree, Naïve Bayes - Happy, Angry, Sad, Surprise, Fear SVM: 86% KNN: 87% Decision Tree: 89% Naïve Bayes: 93% The research conducted by Fernandes et al. (2020) performed sentiment analysis on emotions by categorizing them into 5 labels. The modeling in this study achieved label-specific accuracies such as 86% for 'happy,' 81% for 'sad,' and so on. Other studies by Ramdani, Santosh, and Sajib also examined emotions using 2 labels, namely 'positive' and 'negative,' resulting in Zamsuri et al … Vol 4(2) 2023 : 1012-1021 1014 accuracies ranging from 79% to 93%. However, Sailunaz's research, which used more than 2 labels such as 'guilt,' 'joy,' 'shame,' 'fear,' 'sadness,' and 'disgust,' obtained a relatively low accuracy of 43% . The research utilized the K-Nearest Neighbor (KNN) algorithm, chosen due to its high accuracy in previous studies. For instance, Chenna et al. (2021) achieved 87% accuracy, while Suprayogi obtained 85%. Based on these two studies, the current research utilizes the KNN algorithm to analyze emotions with 6 labels. 3. Research Methods Figure 1 illustrates the flowchart of the methodology utilized to facilitate the execution of this research. Fig. 1. Flowchart of The Methodology The dataset used in this research consists of tweets extracted from Twitter, specifically focusing on the topic of pilpres2024 (presidential election 2024). The dataset comprises a total of 1649 data instances. These data were labeled based on emotions derived from an emotion wheel. An emotion wheel is a visual model used to depict various human emotions. It places emotions in a circular diagram consisting of different sectors or categories of emotions. The emotion wheel aids in identifying and describing the nuances and variations of emotions experienced by humans. After labeling, the next step involved preprocessing the data. a. Data Cleaning: Data cleaning is a procedure to ensure the correctness, consistency, and usability of existing data in a dataset. It involves detecting errors or corruption in the data and then fixing or removing the data if necessary (Angloher et al., 2023). b. Case Folding: Case folding is the process of converting all text to the same case, either lowercase or uppercase (Fauzi, 2019). c. Tokenizing: Tokenizing is the process of splitting words in text into separate word sequences, separated by spaces or other characters (Friedman, 2023). d. Filtering: Filtering, also known as stop-word removal, is a preprocessing process that aims to eliminate conjunctions, connectors, or other common words. This way, only important words are retained, while unimportant words are discarded (Madhavan et al., 2021). e. Stemming: Stemming is a necessary step to reduce the number of different word indexes for a given data. It brings a word with suffixes or prefixes back to its base form. It also groups other words that have the same base and similar meaning but different forms due to different affixes. NLTK library provides modules for stemming, including Porter, Lancaster, WordNet, and Snowball (Rifai & Winarko, 2019). Zamsuri et al … Vol 4(2) 2023 : 1012-1021 1015 f. Transformation: Transformation is a crucial step in data preprocessing for machine learning. It is used to convert or process the original data into a more suitable or useful form for machine learning modeling. The goal of transformation is to improve data quality, eliminate bias, enhance understanding, or improve model performance (Awan et al., 2018). The next step involves word weighting using TF-IDF on the cleaned data obtained from the preprocessing stage. Additionally, the Bag of Words (BoW) technique is used for word weighting. Word weighting, or feature extraction, is employed to enhance accuracy. This research compares the best feature extraction for classifying using both 6-label and 2-label scenarios, using the KNN algorithm. Subsequently, modeling is conducted using 3 different data splitting ratios with the KNN algorithm. KNN is chosen for its simplicity and ease of implementation. It is a straightforward algorithm to understand (Uddin et al., 2022). The basic concept of KNN is to find the class or category that appears most frequently among the K nearest neighbors of a given data point. This makes it relatively easy to implement without requiring many tuning parameters (Pamuji, 2021). Furthermore, KNN is a non-parametric algorithm, meaning it does not make specific assumptions about the data distribution (Wang et al., 2020). This allows the algorithm to perform well on data without clear patterns or distributions. KNN also exhibits good tolerance to noise in the data. Data contaminated by noise or outliers will not significantly affect the classification performed by KNN since the algorithm relies on the majority vote of the nearest neighbors. 4. Results and Discussions KNN 6 Label Using TF-IDF Here are the results obtained from the conducted modeling. Figure 2 represents the accuracy results using data splitting ratios of 60:40, 70:30, and 80:20. Fig. 2.The accuracy results of KNN 6 Label using TF-IDF In Figure 2, it can be observed that the highest accuracy is obtained from the data splitting of 80:20, which is 58%. Previous studies also used KNN to analyze 5 emotion labels, namely anger, happiness, sadness, love, and fear, using TF-IDF. The accuracy results achieved were still below the accuracy results obtained in this study, which is 0.51% (Nugroho et al., 2022). Furthermore, this study also conducted sentiment analysis on emotions, but it calculated the accuracy for each label separately, resulting in accuracy above 90% (Kaur & Bhardwaj, 2019). Therefore, it can be concluded that the per-label accuracy is higher compared to the overall accuracy when using TF- IDF as the feature extraction method. KNN 6 Label Using BoW The next testing was conducted on 6 labels using Bag of Words (BoW). Figure 3 represents the accuracy results obtained using KNN with BoW for the 6 labels. Zamsuri et al … Vol 4(2) 2023 : 1012-1021 1016 Fig. 3. The accuracy results of KNN 6 Label Using BoW The testing results from KNN with BoW show a decrease in accuracy. The highest accuracy in this testing is observed in the 80:20 data splitting, which is 54%. There is a 4% decrease compared to the testing using KNN and TF-IDF. The comparison of model evaluation using 6 labels Figure 4 shows that the accuracy using the KNN algorithm with TF-IDF feature extraction is better compared to BoW. Fig. 4. The Comparison of Model Evaluation Using 6 Labels KNN 2 Label Using TF-IDF Figure 5 represents the accuracy results of TF-IDF with KNN using 2 labels. Fig. 5. The accuracy results of KNN 2 Label using TF-IDF From Figure 5, it can be observed that the 80:20 data splitting still achieves the highest accuracy compared to others, reaching 79%. Additionally, using 2 labels significantly improves 54 57 58 53 53 54 47 55 54 48 50 59 46 44 45 45 43 4346 46 46 46 44 45 0 10 20 30 40 50 60 70 60:40 70:30 80:20 60:40 70:30 80:20 KNN KNN TF-IDF Bag Of Word Evaluasi Model 6 Label Accuracy Precision Recall F1 Zamsuri et al … Vol 4(2) 2023 : 1012-1021 1017 the accuracy compared to using 6 labels. However, in other studies such as (Alzami et al., 2020), the accuracy obtained was only 0.503, while (Junadhi et al., 2022) achieved an accuracy of 56%. Furthermore, another study obtained an accuracy of 69.68% in analyzing Arabic tweets (Aloqaily et al., 2020). In (Ritha et al., 2023), an accuracy of 62% was obtained using a dataset of 1600 data, which is similar to the dataset used in this study. KNN 2 Label Using BoW The evaluation conducted on the model using KNN with BoW feature extraction and 2 labels is shown in Figure 6. The accuracy results of this evaluation can be seen in the figure. Fig. 6. The accuracy results of KNN 6 Label using BoW From Figure 6, it can be observed that there is a decrease in accuracy of approximately 2% for the 80:20 data splitting. The highest accuracy achieved is 77%, which is indeed higher compared to the 70:30 and 60:40 splittings. Other studies that also utilized KNN and BoW obtained accuracies of 52% (Mujahid et al., 2021) and 64% (Alzami et al., 2020), respectively. The comparison of model evaluation using 2 labels In Figure 7, it can be seen that the TF-IDF accuracy with an 80:20 data splitting still outperforms the others, similar to the experiment with 6 labels. Fig. 7. The Comparison of Model Evaluation Using 2 Labels In previous studies, several researchers have made improvements to the KNN algorithm. For instance, a study conducted by Alzami et al. (2020) achieved a 70% accuracy with KNN by using a hybrid approach combining TF-IDF and W2V. Another study improved the KNN algorithm by employing BoW Ensemble Feature, which increased the KNN accuracy up to 96% (Irfan et al., 2018). Apart from enhancing KNN through feature extraction, some researchers also utilized hybrid approaches with other algorithms. Rani and Singh Gill (2020) achieved a 91% accuracy by combining several algorithms, namely Dictionary-Based Classifier + Stack-Based 77 78 79 73 75 7777 77 79 74 75 7676 77 78 71 74 76 77 77 78 72 74 76 66 68 70 72 74 76 78 80 60:40 70:30 80:20 60:40 70:30 80:20 KNN KNN TF-IDF Bag Of Word Evaluasi Model 2 Label Accuracy Precision Recall F1 Zamsuri et al … Vol 4(2) 2023 : 1012-1021 1018 Ensemble using KNN, SVMRadial, and C5.0. Another study combined KNN with Decision Tree in a hybrid approach and obtained an 80% accuracy (Khattak et al., 2021). Furthermore, other studies have developed KNN to improve accuracy in multiclass scenarios. Pandian and Balasubramani (2020) employed the FA-KNN hybrid approach to enhance multiclass classification, resulting in a 91% accuracy. Menaouer et al. (2022) conducted a study that compared various developments of KNN to improve accuracy in a 7-label classification task. The study compared KNN Stacking, KNN Boosting, and KNN Bagging, and found that KNN Bagging achieved the highest accuracy at 88.6%. From these studies, it can be concluded that the development of accuracy, whether through hybrid approaches or incorporating additional algorithms, significantly influences the accuracy results 5. Conclusion Based on the discussion above, the researcher concluded that the KNN algorithm with an 80:20 data-splitting ratio yields the best accuracy. It applies to both feature extraction methods used, namely TF-IDF and BoW, for both 6 labels and 2 labels. However, when compared, TF- IDF outperforms BoW. In comparison to other studies, the results obtained in this research are indeed better than previous studies that solely relied on the KNN algorithm. KNN's accuracy can be enhanced by employing hybrid approaches with other algorithms such as boosting, SVM, Bagging, and others. Therefore, for future research, it is recommended to explore hybrid methods with KNN to further enhance the accuracy obtained from this study. References Aloqaily, A., Al-hassan, M., Salah, K., Elshqeirat, B., Almashagbah, M., & Al Hussein Bin Abdullah, P. (2020). Sentiment Analysis For Arabic Tweets Datasets: Lexicon-Based And Machine Learning Approaches. Journal of Theoretical and Applied Information Technology, 29, 4. www.jatit.org Alturayeif, N., & Luqman, H. (2021). Fine-grained sentiment analysis of arabic covid-19 tweets using bert-based transformers and dynamically weighted loss function. Applied Sciences (Switzerland), 11(22). https://doi.org/10.3390/app112210694 Alzami, F., Udayanti, E. D., Prabowo, D. P., & Megantara, R. A. (2020). Document Preprocessing with TF-IDF to Improve the Polarity Classification Performance of Unstructured Sentiment Analysis. Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, 235–242. https://doi.org/10.22219/kinetik.v5i3.1066 Anam, M. K., Mahendra, M. I., Agustin, W., Rahmaddeni, & Nurjayadi. (2022). Framework for Analyzing Netizen Opinions on BPJS Using Sentiment Analysis and Social Network Analysis (SNA). Intensif, 6(1), 2549–6824. https://doi.org/10.29407/intensif.v6i1.15870 Angloher, G., Banik, S., Bartolot, D., Benato, G., Bento, A., Bertolini, A., Breier, R., Bucci, C., Burkhart, J., Canonica, L., D’Addabbo, A., Di Lorenzo, S., Einfalt, L., Erb, A., Feilitzsch, F. v., Iachellini, N. F., Fichtinger, S., Fuchs, D., Fuss, A., … Waltenberger, W. (2023). Towards an automated data cleaning with deep learning in CRESST. European Physical Journal Plus, 138(1). https://doi.org/10.1140/epjp/s13360-023-03674-2 Atika Sari, C., & Hari Rachmawanto, E. (2022). Sentiment Analyst on Twitter Using the K- Nearest Neighbors (KNN) Algorithm Against Covid-19 Vaccination. Journal of Applied Intelligent System, 7(2), 135–145. https://doi.org/10.33633/jais.v7i2.6734 Awan, S. E., Bennamoun, M., Sohel, F., Sanfilippo, F. M., Chow, B. J., & Dwivedi, G. (2018). Feature selection and transformation by machine learning reduce variable numbers and improve prediction for heart failure readmission or death. PLoS ONE, 14(6). https://doi.org/10.1371/journal.pone.0218760 Chenna, A., Srinivas, B., & Nagaraju, S. (2021). Emotion And Sentiment Analysis From Twitter Text. Turkish Journal of Computer and Mathematics Education, 12(12), 4614–4620. Fauzi, M. A. (2019). Word2Vec model for sentiment analysis of product reviews in Indonesian language. International Journal of Electrical and Computer Engineering (IJECE), 9(1), 525. https://doi.org/10.11591/ijece.v9i1.pp525-530 Zamsuri et al … Vol 4(2) 2023 : 1012-1021 1019 Fernandes.J, B., Bhargavi, Ch., Arshad, S., Kumar, S., & Sandeep, G. (2020). Emotion recognition in speech signals using optimization based multi-SVNN classifier. International Journal Of Scientific & Technology Research, 9(1), 3998–4001. Friedman, R. (2023). Tokenization in the Theory of Knowledge. Encyclopedia, 3(1), 380–386. https://doi.org/10.3390/encyclopedia3010024 Graciyal, D. G., & Viswam, D. (2021). Social Media and Emotional Well-being: Pursuit of Happiness or Pleasure. Asia Pacific Media Educator, 31(1), 99–115. https://doi.org/10.1177/1326365X211003737 Gu, S., Wang, F., Patel, N. P., Bourgeois, J. A., & Huang, J. H. (2019). A model for basic emotions using observations of behavior in Drosophila. In Frontiers in Psychology (Vol. 10, Issue APR). Frontiers Media S.A. https://doi.org/10.3389/fpsyg.2019.00781 Gunawan, L., Anggreainy, M. S., Wihan, L., Santy, Lesmana, G. Y., & Yusuf, S. (2022). Support vector machine based emotional analysis of restaurant reviews. International Conference on Computer Science and Computational Intelligence, 216, 479–484. https://doi.org/10.1016/j.procs.2022.12.160 Hustinawaty, Dwiputra, R. A. A., & Rumambi, T. (2019). Public Sentiment Analysis Of Pasar Lama Tangerang Using K-Nearest Neighbor Method And Programming Language R. Jurnal Ilmiah Informatika Komputer, 24(2), 129–133. https://doi.org/10.35760/ik.2019.v24i2.2367 Iglesias, C. A., & Moreno, A. (2020). Sentiment Analysis for social media. In Applied Science (Special Issue). www.mdpi.com/journal/applsci Irfan, M. R., Fauzi, M. A., Tibyani, T., & Mentari, N. D. (2018). Twitter Sentiment Analysis on 2013 Curriculum Using Ensemble Features and K-Nearest Neighbor. International Journal of Electrical and Computer Engineering (IJECE), 8(6), 5409. https://doi.org/10.11591/ijece.v8i6.pp5409-5414 Juluru, K., Shih, H. H., Murthy, K. N. K., & Elnajjar, P. (2021). Bag-of-words technique in natural language processing: A primer for radiologists. Radiographics, 41(5), 1420–1426. https://doi.org/10.1148/rg.2021210025 Junadhi, Agustin, Rifqi, M., & Anam, M. K. (2022). Sentiment Analysis of Online Lectures using K-Nearest Neighbors based on Feature Selection. Jurnal Nasional Pendidikan Teknik Informatika (JANAPATI), 11(3), 216–225. https://doi.org/10.23887/janapati.v11i3.51531 Kang, H., Yoo, S. J., & Han, D. (2012). Senti-lexicon and improved Naïve Bayes algorithms for sentiment analysis of restaurant reviews. Expert Systems with Applications, 39(5), 6000– 6010. https://doi.org/10.1016/j.eswa.2011.11.107 Kaur, R., & Bhardwaj, V. (2019). Gurmukhi Text Emotion Classification System using TF-IDF and N-gram Feature Set Reduced using APSO. International Journal on Emerging Technologies, 10(3), 352–362. www.researchtrend.net Khattak, A., Asghar, M. Z., Ishaq, Z., Bangyal, W. H., & Hameed, I. A. (2021). Enhanced concept-level sentiment analysis system with expanded ontological relations for efficient classification of user reviews. Egyptian Informatics Journal, 22(4), 455–471. https://doi.org/10.1016/j.eij.2021.03.001 Kiran Kumar, P., & Kumar, I. (2021). Emotion detection and sentiment analysis of text. Proceedings of the International Conference on Innovative Computing & Communication (ICICC), 1–4. https://ssrn.com/abstract=3884914 Kowsari, K., Meimandi, K. J., Heidarysafa, M., Mendu, S., Barnes, L., & Brown, D. (2019). Text classification algorithms: A survey. Information, 1(68). https://doi.org/10.3390/info10040150 Kusumawati, N., Maspupah, U., Sari, D. R., Hamzah, A., Lukito, D., & Dwi Saputra, D. (2022). based on Surat Keputusan Dirjen Risbang SK Nomor 85/M/KPT/2020 Comparing Algorithm For Sentiment Analysis In Healthcare And Social Security Agency (BPJS Kesehatan). Techno Nusa Mandiri : Journal of Computing and Information Technology As an Accredited Journal Rank, 19(1). https://doi.org/10.33480/techno.v19i1.3167 Madhavan, M. V., Pande, S., Umekar, P., Mahore, T., & Kalyankar, D. (2021). Comparative analysis of detection of email spam with the aid of machine learning approaches. IOP Zamsuri et al … Vol 4(2) 2023 : 1012-1021 1020 Conference Series: Materials Science and Engineering, 1022(1). https://doi.org/10.1088/1757-899X/1022/1/012113 Menaouer, B., Zahra, A. F., & Mohammed, S. (2022). Multi-Class Sentiment Classification for Healthcare Tweets Using Supervised Learning Techniques. International Journal of Service Science, Management, Engineering, and Technology, 13(1), 1–23. https://doi.org/10.4018/ijssmet.298669 Mujahid, M., Lee, E., Rustam, F., Washington, P. B., Ullah, S., Reshi, A. A., & Ashraf, I. (2021). Sentiment analysis and topic modeling on tweets about online education during covid-19. Applied Sciences (Switzerland), 11(18). https://doi.org/10.3390/app11188438 Nugroho, K. S., Bachtiar, F. A., & Mahmudy, W. F. (2022). Detecting Emotion in Indonesian Tweets: A Term-Weighting Scheme Study. Journal of Information Systems Engineering and Business Intelligence, 8(1), 61–70. https://doi.org/10.20473/jisebi.8.1.61-70 Pace-Schott, E. F., Amole, M. C., Aue, T., Balconi, M., Bylsma, L. M., Critchley, H., Demaree, H. A., Friedman, B. H., Gooding, A. E. K., Gosseries, O., Jovanovic, T., Kirby, L. A. J., Kozlowska, K., Laureys, S., Lowe, L., Magee, K., Marin, M. F., Merner, A. R., Robinson, J. L., … VanElzakker, M. B. (2019). Physiological feelings. Neuroscience and Biobehavioral Reviews, 103, 267–304. https://doi.org/10.1016/j.neubiorev.2019.05.002 Pamuji, A. (2021). Performance of the K-Nearest Neighbors Method on Analysis of Social Media Sentiment. Juisi, 07(01), 32–37. Pandian, M. N. R., & Balasubramani, M. (2020). An Efficient Hybrid Classification Algorithm For Heart Prediction In Data Mininig. European Journal of Molecular & Clinical Medicine, 7(4), 1946–1954. Putra, R. S., Agustin, W., Anam, M. K., Lusiana, L., & Yaakub, S. (2022). The Application of Naïve Bayes Classifier Based Feature Selection on Analysis of Online Learning Sentiment in Online Media. Jurnal Transformatika, 20(1), 44. https://doi.org/10.26623/transformatika.v20i1.5144 Ramdani, C. M. S., Rachman, A. N., & Setiawan, R. (2022). Comparison of the Multinomial Naive Bayes Algorithm and Decision Tree with the Application of AdaBoost in Sentiment Analysis Reviews PeduliLindungi Application. International Journal of Information System & Technology Akreditasi, 6(4), 419–430. https://doi.org/10.30645/ijistech.v6i4.257 Rani, S., & Singh Gill, N. (2020). Hybrid Model For Twitter Data Sentiment Analysis Based On Ensemble Of Dictionary Based Classifier And Stacked Machine Learning Classifiers- SVM, KNN AND C5.0. Journal of Theoretical and Applied Information Technology, 29, 4. www.jatit.org Rifai, W., & Winarko, E. (2019). Modification of Stemming Algorithm Using A Non Deterministic Approach To Indonesian Text. IJCCS (Indonesian Journal of Computing and Cybernetics Systems), 13(4), 379. https://doi.org/10.22146/ijccs.49072 Ritha, N., Hayaty, N., Matulatan, T., Uperiati, A., Rathomi, M., Bettiza, M., & Farasalsabila, F. (2023). Sentiment Analysis of Health Protocol Policy Using K-Nearest Neighbor and Cosine Similarity. ICSEDTI, 1–9. https://doi.org/10.4108/eai.11-10-2022.2326274 Saifullah, S., Fauziyah, Y., & Aribowo, A. S. (2021). Comparison of machine learning for sentiment analysis in detecting anxiety based on social media data. Jurnal Informatika, 15(1), 45. https://doi.org/10.26555/jifo.v15i1.a20111 Sailunaz, K., & Alhajj, R. (2019). Emotion and sentiment analysis from Twitter text. Journal of Computational Science, 36, 1–18. https://doi.org/10.1016/j.jocs.2019.05.009 Sajib, M. I., Shargo, S. M., & Hossain, Md. A. (2019). Comparison of the efficiency of Machine Learning algorithms on Twitter Sentiment Analysis of Pathao. International Conference on Computer and Information Technology, 1–6. https://doi.org/10.1109/ICCIT48885.2019.9038208 Samsir, Irmayani, D., Edi, F., Harahap, J. M., Jupriaman, Rangkuti, R. K., Ulya, B., & Watrianthos, R. (2021). Naives Bayes Algorithm for Twitter Sentiment Analysis. Journal of Physics: Conference Series, 1933(1). https://doi.org/10.1088/1742-6596/1933/1/012019 Santhosh Baboo, S., & Amirthapriya, M. (2022). Sentiment Analysis And Automatic Emotion Detection Analysis Of Twitter Using Machine Learning Classifiers. International Journal of Mechanical Engineering, 7(2), 1161–1171. Zamsuri et al … Vol 4(2) 2023 : 1012-1021 1021 Sarimole, F. M., & Rosiana, A. (2022). Classification of Maturity Levels in Areca Fruit Based on HSV Image Using the KNN Method. Journal of Applied Engineering and Technological Science (JAETS), 4(1), 64–73. https://doi.org/10.37385/jaets.v4i1.951 Satyanarayana, K., Shankar, D., & Raju, D. (2021). An Approach For Finding Emotions Using Seed Dataset With Knn Classifier. In Turkish Journal of Computer and Mathematics Education (Vol. 12, Issue 10). Uddin, S., Haque, I., Lu, H., Moni, M. A., & Gide, E. (2022). Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction. Scientific Reports, 12(1). https://doi.org/10.1038/s41598-022-10358-x Jaya, R. T., & Wahyudi, T. (2022). Classification of Booster Vaccination Symptoms Using Naive Bayes Algorithm and C4.5. Journal of Applied Engineering and Technological Science (JAETS), 4(1), 131–138. https://doi.org/10.37385/jaets.v4i1.941 Wang, Y., Zhang, Y., Lu, Y., & Yu, X. (2020). A Comparative Assessment of Credit Risk Model Based on Machine Learning - a Case Study of Bank Loan Data. Procedia Computer Science, 174, 141–149. https://doi.org/10.1016/j.procs.2020.06.069