muhammad et al. aspect-based sentiment analysis on amazon product reviews | 94 aspect-based sentiment analysis on amazon product reviews muhammad abubakar*, amir shahzad, husna abbasi comsats university islamabad abbottabad campus pakistan, pakistan. *corresponding email: abubakarhameedch@gmail.com a b s t r a c t s a r t i c l e i n f o the focus of this paper was on amazon product reviews. the goal of this is to study is two (nlp) for evaluating amazon product review sentiment analysis. customers can learn about a product's quality by reading reviews. several product review characteristics, such as quality, time of evaluation, material in terms of product lifespan and excellent client feedback from the past, will have an impact on product rankings. manual interventions are required to analyse these reviews, which are not only time consuming but also prone to errors. as a result, automatic models and procedures are required to effectively manage product reviews. (nlp) is the most practical method for training a neural network in this era of artificial intelligence. first, the naive bayes classifier was used to analyse the sentiment of consumer in this study. the (svm) has categorized user sentiments into binary categories. the goal of the approach is to forecast some of the most important characteristics of an amazon-based product reviews, and then analyse customer attitudes about these aspects. the suggested model is validated using a largescale real-world dataset gathered specifically for this purpose. the dataset is made up of thousands of manually annotated product reviews gathered from amazon. after passing the input via the network model, (tf) and (idf) pre-processing methods were used to evaluate the feature. the outcomes precision, recall and f1 score are very promising. article history: received 18 dec 2021 revised 20 dec 2021 accepted 25 dec 2021 available online 26 dec 2021 aug 2018 __________________ keywords: naïve bayes, text classification algorithms , natural language processing, support vector machines, nlp, svm international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 2(2) (2021) 94-99 95 | international journal of informatics information system and computer engineering 2(2) (2021) 94-99 1. introduction amazon is the largest online retailer in the world, as well as a significant cloud computing service provider (rain, 2013). the company began as a book seller but has now evolved to include a wide range of consumer items and digital media, including the kindle e-reader, kindle fire tablet, and fire tv., a streaming media adaptor are among the company's own electronic devices. people nowadays prefer to trade things on an e-commerce website rather than at a physical store because of the time savings and convenience (bhat et al., 2015). before purchasing a product, it is usual practice to read the product review. the consumer's opinion of the product has been swayed either positive or negative by the reviews. thousands of reviews were read, on the other hand, is an unnatural feat. in this era of everimproving natural language processing algorithms, it takes time to wade through hundreds of comments to identify a product that uses a polarized review of a specific category to assess its popularity among consumers all around the world. this project aims to categorize customers' positive and negative product reviews, as well as construct a supervised learning model to polarize a wide range of reviews. according to an amazon study from last year, 88% customers from internet trust reviews as much as a personal suggestion (dey et al., 2020). with a powerful remark, the credibility of an internet product with a high number of positive reviews is established. the absence of reviews, books, or any other thing on the internet creates a sense of distrust among potential customers. pre-processing is used in this study to minimize the multidomain sentiment dataset's dimensionality of the features applied. following that, any frequent words above a certain threshold value are considered characteristics (haque et al., 2018). 2. related work this section presents the results of the classic schema polarization analysis based on user reviews on the amazon ecommerce website (xiao et al., 2021). the criteria for compositional sentiment were set by zhang et al. to find out how much textual sentiment there is. the system makes clear use of machine learning. in this work, film reviews were classified into binary classes using (svm) and naive bayes classifiers (joseph, 2020). the accuracy of the naive bayes model has been improved, while the svm model has been extended. to summarize, there have been no studies comparing (svm) with the naive bayes classifier. a comparison of two approaches (nlp) to analyze amazon product evaluation sentiment is presented in this study (more et al., 2020). comparative polarity analysis on amazon product reviews using algorithms has also been carried out to evaluate the sentiment of amazon product reviews (karthikayini et al., 2017). in his research, dadhich uses a rule-based hybrid to be able to create an automatic comment analyzer (dadhich et al., 2022). salmony also conducted a survey on amazon product reviews to assist in customer decision making (salmony et al., 2021). 3. methodology amazon, as seen by the numerous evaluations accessible, is one of the most well-known e-commerce companies. the dataset was unlabeled, thus it needed to muhammad et al. aspect-based sentiment analysis on amazon product reviews | 96 be labelled before it could be used in a supervised learning model (pandey, 2019). only amazon product feedback, specifically book feedback, was used for this study activity. to evaluating polarization, about 1, 47,000 book evaluations were analyzed. data collecting was completed as the first step in the data labelling process. manual labelling is impractical for a human to do because the dataset contains a high number of reviews. the term (tf) and (idf), elimination of relevant nouns and frequent noun identifier methods were used to extract the dataset's features (jagdale et al., 2019). tf-idf: tf-idf is a retrieval strategy that considers the frequency of a phrase (tf) as well as the (idf). tf and idf scores are assigned to each word or phrase. the tf and idf product results of a term, on the other hand, refer to the tf-idf weight of that term. as a result, the tf of a word represents its frequency, whereas the idf is a metric for what percentage of the corpus is occupied by a term. the content will always be among the top search results if words have a high tf-idf content weight, allowing anyone to avoid stop words while also effectively locating words with a higher search volume but a lower level of competition (fang, 2015). 4. results and discussion the purpose of this part is to assess the experiment's performance. evaluating metrics are important in determining classification efficiency, and assessing accuracy is the easiest way to do so. the system is assessed using three widely used statistical measures: the f-measure, which is generated from a confusion matrix, is derived from recall, precision, and the f-measure. the confusion matrix divided into four categories true positive, true negative, false positive, and false negative (see figures 1 and 2). true positive describes a situation in which the system accurately anticipates the positive class. false-positive highlights a situation in which the scheme predicts the positive class inaccurately. tabulator form is used to show the (svm) confusion matrix and the naive bayes classifier a separate tabular format is used to display both the statistical measurement and the npl (table 1). table 1. svm confusion matrix positive 3694 neutral 158 negative 90 in the train dataset, we have 3694 (~95.1%) sentiments labelled as positive, and 158 (~4%) sentiments labelled as neutral and 90(~2.35%) sentiments as negative. so, it is an imbalanced classification problem. naive bayes [[0 0 24] [0 0 39] [0 0 937]] precision recall f1-score support 0 0.00 0.00 0.00 24 1 0.00 0.00 0.00 39 2 0.94 1.00 0.97 937 micro avg 0.94 0.94 0.94 1000 macro avg 0.31 0.33 0.32 1000 97 | international journal of informatics information system and computer engineering 2(2) (2021) 94-99 weighted avg 0.88 0.94 0.91 1000 accuracy: 93.7 precision refers to the ratio of predicted positive cases to total positive instances indicated by the equation. tf/idf vectorizer and logistic regression for under sampled data [[10 6 8] [15 7 17] [314 195 428]] precision recall f1-score support 0 0.03 0.42 0.06 24 1 0.03 0.18 0.06 39 2 0.94 0.46 0.62 937 micro avg 0.45 0.45 0.45 1000 macro avg 0.34 0.35 0.24 1000 weighted avg 0.89 0.45 0.58 1000 accuracy: 44.5 characteristic of logistic regression of under sampled data figure 1. true and false positive rate under sampled data tf/idf and logistic regression for over sampled data [[13 3 8] [10 10 19] [214 171 552]] precision recall f1-score support 0 0.05 0.54 0.10 24 1 0.05 0.26 0.09 39 2 0.95 0.59 0.73 937 micro avg 0.57 0.57 0.57 1000 macro avg 0.35 0.46 0.31 1000 weighted avg 0.90 0.57 0.69 1000 accuracy: 57.49999999999999 logistic regression on over-sampled data is performing better than undersampled data. muhammad et al. aspect-based sentiment analysis on amazon product reviews | 98 characteristic of logistic regression of over sampled data figure 2. true and false over sampled data neural network [[9 2 13] [0 12 27] [2 8 927]] precision recall f1-score support 0 0.82 0.38 0.51 24 1 0.55 0.31 0.39 39 2 0.96 0.99 0.97 937 micro avg 0.95 0.95 0.95 1000 macro avg 0.77 0.56 0.63 1000 weighted avg 0.94 0.95 0.94 1000 using class-weights does not improve the performance. 3. conclusion in order to investigate the polarisation of amazon product ratings, this study was able to compare svm and naive bayes classifiers. following the preprocessing step, almost 2250 features and over 6000 datasets were used to train the models. the svm classifier in this system has a precision of 0.00 percent, a recall of 0.00 percent, f1 score 0.00 percent. the model yields svm and naive bayes with 93.7 percent accuracy, respectively, which is confirmed to be superior to traditional approaches. with a higher accuracy rate, the (svm) can polarise amazon product feedback, according to the findings of experiments. 99 | international journal of informatics information system and computer engineering 2(2) (2021) 94-99 references bhatt, a., patel, a., chheda, h., & gawande, k. (2015). amazon review classification and sentiment analysis. international journal of computer science and information technologies, 6(6), 5107-5110. dadhich, a., & thankachan, b. (2022). sentiment analysis of amazon product reviews using hybrid rule-based approach. in smart systems: innovations in computing (pp. 173-193). springer, singapore. dey, s., wasif, s., tonmoy, d. s., sultana, s., sarkar, j., & dey, m. (2020, february). a comparative study of support vector machine and naive bayes classifier for sentiment analysis on amazon product reviews. in 2020 international conference on contemporary computing and applications (ic3a) (pp. 217-220). ieee. fang, x., & zhan, j. (2015). sentiment analysis using product review data. journal of big data, 2(1), 1-14. haque, t. u., saber, n. n., & shah, f. m. (2018, may). sentiment analysis on large scale amazon product reviews. in 2018 ieee international conference on innovative research and development (icird) (pp. 1-6). ieee. jagdale, r. s., shirsat, v. s., & deshmukh, s. n. (2019). sentiment analysis on product reviews using machine learning techniques. in cognitive informatics and soft computing (pp. 639-647). springer, singapore. joseph, r. p. s. (2020). amazon reviews sentiment analysis: a reinforcement learning approach (doctoral dissertation, ms thesis, griffith college dublin, ireland). karthikayini, t., & srinath, n. k. (2017, december). comparative polarity analysis on amazon product reviews using existing machine learning algorithms. in 2017 2nd international conference on computational systems and information technology for sustainable solution (csitss) (pp. 1-6). ieee. more, g., behara, h., & suresha, a. m. (2020). sentiment analysis on amazon product reviews with stacked neural networks. no. october. pandey, p., & soni, n. (2019, february). sentiment analysis on customer feedback data: amazon product reviews. in 2019 international conference on machine learning, big data, cloud and parallel computing (comitcon) (pp. 320-322). ieee. rain, c. (2013). sentiment analysis in amazon reviews using probabilistic machine learning. swarthmore college. salmony, m. y. a., & faridi, a. r. (2021, april). supervised sentiment analysis on amazon product reviews: a survey. in 2021 2nd international conference on intelligent engineering and management (iciem) (pp. 132-138). ieee. xiao, y., qi, c., & leng, h. (2021, march). sentiment analysis of amazon product reviews based on nlp. in 2021 4th international conference on advanced electronic materials, computers and software engineering (aemcse) (pp. 12181221). ieee. 71 | international journal of informatics information system and computer engineering 3(1) (2022) 71-79 bts application: online thesis consultation bella hardiyana school of information science japan advanced institute of science and technology, japan *corresponding email: bella.hardiayana@email.unikom.ac.id a b s t r a c t s a r t i c l e i n f o the learning process at universities is hindered by the covid-19 pandemic, so activities that should be carried out face-to-face must be done online. one of the activities that are hampered is thesis consultation. thesis consultation should be done directly, face to face, and verified on the attendance card, and it cannot be carried out as usual. the consultation can only be done online by sending files to the supervisor and then reviewing the results of the work. however, by doing it online, it will not be easy to fill out the attendance card that must be signed in person. the signing process becomes online, by sending the digital version of attendance card to the supervisor, then be signed by the supervisor and sent again to the student. the purpose of this research is to design a thesis consultation information system in which everything is centralized and documented in one platform. this research used qualitative descriptive analysis method and system development method using prototype. the results showed that the system design that was built could help become a medium for exchanging files between supervisors and students. the output of this system will be documented. the history of each counsel carried out will be recorded so that the attendance card will be filled automatically. article history: received 25 may 2022 revised 30 may 2022 accepted 10 june 2022 available online 26 june 2022 aug 2018 __________________ keywords: technology, information system, computer science, application, assignment, thesis international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 3(1) (2022) 71-79 bella hardiyana. bts application: online thesis assignment guidance | 72 1. introduction the corona virus is the cause of the covid-19 pandemic which is currently spreading in various countries (skovlund et al., 2021). this virus is an infectious disease virus that has a very fast spread with human fluids as the medium of transmission (singhal, 2020). this virus target people indiscriminately and tends to be more dangerous if it affects the elderly and people who have a history of previous severe illness (etard et al., 2020). with the outbreak of covid-19, many activities have been hindered, if not stopped. covid-19 has made a country's economy paralyzed, interrupted the distribution of logistics, and hindered the education sector is because the spread of this virus spreads through physical and liquid contact so that it spreads very quickly (alagu et al., 2021). this certainly has a lot of impact on the sectors of daily life, one of which is the education sector. offline learning activities must change to online to avoid the spread of the virus so that all students are forced to adapt to the new method. in addition, the university's higher education sector is also constrained, for example in the process of implementing the final project. the final project also has a consultation process which is usually carried out directly by the lecturers and their student group, but it is now constrained by the policy of limiting social activities so that all must be connected through the online system (kintama et al., 2021). therefore, to solve these problems, it is necessary to build media to be a liaison between the consultation process between lecturers and students who can monitor the progress of the final project. in china, the use of online learning platforms was practiced even before the covid-19 pandemic. however, during the pandemic, the use of online platforms has skyrocketed, but it is not uncommon for people to worry about the security and speed of video conference data transfer. for this reason, video conferencing service providers move quickly to overcome these problems by updating to minimize application bugs (han et al., 2021). in addition, studies conducted in japan indicate that many universities are not ready organizationally or operationally when facing a pandemic. therefore, many universities collaborate with other universities to establish cooperation in the implementation of online classes to support their learning. in addition, many universities are preparing post-pandemic scenarios so that the learning carried out remains relevant to the surrounding community (izumi et al., 2021). in line with this, online education according to a survey of several students in india stated that this method was a feasible alternative during this pandemic. although the survey conducted stated that 65.9% felt that learning through physical classrooms was more effective than through online. for this reason, students hope to optimize this online learning by delivering more diverse materials such as case studies, gamification, and interactive classes (chakraborty et al., 2021). based on previous research, this study raised the theme of online learning during the pandemic, especially in the final project guidance process that was affected by being online with the covid-19 pandemic. the purpose of this research is to design a thesis consultation information system in which everything is centralized and documented in one platform. this 73 | international journal of informatics information system and computer engineering 3(1) (2022) 71-79 research used qualitative descriptive analysis method and system development method using prototype. the results show that the system design that was built can help become a medium for exchanging files between supervisors and students. 2. method this research method used descriptive qualitative analysis with object-oriented systems approach method. the concept of object-oriented approach makes developers focus on creating classes which are the blueprints of an object system. this concept can divide the system components into several objects that interact with each other to run the system (aman, 2021). while in data collection using observation techniques and direct interviews on the object of research. interviews are used to determine the needs of users who will use this application and observations are made on activities that run before the system (greer et al., 2020). in the development of this system using the rapid application development (rad) method. rapid application development (rad) is a software development process model that moves linearly over and over in development but is limited by a short time because this method is specifically for systems that are not too complex (pricillia, 2021). because the stages used will work a lot at one stage of development before the final stage of implementation. the following are the stages of the rad development method (rosmalia et al., 2021) (fig. 1). fig. 1. rapid application development (rad) method 3. results and discussion the development of this thesis consultation information system used the php programming language with the codeigniter framework and from the database supported by the mysql dbms. in the proposed system, this system has 2 main functions that are used by 2 users or users, namely lecturers and students. each user has a similar function in the system, but both have their own rights and characteristics. student users can consult with their supervisor, by submitting the files through the menu provided (upload guidance), after that, the student waits for the lecturer to verify his attendance and after that students can see the comments in the menu provided (revision consultation). students can also see the history of consultation that he did together with his supervisor (consultation history). the following is a design use case diagram of the thesis consultation information system (fig. 2). bella hardiyana. bts application: online thesis assignment guidance | 74 fig. 2. use case thesis consultation diagram in this information system, students can upload a draft of the file-to-beconsulted in stages, for example by uploading it according to the chapter they consulted. in addition to the facility for uploading thesis drafts, there are also facilities from lecturers to provide comments in which these comments will be recorded on the attendance card. each stage of guidance carried out by the student if it is in accordance with the results of the revision, the lecturer will provide validation through the system which will be recorded on the attendance card in the form of initials based on the stages of consultation carried out. in addition to thesis draft consultation, it is also possible to test programs that have been made by students. students can test their program by recording their screen then upload it on youtube and attach the link to the program testing consultation form on this information system. as with the previous stages in the program testing consultation, lecturers are also given the feature to provide comments on what has been presented by students. the lecturers are also given an approval feature to validates student submission at this stage. after all the stages of consultation are carried out and validated by the supervisor, students only need to download the attendance card file to be used as evidence for conducting consultation as a condition for the thesis trial. in designing information systems, activity diagrams serve as an overview of the system flow and what the system can do. the following is an illustration of the designed activity diagram (fig. 3). 75 | international journal of informatics information system and computer engineering 3(1) (2022) 71-79 fig. 3. thesis consultation activity diagram bella hardiyana. bts application: online thesis assignment guidance | 76 the thesis consultation process starts from students accessing the page then uploading the results of their thesis work. the form contains a brief description of the draft uploaded in this tutorial. then upload the draft thesis file to be reviewed by the lecturer concerned. in this form, students can choose which chapter will be the topic of the consultation so that the lecturer can monitor the progress of the student group under his guidance. fig. 4 is the interface for the thesis draft upload page. fig. 4. upload thesis draft by student after the guidance draft has been uploaded successfully, it will be entered on the supervisor page. the supervisor will download the previously uploaded file to be able to do a review. after the review is done, the lecturer will make a response to the student in the form of a note of improvement that needs to be done by the student. in this form, lecturers can also upload their review files to be re-sent to students so that students can see in detail the parts that need to be improved. at this verification stage, if you feel that the draft is appropriate, you can change the status of the chapter's guidance to "accepted". the guidance verification page can be seen in fig. 5 and 6. 77 | international journal of informatics information system and computer engineering 3(1) (2022) 71-79 fig. 5. review draft skripsi by lecturer fig. 6. consultation history by student lecturers can check the student attendance list through the consultation list menu. the lecturer can also revise files submitted by the students in the consultation verification menu can see the consultation history with students in the consultation history menu. in fig. 7 is the design of the consultation history interface that can be accessed by lecturers. bella hardiyana. bts application: online thesis assignment guidance | 78 fig. 7. list of students in guidance 4. conclusion the making of this thesis consultation information system require real data and problems that actually occur in the educational environment, namely the educational environment of the universitas komputer indonesia (unikom). with the covid-19 pandemic making it difficult for students to conduct thesis guidance directly, an information system was created with the php programming language and using the mysql database. with the creation of this information system, it will make it easier for students to carry out the thesis guidance procession because students do not need to come directly to campus for guidance, but only by opening the platform provided students can do thesis guidance, in the preparation of this information system, of course, the author hopes this system can make it easier for lecturers and students to carry out the educational process in this pandemic situation. references skovlund, c. w., friis, s., dehlendorff, c., nilbert, m. c., & mørch, l. s. (2021). hidden morbidities: drop in cancer diagnoses during the covid-19 pandemic in denmark. acta oncologica, 60(1), 20-23. singhal, t. (2020). a review of coronavirus disease-2019 (covid-19). the indian journal of pediatrics, 87(4), 281-286. 79 | international journal of informatics information system and computer engineering 3(1) (2022) 71-79 etard, j. f., vanhems, p., atlani-duault, l., & ecochard, r. (2020). potential lethal outbreak of coronavirus disease (covid-19) among the elderly in retirement homes and long-term facilities, france, march 2020. eurosurveillance, 25(15), 2000448. alagu lakshmi, s., shafreen, r. m. b., priya, a., & shunmugiah, k. p. (2021). ethnomedicines of indian origin for combating covid-19 infection by hampering the viral replication: using structure-based drug discovery approach. journal of biomolecular structure and dynamics, 39(13), 4594-4609. kintama, a. y., larasati, d. a., & yuliana, l. (2021). bimbingan skripsi daring selama pademi covid-19 pada mahasiswa pgsd uwks: hambatan dan solusi. trapsila: jurnal pendidikan dasar, 3(1), 57-71. han, x., zhou, q., shi, w., & yang, s. (2021). online learning in vocational education of china during covid-19: achievements, challenges, and future developments. journal of educational technology development and exchange (jetde), 13(2), 4-31. izumi, t., sukhwani, v., surjan, a. and shaw, r. (2021), "managing and responding to pandemics in higher educational institutions: initial learning from covid19", international journal of disaster resilience in the built environment, vol. 12 no. 1, pp. 51-66. chakraborty, p., mittal, p., gupta, m. s., yadav, s., & arora, a. (2021). opinion of students on online education during the covid‐19 pandemic. human behavior and emerging technologies, 3(3), 357-365. aman, m. (2021). pengembangan sistem informasi wedding organizer menggunakan pendekatan sistem berorientasi objek pada cv pesta. jurnal janitra informatika dan sistem informasi, 1(1), 47-60. greer, b. d., mitteer, d. r., briggs, a. m., fisher, w. w., & sodawasser, a. j. (2020). comparisons of standardized and interview‐informed synthesized reinforcement contingencies relative to functional analysis. journal of applied behavior analysis, 53(1), 82-101. pricillia, t. (2021). perbandingan metode pengembangan perangkat lunak (waterfall, prototype, rad). jurnal bangkit indonesia, 10(1), 6-12. rosmalia, l., jaroji, j., & teddyyana, a. (2021). aplikasi pendataan dan monitoring industri kecil dan menengah (ikm) menggunakan metode rapid application development. zonasi: jurnal sistem informasi, 3(2), 71-86. faez m. hassan, hussein abdelwahab mossa. image mosaicking using low-distance...| 44 image mosaicking using low-distance high-resolution images captured by an unmanned aerial vehicle faez m. hassan, hussein abdelwahab mossa physics department, college of education, mustansiriyah university, baghdad, iraq a b s t r a c t s a r t i c l e i n f o regional surveys will have a high demand for coverage. to adequately cover a large area while retaining high resolution, mosaics of the area from a variety of scenes can be created. this paper describes a mosaicking procedure that consists of a series of processing steps used to combine multiple aerial images. these images were taken from cropcam unmanned aerial platform flight missions over the desired area to quickly map a large geographical region. the results of periodic processing can be compared and analyzed to monitor a large area for future research or during an emergency situation in the covered area. digital imagery captured from the air has proven to be a valuable resource for studying land cover and land use. for this study, airborne digital camera images were chosen because they provide data with a higher spatial resolution for trying to map a small research area. on board the uav autopilot, images were captured from an elevation of 320 meters using a standard digital camera. when compared to other airborne studies, this technique was less expensive and more cost effective. according to this study, onboard a uav autopilot, a digital camera serves as a sensor, which can be helpful in planning and developing a limited coverage area after mosaicking. article history: received 18 nov 2021 revised 20 nov 2021 accepted 25 nov 2021 available online 26 dec 2021 __________________ keywords: image mosaic, crop cam uav, aerial photography international journal of informatics information system and computer engineering journal homepage: http://ejournal.upi.edu/index.php/ijost/ international journal of informatics information system and computer engineering 2(2) (2021) 44-52 45 | international journal of informatics information system and computer engineering 2(2) (2021) 44-52 1. introduction aerial photography serves as a common foundation for large-scale mapping. it has been widely used for creating and updating maps, as well as for keeping gis databases up to date (neteler, m., & mitasova, h. 2004; xu, y et al., 2016). in remote sensing and geoformation sciences, the term "mosaicking" is frequently used when two or more contiguous images are stitched together to create a single image file (aber, et al., 2010). stitching functions are now available in the majority of digital camera software, and there is a wide range of both free and commercial panorama software available for fully automating the merging of combining multiple photographs into larger composites. these tools have the potential to produce visually appealing results, but they are not geometrically correct, which means they can still detect the image tiles with an angular skew and straight edges (aber, et al., 2010). one of the most significant image data processing techniques in uav systems is mosaic in real time, which allows the uav images that have been georeferenced to be combined with geographic information for quick reaction to time-sensitive events (zhou, et al., 2006; kim, et al., 2017). aerial photographs can be used to study changes in the earth's features as time passes. those images are especially useful in analyses of land cover because they compare older data sets with new data sets, which can be available for a wide range of studies (ren, et al., 2017). information on current land use allows agencies and researchers to identify patterns in land cover and, as a result, make more informed decisions about analyses of development suitability, proposed land uses, and long-term planning (gómez-candón, et al., 2014). the data can show how development has changed over time, which can be applied as a guide for future research on land cover (ahmad, a., 2011; hassan, et al., 2010; gomarascs, m. a. 2009). the digital images captured are available in a short time and are accompanied by latitude, longitude, and altitude coordinates (zhao., et al., 2019). by manipulating the visualization of digital images, the user can keep track of what is going on at the ground, observe the most recent developments, and prevent problems from spiraling out of control. the unmanned aerial vehicle (uav) can be hand-launched and can autonomously fly from takeoff to landing (cropcam, 2008; felderhof, et al., 2008). both flights were made to capture visible imagery with a resolution of ground level of 9 cm over the selected area in each flight, and all of the images obtained were in jpeg format. the visible images that resulted demonstrated a clear distinction between urban and green land surfaces (avola, et al., 2018). flights of uav’s have been completed successfully at all of the research sites selected for this study. the number of photos taken during each flight plan was adequate for covering the research area. a mosaic image is a fabricated composition created from a series of images obtained by comprehending the geometric relationships between images (fuyi, et al., 2012; lim, et al., 2009; hassan, et al., 2011). the entire survey area should be covered by aerial images with a sufficient amount of overlap between them. typically, the degree of overlap in route direction should be between 60% and 65% with no less than 53%, while in the lateral direction, it should be between 30% and 40% with no less than 15% (wang, et al., 2007). each fly file was set up in this study with a 60% faez m. hassan, hussein abdelwahab mossa. image mosaicking using low-distance...| 46 overlap along the flight end lap runs and a 30% overlap between the side lap runs. the images that were taken in this study had a lot of overlap between them, both inside and outside of the runs. this meant that the images could be stitched together into a good mosaic for further analysis. 2. method penang island, which is located in northern malaysia between latitudes 5o 12' n and 5o 30' n and longitudes 100o 09' e and 100o 26' e, was chosen as a study area. the cropcam uav system was used to collect images for this study, as shown in figure 1. the aircraft was outfitted with navigation and autopilot systems that allowed it to follow predetermined waypoints and thus acquire the target area. furthermore, to collect digital remote sensing images of the study area, a pentax optio a40 digital camera was employed in the form of a low-cost imaging sensor system affixed to the body of the uav. pentax's first digital camera with a resolution of 12.0 effective megapixels, the optio a40 is capable of producing images with extreme precision and high resolution. fig. 1. cropcam uav system used in this study. flights were conducted for each study site to capture visible imagery in order to cover the entire study site, with high ground resolution. in this study, the imagery was captured by the uav at a low altitude (320 meters) above ground level, allowing imagery to be obtained even when there was cloud cover, giving it a competitive advantage over manned aircraft and satellite imagery. the software packages autopano giga 2.2 pro and ptgui 8.3.10 were used to stitch or mosaic a number of images captured by the cropcam uav platform. to cover a larger area, each flight's raw images were mosaicked together. to achieve a good mosaic, a couple of control points shared by photos taken in succession were manually added. because it was pre-programmed into the fly files, the overlap between adjunct photos was ideal. this has enabled the 47 | international journal of informatics information system and computer engineering 2(2) (2021) 44-52 stitching of images into seamless final mosaics, as well as the improvement of the georeferencing process (mengxiao, et al., 2018). figure 2 depicts the mosaicking procedure used in this study. the seemliness between individual mosaic pieces can be placed manually or automatically to be as inconspicuous as possible, and radiometric matching techniques can be used to account for color and brightness differences (tian, et al., 2020). all of the image mosaics in this study were created with the software packages autopano and ptgui. the original images (raw images) from each flight were mosaicked with lens distortion correction and color equalization. as shown in figure 2, a near-neighbor interpolator and smart blend bending algorithms were used to render images in order to make the mosaics shown in figure 2. fig. 2. modular workflow for image mosaicking process. 3. results and discussion figures 3 and 4 show examples of unprocessed digital images collected during cropcam uav flight missions over the chosen study sites. figures 5 and 6 depict image mosaics created from raw images collected after each flight over the study sites. the rmse of image mosaicking is displayed in table 1. faez m. hassan, hussein abdelwahab mossa. image mosaicking using low-distance...| 48 fig. 3. samples of cropcam raw images (first flight on june 20th, 2011) fig. 4. samples of cropcam raw images (second flight on december 12th, 2011) 202011). 49 | international journal of informatics information system and computer engineering 2(2) (2021) 44-52 fig. 5. uncontrolled image mosaic of the selected area in penang island created with autopano pro and ptgui software from 65 images taken by cropcam uav on june 20th, 2011 (first flight) fig. 6. uncontrolled image mosaic of the selected area in penang island created with autopano pro and ptgui software from 86 images taken by cropcam uav on december 12th, 2011 (second flight). faez m. hassan, hussein abdelwahab mossa. image mosaicking using low-distance...| 50 table 1. mosaicking process results flight mission number of stitched images panorama fov rmse (cm) quality status panorama 100 dpi(m) first flight 65 68.03º×51.31º 2.6 v. good 3.57×2.69 second flight 86 79.71º×40.10º 2.1 v. good 3.76×1.59 an examination of the generated image mosaics reveals that they are preferable. clearly, the characteristics (roads, buildings, etc.) in those images are joined perfectly with the minimum distortion. furthermore, due to the efficient method (image blending) used to create a mosaic of high quality, those image mosaics are in good enough shape to be used in further image analysis. in uav-acquired images, radiometric variations of overlapping views are common. as a result, each image region retains its own color, brightness, and contrast during the image blending process. therefore, these overlapping regions blend into one another with no discernible pattern. there is sufficient evidence in this study to show that the cropcam flight missions were successful to obtain the desired images with high resolution, and the mosaicking results are visually pleasing. image mosaics frequently reveal differences in exposure across or between photographs. finally, color matching between the stitched images is required to hide the seams. clearly, the mosaicking results show that the software used in this study can be used to solve the problem of uneven brightness in mosaics. 4. conclusion this research paper describes a simple but effective procedure for uav high-resolution mosaicking images obtained from images captured by uav flying at a low distance. the proposed method's performance was demonstrated in a case study on penang island, malaysia. the method proposed outperformed the highly developed commercial software on uav images to achieve better mosaicking results. our proposed method generates mosaicked images with reduced spectral distortion and increased spatial accuracy. moreover, the process of mosaicking is much faster than that of other software packages. the rmse value in the experimental results is quite high. the method presented in this paper saves 40% of the time. furthermore, the mosaicked images generated by our proposed technique are strikingly similar to the original uav images. in this paper, we propose a small-scale uav-based system for creating low-altitude image mosaics that are incremental and georeferenced in real time. acknowledgments we acknowledge mustansiriyah university, baghdad, iraq and universitas komputer indonesia, indonesia. 51 | international journal of informatics information system and computer engineering 2(2) (2021) 44-52 references aber, j. s., marzolff, i., & ries, j. (2010). small-format aerial photography: principles, techniques and geoscience applications. elsevier. ahmad, a. (2011). digital mapping using low altitude uav. pertanika journal of science and technology, 19(s), 51-58. avola, d., cinque, l., foresti, g. l., martinel, n., pannone, d., & piciarelli, c. (2018). a uav video dataset for mosaicking and change detection from low-altitude flights. ieee transactions on systems, man, and cybernetics: systems, 50(6), 2139-2149.. cropcam, (2008). cropcam user’s guide-application version, canada. felderhof, l., gillieson, d., zadro, p., & van boven, a. (2008). linking uav (unmanned aerial vehicle) technology with precision agriculture. fuyi, t., chun, b. b., jafri, m. z. m., san, l. h., abdullah, k., & tahrin, n. m. (2012, november). land cover/use mapping using multi-band imageries captured by cropcam unmanned aerial vehicle autopilot (uav) over penang island, malaysia. in unmanned/unattended sensors and sensor networks ix (vol. 8540, pp. 147-152). spie. gomarascs, m. a. (2009). basics of geomatics. london and new york, springer. gómez-candón, d., de castro, a. i., & lópez-granados, f. (2014). assessing the accuracy of mosaics from unmanned aerial vehicle (uav) imagery for precision agriculture purposes in wheat. precision agriculture, 15(1), 44-56. hassan, f. m., lim, h. s., & jafri, m. m. (2011). cropcam uav for land use/land cover mapping over penang island, malaysia. pertanika journal of science & technology, 19(s), 69-76. hassan, f. m., lim, h. s., matjafri, m. z., & othman, n. (2010, july). an assessment of low‐cost cropcam uav images for land cover/use over penang island, malaysia. in aip conference proceedings (vol. 1250, no. 1, pp. 23-26). american institute of physics. kim, j. i., kim, t., shin, d., & kim, s. (2017). fast and robust geometric correction for mosaicking uav images with narrow overlaps. international journal of remote sensing, 38(8-10), 2557-2576. lim, h. s., jafri, m. z. m., abdullah, k., hassan, f., & saleh, n. m. (2009). feasibility of using multi-band imageries captured by cropcam unmanned aerial vehicle autopilot for land cover mapping. journal of materials science and engineering, 3(12), 26-31. faez m. hassan, hussein abdelwahab mossa. image mosaicking using low-distance...| 52 mengxiao song, zheng ji, shan huang & jing fu. (2018). mosaicking uav orthoimages using bounded voronoi diagrams and watersheds. international journal of remote sensing, 39(15-16), 4960-4979. neteler, m., & mitasova, h. (2013). open source gis: a grass gis approach (vol. 689). springer science & business media. ren, x., sun, m., zhang, x., & liu, l. (2017). a simplified method for uav multispectral images mosaicking. remote sensing, 9(9), 962. tian, y., sun, a., luo, n., & gao, y. (2020). aerial image mosaicking based on the 6dof imaging model. international journal of remote sensing, 41(1), 74-89. wang, p. and xu, y., (2007). photogrammetry, wuhan: wuhan university press, pp.16 17 . xu, y., ou, j., he, h., zhang, x., & mills, j. (2016). mosaicking of unmanned aerial vehicle imagery in the absence of camera poses. remote sensing, 8(3), 204. zhao, j., zhang, x., gao, c., qiu, x., tian, y., zhu, y., & cao, w. (2019). rapid mosaicking of unmanned aerial vehicle (uav) images for crop growth monitoring using the sift algorithm. remote sensing, 11(10), 1226. zhou, g., wu, j., wright, s., & gao, j. (2006). high-resolution uav video data processing for forest fire surveillance. old dominion univ., norfolk, va, tech. rep. national sci. foundation. 31 | international journal of informatics information system and computer engineering 3(2) (2022) 1-20 unique aspects of usage of the quadratic cryptanalysis method to the gost 28147-89 encryption algorithm bardosh akhmedov*, rakhmatillo aloev** university of uzbekistan named after mirzo ulugbek tashkent, uzbekistan *corresponding email: shirin07@ya.ru a b s t r a c t s a r t i c l e i n f o in this article, issues related to the application of the quadratic cryptanalysis method to the five rounds of the gost 28147-89 encryption algorithm are given. for example, the role of the bit gains in the application of the quadratic cryptanalysis method, which is formed in the operation of addition according to mod232 used in this algorithm is described. in this case, it is shown that the selection of the relevant bits of the incoming plaintext and cipher text to be equal to zero plays an important role in order to obtain an effective result in cryptanalysis. article history: received 18 dec 2022 revised 20 dec 2022 accepted 25 dec 2022 available online 26 dec 2022 aug 2018 __________________ keywords: gost 28147-89, selected plaintext, quadratic approximation, correlation matrix, quadratic cryptanalysis 1. introduction in order to verify and evaluate the strength of encryption algorithms the possibilities of linear, differential, lineardifferential, algebraic, and correlation cryptanalysis are used. many works are devoted to improving applications of linear cryptanalysis. several linear approximations simultaneously for one combination of key bits (kaliski & robshaw, 1994; quisquater, 2004) can be used to increase the efficiency of the linear cryptanalysis method. a method for improving the lc method (in particular, for the cipher loki91) is proposed, which suggests taking into account the probabilistic behavior of some bits instead of their fixed values when approximating (sakurai & furuya, 1997). 2. literature review 2.1. linear cryptanalysis a series of works is devoted to the issues of the resistance of various encryption algorithms to the linear cryptanalysis method. in (chee et al., 1994), l.knudsen considered the issues of constructing feistel-type encryption schemes that are resistant to linear and differential international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 3(2) (2022) 31-40 bardosh akhmedov and rakhmatillo aloev. unique aspects of usage of the quadratic … | 32 cryptanalysis methods. v.shorin, v.zheleznyakov and e.gabidulin proved in 2001 that the russian algorithm gost 28147-89 is resistant to these methods (with no less than five rounds of encryption in linear cryptanalysis and seven rounds in a different one). a large number of works are devoted to the study of various classes of approximating functions and to the construction of functions that are most difficult to such approximations. in these papers, bent functions (logachev et al., 2004; dobbertin & leander, 2004; chee et al., 1994) are considered, which are boolean functions from an even number of variables that are maximally distant from the set of all linear functions in the hamming metric, as well as their generalizations: semi-bent functions (dobbertin & leander, 2005), partially bent functions (qu et al., 2000), z−bent functions (pfitzmann, 2003), homogeneous bent functions (kuzmin et al., 2006), hyper best functions (carlet & gaborit, 2006; youssef, 2007; kuz'min et al., 2008; knudsen & robshaw, 1996). the main idea of using linear cryptanalysis of nonlinear approximations (knudsen & robshaw, 1996) is to enrich the class of approximating functions (of m variables) with nonlinear functions and increase the quality of approximation due to this. in this case, the cryptanalyst has to deal with the difficulties of choosing nonlinear approximations and combining nonlinear approximations of individual rounds. 2.2. gost 28147-89 encryption algorithms in the gost 28147-89 encryption algorithm (kuryazov et al., 2017; vinokurov) mod232 addition operation is used, and this operation, by its nature, the value of each resulting bit is connected to the values of the incoming bits below it in order. the mathematical model of each bit of the result of this operation can be expressed as follows: 2mod)( 323232 kxp += ; 2mod)( 32313131 qkxp ++= ; 2mod)( 3222 qkxp ++= 2mod)( 2111 qkxp ++= the general mathematical model of addition operation according to mod232 can be expressed as follows (kuryazov et al., 2017): 0,1...32,2mod)( 331 ==++= + qiqkxp iiii (1) here, qi –addition of sum of all i-bits. in this case, when applying the linear cryptanalysis method, considering the influence of the bit in each position of the block to be reflected with the output bits, the problem of building a boolean function for each bit of the result of the addition operation according to mod232 was considered. an overview of this function is as follows (kuryazov et al., 2017). )2.(0,1...32 ),(, 33 11 == == ++ qi kxqkxqqkxp iiiiiiiiii based on the results of the research on the mod232 addition operation used in the gost 28147-89 encryption algorithm, the schematic view of one round of this 33 | international journal of informatics information system and computer engineering 3(2) (2022) 1-20 algorithm is as follows (fig. 1) (kuryazov et al., 2017). 3. method 3.1. a. quadratic relations of a special form in previous works, correlation matrix values for linear and quadratic dependences and appropriate approximation equations with probability r=7/8 were obtained for gost 28147-89 algorithm s box. these equations are effectively used in linear cryptanalysis to find key bits with high probability (akhmedov & aloev, 2020; akhmedov, 2021). these equations are shown in table 1. fig. 1. schematic view of one round of gost 28147-89 encryption algorithm 3.2. quadratic cryptanalysis with these approximation equations, a modification of the gost 28147-89 algorithm, that is, using the xor operation instead of the mod232 addition operation, was used for the 5th round of quadratic cryptanalysis, and the corresponding results were obtained (akhmedov & aloev, 2020; akhmedov, 2021). based on the quadratic cryptanalysis conducted for the 5th round of the gost 28147-89 algorithm, the addition operation according to mod232 is used for the s block reflections, when conducting cryptanalysis based on the correlation matrices of linear and quadratic connections, the fact that some bits of the plaintext and ciphertext are equal to zero ensures the formation of an effective approximation relationship. 4. results and discussion based on the concepts presented above, the quadratic dependence approximation equations in table 1 determined for the correlation matrices for the s3-block are analyzed. 1-round: for s3 block 〖(p〗 _1⨁p_3)(p_2⨁p_4)⨁p_1⨁p_3=c_3 with probability p=12/16 the position of variables in the round reflection of the approximation equality, according to 11bit left cyclic shift and addition of leftside appropriate bits {( p(41) ⊞ k1(9))⨁(p(43) ⊞ k1(11))}*{(p(42) ⊞ k1(10))⨁(p(44) ⊞ k1(12))} ⨁ (p(41) ⊞ k1(9))⨁(p(42) ⊞ k1(10)) =y1(32)⨁p(32) will have the form. in order not to encounter the problem of addition from the sum of bits in this equality, it is necessary to choose plaintexts that satisfy the condition p(42)=p(43)=p(44)=p(45)=0. since the addition of the sum of p(41) and k1(9) in this block does not affect equality, it can bardosh akhmedov and rakhmatillo aloev. unique aspects of usage of the quadratic … | 34 be obtained in the form p(41) ⊞k1(9)=p(41)⨁k1(9). in this case (p(41)⨁k1(9)⨁k1(11))*(k1(10)⨁k1(12)) ⨁ p(41)⨁k1(9)⨁k1(10))=y1(32)⨁ p(32) (3) the equation is formed. table 1. cloud users’ responsibility types of attacks № approximation equations p o ss ib il it y e x cl u si o n s1 𝑝1⨁𝑝4 = 𝑐1𝑐2⨁𝑐2𝑐3⨁𝑐1𝑐4⨁𝑐3𝑐4⨁𝑐1⨁𝑐4 𝑝1𝑝3⨁𝑝1𝑝4⨁𝑝2𝑝3⨁𝑝2𝑝4⨁𝑝2⨁𝑝4 = 𝑐1𝑐3⨁𝑐1𝑐4⨁𝑐2𝑐3⨁𝑐2𝑐4⨁𝑐1⨁𝑐3 𝑝1𝑝3⨁𝑝1𝑝4⨁𝑝2𝑝3⨁𝑝2𝑝4⨁𝑝1⨁𝑝3 = 𝑐2⨁𝑐3⨁1 𝑝1𝑝2⨁𝑝1𝑝3⨁𝑝2𝑝4⨁𝑝3𝑝4⨁𝑝1⨁𝑝3 = 𝑐2⨁𝑐3⨁1 𝑝1𝑝2⨁𝑝1𝑝3⨁𝑝2𝑝4⨁𝑝3𝑝4⨁𝑝1⨁𝑝3 = 𝑐1𝑐3⨁𝑐2𝑐3⨁𝑐1𝑐4⨁𝑐2𝑐4⨁𝑐1⨁𝑐3⨁1 p = 1/8 ∆=3/4 s2 𝑝1⨁𝑝3⨁𝑝4 = 𝑐1⨁𝑐4⨁1 𝑝1⨁𝑝2 = 𝑐1⨁𝑐2⨁𝑐4 𝑝1⨁𝑝2⨁𝑝3 = 𝑐1𝑐3⨁𝑐2𝑐3⨁𝑐1𝑐4⨁𝑐2𝑐4⨁𝑐2⨁𝑐3 𝑝1𝑝2⨁𝑝1𝑝4⨁𝑝2𝑝3⨁𝑝3𝑝4⨁𝑝2⨁𝑝3 = 𝑐1⨁𝑐2⨁𝑐4 𝑝1𝑝2⨁𝑝1𝑝3⨁𝑝2𝑝4⨁𝑝3𝑝4⨁𝑝4⨁𝑝3 = 𝑐3⨁𝑐4⨁1 p = 1/8 ∆=3/4 35 | international journal of informatics information system and computer engineering 3(2) (2022) 1-20 s3 𝑝1⨁𝑝4 = 𝑐1⨁𝑐3 𝑝2⨁𝑝3⨁𝑝4 = 𝑐1⨁𝑐3⨁𝑐4 𝑝2⨁𝑝3 = 𝑐1⨁𝑐2⨁𝑐3⨁1 𝑝1⨁𝑝3 = 𝑐1𝑐2⨁𝑐2𝑐3⨁𝑐1𝑐4⨁𝑐3𝑐4⨁𝑐1⨁𝑐2⨁1 𝑝4 = 𝑐1𝑐2⨁𝑐2𝑐4⨁𝑐1𝑐3⨁𝑐3𝑐4⨁𝑐2⨁𝑐4⨁1 𝑝1⨁𝑝4 = 𝑐1𝑐2⨁𝑐2𝑐4⨁𝑐1𝑐3⨁𝑐3𝑐4⨁𝑐3⨁𝑐4 𝑝1𝑝3⨁𝑝1𝑝4⨁𝑝2𝑝3⨁𝑝2𝑝4⨁𝑝1⨁𝑝3 = 𝑐1⨁𝑐2⨁1 𝑝1𝑝3⨁𝑝1𝑝4⨁𝑝2𝑝3⨁𝑝2𝑝4⨁𝑝2⨁𝑝4 = 𝑐1⨁𝑐3 𝑝1𝑝2⨁𝑝1𝑝4⨁𝑝2𝑝3⨁𝑝3𝑝4⨁𝑝1⨁𝑝4 = 𝑐1⨁𝑐2⨁𝑐3 𝑝1𝑝2⨁𝑝1𝑝3⨁𝑝2𝑝4⨁𝑝3𝑝4⨁𝑝1⨁𝑝3 = 𝑐1𝑐2⨁𝑐2𝑐4⨁𝑐1𝑐3⨁𝑐3𝑐4⨁𝑐1⨁𝑐3⨁1 𝑝1𝑝3⨁𝑝1𝑝4⨁𝑝2𝑝3⨁𝑝2𝑝4⨁𝑝1⨁𝑝3 = 𝑐1𝑐2⨁𝑐1𝑐3⨁𝑐2𝑐4⨁𝑐3𝑐4⨁𝑐3⨁𝑐4 p = 1/8 ∆=3/4 s4 𝑝3 = 𝑐1⨁𝑐2⨁𝑐3⨁𝑐4⨁1 𝑝1⨁𝑝3⨁𝑝4 = 𝑐1 𝑝2⨁𝑝4 = 𝑐3⨁1 𝑝3⨁𝑝4 = 𝑐1𝑐2⨁𝑐2𝑐3⨁𝑐1𝑐4⨁𝑐3𝑐4⨁𝑐2⨁𝑐3 𝑝3⨁𝑝4 = 𝑐1𝑐2⨁𝑐1𝑐3⨁𝑐2𝑐4⨁𝑐3𝑐4⨁𝑐1⨁𝑐3 𝑝1⨁𝑝2⨁𝑝3⨁𝑝4 = 𝑐1𝑐2⨁𝑐2𝑐4⨁𝑐1𝑐3⨁𝑐3𝑐4⨁𝑐1⨁𝑐2⨁1 𝑝1⨁𝑝2⨁𝑝4 = 𝑐1𝑐2⨁𝑐2𝑐4⨁𝑐1𝑐3⨁𝑐3𝑐4⨁𝑐3⨁𝑐4 𝑝1𝑝2⨁𝑝1𝑝4⨁𝑝2𝑝3⨁𝑝3𝑝4⨁𝑝2⨁𝑝3 = 𝑐4⨁1 𝑝1𝑝2⨁𝑝1𝑝3⨁𝑝2𝑝4⨁𝑝3𝑝4⨁𝑝2⨁𝑝4 = p = 1/8 ∆=3/4 bardosh akhmedov and rakhmatillo aloev. unique aspects of usage of the quadratic … | 36 𝑐3⨁1 𝑝1𝑝2⨁𝑝1𝑝3⨁𝑝2𝑝4⨁𝑝3𝑝4⨁𝑝1⨁𝑝2 = 𝑐1𝑐2⨁𝑐2𝑐4⨁𝑐1𝑐3⨁𝑐3𝑐4⨁𝑐1⨁𝑐3⨁1 𝑝1𝑝2⨁𝑝1𝑝3⨁𝑝2𝑝4⨁𝑝3𝑝4⨁𝑝1⨁𝑝2 = 𝑐1𝑐3⨁𝑐2𝑐3⨁𝑐1𝑐4⨁𝑐2𝑐4⨁𝑐1⨁𝑐3⨁1 𝑝1𝑝2⨁𝑝1𝑝3⨁𝑝2𝑝4⨁𝑝3𝑝4⨁𝑝3⨁𝑝4 = 𝑐1𝑐2⨁𝑐2𝑐4⨁𝑐1𝑐3⨁𝑐3𝑐4⨁𝑐2⨁𝑐4 s5 𝑝3 = 𝑐1⨁𝑐2⨁𝑐3⨁𝑐4 𝑝1⨁𝑝2⨁𝑝3⨁𝑝4 = 𝑐1𝑐2⨁𝑐2𝑐3⨁𝑐1𝑐4⨁𝑐3𝑐4⨁𝑐1⨁𝑐2 𝑝1𝑝2⨁𝑝1𝑝4⨁𝑝2𝑝3⨁𝑝3𝑝4⨁𝑝1⨁𝑝4 = 𝑐1 𝑝1𝑝2⨁𝑝1𝑝3⨁𝑝2𝑝4⨁𝑝3𝑝4⨁𝑝2⨁𝑝4 = 𝑐1 𝑝1𝑝2⨁𝑝1𝑝3⨁𝑝2𝑝4⨁𝑝3𝑝4⨁𝑝3⨁𝑝4 = 𝑐1⨁𝑐2⨁𝑐3⨁1 p = 1/8 ∆=3/4 s6 𝑝3⨁𝑝4 = 𝑐1⨁𝑐3⨁𝑐4 𝑝1⨁𝑝2⨁𝑝3 = 𝑐3 𝑝3 = 𝑐1𝑐3⨁𝑐2𝑐3⨁𝑐1𝑐4⨁𝑐2𝑐4⨁𝑐2⨁𝑐3⨁1 𝑝1⨁𝑝2⨁𝑝4 = 𝑐1𝑐2⨁𝑐2𝑐3⨁𝑐1𝑐4⨁𝑐3𝑐4⨁𝑐2⨁𝑐3⨁1 𝑝3 = 𝑐1𝑐2⨁𝑐2𝑐4⨁𝑐1𝑐3⨁𝑐3𝑐4⨁𝑐1⨁𝑐2⨁1 𝑝1𝑝2⨁𝑝1𝑝3⨁𝑝2𝑝4⨁𝑝3𝑝4⨁𝑝2⨁𝑝4 = 𝑐1⨁𝑐3⨁𝑐4 p = 1/8 ∆=3/4 s7 𝑝2⨁𝑝3⨁𝑝4 = 𝑐2⨁𝑐4 𝑝1⨁𝑝4 = 𝑐1𝑐3⨁𝑐2𝑐3⨁𝑐1𝑐4⨁𝑐2𝑐4⨁𝑐1⨁𝑐3⨁1 p = 1/8 ∆=3/4 37 | international journal of informatics information system and computer engineering 3(2) (2022) 1-20 𝑝3⨁𝑝4 = 𝑐1𝑐2⨁𝑐1𝑐4⨁𝑐2𝑐3⨁𝑐3𝑐4⨁𝑐1⨁𝑐2⨁1 𝑝1𝑝3⨁𝑝1𝑝4⨁𝑝2𝑝3⨁𝑝2𝑝4⨁𝑝1⨁𝑝3 = 𝑐2⨁𝑐3⨁1 𝑝1𝑝3⨁𝑝1𝑝4⨁𝑝2𝑝3⨁𝑝2𝑝4⨁𝑝2⨁𝑝4 = 𝑐3 𝑝1𝑝2⨁𝑝2𝑝3⨁𝑝2𝑝3⨁𝑝3𝑝4⨁𝑝1⨁𝑝2 = 𝑐2⨁𝑐3⨁1 𝑝1𝑝2⨁𝑝1𝑝3⨁𝑝2𝑝4⨁𝑝3𝑝4⨁𝑝2⨁𝑝4 = 𝑐3 𝑝1𝑝3⨁𝑝2𝑝3⨁𝑝1𝑝4⨁𝑝2𝑝4⨁𝑝1⨁𝑝3 = 𝑐1𝑐3⨁𝑐2𝑐3⨁𝑐1𝑐4⨁𝑐2𝑐4⨁𝑐1⨁𝑐3⨁1 s8 𝑝1⨁𝑝3 = 𝑐1⨁𝑐4⨁1 𝑝2 = 𝑐1⨁𝑐2 𝑝2⨁𝑝3 = 𝑐2⨁𝑐3⨁𝑐4⨁1 𝑝1⨁𝑝2⨁𝑝4 = 𝑐1⨁𝑐2⨁𝑐3⨁𝑐4⨁1 𝑝1⨁𝑝3 = 𝑐1𝑐2⨁𝑐2𝑐3⨁𝑐1𝑐4⨁𝑐3𝑐4⨁𝑐2⨁𝑐3 𝑝2 = 𝑐1𝑐2⨁𝑐1𝑐4⨁𝑐2𝑐3⨁𝑐3𝑐4⨁𝑐1⨁𝑐2 𝑝1⨁𝑝4 = 𝑐1𝑐2⨁𝑐1𝑐4⨁𝑐2𝑐3⨁𝑐3𝑐4⨁𝑐3⨁𝑐4 p = 1/8 ∆=3/4 2-round: in block s8, the equality p_4=c_1⨁c_2⨁c_4⨁1 has probability p=12/16, the variables in the approximation equality have the appearance p2(32)⊞k2(32)=y2(18)⨁y2(19)⨁y2(21) ⨁ p(18)⨁p(19)⨁p(21)⨁1 according to the position of the round reflection, a cyclic left shift of 11 bits, and the addition of left-side appropriate bits. since the value p2(32) in this parity represents the last bit, it is not affected by the summation, and since no other incoming text and key bits are involved, the addition p2(32) ⊞ k2(32) is not involved. in this case, equation p2(32) ⨁ k2(32) =y2(18)⨁ y2(19)⨁ y2(21)⨁p(18) ⨁p(19) ⨁p(21) ⨁1 (4) will appear y1(32) = p2(32) as a result of combining equations 3 and 4 according to compatibility, the following equation results: bardosh akhmedov and rakhmatillo aloev. unique aspects of usage of the quadratic … | 38 y2(18)⨁y2(19)⨁y2(21)=(p(41)⨁k1(9)⨁ k1(11))*(k1(10)⨁k1(12))⨁p(41)⨁k1(9)⨁ k1(10))⨁p(32)⨁p(18)⨁p(19)⨁p(21)⨁ k2(32)⨁1 (5) 3-round: in s3 block 〖(p〗 _1⨁p_3)(p_2⨁p_4)⨁p_1⨁p_3=c_3 with probability p=12/16 the variables in the approximation equality have the appearance of {(с(41) ⊞k5(9))⨁(с(43) ⊞k5(11))}*{(с(42) ⊞k5(10))⨁(с(44) ⊞ k5(12))} ⨁(с(41) ⊞ k5(9))⨁(с(42) ⊞ k5(10)) =y5(32)⨁с(32) according to the position in the round reflection, 11-bit left cyclic shift and addition of the left appropriate bits. in order not to encounter the problem of addition from the sum of bits in this equality, it is necessary to choose plaintexts that satisfy the condition s(42)=s(43)=s(44)=s(45)=0. since the addition of the sum of s(41) and k5(9) in this block does not affect equality, it can be obtained in the form s(41) ⊞k5(9)=s(41)⨁k5(9). in this case (s(41)⨁k5(9)⨁k5(11))*(k5(10)⨁k5(12)) ⨁s(41)⨁ k5(9)⨁k5(10))=y5(32)⨁s(32) (6) results in equality. 4-round: in block s8, the equality p_4=c_1⨁c_2⨁c_4⨁1 has probability p=12/16, the variables in the approximation equality have the appearance of p4(32) ⊞ k4(32) =y4(18)⨁ y4(19)⨁y4(21) ⨁с(18) ⨁с(19) ⨁с(21) ⨁1 according to the position of the round reflection, 11-bit left cyclic shift and addition of the left-side appropriate bits. since the value p4(32) in this parity represents the last bit, it is not affected by the summation, and since no other incoming text and key bits are involved, the addition p4(32) ⊞ k4(32) is not involved. in this case p4(32) ⨁ k4(32) =y4(18)⨁y4(19)⨁y4(21)⨁s(18) ⨁s(19) ⨁s(21)⨁1 (7) equality is formed. according to y5(32)=p4(32), combining equations 6 and 7 results in the following equation: y4(18)⨁y4(19)⨁y4(21)=(c(41)⨁k5(9)⨁ k5(11))*(k5(10)⨁k5(12))⨁c(41)⨁k5(9) ⨁ k5(10))⨁c(32)⨁c(18)⨁c(19)⨁c(21)⨁ k4(32) ⨁1 (8) 5 and 8 equations y4(18)⨁y4(19)⨁y4(21)=y2(18)⨁y2(19) ⨁ y2(21) based on (p(41)⨁k1(9)⨁k1(11))*(k1(10)⨁k1(12)) ⨁p(41)⨁k1(9)⨁k1(10))⨁p(32)⨁p(18)⨁ p(19)⨁p(21)⨁2(32)⨁1=(c(41)⨁k5(9)⨁k 5(11))*(k5(10)⨁k5(12))⨁c(41)⨁k5(9)⨁ k5(10)) ⨁k4(32)⨁c(32)⨁c(18)⨁c(19)⨁c(21)⨁1 and this results the following: (p(41)⨁k1(9)⨁k1(11))*(k1(10)⨁k1(12)) ⨁k1(9)⨁k1(10)⨁k2(32)⨁(c(41)⨁k5(9) ⨁k5(11))* (k5(10)⨁k5(12))⨁c(41)⨁k5(9)⨁k5(10)) ⨁ k4(32)=p(32)⨁p(18)⨁p(19)⨁p(21)⨁c(32 )⨁ c(18)⨁c(19)⨁c(21) (9) 39 | international journal of informatics information system and computer engineering 3(2) (2022) 1-20 the problem with sum-of-bits does not arise due to the fact that parity in general satisfies the following conditions: { p(41) = p(42) = p(43) = p(44) = p(45) = 0 с(41) = с(42) = с(43) = с(44) = с(45) = 0 the solution to the above problem with sum-of-bits depends on the s-block being chosen and requires a different approach. 5. conclusion modification of the gost 28147-89 algorithm, that is, using the xor operation instead of the mod232 operation, results of quadratic cryptanalysis method for the 5th round was used based on addition of bits using mod232 addition operation. due to this operation, the number of unknowns in the equation increases, since the value of each resulting bit depends on the values of the bits preceding it in order. for this reason, it is desirable to choose zero values of the corresponding bits of data entering the first round and exiting the fifth round in order to achieve an efficient result. since the second and fourth round input bits depend on the output values from the first and fifth round reflections, there is no option to select these bits. for this reason, it is necessary to consider these values as unknown. the stages of using the quadratic cryptanalysis method for five rounds of gost 28147-89 algorithm are created. in order to achieve an effective result in this method, it is shown that it is important to select the zero values of the corresponding bits of data entering the first round and exiting the fifth round. references akhmedov b.b. “nonlinear cryptanalysis for modification of the xor encryption algorithm gost 28147-89”, i международная научно-практическая интернет-конференция «актуальные вопросы физикоматематических и технических наук: теоретические и прикладные исследования», г.киев. 2021 г. 81-97 стр. www.openscilab.org. akhmedov b.b., aloev r.d. application of quadratic cryptanalysis for a five round xor modification of the encryption algorithm gost 28147-89 // international journal of science and research (ijsr), https://www.ijsr.net/search_index_results_paperid.php?id=sr2081818033 5, volume 9 issue 8, august 2020, 1101 – 1109, issn: 2319-7064, india). carlet, c., & gaborit, p. (2006). hyper-bent functions and cyclic codes. journal of combinatorial theory, series a, 113(3), 466-482. chee, s., lee, s., & kim, k. (1994, november). semi-bent functions. in international conference on the theory and application of cryptology (pp. 105-118). springer, berlin, heidelberg. bardosh akhmedov and rakhmatillo aloev. unique aspects of usage of the quadratic … | 40 dobbertin, h., & leander, g. (2004, october). a survey of some recent results on bent functions. in international conference on sequences and their applications (pp. 1-29). springer, berlin, heidelberg. dobbertin, h., & leander, g. (2005). cryptographer's toolkit for construction of $8 $bit bent functions. cryptology eprint archive. kaliski, b. s., & robshaw, m. j. (1994, august). linear cryptanalysis using multiple approximations. in annual international cryptology conference (pp. 26-39). springer, berlin, heidelberg. knudsen, l. r., & robshaw, m. j. (1996, may). non-linear approximations in linear cryptanalysis. in international conference on the theory and applications of cryptographic techniques (pp. 224-236). springer, berlin, heidelberg. kuz’min, a. s., markov, v. t., nechaev, a. a., shishkin, v. a., & shishkov, a. b. (2008). bent and hyper-bent functions over a field of 2ℓ elements. problems of information transmission, 44(1), 12-33. kuzmin, a. s., markov, v. t., nechaev, a. a., & shishkov, a. b. (2006). approximation of boolean functions by monomial ones. logachev, o. a., sal’nikov, a. a., & yashchenko, v. v. (2004). boolean functions in coding theory and cryptology. mccme, moscow. pfitzmann, b. (ed.). (2003). advances in cryptology–eurocrypt 2001: international conference on the theory and application of cryptographic techniques innsbruck, austria, may 6–10, 2001, proceedings (vol. 2045). springer. qu, c., seberry, j., & pieprzyk, j. (2000). homogeneous bent functions. discrete applied mathematics, 102(1-2), 133-139. quisquater, b. a. d. c. c. (2004). m franklin m on multiple linear approximations. in advances in cryptology–crypto (vol. 2004). sakurai, k., & furuya, s. (1997, january). improving linear cryptanalysis of loki91 by probabilistic counting method. in international workshop on fast software encryption (pp. 114-133). springer, berlin, heidelberg. vinokurov, a. algorithm for cryptographic data transformation gost 28147 89. youssef, a. m. (2007). generalized hyper-bent functions over gf (p). discrete applied mathematics, 155(8), 1066-1070. кuryazov d.m., sattarov a.b., akhmedov b.b. блокли симметрик шифрлаш алгоритмлари бардошлилигини замонавий криптотаҳлил усуллари билан баҳолаш. ўқув қўлланма. т.: «aloqachi». 2017, 228 бет. 59 | international journal of informatics information system and computer engineering 3(1) (2022) 59-70 association analysis with apriori algorithm for electronic sales decision support system r. fenny syafariani mathematics department, faculty of ocean technology engineering and informatics, universiti malaysia terengganu, malaysia corresponding email: r.fenny.syafariani@email.unikom.ac.id a b s t r a c t s a r t i c l e i n f o the purpose of this study was to determine the level of dependence of various items in order to dig up information on what items are dependent on other items. the method used in this research is descriptive analysis with a qualitative approach through a priori algorithm. the results show that the association analysis of the 26 transactions taken is 76.47%. a consumer who buys a laptop electronic device has the possibility to also buy an electronic mouse. article history: received 25 may 2022 revised 30 may 2022 accepted 10 june 2022 available online 26 june 2022 aug 2018 __________________ keywords: technology, information system, apriori, algorithm, decision support system, electronic sales 1. introduction data mining is a data processing method to find patterns from the data obtained (ordila et al., 2020). there are many methods in data mining. one method that is often used is the association method or association rule, more precisely using the apriori algorithm. the data generated from the sales process or transaction data is processed by association rules to find out information related to product purchases made by buyers (riszky et al., 2019). there are various kinds of electronic goods that are sold such as laptops, printers, mouse and so on. sales transactions continue to grow every day and cause huge data storage (purnia et al., 2017). most sales transaction data is only used as an archive without being international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 3(1) (2022) 59-70 mailto:r.fenny.syafariani@email.unikom.ac.id r. fenny association analysis with apriori...| 60 used properly. however, this data set contains very useful information. with the application of association analysis or association rule mining in this discussion, it is hoped that association rules can be found between a combination of items. so that obtained a knowledge of the application of the concept of association mining analysis through the search for support and confidence. the previous research discussed "application of data mining for analysis of consumer purchase patterns with the fpgrowth algorithm on motorcycle spare part sales transaction data" (fajrin et al., 2018). this research is tested in order to influence consumer buying patterns, because each consumer's buying pattern is different. this needs to be analyzed further so that it can produce useful information, as well as maximize the benefits that can be obtained. then the next research that has been done previously is discussing "data mining analysis for clustering covid-19 cases in lampung province with the k-means algorithm" (nabila et al., 2021). this study was to analyze data on covid-19 cases in order to find out the grouping of the covid-19 case problems in lampung province. the grouping of data on covid19 cases in lampung province was carried out using the clustering method with the k-means algorithm. the results of dbi validation using manual calculations and using the help of rapidminer tools have differences, in this case manual calculations have better results than using rapidminer tools, but the results of both calculations are both close to 0 which means the evaluated clusters produce good clusters. in the previous research conducted using the fpgrowth algorithm analysis method, and subsequent research using the k-means method in conducting the analysis. this research is "association analysis with apriori algorithm for decision support system for selling electronic goods" (riszky et al., 2019). data mining and a priori algorithms are very useful to find out the reality of the frequency of sales of electronic goods that are most in demand by consumers. so that it can be used as very useful information in making decisions to prepare stocks of what types of electronic goods are needed in the future. 2. method in this study using descriptive analysis method with a qualitative approach (rahmawati et al., 2018). while in data processing using data mining techniques. the algorithm approach used is the a priori algorithm. the process of forming a combination of itemsets pattern and making rules starts from data analysis. the data used is data on sales of electronic goods, then followed by the formation of a combination of itemsets pattern and from an interesting combination of itemsets, association rules are formed. then the data is made in tabular data format (tana et al., 2018; syahril et al., 2020; simbolon, 2019). in relation to the application used in the test, it is an application that uses one of the microsoft excel databases with data in tabular data, then the sales transaction data (electronic goods data out), is converted into binary form (triansyah et al., 2018). after that the formation of a combination of two elements with a minimum value of the frequency of occurrence = 15 and a minimum value of confidence = 75%. to 61 | international journal of informatics information system and computer engineering 3(1) (2022) 59-70 calculate support and confidence, the following formula is used: 3. results and discussion 3.1. preprocessing the dataset used as a test sample in this study uses 26 transaction data. in the data there are several items of electronic goods sold, namely printers, laptops, chargers, and mice. the following is a table of transaction data that is used as a sample. table 1. transaction data table transaction item e1 printer,laptop e2 laptop, printer e3 charger, printer e4 printer, mouse, printer e5 charger, printer,printer e6 laptop, printer e7 printer,laptop,charger e8 printer, laptop, printer e9 printer,laptop e10 printer, printer e11 printer, laptop, printer e12 laptop, printer e13 charger, printer,printer e14 printer,laptop e15 printer,laptop,charger e16 printer, charger e17 charger, printer e18 charger, printer,printer e19 printer, laptop, printer support = 𝛴𝑖𝑡𝑒𝑚𝑠 𝑝𝑢𝑟𝑐ℎ𝑎𝑠𝑒𝑑 𝑎𝑡 𝑜𝑛𝑐𝑒 𝛴𝑡𝑜𝑡𝑎𝑙 𝑡𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛 × 100% confidence = 𝛴𝑖𝑡𝑒𝑚𝑠 𝑝𝑢𝑟𝑐ℎ𝑎𝑠𝑒𝑑 𝑎𝑡 𝑜𝑛𝑐𝑒 𝛴𝑡𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑎𝑛𝑡𝑒𝑐𝑒𝑑𝑒𝑛𝑡 𝑠𝑒𝑐𝑡𝑖𝑜𝑛 × 100 % r. fenny association analysis with apriori...| 62 e20 charger, printer,printer e21 laptop, printer e22 printer,laptop,charger e23 printer,laptop e24 printer, mouse, printer e25 laptop, printer e26 printer, mouse, printer 3.2. transaction data tabular format the application used in the test is a microsoft excel database so that the data must be converted into binary form (sianturi et al., 2018).the conversion process is that the slip number of the data to be tested is made horizontally downwards, while all types of items will become vertical attributes, so that they form like a table, based on real sales data (electronic goods data out) the meeting point between the name of the electronic type and the number slip will become binary 1, while those that do not become meeting points will become binary 0. the results of the conversion process of sales transaction data to data format in tabular data form are as shown in the following table: table 2. data in the form of tabular data transaction printer mouse laptop charger e1 1 0 1 0 e2 0 1 1 0 e3 1 0 0 1 e4 1 1 1 0 e5 1 1 0 1 e6 0 1 1 0 e7 0 1 1 1 e8 1 1 1 0 e9 1 0 1 0 e10 1 1 0 0 63 | international journal of informatics information system and computer engineering 3(1) (2022) 59-70 e11 1 1 1 0 e12 0 1 1 0 e13 1 1 0 1 e14 1 0 1 0 e15 0 1 1 1 e16 0 1 0 1 e17 1 0 0 1 e18 1 1 0 1 e19 1 1 0 0 e20 1 1 0 1 e21 0 1 1 0 e22 0 1 1 1 e23 1 0 1 0 e24 1 1 1 0 e25 0 1 1 0 e26 1 1 1 0 17 20 17 10 in the tabular data table above, the number of occurrences (electronic items that come out) of each item is: printer = 17, mouse = 20, laptop = 17, and charger = 10. 3.3. formation of two elements combination pattern with a minimum value of the frequency of occurrence ф= 15. in the form table tabular data, there is one electronic item that does not meet the provisions of the frequency limit, namely charger = 10, so in the formation of the pattern of these two elements we make a combination of pairs of 3 electronic items, namely printer-mouse, laptop printer, mouselaptop. the following are tables of 2 element combinations: table 3. two elements combination pattern (printer, mouse) transaction printer mouse f r. fenny association analysis with apriori...| 64 e1 1 0 s e2 0 1 s e3 1 0 s e4 1 1 p e5 1 1 p e6 0 1 s e7 0 1 s e8 1 1 p e9 1 0 s e10 1 1 p e11 1 1 p e12 0 1 s e13 1 1 p e14 1 0 s e15 0 1 s e16 0 1 s e17 1 0 s e18 1 1 p e19 1 1 p e20 1 1 p e21 0 1 s e22 0 1 s e23 1 0 s e24 1 1 p e25 0 1 s e26 1 1 p total (p) 11 table 4. two elements combination pattern (printer, laptop) 65 | international journal of informatics information system and computer engineering 3(1) (2022) 59-70 transaction printer laptop f e1 1 1 p e2 0 1 s e3 1 0 s e4 1 1 p e5 1 0 s e6 0 1 s e7 0 1 s e8 1 1 p e9 1 1 p e10 1 0 s e11 1 1 p e12 0 1 s e13 1 0 s e14 1 1 p e15 0 1 s e16 0 0 s e17 1 0 s e18 1 0 s e19 1 0 s e20 1 0 s e21 0 1 s e22 0 1 s e23 1 1 p e24 1 1 p e25 0 1 s e26 1 1 p total (p) 9 table 5. two elements combination pattern (printer, laptop) transaction mouse laptop f e1 0 1 s e2 1 1 p e3 0 0 s e4 1 1 p e5 1 0 s e6 1 1 p e7 1 1 p e8 1 1 p e9 0 1 s e10 1 0 s r. fenny association analysis with apriori...| 66 e11 1 1 p e12 1 1 p e13 1 0 s e14 0 1 s e15 1 1 p e16 1 0 s e17 0 0 s e18 1 0 s e19 1 0 s e20 1 0 s e21 1 1 p e22 1 1 p e23 0 1 s e24 1 1 p e25 1 1 p e26 1 1 p total (p) 13 from the tables of the 2 elements above, p means that the items are sold together, while s means that there are no items that are sold together or there is no transaction. σ represents the number of frequency items set. so that in the pattern of these two elements, the support value is obtained, namely: • printer – mouse = 11 • printer – laptop = 9 • mouse – laptop = 13 3.4. formation of three elements combination pattern the combination of the 2 elements in the table above, we can combine into 3 elements. for the set formed on these 3 elements are laptop, printer, mouse. here is a table of 3 elements: table 6. three elements combination pattern (printer, mouse, laptop) transaction printer mouse laptop f e1 1 0 1 s e2 0 1 1 s e3 1 0 0 s e4 1 1 1 p e5 1 1 0 s 67 | international journal of informatics information system and computer engineering 3(1) (2022) 59-70 e6 0 1 1 s e7 0 1 1 s e8 1 1 1 p e9 1 0 1 s e10 1 1 0 s e11 1 1 1 p e12 0 1 1 s e13 1 1 0 s e14 1 0 1 s e15 0 1 1 s e16 0 1 0 s e17 1 0 0 s e18 1 1 0 s e19 1 1 0 s e20 1 1 0 s e21 0 1 1 s e22 0 1 1 s e23 1 0 1 s e24 1 1 1 p e25 0 1 1 s e26 1 1 1 p total (p) 5 it can be seen from the pattern table of the 3 elements above, the items that were sold simultaneously were laptop – printer mouse = 5 so, the support value in the 3 element pattern table is 5. 3.4. association rules calculating the support and confidence values of each frequent itemset so that candidate association rules appear (lestari, 2017). to calculate support and confidence, the following formula is used: support = 𝛴𝑖𝑡𝑒𝑚𝑠 𝑝𝑢𝑟𝑐ℎ𝑎𝑠𝑒𝑑 𝑎𝑡 𝑜𝑛𝑐𝑒 𝛴𝑡𝑜𝑡𝑎𝑙 𝑡𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛 × 100% r. fenny association analysis with apriori...| 68 so that the results are obtained as in the table below table 7. association rules candidate list if antecedent, then consequent support confidence printer, mouse 11/26*100 %= 42,30% 11/17*100 %= 64,70% mouse, printer 11/26*100 %= 42,30% 11/20*100 %= 55% printer, laptop 9/26*100%=34,61% 9/17*100%=52,94% laptop, printer 9/26*100%=34,61% 9/17*100%=34,61% mouse, laptop 13/26*100%=50% 13/20*100%=65% laptop, mouse 13/26*100%=50% 13/17*100%=76,47% from the table above, the support and confidence have been determined. then select the association rules that meet the minimum confidence of 75%, so that the association rules are obtained, which are as follows: table 8. association rules list if antecedent, then consequent support confidence laptop, mouse 13/26*100%=50% 13/17*100%=76,47% from the results of the analysis that has been carried out, there is 1 product association rule that meets the minimum confidence limit, namely laptop mouse. then the results obtained are "76,47% of transactions that contain laptop electronics also contain mouse electronics. and 50% of all transactions that contain these two items". with apriori algorithm analysis can be applied to assist marketing strategies in a company or institutions. data mining and a priori algorithms are very useful to find out the relationship between the frequency of sales of electronic goods that are most in demand by customers, so that they can be used as very valuable information in making decisions to prepare stocks of what types of electronic goods are needed in the future. 4. conclusion a priori algorithm is used in conducting association analysis to determine the level of dependence of various items to explore information on what items have dependence on other items based on 26 transaction records that are sampled. the author performs an association analysis calculation from the samples taken so that the result is that 76.47% of a consumer who buys laptop electronics has the possibility to also buy mouse electronics. and 50% of all transactions that contain these two items. confidence = 𝛴𝑖𝑡𝑒𝑚𝑠 𝑝𝑢𝑟𝑐ℎ𝑎𝑠𝑒𝑑 𝑎𝑡 𝑜𝑛𝑐𝑒 𝛴𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑎𝑛𝑡𝑒𝑐𝑒𝑑𝑒𝑛𝑡 𝑠𝑒𝑐𝑡𝑖𝑜𝑛 × 100 % 69 | international journal of informatics information system and computer engineering 3(1) (2022) 59-70 references ordila, r., wahyuni, r., irawan, y., & sari, m. y. (2020). penerapan data mining untuk pengelompokan data rekam medis pasien berdasarkan jenis penyakit dengan algoritma clustering (studi kasus: poli klinik pt. inecda). jurnal ilmu komputer, 9(2), 148-153. riszky, a. r., & sadikin, m. (2019). data mining menggunakan algoritma apriori untuk rekomendasi produk bagi pelanggan. jurnal teknologi dan sistem komputer, 7(3), 103-108. purnia, d. s., & warnilah, a. i. (2017). implementasi data mining pada penjualan kacamata menggunakan algoritma apriori. ijcit (indonesian journal on computer and information technology), 2(2). rahmawati, f., & merlina, n. (2018). metode data mining terhadap data penjualan sparepart mesin fotocopy menggunakan algoritma apriori. piksel: penelitian ilmu komputer sistem embedded and logic, 6(1), 9-20. triyansyah, d., & fitrianah, d. (2018). analisis data mining menggunakan algoritma k-means clustering untuk menentukan strategi marketing. incomtech: jurnal telekomunikasi dan komputer, 8(3), 163-182. nabila, z., isnain, a. r., permata, p., & abidin, z. (2021). analisis data mining untuk clustering kasus covid-19 di provinsi lampung dengan algoritma kmeans. jurnal teknologi dan sistem informasi, 2(2), 100-108. riszky, a. r., & sadikin, m. (2019). data mining menggunakan algoritma apriori untuk rekomendasi produk bagi pelanggan. jurnal teknologi dan sistem komputer, 7(3), 103-108. sianturi, f. a. (2018). penerapan algoritma apriori untuk penentuan tingkat pesanan. jurnal mantik penusa, 2(1). lestari, n. (2017). penerapan data mining algoritma apriori dalam sistem informasi penjualan. jurnal edik informatika penelitian bidang komputer sains dan pendidikan informatika, 3(2), 103-114. tana, m. p., marisa, f., & wijaya, i. d. (2018). penerapan metode data mining market basket analysis terhadap data penjualan produk pada toko oase menggunakan algoritma apriori. jimp-jurnal informatika merdeka pasuruan, 3(2). syahril, m., erwansyah, k., & yetri, m. (2020). penerapan data mining untuk menentukan pola penjualan peralatan sekolah pada brand wigglo dengan r. fenny association analysis with apriori...| 70 menggunakan algoritma apriori. jurnal teknologi sistem informasi dan sistem komputer tgd, 3(1), 118-136. simbolon, p. h. (2019). implementasi data mining pada sistem persediaan barang menggunakan algoritma apriori (studi kasus: srikandi cash credit elektronic dan furniture). jurikom (jurnal riset komputer), 6(4), 401-406. fajrin, a. a., & maulana, a. (2018). penerapan data mining untuk analisis pola pembelian konsumen dengan algoritma fp-growth pada data transaksi penjualan spare part motor. kumpulan jurnal ilmu komputer (klik), 5(01), 1-10. 203 | international journal of informatics information system and computer engineering 3(2) (2022) 209-218 doi: https://doi.org/10.34010/injiiscom.v3i2.9037 p-issn 2810-0670 e-issn 2775-5584 log monitoring system using quick response (qr) code: a state university’s covid – 19 contact tracing system anna monica c. paculaba*, mark angelo s. bathan, erickson l. niego college of arts and sciences, samar state university, philippines *corresponding email: monica.paculaba@ssu.edu.ph a b s t r a c t s a r t i c l e i n f o contact tracing is the technique employed by public health units and the national close contact service to track down persons who may have been exposed to covid-19 by interaction with a suspect, confirmed, or probable case during their infectious period. this study focused on the development of a log monitoring system using quick response (qr) code in samar state university as an institution’s tracing system for covid – 19 preventions. the study was designed as a tool for managing the everyday logs of the employees, students, and visitors to track down the person who is in close contact to a covid – 19 positives. the waterfall model was used in developing the system and descriptive research design was used to determine the effectiveness of the system along with functionality, reliability, usability, efficiency, maintainability, and portability. the participants of the study were the employees, students, and visitors of ssu. each participant has given an iso 9126 quality standard questionnaire for the evaluation of the effectiveness of the system. the result revealed that using the system, the conduct of contact tracing of the possible covid – 19 suspected individuals was done easily and with reliability. article history: submitted/received 01 jul 2022 first revised 03 sept 2022 accepted 01 oct 2022 available online 13 oct 2022 publication date 01 dec 2022 aug 2018 __________________ keywords: information system, iso/iec 9126, monitoring system, system effectiveness. international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 3(2) (2022) 209-218 https://doi.org/10.34010/injiiscom.v3i2.9037 paculaba et al. log monitoring system using quick response (qr) …| 210 doi: https://doi.org/10.34010/injiiscom.v3i2.9037 p-issn 2810-0670 e-issn 2775-5584 1. introduction during the current covid-19 pandemic and previous pandemics, a variety of digital health initiatives were used to control disease transmission. these control techniques have been shown to be effective in decreasing the initial wave of covid-19 in various countries; among these strategies, contact tracing is regarded the cornerstone of containment and receives a lot of attention. according to the study of ferretti, et al. (2020), contact tracing entails locating, quarantining, and notifying infected individuals' contacts. this is a technique employed by public health units and the national close contact service to track down persons who may have been exposed to covid 19 by interaction with a suspect, confirmed, or probable case during their infectious period. (new zealand government ministry of health, 2020). one of the most efficient approaches to stop the transmission of this virus and identify the main and secondary contacts of confirmed covid19 patients is to use this kind of technologies and tools (dunford, et al., 2020; chen, yang & wang, 2020). hence, it has led to a fast rise of covid-19 contact tracing technologies in different countries in the world. in the philippines, various contact tracing system has emerged. different local government units (lgus) in the different regions of the country implemented its contact tracing system to boost the lgus contact tracing program amid the rising number of covid – 19 cases. local government unit of catbalogan city, samar employed this type of system. it was deployed to the different establishments in the city to monitor the logs of the individuals using quick response (qr) code to capture easily their data. in this way, it will be accessible to locate the persons who are in close contact to a covid – 19 positive individuals. every resident of the city and non – residents who works in the said area need to register their information manually to avail a “quarantine pass” an identification card with qr code, that served as their access to enter different establishments. one of the establishments that was recipient of the said system is samar state university (ssu), one of the higher institutions in the philippines located in the city of catbalogan. the system was implemented in ssu, however, only those logs of the employees and visitors who have their quarantine passes can be recorded by the system. information of the students who came from other places outside catbalogan cannot be recorded. on the other hand, the system is limited only in capturing the logs who enter inside the campus. information of the individuals who visited the different departments, colleges, and offices in ssu were not included in the functionality of the system. as we know in this modern world, monitoring system is widely used in organizations like schools, to keep track of the day-today operations (balmes, 2016). on the other hand, using unique qr code-based identity card, authorization as well as authentication in monitoring is very important for the growth of the organization (bole, et al., 2016). hence, it is possessing a really great advantage that among the whole types of code scanning technology, qr code-based monitoring system is the most accurate (wei, et al., 2017). this led https://doi.org/10.34010/injiiscom.v3i2.9037 211 | international journal of informatics information system and computer engineering 3(2) (2022) 209-218 doi: https://doi.org/10.34010/injiiscom.v3i2.9037 p-issn 2810-0670 e-issn 2775-5584 to an idea to the researchers to develop a web – based log monitoring system with an integration of qr code for samar state university. the system intends to quickly take a piece of information from the employees, students, and visitors who visited the campus in order to trace their daily logs that will be an advantage in tracking down the information of the persons who were in a close contact to a covid – 19 positives. individuals which include students, faculty and employees from ssu, and visitors from any part of the country who wanted to visit ssu can easily have their “quarantine pass” as the system has an automated registration. the system has the capability to record the entrance and exit logs inside the campus, and apart from that, information of an individual who enter and exit in the different department, colleges, and offices of ssu will be logged by the system. in this way, the duration of time of stay consumed by an individual in the different offices they visited within the campus can be easily tracked which was a big aspect to determine if a person is considered to be a close contact to a covid – 19 positive individuals. in addition, to test the effectiveness of the system, iso/iec 9126 system quality standard will be applied (iso/iec 9126, 2022). 2. method stakeholders in the ilorin metropolitan this study utilized a developmental approach in developing the log monitoring system using quick response (qr) code and used a quantitative approach to assess the system’s effectiveness upon implementation. the researchers have chosen the water fall system development life cycle model as shown in fig.1 in order to achieve the objectives of the study. the waterfall model defines several consecutive phases that must be completed one after the other and moving to the next phase only when its preceding phase is completely done. for this reason, the waterfall model is the waterfall model is recursive in that each phase can be endlessly repeated until it is perfected (bassil, 2012). this methodology is composed of five phases such as planning and analysis phase, design phase, implementation phase, testing phase and maintenance phase. 2.1. system planning and analysis the researchers determined the requirements such as gathering data from the end – user. after gathering of data, it was analyzed to test the validity. the probability of combining the requirements in the system to be developed is also studied. 2.2. system design the system was rationally planned to fulfill the functional requirements identified during analyzing requirements specification. in addition, the database as well as the interface was drawn on how it will be designed. 2.3. implementation in this phase, the whole system requirements and blue prints was converted into an actual coding. the researchers used the php programming language as a base program of the system and mysql for its database. 2.4. testing and integration testing of the product was made. each functionality was verified and validated by the researchers to evaluate if the vital specifications was met. https://doi.org/10.34010/injiiscom.v3i2.9037 paculaba et al. log monitoring system using quick response (qr) …| 212 doi: https://doi.org/10.34010/injiiscom.v3i2.9037 p-issn 2810-0670 e-issn 2775-5584 2.5. operation and maintenance after testing and evaluating the system, the errors discovered from the deployment of the system was refined, corrected and modified as well as the suggestions gathered from the end – users was considered. 2.6. system architecture and flow the architecture and flow of the log monitoring system is presented in figs. 1 and 2. 2.7. data gathering procedure in general, the researchers provided a set of questionnaires as one of the basic instruments in conducting a beta-testing. with a total of forty (40) number of respondents; 10 faculty, 10 admin employees, 10 students, and 10 visitors to test the effectiveness of the system. the respondents tested the system’s effectiveness in terms of its accuracy, functionality, reliability, and efficiency. the researchers had also used applicable statistical tools for the complete evaluation of the system in order to acquire concrete feedbacks and opinions coming from of the respondents. fig. 1. developmental process fig. 2. system architecture and flow https://doi.org/10.34010/injiiscom.v3i2.9037 213 | international journal of informatics information system and computer engineering 3(2) (2022) 209-218 doi: https://doi.org/10.34010/injiiscom.v3i2.9037 p-issn 2810-0670 e-issn 2775-5584 the effectiveness of the system applies the three (3) layers of testing namely: alpha, beta and full implementation. the mean uses the standard criteria under iso/iec 9126 in terms of functionality, reliability, usability, efficiency, maintainability and portability. 3. results and discussion anchored to the objectives of the study, the result of the study was derived. 3.1. system’s interface 3.1.1. log – in and registration before entering to the main setup form in the log monitoring system, a login form will pop up as shown in fig. 3 to allow the user to login with their own username and password. the function of the login form is to enter authentication credentials so that the users can access to the main form of the system. when the login form is submitted, the elemental code will be used to check and compare with the existing credentials in mysql database. if the result matched, the users will be granted for further features of the system. on the contrary, in order for the client to have an account in the log monitoring system, the client should register first by providing their personal information such as: first name, last name, contact number, address, client – type (faculty, student, admin employee, or visitor), and their respective offices/department. for the outsiders or the visitors, they need to select the offices/department that they wanted to visit. 3.1.2. homepage once the user successfully logged – in into the system, the user will be redirected to the homepage interface as shown in fig. 4. from the homepage, menu option is seen where qr code information is located. unique qr code is automatically generated, once the user provides their information and successfully access the system. 3.1.3. qr code scanner this interface shows how the qr code will be scanned as seen in fig. 5. once unique qr code is scanned, information of the user who enter the university will be automatically stored in the database. 3.1.4. log reports this log reports interface is shown in figs. 6 and 7. it presents the summary of the logs of the persons who enter and exit the campus. once the summary of log reports is downloaded, information of the person who enter the specific office, the time they log – in and log – out, contact number, address and their client type (faculty, student, admin employee, and visitor) displayed. reports can be downloaded in excel form. https://doi.org/10.34010/injiiscom.v3i2.9037 paculaba et al. log monitoring system using quick response (qr) …| 214 doi: https://doi.org/10.34010/injiiscom.v3i2.9037 p-issn 2810-0670 e-issn 2775-5584 fig. 3. log – in and registration for fig. 4. homepage https://doi.org/10.34010/injiiscom.v3i2.9037 215 | international journal of informatics information system and computer engineering 3(2) (2022) 209-218 doi: https://doi.org/10.34010/injiiscom.v3i2.9037 p-issn 2810-0670 e-issn 2775-5584 fig. 5. qr code scanner fig. 6. log report interface https://doi.org/10.34010/injiiscom.v3i2.9037 paculaba et al. log monitoring system using quick response (qr) …| 216 doi: https://doi.org/10.34010/injiiscom.v3i2.9037 p-issn 2810-0670 e-issn 2775-5584 fig. 7. summary of log report 3.2. end user’s system performance evaluation the result of the end – user evaluation as shown in table 1 indicates that the system over – all performance is “highly effective” with a numerical mean of 4.61. among of the parameters, reliability of the system got the highest mean 4.68. the result implied that the system is free from error, capable of handling errors, can resume and restore lost data from failure, and can presents integrated reports. the result agreed to the study of tworek (2018), that the reliability of the information system is linked to the information security, availability, and responsiveness. this can be characterized as that the system is assured to be accurate and is conveniently accessible, user-friendly and accepted by its users, responsive, and high availability connected with high security. table 1. end user’s evaluation of system performance evaluation result criteria mean value descriptive remarks functionality 4.65 highly effective reliability 4.68 highly effective usability 4.58 highly effective efficiency 4.56 highly effective https://doi.org/10.34010/injiiscom.v3i2.9037 217 | international journal of informatics information system and computer engineering 3(2) (2022) 209-218 doi: https://doi.org/10.34010/injiiscom.v3i2.9037 p-issn 2810-0670 e-issn 2775-5584 table 1 (continue). end user’s evaluation of system performance evaluation result maintainability 4.61 highly effective portability 4.55 highly effective grand mean 4.61 highly effective legend: 1.51 – 5:00 highly effective 3.31 – 4.50 moderately effective 2.51 – 3.50 effective 1.51 – 2.50 slightly effective 1:00 – 1.50 not effective 4. conclusion log monitoring system using qr code is one of the effective ways to monitor the logs of the faculty, students, admin employee, and visitors who enter the campus. since log – in and log – out of the respondents are recorded, the conduct of contact tracing of the possible covid – 19 suspected individuals will be done easily and with reliability as the span of time of their stay in the area/office inside the university they visited is recorded by the system. for the expansion and sustainable use of the log monitoring system using qr code, it is recommended that attendance monitoring of the students, laboratory utilization monitoring, and other related activities will be included in the system. acknowledgement the researchers would like to acknowledge the center for engineering, science and technology, and innovation (cesti) of samar state university for the supported funding they provided to make this research project possible references balmes, i. l. (2016). online day to day monitoring system for lyceum of the philippines university-batangas. the 2nd multidisciplinary research and innovation for globally sustainable development. bassil, y. (2012). a simulation model for the waterfall software development life cycle. international journal of engineering & technology (ijet), 2(5). bole, g. r., more, s. p., parnak, a. a., & naik, l. s. (2016). qr code based effective employee maintenance system. international research journal of engineering and technology (irjet), 03(04), 604 607. chen, s., yang, j., yang, w., wang, c., & bärnighausen, t. (2020). covid-19 control in china during mass population movements at new year. the lancet, 395(10226), 764-766. https://doi.org/10.34010/injiiscom.v3i2.9037 paculaba et al. log monitoring system using quick response (qr) …| 218 doi: https://doi.org/10.34010/injiiscom.v3i2.9037 p-issn 2810-0670 e-issn 2775-5584 dunford, d., dale, b., stylianou, n., lowther, e., ahmed, m., & de la torre arenas, i. (2020). coronavirus: the world in lockdown in maps and charts. bbc news, 9, 462. ferretti, l., wymant, c., kendall, m., zhao, l., nurtay, a., abeler-dörner, l., ... & fraser, c. (2020). quantifying sars-cov-2 transmission suggests epidemic control with digital contact tracing. science, 368(6491), eabb6936. iso/iec 9126 in software engineering. (2022). retrieved from greeks for greeks: https://www.geeksforgeeks.org/iso-iec-9126-in-software-engineering/ new zealand government ministry of health. (2020). retrieved from https://www.health.govt.nz/system/files/documents/publications/covi d-19_contact_tracing_qr_code_specification_25_may_2020.pdf tworek, k. (2018). reliability of information systems in organization in the context of banking sector: empirical study from poland. cogent business & management, 5(1), 1522752. wei, x., manori, a., devnath, n., pasi, n., & kumar, v. (2017). qr code based smart attendance system. international journal of smart business and technology, 5(1), 1-10. https://doi.org/10.34010/injiiscom.v3i2.9037 https://www.geeksforgeeks.org/iso-iec-9126-in-software-engineering/ https://www.health.govt.nz/system/files/documents/publications/covid-19_contact_tracing_qr_code_specification_25_may_2020.pdf https://www.health.govt.nz/system/files/documents/publications/covid-19_contact_tracing_qr_code_specification_25_may_2020.pdf 1 | international journal of informatics information system and computer engineering 4(1) (2023) 23-30 doi: https://doi.org/10.34010/injiiscom.v4i1.9414 p-issn2810-0670 e-issn2775-5584 combination of technology acceptance model and decision-making process to study retentive consumer behavior on online shopping edwin rudini1*, dwiza riana2, sri hadianti3 1,2program studi ilmu komputer, universitas nusa mandiri 3program studi informatika, universitas nusa mandiri jln. jatiwaringin no.2 cipinang melayu, makasar jakarta timur, kota jakarta, 13620 dki jakarta *corresponding email: edwinrudini98@gmail.com a b s t r a c t s a r t i c l e i n f o during the spread of the covid-19 virus, generally the indonesian people began to switch from conventional markets to buying and selling goods and services online with various features and conveniences offered to users. the purpose of this study is to find out the extent to which indicators of satisfaction and trust influence consumer attitudes and behavior when deciding to make transactions at online shops. the study method uses a combination of tam (theory acceptance model) and dmp (decision making process) models using a sampling of 110 student respondents and the public who have made transactions in online shops. data analysis using sem (structural equation modeling) theory. the results showed that satisfaction and trust will influence consumers in shaping. article history: submitted/received 03 dec 2022 first revised 31 dec 2022 accepted 20 feb 2023 first available online 10 april 2023 publication date 01 june 2023 8 __________________ keywords: tam, dmp model, trust, satisfaction, repurchase. 1. introduction changing people's behavior within the framework of online stores is a big challenge for companies to be able to serve all people's needs and wants. information released by the ministry of communication and information explained that the value of online shopping transactions in 2021 will reach idr 337 trillion and the number of internet users will reach 210 million (rose et al., 2015). therefore, it can be concluded that the possibility of development in online trading is very open. this encourages several large companies to invest in the advancement of online business in indonesia. the growing potential of online commerce is international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 4(1) (2023) 23-30 https://doi.org/10.34010/injiiscom.v4i1.9414 mailto:edwinrudini98@gmail.com rudini et al., combination of technology acceptance model and decision-making … | 24 doi: https://doi.org/10.34010/injiiscom.v4i1.9414 p-issn2810-0670 e-issn2775-5584 expected to produce more technology entrepreneurs and encourage the growth of msmes according to the characteristics of each company to utilize every potential they have (haryanti & subriadi, 2020). icd research foundation estimates the growth potential of online shop business in indonesia at 33.2% during 2020-2021 and is one of the countries with the fastest and largest online shopping-based business growth rate in the asia-pacific region (setyowati et al, 2021). the increase in online commerce activity is not in accordance with the growth of online shop buyers. this is due to a number of barriers, including low credit and debit card accessibility as well as shoppers' reluctance to shop online (nagy & hajdú, 2021). then based on nielsen statistical surveys it is known that buyers will look for data on the internet before choosing to buy a product they need. the trust to buy products in online shopping is a barrier difficult to control. because it is related to customer views and behavior (dennis et al., 2010). therefore, to study the attitude and behavior of buyers towards online shopping so that business organizations can take advantage of existing opportunities. it is possible to measure buyer behavior with a social behavior approach that acts as a variable that influences customer views and behavior in online shopping. to measure buyer behavior must be possible with a social behavior approach that acts as a variable that influences customer perspective and behavior in shopping at online shops (petcharat & leelasantitham, 2021). in estimating the use of data innovation, there are several models that can be used, such as the technology acceptance model (tam) and the decision making process (dmp), which states that individual behavior is a measure of power and activity whereby individuals will leverage data frameworks and data innovations if it is beneficial (wei et al., 2018). customer behavior in online commerce is also influenced by the satisfaction of making transactions on the internet and is a major factor that makes buyers prefer online stores. furthermore, buyer satisfaction with online transactions is proven to affect customer confidence, which in turn will affect buyers' views on repurchases. given the problems, obstacles and difficulties, as well as the potential created by online merchants, the online shopping sector is expected to encourage the improvement of the indonesian economy (kim, 2020). 2. method the study method uses a combination of tam (theory acceptance model) and dmp (decision making process) models. due to the large population and limited time and cost to 110 respondents from the population studied. in addition, research methods can be used to evaluate and compare results and draw conclusions. specimen collection techniques through targeted sampling of students and communities. in determining research specimens based on several criteria, namely students who are active, willing to answer surveys distributed by researchers, a minimum sample size of 15% of the total population, and have already done a set online shopping transaction (see tables 1 and 2). the method of collecting data is carried out by distributing questionnaires through google forms to respondents with specified randomization criteria and observing directly the object to be studied. in this study the variables used are independent variables, moderator https://doi.org/10.34010/injiiscom.v4i1.9414 25 | international journal of informatics information system and computer engineering 4(1) (2023) 23-30 doi: https://doi.org/10.34010/injiiscom.v4i1.9414 p-issn2810-0670 e-issn2775-5584 variables and dependent variables. the independent variables used are perceived usefulness (x1), perceived ease of use (x2), evaluation alternative (x3), information search (x4), moderator variables namely trust (y1), satisfaction (y2) and the dependent variable used is purchase (z). table 1. respond demographics total (n-384) frequency percentage (%) gender male 24 26.8 female 86 73.20 nationality indonesia 110 100 18-35 81 29.17 age 36-53 15 18.08 54-65 14 17.05 table 2. variable and indicators variable indicators perceived usefullness (x1) online shopping platforms help you search and buy products faster than offline shopping online shopping platforms help you buy products cheaper than offline shopping perceived easy use (x2) you can use online shopping platforms with your own gadgets online shopping platforms have clear functions and are easy to understand evaluation alternative (x3) the search function in online shopping platforms is necessary and benefits shoppers intend to buy products before going to the shopping cart functions in online shopping platforms help you to compare a product information search (x3) the search function on online shopping platforms is quite helpful comparing the quality and price of similar products will make it easier for you to make a decision to buy perceived trust (y1) online shopping platforms have accurate and clear results such as product details and prices you are sure to get the purchased products from the online shopping platform satisfaction (y2) online shopping platforms have accurate and clear results such as product details and prices you are sure to get the purchased products from the online shopping platform an online shopping platform is required for you repurchase (z) you can buy back from online shopping platforms you have repurchased the same product from an online shopping platform https://doi.org/10.34010/injiiscom.v4i1.9414 rudini et al., combination of technology acceptance model and decision-making … | 26 doi: https://doi.org/10.34010/injiiscom.v4i1.9414 p-issn2810-0670 e-issn2775-5584 3. results and discussion this analysis was used to describe the results of a survey consisting of the number of students and the general public who answered a questionnaire that measured their trust and satisfaction in using online shopping (bhatti et al., 2020). the data was processed using smart partial least squares (pls) software version 3.2.9 and microsoft excel windows 2016 version. the variables analyzed in this study include satisfaction before and after transactions (x1, x2, x3 and x4), trust (y1), and satisfaction of online shopping repurchases (y2) and purchases (z) (riantini, n.d.). the research instrumentation uses likert scale methodology. the likert scale consists of two types of statements, positive and negative, with positive statements scoring 4 points for strongly agreeable responses and strongly disagree responses as 1 point using structural equation modeling (sem) and pls data analysis techniques to develop predictive theories related to satisfaction and trust in the use of online shopping in students and the community (fedorko et al., 2018). pls model analysis is based on predictive measures with non-parametric properties due to convergence validity. here the measurement of individual reflections correlates with discriminant validity values comparing loading values of > 0.5 and squared values. root of extracted mean variance (ave) for each component with correlation between components in the model (sheth, 2020). discriminant validity is good if the ave value is greater than the correlation value between the component and the model. structural models are tested using r squared for dependent structures, stone geyser q-squared test to test predictive associations, t tests and significance for structural path parameters. data analysis was carried out by entering all respondents' data and testing convergence validity, discriminant validity and significance (cai et al., 2023). the results of the calculation explain that all indicators meet a construct loading value of >0.5 so that all indicators can be used in tests using the pls model. referring to the results of the calculation of convergence validity determined loading values per indicator is shown in table 3. table 3. convergent validity value evaluation alternative uses facilities belief satisfaction information search purchase ea1 0,879 ea2 0,830 is2 0,737 is3 0,941 peu1 1,000 ps1 0,899 ps2 0,897 pt1 0,920 pt2 0,880 pu2 1,000 rp2 1,000 https://doi.org/10.34010/injiiscom.v4i1.9414 27 | international journal of informatics information system and computer engineering 4(1) (2023) 23-30 doi: https://doi.org/10.34010/injiiscom.v4i1.9414 p-issn2810-0670 e-issn2775-5584 table 4. discriminant validity evaluation alternative uses facilities belief satisfaction information search purchase evaluation alternativ e 0,855 uses 0,283 1,000 facilities 0,314 0,496 1,000 belief 0,344 0,377 0,152 0,900 satisfaction 0,497 0,455 0,510 0,353 0,898 informatio n search -0,099 0,167 0,272 -0,191 0,145 0,845 purchase 0,041 0,065 0,192 -0,069 -0,109 0,177 1,000 table 5. average variance extracted (ave) and composite reability cronbach's alpha rho_a composite reliability average variance extracted (ave) evaluation of alternativ e 0,633 0,644 0,844 0,731 uses 1,000 1,000 1,000 1,000 facilities 1,000 1,000 1,000 1,000 belief 0,769 0,789 0,896 0,811 satisfaction 0,760 0,760 0,893 0,806 informatio n search 0,634 0,845 0,831 0,714 purchase 1,000 1,000 1,000 1,000 table 6. path coefficient and decision original sample (o) sample mean (m) standard deviation (stdev) t statistics (|o/stdev|) p value evaluation alternative -> belief 0,232 0,257 0,099 2,331 0,020 evaluation alternativ e -> satisfaction 0,320 0,314 0,113 2,836 0,005 evaluation of alternativ e -> purchase 0,111 0,131 0,144 0,775 0,439 uses -> belief 0,371 0,352 0,104 3,554 0,000 uses -> satisfaction 0,143 0,146 0,094 1,511 0,132 uses -> purchase 0,051 0,055 0,158 0,325 0,745 facilities -> belief -0,046 -0,054 0,125 0,367 0,714 facilities -> satisfaction 0,284 0,288 0,120 2,373 0,018 facilities -> purchase 0,305 0,315 0,135 2,258 0,025 the value discriminant validity, ave and composite reliability, and path coefficient on fornell-larcker are shown in tables 4 – 6. table 6 shows path coefficients is also known as significance and a measure of strength. this figure is used to interpret the importance and strength of relationships between concepts. this table has a range of path factor values from – 0.05 to + 0.05 for path coefficients. value greater than 0.05 is considered a negative relationship, while a positive relationship is a value less than https://doi.org/10.34010/injiiscom.v4i1.9414 rudini et al., combination of technology acceptance model and decision-making … | 28 doi: https://doi.org/10.34010/injiiscom.v4i1.9414 p-issn2810-0670 e-issn2775-5584 0.05, which increases the strength of the relationship. table 6 gives a detailed description of the table of path coefficients: h1a. evaluation of alternatives has an effect on trust. h1b. evaluation of alternatives has a significant effect on satisfaction. h1c. evaluation of alternatives has no effect on purchasing. h2a. usability has a significant effect on satisfaction. h2b. usability has no effect on satisfaction. h2c. usability has no effect on purchases. h3a. convenience has a significant and positive effect on purchases. h1b. evaluation of alternatives has a significant effect on satisfaction h2c. usability has no effect on purchases h3a. convenience has a significant and positive effect on purchasing h4a. trust has no effect on satisfaction h4b. trust has no effect on purchases h4c. satisfaction affects purchases h5a. seeking information has no effect on trust h5b. seeking information has no effect on satisfaction. 3.1. teory acceptance model (tam) this model was originally created by davis and has become one of the most widely used models to explain how users receive new technologies. this model was developed from the theory of reasoned action and provides a basis for identifying how external variables such as beliefs, attitudes, and intentions influence the acceptance of new technologies (see fig. 1) (dennis et al., 2010). perceived usefulness shows how far individuals will believe that utilizing technology improves the quality of their work (moe & fader, 2004). 3.2. perceived easy easy of use means that individuals believe that using information technology systems will not cause problems or require too much effort when used (hernández et al., 2011). 3.3. dmp (decision making process) the purchase decision process used in this study is related to the theory that the process is an orderly action and information search. table 7. r square adjust value r square r square adjusted belief 0.254 0.226 satisfaction 0.443 0.417 purchase 0.105 0.062 https://doi.org/10.34010/injiiscom.v4i1.9414 29 | international journal of informatics information system and computer engineering 4(1) (2023) 23-30 doi: https://doi.org/10.34010/injiiscom.v4i1.9414 p-issn2810-0670 e-issn2775-5584 fig. 1. smart pls use tam & dma model this process includes searching for and gathering information about relevant products or services. it gives you a wide range of service products worth buying (cai et al., 2023). 3.4. evaluation of alternative involves evaluating and comparing different products or services worth buying (deananda et al., 2020). 4. conclusion customer satisfaction is influenced by the trust process. this research shows that customer attitudes and behaviors when shopping online are influenced by the process of customer trust in online stores. from this it follows that trust in online stores have a significant impact on customer attitudes and behavior. in order for the online shop business to succeed optimally, the goal is to maintain customer trust well and protect msme business actors who use information technology to further optimize sales. references bhatti, a., akram, h., basit, h. m., khan, a. u., mahwish, s., naqvi, r., & bilal, m. (2020). e-commerce trends during covid-19 pandemic. international journal of future generation communication and networking, 13(2), 1449–1452. cai, x., cebollada, j., & cortiñas, m. (2023). impact of sellerand buyer-created content on product sales in the electronic commerce platform: the role of informativeness, readability, multimedia richness, and extreme valence. journal of retailing and consumer services, 70 (september 2022). deananda, a., budiastuti, p., & muid, d. (2020). faktor-faktor pengaruh minat penggunaan sistem informasi akuntansi berbasis e-commerce pada aplikasi shopee dengan menggunakan technology acceptance model (tam). diponegoro journal accounting, 9(4), 1–10 https://doi.org/10.34010/injiiscom.v4i1.9414 rudini et al., combination of technology acceptance model and decision-making … | 30 doi: https://doi.org/10.34010/injiiscom.v4i1.9414 p-issn2810-0670 e-issn2775-5584 dennis, c., morgan, a., wright, l. t., & jayawardhena, c. (2010). the influences of social e-shopping in enhancing young women’s online shopping behaviour. journal of customer behaviour, 9(2), 151–174. fedorko, i., bacik, r., & gavurova, b. (2018). technology acceptance model in e commerce segment. management and marketing, 13(4), 1242–1256. haryanti, t., & subriadi, a. p. (2020). factors and theories for e-commerce adoption: a literature review. in international journal of electronic commerce studies (vol. 11, issue 2, pp. 87–105). academy of taiwan information systems research. hernández, b., jiménez, j., & martín, m. j. (2011). age, gender and income: do they really moderate online shopping behaviour? online informatio n review, 35(1), 113–133. kim, r. y. (2020). the impact of covid-19 on consumers: preparing for digital sales. ieee engineering management review, 48(3), 212–218. https://doi.org/10.1109/emr.2020.2990115 moe, w. w., & fader, p. s. (2004). dynamic conversion behavior at e-commerce sites. management science, 50(3), 326–335. https://doi.org/10.1287/mnsc.1040.0153 nagy, s., & hajdú, n. (2021). consumer acceptance of the use of artificial intelligence in online shopping: evidence from hungary.amfiteatru economic, 23(56), 1– 1. petcharat, t., & leelasantitham, a. (2021). a retentive consumer behavior assessment model of the online purchase decision-making process. heliyon, 7(10) riantini, r. e. (n.d.). adoption of e-commerce online to offline with technology acceptance model (tam) approach. rose, k., eldridge, s., & chapin, l. (2015). the internet of things (iot): an overview. int. journal of engineering research and applications, 5(12), 71–82. setyowati, w., widayanti, r., & supriyanti, d. (2021). implementation of e-business information system in indonesia: prospects and challenges. international journal of cyber and it service management, 1(2), 180–188. sheth, j. (2020). impact of covid-19 on consumer behavior: will the old habits return or die? journal of business research, 117, 280–283. https://doi.org/10.1016/j.jbusres.2020.05.059 wei, y., wang, c., zhu, s., xue, h., & chen, f. (2018). online purchase intention of fruits: antecedents in an integrated model based on technology acceptance model and perceived risk theory. frontiers in psychology, 9(aug). https://doi.org/10.34010/injiiscom.v4i1.9414 225 | international journal of informatics information system and computer engineering 3(2) (2022) 231-240 doi: https://doi.org/10.34010/injiiscom.v3i2.9504 p-issn 2810-0670 e-issn 2775-5584 coffee tree detection using convolutional neural network ahmed abdulsalam naji hasan department of computer science, faculty of applied science taiz university, republic of yemen corresponding email: aaftername@gmail.com a b s t r a c t s a r t i c l e i n f o identifying plants is an important field in the environment because of their roles in the continuation of human existence. finding a plant by using the traditional methods such as looking at its physical properties is a burdensome task. thus, several computational-based methods have been introduced for detecting trees. in this study we constructed the coffee tree dataset due there is no publicly available coffee tree dataset for detection and classification of the coffee tree in orchard environments for what this tree has a role in health, industrial and agricultural fields, and raising the wheel of economic development. many machine learning algorithms have been used to detect and classify trees which resulted in reliable results. in this study, we presented a deep learning-based approach, in particular a convolutional neural network, for coffee tree detection and classification. the current study focused on providing a dataset for the detection and classification of coffee trees and improving the efficiency of the algorithm used in the detection and classification model. this study achieved the best results, the proposed system achieved an accuracy of 0.97%. article history: submitted/received 05 may 2022 first revised 01 jul 2022 accepted 19 sept 2022 available online 21 oct 2022 publication date 01 dec 2022 aug 2018 __________________ keywords: coffee tree dataset, almawasit, wadi balabil, al-ghayl, joreenat, al-zeila international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 3(2) (2022) 231-240 https://doi.org/10.34010/injiiscom.v3i2.9504 ahmed abdulsalam naji hasan. coffee tree detection using convolutional…| 232 doi: https://doi.org/10.34010/injiiscom.v3i2.9504 p-issn 2810-0670 e-issn 2775-5584 1. introduction 1.1. overview of the study agriculture is extremely important to continue human existence. it remains a major driver of many economies around the world, particularly in developing and underdeveloped economies. there is a growing demand for food and cash crops. machine learning techniques have been widely used in the control and management of agricultural crops, and the identification of types of agricultural crops. several machine learning algorithms have been used in the agricultural field since the beginning of the twentieth century. for example, convolutional neural networks (cnn) have been used to identify trees, it still needs need a standard data set designed to achieve high efficiency, by taking full advantage of from different assembly devices. to help identify and detect different crop trees. coffee is one of these important crops, driving forces of the economy in various countries of the world (clarence et al., 2003). for the importance of coffee 1 october 2015 world coffee daybasis of the economy of many of them, especially in the yemeni environment. thousands of families depend on the coffee crop to increase their annual income. approximately, one million people are working in this field starting from its cultivation until its exportation. it can be considered the main commodity that yemen exports to the world after oil. with the coffee crop, yemen has recorded a distinguished presence at the global level since the beginning of the sixth century as the first source of coffee (amamo, a. a.,2014). yemen is the only country in the world where the coffee tree is grown under climatic and environmental conditions that are not similar to other regions of the world. yemeni coffee, which is called arabica coffee, is considered the most famous and the most expensive coffee in the world (incomes et al., 2005). in the field of health, many studies have shown that coffee extracted from this tree is one of the substances that contain a great benefit to the body, as it treats many diseases such as type 2 diabetes colon and prostate cancer (kolb et al., 2020). and liver diseases (nieber et al., 2017). coffee also helps to increase focus and endurance (karayigit et al., 2021). it protects the liver from cirrhosis, as it reduces liver enzymes and prevents liver cancer, coffee is a rich source of antioxidants that protect people from tooth loss (priya et al., 2020). and gum disease (nagpal et al., 2014). thus, people have become more interested in planting coffee trees (alzaidi et al., 2016). this study aims to detect and classify the coffee tree using a convolutional neural network to encourage farmers to plant coffee trees. it will assist in developing the coffee tree in different countries, knowing the taxes of coffee trees, inventory, the farm management plan, and increasing the yield. detection and classification of the coffee tree are useful in making smart systems to classify many other important trees, especially those used in the field of medicine. 1.2. literature review computer vision has become very important in the knowledge of agricultural cash crops, and it is an easy and inexpensive way compared with the traditional methods (chandra et al., https://doi.org/10.34010/injiiscom.v3i2.9504 233 | international journal of informatics information system and computer engineering 3(2) (2022) 231-240 doi: https://doi.org/10.34010/injiiscom.v3i2.9504 p-issn 2810-0670 e-issn 2775-5584 2020). several works have been introduced for detecting trees based on computer vision techniques. for example, zortea et al., presented a method for detecting citrus trees at highdensity orchards from images captured by unmanned aerial vehicles (uavs). they used the cnn method depicted in fig. 1 which provided good results ((zortea et al., 2018). fan et al., proposed a new algorithm based on deep neural networks, as shown in fig 2., to detect tobacco plants in images captured by uavs. the results showed that the proposed algorithm performs well (fan et al., 2018). wu et al., present an extracting apple tree crown information from remote imagery using (uavs).by using a faster r-cnn object detector. an automatically detect and segment individual trees and measure the crown width, perimeter, crown projection area of apple trees. the results were close to the manual delineation and this technique can bey used to detect and count apple an overall accuracy of 0.97%, estimate crown parameter with an overall accuracy exceeding 0.92% (wu et al., 2020). sun et al., presented image dataset collected by mobile phone in natural scene. which contains 10,000 images of 100 ornamental plant species in beijing forestry university campus. a 26-layer deep learning model consisting of 8 residual building blocks is designed for largescale plant classification in natural environment. the proposed model achieves a recognition rate of 0.91% on the bjfu100 dataset, demonstrating that deep learning is a promising technology for smart forestry (sun et al., 2017). santana et al., developed an algorithm for automatic counting of coffee plants and to determine the best age to carry out monitoring of plants using remotely piloted aircraft (rpa) images, it presented 96.8% accuracy with images without spectral treatment (santana et al., 2023). zheng et al., presented the cropdeep agricultural dataset. cropdeep species classification and detection dataset, consisting of 31,147 images with over 49,000 annotated instances from 31 different classes images were collected with different cameras and equipment in greenhouses, captured in a wide variety of situations. results show that current deep-learning-based methods achieve well performance in classification accuracy over 0.99% (zheng et al., 2019). diez et al., have published a review focused on studies that use dl and rgb images gathered by uavs to solve practical forestry research problems. the review discussed three main forestry problems including (1) individual tree detection, (2) tree species classification, and (3) forest anomaly detection (forest fires and insect infestation) this study useful for researchers that want to start working in this area (diez et al.,2021). gurumurthy et al., they presented a method for semantic segmentation of mango trees in high resolution aerial imagery, and, a novel method for individual crown detection of mango trees using segmentation output. results obtained demonstrate the robustness of the proposed methods despite variations in factors such as scale, occlusion, lighting conditions and surrounding vegetation (gurumurthy et al., 2019). https://doi.org/10.34010/injiiscom.v3i2.9504 ahmed abdulsalam naji hasan. coffee tree detection using convolutional…| 234 doi: https://doi.org/10.34010/injiiscom.v3i2.9504 p-issn 2810-0670 e-issn 2775-5584 2. method in this chapter, we presented dataset of coffee tree, due to the lack of publicly available datasets to detect and classify the coffee tree, and system to detect and classify the coffee tree based on cnn deep learning algorithm for detection and classification of the coffee tree among more than one type of trees. as show in fig. 1. we briefly explain the main steps of the proposed method used to construct the dataset and model based on cnn technique (sk et al., 2021). the main phases are illustrated in the following table 1. table 1. proposed system phases dataset acquisition images preprocessing cnn classifier experiments testing of model 2.1. dataset acquisition we have constructed our own dataset from 417 coffee images and 493 other trees due to the lack of publicly available datasets to detect and classify the coffee tree, the images used in this work were collected from different regions of the yemeni environment. coffee trees and other trees were collected from the city of taiz, from al-mawasit department, from several orchards from the region of bani hammad, from wadi balabil and alghayl, as well as from several valleys in the region of joreenat and al-zeila. we collected rgb images of coffee tree and other types of trees. we used a similar standard (rodrguez et al., 2020). which are captured directly in place by midrange phone samsung j1 to capture the images (3 megapixels resolution, focus of f/2,2, srgb color jpg format 1536 × 2560 and 1920 × 1920 pixels). the pictures were taken at different distances in different lighting conditions. other images obtained from google image search with different format. we collected and stored them in dataset called (coffee tree dataset). we collected and stored them in dataset called (coffee tree dataset). combined in these ways in order to remove complexity for researchers and agricultural engineers. as shown in fig. 2. fig. 1. proposed system. https://doi.org/10.34010/injiiscom.v3i2.9504 235 | international journal of informatics information system and computer engineering 3(2) (2022) 231-240 doi: https://doi.org/10.34010/injiiscom.v3i2.9504 p-issn 2810-0670 e-issn 2775-5584 fig. 2. coffee tree dataset. 2.2. images pre-processing the collected images of coffee trees and other trees used in this study are of different sizes, as shown in fig 3. thus, the first step we divided the data as follows: 60% for training, 20% for cross validation, and 20% for testing. when training the network, the input picture size is diversified, so that the network has better generalization (zheng et al., 2019). all of these transformations are contained within the imagedatagenerator. this function takes an image as input it. then, it uses a set of transformations such as increasing or decreasing brightness, flipping the image vertically or horizontally, rotating the image, shifting pixels. we processed these images, we used the rotation range=4, validation split = 0.20, rescale=1/255, width shift range=0.5, height shift range=0.5,shear range=0.10, zoom range=0.10,horizontal flip=true and fill mode=”nearest”. the second step we converted all of the images to the size of (100 ×100 × 3 pixels). (see fig 4). https://doi.org/10.34010/injiiscom.v3i2.9504 ahmed abdulsalam naji hasan. coffee tree detection using convolutional…| 236 doi: https://doi.org/10.34010/injiiscom.v3i2.9504 p-issn 2810-0670 e-issn 2775-5584 fig. 3 images before preprocessing fig. 4 images after preprocessing 2.3. cnn classifier a deep convolutional neural network has become the dominating approach for image classification. year after year, various new architectures have been proposed. however, when choosing a deep architecture to solve a realistic problem, some aspects should be taken into consideration such as the type or number of layers, as a higher number of parameters increases the complexity of the system and directly influences the memory computation, speed and results of the system. although with specific characteristics according to realistic applications, a deep-learning network has the same goal to increase accuracy while reducing operation time and computational complexity. therefore, this study selected modern deep learning architecture. a model is proposed to detect and classify coffee tree among different trees. trees grading by human is inefficient, labor intensive (see fig. 4). from the equation shown in fig. 5, we used "adam" optimizer, we trained the model. during the training process, the model stopped at epoch=8. we noticed the model achieved better speed and good results. it achieved an accuracy 0.97% and an error rate of 0.04% is small, although the images were complex. https://doi.org/10.34010/injiiscom.v3i2.9504 237 | international journal of informatics information system and computer engineering 3(2) (2022) 231-240 doi: https://doi.org/10.34010/injiiscom.v3i2.9504 p-issn 2810-0670 e-issn 2775-5584 fig. 5 equation for cnn classifier 3. testing of model we used 20% from coffee dataset a crossvalidation as an initial stage to evaluation the performance of the model and adjust some parameters after completing the training process for the model and to get good results. we tested the model, we used 20% from dataset. the model achieved an accuracy 0.75%, error rate 0.17% good results as long as the data is few, new and contains different characteristics compared to the training sample. we used a new data sample that had not been previously trained to test the model. fig. 4 cnn classifier after finishing we found that the model achieved better results. we also made predictions for two images one of the coffee tree images and one's unknown image, the images contain noise when looked at the images cannot be classified in a human way the two images have been uploaded to the model then we looked at the results we found the model correctly classified the images. https://doi.org/10.34010/injiiscom.v3i2.9504 ahmed abdulsalam naji hasan. coffee tree detection using convolutional…| 238 doi: https://doi.org/10.34010/injiiscom.v3i2.9504 p-issn 2810-0670 e-issn 2775-5584 4. results through experiences we believe these results are satisfactory when visualized data with similar characteristics are considered, images inputs which were more contain noise. as well as the image taken of the coffee tree adjacent to other trees. despite these limitations, we achieved excellent results. we achieved 0.97% accuracy, precision 0.98%, recall 0.95% and the error rate very low 0.08%. 5. conclusions in this study, we proposed a new and fast technique to detect and classify the coffee tree. this study provided a dataset of coffee trees, this study proved that using the machine learning algorithm, the convolutional neural network (cnn) is able to detect and classify the coffee tree from images. convolutional layers can extract different abstract level features for a classification. we got an average accuracy 0.97%, an average error 0.04%. these results which achieved by the cnn algorithm are the best is very close to the features of manual measurement and visual inspection. this study can be relied upon instead of the traditional methods in detecting and classifying the coffee tree as well as other trees. references al-zaidi, a. a., baig, m. b., shalaby, m. y., & hazber, a. (2016). level of knowledge and its application by coffee farmers in the udeen area, governorate of ibb – republic of yemen. the journal of animal and plant sciences, 26, 1797-1804. amamo, a. a. (2014). coffee production and marketing in ethiopia. eur j bus manag, 6(37), 109-22essien, e. r., atasie, v. n., okeafor, a. o., & nwude, d. o. (2020). biogenic synthesis of magnesium oxide nanoparticles using manihot esculenta (crantz) leaf extract. international nano letters, 10(1), pp. 43-48. chandra, a. l., desai, s. v., guo, w., & balasubramanian, v. n. (2020). computer vision with deep learning for plant phenotyping in agriculture: a survey. arxiv preprint arxiv:2006.11391. clarence-smith, w. g., & topik, s. (eds.). (2003). the global coffee economy in africa, asia, and latin america, 1500–1989. cambridge university press. diez, y., kentsch, s., fukuda, m., caceres, m. l. l., moritake, k., & cabezas, m. (2021). deep learning in forestry using uav-acquired rgb data: a practical review. remote sensing,13(14), 2837. fan, z., lu, j., gong, m., xie, h., & goodman, e. d. (2018). automatic tobacco plant detection in uav images via deep neural networks. ieee journal of selected topics in applied earth observations and remote sensing,11(3), 876-887. https://doi.org/10.34010/injiiscom.v3i2.9504 239 | international journal of informatics information system and computer engineering 3(2) (2022) 231-240 doi: https://doi.org/10.34010/injiiscom.v3i2.9504 p-issn 2810-0670 e-issn 2775-5584 gurumurthy, v. a., kestur, r., & narasipura, o. (2019). mango tree net–a fully convolutional network for semantic segmentation and individual crown detection of mango trees.arxiv preprint arxiv:1907.06915. incomes, s. i., & trade, e. (2005). moving yemen coffee forward. karayigit, r., naderi, a., akca, f., cruz, c. j. g. d., sarshin, a., yasli, b. c., ... & kaviani, m. (2021). effects of different doses of caffeinated coffee on muscular endurance, cognitive performance, and cardiac autonomic modulation in caffeine naive female athletes. nutrients, 13(1),2. kolb, h., kempf, k., & martin, s. (2020). health effects of coffee: mechanism unraveled?. nutrients, 12(6), 1842. nagpal, i. (2014). can milk, coffee and tea prevent dental caries?. international journal, 1(4), 129. nieber, k. (2017). the impact of coffee on health. planta medica, 83(16), 1256-1263. priya, s. l., jagannathan, r., balaji, t. m., varadarajan, s., venkatakrishnan, c., rajendran, s., ...\& devi, s. (2020). resveratrol and green coffee extract gel as anticaries agent. indian j. res. pharm. biotechnol, 8, 15-21. rodrguez, j. p., corrales, d. c., aubertot, j. n., & corrales, j. c. (2020). a computer vision system for automatic cherry beans detection on coffee trees.pattern recognition letters,136, 142-153. santana, l. s., santos, g. h. r. d., bento, n. l., faria, r. d. o. (2023). identification and counting of coffee trees based on convolutional neural network applied to rgb images obtained by rpa. sustainability, 15(1), 820. sk, rakibul, and ankita wadhawan. ”identification of plants using deep learning: a.” (2021). sun, y., liu, y., wang, g., & zhang, h. (2017). deep learning for plant identification in natural environment. computational intelligence and neuroscience, 2017. wu, j., yang, g., yang, h., zhu, y., li, z., lei, l., & zhao, c. (2020). extracting apple tree crown information from remote imagery using deep learning. computers and electronics in agriculture,174, 105504. zheng, y. y., kong, j. l., jin, x. b., wang, x. y., su, t. l., & zuo, m. (2019). cropdeep: the crop vision dataset for deep-learning-based classification and detection in precision agriculture. sensors, 19(5), 1058. https://doi.org/10.34010/injiiscom.v3i2.9504 ahmed abdulsalam naji hasan. coffee tree detection using convolutional…| 240 doi: https://doi.org/10.34010/injiiscom.v3i2.9504 p-issn 2810-0670 e-issn 2775-5584 zheng, y. y., kong, j. l., jin, x. b., wang, x. y., su, t. l., & zuo, m. (2019). cropdeep: the crop vision dataset for deep-learning-based classification and detection in precision agriculture. sensors, 19(5), 1058. zortea, m., macedo, m. m., mattos, a. b., ruga, b. c., & gemignani, b. h. (2018, october). automatic citrus tree detection from uav images based on convolutional neural networks. in2018 31th sibgrapi conference on graphics, patterns and images (sibgrapi)(vol. 11). https://doi.org/10.34010/injiiscom.v3i2.9504 1 gis-based urban village regional fire risk assessment and mapping yonathan andri hermawan *, lia warlina*, masnizah mohd** *departemen perencanaan wilayah dan kota, fakultas teknik dan ilmu komputer universitas komputer indonesia, jl. dipati ukur 102-116, bandung, 40132, indonesia **faculty of information science and technology, universiti kebangsaan malaysia 43600 bangi, selangor, malaysia *corresponding email:lia.warlina@email.unikom.ac.id a b s t r a c t s a r t i c l e i n f o fires in residential areas are one of the threats out of 13 disasters in indonesia. fires are disasters based on their causes, classified as disasters caused by human negligence. this research aims to identify residential fire incidents, assess fire risk levels, and map the risk level. we used the geographic information system (gis) analysis approach and direct observation of the study area. the research location was in the tamansari subdistrict in bandung city. the subdistrict of tamansari consists of 20 neighborhood units (rukun warga/ rw) with 22,995 people and 6,598 households. we conducted a field survey from december 2019 to march 2020. we used a spatial approach to analyze fire risk in this residential area by using gis to map urban-village regional fire incidents and assess the risk level. there were four fire hazard variables: population density, building density, building quality, road class. on the other hand, vulnerability variables are based on the community's social parameters: population density, percentage of old age and children under five, people with disabilities, and the population's sex ratio. the hazard and vulnerability maps overlay showed three neighborhood units (rukun warga/ rw) with a high risk of fire, eight rws with a moderate risk of residential fires, and nine rws with a low risk of residential fires. the areas with low-risk categories must remain vigilant because the width of the roads in these areas is relatively narrow. article history: ____________________ keyword: geographic information system (gis), urban village, fire, risk assessment international journal of informatics, information system and computer engineering journal homepage: http://ejournal.upi.edu/index.php/ijost/ international journal of informatics, information system and computer engineering 2 (2) (2021) 31-43 received 16 nov 2021 revised 20 nov 2021 accepted 25 nov 2021 available online 26 dec 2021 yonathan et al. gis-based urban village regional fire risk assessment and mapping|32 1. introduction fire is a disaster that, based on the causes of its occurrence, is classified as a natural disaster or a non-natural disaster caused by human negligence (man-made disaster). natural factors that cause fire disasters are lightning, earthquakes, volcanic eruptions, drought, and many others, while human factors are gas leaks, electrical short circuits, low construction security systems, and others (granda & ferreira, 2019; kumar, jana, & ramamritham, 2020; zhang, yao, & silanowicka, 2018). a fire in an area causes economic loss; therefore, it needs fire disaster management. urban fire disaster management research is carried out in many countries such as china, india, iran; with results showing that densely populated urban areas are at risk of fire (chan et al., 2018; kumar, ramamritham, & jana, 2019; navitas, 2014; waheed, 2014; zhang, yao, sila-nowicka, & jin, 2020). there are many methods for analyzing fire risk in urban areas. fire risk analysis for residential buildings in china uses cluster scenarios and applications (xin & huang, 2013). research in iran uses the fusion method of spatial information produced using unmanned aerial vehicles (uavs) and attribute data surveyed from 150 high-rise buildings (masoumi, van l.genderen, & maleki, 2019). meanwhile, in malmo, sweden, research on identifying the distribution of fires has made social stress one of the variables (guldåker & hallin, 2014). fire is one of the threats of 13 disasters in indonesia (badan nasional penanggulangan bencana, 2012). fig 1 shows the data on fire incidents in bandung city during 2019 (dinas kebakaran dan penanggulangan bencana kota bandung, 2020). the fire factor that often occurs in large cities such as bandung tends to be caused by human factors. the shape and planning of houses that are not regular make fire disasters often occur in bandung. the total population of bandung city in 2020 reached 2.444.160 people, with a population density of 14.61 thousand people per km2 (bpsstatistics of bandung municipality, 2021). fig 1. fire incidents in bandung city in 2019 (dinas kebakaran dan penanggulangan bencana, 2020) 10 12 14 16 13 26 16 43 39 51 19 13 0 10 20 30 40 50 60 yonathan et al. gis-based urban village regional fire risk assessment and mapping|33 this research will focus on one of the urban villages in bandung, a dense housing area located in kelurahan tamansari (tamansari sub-district). dense settlements that arise in the city of bandung, which is not accompanied by good supervision, have caused the spatial pattern of the residential areas to become irregular and very difficult to control. the housing density makes a fire more potential because of a poor water pipe system, very narrow road, and low-quality building materials. this research aims to identify residential fire incidents, assess fire risk levels, and map the risk level. 2. method the location of the research project was in tamansari sub-district. the study area consists of 20 neighborhood units (rukun warga/ rw) with 22,995 people and 6,598 households. we conducted the research project from december 2019 to march 2020. by using geographic information system (gis) analysis and direct observation of the study area. the data collection method is primary and secondary data. we used a spatial approach to analyze fire risk in this residential area. we applied gis to map urban-village regional fire incidents and assess the risk level. two main variables in this study to calculate fire risk are hazard and vulnerability. the level of disaster risk, using a formula, namely r = h x v/c (where: r = risk; h = hazard; v = vulnerability; c= capacity) (badan nasional penanggulangan bencana, 2012). capacity is the ability of regions and communities to reduce threats and losses due to disasters. this research does not include capacity in spatial analysis. there were four fire hazard variables, namely: population density, building density, building quality, road class (table 1). on the other hand, vulnerability variables are based on the community's social parameters: population density, percentage of elderly and toddlers, people with disabilities, and the population's sex ratio (table 2). table 1. fire hazard variables (badan nasional penanggulangan bencana, 2012) variables level of hazard weight population density < 150 persons/ ha (low) 1 150 200 persons/ ha (moderate) 2 >200 persons/ ha (high) 3 building density <40 units/ ha (low) 1 40-80 units/ ha (moderate) 2 >80 units/ha (high) 3 building quality <5% (low) 3 5 – 15 % (moderate) 2 > 15% (high) 1 road density >105 m/ha (high) 3 75-105 m/ha (modertae) 2 <75 m/ha (low) 1 yonathan et al. gis-based urban village regional fire risk assessment and mapping|34 table 2. fire vulnerability variables variables level of vulnerability weight population density < 150 persons/ ha (low) 1 150 200 persons/ ha (moderate) 2 >200 persons/ ha (high) 3 number of elderly and todler <20 % (low) 1 20-40 % (moderate) 2 >40% (high 3 number of disabled residents <20 % (low) 1 20-40 % (moderate) 2 >40% (high) 3 population sex ratio 92,38 – 98,88 (low) 1 98,89 – 105,39 (moderate) 2 105,4 – 111,9 (high) 3 3. results and discussion 3.1. distribution of fire locations in tamansari sub-district tamansari sub-district has a dense population density of up to 228 people/hectare due to urbanization. therefore, the need for land is very high, but the availability of land is insufficient, causing dense settlements in urban areas, which triggers a fire disaster that will occur due to the quality of the building and the unstable condition of the building material. based on data from the bandung city fire and disaster management service, there were fire incidents in the tamansari sub-district in 2015 – 2018. the fires in the tamansari subdistrict in 2015-2020 occurred at several points, and those were in rws of 1, 4, 9, 10, 11,15, and 20, period (fig 2). fig 3 shows one of the fire incidents in tamansari subdistrict. fig 2. fire incidents in taman sari sub-district in 2015-2020 yonathan et al. gis-based urban village regional fire risk assessment and mapping|35 fig 3. fire incident in tamansari subdistrict 3.2. fire hazard level the level of fire hazard was obtained from four fire hazard factors using the overlay method. table 3 shows the total weight of parameters is 11 (the maximum is 12). table 4 shows five rws with low levels, four rws with moderate levels, and 11 rws with a high fire hazard level. based on the residential fire hazard map, it appears that the tamansari village is dominated by a high fire hazard (fig 4). in contrast, the area with a low level of hazard is the trade and service area. the physical condition of the area is an essential factor in the fire hazard in urban village settlements, as studied in the bandung (permana, susanti, & wijaya, 2019) and surabaya city areas (navitas, 2014). figs 5 and 6 show the physical condition of the urban village in tamansari subdistrict. table 3. scores of fire hazard in tamansari subdistrict parameter fire hazard weight population density >200 person per hectare 3 building density >80 units per hectare 3 building quality 5 – 15 % 2 road density >105 meters per hectare 3 total 11 yonathan et al. gis-based urban village regional fire risk assessment and mapping|36 table 4. level and areas of fire hazard in tamansari subdistrict neighborhood units (rukun warga/ rw) area (hectares) level of hazard 1 7.7 low 2 6.7 low 3 17.2 low 4 1.9 high 5 1.7 high 6 1.8 high 7 4.8 moderate 8 3.4 low 9 2.3 moderate 10 5.2 low 11 5.3 high 12 4.5 high 13 2.6 high 14 1.7 high 15 7.3 high 16 3.8 high 17 7.9 moderate 18 7.4 high 19 4.1 moderate 20 4.6 high fig 4. map of fire hazard in tamansari subdistrict yonathan et al. gis-based urban village regional fire risk assessment and mapping|37 fig 5. dense settlement in urban village (tamansari subdistrict) fig 6. an example of low-quality building materials in tamansari subdistrict 3.3. fire vulnerability map vulnerability is a community or social condition that leads to or causes an inability to face the threat of disaster (trainor et al., 2009). the vulnerability parameter used in this study is the social vulnerability parameter. this social vulnerability is described through the condition of the population, which includes the sex ratio, disabled population, dependent age group of the elderly and infants, and population density where these factors can cause them to be in a vulnerable condition (sutanti, tjahjono, & syaufina, 2020; y. zhang, 2013). yonathan et al. gis-based urban village regional fire risk assessment and mapping|38 table 5 shows the results of the combined analysis of the four vulnerability factors above are further classified into three classes, namely low, medium, and high so that the level of fire vulnerability in the tamansari subdistrict has a score of 8. this result can be categorized into moderate residential fire vulnerability. table 6 shows the level and area of fire vulnerability in the tamansari subdistrict. there are nine rws with low levels, seven rws with moderate levels, and four rws with a high level of fire vulnerability. fig 7 shows the fire vulnerability of the urban village (tamansari subdistrict), caused by the social conditions of the people in the residential area. based on social conditions, the vulnerability level is dominated by rws with low levels. table 5. scores of fire vulnerability in tamansari subdistrict parameters fire vulnerability scores population density >200 persons/hectare (high) 3 number of elderly and toddler (%) 20-40 % (low) 1 number of disable resident (%) <20 % (low) 1 sex ration (%) 92.38 – 98.88 (low) 3 total 8 table 6. level and areas of fire vulnerability in tamansari subdistrict community units (rukun warga/ rw) area (hectares) level of vulnerability 1 7.7 low 2 6.7 low 3 17.2 low 4 1.9 moderate 5 1.7 moderate 6 1.8 low 7 4.8 low 8 3.4 low 9 2.3 high 10 5.2 low 11 5.3 moderate 12 4.5 moderate 13 2.6 high 14 1.7 low 15 7.3 high 16 3.8 high 17 7.9 moderate 18 7.4 moderate 19 4.1 low 20 4.6 moderate yonathan et al. gis-based urban village regional fire risk assessment and mapping|39 fig 7. map of fire vulnerability of tamansari subdistrict 3.4. fire risk table 7 and fig 8 describe the fire risk due to multiplying the hazard and vulnerability, showing three rws with high risk, eight rws with medium risk, and nine rws with low risk.fire risk shows that the area of the tamansari subdistrict is dominated by low risk. meanwhile, the low-risk area does not mean there is no potential for fire to occur. tamansari subdistrict has a narrow road width that makes it challenging to handle fires in residential areas, so it can be higher even though the risk is low. research on fire mitigation scenarios in dense settlements in sukahaji, bandung, showed that capacity optimization as a mitigation measure could be the main alternative in handling fire hazards in areas with medium-high population density. in addition, the early warning system is a crucial factor in mitigation efforts (sagala, adhitama, sianturi, & al faruq, 2016). to increase capacity in urban village housing, a proposed method by using existing resources for an emergency response include mosque loudspeakers, fire extinguishers, and preparing evacuation routes (pamungkas, rahmawati, larasati, rahadyan, & dito, 2017). yonathan et al. gis-based urban village regional fire risk assessment and mapping|40 table 7. scores of fire risk of tamansari subdistrict neighborhood unit (rw) the score of hazard (h) the score of vulnerability (v) risk(h x v) level of risk 1 1 1 1 low 2 1 1 1 low 3 1 1 1 low 4 3 2 6 moderate 5 3 2 6 moderate 6 2 1 2 low 7 2 1 2 low 8 2 1 2 low 9 2 3 6 moderate 10 1 1 1 low 11 3 2 6 moderate 12 3 2 6 moderate 13 3 3 9 high 14 3 1 3 low 15 3 3 9 high 16 3 3 9 high 17 2 2 4 moderate 18 3 2 6 moderate 19 2 1 2 low 20 3 2 6 moderate yonathan et al. gis-based urban village regional fire risk assessment and mapping|41 fig 8. map of fire risk of tamansari subdistrict 4. conclusion tamansari subdistrict is an urban village with dense settlements, which causes fire disasters to occur often. dense buildings and low building materials caused these incidents. based on the fire hazard and vulnerability analysis, tamansari subdistrict has a very high fire hazard level with a score of 11 of 12. the level of vulnerability based on social aspects shows that kelurahan tamansari has a low vulnerability value of 8. the fire risk map in the tamansari subdistrict is dominated by areas with a low risk of fires. the fire risk map shows three rws with high levels, eight rws with moderate levels, and nine rws with a low level of fire risk. 5. authors' note the authors declare that there is no conflict of interest regarding the publication of this article. the authors confirmed that the paper was free of plagiarism. 6. acknowledgment the authors would like to thank kelurahan tamansari for the data and information that is very useful for this research. yonathan et al. gis-based urban village regional fire risk assessment and mapping|42 7. references badan nasional penanggulangan bencana. (2012). peraturan kepala badan nasional penanggulangan bencana nomor 02 tahun 2012 tentang pedoman umum pengkajian risiko bencana. badan nasional penganggulangan bencana. bpsstatistics of bandung municipality. (2021). bandung municipality in figs 2021. chan, e. y. y., lam, h. c. y., chung, p. p. w., huang, z., yung, t. k. c., ling, k. w. k., … chiu, c. p. (2018). risk perception and knowledge in fire risk reduction in a dong minority rural village in china: a health-edrm education intervention study. international journal of disaster risk science, 9(3), 306–318. doi: 10.1007/s13753-018-0181-x dinas kebakaran dan penanggulangan bencana. (2020). data rekap kabakaran kota bandung [fire incident number]. bandung. retrieved from data.bandung.go.id/dataset/rekapitulasi-kejadian-kebakaran-di-kotabandung granda, s., & ferreira, t. m. (2019). assessing vulnerability and fire risk in old urban areas: application to the historical centre of guimarães. fire technology, 55(1), 105–127. doi: 10.1007/s10694-018-0778-z guldåker, n., & hallin, p.-o. (2014). spatio-temporal patterns of intentional fires, social stress and socio-economic determinants: a case study of malmö, sweden. fire safety journal, 70((2014)), 71–80. doi: 10.1016/j.firesaf.2014.08.015 kumar, v., jana, a., & ramamritham, k. (2020). a decision framework to assess urban fire vulnerability in cities of developing nations: empirical evidence from mumbai. geocarto international, 1–17. doi: 10.1080/10106049.2020.1723718 kumar, v., ramamritham, k., & jana, a. (2019). resource allocation for handling emergencies considering dynamic variations and urban spaces: fire fighting in mumbai. proceedings of the tenth international conference on information and communication technologies and development, 1–11. ahmedabad india: acm. doi: 10.1145/3287098.3287099 masoumi, z., van l.genderen, j., & maleki, j. (2019). fire risk assessment in dense urban areas using information fusion techniques. isprs international journal of geo-information, 8(12), 579. doi: 10.3390/ijgi8120579 navitas, p. (2014). improving resilience against urban fire hazards through environmental design in dense urban areas in surabaya, indonesia. procedia social and behavioral sciences, 135(2014), 178–183. doi: 10.1016/j.sbspro.2014.07.344 pamungkas, a., rahmawati, d., larasati, k. d., rahadyan, g. a., & dito, a. h. (2017). making a low risk kampong to urban fire. asian journal of applied sciences, 5(2), 367–375. doi: 10.24203/ajas.v5i2.4615 permana, a. y., susanti, i., & wijaya, k. (2019). kerentanan bahaya kebakaran di kawasan kampung kota. kasus: kawasan balubur tamansari kota bandung. jurnal arsitektur zonasi, 2(1), 32. doi: 10.17509/jaz.v2i1.15208 sagala, s. a. h., adhitama, p., sianturi, d. g., & al faruq, u. (2016). mitigation scenarios for residential fires in densely populated urban settlements in sukahaji village, bandung city. geoplanning: journal of geomatics and planning, 3(2), 147–160. doi: 10.14710/geoplanning.3.2.147-160 sutanti, n., tjahjono, b., & syaufina, l. (2020). analisis risiko bencana kebakaran di yonathan et al. gis-based urban village regional fire risk assessment and mapping|43 kecamatan tambora kota administrasi jakarta barat. tataloka, 22(2), 162– 174. doi: 10.14710/tataloka.22.2.162-174 trainor, s. f., calef, m., natcher, d., stuart chapin iii, f., mcguire, a. d., huntington, o., lovecraft, a. l. (2009). vulnerability and adaptation to climate-related fire impacts in rural and urban interior alaska. polar research, 28(1), 100–118. doi: 10.1111/j.1751-8369.2009.00101.x waheed, m. a. a. (2014). approach to fire-related disaster management in high density urban-area. procedia engineering, 77, 61–69. doi: 10.1016/j.proeng.2014.07.007 xin, j., & huang, c. (2013). fire risk analysis of residential buildings based on scenario clusters and its application in fire risk management. fire safety journal, 62((2013)), 72–78. doi: 10.1016/j.firesaf.2013.09.022 zhang, x., yao, j., & sila-nowicka, k. (2018). exploring spatiotemporal dynamics of urban fires: a case of nanjing, china. isprs international journal of geoinformation, 7(1), 7. doi: 10.3390/ijgi7010007 zhang, x., yao, j., sila-nowicka, k., & jin, y. (2020). urban fire dynamics and its association with urban growth: evidence from nanjing, china. isprs international journal of geo-information, 9(4), 218. doi: 10.3390/ijgi9040218 zhang, y. (2013). analysis on comprehensive risk assessment for urban fire: the case of haikou city. procedia engineering, 52, 618–623. doi: 10.1016/j.proeng.2013.02.195 41 | international journal of informatics information system and computer engineering 3(2) (2022) 41-49 new modern approach to predict users’ sentiment using cnn and blstm r. sathish kumar* * department of computer science and engineering manakula vinayagar institute of technology, kalitheerthalkuppam, puducherry, india. *corresponding email: sathishmail8@gmail.com a b s t r a c t s a r t i c l e i n f o in today’s world social network play a vital role and provides relevant information on user opinion. this paper presents emotional health monitoring system to detect stress and the user mood. depending on results the system will send happy, calm, relaxing or motivational messages to users with phycological disturbance. it also sends warning messages to authorized persons in case a depression disturbance is detected by monitoring system. this detection of sentence is performed through convolution neural network (cnn) and bi-directional long-term memory (blstm). this method reaches accuracy of 0.80 to detect depressed and stress users and also system consumes low memory, process and energy. we can do the future work of this project by also including the sarcastic sentences in the dataset. we can also predict the sarcastic data with the proposed algorithm. article history: received 18 dec 2022 revised 20 dec 2022 accepted 25 dec 2022 available online 26 dec 2022 aug 2018 __________________ keywords: sentimental analysis, recommendation system, deep learning, cnn, blstm, social networks. 1. introduction nowadays, the number of active social network users has grown drastically. this high number of users on social networks is mainly due to the increase of the number of mobile devices, such as smart phones and tablets. currently osn have become a universal means of opinion, expression, feelings, and they reflect the bad habits or wellness practices of each user. in recent years, the analysis of the messages posted on social networks have been used by many applications in the industry of healthcare informatics. at first, social media existed to help end users connect digitally with friends, colleagues, family members, and international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 3(2) (2022) 41-49 r. sathish kumar. new modern approach to predict users’ sentiment using …| 42 like-minded individuals they might never have met in person. desktop access to bulletin board services such as compuserve and prodigy made it easier to grow free online communities without ever leaving the house. as social media companies grew their user bases into the hundreds of millions, the business applications of facebook, twitter, and other social platforms began to take shape. social media companies had access to some of the richest trackable user data ever conceived. another downside of many of the internet’s segmented communities is that users tend to be exposed only to information they are interested in and opinions they agree with. this lack of exposure to novel ideas and contrary opinions can create or reinforce a lack of understanding among people with different beliefs, and make political and social compromise more difficult to come by. 2. literature survey 2.1. the use of social media for communication social media takes on many different forms including magazines, internet forums, weblogs, social blogs, micro blogging, wikis, podcasts, photographs or pictures, video, rating and social bookmarking (glavan et al., 2016; who, 2016; zhang et al., 2018). with the world in the midst of a social media revolution, it is more than obvious that social media like face book, twitter, orkut, myspace, skype etc., are used extensively for the purpose of communication. 2.2. music recommendation system based on user’s sentiments extracted from social networks. this paper presents a music recommendation system based on a sentiment intensity metric, named enhanced sentiment metric (esm) that is the association of a lexicon-based sentiment metric with a correction factor based on the user's profile (rosa et al., 2018; al-qurishi et al., 2018; sathish kumar & pariselvam, 2012). this correction factor is discovered by means of subjective tests, conducted in a laboratory environment. based on the experimental results. 2.3. hunting suicide notes in web2.0 preliminary findings: y. p. huang, t. goh, and c. l. liew. this paper will explore the techniques used by other researchers in the process of identifying emotional content in unstructured data, and will make use of existing technologies to attempt to identify at-risk bloggers (huang et al., 2007; sathish et al., 2016; sathish et al., 2016; sathish et al., 2017). using a selection of real blog entries harvested from myspace.com, supplemented with artificial entries from our research, we test the accuracy of a simple algorithm for scoring the presence of certain key words and phrases in blog entries. despite the simplistic approach taken, the preliminary results of this study were very promising. 2.4. detecting stress based on social interactions in social networks: in this paper, we find that users stress state is closely related to that of his/her friends in social media, and we employ a large-scale dataset from real-world social platforms to systematically study the 43 | international journal of informatics information system and computer engineering 3(2) (2022) 41-49 correlation of users' stress states and social interactions (lin et al., 2017; sathish et al., 2019; thapliyal et al., 2017; berbano et al., 2017; sathish et al., 2019). we first define a set of stress-related textual, visual, and social attributes from various aspects, and then propose a novel hybrid model a factor graph model combined with convolution neural network to leverage tweet content and social interaction information for stress detection. 2.5. deep learning based document modeling for personality detection from text: the authors train a separate binary classifier, with identical architecture, based on a novel document modeling technique. namely, the classifier is implemented as a specially designed deep convolution neural network, with injection of the document-level marissa features, extracted directly from the text, into an inner layer (majumeder et al., 2017; xue et al, 2014; tsugawa et al., 2015; sathish et al., 2020; rodrigues et al., 2016). the first layers of the network treat each sentence of the text separately; then the sentences are aggregated into the document vector. filtering out emotionally neutral input sentences improved the performance. this method outperformed the state of the art for all five traits, and the implementation is freely available for research purposes. 3. existing system machine learning is that field of study that provides computers the aptitude to find out while not being expressly programmed. millilitre is one of the foremost exciting technologies that one would have ever come upon. because it is clear from the name, it provides the pc that creates it additional the same as humans the power to find out. machine learning is actively being employed these days, maybe in more places than one would expect (ma & hovy, 2016; lample et al., 2016; khodayar et al., 2017; guimaraes et al., 2017; araque et al., 2017). 3.1. types of machine learning: machine learning implementations are classified into 3 major classes, betting on the character of the training “signal” or “response” on the market to a learning system. 1. supervised learning 2. unsupervised learning 3. reinforcement learning 4. semi-supervised learning 1. supervised learning when associate degree algorithmic rule learns from example knowledge and associated target responses which will carry with it numeric values or string labels, like categories or tags, so as to later predict the proper response once displayed with new examples comes beneath the class of supervised learning. 2. unsupervised learning whereas once associate degree algorithmic rule learns from plain examples with none associated response, going away to the algorithmic rule to work out the info patterns on its own. this sort of algorithmic rule tends to reconstitute the info into one thing else, like new options that will represent a r. sathish kumar. new modern approach to predict users’ sentiment using …| 44 category or a brand-new series of uncorrelated values. 3. reinforcement learning. when you give the rule with examples that lack labels, as in unsupervised learning. however, you may accompany associate example with positive or feedback per the solution the rule proposes comes beneath the category of reinforcement learning, that's connected to applications that the rule ought to produce picks (so the merchandise is prescriptive. 4. semi-supervised learning. where associate degree incomplete coaching signal is given: a coaching set with some (often many) of the target outputs missing. there's a special case of this principle called transduction wherever the complete set of downside instances is understood as learning time, except that a part of the target area unit is missing (fig. 1). fig. 1. flow diagram – ml working a common example of an application of semi-supervised learning is a text document classifier. this is the type of situation where semi-supervised learning is ideal because it would be nearly impossible to find a large amount of labeled text documents. this is simply because it is not time efficient to have a person read through entire text documents just to assign it a simple classification. so, semi-supervised learning allows for the algorithm to learn from a small amount of labeled text documents while still classifying a large amount of unlabeled text documents in the training data (fig. 2). fig. 2. flow diagram – data splitting 3.2. limitations of existing work: • the existing system shows accuracy of only 60% • it uses only random forest algorithm to process the data. • its efficiency of processing the results is slow and so the results appear • its user interface is not friendly. • it shows error in some results. • it does not process all the data; it leaves some of the data. 5. proposed work 5.1. algorithms used 1) random forest random forest may well be a machine learning formula that belongs to the supervised learning technique. it is typically used for every classification and regression problem in cubic 45 | international journal of informatics information system and computer engineering 3(2) (2022) 41-49 centimeter. it’s supported the conception of ensemble learning. random forest is one of the best high-performance strategies widely applied in numerous industries due to its effectiveness. it can handle data very effectively, whether it is binary, continuous, or categorical. random forest is difficult to beat in terms of performance. of course, you can always discover a model that performs better, for example, neural networks. still, they take longer to construct and can handle a wide range of data types, including binary, category, and numerical. one of the finest aspects of the random forest is that it can accommodate missing values, making it an excellent solution for anyone who wants to create a model quickly and efficiently (fig. 3). fig. 3. architecture diagram – diagram random forest 2) blstm algorithm: blstm is an associate degree extension of ancient lstm. it will improve model performance on sequence classification issues. in issues wherever all time steps of the input sequence area unit offered, blstm train 2 rather than one lstm on the input sequence. lstm in its core preserves information from inputs that has already passed through it using the hidden state. unidirectional lstm only preserves information of the past because the only inputs it has seen are from the past. using bidirectional will run your inputs in two ways, one from past to future and one from future to past and what differs this approach from unidirectional is that in the lstm that runs backwards you preserve information from the future and using the two hidden states combined you are able in any point in time to preserve information from both past and future. 3) convolutional neural network (cnn): a convolution is the straightforward application of a filter to associate degree input that leads to activation. indicating the locations associate degree strength of a detected feature in an input, like a picture. the innovation of cnn is the ability to mechanically learn an outsized range of filters in parallel specific to a coaching dataset below the constraints of a particular prognostication modelling drawback, like image classification. the experimental analysis shows that the model gives the accuracy of 84%. 6. results and discussion 6.1. upload osn dataset using this module, we will upload dataset to application. user profile and user data: database built from the data captured from osns. messages: there is a database with 360 messages, 90 messages for each kind (relaxing, motivational, happy, or calm messages) r. sathish kumar. new modern approach to predict users’ sentiment using …| 46 to be suggested to the user by the recommendation engine. the users can previously choose one or two kinds of messages when they undergo a period of stress or depression. the messages were written by 3 specialists in psychology and validated by 3 other specialists. 6.2. generate train & test model from osn dataset: using this module, we will read all messages from dataset and build a train and test model by extracting features from dataset. depression or stress detection by machine learning: the sentences are extracted from osn and they are filtered by machine learning to detect depression or stress conditions. it is implemented in the emotional health monitoring system. 6.3. build cnn blstm-rnn model using softmax: using this module, we will build deep learning blstm model on dataset and then using test data we will calculate blstm prediction accuracy. the cnns have several different filters/kernels consisting of trainable parameters which can convolve on a given image spatially to detect features like edges and shapes. hence, they can successfully boil down a given image into a highly abstracted representation which is easy for predicting. 6.4. upload test message & predict sentiment & stress: using this module, we will upload test messages and then application will detect stress by applying blstm model on test data. in the proposed system, users’ personal information and context information is used. however, users do not always post this related information. in case users do not post personal information, standard information is used, such as sleep routine of 8 hours, no unhealthy habits, no preferences about work or study. it is important to note that in our tests only 5% of the users do not post this information. a traditional rs is also implemented, in which only the words searched by a person on osn are used to feed the system, forming a content-based rs. for the sake of simplicity, the traditional content-based rs will not be explained in this section (figs. 4-10). fig. 4. in fig. 4 i am uploading the dataset file which contains messages. fig. 5. in the above screen we can see the records for to test the prediction performance. 47 | international journal of informatics information system and computer engineering 3(2) (2022) 41-49 fig. 6. blstm model generated and the accuracy is shown as 83.94%. fig. 7. in the above screen we can see the iterations to generate prediction layers. fig. 8. the random forest prediction accuracy is 36% which is lower than proposed blstm accuracy. fig. 9. in the above screen we can see each message application detected and mark with stress or nonstress status. fig. 10. accuracy of both the algorithms. 7. conclusion various deep-learning techniques can be used for the prediction of sentimental analysis and recommendation. the challenge is to develop accurate and computationally efficient medical data classifiers. in this paper the model contains the emotional health monitoring system, which uses the deep learning model and the sentiment metric named esm2. the sentences are extracted from an osn and then emotional health monitoring system identifies which sentences present a stress or depression r. sathish kumar. new modern approach to predict users’ sentiment using …| 48 content using machine learning algorithms and the emotion of the sentence content. references al-qurishi, m., hossain, m. s., alrubaian, m., rahman, s. m. m., & alamri, a. (2017). leveraging analysis of user behavior to identify malicious activities in largescale social networks. ieee transactions on industrial informatics, 14(2), 799813. araque, o., corcuera-platas, i., sánchez-rada, j. f., & iglesias, c. a. (2017). enhancing deep learning sentiment analysis with ensemble techniques in social applications. expert systems with applications, 77, 236-246. berbano, a. e. u., pengson, h. n. v., razon, c. g. v., tungcul, k. c. g., & prado, s. v. (2017, september). classification of stress into emotional, mental, physical and no stress using electroencephalogram signal analysis. in 2017 ieee international conference on signal and image processing applications (icsipa) (pp. 11-14). ieee. glavan, i. r., mirica, a., & firtescu, b. n. (2016). the use of social media for communication in official statistics at european level. romanian statistical review, 64(4), 37-48. guimaraes, r. g., rosa, r. l., de gaetano, d., rodriguez, d. z., & bressan, g. (2017). age groups classification in social network using deep learning. ieee access, 5, 10805-10816. huang, y. p., goh, t., & liew, c. l. (2007, december). hunting suicide notes in web 2.0-preliminary findings. in ninth ieee international symposium on multimedia workshops (ismw 2007) (pp. 517-521). ieee. khodayar, m., kaynak, o., & khodayar, m. e. (2017). rough deep neural architecture for short-term wind speed forecasting. ieee transactions on industrial informatics, 13(6), 2770-2779. kumar, r. s., & pariselvam, s. (2012). formative impact of gauss markov mobility model on data availability in manet‖. asian journal of information technology, 11(3), 108-116. kumar, r. s., dhinesh, t., & kathirresh, v. consensus based algorithm to detecting malicious nodes in mobile adhoc network. international journal of engineering research & technology (ijert) vol, 6. kumar, r. s., koperundevi, s., & suganthi, s. (2016). enhanced trust based architecture in manet using aodv protocol to eliminate packet dropping attacks. international journal of engineering trends and technology, 34, 21-27. kumar, r., logeswari, r., devi, n., & bharathy, s. (2017). efficient clustering using ecatch algorithm to extend network lifetime in wireless sensor networks. int. j. eng. trends technol, 45, 476-481. 49 | international journal of informatics information system and computer engineering 3(2) (2022) 41-49 lample, g., ballesteros, m., subramanian, s., kawakami, k., & dyer, c. (2016). neural architectures for named entity recognition. arxiv preprint arxiv:1603.01360. lin, h., jia, j., qiu, j., zhang, y., shen, g., xie, l., ... & chua, t. s. (2017). detecting stress based on social interactions in social networks. ieee transactions on knowledge and data engineering, 29(9), 1820-1833. ma, x., & hovy, e. (2016). end-to-end sequence labeling via bi-directional lstm-cnnscrf. arxiv preprint arxiv:1603.01354. majumder, n., poria, s., gelbukh, a., & cambria, e. (2017). deep learning-based document modeling for personality detection from text. ieee intelligent systems, 32(2), 74-79. rodrigues, r. g., das dores, r. m., camilo-junior, c. g., & rosa, t. c. (2016). sentihealth-cancer: a sentiment analysis tool to help detecting mood of patients in online social networks. international journal of medical informatics, 85(1), 80-95. rosa, r. l., rodriguez, d. z., & bressan, g. (2015). music recommendation system based on user's sentiments extracted from social networks. ieee transactions on consumer electronics, 61(3), 359-367. sathish kumar r, abdulla m.g. (2019). “head gesture and voice control wheel chair system using signal processing”, asian journal of information technology, 18(18), issue 8, 1682-3915. sathish kumar r, girivarman r, parameshwaran s, sriram v. (2020) "stock price prediction using deep learning and sentimental analysis", international journal of emerging technologies and innovative research 7(5), 346-354. sathishkumar, r., kalaiarasan, k., prabhakaran, a., & aravind, m. (2019, march). detection of lung cancer using svm classifier and knn algorithm. in 2019 ieee international conference on system, computation, automation and networking (icscan) (pp. 1-7). ieee. thapliyal, h., khalus, v., & labrado, c. (2017). stress detection and management: a survey of wearable smart health devices. ieee consumer electronics magazine, 6(4), 64-69. tsugawa, s., kikuchi, y., kishino, f., nakajima, k., itoh, y., & ohsaki, h. (2015, april). recognizing depression from twitter activity. in proceedings of the 33rd annual acm conference on human factors in computing systems (pp. 3187-3196). world health organization. (2016). world health statistics 2016: monitoring health for the sdgs sustainable development goals. world health organization. xue, y., li, q., jin, l., feng, l., clifton, d. a., & clifford, g. d. (2014, april). detecting adolescent psychological pressures from micro-blog. in international conference on health information science (pp. 83-94). springer, cham. zhang, y., xu, c., li, h., yang, k., zhou, j., & lin, x. (2018). healthdep: an efficient and secure deduplication scheme for cloud-assisted ehealth systems. ieee transactions on industrial informatics, 14(9), 4101-4112. 89 | international journal of informatics information system and computer engineering 4(1) (2023) 89-100 doi: https://doi.org/10.34010/injiiscom.v4i1.9588 p-issn 2810-0670 e-issn 2775-5584 development of an educational training game for ear sensitivity of intervals michael nagaku milenn salim, hanhan maulana* program studi teknik informatika, universitas komputer indonesia , indonesia *corresponding email: hanhan@email.unikom.ac.id a b s t r a c t s a r t i c l e i n f o this study aims to build a game to help beginner musicians in the process of training ear sensitivity to intervals without involving the teacher in the training process. the software development uses the multimedia development life cycle (mdlc) method. a software is built in the form of a game that can train the ear's sensitivity to intervals, share knowledge related to intervals, randomize questions and validate answers without involving the teacher in the learning process. the average respondents' answers related to the application were positive. it can be concluded that the development of the application can help users in training the ear sensitivity to intervals and increase the user's knowledge of music theory, especially related to intervals. article history: submitted/received 20 dec 2022 first revised 03 jan 2023 accepted 07 mar 2023 first available online 18 apr 2023 publication date 01 jun 2023 aug 2018 __________________ keywords: game, ear training, music, multimedia, intervals 1. introduction in the world of music, the pitch interval is the distance between one note and another (levine, m. 2011). interval is what builds music, when we listen to music, we listen to intervals (willis, g. 1998). a musician, both beginner and expert must know the pitch interval. the ability to identify the intervals of the notes heard is very helpful in musical activities. an ear that is sensitive to pitch intervals is a musician's most valuable asset (pavlik, p. i., et al., 2013) games are fun entertainment media that can be played to fill spare time. games can also be used as fun learning media, commonly called educational games (jubaedi, a. d., & putra, r. e. 2018) (arsenault, d. 2009). a fun learning process for what is learned can make the subject matter more interesting and facilitate the delivery of the subject matter international journal of informatics information system and computer engineering journal homepage: http://ejournal.upi.edu/index.php/ijost/ international journal of informatics information system and computer engineering 4(1) (2023) 89-100 https://doi.org/10.34010/injiiscom.v4i1.9588 michael and hanhan. development of an educational training game for ear ,...| 90 doi: https://doi.org/10.34010/injiiscom.v4i1.9588 p-issn 2810-0670 e-issn 2775-5584 (pradana, a. g. 2019). the use of educational games in learning can improve the quality and disruption in the learning process as well as train the user's ability to solve problems, find solutions, think quickly and improve (limin, s. 2022), (ardiningsih, d. 2019). a good musician has an ear that is sensitive to changes in pitch. beginner music generally knows the concept of solfge (do, re, mi, fa, sol, la, si, do) but does not yet have a good ear. the ear's sensitivity to pitch intervals can develop over time. development may be aided by ear or ear repair exercises (willis, g. 1998). the music learning process, in this case ear training, requires a teacher who can determine questions randomly and dynamically and correct errors during the learning process. to have someone who teaches can cost a lot of money, especially face-to-face learning during a pandemic can pose a danger of spreading covid-19 (santoso, a. m. h. 2022). the need to do social distancing complicates the process of learning music, especially in terms of ear training. by using educational games as learning media in terms of ear training, you can learn to automate the randomization of questions dynamically as well as the process of correcting errors in social distancing conditions. the use of educational games can also reduce the cost of learning music. several studies related to games for ear training or music theory have been carried out. in a study entitled "adventure game as learning media for introducing music interval and ear training to kids" discusses the use of games as an educational medium for musical intervals and ear training for children (rizqyawan, m. i., & hermawan, g. 2015). another study that discusses ear training with the title "development of interactive quiz games as a formative evaluation instrument in music theory course" builds games for student understanding in music theory courses (haditama, i., et al., 2016). several previous studies built desktop-based singleplayer games with 2d graphics. however, in studies, input from users is still in the form of a multiple-choice quiz. ear training exercises by singing or humming can help the ear hear better intervals. therefore, it requires input in the form of sound through a microphone so that the learning process can be more effective (willis, g. 1998). the software method that will be used is the multimedia development life cycle method or abbreviated as mdlc. mdlc was chosen as the development method because educational games are part of interactive multimedia so mdlc is the right method for developing this software (afrianto, i., & furqon, r. m. 2018). 2. method in this study, the methodology used includes three stages in fig. 1, with the following explanation: https://doi.org/10.34010/injiiscom.v4i1.9588 91 | international journal of informatics information system and computer engineering 4(1) (2023) 89-100 doi: https://doi.org/10.34010/injiiscom.v4i1.9588 p-issn 2810-0670 e-issn 2775-5584 fig. 1. research methodology 2.1. data collection stage at this stage, data is collected that will be used in research. the data collection process begins with conducting a literature study related to research and technology in the field of music, especially research that integrates technology in the field of music. literature study was conducted to find out what research has been done, expand knowledge and widen the range of ideas. next is the observation stage, which will carry out the problem identification process from the results of these observations. 2.2. educational game development stage the approach used in the development of educational games is the multimedia development life cycle (mdlc). in this approach to the six stages that need to be carried out, the stages are (mursid, r. 2018), (laksamana, d.j., et al 2021): i) concept the first stage is the concept, in this study raised the concept of educational games with the theme of outer space for learning music theory and training ear sensitivity. ii) design the second stage is the stage where the design is carried out, both menu, character and storyboard designs. iii) material collecting the third stage is the stage for collecting material, the material collected is in the form of audio, images, and data. iv) assembly the fourth stage is the development stage where the components are built into a complete system. v) testing the fifth stage is the testing stage, the testing process will be carried out at this stage. vi) distribution the sixth stage is the distribution stage, in order to disseminate research on this educational game to be useful for the community. the activities of the mdlc approach can be seen in fig 2. https://doi.org/10.34010/injiiscom.v4i1.9588 michael and hanhan. development of an educational training game for ear ,...| 92 doi: https://doi.org/10.34010/injiiscom.v4i1.9588 p-issn 2810-0670 e-issn 2775-5584 fig. 2. mdlc approach 2.2.1. concept the game that was built is a game that helps beginner musicians in the process of training ear sensitivity to tone intervals. the game built is a desktopbased training educational game with 2d (2 dimensional) graphics. the related description of the game built as follows: i) the game will play questions in the form of interval sounds with the same root in each question ii) players guess the sound by providing input in the form of the same interval sound through the microphone iii) the game will calculate the player's score on each level iv) the next level will be unlocked if the player reaches a minimum score of 75% v) each level has a different difficulty vi) the material to be taught and trained is part of the c major scale vii) game is a training which must be done repeatedly. 2.2.2. design the design stage is the stage of carrying out the specifications of the game that is built. the system design built is: i) storyline design a spaceship pilot assigned to collect precious space rocks. spaceship pilots must sail through the vast darkness of outer space in order to collect the precious stones needed by the main ship. players will play a spaceship character. ii) level design the levels are divided into seven sections. each level will practice different pitch intervals. players will start the game from level 1. the next level will be unlocked when the player has reached the minimum score. the minimum score that needs to be achieved to unlock the next level feature is 75%, 75% is obtained from the national completeness target (yusuf, m. 2019), (aldwell, e., et al., 2018). the list of tone intervals that are trained at each level can be seen in table 1. https://doi.org/10.34010/injiiscom.v4i1.9588 93 | international journal of informatics information system and computer engineering 4(1) (2023) 89-100 doi: https://doi.org/10.34010/injiiscom.v4i1.9588 p-issn 2810-0670 e-issn 2775-5584 iii) storyboard design displays a series of notes vertically, spaceship characters, space rocks as a hint of tone intervals. the game will play a matter of tone intervals in the form of sound. if the player activates the hint feature, a stone will be displayed which is an indication of the tone that must be played. if the player does not activate the hint feature, the rocks become invisible. players will provide input in the form of sound through the microphone. the character of the aircraft will move up or down according to the input frequency of the tone from the user. the lowest “do” tone represents c3. the storyboard design can be seen in fig 3. table 1. intervals in each level level trained intervals 1 unison, major second 2 unison, major second, major third 3 unison, major second, major third, perfect fourth 4 unison, major second, major third, perfect fourth, perfect fifth 5 unison, major second, major third, perfect fourth, perfect fifth, major sixth 6 unison, major second, major third, perfect fourth, perfect fifth, major sixth, major seventh, 7 unison, major second, major third, perfect fourth, perfect fifth, major sixth, major seventh, perfect octave 8 ghundul-ghundul pacul song https://doi.org/10.34010/injiiscom.v4i1.9588 michael and hanhan. development of an educational training game for ear ,...| 94 doi: https://doi.org/10.34010/injiiscom.v4i1.9588 p-issn 2810-0670 e-issn 2775-5584 fig 3. storyboard design iv) gameplay design this part is a gameplay where players will be trained to be sensitive to their ears when faced with obstacles. players will be represented as a spaceship that blocks the vast darkness of outer space. in this mode, the player must capture as many spaces precious stones as possible. players will set the spaceship's flight path with sound through the microphone. the movement of the spaceship will refer to the input tone (frequency) provided by the player. v) scoring design the score at each level will depend on how many intervals the player can guess correctly. the total predictable interval will be divided by the total number of questions at each level. the scoring calculation formula can be seen in formula (1). 𝑆 = 𝐵 𝑁 × 100% (1) with the following variables: s = score players at that level. b = number of correctly guessed intervals in the level. n = the total number of questions in the level. vi) character design players will play a spaceship character, and the questions played will appear as space aids if players use the hint feature. the character designs used in the game can be seen in table 2. 2.2.3. material collecting the material collecting stage is the stage of collecting materials or assets used in the process of developing educational games. the assets collected can be seen in table 3. https://doi.org/10.34010/injiiscom.v4i1.9588 95 | international journal of informatics information system and computer engineering 4(1) (2023) 89-100 doi: https://doi.org/10.34010/injiiscom.v4i1.9588 p-issn 2810-0670 e-issn 2775-5584 table 2. character design no name figure information 1. pesawat luar angkasa in-game player character 2. bebatuan do-si characters for do to si tone hints with different colors table 3. assets used no name information 1. play button play button asset to enter select window 2. quit button asset quit button to leave the game 3. back button asset back button to return to the previous window 4. hint tidak aktif button inactive hint key asset indicates that the hint feature is not active 5. hint aktif button the active hint key asset indicates that the hint feature is active 6. score button score button asset to display the jscore jendela window 7. go button button asset to display the jgo jendela window 8. laser biru soal the limiting asset for segment markers of voice questions 9. laser kuning jawab limiting asset for voice input segment marker 10. laser hijau jawab benar limiting asset for voice input segment marker, voice input marker from correct player 11. laser merah jawab salah limiting asset for voice input segment marker, voice input marker from wrong player 12. tombol level 1-8 key asset level 1-8 to display the jlevel1-8 . window 13. papan penjelasan explanation board assets to display explanatory information on each level 2.2.4. assembly the assembly stage is the implementation stage or combining assets and designs into a complete game that is ready to use. the main menu interface, or the first display in the game can be seen in fig 4. figure 5 is a level selection interface with a hint feature tool. figure 6 is the interface that appears when the user has selected a level. the content level interface also displays the materials at that level. the levels that have been taught will be trained on the main game interface. the main game interface is shown in fig 7. https://doi.org/10.34010/injiiscom.v4i1.9588 michael and hanhan. development of an educational training game for ear ,...| 96 doi: https://doi.org/10.34010/injiiscom.v4i1.9588 p-issn 2810-0670 e-issn 2775-5584 fig. 4. home menu interface fig. 5. level selects menu interface fig. 6. level material interface fig. 7. main game interface after the user completes the main game, the score obtained by the user will be saved by the system which can be viewed again on the score interface. the score interface can be seen in fig 8. fig. 8. score interface 2.2.5. testing the testing stage is the stage of testing the game that was built. the tests carried out are blackbox testing, musical instrument testing and beta testing. 2.2.6. distribution the distribution stage is the stage where the game will be uploaded to google drive which users will download later. 2.3. conclusion drawing stage the conclusion stage is the stage where conclusions will be drawn based on responses from users who have played the game that was built. 3. results and discussion the games that have been built will be tested using the blackbox method, then musical instrument testing and beta testing will be carried out which will involve users. the first test is blackbox testing which aims to find functional errors or bugs in the game. testing the main components in blackbox can be seen in table 4. https://doi.org/10.34010/injiiscom.v4i1.9588 97 | international journal of informatics information system and computer engineering 4(1) (2023) 89-100 doi: https://doi.org/10.34010/injiiscom.v4i1.9588 p-issn 2810-0670 e-issn 2775-5584 the next test is the testing of musical instruments, this test is carried out to test whether the musical instruments used can be received well by the game that is built. if the sound can be processed properly, the spacecraft will move up or down following the given sound frequency. the results of testing musical instruments can be seen in table 5. the next test is beta testing, this test will involve users as respondents who will answer the questions that have been provided previously. the questions that have been provided can be seen in table 6. respondents will answer the questions in table 6 with multiple choice answers with their respective weights, the weight of each answer can be seen in table 7. the results of beta testing for games that have been built on previously prepared questions can be seen in table 8. table 4. blackbox test results no tested components testing scenario test results 1. home menu buttons on the start menu succeed 2. menu select level buttons on the level menu succeed selecting level 1 to level 2 succeed 3. level menu the buttons on the level menu succeed 4. gameplay move the plane with the microphone succeed answer the questions correctly succeed answering questions incorrectly succeed the buttons on the level menu succeed 5. score save game score succeed view game score succeed table 5. musical instrument test results musical instrument c3 d3 e3 f3 g3 a3 b3 c4 nylon stringed guitar ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ steel stringed guitar ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ guitalele ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ human voice ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ https://doi.org/10.34010/injiiscom.v4i1.9588 michael and hanhan. development of an educational training game for ear ,...| 98 doi: https://doi.org/10.34010/injiiscom.v4i1.9588 p-issn 2810-0670 e-issn 2775-5584 table 6. beta testing questions no question 1 is by playing this game your knowledge of music theory related to tone intervals increases? 2 can the question of the interval of the tone played be clearly heard? 3 is the ear training provided according to your skill level? 4 can this game provide an understandable pitch interval theory? 5 do you feel comfortable practicing ear sensitivity using this game? table 7. question weight weight information likert scale 3 yes 0-33 2 maybe 34-66 1 no 67-100 table 8. beta test results question score respondent scale value results 1 83 93 (21 yes, 10 enough, 0 no) 89.25% yes 2 85 93 (23 yes, 8 enough, 0 no) 91.40% yes 3 75 93 (15 yes, 14 enough, 2 no) 80.64% yes 4 82 93 (22 yes, 7 enough, 2 no) 90.11% yes 5 80 93 (19 yes, 11 enough, 1 no) 87.91% yes from the test results obtained 89.25% of users get new knowledge related to tone interval theory, 91.40% of users say the tone played can be heard clearly, 80.64% of users feel that the training provided is in accordance with their level of expertise, 90.11% of users can understand the given tone interval theory, 87.91% of users feel comfortable practicing ear sensitivity using this educational game. https://doi.org/10.34010/injiiscom.v4i1.9588 99 | international journal of informatics information system and computer engineering 4(1) (2023) 89-100 doi: https://doi.org/10.34010/injiiscom.v4i1.9588 p-issn 2810-0670 e-issn 2775-5584 4. conclusion based on the results of research, analysis, system design, and implementation and testing, it can be concluded that the game that was built can be an alternative tool that helps users or novice musicians to carry out the process of training ear sensitivity to tone intervals without involving the teacher in the learning process. applications that are built can also provide a little insight into music theory related to tone intervals. this ear sensitivity training educational game that has been built still has shortcomings, therefore the following are some acceptable suggestions for the development of this game, as follows: 1. learning and training materials can be expanded by adding other scales such as minor scales. 2. votes as a training question can be reproduced to increase difficulty and variety. acknowledgments we acknowledged bangdos, universitas pendidikan indonesia. references afrianto, i., & furqon, r. m. (2018). the herbalist game edukasi pengobatan herbal berbasis android. jurnal sistem informasi bisnis, 8(2), 141. aldwell, e., schachter, c., & cadwallader, a. (2018). harmony and voice leading. cengage learning. ardiningsih, d. (2019). pengembangan game kuis interaktif sebagai instrumen evaluasi formatif pada mata kuliah teori musik. jurnal inovasi teknologi pendidikan, 6(1), 92-103. arsenault, d. (2009). video game genre, evolution and innovation. eludamos: journal for computer game culture, 3(2), 149-176. haditama, i., slamet, c., & fauzy, d. (2016). implementasi algoritma fisher-yates dan fuzzy tsukamoto dalam game kuis tebak nada sunda berbasis android. jurnal online informatika, 1(1), 51-58. jubaedi, a. d., & putra, r. e. (2018, november). pembelajaran alat musik kendang pada kesenian beluk berbasis game. in prosiding seminar nasional rekayasa teknologi informasi| snartisi (vol. 1). laksana, d. j., budiman, a., & apriandari, w. (2021). game edukasi pengenalan alat musik tradisional menggunakan metode mdlc berbasis android. jutisi: jurnal ilmiah teknik informatika dan sistem informasi, 10(1), 45-56. levine, m. (2011). the jazz theory book. " o'reilly media, inc.". limin, s. (2022). strategi pembelajaran pada mata kuliah teori musik dengan menggunakan aplikasi kahoot. psalmoz: a journal of creative and study of church music, 3(1), 10-19. https://doi.org/10.34010/injiiscom.v4i1.9588 michael and hanhan. development of an educational training game for ear ,...| 100 doi: https://doi.org/10.34010/injiiscom.v4i1.9588 p-issn 2810-0670 e-issn 2775-5584 mursid, r. (2018). pengembangan media pembelajaran interaktif pada mata pelajaran bahasa inggris. jurnal teknologi lnformasi dan komunikasi dalam pendidika, 5(2), 210-221. pavlik, p. i., hua, h., williams, j., & bidelman, g. (2013, july). modeling and optimizing forgetting and spacing effects during musical interval training. in educational data mining 2013. pradana, a. g. (2019, october). rancang bangun game edukasi “amudra” alat musik daerah berbasis android. in prosiding seminar nasional teknologi informasi dan komunikasi (senatik) (vol. 2, no. 1, pp. 49-53). rizqyawan, m. i., & hermawan, g. (2015, october). adventure game as learning media for introducing music interval and ear training to kids. in 2015 international conference on automation, cognitive science, optics, micro electromechanical system, and information technology (icacomit) (pp. 172-175). ieee. santoso, a. m. h. (2022). covid-19: varian dan mutasi. jurnal medika hutama, 3(02 januari), 1980-1986. willis, g. (1998). ultimate ear training for guitar & bass. h. leonard. yusuf, m. (2019). peningkatan kemampuan guru dalam menentukan kriteria ketuntasan minimal (kkm) melalui workshop di uptd sdn banda soleh 1 kecamatan kokop kabupaten bangkalan tahun 2019. re-jiem (research journal of islamic education management), 2(1), 131-144. https://doi.org/10.34010/injiiscom.v4i1.9588 53 | international journal of informatics information system and computer engineering 2(2) (2021) 53-65 risks of chronic kidney disease prediction using various data mining algorithms akalya devi c*, fatima abdul jabbar**, kavi varshini s***, kriti s rithanya****, miruthubashini m*****, naveena k s****** *assistant professor, 2ug scholar, department of information technology, psg college of technology, coimbatore, india. *corresponding email: 1akalya.jk@gmail.com a b s t r a c t s a r t i c l e i n f o twenty million people have chronic kidney disease where patients experience a gradual deterioration of kidney function, the result of which is kidney failure. early detection of chronic renal disease can help to slow its progression, avert complications, and reduce the risk of cardiovascular complications. data mining has been broadly used in order to support medical professionals and physicians in the prediction and examination. here, in this paper, multiple data mining algorithms are used to solve a problem in the field of medical diagnosis and examine how effective they were at predicting the consequences. the study's focus was on the diagnosis of chronic renal disease. this dataset used for this study consists 400 instances & 25 attributes. preprocessing of the large amount of raw data is carried out to impute any missing data and determine which of the variables should be taken into account in the prediction models. the accuracy of the prediction is used to compare and contrast the various predictive analytic models. article history: received 18 dec 2021 revised 20 dec 2021 accepted 25 dec 2021 available online 26 dec 2021 aug 2018 __________________ keywords: chronic kidney disease, knearest neighbor classification, predictive analytics, decision tree, data mining, support vector machine, random forest. international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 2(2) (2021) 53-65 mailto:akalya.jk@gmail.com akalya et al. risks of chronic kidney disease prediction using various data mining...| 54 1. introduction chronic kidney disease (ckd) or chronic kidney failure is the increasing impairment of the kidney's ability to function normally. chronic kidney disease is induced primarily due to high blood pressure, diabetes, hypertension, and several other factors in particular smoking, obesity, heart disease, heredity, consumption of alcohol, usage of drugs, age, race, ethnicity, etc. in india and other developing nations, chronic diseases remain a leading cause of mortality. the number of casualties in india owing to chronic disease was anticipated to be 5.21 million in 2008 and is likely to increase from over 7.63 million by 2020. there are five distinct stages of disease development in which each stage increases in severity while as it advances between stage 1 and stage 5. stage 1 is when a person's kidney function falls below normal. as the affected individuals it goes ahead into step 2, they may experience a mild to moderate loss of renal functions. the worsening condition escalates in level 3, where there is a moderate to average deterioration in the nephrological operation followed by acute damage in the functioning of the excretory system in stage 4. stage 5 is the absolute collapsing of the urinary organs. (almustafa, 2021). the massive increase in the amount of medical data available to predict the disease has raised the question of being effectively classified, managed, and transferred. to extract useful insight and knowledge from this raw data, effective ways are required. data mining techniques are a dependable and pragmatic way of accomplishing this. data mining is the process for processing massive amounts of data and extracting knowledge from all of this. in addition to the medical sector, the data are sequentially organized and are exploited in multiple number of real-time applications such as social networking sites, online websites, and so on. data mining is categorized in many other domains including graphic data extraction, web data mining, textual data mining, image data extraction domain. these data mining sectors facilitate in decision-making and the extraction of useful information from the dataset undergoing investigation. prediction of the risk of chronic kidney disease is based on several health parameters including random blood glucose level, blood pressure, serum creatinine level, and others. supervised classification algorithms which are used to predict the risk of chronic kidney disease are decision tree classification, support vector machine classification (svm), random forest classification, and k-nearest neighbor classification (knn) (aqlan, et al., 2017). from experimenting, random forest classification and knn were shown to be the best classifiers for classification. random forest and knn classifications have maximum reliability than decision tree and svm classifiers. 2. literatur review in this research paper, recent data mining procedures were used to classify and forecast chronic kidney disease which considers various influencing factors such as blood pressure, red blood cells count, haemoglobin, etc. the techniques used in this paper provide more accuracy than the techniques used in other existing works. 55 | international journal of informatics information system and computer engineering 2(2) (2021) 53-65 kaur g et al. applied two data mining classifiers to predict chronic kidney disease: knn and svm, which gave the exactitude and error percentage (arasu, d., & thirumalaiselvi, r. 2017). bhatla n et al. has analysed most of the dangerous diseases among which breast cancer, heart disease, and diabetes are the predominant ones (bhatla, n., & jyoti, k. 2012). on investigating 168 articles the techniques for implementing the diagnosis of various diseases have been performed. all techniques, data mining approaches, and evaluation methodologies are carefully investigated and properly considered. kunwar v et al. using categorization approaches such as naive bayes and artificial neural networks (ann), authors hypothesize chronic renal disease. according to the rapidminer tool's trial results, naive bayes generates further accurate outcomes than ann. (gharibdousti, m. s et al., 2017). decision tree, linear regression, super vector machine, naive bayesian, and artificial neural networks (anns) were one of the classification strategies utilized (ilyas, et al., 2021). the correlation matrix was used to investigate the features' correlation. as a result, they observed the influence of properties on classification findings. padmanaban k. a et al. on the incurable renal disease dataset, researchers implemented data extraction algorithms such as naive bayes and the decision tree algorithm (padmanaban, k. a., & parthiban, g. (2016). on comparing and contrasting several categorization algorithms, they recommended decision tree classification to reach substantial results with suitable accuracy by estimating its performance to its specificity and sensitivity (kunwar, et al., 2021). sharma s et al. evaluated 12 data mining clustering techniques by implementing them to the ckd dataset (sharma, et al., 2016). to determine efficiency, the findings of the prediction were contrasted with the factual medical outcomes. a few of the metrics used to evaluate performance comprise predictive accuracy, precision, sensitivity, and specificity (kunwar, et al., 2016). with an accuracy at about 98.6%, a sensitivity of 0.9720, a precision of 1, and a specificity of 1, the decision tree showed the best performance. arasu d et al. employed significant data extraction methods in particular clustering, classification, association analysis, and regression to predict renal diseases (milley, a. 2000). these techniques had insignificant shortcomings in the picturality of preprocessing or at any other stages. various data mining techniques are evaluated and the major problems are briefly explained. vijayarani s et al has focused on using a novel machine learning classification strategy to predict chronic renal illness employing svm on a data sample of 400 observations and 24 attributes (vijayarani, et al., 2015). 3. proposed work 1. due to ckd millions of individuals pass on each year since they don't experience legitimate treatment. ckd risk factors fall under four main categories: susceptibility components which lead to a rise in renal damage susceptibility, akalya et al. risks of chronic kidney disease prediction using various data mining...| 56 2. the terminology "initiation factors" refers to the elements that play a key role in renal damage. 3. progression factors leads to more regrettable reality of kidney harm and fast decay functionalities once the harm gets begun. 4. kidney failure occurs as a result of end-stage conditions, culminating in morbidity and mortality. kidney illnesses are anticipated and compared utilizing svm and ann algorithm stationed on the exactness and performance time. svm, knn, and some other algorithms have been used to assess the performance of the ckd dataset from the uci repository and the raw data which have been taken was cleaned and processed by various steps which have been explained in the figure 1. four different classifiers have been analyzed majorly established on the succeeding approaches: decision-tree, support vector machine (svm), random forest, k-nearest neighbor (knn) in section 3. these technics were picked for the examination and review for the reason that of their ubiquity within the later important writing. a concise portrayal about the chosen strategies has been given underneath. 3.1. data mining algorithms & technuques an algorithmic data mining program can be a well-specified plan of action that takes data as in and out. it includes designs in the shape of models. it comprises a small number of algorithms and strategies namely classification, grouping, prediction, association rules, neuronal networks, etc., to perform knowledge revelation from data banks. table 1 shows the evaluation plans employed here. table 1. classification of ckd & evaluation plans s. no phases of ckd gfr (glomerular filtration rate) evaluation strategies 1. nephrological damage with common gfr 90 or beyond treating the coexisting conditions, reduction of hazard variables for cardiac and vascular illness 2. renal impairment with moderate reduction 60-89 approximation of ailment advancement 3. reasonable reduction 30-59 assessment and medication of sickness intricacies 57 | international journal of informatics information system and computer engineering 2(2) (2021) 53-65 4. rigorous diminution 15-29 formulation of excretory organs switching remedy (dialysis, granting) 5. renal failure less than 15 nephrological organs grafting therapy the prediction analytics conducted is based upon the typically picked data columns of data, which comprises of the age, blood pressure, number of red blood cells, and appetite fields. these above mentioned four entries incorporate the numeric data in the case of blood pressure and age, while categorical data for the number rbgs and appetite. the nominal data has indeed been converted into numeric types so as to –make classification techniques suitable to string-based categorical attributes, which cannot be handled using statistical models. the proposed framework for the study is illustrated in figure 1. figure 1. proposed framework for ckd analysis and prediction akalya et al. risks of chronic kidney disease prediction using various data mining...| 58 here are the basic steps which were performed initially; 1. acquire the data from the local disk. 2. with the help of the column identifier ids, manually choose the columns. 3. to make all the nominal values numerical, the conversion is made. 4. after the categorical transformation, make the last data matrix. 5. inside the last data matrix, search for the missing values. 6. compute the average of every column that constitutes the variable. 7. load in the missing values with the appropriate average value from the mean values. 8. to make a non-uniform feature matrix, shuffle the data matrix. 9. divide the training and testing data matrices. 10. make the observation vectors ready for training and testing. 3.2. classification the best and most common data mining approach is classification. where entities are classified into different categories called classes and assigned to them. each and every thing needs to be distributed precisely to one class and not more than one and never to no classes at all. decision tree, svm, knn, and random forest were the classification algorithms included in this model. 3.2.1. decision tree classification this method is especially beneficial for deciphering classification problems in which a tree is formed to depict the categorization process. the tree is linked to every tuple in the database to yield classification as long as it is established. classification tree analysis and regression tree analysis are the two forms of decision trees used in data mining, and they have been employed for a spectrum of potential results such as belonging to a specific statistical class or an actual number. 1. fitting decision tree to the training set. 2. predicting the test result. 3. calculating the accuracy. 4. displaying the confusion matrix. 3.2.2 svm classification svm is a set of rules for supervised machine learning that can be used to resolve classification and regression problems. it uses a strategy called the kernel trick to convert your data and after that based on these changes it finds an ideal boundary between the possible outputs (sinha, p., & sinha, p. 2015). the following steps are the ones performed; 1. support vector machine (a classification technique) is applied on the available data for the purpose of predictive analysis. 2. using the training data matrix and the training observation vector the classifier is trained. 3. the testing data matrix with unseen data is utilized to examine the classifier 59 | international journal of informatics information system and computer engineering 2(2) (2021) 53-65 4. the predictions (observations predicted by svm classifier) are returned as output. the entire performance is computed by comparing and contrasting the outcomes of support vector machine classifier and the actual perceptions. 3.2.3 random – forest classifier random forest is an analyzer that equips the average of a number of decision trees on discrete subsections of a given set of data to advance the dataset's predicted accuracy. the following steps are the ones performed (sinha, p., & sinha, p. 2015). 1. fitting random forest classification to the training set. 2. predicting the test result. 3. calculating the accuracy. 4. displaying the confusion matrix. 3.2.4 knn classifier it's a type of distance-based technique that's typically used while the values of each and every attribute is uninterrupted and continual, but it can also be used with nominal features (subasi, et al., 2017). to compute the categorization of an unknown sample data based on the classification of the closest instance or instances. more occurrences inside the preparation set use the same way to group the k-nearest neighbors (also known as k-nearest neighbor), (vijayarani, et al., 2015). the steps that were taken were as follows: 1. k-nearest neighbor (one of the classification technics) is employed over the given data for the purpose of predictive analytics. 2. before initiating the entire process, the value of k should be initialized which will be symbolizing the number of neighbors that has to be considered. 3. the k-nearest neighbor classifier needs to be trained with the specified k value over the training data matrix and the training observation vector. 4. with the help of the test data matrix, which contains the unseen data the classifier is tested and evaluated for the required metrics. 5. the forecasts (observations predicted by the knn classifier) made by the knn analyzer should be returned. 6. the entire accuracy and performance of the knn classifier is estimated by comparing the predictions made by knn and the actual observations. 5. result and disscusion the chronic kidney disease (ckd) dataset was acquired based on the uci machine learning repository and is employed in this study for prediction and validation. both numerical and nominal attributes were included in ckd dataset. there are 25 attributes and 400 instances. this dataset also contains missing values. there are 24 attributes and one class attribute (i.e.) ckd, not-ckd. table 2 gives the attribute description of the dataset. akalya et al. risks of chronic kidney disease prediction using various data mining...| 60 table 2. ckd dataset attributes description s. no attribute name expansion s. no attribute name expansion 1 age age of the patient 13 pot potassium 2 bp blood pressure 14 hemo hemoglobin 3 sg specific gravity 15 pcv packed cell volume 4 al albumin 16 wc white blood cell count 5 su sugar 17 rc red blood cell count 6 rbc red blood cells 18 htn hypertension 7 pc pus cell 19 dm diabetes mellitus 8 pcc pus cell clumps 20 cad coronary artery disease 9 ba bacteria 21 appet appetite 10 bgr blood glucose random 22 pe pedal edema 11 bu blood urea 23 ane anemia 12 sc serum creatinine 24 sod sodium data cleaning and data preprocessing is the most critical point in the data mining procedure as it influences the rate of success drastically. the categorical attributes were displaced with 0s and 1s corresponding to their values. the missing values were replaced with the mean of that particular attribute. as there was a wide range of age, the age attribute was grouped in batches (sharma, et al., 2016). the ckd dataset includes features that vary in the degree of magnitude, range, and units. in order to interpret all the features on the same scale, feature scaling (data normalization) was carried out. the ckd dataset was parted into 70% for the purpose training and 30% for the purpose of testing data. four different data mining procedures encompassing decision tree classification, support vector machine classification, random forest classification, knn classification 61 | international journal of informatics information system and computer engineering 2(2) (2021) 53-65 were applied to the training and testing data and the performance measurement using different metrics like precision, f1score, recall, accuracy, specificity, and sensitivity were observed (vijayarani, et al., 2015). table 3 presents the different performance metrics used in this paper. table 3. different overall performance analysis metrics used metrics definition equation precision the proportion of predicted accurately positive considerations to fully predicted positive observations is referred as precision. recall (sensitivity) estimates the percentage of number of yes’s that are effectively-recognized correctly. f1-score precision and recall are weighted averages which determine the f1 score. accuracy measures the model's ability to accurately estimate class label of latest or previously unknown information. specificity here, ratio of negatives (or no's) that have been correctly recognized as such is measured. the performance metrics of the various proposed algorithms were derived using the equations listed in table 3. table 4 depicts the results obtained for every algorithm. table 4. performance measures of the proposed algorithms model precision recall f1-score specificity accuracy not-ckd ckd not-ckd ckd not-ckd ckd decision tree 0.91 1.00 1.00 0.95 0.95 0.97 1.00 0.967 svm 0.93 1.00 1.00 0.96 0.97 0.98 1.00 0.975 akalya et al. risks of chronic kidney disease prediction using various data mining...| 62 model precision recall f1-score specificity accuracy not-ckd ckd not-ckd ckd not-ckd ckd random forest 0.95 0.99 0.98 0.97 0.96 0.98 0.987 0.975 knn 0.95 1.00 1.00 0.97 0.98 0.99 1.00 0.983 the train score is the measurement that states us in what way the model suits the training data. similarly, the test score shows how the model reacts to the unknown data. the area under the curve (auc) score portrays the model’s overall performance at differentiating between the positive and negative classes. figure 2 shows the comparison of the training score, test score and mean auc score. figure 2. depiction of train, test, and mean auc scores of the proposed algorithms the difference in magnitude between both the observation's prediction and its true value is termed as the mean absolute error. for the proposed algorithms, figure 3 illustrates the mean absolute error. 63 | international journal of informatics information system and computer engineering 2(2) (2021) 53-65 figure 3. illustration of the proposed algorithms' mean absolute error the receiver operating characteristic curve (roc curve) reflects the classification model's overall performance among all class thresholds [10]. the roc curve for the algorithms employed in this study is illustrated in figure 4. figure 4. plot of roc curve for the proposed algorithms 5. conclusion the objective of this article is to analyze the variety of data mining techniques and algorithms utilized to predict chronic renal disease. ckd has been predicted and diagnosed using data akalya et al. risks of chronic kidney disease prediction using various data mining...| 64 mining classifiers: decision tree, svm, random forest, and knn. it was found that knn results in the best accuracy. the performance of the knn method was found to be 98.3% accurate compared to decision tree (96.7%), svm (97.5%), and random forest (97.5%). the work can further be extended keeping into consideration the other parameters like food intake, living conditions like sanitation, availability of clean water, working environment, environmental factors like pollution, etc. for the detection of kidney disease. further experimentation can be conducted using other classifiers like ann or by using ensemble techniques. references almustafa, k. m. (2021). prediction of chronic kidney disease using different classification algorithms. informatics in medicine unlocked, 100631.dobrucka, r. (2018). synthesis of mgo nanoparticles using artemisia abrotanum herba extract and their antioxidant and photocatalytic properties. iranian journal of science and technology, transactions a: science, 42(2), pp. 547-555. aqlan, f., markle, r., & shamsan, a. (2017). data mining for chronic kidney disease prediction. in iie annual conference. proceedings (pp. 1789-1794). institute of industrial and systems engineers (iise). arasu, d., & thirumalaiselvi, r. (2017). review of chronic kidney disease based on data mining techniques. international journal of applied engineering research, 12(23), 13498-13505. bhatla, n., & jyoti, k. (2012). an analysis of heart disease prediction using different data mining techniques. international journal of engineering, 1(8), 1-4 gharibdousti, m. s., azimi, k., hathikal, s., & won, d. h. (2017). prediction of chronic kidney disease using data mining techniques. in iie annual conference. proceedings (pp. 2135-2140). institute of industrial and systems engineers (iise). ilyas, h., ali, s., ponum, m., hasan, o., mahmood, m. t., iftikhar, m., & malik, m. h. (2021). chronic kidney disease diagnosis using decision tree algorithms. bmc nephrology, 22(1), 1-11. ilyas, h., ali, s., ponum, m., hasan, o., mahmood, m. t., iftikhar, m., & malik, m. h. (2021). chronic kidney disease diagnosis using decision tree algorithms. bmc nephrology, 22(1), 1-11. kunwar, v., chandel, k., sabitha, a. s., & bansal, a. (2016, january). chronic kidney disease analysis using data mining classification techniques. in 2016 6th international conference-cloud system and big data engineering (confluence) (pp. 300-305). ieee. 65 | international journal of informatics information system and computer engineering 2(2) (2021) 53-65 milley, a. (2000). healthcare and data mining. health management technology, 21(8), 44-45. padmanaban, k. a., & parthiban, g. (2016). applying machine learning techniques for predicting the risk of chronic kidney disease. indian journal of science and technology, 9(29), 1-6. sharma, s., sharma, v., & sharma, a. (2016). performance based evaluation of various machine learning classification techniques for chronic kidney disease diagnosis. arxiv preprint arxiv:1606.09581. sinha, p., & sinha, p. (2015). comparative study of chronic kidney disease prediction using knn and svm. international journal of engineering research and technology, 4(12), 608-12. subasi, a., alickovic, e., & kevric, j. (2017). diagnosis of chronic kidney disease by using random forest. in cmbebih 2017 (pp. 589-594). springer, singapore. rubini, l. j., & eswaran, p. (2015). uci machine learning repository: chronic_kidney_disease data set. vijayarani, s., dhayanand, s., & phil, m. (2015). kidney disease prediction using svm and ann algorithms. international journal of computing and business research (ijcbr), 6(2), 1-12. 75 | international journal of informatics information system and computer engineering 4(1) (2023) 75-88 doi: https://doi.org/10.34010/injiiscom.v4i1.9586 p-issn 2810-0670 e-issn 2775-5584 a computational bibliometric analysis of e-groceries analysis using vosviewer rudhi lesmana*, m ihsan rifaldi departemen manajemen, universitas komputer indonesia, indonesia *corresponding email: rudhi.21221230@mahasiswa.unikom.ac.id a b s t r a c t s a r t i c l e i n f o the purpose of the research is to combine mapping analysis with vosviewer as well as publish or perish software to do a computerized bibliometric analysis of the topic "e-groceries analysis." the method used descriptive-quantitative approach in conjunction with bibliometric analysis in which the data were retrieved from google scholar. based on the results, e-groceries analysis research decreases every year, proven by the fact that 2018 have 25 articles and increased to 32 articles in 2019, 49 articles in 2020, and 98 articles in 2021. based on further findings of this research, it can be concluded that there are several understudied sectors in e groceries analysis that may be examined further to increase the efficacy of e-groceries analysis. this research is also anticipated to serve as a reference for future research in defining and assessing the research subject, as well as a reference for field to be studied in e-groceries analysis. article history: submitted/received 01 oct 202 2 first revised 15 jan 2023 accepted 03 mar 2023 first available online 14 apr 2023 publication date 01 jun 2023 aug 2018 __________________ keywords: bibliometrics, e-groceries analysis, data analysis, vosviewer international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 4(1) (2023) 75-88 https://doi.org/10.34010/injiiscom.v4i1.9586 mailto:rudhi.21221230@mahasiswa.unikom.ac.id rudhi & m. ihsan. a computational bibliometric analysis of e-groceries...| 76 doi: https://doi.org/10.34010/injiiscom.v4i1.9586 p-issn 2810-0670 e-issn 2775-5584 1. introduction the customary sequences that were once employed when completing daily tasks, including shopping, have been disturbed and re-combined, both in time and place, as a result of the internet today (couclelis, h. 2000). online shopping does certainly allow customers to buy goods or services from a seller over the internet, fundamentally changing the procedures involved in information gathering, comparison, and use, as well as purchase and delivery. people who use e-commerce can purchase products using their mobile devices while, for instance, traveling to work or waiting at the train station without having to adhere to the store's precise opening and closing hours. consumer behavior is significantly altered by the evolution of shopping, and this behavior is closely related to transportation (suel, e., & polak, j. w. 2017). with grocery shopping being the most popular and regular form of retail therapy, it has a particularly negative impact on the environment and urban freight transportation. however, depending on customer behaviour and last mile delivery strategies, switching from in-store to online purchasing can have both good and bad effects on transportation. in greater detail, it is evident that when customers order groceries online and want home delivery, the burden of the freight travels is transferred from the customer to the store. instead, the final effect on urban freight transportation is unpredictable because it relies on the kind of product, how often people shop, why they purchase, whether trips are chained together, and how quickly efficiency must be achieved (mokhtarian, p. l. 2004). therefore, this study aims to conduct a bibliometric analysis on the topic of purchasing decisions in using the egroceries service. this method uses a mixed method with a literature review, publish or perish to collect data and vosviewer to visualize the relationship between terms as well as other things such as research trends throughout the year. it is hoped that this research will contribute to finding the fields proposed in the topic of e-groceries analysis. egroceries analysis is a business model that applies information technology to establish communication relationships and conduct transactions with customers regarding products, services and distribution systems through internet media (muhammad, n. s., et al 2016). previous study regarding e-groceries analysis have been conducted. ayudhia et al. conducted a study regarding egroceries analysis of business model. pico and barcelo also conducted a study regarding e-groceries study, which focuses on organic matter and microplastics. according to pico and barcelo, py-gc-ms is a valuable technique for e-groceries analysis specially to cover crucial e-groceries aspects (pico, y., & barcelo, d. 2020). besides, plenty of bibliometric analysis research on various fields, such as computer science (al husaeni, d. f., & nandiyanto, a. b. d. 2022), educational research (al husaeni, d. f., et al 2023), high school (al husaeni, d. n., & nandiyanto, a. b. d. 2023), technoeconomic education (ragadhita, r., & nandiyanto, a. b. d. 2022), materials research (nandiyanto, a. b. d., et al 2020), vocational school (al husaeni, d. https://doi.org/10.34010/injiiscom.v4i1.9586 77 | international journal of informatics information system and computer engineering 4(1) (2023) 75-88 doi: https://doi.org/10.34010/injiiscom.v4i1.9586 p-issn 2810-0670 e-issn 2775-5584 n., & nandiyanto, a. b. d. 2023), digital learning (al husaeni, d. f., & nandiyanto, a. b. d. 2022), scientific publications (mulyawati, i. b., & ramadhan, d. f. 2021), bioenergy management (soegoto, h., et al., 2022), chemical engineering (nandiyanto, a. b. et al 2021), special needs education (al husaeni, d. f., et al., 2023), and covid-19 (hamidah, i., sriyono, s., & hudha, m. n. 2020). however, there have not been a bibliometric analysis regarding egroceries analysis. therefore, this research aims to conduct a bibliometric analysis on the topic of egroceries analysis. the method used mixed method with literature review, publish or perish 8 to gather the data and vosviewer to visualize the connection between terms as well as other things such as research trend along the year. it is hoped that this research would contribute to discover the understudied fields in the topic of e-groceries analysis. 2. method descriptive-quantitative approaches were applied in this study. in addition, literature review were conducted to gain insights based on previous researches on bibliometric analysis as well as the topic of e-groceries analysis. we collected the articles from journals indexed by google scholar, due to its accessibility. publish or perish was chosen to gather the bibliometric data from google scholar (al husaeni, d. f., & nandiyanto, a. b. d. 2022). then, the bibliometric data were saved in *.ris, and *.csv format to be used in vosviewer software and to be converted into *.xlsx to be analyzed further. the software version that is used in this research is publish or perish 8 and vosviewer 1.6.17. in this research, we sifted through facts and used relevant facts to make arguments under the topic e-groceries analysis. we retrieve the data from google scholar by entering the keyword "e-groceries analysis" for to the title, keyword, and abstract requirements in the publish or perish software. we obtained 993 articles on e-groceries analysis research published between 2017 and 2021. the collected articles are then saved in *.ris format to be visualized in vosviewer software in the form of visualization map, and to analyze the research trend in the form of bibliometric maps. before creating the map, irrelevant terms were filtered in the visualization map (allan, r. n., et al., 1984). the visualization map is classified into three types: network visualization, overlay visualization, and density visualization. 3. results and discussion 3.1. research developments in the field of e-groceries analysis research on climate development in the field of e-groceries analysis, describes the development of research in the field of e-groceries analysis from 2018 to 2021 in fig. 1. figure 1 shows that the research on egroceries analysis decreases every year. this can be proven by the fact that there are 25 articles in 2018, 32 articles in 2019, 49 articles in 2020, and lastly 98 articles in 2021. based on the search results in the publish or perish, there are 263 articles that match the research topic. 16 articles with the most citations from 16 different publishers were shown in table 1. https://doi.org/10.34010/injiiscom.v4i1.9586 rudhi & m. ihsan. a computational bibliometric analysis of e-groceries...| 78 doi: https://doi.org/10.34010/injiiscom.v4i1.9586 p-issn 2810-0670 e-issn 2775-5584 fig. 1. level of research development on e-groceries analysis table 1. article data in the field of e-groceries analysis no authors title publisher year cites refs 1. oa hjelkre m., et al. e-groceries: sustainable last mile distribution in city planning wiley online library 2021 255 (bjørgen, a., et al., 2021) 2. c fikar a decision support system to investigate food losses in egrocery deliveries westminst erresearch. westminst er.ac 2018 63 (fikar, c. 2018) 3. by ekren., et al. lateral inventory sharebased models for iot-enabled e-commerce sustainable food supply networks university of jaffna 2021 57 (ekren, b. y., et al., 2021) https://doi.org/10.34010/injiiscom.v4i1.9586 79 | international journal of informatics information system and computer engineering 4(1) (2023) 75-88 doi: https://doi.org/10.34010/injiiscom.v4i1.9586 p-issn 2810-0670 e-issn 2775-5584 table 1 (continue). article data in the field of e-groceries analysis no authors title publisher year cites refs 4. c thommi s logistieke uitdagingen in e-groceries uis.brage.u nit.no 2021 52 (thommis, c. 2021) 5. m fernand ez vazque znoguer ol. modeling and optimization of the supply chain in e-groceries uir.unisa.a c.za 2021 46 (fernande z v, n, m. 2021) 6. m mees. e-groceries: the effects of simulated sensory information and freshness guarantee information on consumer uncertainty. uijrt.com 2019 44 (mees, m. 2019) 7. c berggre n., & s wikströ m barriers online: exploring consumers' resistance to egroceries ubiblioru m.ubi.pt 2018 43 (berggren, c., & wikström, s. 2018) 8. ai pujol digital nudging to enhance sustainable purchasing behaviours in egroceries turcomat.o rg 2020 42 (pujol, a. i. 2020) 9. mfv noguer ol modeling and optimization of the supply chain in e-groceries thesis.cust. edu.pk 2021 40 (noguerol, m. f. v. 2021) https://doi.org/10.34010/injiiscom.v4i1.9586 rudhi & m. ihsan. a computational bibliometric analysis of e-groceries...| 80 doi: https://doi.org/10.34010/injiiscom.v4i1.9586 p-issn 2810-0670 e-issn 2775-5584 table 1 (continue). article data in the field of e-groceries analysis no authors title publisher year cites refs 10. j meijboo m waste reduction in e-groceries fulfilment center: a case study at picnic theseus.fi 2019 38 (meijboom , j. 2019) 11. p gunawa rdana., & pin fernand o does customer trust mediate the impact of eservice quality dimensions? lessons during covid-19 pandemic tesi.luiss.it 2021 30 (gunawar dana, p. k. a. t. d. r., & fernando, i. 2021) 12. p gunawa rdana., & pin fernand o does customer trust impact on e-service quality dimensions during covid-19 pandemic? iopscience. iop.org 2021 29 (gunawar dana, p. k. a. t. d. r., & fernando, i. 2021) 13. p gunawa rdana., & pin fernand o assessing the mediation role of the customer trust on eservice quality: lessons during covid-19 pandemic cambridge university press 2021 28 (gunawar dana, p. k. a. t. d. r., & fernando, i. 2021) 14. y kusna di., & g pan developing online business strategy with millennials through partnership with university snejournal.org 2020 15 (kusnadi, y., & pan, g. 2020) https://doi.org/10.34010/injiiscom.v4i1.9586 81 | international journal of informatics information system and computer engineering 4(1) (2023) 75-88 doi: https://doi.org/10.34010/injiiscom.v4i1.9586 p-issn 2810-0670 e-issn 2775-5584 table 1 (continue). article data in the field of e-groceries analysis no authors title publisher year cites refs 15. vc echrler., et al. challenges and perspectives for the use of electric vehicles for last mile logistics of grocery ecommerce– findings from case studies in germany sljmuok.slj ol.info 2021 15 (ehrler, v. c., et al., 2021) 16. m waitz., et al. a decision support system for efficient lastmile distribution of fresh fruits and vegetables as part of egrocery operations search.pro quest.com 2018 13 (waitz, m., mild, a., & fikar, c. 2018) in table 1 there are 16 articles that match the criteria research. of the 16 selected articles, showing that highest quote related to e-groceries analysis research is 255, while with the lowest citation is 13. that in table 1, it shows that in 2018 and 2021, each has articles with quotes highest. in 2018-2021, the most articles quoted is 255 articles. temporary that, in 2018, a lot of articles quoted are 63 articles. year with quote the most is in 2021 as many as 255 articles. 3.2. visualization e-groceries analysis topic area using vosviewer visualization map of e-groceries analysis topic was created using vosviewer software. according to al husaeni and nandiyanto, two terms set are the minimum number of relationships when creating map using vosviewer software (peters, c. i. 1975). the generated map has 10 items (terms) with a total of 3 clusters, 18 links, and total link strength of 166 (see fig. 2). cluster 1 is indicated by red; cluster 2 is shown in green; cluster 3 is shown in dark blue. figure 2 is the network visualization map generated by vosviewer based on the terms present in collected data. the collected articles have a total of 10 terms (in the form of items) and were categorized into 3 clusters. in addition, it has the total link strength of 166 and total links of 18. the item categorization is determined based on the connection https://doi.org/10.34010/injiiscom.v4i1.9586 rudhi & m. ihsan. a computational bibliometric analysis of e-groceries...| 82 doi: https://doi.org/10.34010/injiiscom.v4i1.9586 p-issn 2810-0670 e-issn 2775-5584 strength of the terms with each other, further detail of each cluster is shown in figs. 3 7. items on each cluster are as follows: (i) cluster 1 (4 items) customer, e-grocery, home delivery, supply chain (ii) cluster 2 (3 items) feature, main content skip, skip (iii) cluster 3 (3 items) covid, pandemic, role fig. 2. network visualization map of e-groceries analysis fig. 3. cluster 1 visualization e-groceries analysis network. https://doi.org/10.34010/injiiscom.v4i1.9586 83 | international journal of informatics information system and computer engineering 4(1) (2023) 75-88 doi: https://doi.org/10.34010/injiiscom.v4i1.9586 p-issn 2810-0670 e-issn 2775-5584 the main node in cluster 1 is the term ‘egroceries’, this node linked to several other nodes in cluster 1 namely, ‘supply chain’, ‘costumer’, and ‘home delivery’. in addition, it also linked to the nodes in the other cluster, such as • 'supply chain', 'costumer', and 'home delivery' in cluster 1 • 'feature', 'main content skip', and 'skip' in cluster 2 • 'covid', 'pandemic', and 'role' in cluster 3 the main node in cluster 2 is the term, egroceries feature, this node linked to several other nodes in cluster 2 namely, ‘skip’, and ‘main content skip’. in addition, it also linked to the nodes in the other cluster, such as • 'supply chain', 'costumer', and 'home delivery' in cluster 1 • 'feature', 'main content skip', and 'skip' in cluster 2 • ' covid', 'pandemic', and 'role' in cluster 3 fig. 4. cluster 2 visualization e-groceries analysis network. https://doi.org/10.34010/injiiscom.v4i1.9586 rudhi & m. ihsan. a computational bibliometric analysis of e-groceries...| 84 doi: https://doi.org/10.34010/injiiscom.v4i1.9586 p-issn 2810-0670 e-issn 2775-5584 fig 5. cluster 3 visualization e-groceries analysis network. 3.3. overlay visualization map of egroceries analysis overlay visualization map visualize the research trend of keywords in each year. different coloration indicates the year in which terms are commonly used. darker color indicates that the keyword is commonly appear on older years while bright color indicates that the keyword commonly appears on recent year. in fig. 6, the majority of keywords seems to be popular on older years. however, there are recently emerging keywords in the collected data such as ‘covid’, ‘egrocery’, ‘customer’, ‘feature’. these keywords can be linked to recent situations such as the covid-19 pandemic and the effort to minimize carbon footprint and green energy development in the name of saving the environment. 3.4. density visualization of e-groceries analysis density visualization aims to show the frequency of occurrence of terms in the collected data. color intensity and size is the primary indicator of density, so an item that have a large and bright coloration means that the keyword appears frequently in the collected data and vice versa. the density visualization is shown in fig. 7. visualization density about climate egroceries analysis research is in the picture above, which means that on the map density showing results analysis use all article regarding e-groceries analysis in 2018-2022. in fig. 7, it is depicted that there are some color terms that is there is color yellow with a fairly large diameter. these terms called evidence, e-grocery, covud, customer, and feature. https://doi.org/10.34010/injiiscom.v4i1.9586 85 | international journal of informatics information system and computer engineering 4(1) (2023) 75-88 doi: https://doi.org/10.34010/injiiscom.v4i1.9586 p-issn 2810-0670 e-issn 2775-5584 fig. 6. overlay e-groceries analysis visualization fig. 7. density visualization map of e-groceries analysis 4. conclusion the conclusion in this study is that there are many topics that are poorly explored in the field of e-groceries analysis for example, cluster 1 is "e-groceriey", cluster 2 "feature", cluster 3 "covid". it is hoped that this research will contribute to finding the field studied in the topic of egroceries analysis references al husaeni, d. f., & nandiyanto, a. b. d. (2022). bibliometric using vosviewer with publish or perish (using google scholar data): from step -by-step processing for users to the practical examples in the analysis of digital learning articles in pre https://doi.org/10.34010/injiiscom.v4i1.9586 rudhi & m. ihsan. a computational bibliometric analysis of e-groceries...| 86 doi: https://doi.org/10.34010/injiiscom.v4i1.9586 p-issn 2810-0670 e-issn 2775-5584 and post covid-19 pandemic. asean journal of science and engineering, 2(1), 1946. al husaeni, d. f., & nandiyanto, a. b. d. (2022). bibliometric using vosviewer with publish or perish (using google scholar data): from step -by-step processing for users to the practical examples in the analysis of digital learning articles in pre and post covid-19 pandemic. asean journal of science and engineering, 2(1), 1946. al husaeni, d. f., & nandiyanto, a. b. d. (2022). mapping visualization analysis of computer science research data in 2017-2021 on the google scholar database with vosviewer. international journal of informatics, information system and computer engineering (injiiscom), 3(1), 1-18. al husaeni, d. f., nandiyanto, a. b. d., & maryanti, r. (2023). bibliometric analysis of educational research in 2017 to 2021 using vosviewer: google scholar indexed research. indonesian journal of teaching in science, 3(1), 1-8. al husaeni, d. f., nandiyanto, a. b. d., & maryanti, r. (2023). bibliometric analysis of educational research in 2017 to 2021 using vosviewer: google scholar indexed research. indonesian journal of teaching in science, 3(1), 1-10. al husaeni, d. n., & nandiyanto, a. b. d. (2023). bibliometric analysis of high school keyword using vosviewer indexed by google scholar. indonesian journal of educational research and technology, 3(1), 1-12. al husaeni, d. n., & nandiyanto, a. b. d. (2023). bibliometric analysis of high school keyword using vosviewer indexed by google scholar. indonesian journal of educational research and technology, 3(1), 1-12. allan, r. n., billinton, r., & lee, s. h. (1984). bibliography of the application of probability methods in power system reliability evaluation 1977-1982. ieee power engineering review, (2), 24-25. berggren, c., & wikström, s. (2018). barriers online: exploring consumers' resistance to e-groceries. bjørgen, a., bjerkan, k. y., & hjelkrem, o. a. (2021). e-groceries: sustainable last mile distribution in city planning. research in transportation economics, 87, 100805. couclelis, h. (2000). from sustainable transportation to sustainable accessibility: can we avoid a new tragedy of the commons?. information, place, and cyberspace: issues in accessibility, 341-356. https://doi.org/10.34010/injiiscom.v4i1.9586 87 | international journal of informatics information system and computer engineering 4(1) (2023) 75-88 doi: https://doi.org/10.34010/injiiscom.v4i1.9586 p-issn 2810-0670 e-issn 2775-5584 ehrler, v. c., schöder, d., & seidel, s. (2021). challenges and perspectives for the use of electric vehicles for last mile logistics of grocery e -commerce–findings from case studies in germany. research in transportation economics, 87, 100757. ekren, b. y., mangla, s. k., turhanlar, e. e., kazancoglu, y., & li, g. (2021). lateral inventory share-based models for iot-enabled e-commerce sustainable food supply networks. computers & operations research, 130, 105237. fernandez vazquez-noguerol, m. (2021). modeling and optimization of the supply chain in e-groceries (doctoral dissertation, organización de empresas e márketing). fikar, c. (2018). a decision support system to investigate food losses in e -grocery deliveries. computers & industrial engineering, 117, 282-290. gunawardana, p. k. a. t. d. r., & fernando, i. (2021). does customer trust mediate the impact of e-service quality dimensions? lessons during covid-19 pandemic (preprint). gunawardana, p. k. a. t. d. r., & fernando, i. (2021). does customer trust mediate the impact of e-service quality dimensions? lessons during covid-19 pandemic. gunawardana, p. k. a. t. d. r., & fernando, p. i. n. (2021). assessing the mediation role of the customer trust on e-service quality: lessons during covid-19 pandemic. sri lanka journal of marketing, 7(3), 105. hamidah, i., sriyono, s., & hudha, m. n. (2020). a bibliometric analysis of co vid-19 research using vosviewer. indonesian journal of science and technology, 34-41. kusnadi, y., & pan, g. (2020). developing online business strategy with millennials through partnership with university. mees, m. (2019). e-groceries: the effects of simulated sensory information and freshness guarantee information on consumer uncertainty. meijboom, j. (2019). waste reduction in e-groceries fulfilment center: a case study at picnic. mokhtarian, p. l. (2004). a conceptual analysis of the transportation imp acts of b2c e-commerce. transportation, 31, 257-284. muhammad, n. s., sujak, h., & abd rahman, s. (2016). buying groceries online: the influences of electronic service quality (eservqual) and situational factors. procedia economics and finance, 37, 379-385. https://doi.org/10.34010/injiiscom.v4i1.9586 rudhi & m. ihsan. a computational bibliometric analysis of e-groceries...| 88 doi: https://doi.org/10.34010/injiiscom.v4i1.9586 p-issn 2810-0670 e-issn 2775-5584 mulyawati, i. b., & ramadhan, d. f. (2021). bibliometric and visualized analysis of scientific publications on geotechnics fields. asean journal of science and engineering education, 1(1), 37-46. nandiyanto, a. b. d., al husaeni, d. n., & al husaeni, d. f. (2021). a bibliometric analysis of chemical engineering research using vosviewer and its correlation with covid-19 pandemic condition. journal of engineering science and technology, 16(6), 4414-4422. nandiyanto, a. b. d., girsang, g. c. s., maryanti, r., ragadhita, r., anggraeni, s., fauzi, f. m., ... & al-obaidi, a. s. m. (2020). isotherm adsorption characteristics of carbon microparticles prepared from pineapple peel waste. communications in science and technology, 5(1), 31-39. noguerol, m. f. v. (2021). modeling and optimization of the supply chain in e-groceries (doctoral dissertation, universidade de vigo). peters, c. i. (1975). method of antenna tuning. department of the navy washington dc. pico, y., & barcelo, d. (2020). pyrolysis gas chromatography-mass spectrometry in environmental analysis: focus on organic matter and microplastics. trac trends in analytical chemistry, 130, 115964. pujol, a. i. (2020). digital nudging to enhance sustainable purchasing behaviours in e groceries. ragadhita, r., & nandiyanto, a. b. d. (2022). computational bibliometric analysis on publication of techno-economic education. indonesian journal of multidiciplinary research, 2(1), 213-220. soegoto, h., soegoto, e. s., luckyardi, s., & rafdhi, a. a. (2022). a bibliometric analysis of management bioenergy research using vosviewer application. indonesian journal of science and technology, 7(1), 89-104. suel, e., & polak, j. w. (2017). development of joint models for channel, store, and travel mode choice: grocery shopping in london. transportation research part a: policy and practice, 99, 147-162. thommis, c. (2021). logistieke uitdagingen in e-groceries. waitz, m., mild, a., & fikar, c. (2018). a decision support system for efficient lastmile distribution of fresh fruits and vegetables as part of e -grocery operations. https://doi.org/10.34010/injiiscom.v4i1.9586 241 | international journal of informatics information system and computer engineering 4(1) (2023) 96 doi: https://doi.org/10.34010/injiiscom.v3i2.9563 p-issn 2810-0670 e-issn 2775-5584 intention to adopt cloud-based e-learning in nigerian educational institutions tom a. m 1, virgiyanti, w 2* 1school of computing, universiti utara malaysia, malaysia 2faculty of ocean engineering technology and informatics, universiti malaysia terengganu, malaysia *corresponding email: wiwied.virgiyanti@umt.edu.my a b s t r a c t s a r t i c l e i n f o institutions of higher education must utilize innovative information and communication technologies for teaching in nigeria. thus, cloud-based e-learning is essential to curtail educational challenges such as limited infrastructure, funds, and student-to-lecturer ratio. recently, there has been widespread enthusiasm regarding cloud computing for e-learning; adopting and strategically utilizing these technologies remains a significant challenge for higher education institutions. furthermore, there is a limited understanding of how cloud-based e-learning can transform nigerian educational establishments. cloud-based e-learning systems' technological components have been the subject of numerous study studies, but little is known about how they operate from an organizational perspective. accordingly, using the technologyorganization-environment theory, the goal of this study is to investigate the variables that influence the adoption of cloud-based e-learning. the findings of the research show that relative benefit and competing pressure have a big impact on whether cloud-based elearning is adopted. however, compatibility, security, and top management commitment do not appear to be significant determinants. these findings will help nigerian education institutions, the ministry of education, and practitioners to understand the critical factors for adopting this technology for improved education. article history: submitted/received 31 jul 2022 first revised 26 aug 2022 accepted 30 sept 2022 available online 29 oct 2022 publication date 01 dec 2022 aug 2018 __________________ keywords: intention to adopt, cloud-based, e-learning, higher education institution international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 3(2) (2022) 241-250 https://doi.org/10.34010/injiiscom.v3i2.9563 mailto:wiwied.virgiyanti@umt.edu.my tom and virgiyanti. intention to adopt cloud-based e-learning in nigerian …| 242 doi: https://doi.org/10.34010/injiiscom.v3i2.9563 p-issn 2810-0670 e-issn 2775-5584 1. introduction in recent years, the advancement of technology has brought about innovations like cloud computing, big data, ai, and blockchain (tom, virgiyanti, & rozaini, 2019). these innovations brought more opportunities and challenges to businesses. e-learning is a critical driver for emancipation from poverty (world bank [wb], 2013). research has shown that the e-learning systems of emerging nations are experiencing a similar and severe crisis. this crisis is primarily ascribed to the lack of clear education policies, infrastructure deficiencies, and inadequate investment in the education sector. the danger this scenario poses to the continent's ability to learn and advance generally is limitless. inadequate allocation of financial resources has been identified as nigeria's most significant challenge in education (asiyai, 2013; edomwonyi & osarumwense, 2017; virgiyanti & rozaini, 2019). the lack of concrete education policies, infrastructure gaps, and underinvestment in the education sector are mainly to blame for the comparable and pervasive problem in emerging nations' e-learning systems. the learning process and the continent's development as a whole are constantly threatened by this scenario. the biggest barrier to education in nigeria has been recognized to be the inadequate supply of money means. previous research has demonstrated the prevalence of issues in e-learning systems in developing countries, which can be attributed to the shortage of feasible and effective educational policies, insufficient infrastructure, and inadequate investment. the biggest obstacle to education in nigeria has been recognized as a lack of funding. (asiyai, 2013; edomwonyi & osarumwense, 2017; virgiyanti & rozaini, 2019). many universities in developing countries currently operate unreliable elearning systems. as a solution, cloud computing, with its location independence, can provide staff and students in nigerian universities with access to highly dependable and efficient systems, similar to those found in developed countries. this would ultimately enhance the competitiveness of nigerian higher education institutions (heis). to examine the usage of cloud-based e-learning from the managerial viewpoint of heis, more actual data is needed. in this research, the opinion of administrators in nigerian heis is investigated using the elements of technology, organization, and environment as well as exterior variables. the results of this research will help understand the critical elements influencing the uptake and usage of cloud-based e-learning in nigeria. 2. theoretical and conceptual model the technology, organization, and environment (toe) hypothesis, developed by tornatzky and fleischer, is used in this research (tornatzky & fleischer, 1990). the toe is an organizational theory that thoroughly explains the likelihood that a company or group will embrace innovation. the toe proposes that these three variables affect how organizations perceive the need for innovation and adapt/adopt it in order to stay competitive by including both limitations that act as roadblocks and https://doi.org/10.34010/injiiscom.v3i2.9563 243 | international journal of informatics information system and computer engineering 3(2) (2022) 241-250 doi: https://doi.org/10.34010/injiiscom.v3i2.9563 p-issn 2810-0670 e-issn 2775-5584 opportunities that serve as incentives for innovation (baker, 2012; virgiyanti & rozaini, 2019). relative advantage (ra) and compatibility are technical factors when implementing cloud computing, including cloud-based e-learning platforms. (com). information and communication technology (ict) is used by universities all over the globe to create efficient and effective learning environments for both employees and pupils. as a result, when implementing innovation, implementers must weigh the advantages and disadvantages of the technology. relative advantage (ra) is the expectation of an organization's gain from technical factors. hence, institutions consider adopting innovation and its advantages over their existing systems. cloud computing has the edge over traditional server-based systems due to its flexibility, mobility, and scalability. adopting the cloud in academia opens up numerous avenues such as collaboration, discussion, availability of resources, and cost savings due to its payper-use model (tom, virgiyanti, & osman, 2019). we, therefore, posit that: 2.1. h1: relative advantage will positively influence the adoption of cloud-based e-learning in nigerian heis. due to a lack of resources, developing nations, especially those in africa, face difficulties in enhancing their educational facilities and need assistance to fix their outdated, ineffective systems. the degree to which an invention is viewed as compatible with the users' established beliefs, standards, and experiences is also referred to as compatibility (rogers, 1995). as in developed countries, cloud computing supports a variety of apps and computer languages that can be easily incorporated into nigerian e-learning systems, giving users an edge in terms of freedom and productivity. hence, we posit that: 2.2. h2: compatibility will positively influence the adoption of cloudbased e-learning in nigerian heis. regarding organizations considering implementing cloud computing, security is a top worry because it could present challenges. since data proprietorship is still a contentious topic in the context of cloud computing, the secrecy, stability, and accessibility of an organization's data are extremely important. 2.3. h3: security will positively influence the adoption of cloudbased e-learning in nigerian heis. the security of cloud computing is a major factor that deters its adoption, as it plays a crucial role in protecting an organization's information and data. this study emphasizes the significance of the organizational perspective in innovation adoption, particularly the importance of top management commitment (tmc). tornatzky also recognized tmc as a crucial factor in the innovation adoption process (tornatzky, 1990). hence, the involvement of the top managers increases the chances of technology adoption. hence, it is especially true for developing countries like nigeria, with limited resources. so, strategically using limited resources and adopting cloud-based e-learning will considerably improve the over-stretched learning systems. therefore, we posit that: https://doi.org/10.34010/injiiscom.v3i2.9563 tom and virgiyanti. intention to adopt cloud-based e-learning in nigerian …| 244 doi: https://doi.org/10.34010/injiiscom.v3i2.9563 p-issn 2810-0670 e-issn 2775-5584 2.4. h4: top management commitment will positively influence the adoption of cloudbased e-learning in nigerian heis. the natural viewpoint has an impact on how innovation is adopted in companies. cloud computing can offer a competitive edge to nigerian heis, but external factors like government regulations, policies, and peer pressure from global competitors can either support or hinder the adoption of new technology for learning systems. competitive pressure (cp) refers to the level of competition that an organization faces from other institutions within the academia. competition can drive heis to innovate and enhance the quality of their systems. therefore, the adoption of cloud-based elearning can help nigerian heis to restructure and create new opportunities for higher education. 2.5. h5: competitive pressure will positively influence the adoption of cloud-based e-learning in nigerian heis. this research also makes use of the toe and doi theories to forecast nigeria's propensity to embrace cloud-based elearning. the study uses poll questions to determine the critical toe-doi factors, which are shown in fig. 1. a seven-point likert measure was used to capture every answer. 3. research methodology deductive research technique, which includes the creation of quantifiable research queries, was used in this work. the study's element of research is the ict directorates' senior managerial employees. an easy random selection method was used to choose the group of colleges. the two parts of the poll items used in the research were modified from earlier studies. the subjects' personal data was collected in the first segment, and acceptance concerns for cloud-based e-learning were addressed in the second. responses were gathered using a 7-point likert scale, and the surveys received a topic validity assessment. a total of 248 responses were obtained from the 454 questionnaires distributed to the target respondents. fig. 1. the theoretical framework and hypothesis of the study https://doi.org/10.34010/injiiscom.v3i2.9563 245 | international journal of informatics information system and computer engineering 3(2) (2022) 241-250 doi: https://doi.org/10.34010/injiiscom.v3i2.9563 p-issn 2810-0670 e-issn 2775-5584 4. measurement model in order to evaluate different parts of the data, including the content validity, internal consistency dependability, convergent validity, and discriminant validity, it is essential to evaluate the measurement model. these assessments assist in ensuring that the data gathered is precise and trustworthy as well as that the poll queries used are tracking the intended outcomes (henseler et al., 2009; hair et al., 2011; hair et al., 2014). 4.1. individual item reliability and internal consistency the internal consistency is a measurement of how closely various scale elements reflect the same fundamental entity. (bijttebier et al., 2000; sun et al., 2007). in most cases, the cronbach's alpha statistic is used to assess this. composite reliability factors are frequently used to calculate an instrument's internal consistency reliability. (bacon et al., 1995; mccrae et al., 2011; peterson & kim, 2013). in order to ensure the truth and dependability of a measurement tool, it is crucial to evaluate its internal coherence. 4.2. discriminant validity the degree to which two conceptually comparable conceptions differ from one another is referred to as discriminant validity (hair et al., 2010). it is assessed using the fornell and larcker-proposed average variance extracted (ave) technique (fornell & larcker, 1981). the common variation between the components should be less than the ave numbers. the entities are shown to be considerably distinct from one another if the ave values are greater than the common variation. the research data satisfies the requirement for discriminant validity, as shown in table 2. the htmt (heterotrait-monotrait ratio of correlations) is an additional method used to assess the discriminant validity of the data. table 3 shows the results of this analysis, indicating that the htmt is also not a concern, meaning that no items needed to be deleted in this phase. table 1. construct reliability and validity factors cronbach's alpha rho_a composite reliability ave com 0.843 0.858 0.883 0.557 cp 0.899 0.913 0.92 0.624 int 0.791 0.801 0.877 0.703 ra 0.816 0.843 0.86 0.508 sec 0.828 0.859 0.883 0.655 tmc 0.886 0.887 0.914 0.639 https://doi.org/10.34010/injiiscom.v3i2.9563 tom and virgiyanti. intention to adopt cloud-based e-learning in nigerian …| 246 doi: https://doi.org/10.34010/injiiscom.v3i2.9563 p-issn 2810-0670 e-issn 2775-5584 table 2. fornell-lacker criterion variables com cp int ra sec tmc com .746 cp .576 .79 int .422 .564 .839 ra .399 .344 .44 .712 sec .466 .435 .402 .462 .809 tmc .569 .599 .438 .398 .623 .799 table 3. heterotrait-monotrait ratio (htmt) factors com cp int ra sec tmc com cp 0.661 int 0.475 0.633 ra 0.425 0.365 0.501 sec 0.53 0.48 0.471 0.515 tmc 0.653 0.659 0.499 0.417 0.704 4.3. assessment of structural model the structural model in this research was assessed using pls-sem (partial least squares structural equation modeling), as suggested by the conceptual model, which contained five assumptions (see fig. 2) (hair et al., 2017). following the recommendations made by hair et al., the importance of route coefficients was assessed using the conventional bootstrapping technique with 5,000 data. (2010, 2014, 2017). the normalcy of the data was estimated using the bootstrap findings. the structural model is shown in table 4, and the mediator variable is government support. according to the study's findings, there is a connection between competitive pressure (cp) and the desire of nigerian heis to embrace cloud-based e-learning. compatibility, security, and top management commitment, however, did not significantly influence whether cloud-based e-learning was adopted in heis. relative advantage (ra) and the desire to implement a cloud-based elearning system in nigeria were found to be significantly correlated in the research. in conclusion, these results shed important light on the importance of nigerian heis' plans to implement cloudbased e-learning technology. table 5 presents the findings. https://doi.org/10.34010/injiiscom.v3i2.9563 247 | international journal of informatics information system and computer engineering 3(2) (2022) 241-250 doi: https://doi.org/10.34010/injiiscom.v3i2.9563 p-issn 2810-0670 e-issn 2775-5584 table 4. structural model with government support as the moderator variable factors sample mean (m) t statistics (|o/stdev|) p values com -> int 0.048 0.564 0.286 cp -> int 0.398 5.157 0.000 ra -> int 0.243 3.833 0.000 sec -> int 0.085 1.012 0.156 tmc -> int 0.029 0.306 0.380 fig. 2. the structural model table 5. bootstrapping findings hypothesis result h1 there will be a positive relationship between relative advantage (ra) and intention to adopt cloud-based e-learning. supported h2 there will be a positive relationship between compatibility (com) and intention to adopt cloud-based e-learning. not supported h3 there will be a positive relationship between security (sec) and intention to adopt cloud-based e-learning. not supported h4 there will be a positive relationship between top management commitment (tmc) and intention to adopt cloud-based e-learning. not supported h5 there will be a positive relationship between competitive pressure (cp) and intention to adopt cloud-based e-learning. supported 5. discussion this research set out to determine how eager nigerian hei administration was to implement cloud-based e-learning within their organizations. the research suggested an adaptation model based on the technology-organizationenvironment (toe) theory and other pertinent environmental factors to accomplish this objective. studies on cloud computing usage in heis around the globe are numerous, and the technology has many benefits over https://doi.org/10.34010/injiiscom.v3i2.9563 tom and virgiyanti. intention to adopt cloud-based e-learning in nigerian …| 248 doi: https://doi.org/10.34010/injiiscom.v3i2.9563 p-issn 2810-0670 e-issn 2775-5584 conventional e-learning platforms. therefore, incorporating cloud computing into nigerian heis is crucial. the results of the research show that the toe theory elements are extremely important for comprehending and affecting the desire to embrace cloudbased e-learning. relative advantage, a technical component, had a substantial effect on the desire to embrace cloud-based elearning, according to the structural equation modeling (sem) study, with βvalue = 0.062, t-value = 3.833, and p-value = 0.000. the upper management views the impact of this technology as vital, in line with roger's diffusion of innovation (doi) theory. (rogers, 2003). according to them, adopting cloud-based elearning will enhance the standard of procedures and employee performance in their institutions, which is consistent with tashkandi and al-jabri's results. (tashkandi & al-jabri, 2015). on the other hand, with a p-value of 0.286 and a β-value of 0.072, compatibility is not a major factor. this suggests that heis in nigeria are still adopting cloud-based elearning in its early phases and have not yet concentrated on interoperability with their existing operational systems (rogers, 2003). and others, however, contend that compatibility is a crucial aspect of technology usage in heis (hiran & henten, 2020). the study's findings also demonstrated that top management commitment (tmc), as an organizational component, had little bearing on nigerian heis' uptake of cloud-based e-learning. this result emphasizes the difficulties and opposition that heis encounter when implementing innovation and change, and it implies that senior administrators should be more knowledgeable about cloud technology. on the other hand, the security factor was also found to be statistically insignificant. despite being a major concern for many heis, managers are still cautious about cloud computing due to issues such as data ownership, location, privacy, confidentiality, and data availability. data ownership, in particular, remains a major challenge for cloud computing as the policies of the country where the data centres are located may require them to make the data available to them. 6. conclusion the purpose of this research was to investigate how university senior management in northern nigeria perceived the usage of cloud-based elearning and the variables that might have an effect on that goal. based on the work of tornatzky and fleischer, the technology-organization-environment theory was applied and extended to include additional pertinent factors for the research (tornatzky & fleischer's, 1990). the pls-sem method was used to evaluate the interviewees' data, and the results showed that relative advantage had a substantial impact on the management's decision to implement cloud-based e-learning in nigerian heis. the desire to embrace cloud-based elearning was also found to be significantly impacted by competitive pressure. however, it was discovered that crucial elements like compatibility, security, and top management commitment were negligible in nigeria. to enhance their competitiveness with global peers, nigerian heis must gain an understanding of cloud computing and make informed decisions about adopting https://doi.org/10.34010/injiiscom.v3i2.9563 249 | international journal of informatics information system and computer engineering 3(2) (2022) 241-250 doi: https://doi.org/10.34010/injiiscom.v3i2.9563 p-issn 2810-0670 e-issn 2775-5584 cloud-based e-learning. financial support is crucial for implementing this technology, which is widely used by developed nations to provide efficient and effective learning experiences. although more research is required, this survey provides insightful information about the desire to implement cloudbased e-learning in nigerian heis. future studies could concentrate on cloud-based e-learning in nigeria and make use of mixed-methods or qualitative techniques to give participants' views of implementing this technology in nigerian heis a more thorough grasp. due to financial and organizational limitations, this research was only able to encompass a small number of colleges in a particular area of nigeria. future studies could use a larger sample size and responses from staff and other managerial levels to get a more thorough grasp of the usage of cloud-based elearning in nigerian heis. future study could use online surveys and conversations to collect information on the crucial factors influencing the uptake of cloud-based e-learning in nigerian heis. references asiyai, r. i. (2013). challenges of quality in higher education in nigeria in the 21st century. international journal of educational planning & administration, 3(2), 159– 172. bacon, d. r., sauer, p. l., & young, m. (1995). composite reliability in structural equations modeling. educational and psychological measurement, 55(3), 394–406. https://doi.org/10.1177/0013164495055003003 bijttebier, p., delva, d., vanoost, s., bobbaers, h., lauwers, p., & vertommen, h. (2000). reliability and validity of the critical care family needs inventory in a dutch-speaking belgian sample. heart & lung, 29(4), 278–286. https://doi.org/10.1067/mhl.2000.107918 edomwonyi, j., & osarumwense, r. (2017). business education in nigeria: issues, challenges and way forward for national development. journal of collaborative research and development (jcrd), 5(1), 1–25. fornell, c., & larcker, d. f. (1981). evaluating structural equation models with unobservable variables and measurement error. journal of marketing research, 18(1), 39. https://doi.org/10.2307/3151312 hair, j. f., black, w. c., babin, b. j., anderson, r. e., & tatham, r. l. (2014). multivariate data analysis (7th edition). pearson education limited. www.pearsoned.co.uk hair, j. f. j., black, w. c., babin, b. j., & anderson, r. e. (2010). multivariate data analysis (seventh ed). pearson. https://www.pearson.com/us/highereducation/program/hair-multivariate-data-analysis-7thedition/pgm263675.html hair, joe f., ringle, c. m., & sarstedt, m. (2011). pls-sem: indeed a silver bullet. the journal of marketing theory and practice, 19(2), 139–152. https://doi.org/10.2753/mtp1069-6679190202 https://doi.org/10.34010/injiiscom.v3i2.9563 tom and virgiyanti. intention to adopt cloud-based e-learning in nigerian …| 250 doi: https://doi.org/10.34010/injiiscom.v3i2.9563 p-issn 2810-0670 e-issn 2775-5584 hair, joseph f., hult, jr., g. t. m., ringle, c., & sarstedt, m. (2014). a primer on partial least squares structural equation modeling (pls-sem). sage publications,. hair, joseph f., hult, g. t. m., ringle, c., & sarstedt, m. (2017). a primer on partial least squares structural equation modeling (pls-sem) (second edi). https://uk.sagepub.com/en-gb/asi/a-primer-on-partial-least-squaresstructural-equation-modeling-pls-sem/book244583 henseler, j., ringle, c. m., & sinkovics, r. r. (2009). the use of partial least squares path modeling in international marketing. emerald group publishing limited, 20, 277–319. https://doi.org/10.1108/s1474-7979(2009)0000020014 hiran, k. k., & henten, a. (2020). an integrated toe-doi framework for cloud computing adoption in higher education: the case of sub-saharan africa, ethiopia. in soft computing: theories and applications (pp. 1281–1290). springer. mccrae, r. r., kurtz, j. e., yamagata, s., & terracciano, a. (2011). internal consistency, retest reliability, and their implications for personality scale validity. personality and social psychology review, 15(1), 28–50. https://doi.org/10.1177/1088868310366253 peterson, r. a., & kim, y. (2013). on the relationship between coefficient alpha and composite reliability. journal of applied psychology, 98(1), 194–198. https://doi.org/10.1037/a0030767 rogers, e. m. (2003). diffusion of innovations (5th ed., vol. 1, issue 38). sun, w., chou, c.-p., stacy, a. w., ma, h., unger, j., & gallaher, p. (2007). sas and spss macros to calculate standardized cronbach’s alpha using the upper bound of the phi coefficient for dichotomous items. behavior research methods, 39(1), 71– 81. https://doi.org/10.3758/bf03192845 tashkandi, a. n., & al-jabri, i. m. (2015). cloud computing adoption by higher education institutions in saudi arabia: an exploratory study. cluster computing, 18(4), 1527–1537. https://doi.org/10.1007/s10586-015-0490-4 tom, a. m., virgiyanti, w., & osman, w. r. s. (2019). the impact of government support on the adoption of iaasbel by university’s top management. 2019 international conference on data and software engineering (icodse), 1–6. tom, a. m., virgiyanti, w., & rozaini, w. (2019). understanding the determinants of infrastructure-as-a service-based e-learning adoption using an integrated toe-doi model: a nigerian perspective. 2019 6th international conference on research and innovation in information systems (icriis), 1–6. https://doi.org/10.1109/icriis48246.2019.9073418 tornatzky, l. g., & fleischer, m. (1990). the processes of technological innovation. the journal of technology transfer, 16(1), 45–46. https://doi.org/10.1007/bf02371446 world bank. (2013). smarter education systems for brighter futures. https://doi.org/10.34010/injiiscom.v3i2.9563 49 | international journal of informatics information system and computer engineering 4(1) (2023) 49-60 doi: https://doi.org/10.34010/injiiscom.v4i1.9571 p-issn 2810-0670 e-issn 2775-5584 a computational bibliometric analysis of game advertising using vosviewer yogie rinaldy ginting department of mechanical engineering, school of mechanical and civil engineering, curtin university, australia *corresponding email: y.ginting@curtin.edu.au 1. introduction a computational bibliometric analysis of game advetising using vosviewer. advetising games are attitudes related to a person's financial problems, where responses to a statement or opinion can be used as a useful measurement variable to define advetising games as a state of mind, opinions and judgments about finances. the game consists of a set of rules that build competing situations from two to several people or groups by choosing strategies built to maximize a b s t r a c t s a r t i c l e i n f o the purpose of this study is to perform a computational bibliometric analysis of the term “game advertising” by combining mapping analysis using vosviewer publish or perish software, and google schoolar. the method used is a bibliometric and descriptive quantitative approach. the data obtained is a search result based on the keyword "game advertising" on google scholar. the search results show 989 articles published from 2017 to 2022 decreasing every year except in 2021. this can be proven in 2017 with 232 articles, in 2018 it decreased to 210 articles, in 2019 it decreased again to 199 articles. in 2020 it decreased to 136 articles, except for in 2021 there was a less significant increase to 137 articles, and in 2022 research on game advertising decreased dramatically, the number of publications to 40. the conclusion of this study shows the importance of conducting bibliometric analysis, especially in advertising games field. this research is expected to be a reference for further research in determining and analyzing the research theme. article history: submitted/received 03 nov 2022 first revised 20 feb 2023 accepted 25 mar 2023 first available online 12 apr 2023 publication date 01 jun 2023 aug 2018 __________________ keywords: bibliometrics, game advertising, data analysis, vosviewer international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 4(1) (2023) 49-60 https://doi.org/10.34010/injiiscom.v4i1.9571 mailto:y.ginting@curtin.edu.au yogie rinaldy ginting. a computational bibliometric analysis of game advertising…| 50 doi: https://doi.org/10.34010/injiiscom.v4i1.9571 p-issn 2810-0670 e-issn 2775-5584 their own winnings or to minimize your opponent's winnings. the rules specify the possibility of action for each player, a certain amount of information received by each player as the progress of play, and a certain number of wins or losses in various situations (von eumann & morgenstern, 2007). the west kutai regency government aims to enhance the effectiveness and efficiency of public services by implementing e-government internally within government organizations. one aspect of egovernment that must be prioritized is government to citizen or government to customer (g2c) services, where the government provides public service information through information technology. online services can significantly reduce administrative, relational, and interaction costs compared to manual services, benefiting both the government and its stakeholders. additionally, there are technologies that can simplify administrative processes and reduce bureaucracy, thereby creating a positive business environment. however, based on direct observations including interviews and discussions with kesbangpol of west kutai regency, it is evident that the implementation of egovernment is still in the preparation stage and has not yet reached the emerging level or the lowest level of adoption. promotion is anything that is done to help sell a product or service at each point of the sales network, from the presentation materials a salesperson uses when making offers to commercial broadcasts, television, or newspaper advertisements that try to lure customers into getting favorable impressions of what is advertised (goutama, ae., 2018). advertising games (adv-games) are video games that include promotional content as a marketing technique, either overtly or implicitly. the brand is integrated into the story, mission, and other game activities in an adv-game. these adv-games can be observed in a variety of game components, including characters with specific brands, gameplay that demonstrates the characteristics of specific products, advertising banners in a game segment, and other elements (aulia, et al, 2014). in general, advanced games have been around for a long time, yet they continue to evolve and adapt. traditional games like monopoly, snakes and ladders, and chess may all be turned into adv-games by adding a hint of brand awareness. following technical advancements, advgames have evolved into digital media, and various adv-games applications are now available on smartphone media (terlutter and capella, 2013). advertising games (adv-games) can be an opportunity for business people to improve their marketing tools. to determine the great opportunity about improving marketing advertising through certain video games, research on each video game, including advertising games in improving marketing tools, research using bibliometrics should also be carried out. this is because bibliometrics can be used to categorize certain topics in bibliographical form as well as to generate representative summaries of selected topics. vosviewer is a software tool that facilitates bibliometric analysis, offering the capability to generate, visualize, and analyze bibliometric maps. with vosviewer different types of bibliometric network data can be evaluated, such as connections between https://doi.org/10.34010/injiiscom.v4i1.9571 51 | international journal of informatics information system and computer engineering 4(1) (2023) 49-60 doi: https://doi.org/10.34010/injiiscom.v4i1.9571 p-issn 2810-0670 e-issn 2775-5584 journal publications or citations, associations between scientific terminology, and collaborative relationships among researchers. building upon previous research, this study aims to conduct bibliometric analysis on the topic of game advertising by utilizing vosviewer software for mapping analysis. the research methodology involves quantitative bibliometric analysis and a descriptive approach. data is collected from searches and processing using the keyword "game advertising" on platforms such as google scholar and publish or perish (kania and sabariah, 2013). 2. method the research methodology employed in this study involves bibliometric, descriptive, and quantitative approaches. information was gathered from various published journals indexed by google scholar. additionally, a literature review was conducted on the research topic "gaming advertising" using the publish or perish software, chosen specifically for identifying bibliometric data (arwendria, a., 2021). furthermore, data obtained from publish or perish was saved in *ris format and imported into vosviewer software for analysis. publish or perish 8 and vosviewer 1.6.17 were utilized as the software tools for data collection in this study. in this research, relevant materials related to game advertising were reviewed and selected. a total of 989 research articles on game advertising published between 2017 and 2022 were obtained. these articles were saved in *.ris format for further analysis. vosviewer software was then used to generate visualizations and analyze trends using bibliometric maps. the article data from the prepared database sources were mapped using three types of visualizations in vosviewer software: network visualization, overlay visualization, and density visualization. additionally, terms included in the vosviewer mapping visualization were filtered to refine the analysis (kurnia, s., 2021). 3. results and discussion 3.1. advancements in the field of game advertising research research on the development climate in the field of game advertising, illustrating the development of research in the field of game advertising from 2017 to 2022 in fig. 1. fig. 1. level of research development on game advertising figure 1 shows that research on game advertising decreases every year, starting from 2017 to 2022 except for 2021. this can be proven in 2017 with 232 articles, in 2018 it decreased to 210 articles, in 2019 it decreased again to 199 articles, in 2020 it decreased to 136 articles, except in 2021 there was a less significant increase to 137 articles, and in 2022 research on game advertising decreased dramatically, the number of publications to 40. after conducting a search using the publish or perish software, we identified 989 articles https://doi.org/10.34010/injiiscom.v4i1.9571 yogie rinaldy ginting. a computational bibliometric analysis of game advertising…| 52 doi: https://doi.org/10.34010/injiiscom.v4i1.9571 p-issn 2810-0670 e-issn 2775-5584 that are relevant to the research topic. from this dataset, we further filtered and selected the 20 articles with the highest number of citations, sourced from 20 distinct journals and books (see table 1). table 1. article data in the field of game advertising no authors title year cites refs 1. herbert., et a l. ma rketing: grundla gen ma rktorientierter unternehmensführung konzepte – instrumente – pra xisbeispiele 2018 7173 (herbert, et a l, 2018) 2. silver., et a l. ma stering the ga me of go without huma n knowledge 2017 7044 (silver, et a l, 2017) 3. leiss., et a l. socia l communica tion in a dvertising 2018 2195 (leiss, et a l, 2018) 4. silver., et a l. a genera l reinforcement lea rning a lgorithm tha t ma ster's chess, shogi, a nd go through self -pla y 2018 2194 (silver, et a l, 2018) 5. j koivisto., & j ha ma ri. the rise of motiva tiona l informa tion systems: a review of ga mifica tion resea rch 2019 720 (j koivisto & j ha ma ri, 2019) 6. k huota ri., & j ha ma ri. a definition for ga mifica tion: a nchoring ga mifica tion in the service ma rketing litera ture 2017 685 (k huota ri & j ha ma ri, 2017) 7. eva ns., et a l. disclosing insta gra m influencer a dvertising: the effects of disclosure la ngua ge on a dvertising recognition, a ttitudes, a nd beha viora l intent 2017 611 (eva ns, et a l, 2017) 8. lee., et a l. advertising content a nd consumer enga gement on socia l media : evidence from fa cebook 2018 605 (lee, et a l, 2018) 9. s hea lth. method for socia l networking intera ctions using online consumer browsing beha vior, buying pa tterns, a dvertisements a nd a ffilia te a dvertising, for promotions, online 2019 590 (s hea lth, 2019) 10. db nieborg ., & t poell. the pla tformiza tion of cultura l production: theorizing the contingent cultura l commodities 2018 503 (db nieborg ., & t poell, 2018) 11. wa ng., et a l. irga n : a minima x ga me for unifying genera tive a nd discrimina tive informa tion retrieva l models 2017 460 (wa ng, et a l, 2017) 12. steenka mp., et a l. competitive rea ctions to a dvertising a nd promotion a tta cks 2018 367 (steenka mp, et a l, 2018) 13. m meeker., & l wu. internet trends 2018 2019 321 (m meeker, & l wu, 2019) 14. j roozenbeek ., & s va n der linden. fa ke news ga me conferences psychologica l resista nce a ga inst online misinforma tion 2019 244 (j roozenbeek & s va n der linden, 2019) 15. s de freita s. are ga mes effective lea rning tools? a review of educa tiona l ga mes 2018 243 (s de freita s, 2018) https://doi.org/10.34010/injiiscom.v4i1.9571 53 | international journal of informatics information system and computer engineering 4(1) (2023) 49-60 doi: https://doi.org/10.34010/injiiscom.v4i1.9571 p-issn 2810-0670 e-issn 2775-5584 table 1 (continue). article data in the field of game advertising no authors title year cites refs 16. s da hl. socia l media ma rketing: theories a nd a pplica tions 2021 213 (s da hl, 2021) 17. bt sha piro. positive spillovers a nd free riding in a dvertising of prescription pha rma ceutica ls: the ca se of a ntidepressa nts 2018 188 (bt sha piro, 2018) 18. wa ng., et a l. displa y a dvertising with rea l-time bidding (rtb) a nd beha viora l ta rgeting 2017 157 (wa ng, et a l, 2017) 19. smith., et a l. food ma rketing influences children's a ttitudes, preferences a nd consumption: a systema tic critica l review 2019 152 (smith, et a l, 2019) in table 1 there are 19 articles that match the criteria research. of the 19 selected articles, showing that highest quote related to game advertising research is 7173, while with the lowest citation is 152. that in table 1, it shows that in 2017 and 2022, each has articles with quotes highest. in 2017-2022, the most articles quoted is 7173 articles. temporary that, in 2021, a lot of articles quoted are 213 articles. year with quote the most is in 2018 as many as 7173 articles. 3.2. visualization game advertising topic area using vosviewer the visualization of the game advertising research area was carried out using vosviewer software, with a minimum requirement of 3 relationships, with 2 terms set by al husaeni and nandiyanto (utami and karlina, 2022). as a result, a total of 26 items were obtained and clustered into 4 groups based on the analysis of mapping visualization, according to the study on game advertising climate, namely: (i) cluster 1 has 9 items, those 9 items elements, engagement, game elements, gamers, gamified systems, market research, marketing communication, motivation, and tourism marketing (see figure 2). (ii) cluster 2 has 8 items, those 8 items are application, case study, computer game, development, digital marketing, education, marketing strategy, and target market (see figure 3). (iii) cluster 3 has 5 items, those 5 items are gamification, gamified application, interactivity, and service marketing perspective (see figure 4). (iv) cluster 4 has 4 items, those 4 items are analysis, game advertising, marketing management, and research directions (see figure 5). cluster 1 is indicated by color red, cluster 2 is shown in color green, cluster 3 is shown in blue old, and cluster 4 is shown in color yellow. https://doi.org/10.34010/injiiscom.v4i1.9571 yogie rinaldy ginting. a computational bibliometric analysis of game advertising…| 54 doi: https://doi.org/10.34010/injiiscom.v4i1.9571 p-issn 2810-0670 e-issn 2775-5584 fig. 2. cluster 1 visualization advertising game network fig. 3. cluster 2 visualization advertising game network https://doi.org/10.34010/injiiscom.v4i1.9571 55 | international journal of informatics information system and computer engineering 4(1) (2023) 49-60 doi: https://doi.org/10.34010/injiiscom.v4i1.9571 p-issn 2810-0670 e-issn 2775-5584 fig. 4. cluster 3 visualization advertising game network fig. 5. cluster 4 visualization advertising game network 3.3. network visualization game advertising topic area using vosviewer in vosviewer software, mapping every term divided becomes three type, the first is visualization network. visualization network that is connection among thing on the map. existing relationship in network visualization shown in network or the line that goes from one to one other things (see figure 6). https://doi.org/10.34010/injiiscom.v4i1.9571 yogie rinaldy ginting. a computational bibliometric analysis of game advertising…| 56 doi: https://doi.org/10.34010/injiiscom.v4i1.9571 p-issn 2810-0670 e-issn 2775-5584 fig. 6. visualization advertising game network visualization network from the "advertising game" gained using the vosviewer software is displayed in fig. 6. each cluster where in each each each field or investigated issue is represented in fig. 6. the game advertising climate alone is included in cluster 4 with a total strength of 40 and an occurrence of 43, as indicated in figure 6 above. advetising game climate connected to cluster 1, among them that is terms game element, marketing communication, and gamer, in k laster 2, among others that is term application, marketing strategy, computer games, and finally in cluster 3 of them namely with the term gamify, interactivity. 3.4. overlay visualization of game advetising topic area using vosviewer b second visualization network, in vosviewer software provide visualization mapping in overlay shape. mapping in shape overlay visualization focuses on novelty something term in research. novelty term or thing in research related to the climate of advertising games shown in figure 7. https://doi.org/10.34010/injiiscom.v4i1.9571 57 | international journal of informatics information system and computer engineering 4(1) (2023) 49-60 doi: https://doi.org/10.34010/injiiscom.v4i1.9571 p-issn 2810-0670 e-issn 2775-5584 fig. 7. overlay game advertising visualization in the depiction thing or term type overlay visualization, can see how much popular from every year. on visualization overlay, different colors showing extension term in something period certain. in research this, we use year 2017 to year 2022. more colors dark approach purple meaning that in something study about one thing or term done more close to 2017. meanwhile, the color is more light approach yellow is existing term in study latest. 3.5. density visualization of game advertising third that is the last mapping depiction in the vosviewer software is density visualization. visualization of density on financial attitude is shown in fig. 8. the mapping type used in this study involves the use of colors to indicate the popularity of a term. if the color of a term is lighter, it signifies that research on that term is becoming more popular. conversely, if the color is darker or faded, it indicates that research on that term is decreasing in frequency. in figure 8, it can be observed that there are some terms depicted in yellow color with relatively larger diameters. these terms called emission, game advertising, analysis, application, and engagement. https://doi.org/10.34010/injiiscom.v4i1.9571 yogie rinaldy ginting. a computational bibliometric analysis of game advertising…| 58 doi: https://doi.org/10.34010/injiiscom.v4i1.9571 p-issn 2810-0670 e-issn 2775-5584 fig. 8. visualization density advertising game network visualization density about climate advertising game research is in the picture above, which means that on the map density showing results analysis use all article regarding game advertising in 2017-2022. in the visualization in figure 8 shows pattern colored yellow where is getting yellow color, more keyword thick and getting the diameter of the circle, which means that they appear more dominant and if color on map fade or blend with the background behind colored green, shows that keyword the appear more rarely 4. conclusion study this aims to examine, analyze bibliometric literature on game advertising. the keyword "game advertising " is used to obtain data, which is based on a topic area that contains keywords, abstracts, and titles. after processing and filtering the data, obtained 989 relevant articles. device soft vosviewer is used to generate mapping data. mapping data poured to in visualization grid, overlay, and density. based on results in mapping and analysis use vosviewer, obtained that study regarding financial management with the term game advertising in 2017-2022 decreased from every year to year. in research this, using method bibliometrics to identify theme main in every field studies before, because important to assess novelty in future research. https://doi.org/10.34010/injiiscom.v4i1.9571 59 | international journal of informatics information system and computer engineering 4(1) (2023) 49-60 doi: https://doi.org/10.34010/injiiscom.v4i1.9571 p-issn 2810-0670 e-issn 2775-5584 references arwendria , a. (2021). publish or perish: bibliometric analysis of the literature on covid-19 on the google scholarship data base for 2019-2021. alma'arif : science islamic libraries and information , 1(1), 1-12. aulia, eg.; nurkertamanda, d.; and budiawan , w. (2014). analysis of the effectiveness of dynamic in-game advertising on android games with eyetracking method. industrial engineering online journal, 3(1). dahl, s. (2021). social media marketing: theories and applications. social media marketing, 1-100. de freitas, s. (2018). are games effective learning tools? a review of educational games. journal of educational technology and society, 21(2), 74-84. evans, nj; phua, j.; lim, j.; and jun, h. (2017). disclosing instagram influencer advertising: the effects of disclosure language on advertising recognition, attitudes, and behavioral intent. journal of interactive advertising, 17(2), 138149. goutama , ae (2018). games advertising vs tv advertising which one is more effective in building brands? stei journal of economics, 27(01), 1-9. heath, s. (2019). u.s. patent no. 10,217,117. washington, dc: us patent and trademark office. huotari, k.; and hamari, j. (2017). a definition for gamification: anchoring gamification in the service marketing literature. electronic markets, 27(1), 2131. kania, mbddm, and sabariah, a. (2013). analysis quality software against _ system unikom information . magazine unikom science. koivisto, j.; and hamari, j. (2019). the rise of motivational information systems: a review of gamification research. international journal of information management, 45, 191-210. kurnia, s. (2021). science, technology, engineering, art and mathematics (steam) in science education: analysis bibliometrics and mapping literature study use vosviewer software ( doctoral dissertation, uin raden intan lampung). lee, d.; hosanagar , k.; and nair, hs (2018). advertising content and consumer engagement on social media: evidence from facebook. management science, 64(11), 5105-5131. leiss, w.; kline, s.; jhally , s.; botterill, j.; and asquith, k. (2018). social communication in advertising. routledge. meeker, m., and wu, l. (2018). internet trends 2018. meffert, h.; burmann, c.; kirchgeorg , m.; and eisenbeiß , m. (2018). marketing: grundlagen marktorientierter unternehmensführung konzepte – instrumente – praxisbeispiele. springer-verlag. https://doi.org/10.34010/injiiscom.v4i1.9571 yogie rinaldy ginting. a computational bibliometric analysis of game advertising…| 60 doi: https://doi.org/10.34010/injiiscom.v4i1.9571 p-issn 2810-0670 e-issn 2775-5584 roozenbeek , j.; and van der linden, s. (2019). fake news game conferences psychological resistance against online misinformation. palgrave communications, 5(1), 1-10. shapiro, bt (2018). positive spillovers and free riding in advertising of prescription pharmaceuticals: the case of antidepressants. journal of political economy, 126(1), 381-437. shapiro, bt (2018). positive spillovers and free riding in advertising of prescription pharmaceuticals: the case of antidepressants. journal of political economy, 126(1), 381-437. silver, d., hubert, t., schrittwieser, j., antonoglou, i., lai, m., guez, a., ... & hassabis, d. (2018). a general reinforcement learning algorithm that master’s chess, shogi, and go through self-play. science, 362(6419), 1140-1144. silver, d.; schrittwieser , j.; simonyan, k.; antonoglou , i.; huang, a.; guez, a.; ... and hassabis, d. (2017). mastering the game of go without human knowledge. nature, 550(7676), 354-359. smith, r.; kelly, b.; yeatman, h.; and boyland, e. (2019). food marketing influences children's attitudes, preferences and consumption: a systemati c critical review. nutrients, 11(4), 875. steenkamp, jbe.; nijs, vr.; hanssens , dm.; and dekimpe , mg (2018). competitive reactions to advertising and promotion attacks. in long-term impact of marketing: a compendium 325-372. terlutter, r.; and capella, ml (2013). the gamification of advertising: analysis and research directions of in-game advertising, advergames, and advertising in social network games. journal of advertising, 42(2-3), 95-112. utami , sb; and karlina, n. (2022). bibliometric analysis: development of research and publications on program coordination using vosviewer. journal of cultural libraries , 9 (1), 1-8. von neumann, j.; & morgenstern, o. (2007). theory of games and economic behavior. in theory of games and economic behavior. princeton university press. wang, j.; yu, l.; zhang, w.; gong, y.; xu, y.; wang, b.; ... and zhang, d. (2017, august). irgan : a minimax game for unifying generative and discriminative information retrieval models. in proceedings of the 40th international acm sigir conference on research and development in information retrieval, 515-524. wang, j.; zhang, w.; and yuan, s. (2017). display advertising with real-time bidding (rtb) and behavioral targeting. foundations and trends® in information retrieval, 11(4-5), 297-435. https://doi.org/10.34010/injiiscom.v4i1.9571 samuel w lusweti et al. impact of number of artificial ants in aco … | 124 impact of number of artificial ants in aco on network convergence time: a survey samuel w lusweti1*, collins o odoyo, dorothy a rambim department of information technology, school of computing and informatics, masinde muliro university of science and technology, kenya *corresponding email: lusweti015@gmail.com a b s t r a c t s a r t i c l e i n f o due to the dynamic nature of computer networks today, there is need to make the networks self-organized. selforganization can be achieved by applying intelligent systems in the networks to improve convergence time. bio-inspired algorithms that imitate real ant foraging behaviour of natural ants have been seen to be more successful when applied to computer networks to make the networks self-organized. in this paper, we studied how ant colony optimization (aco) has been applied in the networks as a bio-inspired algorithm and its challenges. we identified the number of ants as a drawback to guide this research. we retrieved a number of studies carried out on the influence of ant density on optimum deviation, number of iterations and optimization time. we found that even though some researches pointed out that the numbers of ants had no effect on algorithm performance, many others showed that indeed the number of ants which is a parameter to be set on the algorithm significantly affect its performance. to help bridge the gap on whether or not the number of ants were significant, we gave our recommendations based on the results from various studies in the conclusion section of this paper. article history: received 25 may 2022 revised 30 may 2022 accepted 10 june 2022 available online 26 june 2022 aug 2018 __________________ keywords: convergence time, ant colony optimization, artificial ants, networks, parameter 1. introduction a family of optimization techniques that have been applied as combinatorial problem-solving techniques form the widely known metaheuristics. over the years, metaheuristics have been applied in many fields to solve complex problems (liao et al., 2014). the aco algorithm is international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 3(1) (2022) 124-135 125 |international journal of informatics information system and computer engineering 3(1) (2022) 124-135 one example of these metaheuristics. others include particle swarm optimization, bee colony optimization, bat optimization among others. aco was introduced in the early years of the 1990s (blum, 2005) and has for many years been applied in various fields to tackle complex optimization issues that may be solved using basic methods. for example, the algorithm has been successfully applied in data compression, gaming theory, feature selection, dispatch problems, parameter estimation in dynamic systems, satellite control, job scheduling problems, congestion control, social graph mining, medicine for decision making, and target tracking (kar, 2016). aco has also been employed in computer networks to determine the shortest possible route for sending data from basis to destination, similar to how ants forage (adiga et al., 2013), therefore becoming the lowest-cost route between any two connected nodes. this has increased network functioning while decreasing the latency that packets suffer in the absence of the aco algorithm, when attempting to reach their target because computer networks are important to the running of any institution, they must be active at all times, regardless of the obstacles they may encounter. as a result, technical adjustments in their upkeep are required. aco was employed in this example to make the network self-regulating and sustainable, so that any issues discovered in the network could be addressed by the network itself. some academics have proposed using aco to prevent the need for infrastructure such as nodes and switches, which might fail and cause the communication process to fail (dressler et al., 2010). 2. methods and materials in this paper we adopted both qualitative research methods. secondary data of 12 experiments published online was retrieved whereby results from various studies were compared and used to do the analysis using thematic analysis. 3. application of aco in computer networks ant colony optimization algorithm is a class of swarm intelligent systems that are applied in solving np-hard problems. swarm-intelligent systems have not been fully explored in literature till today (gustov & levina, 2021), but a lot is being done in aco. this is because among the many swarms intelligent algorithms available, aco is the most studied. foraging ants in a true ant colony system produce pheromone trails along the pathways they traverse, which others may follow to the food source. an ant in the aco algorithm is a mobile agent capable of dealing with computer network difficulties such as congestion and packet routing. this is made possible by the continuous and consistent modification of the routing tables by the agents in response to any congestion in the network (sim & sun, 2002). as the agents do this modification, they lay pheromone trails, which make clear routes between any nodes on the system. these pheromone trails acting as stigmergy would be handy in helping the future ants make a routing decision (kamali & opatrny, 2007). other biosamuel w lusweti et al. impact of number of artificial ants in aco … | 126 inspired algorithms that have been successfully applied in computer networks for adaptive routing include the artificial plant optimization (apo) algorithm for the implementation of the telecom sensor network, artificial neural networks (ann) for switching networks, the genetic algorithm (ga) for network path routing and the leaping frog algorithm (lfa) for network designing and scaling problems (kar, 2016). however, aco has been the most often used and researched optimization algorithm since it has shown the highest performance and has been the most effective (caro & dorigo, 2004). as a result, this research concentrated on the design and use of aco in many sectors, and more especially on its application in computer networks, and how it may even be improved in terms of the ideal number of ants required to be employed for better performance in these networks. forward and backward ants are used in these algorithms (jacobsen et al., 2011). the proactive aco routing algorithm antnet has been used effectively in packet-switched networks (caro & dorigo, 1998). the antnet algorithm propels a forward ant from the nest or source node at regular intervals towards its objective of food. when a forwarding ant reaches at its destination, it uses the list to return to the nest or source and update the pheromone values deposited in the routes or connections. if aco is applied in an ideal network, the ants are translated into packets, and the routes they use become the links between the nodes on that very network. if we have redundant links between the devices on the network, then the packets are expected to go through the shortest route to the destination if they are well optimized. 4. drawbacks and variations of aco the aco has shown certain benefits, such as positive feedback for quick solutions, dynamic applications that adjust to changes such as additional distances, and intrinsic parallelism. it has, however, shown several limitations, such as probability distribution changes due to iterations and convergence time (selvi & umarani, 2010). more precisely, anthocnet's disadvantage is the quantity of routing messages routing that must be delivered in the network before the formation of routes to the destination, the downside of antnet is the time necessary to build a route system between any two nodes in the network. this is referred to as the convergence time (selvi & umarani, 2010). last but not least the stagnation of ants in the working process of the algorithm (caro & dorigo, 1998) is also another problem common in aco algorithms. parameter setting of a basic ant colony algorithm is mainly the cause of these variations and is still under experimental stage till today (wei, 2014). 4.1 aco algorithm parameter setting the following are the parameters under consideration (wei, 2014). m number of ants. α pheromone relative importance. β relative importance of heuristic factor. 127 |international journal of informatics information system and computer engineering 3(1) (2022) 124-135 ρpheromone evaporation coefficient while (1-ρ) indicates the pheromone persistence factor. q amount of pheromone released by ants the following is the general formula used in the algorithm with the above parameters. the ants would independently select the next town or city to tour at time t, hence the probability of ant k to move from city i to j is given by (caro & dorigo, 1998): after all the ants have toured all the cities in the search space, each of the paths is updated according to eq. (2) below. were, and, this study compares various researches on the impact of number of ants (m) shown in eq. (3) on optimal solution and convergence time of the algorithm. 5. the effect of the number of ants (m) on aco optimization according to (colorni et al., 1991) the number of ants is among the controllable parameters affecting the performance of aco. we examine at 12 tests from three research to see how the number of ants in the aco algorithm affects its convergence. here, we look at whether or not an optimal solution is found and how long it takes for the solution to be found. 5.1. experiment 1 this experiment was done by aydin and yilmaz (sivagaminathan & ramakrishnan, 2007). the two presented an investigation into the number of ants used in aco in relation to the number of iterations, penalized objective function, and optimization time. for the purposes of this study, the results obtained from the number of iterations as well as time of optimization versus the number of ants are taken into account (see figs. 1). fig. 1. the average iteration number versus number of ants (sivagaminathan & ramakrishnan, 2007) fig. 2. the optimization time versus number of ants (sivagaminathan & ramakrishnan, 2007) samuel w lusweti et al. impact of number of artificial ants in aco … | 128 from fig. 1 above, as the number of ants rises, the number of iterations reduces. in contrast to fig. 2 above, the optimization time increases quickly when the population of ants grows. in terms of the precise number of ants required for the optimum solution, this research does not give a direct answer, as it only suggests that it is critical to identify the appropriate number of ants in order to get the best solution in the shortest amount of time. however, from the experimental results in fig. 1 and 2 above, it is obvious that the fewer the ants, the greater the number of iterations, and hence the shorter the optimization time. in other terms, when the number of ants rises, the number of iterations decreases yet the optimization time increases since numerous ants take longer to converge. however, this experiment was limited to small problems, and the exact number of optimal ants could not be established clearly. 5.2. experiment 2 we examine this experiment done by alobaedy et al (yilmaz & aydin). in this experiment, the researchers categorized optimization problems into small and medium scales using data sets of 50 and 100 cities respectively. they were able to measure the execution time, best solution, total number of new solutions obtained among other metrics. fig. 3 below shows the results obtained in terms of execution time against the number of ants for small scale problem of 50 cities. figure 3 shows that increasing the number of ants causes an increase in the execution time. fig. 3. execution time against the number of ants (50 cities) (yilmaz & aydin) when the experiment with 100 ants (medium) was run as shown below in fig. 4, a similar pattern was noted. however, due to the complexity of this issue, the execution time of the method increased in fig. 4. fig. 4. execution time against the number of ants (100 cities) (yilmaz & aydin) this research reveals that increasing the number of ants did not improve the algorithm's performance, but if the number is low, the performance was enhanced. nevertheless, the experiment was performed under two problem variations that is small and medium sized problems. for small sized problem the execution time was small and difficult to determine the exact number of ants needed to optimize the solution. whereas on the medium sized problem, the 129 |international journal of informatics information system and computer engineering 3(1) (2022) 124-135 execution time doubled and the number of optimum ants for that solution was found to be 16 out of 100. 5.3. experiment 3 in this experiment, christoffer and lars (alobaedy et al., 2017; stutzle & hoos, 2000; aydin) carried out comparisons on three aco variations model (rankedas eliteas, and mmas). the eliteas relies on specialist ants to work. these secondary ants, or specialized ants, are utilized to impose the elitist approach. the other ants called the normal ants work differently. the specialist ants multiply the pheromone on the best solution found by normal ants (petterson & lundell, 2018) making it stay longer without decomposing unlike the normal routes in other ant systems. the rankedas on the other hand also use specialist ants but these ants deposit pheromones on many good paths found instaed of depositing all pheromones on the best solution found (petterson & lundell, 2018). every route is graded by length, with the best-ranked route getting the most pheromones and the worstranked route receiving the fewest. lastly, the mmas has no specialist ants which means it only uses the normal ants. here, the pheromone deposited on a given path can never exit a maximum value or getlower than a given minimum value. this ensures that the pheromone level on a path does not get too low that the path is rendered unusable or the path should never be filled with the pheromone so much that it overshadows all the other routes (bullnheimer et al., 1997). this happens through smoothing of the edges whenever pheromone concentration levels are going below or above the extremes (see figs. 5-9). 5.3.1 eliteas fig 5. average deviation from optimum ants in relation to 101 cities (alobaedy et al., 2017) 1 % 10 % 20 % 30 % 40 % 50 60 % % 70 % 80 % 90 % 100 % 5 10 % 15 % eliteas | 101 cities 1 % 25 % 50 % 75 100 % specialists in relation to ants samuel w lusweti et al. impact of number of artificial ants in aco … | 130 fig 6. average deviation from optimum ants in relation to 225 cities (alobaedy et al., 2017) in eliteas as shown in figs. 5 and 6, only between 10-30% of the ants showed a better performance. 5.3.2. rankedas fig 7. average deviation from optimum ants in relation to 101 cities (alobaedy et al., 2017) fig 8. average deviation from optimum ants in relation to 225 cities (alobaedy et al., 2017) 1 % 10 % 20 % 30 % 40 % 50 % 60 % 70 80 % % 90 % 100 ants in relation to cities 5 % 10 % % 15 eliteas | 225 cities 1 25 % % 50 % 75 % 100 specialists in relation to ants 1 % 10 % 20 % 30 % 40 % 50 % 60 % 70 80 % % 90 % 100 5 % 10 % 15 % 1 25 % 50 % 75 % 100 % 5 specialists in relation to ants 1 % 10 20 % 30 % 40 % 50 % 60 % 70 % % 80 % 90 % 100 ants in relation to cities 5 % % 10 15 % rankedas | 225 cities 1 % 25 % 50 % 75 % 100 5 specialists in relation to ants 131 |international journal of informatics information system and computer engineering 3(1) (2022) 124-135 fig 9. average deviation from optimum ants in relation to 532 cities (alobaedy et al., 2017) as shown from the results in figs. 5,6,7,8 and 9 above, the results of rankedas and eliteas are contrasting each other. a large percentage of specialized ants degrades the solution more than a low proportion of specialized ants. in rankedas, the number of normal ants has an effect of deviation from optimum solution when using 5 specialized ants but when using more specialized ants there is no effect. in this case, the optimal solution is obtained when 5 specialized ants and 100% regular ants are used in relation to cities (alobaedy et al., 2017). nonetheless, with figure 9 where we have 525 cities, using more than 50% of the normal ants brings about worse results as the deviation from optimum is high as shown by the red line. it found that, when implementing some rankedas, the highest number of normal ants which is 100% of ants showed better performance in terms of convergence (see figs. 10-12). 5.3.3 min-max ant system fig 10. average deviation from optimum against ants in relation to 101 cities (alobaedy et al., 2017) 1 % 10 % 20 30 % 40 % % 50 % 60 % 70 80 % % 90 % 100 ants in relation to cities 5 % % 10 % 15 % 20 25 % rankedas | 532 cities 1 25 % % 50 % 75 100 % 5 specialists in relation to ants 1 % 10 % 20 30 % 40 % % 50 60 % 70 % % 80 % 90 % 100 ants in relation to cities % 2 4 % % 6 mmas | 101 cities average best-worst samuel w lusweti et al. impact of number of artificial ants in aco … | 132 fig 11. average deviation from optimum against ants in relation to 225 cities (alobaedy et al., 2017) fig 12. average deviation from optimum against ants in relation to 532 cities (alobaedy et al., 2017) fig. 10 and 11 show that increasing the number of ants in mmas has no effect on the deviation from optimum results. except for fig. 12 which shows some variation, overall performance of mmas has is not affected by a large number of ants (alobaedy et al., 2017). 6. results and discussion while some eight experiments in the data collected show that an increase in the number of ants degrades the optimization of various aco versions, two of them indicate that it actually improves the optimization of the algorithm, yet two more reveal that it has no impact on the results of the algorithm. to help understand the variation in eliteas, we single out figure 9 that brought about worse results when the number of normal ants is increased. first, we look at the number of specialized ants kept at 5 in all the figures 7, 8 and 9. calculating the ratio between the number specialized ants to that of normal ants in all the three experiments when the 1 % 10 % 20 % 30 40 % % 50 60 % 70 % 80 % % 90 % 100 ants in relation to cities 0 % % 2 % 4 6 % mmas | 225 cities average best-worst 1 % 10 % 20 % 30 40 % 50 % 60 % 70 % % 80 % 90 % 100 ants in relation to cities % 6 8 % % 10 % 12 % 14 mmas | 532 cities average best-worst 133 |international journal of informatics information system and computer engineering 3(1) (2022) 124-135 algorithm is at its optimum performance we find; 5:100, 5:225 and 5:266 (at 50% ants) for figures 7,8 and 9 respectively. when we simplify the ratio, we get 1:20, 1:45 and 1:53 respectively. if we take a look at the same figure 9, at 60% ants where the graph starts to deviate from the optimum, the ratio of the ants is 1: 63. from 60%, the graphs exponentially continue to deviate from the optimal solution till 100%. from this data we can see that for the optimal functioning of the algorithm the ratio of specialized ants and that of normal ants has to be considered. in figure 9, this ratio was not considered as the number of specialized ants remained to be 5. for instance, for every 1 specialized ant, there needs to be about 20 to about 50 normal ants to optimize the solution. when the aco algorithms are optimized, they can then be applied in various fields of study. in this case when it is applied in computer networks, the packets would be translated into ants and made to be as intelligent as the ants of the algorithm, hence prevent packet looping and many other problems associated with unoptimized computer networks especially under dynamic situations. this helps improve on the network convergence time without making the time too short to cause premature convergence or too long to bring a lot of latencies in the network. 5. conclusion after identifying the main challenges faced by aco which include stagnation of ants, actual number of routing messages that are needed, and convergence time and their main causes which are associated with parameter setting in the algorithm, we singled out one of the parameters which is the convergence time influenced by the number of ants in the solution space. however, the key problem is determining the ideal number of ants to utilize in the algorithm. it is difficult to quantify the number of ants necessary to solve a issue. first, some problems are less complex than others that means they need a smaller number of ants to be solved, and secondly, we have different types of aco algorithms working differently. a well optimized algorithm will have a short convergence time. we therefore considered the results from the experiments done by various researchers as shown in the 12 figures above and came up with the following conclusions that can help determine the optimum number of ants needed. 1) the type of aco algorithm in use, 2) the complexity of the problem under study, and 3) the ratio of the number of specialized ants to that of normal ants. further research needs to be carried out to determine how best we can calculate and determine the optimal number of ants needed. references liao, t., stützle, t., de oca, m. a. m., & dorigo, m. (2014). a unified ant colony optimization algorithm for continuous optimization. european journal of operational research, 234(3), 597-609. blum, c. (2005). ant colony optimization: introduction and recent trends. physics of life reviews, 2(4), 353-373. samuel w lusweti et al. impact of number of artificial ants in aco … | 134 kar, a. k. (2016). bio inspired computing–a review of algorithms and scope of applications. expert systems with applications, 59, 20-32. adiga, c. s., joshi, h. g., & harish, s. v. (2013). network routing problem-a simulation environment using intelligent technique. international journal of advanced computer research, 3(4), 166. dressler, f., & akan, o. b. (2010). a survey on bio-inspired networking. computer networks, 54(6), 881-900. gustov, v., & levina, a. (2021, april). electromagnetic fields as a sign of sidechannel attacks in gsm module. in 2021 11th ifip international conference on new technologies, mobility and security (ntms) (pp. 1-5). ieee. sim, k. m., & sun, w. h. (2002, november). multiple ant-colony optimization for network routing. in first international symposium on cyber worlds, 2002. proceedings. (pp. 277-281). ieee. kamali, s., & opatrny, j. (2007, march). posant: a position-based ant colony routing algorithm for mobile ad-hoc networks. in 2007 third international conference on wireless and mobile communications (icwmc'07) (pp. 21-21). ieee. kar, a. k. (2016). bio inspired computing–a review of algorithms and scope of applications. expert systems with applications, 59, 20-32. di caro, g., & dorigo, m. (2004). ant colony optimization and its application to adaptive routing in telecommunication networks (doctoral dissertation, phd thesis, faculté des sciences appliquées, université libre de bruxelles, brussels, belgium). jacobsen, r. h., zhang, q., & toftegaard, t. s. (2011). bioinspired principles for largescale networked sensor systems: an overview. sensors, 11(4), 4137-4151. caro, g. d., & dorigo, m. (1998, september). ant colonies for adaptive routing in packet-switched communications networks. in international conference on parallel problem solving from nature (pp. 673-682). springer, berlin, heidelberg. selvi, v., & umarani, r. (2010). comparative analysis of ant colony and particle swarm optimization techniques. international journal of computer applications, 5(4), 16. caro, g. d., & dorigo, m. (1998, september). ant colonies for adaptive routing in packet-switched communications networks. in international conference on parallel problem solving from nature (pp. 673-682). springer, berlin, heidelberg. wei, x. (2014). parameters analysis for basic ant colony optimization algorithm in tsp. international journal of u-and e-service, science and technology, 7(4), 159170. colorni, a., dorigo, m., & maniezzo, v. (1991). distributed optimization by ant colonies. in proceedings of ecal91–european conference on artificial life. sivagaminathan, r. k., & ramakrishnan, s. (2007). a hybrid approach for feature subset selection using neural networks and ant colony optimization. expert systems with applications, 33(1), 49-60. 135 |international journal of informatics information system and computer engineering 3(1) (2022) 124-135 yilmaz, a. h., & aydin, z. examination of parameters used in ant colony algorithm over truss optimization. balıkesir üniversitesi fen bilimleri enstitüsü dergisi, 24(1), 263-280. alobaedy, m. m., khalaf, a. a., & muraina, i. d. (2017, may). analysis of the number of ants in ant colony system algorithm. in 2017 5th international conference on information and communication technology (icoic7) (pp. 1-5). ieee. pettersson, l., & lundell johansson, c. (2018). ant colony optimization-optimal number of ants. bullnheimer, b., hartl, r. f., & strauss, c. (1997). a new rank based version of the ant system. a computational study. stützle, t., & hoos, h. h. (2000). max–min ant system. future generation computer systems, 16(8), 889-914. aydin, z. izgara si̇stemleri̇n opti̇mi̇zasyonu üzeri̇nden karinca koloni̇ opti̇mi̇zasyon algori̇tmasinda karinca sayisinin beli̇rlenmesi̇. uludağ üniversitesi mühendislik fakültesi dergisi, 22(3), 251262. and signal processing, 2(3), 1-10. ivan kristianto singgih. agricultural drone zoning and deployment strategy with multiple...| 66 agricultural drone zoning and deployment strategy with multiple flights considering takeoff point reach distance minimization ivan kristianto singgih quantum machine learning laboratory, school of industrial management engineering, korea university, seoul, republic of korea. *corresponding email: 1ivanksinggih@gmail.com a b s t r a c t s a r t i c l e i n f o in the agricultural sector, drones are used to spray chemicals for the plants. a lawn mowing movement pattern is one of the widely used methods when deploying the drones because of its simplicity. a route planner determines some pre-set routes before making the drones to fly based on them. each drone flight is limited by its battery level or level of spray liquids. to efficiently complete the spraying task, multiple drones need to be deployed simultaneously. in this study, we study a multiple drone zoning and deployment strategy that minimizes the cost to set up equipment at the takeoff points, e.g., between flights. we propose a method to set the flight starting points and directions appropriately, given various target areas to cover. this is the first study that discusses the spraying drone zoning and deployment plan while minimizing the number of takeoff points, which plays an important role in reducing the drone set up and deployment costs. the suggested procedure helps drone route planners to generate good routes within a short time. the generated routes could be used by the planner for their chemical spraying activity and could be used as initial input for their design, which can be improved with the planners’ experience. our study shows that when generating an efficient route, we must consider the number of flight area levels, directions of the drone movements, the number of u-turns of the drones, and the start points of the drone flights. article history: received 18 dec 2021 revised 20 dec 2021 accepted 25 dec 2021 available online 26 dec 2021 aug 2018 __________________ keywords: drone, routing, rule, spraying, zone control. international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 2(2) (2021) 66-79 67 | international journal of informatics information system and computer engineering 2(2) (2021) 66-79 1. introduction drones is one of the most emerging technologies in industry 4.0 era (fernández-caramés et al., 2019). they are used in various fields, including postdisaster observation (singgih et al., 2020), last-mile delivery (moadab et al., 2022), inspection (nguyen et al., 2018), medical treatment (pulver and wei, 2018), etc. in agricultural field, drones are used for various purposes, including vegetation segmentation (su et al., 2021), soil mapping, disease monitoring, estimating crop yield (maddikunta et al., 2021), cultivation planning (kakamoukas et al., 2022), etc. in contrast to manual work, using drones allows us to perform high precision work within a short time. such great precision is enabled internet of things and big data technology (patrik et al., 2019; sordan et al., 2022). although the drone technology must still need to be supported with the establishment of internet connectivity for farmers to ensure the effective implementation of the technology (mehrabi et al., 2021), currently, there is a quick development of the drone technology itself (santangeli et al., 2020). in this study, we focus on drones that are used to spray chemicals on a large area (figure 1). before using the drones to spray chemicals on the plants, a drone route plan (figure 2) is required (hobby hangar, 2022). to assess the novelty of our study, we conducted a literature review as follows. we start with (chung et al. 2020). a drone routing problem can be classified into routing through target points and routing through target areas. in the first classification, the drones must visit all given target points (coutinho et al., 2018; gu et al., 2020), e.g., to conduct item delivery or observation from such points, while in the latter classification, the drones need to cover all target areas, e.g., when spraying chemicals on a target agriculture field (faiçal et al., 2017). our studied problem is classified as the latter one; thus, we review studies listed in table 4 (drone routing problems with area coverage classification) from (chung et al. 2020), specifically the ones marked with ag (agriculture) as the application area. to ensure that we cover papers published after (chung et al. 2020), we also searched for papers citing (chung et al. 2020). among a total of 24 papers from both sources, we found five related papers to be compared with ours. the reasons for excluding other studies are because they study target point visiting problems or focus on the non-agriculture fields (e.g., traffic monitoring and delivery system). comparisons between our study and those five related papers are presented in table 1. based on our knowledge, our study is the first one that considers drone zoning and routing when using multiple flights for agricultural purposes while minimizing the number of takeoff points. the drones’ lawn moving is a sweeping movement (otto et al., 2018; avellar et al., 2015). such movements could be differentiated into (1) movements parallel to the longest side of the area and (2) movements perpendicular to the longest side of the area, as shown in figure 3. such lawn moving is preferred when the target search area is large, in contrast to spiral movement (cabreira et al., 2019). having such a simple movement pattern allows the planner to route the drones easily while ensuring high effectiveness in the drones’ movement. our study focuses on proposing a certain simple movement to assist the route planner with their manual routing procedure. also, our proposed movement strategy could be used as a ivan kristianto singgih. agricultural drone zoning and deployment strategy with multiple...| 68 built-in route suggestion from the routing application that would be provided to the route planners for editing and approval. given such a simple yet effective movement pattern, it would be easily understood by the route planners allowing them to conduct a better route optimization. our study differs from previous studies by proposing a simple lawn moving pattern that considers the distances between each end point of a travel with the start point of the next travel (take off point). such consideration is important because the drones would need their battery and chemical container to be replaced before continuing their next travel (kim and lim, 2018; qin et a., 2021; jorge et al., 2021). minimizing the distances between take-off points minimizes the time required by the workers to pre-place the battery and chemical container replacements; thus, it significantly reduces the operational time and, in the end, reduces the cost and increases the benefit when using the drones. figure 1. chemical spraying drone source: ahmed et al. (2021) figure 2. a pre-set drone route for chemical spraying on an agricultural area source: hobby hangar (2022) table 1. comparison with previous studies characteristics drone flight routing direction(s) details on routes routing objective(s) moon and shim (2009) multiple complex exist completion time barrientos et al. (2011) multiple complex exist completion time avellar et al. (2015) multiple lawn mowing (1 direction) exist completion time barna et al. (2019) single lawn mowing (1 direction) exist captured photo quality 69 | international journal of informatics information system and computer engineering 2(2) (2021) 66-79 tu et al. (2020) multiple lawn mowing (1 direction), grid not exist captured photo quality our study multiple lawn mowing (2 directions) exist completion time, number of takeoff points (a) parallel to the longest side (b) perpendicular to the longest side figure 3. two types of lawn moving patterns of the drones when solving routing problems, various solution approaches could be used, e.g., mathematical models, exact heuristics, metaheuristics, simulation, and rule-based methods (amarat and zong, 2019; erdelić and carić, 2019; moghdani et al., 2021). lately, machine learning-based methods are also proposed by arnold and (sörensen, 2019) and zhao, et al. 2021). earlier methods (e.g., exact methods) produce better solution quality but require more computational effort. in contrast, rulebased methods are straightforward and can be applied more easily. development of such rule-based methods is common for solving various combinatorial optimization problems, e.g., project scheduling (chakrabortty et al., 2020), job dispatching (ðurasević and jakobović, 2020), and machine scheduling (gil-gala et al., 2019). related to our problem, we develop a rule-based approach to provide drone route planners with the necessary insights for manually designing the routes. the structure of the whole paper is presented as follows: section 2 explains the proposed routing procedure. section 3 presents the numerical experiments and discussions. finally, section 4 concludes the study. 2. method when operating the drones, we need to ensure real drone characteristics, e.g., the limitations on flight range (otto et al., 2018) and limitations on weight to carry (macrina et al., 2020). to ensure each drone to completes its tasks, we need to set up some locations within the working area of the drone with the equipment necessary to conduct the battery charging and chemical refuelling or replacement. when any drone requires a temporary landing, the required equipment must be ivan kristianto singgih. agricultural drone zoning and deployment strategy with multiple...| 70 ready. please note that after the landing, the drones would take off again after the landing to continue the spraying process. therefore, we call the temporary landing points as take-off points as well. to ensure the readiness of the equipment, it is straightforward to minimize the distances between the take-off points. such a distance minimization can also be found in truck routing problems in a truck-drone collaborative parcel delivery system (wang et al., 2019). such a take-off point distance minimization is equivalent to reducing the number of take-off points, which reduces the effort to transport and prepare the equipment for the landing drones. our proposed drone zoning and deployment procedure is described in algorithm 1. algorithm 1. drone zoning and deployment procedure 1: calculate the number of required drone flights: " #_of_drone_flights=" ⌊"total grid area/max covered grid area per drone" ⌋ 2: determine alternatives of identical drone flight area dimensions (length and width), which cover the whole spraying area 3: define the possible drone flight start and end positions simultaneously while minimizing the total distances from the drone take-off points 4: finalize the best drone deployment plan the end points for each drone movement are determined based on the size of the target area and the movement direction of the drones, as shown in figure 3. as shown in figure 3(a), when the number of u-turns of a drone is even, the drone travel will end at on the exact opposite side of the starting point. meanwhile, when the number of u-turns of the drone is odd (figure 3(b)), the drone travel will end at on the same side with as the starting point. considering such a movement rule, we need to determine the movement direction of the drones based on the size of the target area. it will significantly affect the positions of the take-off points and determine the number of the take-off points. in general, minimizing the number of u-turns is preferred because travelling through the u-turn area causes a longer movement time due to the required deceleration and acceleration movements. however, allowing a decent number of u-turns should be acceptable, considering that making such decisions could reduce the number of take-off points. please refer to the next section for examples and further analysis. 3. result and disscusion for the numerical experiment, we consider two problem instances. instance 1 considers a 200-grid area with 10 grid x 20 grid dimensions, while instance 2 considers a 180-grid area with 15 grid x 12 grid dimensions. for instances 1 and 2, the max grid area covered by a drone flight are 20 and 30 grid area, respectively. considering various movement rules, we generate five and three drone deployment plans for instances 1 and 2, as shown in figures 4 and 5, respectively. we observe the drone flight area/drone movements on 71 | international journal of informatics information system and computer engineering 2(2) (2021) 66-79 horizontal and vertical locations/directions for simplicity. please note that we refer to the horizontal or vertical directions when observing the target area from the top view. the results are produced using algorithm 1. in step 1, it is straightforward to determine the number of flights based on the drone’s limited flight time, which is determined by the limited battery or carried chemical. in step 2, we define the same-shaped flight area for the drones. currently, we consider the same shape to extract basic drone deployment and routing rules easily. to observe various routing alternatives in steps 3 and 4, we test various drone movement strategies, e.g., (1) horizontal or vertical-directed movements and (2) starting points at the outer side or inner side of the target area. for a drone flight, the start point, and end point are labeled as “s” and “e”, respectively. a “takeoff point” consists of a maximum of 4 landing and takeoff points that are placed adjacently because we can place the equipment (recharged battery and refuel chemical tanks) in the center of those points. from this part of the manuscript, we will call each location as “a takeoff point”. we exclude the first start point and the last end point from the calculation for the number of takeoff points because we assume that each drone is ready with all required equipment when starting its first flight and at the end of its last flight, we do not need to rush with the drone last pickup process. as an example, in figure 4(a), there are five takeoff points as follows: takeoff point 1 (e1, s2, e9, s10), takeoff point 2 (e2, s3, e8, s9), takeoff point 3 (e3, s4, e7, s8), takeoff point 4 (e4, s5, e6, s7), and takeoff point 5 (e5, s6). different background colors for each takeoff point group are used to clearly present the results. (a) even horizontal flight area, vertical flight, odd u-turns, 5 takeoff points (best) ivan kristianto singgih. agricultural drone zoning and deployment strategy with multiple...| 72 (b) even horizontal flight area, vertical flight, odd u-turns, 9 takeoff points (c) even horizontal flight area, horizontal flight, even u-turns, 7 takeoff points (d) odd horizontal flight area, horizontal flight, odd u-turns, 5 takeoff points 73 | international journal of informatics information system and computer engineering 2(2) (2021) 66-79 (e) even horizontal flight area, horizontal flight, no u-turns, 9 takeoff points figure 4. drone deployment alternatives for instances 1 (a) odd horizontal flight area, vertical flight, odd u-turns, 5 takeoff points ivan kristianto singgih. agricultural drone zoning and deployment strategy with multiple...| 74 (b) odd horizontal flight area, vertical flight, odd u-turns, 5 takeoff points (c) odd horizontal flight area, horizontal flight, even u-turns, 4 takeoff points (best) figure 5. drone deployment alternatives for instances 2 the best routing alternatives (that reduces the number of take-off points) for instances 1 and 2 are shown in figures 4(a) and 5(c), respectively. based on our observation, we conclude that a minimum number of take-off points can be produced by: (1) generating an even number of flight area levels, then ensuring that the take-off points are grouped in the 75 | international journal of informatics information system and computer engineering 2(2) (2021) 66-79 middle of each adjacent two levels, as shown by figure 4(a), which has two horizontal flight area levels. the takeoff points can be accumulated in the middle of both horizontal levels because the number of u-turns is odd (which makes the drones return to the same middle side of the target area). (2) generating an odd number of flight area levels, then setting the drones’ perpendicular movements from those levels, as shown by figure 5(c), which has three horizontal flight area levels and a horizontal drone flight movement. in this example, the number of u-turns is even. if the number of u-turns is odd, then we would follow the same routing solution structure shown in figure 4(a) by starting the drone movements from the middle part of two adjacent vertical flight area. in addition to the findings above, we also conclude that no u-turns do not minimize the number of take-off points because the start and end points are not adjacently placed. the findings in this study can be used as a good reference for drone flight planners when they predefine the flight routes (ma et al., 2019). 4. conclusion we study a multiple drone zoning and deployment problem. our proposed method includes dividing the target area into zones, then determining detailed drone movement directions to minimize the effort of preparing battery and refueling chemicals for drones at the end of each flight. some useful insights are listed to be used as recommendations for drone flight planners. for future research topics, we suggest allowing different drone flight area sizes to increase the flexibility of the drone movements and the effort for the equipment preparation. references ahmed, s., qiu, b., ahmad, f., kong, c.-w., & xin, h. (2021). a state-of-the-art analysis of obstacle avoidance methods from the perspective of an agricultural sprayer uav’s operation scenario. agronomy, 11, 1069. amarat, s. b., & zong, p. (2019). 3d path planning, routing algorithms and routing protocols for unmanned air vehicles: a review. aircraft engineering and aerospace technology, 91(9), 1245–1255. arnold, f., & sörensen, k. (2019). what makes a vrp solution good? the generation of problem-specific knowledge for heuristics. computers & operations research, 106, 280–288. avellar, g. s. c., pereira, g. a. s., pimenta, l. c. a., & iscold, p. (2015). multi-uav routing for area coverage and remote sensing with minimum time. sensors, 15, 27783–27803. ivan kristianto singgih. agricultural drone zoning and deployment strategy with multiple...| 76 barna, r., solymosi, k., & stettner, e. (2019). mathematical analysis of drone flight path. journal of agricultural informatics, 10(2), 15–27. barrientos, a., colorado, j., del cerro, j., martinez, a., rossi, c., sanz, d., & valente, j. (2011). aerial remote sensing in agriculture: a practical approach to area coverage and path planning for fleetsof mini aerial robots. journal of field robotics, 28(5), 667–689. cabreira, t. m., brisolara, l. b., & paulo r. jr., f. (2019). survey on coverage path planning with unmanned aerial vehicles. drones, 3(1), 4. chakrabortty, r. k., rahman, h. f., & ryan, m. j. (2020). efficient priority rules for project scheduling under dynamic environments: a heuristic approach. computers & industrial engineering, 140, 106287. chung, s. h., sah, b., & lee, j. (2020). optimization for drone and drone-truck combined operations: a review of the state of the art and future directions. computers and operations research, 123, 105004. coutinho, w. p., battara, m., & fliege, j. (2018). the unmanned aerial vehicle routing and trajectory optimisation problem, a taxonomic review. computers & industrial engineering, 120, 116–128. ðurasević, m., & jakobović, d. (2020). comparison of schedule generation schemes for designing dispatching rules with genetic programming in the unrelated machines environment. applied soft computing journal, 96, 106637. erdelić, t., & carić, t. (2019). a survey on the electric vehicle routing problem: variants and solution approaches. journal of advanced transportation, 5075671. faiçal, b. s., freitas, h., gomes, p. h., mano, l. y., pessin, g., de carvalho, a. c. p. l. f., krishnamachari, b., & ueyama, j. (2017). an adaptive approach for uavbased pesticide spraying in dynamic environments. computers and electronics in agriculture, 138, 210–223. fernández-caramés, t. m., blanco-novoa, o., froiz-míguez, i., & fraga-lamas, p. (2019). towards an autonomous industry 4.0 warehouse: a uav and blockchain-based system for inventory and traceability applications in big data-driven supply chain management. sensors, 19(10), 2394. gil-gala, f. j., mencía, c., sierra, m. r., & varela, r. (2019). evolving priority rules for on-line scheduling of jobs on a single machine with variable capacity over time. applied soft computing journal, 85, 105782. 77 | international journal of informatics information system and computer engineering 2(2) (2021) 66-79 gu, q., fan, t., pan, f., & zhang, c. (2020). a vehicle-uav operation scheme for instant delivery. computers & industrial engineering, 149, 106809. hobby hangar. (2022). align rm41501xw m4 high-performance agricultural drone. https://www.hobbyhangar.co.nz/align-rm41501xw-m4-highperformance-agricultural-drone. jorge, h. g., de santos, l. m. g., álvarez, n. f., sánchez, j. m., & medina, f. n. (2021). operational study of drone spraying application for the disinfection of surfaces against the covid-19 pandemic. drones, 5(1), 18. kakamoukas, g. a., sarigiannidis, p. g., & economides, a. a. (2022). fanets in agriculture a routing protocol survey. internet of things, 18, 100183. kim, s. j., & lim, g. j. (2018). a hybrid battery charging approach for drone-aided border surveillance scheduling. drones, 2(4), 38. ma, f., xu, z., & xiong, f. (2019). research on route planning of plant protection uav based on area modular division. 2019 11th international conference on intelligent human-machine systems and cybernetics (ihmsc) (p. 101–104). macrina, g., pugliese, l. d. p., guerriero, f., & laporte, g. (2020). drone-aided routing: a literature review. transportation research part c, 120, 102762. maddikunta, p. k. r., hakak, s., alazab, m., bhattacharya, s., gadekallu, t. r., khan, w. z., & pham, q.-v. (2021). unmanned aerial vehicles in smart agriculture: applications, requirements, and challenges. ieee sensors journal, 21(16), 17608–17619. mehrabi, z., mcdowell, m. j., ricciardi, v., levers, c., martinez, j. d., mehrabi, n., wittman, h., ramankutty, n., & jarvis, a. (2021). the global divide in datadriven farming. nature sustainability, 4, 154–160. moadab, a., farajzadeh, f., & valilai, o. f. (2022). drone routing problem model for last-mile delivery using the public transportation capacity as moving charging stations. scientific reports, 12, 6361. moghdani, r., salimifard, k., demir, e., & benyettou, a. (2021). the green vehicle routing problem: a systematic literature review. journal of cleaner production, 279, 123691. moon, s.-w., & shim, h.-c. (2009). study on path planning algorithms for unmanned agricultural helicopters in complex environment. international journal of aeronautical & space sciences, 10(2), 1–11. ivan kristianto singgih. agricultural drone zoning and deployment strategy with multiple...| 78 nguyen, v. n., jenssen, r., & roverso, d. (2018). automatic autonomous visionbased power line inspection: a review of current status and the potential role of deep learning. international journal of electrical power & energy systems, 99, 107–120. otto, a., agatz, n., campbell, j., golden, b., & pesch, e. (2018). optimization approaches for civil applications of unmanned aerial vehicles (uavs) or aerial drones: a survey. networks, 72(4), 411–458. patrik, a., utama, g., gunawan, a. a. s., chowanda, a., suroso, j. s., shofiyanto, r., & budiharto, w. (2019). gnss-based navigation systems of autonomous drone for delivering items. journal of big data, 6, 53. pergher, i., frej, e. a., roselli, l. r. p., & de almeida, a. t. (2020). integrating simulation and fitradeoff method for scheduling rules selection in job-shop production systems. international journal of production economics, 227, 107669. pulver, a., & wei, r. (2018). optimizing the spatial location of medical drones. applied geography, 90, 9–16. qin, y., kishk, m. a., & alouini, m.-s. (2021). on the influence of charging stations spatial distribution on aerial wireless networks. ieee transactions on green communications and networking, 5(3), 1395–1409. santangeli, a., chen, y., kluen, e., chirumamilla, r., tiainen, j., & loehr, j. (2020). integrating drone‑borne thermal imaging with artifcial intelligence to locate bird nests on agricultural land. scientific reports, 10, 10993. singgih, i. k., lee, j., & kim, b.-i. (2020). node and edge drone surveillance problem with consideration of required observation quality and battery replacement. ieee access, 8, 44125–44139. sordan, j. e., oprime, p., pimenta, m. l., chiabert, p., & lombardi, f. (2022). industry 4.0: a bibliometric analysis in the perspective of operations management. operations and supply chain management, 15(1), 93–104. su, j., yi, d., su, b., mi, z., liu, c., hu, x., xu, x., guo, l., & chen, w.-h. (2021). aerial visual perception in smart farming: field study of wheat yellow rust monitoring. ieee transactions on industrial informatics, 17(3), 2242–2249. tu, y.-h., phinn, s., johansen, k., robson, a., & wu, d. (2020). optimising drone flight planning for measuring horticultural tree crop structure. isprs journal of photogrammetry and remote sensing, 160, 83–96. 79 | international journal of informatics information system and computer engineering 2(2) (2021) 66-79 wang, d., hu, p., du, j., zhou, p., deng, t., & hu, m. (2019). routing and scheduling for hybrid truck-drone collaborative parcel delivery with independent and truck-carried drones. ieee internet of things journal, 6(6), 10483–10495. zhao, j., mao, m., zhao, x., & zou, j. (2021). a hybrid of deep reinforcement learning and local search for the vehicle routing problems. ieee transactions on intelligent transportation systems, 22(11), 7208–7218. 21 | international journal of informatics information system and computer engineering 3(2) (2022) 21-30 2-d attention-based convolutional recurrent neural network for speech emotion recognition akalya devi c, karthika renuka d, aarshana e winy, p c kruthikkha, ramya p, soundarya s assistant professor, 2ug scholar, department of information technology,psg college of technology, coimbatore, india *corresponding email: cad.it@psgtech.ac.in a b s t r a c t s a r t i c l e i n f o recognizing speech emotions is a formidable challenge due to the complexity of emotions. the function of speech emotion recognition (ser) is significantly impacted by the effects of emotional signals retrieved from speech. the majority of emotional traits, on the other hand, are sensitive to emotionally neutral elements like the speaker, speaking manner, and gender. in this work, we postulate that computing deltas for individual features maintain useful information which is mainly relevant to emotional traits while it minimizes the loss of emotionally irrelevant components, thus leading to fewer misclassifications. additionally, speech emotion recognition (ser) commonly experiences silent and emotionally unrelated frames. the proposed technique is quite good at picking up important feature representations for emotion relevant features. so here is a two dimensional convolutional recurrent neural network that is attention-based to learn distinguishing characteristics and predict the emotions. the mel-spectrogram is used for feature extraction. the suggested technique is conducted on iemocap dataset and it has better performance, with 68% accuracy value. article history: received 18 dec 2022 revised 20 dec 2022 accepted 25 dec 2022 available online 26 dec 2022 aug 2018 __________________ keywords: 2-d, attention-based, convolutional recurrent neural network, speech emotion recognition international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 3(2) (2022) 21-30 akalya devi et al. 2-d attention-based convolutional recurrent neural network … | 22 1. introducrion the significance of human speech emotion recognition has increased recently to increase the quality and efficiency of interactions between machines and humans (khalil et al., 2019). due to the difficulty in defining both natural and artificial emotions, recognizing human emotions is a challenging task all on its own. extraction of the spectral and prosodic elements that would lead to the accurate assessment of emotions has been the subject of numerous investigations (tzirakis et al., 2018). recognition of speech emotions is a technique that uses a processor to extract emotional information from speech signals (chen et al., 2018). it then compares and analyzes the collected emotional information together with the distinctive factors. after the emotional information is extracted, various techniques and concepts are used to predict the emotions of speech signals (khalil et al., 2019). speech emotion detection is now a rapidly developing discipline bridging the interaction between robots and humans. it is also a popular study area in signal processing and pattern recognition. emotions are incredibly important to human mental health. it is a way of expressing one's thoughts or state of mind to others. the major objective of ser is to improve human-machine interaction (hmi). it can also be used with lie detectors to monitor a subject's psychophysical state (lalitha et al., 2015). onboard car driving systems, dialogue systems for spoken languages used in call center conversations and the utilization of speech emotion patterns in medical applications are a few instances of ser. hmi systems still have a lot of problems that need to be resolved, especially when they are shifted from being tested in labs to being used in actual operations. therefore, efforts are required to effectively resolve all these problems and enhance machine emotion perception. recently, deep neural networks (dnns) have gained popularity and made revolutionary strides in a number of machines learning fields, including the continuous effect identification field. in most of the studies, hand-crafted features are used to feed the dnn architectures. many dnn architectures have been put forth in that approach, including convolutional neural networks (cnns) and long-short term memory (lstm) networks. mao et al. (mao et al., 2014) first used convolutional neural networks (cnn) and demonstrated great scores on numerous benchmark datasets for learning affective-salient features for ser. recurrent neural networks (rnns) were used by lee et al. (lee & tashey, 2015) to train ser on long-range temporal correlations. in order to train a convolutional recurrent neural network (crnn) to predict continuous valence space, trigeorgis et al. (trigeorgis et al., 2016) directly used the raw audio data. additionally, structures connecting the output and the input segments have been learned with significant effectiveness using attention mechanismbased rnns. rnns based on attention mechanisms are ideally suited to the ser tasks. first, speech is basically a sequence of data with different lengths. the majority of speech signals annotate emotion labels at the utterance level even though utterances sometimes have lengthy pauses and frequently have a 23 | international journal of informatics information system and computer engineering 3(2) (2022) 21-30 short word count. selecting emotionalrelevant frames for ser is very crucial. in this paper, we extend our model to yield affective salient characteristics for the final emotion categorization using a crnn-based attention mechanism. in this study, we combine crnn and an attention model to create a unique architecture for ser dubbed 2-d attention-based convolutional recurrent neural networks (acrnn). the following is a summary of this paper's main contributions: 1) we suggest a unique 2-d crnn for ser that enhances the ability to understand the time-frequency relationship. 2) we employ an additional attention model to automatically concentrate on the emotion-crucial frames and offer discriminative utterance-level characteristics for ser to cope with silence critical frames and emotionirrelevant frames. 3) experimental results contain athe ccuracy, recall, precision and confusion matrix of our proposed model. it is well known that most speech emotion datasets only have utterancelevel class labels. most sentences, however, contain silent regions, short pauses, transitions between phonemes, unvoiced phonemes, and so on. it is clear that not all parts of a sentence are emotionally connected. unfortunately, lstm does not handle this situation well when analyzing acoustic characteristics extracted from voice. in the current study, emotion classification is useful for distinguishing between emotionallyrelevant and emotionally-irrelevant frames. in emotion classification, it is useful to know whether the speech frame is voiced or unvoiced. currently, there are two types of commonly used methods: manually extracting emotionally relevant speech frames and using models to learn how to distinguish automatically. however, as manual extraction requires different thresholds on different data sets, it has some limitations in terms of feasibility. human emotional expression is often gradual and thus each voiced frame is useful for emotion classification. attention mechanisms can better match human emotional expression by capturing only the affective frames. local attention was added to lstm and different weights were assigned to each frame with varying emotional intensity. 2. literature survey on the iemocap dataset, sarthak tripathi and homayoon beigi performed multimodal emotion detection and determined the best individual architectures for classification of each modality using data from speech, text and motion capture. the design of their merged model is modular. this makes it possible to upgrade any individual model without affecting the other modalities. they utilized motion captured data and 2d convolutions in place of video recordings and 3d convolutions (tripathi et al., 2018). for the arabic dataset ksuemotions, mohammed zakariah and yaser mohammad seddiq performed speech emotion recognition. the feature extraction method made use of the timefrequency data from the spectrogram, as well as numerous modification and akalya devi et al. 2-d attention-based convolutional recurrent neural network … | 24 filtering techniques were used. although the system was tested at the file and segment levels, it was trained at the segment level (maji & swain, 2022). to automatically extract affective salient features from raw spectral data, yawei mu and luis a. hernandez gomez presented a distributed convolution neural network (cnn). from the cnn output, they then applied a bidirectional recurrent neural network (brnn) to obtain temporal information. finally, they used the attention mechanism to target the emotion-relevant portions of utterance in the brnn output sequence (jiang et al., 2021). a convolutional-recurrent neural network with multiple attention mechanisms (crnn-ma) was proposed by p. jiang, x. xu, h. tao, l. zhao, and c. zou for ser. it uses extracted melspectrums and frame-level features in parallel convolutional neural network (cnn) and long short-term memory (lstm) modules, respectively. a multidimensional attention layer and multiple self-attention layers in the cnn module on frame-level weight components (yadav et al, 2021) are some of the strategies they established for the suggested crnn-ma. yadav, o. p., bastola, l. p., and sharma, j. presented the convolutional recurrent neural network (crnn), which combines convolutional neural network (cnn) and bidirectional long shortterm memory (bilstm), to learn emotional features from log-mel scaled spectrograms of spoken utterances. convolution kernels of cnn are used to learn local features and a layer of bilstm is chosen to learn the temporal dependencies from the learnt local features. speech utterances are preprocessed to cut out distracting sounds and unnecessary information. additionally, methods for increasing the number of data samples are researched, and the best methods are chosen to improve the model's recognition rate (lim et al., 2016). without employing any conventional hand-crafted features, wootaek lim, daeyoung jang, and taejin lee developed a ser approach based on concatenated cnns and rnns. particularly for computer vision tasks, convolutional neural networks (cnns) have exceptional recognition ability. recurrent neural networks (rnns) also perform sequential data processing tasks to a great extent with high degree of success. the classification result was proven to have higher accuracy than that attained using traditional classification methods by utilizing the proposed methods on an emotional speech database (gayathri et a., 2020). silent frames and inappropriate emotional frames are frequent problems for speech emotion recognition (ser). meanwhile, the attention process has proved to be exceptionally effective at learning relevant feature representations for particular activities. using the melspectrogram with deltas and delta-deltas as input, gayathri, p., priya, p. g., sravani, l., johnson, s., and sampath, v. presented a convolutional recurrent neural networks (acrnn) based on attention to learn discriminative features for ser. finally, test results 25 | international journal of informatics information system and computer engineering 3(2) (2022) 21-30 demonstrated the viability of the suggested approach and achieved cutting-edge performance in terms of unweighted average recall (gayathri et a., 2020). 2.1. proposed models and experimental setup a convolutional recurrent neural network with a 2d attention base, serves as the proposed model for speech emotion recognition. 2.2. speech emotion recognition this section explains the proposed 2d attention based convolutional recurrent neural network. convolutional neural network, or cnn or convnet, is particularly adept at processing input with a grid-like architecture, like an image. a binary representation of visual data is a digital image. recurrent neural networks (rnns) are a type of neural network in which the results of one step are fed into the next step's computations. it employs sequential data or time series data. the convolutional recurrent neural network (crnn) model uses the outputs and hidden states of the recurrent units in each frame to extract features from the successive windows by feeding each window frame by frame into the recurrent layer. here we combine an attention mechanism together with cnn and rnn that enables easier and higherquality learning by concentrating on certain portions of the input sequence inorder to predict a particular portion of the output sequence. feature extraction is a process that converts raw data into manageable numerical features while preserving the original data's information. feature extraction when compared to using machine learning or deep learning models on the raw data directly, produces better outcomes. for the feature extraction log mel-spectrogram is used. the acrnn architecture, which combines crnn with an attention model, is used. then, as depicted in fig. 1, a fully linked layer and a softmax layer for ser are introduced. fig. 1. acrnn architecture cnn has recently demonstrated remarkable accomplishments in the ser field. the time domain and frequency domain are equally important and 2dimensional convolution performs better with less data than 1-dimensional convolution. the ser findings, however, vary greatly between speakers because of huge variation in tone, voice and other unique characteristics. the log-mels with deltas and delta-deltas act as the acrnn input to handle this variation, where the deltas and delta-deltas describe the emotional transformation process. the mel scale has a range of pitches that to the human ear, appear to be equally distant from one another. the distance in hertz between mel scale values, often known as "mels," increases as the frequency increases. mel, which stands for melody, denotes that the scale is founded on pitch comparisons. akalya devi et al. 2-d attention-based convolutional recurrent neural network … | 26 extensive tests have shown us that the mel spectrum is better compatible with the human auditory sense characteristic, which exhibits the linear distribution under 1000 hz and the logarithm growth above 1000 hz and hence this point is used to obtain the log-mel spectrum static. the link between the frequency and the mel spectrum is interrelated. a mel spectrogram renders frequencies over a specific threshold logarithmically (the corner frequency). for instance, in the spectrogram with a linear scale, the vertical space between 1,000 and 2,000 hz is half that between 2,000 and 4,000 hz. the distance between the ranges is almost the same in the mel spectrogram. similar to how we hear, this scaling makes similar low frequency sounds simpler to identify from similar high frequency noises. a frequency-domain value is multiplied by a filter bank to create the output of a mel spectrogram. when a speech signal is given with zero mean and unit variance, it is used to minimize the differences between speakers. the signal is then divided into small frames using hamming windows with a shift of 10 ms and a 25 ms duration. the power spectrum is then placed through the mel-filter bank i to produce output pi, and the output is then used to calculate the power spectrum for each frame using the discrete fourier transform (dft) i. the logarithm of pi is then used to produce the log-mels mi, as shown by (1). to determine the log-mels' deltas features, we use the following formula (2). n is often selected as (2). similarly, the delta-deltas features are calculated using the time derivative of the deltas, as seen in (3). generate a 3-d feature representation for the cnn input x σ 〖〖 r〗^(t×f×c)〗 ^ by t stands for the time (frame) length, f for the number of mel-filter banks, and c for the number of feature channels when computing the log-mels with deltas and delta-deltas. as in speech recognition [17], we set f in this task to 40 and c to 3, which stand for static, deltas, and deltadeltas, respectively. 𝑚𝑖 = 𝑙𝑜𝑔(𝑝𝑖 ) ……………………(1) 𝑚𝑖 𝑑 = 𝛴 𝑛=1 𝑁 𝑛(𝑚𝑖+𝑛 − 𝑚𝑖−𝑛 ) 2𝛴 𝑛=1 𝑁 𝑛2 ………..(2) 𝑚𝑖 𝑑𝑑 = 𝛴 𝑛=1 𝑁 𝑛(𝑚𝑖+𝑛 𝑑 − 𝑚𝑖−𝑛 𝑑 ) 2𝛴 𝑛=1 𝑁 𝑛2 ………..(3) 2.3. acrnn architecture: in this part, we integrate crnn with an attention model along with 2-d log-mels. 2-d cnn is used to perform convolution in a patch that only contains a few frames on the entire log-mels. the long shortterm memory (lstm) is then fed with 2d cnn sequential characteristics for temporal summarization. a series of high-level features are then entered into the attention layer, which outputs utterance-level features. finally, utterance-level characteristics are used as the fully connected layer input to obtain higher level features for ser. 1)crnn model: high-level features for ser are retrieved using crnn from given 2-d log-mels. the crnn used here 27 | international journal of informatics information system and computer engineering 3(2) (2022) 21-30 consists of several 2-d convolution layers, one 2-d max-pooling layer, one linear layer, and one lstm layer. each convolutional layer has a 5 x 2 filter size, with the first convolutional layer having 128 feature maps and the subsequent convolutional layers having 256 feature maps. after the first convolutional layer, we only use one max pooling layer and the pooling size is 2 x 2. the model parameters can be effectively reduced without compromising accuracy by adding a linear layer before feeding 2-d cnn features into the lstm layer. as a result, we find that the linear layer with 768 output units is appropriate when added as a dimension-reduction layer after the 2-d cnn. we perform a 2-d cnn and then feed the 2-d cnn sequence features via a bidirectional rnn with 128 cells in each direction for temporal summarization. as a result, a sequence of 256-dimensional high-level feature representations are obtained. 2)attention layer: due to the fact that not all frame-level crnn features equally contribute to the representation of speech emotion, an attention layer is employed to focus on emotion-relevant sections and produce discriminative utterance-level representations for ser. instead of only using a mean/max pooling across time, the significance of a number of high-level representations to the utterance-level emotion representations is rated using an attention model. in particular, first determine the normalized weight using a softmax function and the lstm output ht at time step t. then, as illustrated, perform a weighted sum on ht using the weights to determine the utterance-level representations (5). finally, feed the utterance-level representations through a fully connected layer with 64 output units to obtain higher level representations that help the softmax classifier map the utterance representations into n different spaces, where n is the number of emotion classes. the fully connected layer is subjected to batch normalization (gayathri et a., 2020) to expedite training and enhance generalization performance. 𝑎𝑡 = 𝑒𝑥𝑝(𝑊.ℎ𝑡) 𝛴 𝑇=1 𝑇 𝑒𝑥𝑝(𝑊.ℎ𝑡) ………(4) 𝑐 = 𝛴 𝑡=1 𝑇 𝑎𝑡 ℎ𝑡 ………….(5) we conduct ser experiments using the interactive emotional dyadic motion capture database (iemocap) to assess performance of our proposed model. there are five sessions of iemocap each having utterances having duration on an average lasting for 4.5 seconds and the rate of each sample being 16 kilohertz. every session here is presented by two speakers (a male and female) in both scripted scenes and improvised scenes. only four emotions are considered here angry, sad, happy and neutral. cross validation used here for evaluation is 10fold. out of the total ten speakers, eight speakers are chosen for training the model, one speaker is chosen for testing and the other speaker is chosen for validation. consequently, we perform each evaluation multiple times using various random seeds in order to obtain more reliable findings. we divide the signal into 3 segments which are all equal in akalya devi et al. 2-d attention-based convolutional recurrent neural network … | 28 length for improved acceleration which is parallel. we have also padded with zeros for speech utterances which are lasting less than 3 s. training set’s standard deviation and mean(global) are used for normalization of log-mels of testing and training data, with 25 ms as the size of the window and a shift of 10 ms. tensorflow and keras libraries are installed for implementation (see figs. 2-4). fig. 2. workflow for azure machine learning fig. 3. workflow for azure machine learning fig 4. classification report of 2d attention based crnn fig. 1 represents the classification report of 1d cnn lstm which has an accuracy of 56%, precision value of 59% and recall value of 56%. fig. 2 represents the classification report of temporal 2-d cnn which has an accuracy of 58%, precision value of 59% and recall value of 58%. fig. 3 represents the classification report of 2-d acrnn which has an accuracy of 68%, precision value of 67% and recall value of 68%. thus, our acrnn model’s performance is superior while compared with other models (see fig. 5). fig 5. classification report of 2d attention based crnn fig. 4 displays the confusion matrix of the acrnn model. there are four emotions 0 represents angry, 1 represents sad, 2 represents happy and 3 represents neutral. the diagonal values represent the correctly predicted values. the accuracy of our proposed model 2-d crnn is 68% which is higher than the accuracy of 1d cnn lstm and t-2d cnn. weighted precision of our model is 0.67, weighted recall of our model is 0.68. weighted f1 score of our model is 0.67. all these values are higher than the corresponding values in 1d cnn lstm and t-2d cnn. thus, our model outperforms similar ser models with greater values for all metrics. 3-d 29 | international journal of informatics information system and computer engineering 3(2) (2022) 21-30 attention based crnn implemented in the paper chen (chen et al., 2018) has average recall value of 64.74%. our 2-d attention based crnn has outperformed it with a recall value of 68% (see fig. 6). fig 6. comparison of models and their evaluation metrics fig. 5 shows the plot between models and their evaluation metrics. our model comes out to be the best in all metrics while comparing with the other two models. references chen, m., he, x., yang, j., & zhang, h. (2018). 3-d convolutional recurrent neural networks with attention model for speech emotion recognition. ieee signal processing letters, 25(10), 1440-1444. gayathri, p., priya, p. g., sravani, l., johnson, s., & sampath, v. (2020). convolutional recurrent neural networks based speech emotion recognition. journal of computational and theoretical nanoscience, 17(8), 3786-3789. huang, c. w., & narayanan, s. s. (2016, september). attention assisted discovery of sub-utterance structure in speech emotion recognition. in interspeech (pp. 1387-1391). huang, c., gong, w., fu, w., & feng, d. (2014). a research of speech emotion recognition based on deep belief network and svm. mathematical problems in engineering, 2014. jiang, p., xu, x., tao, h., zhao, l., & zou, c. (2021). convolutional-recurrent neural networks with multiple attention mechanisms for speech emotion recognition. ieee transactions on cognitive and developmental systems. khalil, r. a., jones, e., babar, m. i., jan, t., zafar, m. h., & alhussain, t. (2019). speech emotion recognition using deep learning techniques: a review. ieee access, 7, 117327-117345. akalya devi et al. 2-d attention-based convolutional recurrent neural network … | 30 lalitha, s., mudupu, a., nandyala, b. v., & munagala, r. (2015, december). speech emotion recognition using dwt. in 2015 ieee international conference on computational intelligence and computing research (iccic) (pp. 1-4). ieee. lee, j., & tashev, i. (2015, september). high-level feature representation using recurrent neural network for speech emotion recognition. in interspeech 2015. lim, w., jang, d., & lee, t. (2016, december). speech emotion recognition using convolutional and recurrent neural networks. in 2016 asia-pacific signal and information processing association annual summit and conference (apsipa) (pp. 1-4). ieee. maji, b., & swain, m. (2022). advanced fusion-based speech emotion recognition system using a dual-attention mechanism with conv-caps and bi-gru features. electronics, 11(9), 1328. mao, q., dong, m., huang, z., & zhan, y. (2014). learning salient features for speech emotion recognition using convolutional neural networks. ieee transactions on multimedia, 16(8), 2203-2213. nwe, t. l., foo, s. w., & de silva, l. c. (2003, december). detection of stress and emotion in speech using traditional and fft based log energy features. in fourth international conference on information, communications and signal processing, 2003 and the fourth pacific rim conference on multimedia. proceedings of the 2003 joint (vol. 3, pp. 1619-1623). ieee. trigeorgis, g., ringeval, f., brueckner, r., marchi, e., nicolaou, m. a., schuller, b., & zafeiriou, s. (2016, march). adieu features? end-to-end speech emotion recognition using a deep convolutional recurrent network. in 2016 ieee international conference on acoustics, speech and signal processing (icassp) (pp. 5200-5204). ieee. tripathi, s., tripathi, s., & beigi, h. (2018). multi-modal emotion recognition on iemocap dataset using deep learning. arxiv preprint arxiv:1804.05788. tzirakis, p., zhang, j., & schuller, b. w. (2018, april). end-to-end speech emotion recognition using deep neural networks. in 2018 ieee international conference on acoustics, speech and signal processing (icassp) (pp. 5089-5093). ieee. yadav, o. p., bastola, l. p., & sharma, j. (2021). speech emotion recognition using convolutional recurrent neural network. 31 | international journal of informatics information system and computer engineering 4(1) (2023) 31-48 doi: https://doi.org/10.34010/injiiscom.v4i1.9570 p-issn 2810-0670 e-issn 2775-5584 a computational bibliometric analysis of esport management using vosviewer selvia lorena br ginting faculty of information science and technology, universiti kebangsaan malaysia, malaysia corresponding email: selvialorena@yahoo.com a b s t r a c t s a r t i c l e i n f o the project aims to combine visualization study with vosviewer and publish or perish software to conduct a computerized bibliometric analysis of the phrase "esports management." the method used descriptivequantitative approach in conjunction with bibliome tric analysis. the data was obtained from the google scholar search results for "esports management." there were 999 articles published between 2017 and 2021, with an increase each year except for 2021 to 2022. this may be demonstrated in 2017 with 58 articles, in 2018 with 92 pieces, in 2019 with 160 articles, in 2020 with 242 articles, and in 2021 with a huge rise to 335 articles. in 2022, however, the number of articles had significantly decreased to 64. based on further findings of this research, it can be concluded that there are several understudied sectors in esports management that may be examined further to increase the efficacy of management in esports. it is anticipated that this research will also serve as an example for further studies in defining and evaluating the research subject, as well as for the esports participants' management team. article history: submitted/received 25 dec 2022 first revised 27 jan 2023 accepted 01 apr 2023 first available online 12 apr 2023 publication date 01 jun 2023 __________________ keywords: bibliometrics, esports management, data analysis, vosviewer. 1. introduction esports is one of the first industries to confront the challenge of transitioning from global to local and online to offline as a whole ecosystem (scholz, 2019; taylor, 2012; scholz, 2020). even though there are still disputes about the nature of esport as a “sport” (franke, 2015; hutchins, 2008; jenny et al., 2017; jonasson & thiborg, 2010; witkowski, international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 4(1) (2023) 31-48 https://doi.org/10.34010/injiiscom.v4i1.9570 mailto:selvialorena@yahoo.com selvia lorena br ginting. a computational bibliometric analysis of esport management using… | 32 doi: https://doi.org/10.34010/injiiscom.v4i1.9570 p-issn 2810-0670 e-issn 2775-5584 2010), the stronger opinion stated that it does can be considered as a sport based on factors such as the presence of enemy, rules and ethics, strategy, as well as winning and defeat (kenzhekanova, 2015). due to the fact that esports is a relatively new industry, it is still evolving as there has been several groundbreaking changes in the recent decades on this industry. since the financial model of the esports industry is unstable, esports organizations prioritize risk management related to future developments such as new markets, franchising, new titles, and the extant fragmentation of the esports industry (kenzhekanova, 2015). this makes it important to properly govern esports as a growing industry (scholz, 2019). however, the research concerning esports is in a fragmented state as there are multiple understudied fields, making it quite hard to research the topic in a thorough manner. conducting a bibliometric study of esports, particularly on the subject of esports management, is one technique to discover such understudied subjects. numerous studies on bibliometric analysis in different domains have been conducted. for instance, digital learning (husaeni & nandiyanto, 2022), computer science (husaeni & nandiyanto, 2023), vocational school (husaeni & nandiyanto, 2023), high school (husaeni & nandiyanto, 2023), covid-19 (hamidah et al., 2020), scientific publications (husaeni et al., 2022; soegoto et al., 2022). additionally, there have been studies on esports, such as those by sousa et al. that addressed the physiological and cognitive functions in competitive esports matches (sousa et al., 2020). chiu et al. undertook a bibliometric study of esports generally as part of their research on bibliometric analysis of esports (chiu et al., 2021). yamanaka et al., büyükbaykal and burak, arwendria, and kurnia all did separate studies on the same subject as chiu et al. (yamanaka et al., 2021; büyükbaykal & burak, 2020; arwendria, 2021; kurnia, 2021). however, no bibliometric examination of esports management has been conducted. based on the issue, this research aims to conduct a bibliometric analysis research regarding esports management. qualitative-descriptive approach was used with literature review as the data collection method. to analyse the data, vosviewer was used to illustrate the connection between the terms as well as to discover the term and publication trend between the year 2017 to 2022. 2. method this study employed quantitative, descriptive, and bibliometric techniques. for this study, we compiled information from a number of google scholar-listed, previously published journals. this is because google scholar is one of the sources of easily accessible journals. using the program publish or perish, we also conducted a literature review on "esports management". it was decided to use publish or perish to extract bibliometric data from study subjects (jenny et al., 2017). moreover, once the data has been saved from the publish or perish utility as a *ris. file, it can be viewed with the vosviewer application. this research uses vosviewer 1.6.17 and publish or perish 8 as its data collection program. in this research, we combed through data and utilized pertinent data to support our claims on esports management. in https://doi.org/10.34010/injiiscom.v4i1.9570 33 | international journal of informatics information system and computer engineering 4(1) (2023) 31-48 doi: https://doi.org/10.34010/injiiscom.v4i1.9570 p-issn 2810-0670 e-issn 2775-5584 alignment with the title, keyword, and abstract requirements of the publish or perish program, we retrieve data from google scholar using the term "esports management" 999 data on the study of esports management were acquired. the research papers that were considered were published between 2017 and 2022. the compiled articles are then saved in *.ris format. after that, we create visualizations with the help of the vosviewer program and utilize bibliometric maps to assess trends. from the prepared database source, we map the article data. three categories of mapping information are used by the vosviewer software: network visualization, visualization overlay, and visualization density. additionally, we apply filters to the phrases shown in the mapping representation of vosviewer. (jonasson & thiborg, 2010). 3. results and discussion 3.1. research developments in the field of esports management esports management research into the changing climate, explains how research on the topic of managing esports has evolved between 2017 and 2022 in fig. 1. research on esports management increases every year, starting from 2017 to 2022 except in 2022. this can be proven in 2017 with the number of articles 58, in 2018 increased to 92 articles, in 2019 increased back to 160 articles, in 2020 increased to 242 articles, pada in 2021 experienced a significant increase to 335 articles, and by 2022 research on esports management decreased drastically, the number of publications to 64. we find 999 publications that match the study subject in the publish or perish software's search results. we selected 20 papers from 20 different journals and books with the most citations from this data (table 1). fig. 1. level of research development on esports management https://doi.org/10.34010/injiiscom.v4i1.9570 selvia lorena br ginting. a computational bibliometric analysis of esport management using… | 34 doi: https://doi.org/10.34010/injiiscom.v4i1.9570 p-issn 2810-0670 e-issn 2775-5584 table 1. article data in the field of esports management no authors title year cites refs 1. hamari and sjöblom. what is esports and why do people watch it? 2017 900 (hamari, 2017) 2. jenny et al. virtual (ly) athletes: where esports fit within the definition of “sport” 2017 455 (jenny, 2017) 3. hallmann & giel. esports–competitive sports or recreational activity? 2018 332 (hallmann, 2018) 4. bányai et al. the psychology of esports: a systematic literature review 2019 211 (banyai et al., 2019) 5. reitman et al. esports research: a literature review 2020 193 (reitman et al., 2020) 6. pizzo et al. esport vs sport: a comparison of spectator motives 2018 175 (pizzo, 2018) 7. difrancisco et al. managing the health of the esport athlete: an integrated health management model 2019 126 (difrancisc o et al., 2019) 8. d himmelstein et al. an exploration of mental skills among competitive league of legend players 2021 119 (himmelste in et al., 2021) 9. r li. good luck has fun: the rise of esports 2017 117 (li, 2017) 10. holden et al. the future is now: esports policy considerations and potential litigation 2017 108 (holden et al., 2017) 11. kane & spradley. recognizing esports as a sport 2017 97 (kane & spradley, 2017) 12. ye et al. mastering complex control in moba games with deep reinforcement learning 2020 94 (ye et al., 2020) 13. jenny et al. esports venues: a new sport business opportunity 2018 91 (jenny, 2018) 14. griffiths. the psychosocial impact of professional gambling, professional video gaming & esports 2017 71 (griffiths, 2017) https://doi.org/10.34010/injiiscom.v4i1.9570 35 | international journal of informatics information system and computer engineering 4(1) (2023) 31-48 doi: https://doi.org/10.34010/injiiscom.v4i1.9570 p-issn 2810-0670 e-issn 2775-5584 table 1 (continue). article data in the field of esports management no authors title year cites refs 15. freeman & wohn. esports as an emerging research context at chi: diverse perspectives on definitions 2017 70 (freeman & wohn, 2017) 16. ströh. the esports market and esports sponsoring 2017 60 (stroh, 2017) 17. schaeperko et al. the “new” student-athlete: an exploratory examination of scholarship esports players 2017 53 (schaeperko etter, 2017) 18. chung et al. will esports result in a higher prevalence of problematic gaming? a review of the global situation 2019 49 (chung, 2019) 19. ye et al. towards playing full moba games with deep reinforcement learning 2020 44 (ye, 2020) 20. nagorsky & wiemeyer. the structure of performance and training in esports 2020 43 (nagorsky, 2020) twenty papers that meet the requirements for research are listed in table 1. out of the 20 publications that were chosen, the study on esports management has a greatest citation of 900 and a lowest citation of 43. according to table 1, the articles with the most quotations will be published in 2017 and 2022, respectively. the most papers cited between 2017 and 2022 total 900 articles. that there would be 193 articles mentioned in total by 2020. the year with the most quotes includes up to 900 articles. 3.2. visualization esports management topic area using vosviewer visualization al husaeni and nandiyanto claim that the vosviewer software is used in the esports management field since it has a limited amount of relationships (büyükbaykal & burak, 2020). however, in this investigation, vosviewer requires a minimum of three connections. thus, 26 elements in a total of 10 clusters constitute the end result. using analytical mapping and visualization, a study of the atmosphere of the esports industry was conducted: (i) cluster 1 (5 items) esport, esports consumer, esports game, esports research, gaming (see fig. 2). (ii) cluster 2 (4 items) esports competition, esports event, esports tournament, sport management (see fig. 3). https://doi.org/10.34010/injiiscom.v4i1.9570 selvia lorena br ginting. a computational bibliometric analysis of esport management using… | 36 doi: https://doi.org/10.34010/injiiscom.v4i1.9570 p-issn 2810-0670 e-issn 2775-5584 (iii) cluster 3 (4 items) audience, esports community, esports fan, esports sponsorship (see fig. 4). (iv) cluster 4 (3 items) digitalization, sport, strategic management (see fig. 5). (v) cluster 5 (3 items) participant, professional esports player, video game (see fig. 6). (vi) cluster 6 (3 items) athlete, esports athlete, esports organization (see fig. 7). (vii) cluster 7 (1 item) competitive gaming (see fig. 8). (viii) cluster 8 (1 item) development (see fig. 9). (ix) cluster 9 (1 item) gambling (see fig. 10). (x) cluster 10 (1 item) los esport (see fig. 11). cluster 1 is represented with the color red, cluster 2 with the color green, cluster 3 with the color blue-old, cluster 4 with the color yellow. cluster 5 with the color purple, cluster 6 with the color cyan, cluster 7 with the color orange, cluster 8 with the color brown, cluster 9 with the color pink, and cluster 10 with the color draco turquoise. fig. 2. cluster 1 visualization esports management network https://doi.org/10.34010/injiiscom.v4i1.9570 37 | international journal of informatics information system and computer engineering 4(1) (2023) 31-48 doi: https://doi.org/10.34010/injiiscom.v4i1.9570 p-issn 2810-0670 e-issn 2775-5584 fig. 3. cluster 2 visualization esports management network fig. 4. cluster 3 visualization esports management network https://doi.org/10.34010/injiiscom.v4i1.9570 selvia lorena br ginting. a computational bibliometric analysis of esport management using… | 38 doi: https://doi.org/10.34010/injiiscom.v4i1.9570 p-issn 2810-0670 e-issn 2775-5584 fig. 5. cluster 4 visualization esports management network fig. 6. cluster 5 visualization esports management network https://doi.org/10.34010/injiiscom.v4i1.9570 39 | international journal of informatics information system and computer engineering 4(1) (2023) 31-48 doi: https://doi.org/10.34010/injiiscom.v4i1.9570 p-issn 2810-0670 e-issn 2775-5584 fig. 7. cluster 6 visualization esports management network fig. 8. cluster 7 visualization esports management network https://doi.org/10.34010/injiiscom.v4i1.9570 selvia lorena br ginting. a computational bibliometric analysis of esport management using… | 40 doi: https://doi.org/10.34010/injiiscom.v4i1.9570 p-issn 2810-0670 e-issn 2775-5584 fig. 9. cluster 8 visualization esports management network fig. 10. cluster 9 visualization esports management network https://doi.org/10.34010/injiiscom.v4i1.9570 41 | international journal of informatics information system and computer engineering 4(1) (2023) 31-48 doi: https://doi.org/10.34010/injiiscom.v4i1.9570 p-issn 2810-0670 e-issn 2775-5584 fig. 11. cluster 10 visualization esports management network 3.3. network visualization esports management topic area using vosviewer each term's mapping in the vosviewer program is segregated into the first of three categories, visualization network. the representation of a connected network one of the map's features. existing relationship as represented in a network representation, or the line connecting two objects (see fig. 12). visualization network from the item "esports game" obtained using the vosviewer program is shown in fig. 12. each cluster where in each individual area or investigated issue is depicted in fig. 12. the esports management climate, which includes cluster 10 and has a total strength of 40 and occurrence of 43, is seen in fig. 12 above. esports climate connected to cluster 1 (5 items) esport, esports consumer, esports game, esports research, gaming, cluster 2 (4 items) esports competition, esports event, esports tournament, sport management, cluster 3 (4 items) audience, esports community, esports fan, esports sponsorship. cluster 4 (3 items) digitalization, sport, strategic management, cluster 5 (3 items) participant professional, esports play, video game. cluster 6 (3 items) athlete, esports athlete, esports organization cluster 7 (1 item) competitive gaming, cluster 8 (1 item) development, cluster 9 (1 item) gambling, cluster 10 (1 item) los esport. 3.4. overlay visualization of esports management topic area using vosviewer the vosviewer software's second visualization network offers overlaystyle visualization mapping. mapping properly overlay visualization focuses on a fresh research phrase. novelty term or thing in research related to the climate of esports managements shown in fig. 13 https://doi.org/10.34010/injiiscom.v4i1.9570 selvia lorena br ginting. a computational bibliometric analysis of esport management using… | 42 doi: https://doi.org/10.34010/injiiscom.v4i1.9570 p-issn 2810-0670 e-issn 2775-5584 fig. 12. visualization esports management network fig. 13. overlay esports management visualization https://doi.org/10.34010/injiiscom.v4i1.9570 43 | international journal of informatics information system and computer engineering 4(1) (2023) 31-48 doi: https://doi.org/10.34010/injiiscom.v4i1.9570 p-issn 2810-0670 e-issn 2775-5584 in the depiction of thing or term type overlay, the popularity of each year can be seen. on visualization overlay, various hues indicate the duration of an extension in a specific period. in this research, the years 2017 to 2022 are considered. more dark colors approach purple, indicating that research on a particular thing or term will be concluded by 2017. in the meantime, the hue is approaching yellow in a lighter fashion. 3.5. density visualization of esports management density visualization is the third and final mapping depiction in the vosviewer software. fig. 14 depicts a visualization of density esports management. the colors that appear in a term can be mapped using this method. if the color that appears becomes paler, then interest in the term is increasing. conversely, if the color is becoming darker or more diminished, the frequency of research on the term is decreasing. yellow color terms are depicted in fig. 14 as having a diameter that is relatively large. these concepts are referred to as emission, esports, esports management, gaming, and development. visualization density about climate esports management research is in the picture above, which means that on the map density showing results analysis use all article regarding esports management in 2017-2022. in fig. 14 is depicted a yellow pattern whose keyword density increases as the circle's diameter increases, indicating that they are more prevalent. if the color on the map fades or blends with the background color green, it indicates that the keyword appears less frequently. fig. 14. visualization density esports management network https://doi.org/10.34010/injiiscom.v4i1.9570 selvia lorena br ginting. a computational bibliometric analysis of esport management using… | 44 doi: https://doi.org/10.34010/injiiscom.v4i1.9570 p-issn 2810-0670 e-issn 2775-5584 4. conclusion the purpose of this study is to evaluate and assess the bibliometric literature on esports management. the keyword "esports management" is used to retrieve data based on a subject area containing keywords, abstracts, and titles. after data processing and filtration, 999 relevant articles were obtained. to generate mapping data, a device soft vos viewer is used. using visualization grid, overlay, and density to map data. based on results in mapping and analysis use vos viewer, obtained that study regarding financial management with the term esports management in 2017-2022 decreased from every year to year. in research this, using method bibliometrics to identify theme main in every field studies before, because important to assess novelty in future research. references al husaeni, d. f., & nandiyanto, a. b. d. (2022). bibliometric using vosviewer with publish or perish (using google scholar data): from step -by-step processing for users to the practical examples in the analysis of digital learni ng articles in pre and post covid-19 pandemic. asean journal of science and engineering, 2(1), 19-46. al husaeni, d. f., & nandiyanto, a. b. d. (2022). mapping visualization analysis of computer science research data in 2017-2021 on the google scholar database with vosviewer. international journal of informatics, information system and computer engineering (injiiscom), 3(1), 1-18. al husaeni, d. f., nandiyanto, a. b. d., & maryanti, r. (2023). bibliometric analysis of educational research in 2017 to 2021 using vosviewer: google scholar indexed research. indonesian journal of teaching in science, 3(1), 1-8. al husaeni, d. f., nandiyanto, a. b. d., & maryanti, r. (2023). bibliometric analysis of educational research in 2017 to 2021 using vosviewer: google scho lar indexed research. indonesian journal of teaching in science, 3(1), 1-8. al husaeni, d. n., & nandiyanto, a. b. d. (2023). bibliometric analysis of high school keyword using vosviewer indexed by google scholar. indonesian journal of educational research and technology, 3(1), 1-12. al husaeni, d. n., & nandiyanto, a. b. d. (2023). bibliometric analysis of high school keyword using vosviewer indexed by google scholar. indonesian journal of educational research and technology, 3(1), 1-12. bányai, f., griffiths, m. d., király, o., & demetrovics, z. (2019). the psychology of esports: a systematic literature review. journal of gambling studies, 35, 351365. https://doi.org/10.34010/injiiscom.v4i1.9570 45 | international journal of informatics information system and computer engineering 4(1) (2023) 31-48 doi: https://doi.org/10.34010/injiiscom.v4i1.9570 p-issn 2810-0670 e-issn 2775-5584 büyükbaykal, n. g., & burak, i̇. l. i̇. (2020). e-spor kavramına yönelik araştırmaların bibliyometrik analizi bibliometric analysis of researches for e-sport. uluslararası kültürel ve sosyal araştırmalar dergisi, 6(2), 572-583. chiu, w., fan, t. c. m., nam, s. b., & sun, p. h. (2021). knowledge mapping and sustainable development of esports research: a bibliometric and visualized analysis. sustainability, 13(18), 10354. chung, t., sum, s., chan, m., lai, e., & cheng, n. (2019). will esports result in a higher prevalence of problematic gaming? a review of the global situation. journal of behavioral addictions, 8(3), 384-394. difrancisco-donoghue, j., balentine, j., schmidt, g., & zwibel, h. (2019). managing the health of the esport athlete: an integrated health management model. bmj open sport & exercise medicine, 5(1), e000467. franke, t. (2013). the perception of esports-mainstream culture, real sport and marketisation. esports yearbook, 14, 111-144. freeman, g., & wohn, d. y. (2017, may). esports as an emerging research context at chi: diverse perspectives on definitions. in proceedings of the 2017 chi conference extended abstracts on human factors in computing systems (pp. 16011608). griffiths, m. d. (2017). the psychosocial impact of professional gambling, professional video gaming & esports. casino & gaming international, 28, 59-63. hallmann, k., & giel, t. (2018). esports–competitive sports or recreational activity?. sport management review, 21(1), 14-20. hamari, j., & sjöblom, m. (2017). what is esports and why do people watch it?. internet research, 27(2), 211-232. hamidah, i., sriyono, s., & hudha, m. n. (2020). a bibliometric analysis of covid -19 research using vosviewer. indonesian journal of science and technology, 34-41. himmelstein, d., liu, y., & shapiro, j. l. (2021). an exploration of mental skills among competitive league of legend players. in research anthology on rehabilitation practices and therapy (1607-1629). igi global. holden, j. t., kaburakis, a., & rodenberg, r. (2017). the future is now: esports policy considerations and potential litigation. j. legal aspects sport, 27, 46. hutchins, b. (2008). signs of meta-change in second modernity: the growth of e-sport and the world cyber games. new media & society, 10(6), 851-869. jenny, s. e., keiper, m. c., taylor, b. j., williams, d. p., gawrysiak, j., manning, r. d., & tutka, p. m. (2018). esports venues: a new sport business opportunity. journal of applied sport management, 10(1), 8. https://doi.org/10.34010/injiiscom.v4i1.9570 selvia lorena br ginting. a computational bibliometric analysis of esport management using… | 46 doi: https://doi.org/10.34010/injiiscom.v4i1.9570 p-issn 2810-0670 e-issn 2775-5584 jenny, s. e., manning, r. d., keiper, m. c., & olrich, t. w. (2017). virtual (ly) athletes: where esports fit within the definition of “sport”. quest, 69(1), 1-18. jenny, s. e., manning, r. d., keiper, m. c., & olrich, t. w. (2017). virtual (ly) athletes: where esports fit within the definition of “sport”. quest, 69(1), 1-18. jonasson, k., & thiborg, j. (2010). electronic sport and its impact on future sport. sport in society, 13(2), 287-299. kane, d., & spradley, b. d. (2017). recognizing esports as a sport. the sport journal, 19(5). kenzhekanova, k. k. (2015). linguistic features of political discourse. mediterranean journal of social sciences, 6(6 s2), 192. kurnia, s. (2021). science, technology, engineering, art and mathematics (steam) di pendidikan sains: analisis bibliometrik dan pemetaan literatur penelitia n menggunakan perangkat lunak vosviewer (doctoral dissertation, uin raden intan lampung). li, r. (2017). good luck have fun: the rise of esports. simon and schuster. masruroh, b., laksana, e. p., rosyida, f., harianti, l. r., & maysa, f. (2022). analisis sitasi jurnal pendidikan geografi: kajian, teori, dan praktik dalam bidang pendidikan dan ilmu geografi periode 2019-2021. jurnal integrasi dan harmoni inovatif ilmu-ilmu sosial (jihi3s), 2(3), 204-209. mulyawati, i. b., & ramadhan, d. f. (2021). bibliometric and visualized analysis of scientific publications on geotechnics fields. asean journal of science and engineering education, 1(1), 37-46. nagorsky, e., & wiemeyer, j. (2020). the structure of performance and training in esports. plos one, 15(8), e0237584. nandiyanto, a. b. d., al husaeni, d. n., & al husaeni, d. f. (2021). a bibliometric analysis of chemical engineering research using vosviewer and its correlation with covid-19 pandemic condition. journal of engineering science and technology, 16(6), 4414-4422. nandiyanto, a. b. d., girsang, g. c. s., maryanti, r., ragadhita, r., anggraeni, s., fauzi, f. m., ... & al-obaidi, a. s. m. (2020). isotherm adsorption characteristics of carbon microparticles prepared from pineapple peel waste. communications in science and technology, 5(1), 31-39. pizzo, a., baker, b., na, s., lee, m., kim, d., & funk, d. (2018). esport vs sport: a comparison of spectator motives. faculty/researcher works. https://doi.org/10.34010/injiiscom.v4i1.9570 47 | international journal of informatics information system and computer engineering 4(1) (2023) 31-48 doi: https://doi.org/10.34010/injiiscom.v4i1.9570 p-issn 2810-0670 e-issn 2775-5584 ragadhita, r., & nandiyanto, a. b. d. (2022). computational bibliometric analysis on publication of techno-economic education. indonesian journal of multidiciplinary research, 2(1), 213-220. reitman, j. g., anderson-coto, m. j., wu, m., lee, j. s., & steinkuehler, c. (2020). esports research: a literature review. games and culture, 15(1), 32-50. schaeperkoetter, c. c., mays, j., hyland, s. t., wilkerson, z., oja, b., krueger, k., ... & bass, j. r. (2017). the “new” student-athlete: an exploratory examination of scholarship esports players. journal of intercollegiate sport, 10(1), 1-21. scholz, t. m. (2020). deciphering the world of esports. international journal on media management, 22(1), 1-12. scholz, t. m., & scholz, t. m. (2019). a short history of esports and management. esports is business: management in the world of competitive gaming, 17-41. scholz, t. m., & scholz, t. m. (2019). introduction: the emergence of esports. esports is business: management in the world of competitive gaming, 1-16. soegoto, h., soegoto, e. s., luckyardi, s., & rafdhi , a. a. (2022). a bibliometric analysis of management bioenergy research using vosviewer application. indonesian journal of science and technology, 7(1), 89-104. sousa, a., ahmad, s. l., hassan, t., yuen, k., douris, p., zwibel, h., & difrancisco donoghue, j. (2020). physiological and cognitive functions following a discrete session of competitive esports gaming. frontiers in psychology, 11, 1030. ströh, j. h. a. (2017). the esports market and esports sponsoring. tectum wissenschaftsverlag. taylor, t. l. (2012). raising the stakes: e-sports and the professionalization of computer gaming. mit press. witkowski, e. (2009). probing the sportiness of esports. esports yearbook. norderstedt: books on demand gmbh, 53-56. yamanaka, g. k., campos, m. v., roble, o. j., & mazzei, l. c. (2021). esport: a stateof-the-art review based on bibliometric analysis. journal of physical education and sport, 21(6), 3547-3555. ye, d., chen, g., zhang, w., chen, s., yuan, b., liu, b., ... & liu, w. (2020). towards playing full moba games with deep reinforcement learning. advances in neural information processing systems, 33, 621-632. https://doi.org/10.34010/injiiscom.v4i1.9570 selvia lorena br ginting. a computational bibliometric analysis of esport management using… | 48 doi: https://doi.org/10.34010/injiiscom.v4i1.9570 p-issn 2810-0670 e-issn 2775-5584 ye, d., liu, z., sun, m., shi, b., zhao, p., wu, h., ... & huang, l. (2020, april). mastering complex control in moba games with deep reinforcement learning. in proceedings of the aaai conference on artificial intelligence, 34(04), 6672-6679. https://doi.org/10.34010/injiiscom.v4i1.9570 1 | international journal of informatics information system and computer engineering 4(1) (2023) 1-10 doi: https://doi.org/10.34010/injiiscom.v4i1.9274 p-issn 2810-0670 e-issn 2775-5584 ict-based literacy evaluation in nigeria educational sector: case study in kwara state hammed olalekan bolaji*, tinuke bilikis ibrahim-raji 1department of science education, faculty of education, al -hikmah university, ilorin-nigeria 2department of educational management and counseling, faculty of education, al-hikmah university, ilorin-nigeria *corresponding email: atinukebilikis259@gmail.com a b s t r a c t s a r t i c l e i n f o ict is now a necessity for both professionals and organizations due to its pervasiveness across all fields of human endeavour. the literacy skills level plays a major role in its application for routine responsibilities and the pace at which task is complemented. for efficient service delivery in the public service, this study evaluated the ict literacy abilities and their application among the staff of education agencies. it used a descriptive cross-sectional survey design. structured items on the ict skills assessment and utilization questionnaire (ictsauq) were administered to fifty staff using convenient sampling techniques. to respond to the research questions posed by this study, descriptive and inferential statistics were used. the question was addressed using percentage means and standard deviations, and the questions were analysed using a t-test. the results showed that having a basic understanding of ict helps do administrative tasks daily. however, the staff of the education agencies lacked the necessities for their daily routine of managerial responsibilities and operations. hence, it was suggested that staff of the education agencies in kwara state must be exposed to the required ict skills to perform the routine functions at the optimal level. additionally, it was suggested that agency staff members be encouraged to consistently improve their ict literacy abilities through self-training and group work to improve the competence of service delivery in the educational sector. article history: submitted/received 18 jan 2023 first revised 20 feb 2023 accepted 25 march 2023 first available online 1 april 2023 publication date 01 june 2023 aug 2018 __________________ keywords: ict, utilization, literacy skills. international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 4(1) (2023) 1-10 https://doi.org/10.34010/injiiscom.v4i1.9274 mailto:atinukebilikis259@gmail.com bolaji et al., ictbased literacy evaluation in nigeria…| 2 doi: https://doi.org/10.34010/injiiscom.v4i1.9274 p-issn 2810-0670 e-issn 2775-5584 1. introduction the level of literacy needed to operate computers and other related technologies effectively is known as information and communication technology (ict) ability (ukachi, 2015). a grasp of how to use common software systems and platforms and a practical knowledge of computer programming and its applications could be on the skill spectrum (alemu, 2015). the use of ict in educational practice is widely regarded as an empowering tool that encourages change and stimulates the development of 21st-century skills (bolaji & adeoye, 2022). currently, the pervasiveness of ict in every sphere of human endeavour cannot be overemphasized, and education is also experiencing the feeling of it in its routine activities and practices. this requires some level of literacy skill in the use of ict to perform one function or the other among educational administrators. the most crucial worldwide resource for selfactualization is ict with its corresponding literacy skills (olatoye, 2019). ict as a form of technological development can improve economic prospects, and enhance governance and service delivery for the socio-economic development of a society (ogwu & ogwu, 2015). it is on this premise that emwanta (2013) advised that ict literacy skills and abilities should be acquired to maximize its potential for educational practices (emwanta, 2013). the advent of numerous ict tools has resulted in significant changes in the educational system around the world. the provision of these amenities by government agencies in the workplace has been shown to improve service delivery efficiency (adeleke, 2016) and should be sustained with friendly ict policies. the federal government of nigeria created an ict policy in 2001 after considering its advantages and the national information technology development agency (nitda) was founded as a result of the policy in which its goal includes ensuring that ict resources are easily accessible to support effective national development and integrate it into the civil service, notably the education sector (nitda, 2017). furthermore, emwanta (2013) posited that policy is a vital instrument for the promotion and sustenance of national development (emwanta, 2013). education policy is well recognized for its ability to transmit desirable values like work ethics, loyalty, integrity and justice; all of which are necessary for individual survival and societal growth. education could be the most crucial tool for change, and staffs, irrespective of gender are the driving force behind it. the gender of an individual always plays a moderating influence on routine activities, and the use of ict is not an exception. gender influences the use of ict for teaching as revealed in the study conducted by irfan et al (2014) where it was found that male teachers frequently use ict in comparison to female colleagues, and it as well extends to creating presentation materials for instructional delivery (guillen et al., 2019). important also, it was further revealed that gender didn’t have a major influence on the use of ict for information seeking which is at variance with the initial finding. therefore, it can be concluded that gender swings its influence on individuality and the responsibilities routinely performed. gender might not necessarily influence every variable of an individual, and as reported by egunjobi and fabunmi https://doi.org/10.34010/injiiscom.v4i1.9274 3 | international journal of informatics information system and computer engineering 4(1) (2023) 1-10 doi: https://doi.org/10.34010/injiiscom.v4i1.9274 p-issn 2810-0670 e-issn 2775-5584 (2017), a relationship did not exist between competence and gender (olusegun & adesoji, 2017). in addition, gender didn’t have any moderating influence on the use of ict for information retrieval and its actual use (durante, 2013). the attitude of an individual while using ict might also not influence routine activities (adenuga et al., 2011). whereas it was discovered that gender influences the use of ict for accessing social networks in females was found to be more prevalent in the use of it for communications (hilbert, 2011). however, males are found to possess more skills in the use of ict (van dijk, 2015). a further advancement is the ict literacy level perceived to be higher in female students but does not influence their operational skills over their male counterparts (zhou et al., 2014). the primary goal of this research was to examine the degree and scope of ict literacy use among education agency workers. in particular, the study: evaluated the degree of ict literacy among kwara state's educational agency staff. examined how the staff of education agencies in kwara state used their knowledge of ict. figured out how to get education agencies' workers in kwara state to become ict literate. examined the difficulties in teaching ict literacy to the workers of kwara state's educational agencies. the purpose of the study is what is the degree of ict literacy among kwara state's educational agency staff? do the kwara state employees of education agencies possess ict literacy skills? how are ict literacy skills acquired by the kwara state education agency staff? what are the difficulties that the kwara state workers of education agencies face while using their ict literacy skills? 2. method this study is descriptive of the crosssectional survey type. the instrument for this study was a researcher-designed questionnaire titled ict skills assessment and utilization questionnaire (ictsauq) comprising a close-ended questionnaire for the collection of data on the ict literacy skill and its utilization among the staff of education agencies in kwara state. the sampling technique employed is a non-probability convenient sample and the technique was adopted because the researcher is specific about the participants who are staff working in the agencies relating to education within the civil service. the agencies randomly selected for this study are the teaching service commission, state universal basic education board, scholarship board and mass literacy agency. hence, the sample size for this was 50 and 10 of each of the staff were conveniently selected across the education agencies in kwara state. the questionnaire has three sections of six items each to measure the variables under study. the five–point likert is considered as an internal scale for all the questions statements. if the mean is from 1 to 1.8 it signifies strongly disagree and for the mean from 1.81 to 2.60, it signifies disagree. likewise, if the mean is from 2.61 to 3.40 indicate undecided and the mean from 3.41 to 4.20 signifies agree. also, from 4.21 to 5 the mean is strongly agreed. the questionnaire was validated by three educators comprising an educational technologist, educational manager and computer educator and thereafter subjected to a reliability test using cronbach alpha which yielded https://doi.org/10.34010/injiiscom.v4i1.9274 bolaji et al., ictbased literacy evaluation in nigeria…| 4 doi: https://doi.org/10.34010/injiiscom.v4i1.9274 p-issn 2810-0670 e-issn 2775-5584 0.87. the responses were collected manually and subjected to both descriptive and inferential statistical analysis. the research questions were answered using mean and standard deviation while the hypothesis was analyzed using a t-test. 3. results and discussion across all the education agencies, it is evident that there are more females than males. for instance, the total average of females overall is 60% which translated to 30 out of 50 respondents examined for this study. this implies that females dominated the staff strength of education agencies in kwara state and it laid the foundation for the need to give gender attention regarding the variables under study (see tables 1 and 2). question 1: what is the level of ict literacy skills among the staff of education agencies in kwara state? table 1. demographics characteristic of the respondents mdas (ministries, departments and agencies) male female total teaching service commission 6(40%) 9 (60%) 15 state universal basic education board 5 (33.33%) 10 (66.6%) 15 scholarship board 5 (50%) 5 (50%) 10 mass literacy agency 4 (40%) 6 (60%) 10 total 20 (40%) 30 (60%) 50 table 2. mean and standard deviation on ict literacy skills among staff education agencies statements n mean std. deviation i can create a multimedia presentation (with sound, pictures and video). 50 3.26 1.192 i can copy files from one location into another location. 50 3.42 1.052 i can send and open an attachment from an email, using a computer email program. 50 3.08 1.482 i can use the world wide web address to find useful information 50 3.84 0.650 i can use search engines to search for information e.g, yahoo, google, and youtube. i can use the internet and its various features 50 50 3.00 3.86 1.262 0.808 https://doi.org/10.34010/injiiscom.v4i1.9274 5 | international journal of informatics information system and computer engineering 4(1) (2023) 1-10 doi: https://doi.org/10.34010/injiiscom.v4i1.9274 p-issn 2810-0670 e-issn 2775-5584 from table 2, in the first statement, the mean is 3.26. hence, this shows that the majority of participants cannot decide whether they can create a multimedia presentation (with sound, pictures and video) or not. the means of the second statement is, 3.42; it means that the majority of participants agreed to have the ability to copy files from one location into another location. the means of the third statement is, 3.08; it means that the majority of participants cannot decide whether they can send and open an attachment from an email, using a computer email program or not. the means of the fourth statement is, 3.84; it means that the majority of participants agreed that they can use www address to find useful information. the means of the fifth statement is 3.00, which means that the majority of participants cannot decide whether they can use search engines to search for information e.g, yahoo, google, and youtube or not. the means of the sixth statement is, 3.86; it means that the majority of participants strongly agreed that they can use the internet and its various features (see table 3). question 2: do the staff of education agencies in kwara state utilize ict literacy skills? table 3 shows that the mean of the first statement is 3.50, this shows that the majority of participants agreed that they can start up and shut down the computer system. the means of the second statement is, 3.66; it means that the majority of participants agreed to have the ability to open, create, edit, backup, save and delete documents or files on the computer. the means of the third statement is, 3.20; it means that the majority of participants cannot decide whether they can copy a file from a floppy disk or flash drive (usb). the means of the fourth statement is, 3.54; it means that the majority of participants agreed that they can use microsoft word for typing. the means of the fifth statement is, 3.24; it means that the majority of participants cannot decide whether they can use microsoft excel for analysis or not. the means of the sixth statement is, 3.40; it means that the majority of participants cannot decide whether they can use microsoft powerpoint for presentation (table 4). question 3: how does the staff of education agencies in kwara state acquire ict literacy skills table 4 indicates that the mean of the first statement is, 2.28; this showing that the majority of participants disagreed with the statement that they went to seminars and conferences. the means of the second statement is, 3.72; it means that the majority of participants agreed to do personal and self-training. the means of the third statement is, 3.24; it means that the majority of participants cannot decide whether they got ict literacy from colleagues in the office. the means of the fourth statement is 2.30, which means that the majority of participants disagreed that they got ict literacy skills through government training. the means of the fifth statement is, 3.32; it means that the majority of participants cannot decide whether they acquire ict literacy skills from in-house training (see table 5). question 4: what are the challenges of literacy skill use of ict among the staff of education agencies in kwara state? https://doi.org/10.34010/injiiscom.v4i1.9274 bolaji et al., ictbased literacy evaluation in nigeria…| 6 doi: https://doi.org/10.34010/injiiscom.v4i1.9274 p-issn 2810-0670 e-issn 2775-5584 table 3. mean and standard deviation on the utilization of ict literacy skills among the staff of the education agencies table 4. mean and standard deviation on the acquisition of ict literacy skills statements n mean std. deviation i went to seminars and conferences 50 2.28 1.070 i did personal and self-training 50 3.72 0.834 through colleagues in the office 50 3.24 0.938 through government training 50 2.30 1.233 in-house training 50 3.32 1.377 table 5. mean and standard deviation on the challenges of ict literacy skills using statements n mean std. deviation the use of productive e-resources may be constrained by inadequate ict skills. 50 3.18 1.366 the usage of electronic information resources might be hampered by inadequate ict literacy abilities. 50 3.32 1.168 electronic information resources are ineffectively used when ict skills are lacking. 50 3.76 1.393 the use of e-information resources is hindered by the inability to operate a computer. 50 3.36 1.411 access to electronic resources might be badly impacted by a lack of computer skills. 50 3.36 1.453 statements n mean std. deviation i can start up and shut down a computer system. 50 3.50 1.568 i can open, create, edit, backup, save and delete documents or files on the computer 50 3.66 1.136 i can copy a file from a floppy disk or flash drive (usb) 50 3.20 1.278 i can use microsoft word for typing 50 3.54 0.862 i can use microsoft excel for analysis 50 3.24 1.519 i can use microsoft powerpoint for presentation 50 3.40 0.928 https://doi.org/10.34010/injiiscom.v4i1.9274 7 | international journal of informatics information system and computer engineering 4(1) (2023) 1-10 doi: https://doi.org/10.34010/injiiscom.v4i1.9274 p-issn 2810-0670 e-issn 2775-5584 table 5 the mean of the first statement is, 3.18; this shows that the majority of participants agreed on the use of productive e-resources may be constrained by inadequate ict skills. the means of the second statement is, 3.32; it means that the majority of participants agreed on the usage of electronic information resources might be hampered by inadequate ict literacy abilities. the means of the third statement is, 3.76; it means that the majority of participants agreed that electronic information resources are ineffectively used when ict skills are lacking. the means of the fourth statement is, 3.36; it means that the majority of participants cannot decide whether the use of einformation resources is hindered by the inability to operate a computer. the means of the fifth statement is, 3.36; it means that the majority of participants cannot decide whether access to electronic resources might be badly impacted by a lack of computer skills. 4. discussion the results of table two showed that the analysis of multimedia presentations, search engine usage, and internet usage showed that skills are essential for the efficient use of ict. this corroborates the finding from balarabe's (2020) study that pupils have a basic comprehension of ict capabilities, including competency with ms word, ms powerpoint, internet searching, and other related abilities (yushau, 2020). because it is the driving force behind ict abilities. the study of table three's data revealed that certain employees have trouble starting up and shutting down computers and are less conversant with presentation-related microsoft programs like word, excel, and powerpoint. when compared to all respondents, the percentage is minuscule. this could signal that the staff is making progress in their pursuit of ict and that they recognize the need to continually improve their ict skills to be able to meet the demands of each cadre for ict usage. table 4 results showed how education agencies' staff members develop their ict skills. they did, however, gain it mainly through personal and self-training. the results of oluwayemi et al. (2021), who testified that ict literacy abilities are more often used in training, concurred with this (olatoye, 2021). the outcome from table five demonstrates that staff members do have difficulties due to a lack of ict skills and expertise. however, staff members claimed to have trouble using electronic information resources, which may be caused by a lack of ict literacy. this is to the results of makori (2016), who asserted that it is dificult to give pupils the ict skills and resources they need (makori, 2016). according to makhmudov, k., shorakhmetov, s., & murodkosimov, a. (2020), not all subject teachers need to be experts at using computers, even if computer literacy is a must. for their teachings to be more effective and to better serve their students, they should possess a certain set of skills. these skills include the following (makhmudov et al., 2020). a) the ability to read and write simple computer programs; b) the ability to use computer programs and documentation that are educational in nature; c) the ability to use computer terminology, especially as it relates to hardware; d) the ability to identify educational problems that can and cannot be solved using the computer; e) the https://doi.org/10.34010/injiiscom.v4i1.9274 bolaji et al., ictbased literacy evaluation in nigeria…| 8 doi: https://doi.org/10.34010/injiiscom.v4i1.9274 p-issn 2810-0670 e-issn 2775-5584 ability to locate information on computing as it relates to education; and f) the ability to discuss the moral and human-impact issues. koltay (2011) made the case that information literacy is crucial for the growth of democracy, cultural participation, and active civic participation (koltay, 2011). knowledge workers who heavily rely on the internet and computer tools are especially in need of this literacy. information literacy also places a strong emphasis on the necessity of recovering and careful selection of the information available in the workplace, in education, and in other areas of individual decision-making, particularly in the domains of citizenship and health. information literacy training places a strong emphasis on the critical thinking, metacognitive, and procedural skills needed to find information in particular fields, settings, and contexts. the acknowledgement of the message of quality, authenticity, and credibility is prioritized (hobbs, 2006). in a study on how information literacy is seen in educational environments, the workplace, and the community, lloyd and williamson (2008) came to the conclusion that the context is a significant component in shaping the phenomena (lloyd, 2008). information literacy, according to catts and lau (2008), is appropriate in all areas of human development and is defined as the capacity to recognize information needs, evaluate their quality, manage this information, use it effectively, and do so in an ethical manner, in addition to the capacity to produce and share the knowledge attained through the application of information (catts & lau, 2008). there are several common elements among the definitions given, with perhaps the most significant one being the understanding that information skills cannot be seen in isolation since they are interrelated processes that entail how people think about and use information (eisenberg et al., 2004). combination of information and computer literacy (ict literacy) has been explained by oecd (organisation for economic co-operation and development) and by (santos et al., 2019). as the interest, attitude and ability of individuals to properly use digital technology and communication tools to access, manage, integrate and evaluate information, construct new knowledge, and communicate with others in order to effectively participate in society as shown in table 5. 5. conclusion from the findings, ict utilization is evident in all the departments and agencies in the education ministries of kwara state. conclusively, education agency staff possess ict skills that are useful for their profession through selftraining. however, gender has no significant influence on the ict skills of the staff. the need for effective utilization of ict devices within the agencies in the state education sector calls for prompt action by all relevant stakeholders to meet the present global technology challenges. hence the need to tackle ict deficiency among the staff of the agencies require essential attention. government should ensure that staff development and ict utilization should be prioritized and funds should be duly allocated for it in the education sector. https://doi.org/10.34010/injiiscom.v4i1.9274 9 | international journal of informatics information system and computer engineering 4(1) (2023) 1-10 doi: https://doi.org/10.34010/injiiscom.v4i1.9274 p-issn 2810-0670 e-issn 2775-5584 references adeleke, d. s., & emeahara, e. n. (2016). relationship between information literacy and use of electronic information resources by postgraduate students of the university of ibadan. library philosophy and practice, 1. adenuga, r. a., owoyele, j. w., & adenuga, f. t. (2011). gender and socio -economic background differentials in students’ attitude to information and communication technology education in nigerian secondary schools: implications for policy, ict education and counselling. international journal of psychology and counseling, 3(9), 162-166. alemu, b. m. (2015). integrating ict into teaching-learning practices: promise, challenges and future directions of higher educational institutes. universal journal of educational research, 3(3), 170-189. bolaji, h. o., & adeoye, m. a. (2022). accessibility, usability, and readiness towards ict tools for monitoring educational practice in secondary schools in ilorin metropolis. indonesian journal of multidiciplinary research, 2(2), 257-264. catts, r. and lau, j. (2008). towards information literacy indicators. technical report, unesco eisenberg, m. b., lowe, c. a., and spitzer, k. l. (2004). information literacy: essential skills for the information age. greenwood publishing group, 88 post road west, westport. durante, f., fiske, s. t., kervyn, n., cuddy, a. j., akande, a., adetoun, b. e., ... & storari, c. c. (2013). nations' income inequality predicts ambivalence in stereotype content: how societies mind the gap. british journal of social psychology, 52(4), 726-746. emwanta, m., & nwalo, k. i. n. (2013). influence of computer literacy and subject background on use of electronic resources by undergraduate students in universities in south-western nigeria. international journal of library and information science, 5(2), 29-42. guillén-gámez, f. d., lugones, a., & mayorga-fernández, m. j. (2019). ict use by pre-service foreign languages teachers according to gender, age and motivation. cogent education, 6(1), 1574693. hilbert, m. (2011, november). digital gender divide or technologically empowered women in developing countries? a typical case of lies, damned lies, and statistics. in women's studies international forum, 34(6), 479-489. pergamon. hobbs, r. (2006). multiple visions of multimedia literacy: emerging areas of synthesis. international handbook of literacy and technology, 2, 15-28. koltay, t. (2011). the media and the literacies: media literacy, information literacy, digital literacy. media, culture & society, 33(2), 211-221. lloyd, a., & williamson, k. (2008). towards an understanding of information literacy in context: implications for research. journal of librarianship and information science, 40(1), 3-12. https://doi.org/10.34010/injiiscom.v4i1.9274 bolaji et al., ictbased literacy evaluation in nigeria…| 10 doi: https://doi.org/10.34010/injiiscom.v4i1.9274 p-issn 2810-0670 e-issn 2775-5584 makhmudov, k., shorakhmetov, s., & murodkosimov, a. (2020). computer literacy is a tool to the system of innovative cluster of pedagogical education. european journal of research and reflection in educational sciences, 8(5). makori, e. o., & mauti, n. o. (2016). digital technology acceptance in transformation of university libraries and higher education institutions in kenya. national information technology development agency (nitda). (2017, february 2). nitda inaugurates cobit 5 national implementation committee. ogwu, e. n., & ogwu, f. c. (2016). comparative analysis of microsoft package (msp) competence among teacher trainee students in botswana and nigeria: implications for curriculum practices. olatoye, o. i. (2019). ict literacy skills and demographic factors as determinants of electronic resources use among the undergraduate students in the selected universities the eastern cape, south africa. olatoye, o. i., nekhwevha, f., & muchaonyerwa, n. (2021). ict literacy skills proficiency and experience on the use of electronic resources amongst undergraduate students in selected eastern cape universities, south africa. library management, 42(6/7), 471-479. olusegun, e. a., & adesoji, f. f. (2017). gender influence of ict competence of undergraduates in state–owned universities in the south–west nigeria. journal of applied information science and technology, 10(1), 140-151. santos, g. m., ramos, e. m., escola, j., & reis, m. j. (2019). ict literacy and school performance. turkish online journal of educational technology-tojet, 18(2), 19-39. ukachi, n. b. (2015). information literacy of students as a correlate of their use of electronic resources in university libraries in nigeria. the electronic library, 33(3), 486-501. van dijk, c. (2015). siektes en sindrome geassosieer met'n hoë reënval -dr chris se notas. veeplaas, 6(1), 93. yushau, b., & nannim, f. a. (2020). investigation into the utilization of ict facilities for teaching purposes among university lecturers: influence of gender, age, qualification and years of teaching experience. pedagogical research, 5(2). zhou, j., chu, h., li, c., wong, b. h. y., cheng, z. s., poon, v. k. m., ... & yuen, k. y. (2014). active replication of middle east respiratory syndrome coronavirus and aberrant induction of inflammatory cytokines and chemokines in human macrophages: implications for pathogenesis. the journal of infectious diseases, 209(9), 1331-1342. https://doi.org/10.34010/injiiscom.v4i1.9274 1 | international journal of informatics information system and computer engineering 3(1) (2022) 1-18 mapping visualization analysis of computer science research data in 2017-2021 on the google scholar database with vosviewer dwi fitria al husaeni, asep bayu dani nandiyanto universitas pendidikan indonesia, indonesia e-mail: dwifitriaalhusaeni@upi.edu a b s t r a c t s a r t i c l e i n f o the purpose of this research is to examine the development and interrelationships between terms in computer science research using mapping analysis with vosviewer. the research data was collected from the google scholar database for the period 2017-2021 using the publish or perish 7 application. data collection was based on the keyword "computer science". the data search results found 992 articles that were considered relevant. the results showed that computer science research experienced high popularity in 2018 with a total of 232 articles. computer science research experienced a decline in research in 2019-2021. based on the mapping analysis that has been done using the vosviewer application, computer science terms are connected to 4 main terms in each cluster, namely student, computer science education, education, and skills. computer science research is mostly associated with the term student, namely the strength of link 221. this research can be used as a reference in determining the research theme or research discussion topic in the field of computer science. article history: received 5 may 2022 revised 15 may 2022 accepted 25 may 2022 available online 26 june 2022 ___________________ keywords: bibliometric, computer science, mapping analysis, vosviewer 1. introduction computer science is generally defined as the study of computers, hardware, and software (armoni & gal-ezer, 2014). computer science is rooted in electronics, mathematics, and linguistics (alhazov, 2010). computer science covers a wide range of computer-related topics, from abstract analysis of algorithms to more specific topics such as programming languages, software, and hardware. computer science focuses more on computer programming and software engineering. computer science is a international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 3(1) (2022) 1-18 al husaeni et al. mapping visualization analysis of computer science… | 2 branch of science that deals with computers and computing. computer science covers theory, component testing, and design and includes questions related to the theoretical understanding of computer devices, programs, and systems, experimental development and testing of computational concepts, design methodologies, algorithms, and tools to achieve them, analytical methods to demonstrate that implementation conforms to requirements. the development of the times is getting faster due to drastic technological changes. therefore, computer science personnel are needed in the workplace because all human needs can be facilitated because of technology (tussyadiah, 2020). therefore, research on computer science also needs to know its development so that it can continue to develop as the times. analysis of mapping visualization can be used to determine the development of computer science research. currently, there are many studies on mapping analysis, including new science mapping analysis software tool (cobo et al., 2012), science mapping analysis using r-tool (aria & cuccurullo, 2017), equity mapping analysis (wolch et al., 2005; jurado de los santos et al., 2020; talen, 1998; mennis & jordan, 2005), mapping analysis about magnetic properties and energy (xiang et al., 2013), and mapping analysis of the pipeline for pooled rna-seq (hill et al., 2013). however, there is no research on mapping analysis that discusses research in the field of computer science based on text data and bibliographic data using vosviewer. therefore, this research examines the analysis of mapping data for computer science publications using vosviewer by visualizing the mapping into three types, namely network visualization, overlay visualization, and density visualization. thus, through this research, it can be seen that the terms of computer science research are connected to facilitate the search for other fields of discussion that have high novelty in the field of computer science. 2. methodology this study uses a mapping analysis method on a data set of articles published in journals from 2017 to 2021 indexed by google scholar. data retrieval from the google scholar database is open source. to get the data from the research, we use the reference manager application publish or perish 7. all data were obtained on 12 january 2022. the publish or perish software review the literature on a predefined topic is "computer science". detailed information for installing and using software (google scholar and publish or perish 7) and a step-by-step process for obtaining data were described in our previous study (al husaeni & nandiyanto, 2022). there are several stages carried out in this research: (i) determination of study topics, (ii) collection of publication data is taken from the google scholar database using the publish or perish 7 application. (iii) processing of text data and bibliometric data on articles that have been obtained using microsoft excel application, which is converted into three file formats, namely research information systems (.ris), comma-separated value format (*.csv) and excel workbook (*.xlsx) 3 | international journal of informatics information system and computer engineering 3(1) (2022) 1-18 (iv) visualization of publication data mapping using the vosviewer application version 1.6.16, and (v) analysis of mapping analysis results. visualization of mapping text data and article bibliometric data is made in 3 types, namely network visualization, density visualization, and overlay visualization based on the relationship between existing items. data mapping is carried out in 2 steps, namely mapping based on text data and mapping based on bibliographic data. data mapping based on text data found 5972 terms. the terms that have been found are re-sorted with several provisions including the minimum number of occurrences of a term is 10. therefore, the number of terms used in the mapping analysis is 159 terms. data mapping based on text data is used to see the relationship between existing terms and is used in research in the field of computer science. the second data mapping on the same data was carried out based on bibliographic data. this mapping was carried out to find connections and also to see authors who contributed quite high to research in the field of computer science as recorded by google scholar. the rules used in making this data mapping include the maximum number of authors per document is 25 authors, the minimum number of documents of an author is 3 times. thus, it was found that 87 authors from 2073 authors met the criteria and entered the data mapping process. 3. result and discussion 3.1. publication data search results the search results for published data on computer science found 992 articles in the google scholar database for 2017-2021. table 1 presents one of the article data used in the vosviewer mapping analysis. all article data that has been obtained are then sorted based on their citation values so that the 20 best articles with the highest citations are found as presented in table 1. from the data in table 1, it is found that the highest citations were dominated by articles published in 2017, which were 20 articles with an average number of citations of 114.4 times. the average citation in 2017 for 20 articles with the highest citations was 22.88 times per year. table 1. computer science publication data no authors title year cites cites per year cites per author refs 1 weintrop, d., & wilensky, u comparing block-based and textbased programmin g in high school computer science classrooms 2017 204 40.80 102 weintrop & wilensky (2017) al husaeni et al. mapping visualization analysis of computer science… | 4 no authors title year cites cites per year cites per author refs 2 webb, m., davis, n., bell, t., katz, y. j., reynolds, n., chambers, d. p., & sysło, m. m. computer science in k12 school curricula of the 2lst century: why, what, and when? 2017 182 36.40 30 webb et al. (2017) 3 borrego, c., fernández, c., blanes, i., & robles, s. room escapes at class: escape games activities to facilitate the motivation and learning in computer science 2017 182 36.40 46 borrego et al. (2017) 4 sax, l. j., lehman, k. j., jacobs, j. a., kanny, m. a., lim, g., monjepaulson, l., & zimmerman, h. b. anatomy of an enduring gender gap: the evolution of women's participation in computer science 2017 177 35.40 35 sax et al. (2017) 5 wang, d., liang, y., xu, d., feng, x., & guan, r. a contentbased recommender system for computer science publications 2018 173 43.25 35 wang et al. (2018) 6 vakil, s. ethics, identity, and political vision: toward a justicecentered approach to 2018 114 28.50 114 vakil (2018) 5 | international journal of informatics information system and computer engineering 3(1) (2022) 1-18 no authors title year cites cites per year cites per author refs equity in computer science education 7 passey, d. computer science (cs) in the compulsory education curriculum: implications for future research 2017 95 19.00 95 passey (2017) 8 giannakos, m. n., pappas, i. o., jaccheri, l., & sampson, d. g. understandin g student retention in computer science education: the role of environment, gains, barriers, and usefulness 2017 87 17.40 22 giannako s et al. (2017) 9 garcia, r., falkner, k., & vivian, r. systematic literature review: selfregulated learning strategies using elearning tools for computer science 2018 80 20.00 27 garcia et al. (2018) 10 leytonbrown, k., milgrom, p., & segal, i. (2017). economics and computer science of a radio spectrum reallocation 2017 66 13.20 22 leytonbrown et al. (2017) al husaeni et al. mapping visualization analysis of computer science… | 6 no authors title year cites cites per year cites per author refs 11 fields, d. a., kafai, y., nakajima, t., goode, j., & margolis, j. (2018). putting making into high school computer science classrooms: promoting equity in teaching and learning with electronic textiles in exploring computer science 2018 65 16.25 13 fields et al. (2018) 12 qian, y., hambrusch, s., yadav, a., & gretter, s. who needs what: recommenda tions for designing effective online professional development for computer science teachers 2018 61 15.25 15 qian et al. (2018) 13 weintrop, d. block-based programmin g in computer science education 2019 56 18.67 56 weintrop (2019) 14 bonham, k. s., & stefan, m. i. women are underreprese nted in computationa l biology: an analysis of the scholarly literature in biology, 2017 55 11.00 28 bonham & stefan (2017) 7 | international journal of informatics information system and computer engineering 3(1) (2022) 1-18 no authors title year cites cites per year cites per author refs computer science, and computationa l biology 15 burnette, j. l., hoyt, c. l., russell, v. m., lawson, b., dweck, c. s., & finkel, e. a growth mindset intervention improves interest but not academic performance in the field of computer science 2020 53 26.50 13 burnette et al. (2020) 16 ehrlinger, j., plant, e. a., hartwig, m. k., vossen, j. j., columb, c. j., & brewer, l. e. do gender differences in perceived prototypical computer scientists and engineers contribute to gender gaps in computer science and engineering? 2018 51 12.75 10 ehrlinger et al. (2018) 17 bers, m. u. coding as another language: a pedagogical approach for teaching computer science in early childhood 2019 50 16.67 50 bers (2019) 18 malik, s. i., & al-emran, m. social factors influence on career choices for female computer 2018 49 12.25 25 malik & al-emran (2018) al husaeni et al. mapping visualization analysis of computer science… | 8 no authors title year cites cites per year cites per author refs science students. 19 nissim, k., bembenek, a., wood, a., bun, m., gaboardi, m., gasser, u., o'brien, d.r., steinke, t. & vadhan, s. bridging the gap between computer science and legal approaches to privacy 2017 48 9.60 8 nissim et al. (2017) 20 hur, j. w., andrzejewski , c. e., & marghitu, d. girls and computer science: experiences, perceptions, and career aspirations 2017 48 9.60 16 hur et al. (2017) 3.2. research developments in computer science research the development of research on computer science over the last 5 years, namely from 2017-2021 which has been published in google scholar indexed publications amounted to 992 articles. the number of each publication in sequence from 2017 to 2021 is 198, 232, 208, 206, and 148 articles. table 2 also shows that the most researched and published articles on computer science in 2018 amounted to 232 articles and the least research occurred in 2021, namely 148 articles. the average publication for the last 5 years is 198.4 articles. the development of research on computer science is shown more clearly in fig. 1. table 2. development of computer science research. year number of publication per year 2017 198.0 2018 232.0 2019 208.0 2020 206.0 2021 148.0 total 992.0 averange 198.4 fig. 1 shows that in 2017 research on computer science there were 198 articles and there was an increase in publications in 2018 to 232 articles. however, it 9 | international journal of informatics information system and computer engineering 3(1) (2022) 1-18 decreased in 2019 to 208 articles. research on computer science continues to decline from 2018, namely 2020, there were 206 articles and 148 articles in 2021. overall, it can be seen that the increase occurred only in 2018 only. in the following year, it continued to decline. fig. 1. level of development in computer science research. 3.3. mapping analysis based on text data of computer science using vosviewer in mapping the analysis based on text data using the vosviewer application. found 5792 terms relevant to computational thinking research. however, in this study, we give the minimum number of occurrences of the term to be 10 times. therefore, the results obtained are 136 items used in the process of mapping data analysis. research related to computer science based on network visualization is divided into 5 clusters and there are 5028 links. cluster 1 has 37 items, marked in red and presented in fig. 2. the items in cluster 1 are access, assessment, challenge, classroom, computational thinking, computer science curriculum, computer science education, computer science educator, computer science program, computer science teacher, content, curricula, curriculum, development, effort, equity, evaluation, faculty, framework, implementation, implication, information technology, knowledge, learner, learning, opportunity, perception, practice, program, project, school survey, teacher, teaching, technique, tool, and understanding. cluster 2 has 36 items and is marked in green, shown in fig. 3. the items in cluster 2 are algorithm, application, approach, area, artificial intelligence, aspect, computer, computer science, computer science department, computer science perspective, computer science research, computer scientist, computing, data, data science, discipline, fact, field, focus, mathematics, model, paper, perspective, physics, problem, process, research, researcher, science, social science, system, technology, term, theoretical computer science, theory, and topic. al husaeni et al. mapping visualization analysis of computer science… | 10 fig. 2. network visualization of the main term in cluster 1. fig. 3. network visualization of the main term in cluster 2. cluster 3 has 32 items and is marked in blue, shown in fig. 4. the items in cluster 3 are case, case study, change, college, computer engineering, computer science concept, computer science course, computer science degree, a computer science major, computer science student, course, department, effectiveness, engineering experience, factor, high school, higher education, idea, impact, introductory computer science, introductory computer science course, 11 | international journal of informatics information system and computer engineering 3(1) (2022) 1-18 investigation, software, software engineering, strategy, student, study, success, systematic literature review, time, and university. cluster 4 has 17 items marked in yellow, shown in fig. 5. the items in cluster 4 are analysis, attention, context, demand, education, effect, gender, group, importance, industry, initiative, level, participation, question, relationship, and role. fig. 4. network visualization of the main term in cluster 3. fig. 5. network visualization of the main term in cluster 4. cluster 5 has 14 items and is marked in purple (fig. 6). the items in cluster 5 are ability, activity, attitude, career, computer science class, concept, goal, information, motivation, programming, review, skill, stem, and subject. al husaeni et al. mapping visualization analysis of computer science… | 12 fig. 6. network visualization of the main term in cluster 5. in mapping analysis using vosviewer, cluster describes the relationship between one term and another (nandiyanto et al., 2021; al husaeni & nandiyanto, 2022; nandiyanto & al husaeni, 2021). the existing terms are labeled and also different colors. the color indicates the term cluster is located. the size of each label indicates the frequency with which the term appears or is used in computer science research. if the size of the circle label is bigger, the more often the term is used, and vice versa, the smaller it is, the less often it is used (nandiyanto et al., 2021; al husaeni & nandiyanto, 2022; nandiyanto & al husaeni, 2021). fig. 7 illustrates the network visualization in mapping analysis with vosviewer. network visualization shows the relationship from one term to another and shows the occurrences of that term. based on fig. 7, it can be seen that the term computer science has the largest label size. this shows that the term computer science has a high frequency of occurrences and the connection with other terms is also high. 13 | international journal of informatics information system and computer engineering 3(1) (2022) 1-18 fig. 7. network visualization of computer science research. in this study, we found 4 main terms that have a fairly high degree of connectedness with computer science terms, namely computer science education term with link strength of 116 (fig. 8a), student term with link strength of 221 (fig. 8b), education term with link strength of 78 (fig. 8c), and skill term with link strength 58 (fig. 8d). the link strength range of terms that are related to research in the field of computer science can be seen in fig. 9. fig. 9 shows that research with the theme of computer science has the highest correlation with the student term. this shows that many researchers are researching computer science and it is related to the student condition or term student. al husaeni et al. mapping visualization analysis of computer science… | 14 fig. 8. network the relationship between computer science research and other terms (a) computer science to computer science education; (b) computer science to students; (c) computer science to education; and (d) computer science to skills. fig. 9. the link strength range of terms that are related to research in the field of computer science. fig. 10 illustrates the overlay visualization of research in the field of computer science from 2017 to 2021. the overlay visualization shows the novelty of research on related terms (nandiyanto et al., 2021; al husaeni & nandiyanto, 2022; nandiyanto & al husaeni, 2021). many types of research on computer science have been carried out in the 20182019 timeframe as shown in fig. 11. the 15 | international journal of informatics information system and computer engineering 3(1) (2022) 1-18 term computer science has the largest research time in 2018.8-2019.0. therefore, when computer science research is in 2020-2021 there are still many opportunities to get new research. mapping analysis on overlay visualization data using vosviewer can be used as a reference for new research with the theme of computer science (nandiyanto et al., 2021; al husaeni & nandiyanto, 2022; nandiyanto & al husaeni, 2021). fig. 10. overlay visualization of computer science research on 2017-2021. fig. 11. overlay visualization of computer science research on 2018-2019. al husaeni et al. mapping visualization analysis of computer science… | 16 3.4. mapping analysis based on bibliographic data of computer science using vosviewer mapping analysis based on bibliographic data was conducted to see which authors contributed the most to the field of computer science research published and indexed by google scholar. fig. 12 shows the network visualization author with the most contributions to the collected data. the data shows that goode, j has the most contribution to research in the field of computer science in 2017-2021 which is published and indexed by google scholar, which contributes 12 published article documents. fig. 12. network visualization of the author in computer science research. from the results of this research, we can look for several topics of research in computer science education and their relationship to several other fields of discussion. we can also determine research themes that are more recent and in accordance with research trends in the year concerned, by looking at the track record of previous research. 4. conclusion the publish or perish 7 application was used to collect data from the google scholar database for the period 20172021. the keyword "computer science" was used to collect data. the data search yielded 992 articles that were thought to be relevant. with a total of 232 papers, the results showed that computer science research was quite popular in 2018. in the years 2019-2021, there was a decrease in computer science research. computer science terms are linked to four key terms in each cluster, according to the mapping analysis performed with the vosviewer application: student, computer science 17 | international journal of informatics information system and computer engineering 3(1) (2022) 1-18 education, education, and skills. the term "student" is commonly connected with computer science research, specifically the strength of link 221. references al husaeni, d. f., & nandiyanto, a. b. d. (2022). bibliometric using vosviewer with publish or perish (using google scholar data): from step-by-step processing for users to the practical examples in the analysis of digital learning articles in pre and post covid-19 pandemic. asean journal of science and engineering, 2(1), 1946. alhazov, a., boian, e., burtseva, l., ciubotaru, c., cojocaru, s., colesnicov, a., demidova, v., ivanov, s., macari, v., magariu, g. & malahov, l. (2010). investigations on natural computing in the institute of mathematics and computer science. computer science journal of moldova, 53(2), 101-138. aria, m., & cuccurullo, c. (2017). bibliometric: an r-tool for comprehensive science mapping analysis. journal of informetrics, 11(4), 959-975. armoni, m., & gal-ezer, j. (2014). early computing education: why? what? when? who?. acm inroads, 5(4), 54-59. bers, m. u. (2019). coding as another language: a pedagogical approach for teaching computer science in early childhood. journal of computers in education, 6(4), 499528. bonham, k. s., & stefan, m. i. (2017). women are underrepresented in computational biology: an analysis of the scholarly literature in biology, computer science and computational biology. plos computational biology, 13(10), e1005134. borrego, c., fernández, c., blanes, i., & robles, s. (2017). room escape at class: escape games activities to facilitate the motivation and learning in computer science. jotse, 7(2), 162-171. burnette, j. l., hoyt, c. l., russell, v. m., lawson, b., dweck, c. s., & finkel, e. (2020). a growth mind-set intervention improves interest but not academic performance in the field of computer science. social psychological and personality science, 11(1), 107-116. cobo, m. j., lópez‐herrera, a. g., herrera‐viedma, e., & herrera, f. (2012). scimat: a new science mapping analysis software tool. journal of the american society for information science and technology, 63(8), 1609-1630. ehrlinger, j., plant, e. a., hartwig, m. k., vossen, j. j., columb, c. j., & brewer, l. e. (2018). do gender differences in perceived prototypical computer scientists and engineers contribute to gender gaps in computer science and engineering?. sex roles, 78(1), 40-51. fields, d. a., kafai, y., nakajima, t., goode, j., & margolis, j. (2018). putting making into high school computer science classrooms: promoting equity in teaching and learning with electronic textiles in exploring computer science. equity and excellence in education, 51(1), 21-35. al husaeni et al. mapping visualization analysis of computer science… | 18 garcia, r., falkner, k., & vivian, r. (2018). systematic literature review: self-regulated learning strategies using e-learning tools for computer science. computers and education, 123, 150-163. giannakos, m. n., pappas, i. o., jaccheri, l., & sampson, d. g. (2017). understanding student retention in computer science education: the role of environment, gains, barriers and usefulness. education and information technologies, 22(5), 2365-2382. hill, j. t., demarest, b. l., bisgrove, b. w., gorsi, b., su, y. c., & yost, h. j. (2013). mmappr: mutation mapping analysis pipeline for pooled rna-seq. genome research, 23(4), 687-697. hur, j. w., andrzejewski, c. e., & marghitu, d. (2017). girls and computer science: experiences, perceptions, and career aspirations. computer science education, 27(2), 100-120. jurado de los santos, p., moreno-guerrero, a. j., marín-marín, j. a., & soler costa, r. (2020). the term equity in education: a literature review with scientific mapping in web of science. international journal of environmental research and public health, 17(10), 3526. leyton-brown, k., milgrom, p., & segal, i. (2017). economics and computer science of a radio spectrum reallocation. proceedings of the national academy of sciences, 114(28), 7202-7209. malik, s. i., & al-emran, m. (2018). social factors influence on career choices for female computer science students. international journal of emerging technologies in learning, 13(5), 56-70 mennis, j. l., & jordan, l. (2005). the distribution of environmental equity: exploring spatial nonstationarity in multivariate models of air toxic releases. annals of the association of american geographers, 95(2), 249-268. nandiyanto, a. b. d., & al husaeni, d. f. (2021). a bibliometric analysis of materials research in indonesian journal using vosviewer. journal of engineering research. 9(asseee special issue), 1-16. nandiyanto, a. b. d., al husaeni, d. n., & al husaeni, d. f. (2021). a bibliometric analysis of chemical engineering research using vosviewer and its correlation with covid-19 pandemic condition. journal of engineering science and technology, 16(6), 4414-4422. nissim, k., bembenek, a., wood, a., bun, m., gaboardi, m., gasser, u., o'brien, d.r., steinke, t. & vadhan, s. (2017). bridging the gap between computer science and legal approaches to privacy. harvard journal of law & technology, 31, 687. passey, d. (2017). computer science (cs) in the compulsory education curriculum: implications for future research. education and information technologies, 22(2), 421-443. qian, y., hambrusch, s., yadav, a., & gretter, s. (2018). who needs what: recommendations for designing effective online professional development for 19 | international journal of informatics information system and computer engineering 3(1) (2022) 1-18 computer science teachers. journal of research on technology in education, 50(2), 164-181. sax, l. j., lehman, k. j., jacobs, j. a., kanny, m. a., lim, g., monje-paulson, l., & zimmerman, h. b. (2017). anatomy of an enduring gender gap: the evolution of women’s participation in computer science. the journal of higher education, 88(2), 258-293. talen, e. (1998). visualizing fairness: equity maps for planners. journal of the american planning association, 64(1), 22-38. tussyadiah, i. (2020). a review of research into automation in tourism: launching the annals of tourism research curated collection on artificial intelligence and robotics in tourism. annals of tourism research, 81, 102883. vakil, s. (2018). ethics, identity, and political vision: toward a justice-centered approach to equity in computer science education. harvard educational review, 88(1), 26-52. wang, d., liang, y., xu, d., feng, x., & guan, r. (2018). a content-based recommender system for computer science publications. knowledge-based systems, 157, 1-9. webb, m., davis, n., bell, t., katz, y. j., reynolds, n., chambers, d. p., & sysło, m. m. (2017). computer science in k-12 school curricula of the 2lst century: why, what and when?. education and information technologies, 22(2), 445-468. weintrop, d. (2019). block-based programming in computer science education. communications of the acm, 62(8), 22-25. weintrop, d., & wilensky, u. (2017). comparing block-based and text-based programming in high school computer science classrooms. acm transactions on computing education (toce), 18(1), 1-25. wolch, j., wilson, j. p., & fehrenbach, j. (2005). parks and park funding in los angeles: an equity-mapping analysis. urban geography, 26(1), 4-35. xiang, h., lee, c., koo, h. j., gong, x., & whangbo, m. h. (2013). magnetic properties and energy-mapping analysis. dalton transactions, 42(4), 823-853. sathish kumar et al. enhanced wearable strap for feminine using iot | 80 enhanced wearable strap for feminine using iot sathish kumar, s. nandhini, r. sujitha department of computer science and engineering, manakula vinayagar institute of technology, india *corresponding email: sathishmail8@gmail.com a b s t r a c t s a r t i c l e i n f o women are increasingly experiencing a slew of security difficulties while traveling alone at night, particularly in the it industry. despite the benefits of new technologies, the rate of crime against women continues to rise. even though numerous security gadgets are available on the market, women are unaware of them and do not use them. we are going to develop a prototype design using iot. the term "physical objects” that are outfitted with sensors, computing power, software, and other technologies that enable them to connect and exchange data with other devices and systems over the internet or other communication networks are referred to as the internet of things. our project's goal is to give protection to working women and children. most crimes occur as a result of a person's lack of awareness. we plan to keep the person aware throughout by administering a vibration. the device comprises components such as the start button, arduino uno, panic button, gsm neo 6 m, gps, pulse sensor, vibration motor, 6v transformer, and touch sensor. the current location is determined using gprs and gsm neo 6 m. in emergency cases, a pulse sensor detects the person's actual pulse rate. the person must use a touch sensor to stop the vibration. if the vibration does not stop within the specified time, it is assumed that the person is not in an active state. the emergency alert is then sent to the predefined contacts stored on the arduino board. the transformer with a voltage range of article history: received 25 may 2022 revised 30 may 2022 accepted 10 june 2022 available online 26 june 2022 aug 2018 __________________ keywords: arduino uno, panic button, gps neo-6m, gsm sim800c, vibration motor, pulse sensor international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 3(1) (2022) 80-92 81 | international journal of informatics information system and computer engineering 3(1) (2022) 80-92 6v supplies power to the entire device. the future scope of this project is we can collect datasets of all hospitals for emergency purpose, creating offline maps to locate the victim without internet connection. then the device can also contain mic and camera to live monitor the consequences. 1. introduction in terms of women's security, we are most likely living in the darkest period in our contemporary society's history. we want to make women feel as powerful as they have always been, strong enough to combat the parasites of our society, strong enough to overcome obstacles, and strong enough to protect themselves from sexual attacks. we want to empower those who are essential to our survival. our goal is to create a system that will make every location and hour safer for women once again. a mechanism that will restore humanity's communal nature. with a single touch of a button, this technology will geotag and send an sos alert to the local police station, close contacts, and anyone in and around the crime scene. the goal is to compensate for the time it takes the cops to get at the scene. most of the incidents were happened in public places especially at train station, market, public gathering etc. the below fig.1 explains the pie chart which implies the percentage of incidents happened in public places. fig. 1. percentage of incidents happened in public places the percentage of cases happened in different countries where thailand has the highest percentage that women has suffered physical assault compared to other countries like united kingdom, brazil, india. the fig.2 bar chart represents the percentage of physical assault faced by women in different countries fig. 2. percentage of women suffered physical assaults according to the survey the percentage of cases increases in last five years especially women between the age 18-24 faced many consequences like harassment, so sathish kumar et al. enhanced wearable strap for feminine using iot | 82 the below fig.3 shows the bar chart with percentage of cases and different ages of women who faced the harassment. fig. 3. most 18 – 24 years women has been harassed in public places in last five years 2. literature survey ”smart wearable device for women safety using iot”: v. hyndavi, n. sai nikhita, s. rakesh – 2021. any woman must touch that push button if she feels insecure or in danger. the victim's image will then be simultaneously taken and saved on this raspberry pi operating system (os). then, the previously saved os image will be taken out automatically because adding more photographs can increase storage (hyndavi et al., 2020). “design and development of an advanced affordable wearable safety device for women”: freedom against fearsome”: israt humaira, kazi arman ahmed, sayantee roy, zareen tasnim safa, f. m. tanvir hasan raian , md. ashrafuzzaman – 2021. this device is simple in design, affordable, and practicable. by providing a hidden camera, gps and gsm module, shocking device, and voice recognizer, our products "bohnni" and "badhon" will ensure a safe atmosphere for women and children in any scenarios. there are already a variety of health and activity monitoring sensor belts on the market, but the design of an integrated safety device is fundamentally different since very accurate calculations and designs are required (humaira et al., 2021). ”iot based wearable device to monitor the signs of quarantined remote patients of covid-19”: nizar al bassam, shaik asif hussain, ammar al qaraghuli ,jibreal khan , e.p. sumesh, vidhya lavanya – 2021.internet of things (iot) technology have enabled remote health monitoring systems to use, accessible, and handy for measuring and recording patient parameters in a comfortable setting. the proposed iot-based health monitoring system can measure physiological parameters and health symptoms in covid-19-affected patients and transmitting those data to an application peripheral interface (api), which serves as a database for browsing and monitoring the infection level (bassam et al., 2021). ”women self defense device”: jayashree agarkhed, aishwarya rathi, maheshwari, faqarunnisa begum – 2021. our main intention in the current circumstance is: "women should always feel free to roam; a camera and a voice recognition module may also be installed in a watch so that it catches and records all of the photos and voice said by the victim, which are then delivered to the registered numbers and saved immediately (agarkhed et al., 2020). “smart security device for women based on iot using raspberry pi” prottasha ghosh, tanjimmasroor bhuiyan, muhib ashraf nibir , md. emran hasan , md. rabiul islam , md. rokib hasan , tanvir hossain – 2021. this gadget may be changed with a clever and effective one, 83 | international journal of informatics information system and computer engineering 3(1) (2022) 80-92 as security is a big issue for any woman in today's environment. for the first time, this sort of gadget has been used to capture direct evidence and store it on the web for later use (ghosh et al., 2021). ”iot based unified approach for women safety alert using gsm”: venkatesh. k, parthiban. s, santhosh kumar. p, vinoth kumar.c.n. s: 2021. the goal of this project is to ensure that every lady in our general population believes that everything in the world is ok and secure. according to a survey in india, 53 out of 100 working women do not believe that everything is ok in the world women are working at night. women experiencing barriers are also prevalent in delhi, mumbai, hyderabad,kolkata, and pune, according to a survey of 86 working women in india (venkatesh et al., 2021). “arduino based smart shoe system for women safety, defense and integrated intelligent tracking”: syed mosharaf hossain, subhendu chakraborty – 2021. this smart safety shoe for ladies is a ready-to-wear gadget for women's everyday use. to the best of the writers' knowledge, there is no such off-the-shelf, ready-to use equipment that can be utilised by women for safety purposes. there is no need for a complicated charging circuit with this gadget, and no wires can be seen from the exterior. the main purpose of this project was to create a ready-to-use and portable smart ladies’ safety gadget (hossain et al., 2021). “iot based women securitya contemplation”:deepinder kaur, ravitachahar, jatinder ashta – 2020. in comparison to mobile applications, preprogrammed smart hardware devices can perform efficiently in danger circumstances. in the future, a new device can be designed for women who use public transportation in which the female and driver must give biometric impressions before beginning the trip and information of both is used by gps in case of an unfortunate event, regardless of whether the passenger has a smart phone or not (kaur et al., 2020). “smart safety and security solution for women using knn algorithm and iot”:bysani sai yaswanth, darshan r s, pavan h, srinivasa d b, b t venkatesh murthy – 2020. this study focuses on an iot-based self-security system that is comfortable, easy to use, and wearable, and that assists in sharing the user's position when they are panicked,as well as locating the closest safe area. the user's position is sent into the machine learning system, which then forecasts the location of the nearest safe spot (yaswanth et al., 2020). “lifecraft: anandroid based application system for women safety”:rabbina ridankhandoker, shahreen khondaker, fatiha-tus-sazia, fernaznarin nur, shaheena sultana – 2019.this study presents a new model of women's safety that attempts to create a particularly safe environment for women. this study compares the essential requirements of an intelligent security system with technological demand and system construction problems. it may beemployed in a variety of situations, including sexual problems, accidents, hijackings, and public attacks (khandoker et al., 2019). “protecht – implementation of an iot based 3 –way women safety device”: trisha sen, arpita dutta, shubham singh,vaegaenveen kumar – 2019. the iot-based 3-way women's safety device accomplishes its goal by offering users sathish kumar et al. enhanced wearable strap for feminine using iot | 84 with a self-defense mechanism as well as facilities for collecting information and recording evidence. when the user says the words "emergency" in the android app, the savedemergency contacts receive a message and a call, as well as the user's current position. the gadget responds quickly and may assist the user in remaining safe in any situation (sen et al., 2019). “smarisa: a raspberry pi based smart ring for women safety using iot”navya r sogi, priya chatterjee, nethra u, suma v – 2018. this report proposed a method through which a woman can immediately notify the appropriate authorities if she is in danger. the suggested method obtains the device's coordinates by using gps tracking of the smart phone. this method also uses the image's url and an alert message to notify the family and police officers (sogi et al., 2018). ”b with uiot based woman security system”: akhila c m, jisna raju, lakshmi sethunath, vishnu prasad yadav, mrs. manju bravada’s – 2019. we have now created a project that will address crucial concerns that women confront in today's society and give them the strength to fight back. it includes a band prototype, shock shoe, and mobile device that displays the current location on a google map as well as live streaming on the internet (akhila et al., 2019). “design of a smart safety device for women using iot”: wasim akram, mohit jain, c. sweetlinhemalatha 2019. this proposes a solution that will attempt to address the shortcomings of the current system while also ensuring women's false proof safety. safe spots from the victim's present position will be shown on the map so that women may reach the safe place from their current location, according to a smartphone app created for women's protection (akram et al., 2019). “the personal stuna smart device for women’s safety”: shivani ahir, smit kapadia, prof. jigar chauhan, prof. nidhi sanghavi – 2018. they concentrated on creating a smart bracelet that can be triggered by tapping the screen twice. the pulse sensor is an integrated sensor in the gadget that detects the person's pulse rate. the gadget now has a force sensor that activates when it is thrown with force. on the android platform, a smart application will be built that connects to the device through bluetooth interface and displays the subject's detected data to the ice contacts (ahir et al., 2018). “safeband: a wearable device for the safety of women in bangladesh”: muhammad nazrul islam, nuzhat tabassum promi, jannatul maowa shaila, mohoshina akter toma, maria afnan pushpo, fatema binte alam ,syeda nusraht khaledur, tasmiah tamzid anannya, md. fazle rabbi – 2018. the 'safe band' system's assessment research revealed that the system efficiently executed all the capabilities and had a very good usability quality. in the future, it is planned to merge the entire gadget onto a single mini chip for real-world application, making transporting the system more efficient for women (islam et al., 2018). ”design and development of an iot based wearable device for the safety and security of women and girl children”: anandjatti, madhvikannan, alisha rm, vijayalakshmi p, shresthasinha 2018. 85 | international journal of informatics information system and computer engineering 3(1) (2022) 80-92 they focused on using the internet of things in conjunction with wearable devices to ensure the protection and security of women and girls. body position is estimated using raw accelerometer data from a triple axis accelerometer (jatti et al., 2016). 3. objective of the research women are not safe anywhere and are most vulnerable when traveling alone into lonely roads and deserted places, so our purpose of effort is to reduce or overcome the rate of crime against women. the development of smart band for women safety is to keep the victim safe from the danger. when a woman or a child feels threatened, she must activate the device by pressing the start button. after activating the device, press the panic button to send the message "help" along with the current location to the predefined contacts via gsm neo 6m and gprs. at the same time, the pulse sensor detects the person's pulse rate. when the live location is sent, the person's pulse rate is also sent. in addition to this we have placed vibration motor to always keep the person alert. the vibration motor is placed to always keep the person alert. when in the active state, the person must use the reset button to stop the vibration. if the vibration motor is not stopped within a certain period, assuming the person is not conscious, an emergency alert is sent to the predefined contacts with the subject "emergency". the device receives power from the transformer ranges of 6v. 4. existing work when women are traveling alone or in a distant region where they cannot acquire help or appropriate assistance, they are attacked (sogi et al., 2018). this research proposes an internet of things-based solution to the problem of women's safety that overcomes present technological restrictions. when a woman is in danger, the proposed design incorporates mechanisms that warn family members and a nearby police station so that help may be sent quickly. a shock waves generator is also included in the suggested design, which women may use to assault the perpetrator. some of the other components of the projected work that give further assistance to women are as follows: 1. sending group sms with the device and the victims' phones. 2. the victim's voice is recorded and later used as evidence against the offender. 3. locating a safe spot on a map from where the victim is now. 5. design of the existing work the suggested women's safety device supports a woman who is in a potentially perilous situation. the device is practically ready for everything that might go against the woman's wishes. an at mega 328 microcontroller oversees it. the gadget is activated by a fingerprint scanner, a gsm (global system for mobile communications) module transmits alarm messages, a buzzer warns the environment, and a shock wave generator is utilized for selfdefence. on an lcd, the message is shown. the following fig.4 explains how the gadget works: the woman's fingerprint must first be registered before the device can be activated. the gadget continues scanning the woman's fingerprint every minute as soon as she sathish kumar et al. enhanced wearable strap for feminine using iot | 86 turns it on. the gadget will be activated if the scanner finds no fingerprint, blowing a bell to alert the public nearby. because it only scans the woman's fingerprint in an emergency, i.e., when she detects a threat, the device's quality is unaffected. the gps data is also relayed to the lcd and the gsm modem, which will deliver the message to the woman's relatives and friends (sogi et al., 2018). even if she is pulled down from behind and is unable to activate an alarm, the device will send an emergency message to all the woman's ice contacts (in case of emergency contacts) alerting them to her current position. the design also includes a shock wave generator that may be used as a weapon or to assist the lady in self-defence. a shock wave generator's gear, which comprises a switch, transformer, and wires. one of the two free ends of the wires acts as the high voltage source, while the other serves as the return path's ground. the high voltage cannot arc unless it comes into touch with the attacker's body, which acts as a conducting conduit between the two ends because the loose ends are not in direct contact. the circuit has three important stages. •the oscillator •the voltage amplifier •the power supply when the battery is fully charged, the voltage is delivered to the oscillator stage. the transformer serves as an inverter, increasing the frequency of oscillation. the output of the transformer is then delivered to the capacitors, which hold the current until it is utilized to electrocute the aggressor. in the existing work, they have added some features which supports through android application, and it is represented in the below fig.5. fig. 4. workflow of the proposed design fig. 5. additional features supported through android application 6. proposed work our suggested smart band project consists of hardware components connected to an arduino uno. when the panic button is pressed, the gps and gsm module activates, tracking the user's current location and sending a message with the victim's heart rate via the pulse sensor. our primary goal is to keep the victim awake. to do this, we provide a vibration motor for a set period. once the victim awakens, they can manually switch off the vibration, and if it continues, another alarm 87 | international journal of informatics information system and computer engineering 3(1) (2022) 80-92 message is sent to the predetermined contact. a switch is used to turn on the device. a push button is triggered to track the location and send message and further processing (see figs. 6-10). module 1: activating the device and tracing the location using gps neo-6m module. module 2: sending messages to the predefined contact. module 3: pulse rate detection of the person. module 4: finding the active state of the person using vibration motor. a) proposed architecture fig. 6. architecture diagram b.) module 1: activating the device and tracing the location using gps neo-6m module global positioning system neo-6m: parts used: 1. arduino uno board 2. neo 6m gps module 3. jumper wire connections: 1. first, the rx pin from the gps module needs to be connected to arduino pin 8. 2. the gps module's tx pin should then be linked to pin 9 on the arduino. 3. the gps module's vcc pin should then be linked to the 12v power supply. 4. •the gps's gnd is then linked to the power supply's gnd. fig. 7. diagrammatic view of tracking and sending location c.) module2: sending messages to the predefined contacts global system for mobile communication sim 800c: parts used: 1. arduino uno board 2. sim 3. jumper wires connections: 1. •the sim card must first be inserted into the sim 800c in the sim cardholder. sathish kumar et al. enhanced wearable strap for feminine using iot | 88 2. •the arduino pin 4 should be linked to the p in the gsm. 3. •the arduino pin 3 should be linked to the t in the gsm. the panic button was connected to the vcc. and the arduino gnd was linked by the gnd. d). working discription: according to the above fig.7, it explains how the gps and gsm connected with the arduino. the start switch must first be used to turn on the gadget. the panic button must then be held down for 5 seconds. following the gps tracking of the current location, the tracked location is communicated via the gsm module 800c to the designated contacts along with the message "help me." e). module 3: pulse rate detection of the person the arduino uno board has a pulse sensor attached to it. parts used: 1. arduino uno board 2. pulse rate sensor 3. jumper wires. connections: 1. the pulse sensor's a0 is linked to the arduino uno's a0. 2. the arduino uno's gnd is connected to the pulse sensor's gnd. 3. the pulse sensor's vcc is connected to the start button fig.8 code of gsm sending sms f). workings discription: according to the above fig.8, it explains how gsm is connected to the arduino and it shows the workflow of how gsm sends messages to the predefined contact. the person's pulse rate is frequently monitored by the pulse sensor. the person's pulse rate is communicated to the designated contacts as soon as the panic button is pressed, coupled with the person's location. the alert message "emergency need help" is delivered to the predefined contacts once more if the pulse rate changes to the left or right of the threshold level. g). module 4: finding the active state of the person using vibration motor parts used: 1. vibration motor 2. arduino uno board 3. jumper wires connections: 1. the arduino uno's gnd and the vibration motor's gnd are connected. 2. the arduino pin 7 is connected to the vcc of the vibration motor.11 fig.9 code of vibration motor h). working discription: according to the above fig.9, it explains how vibration motor is connected with the arduino and it represents how the vibration will function. the vibration motor is preserved to detect a person's active state. in a gap of 30 minutes, it vibrates for 5 seconds. if the person is in an active state, she must press the button to cease the vibration. say an alert message is sent to the predetermined contacts if the vibration hasn't stopped for a while. 89 | international journal of informatics information system and computer engineering 3(1) (2022) 80-92 i). our prototype fig. 8. our actual prototype j). message along with the bpm value fig. 9. gps location gps location fig. 10. gps location 7. result and discussion a) device price comparison we have taken the survey based on price comparison from the existing device and it is described in the below fig.11 fig. 11. based on cost comparison b ) pulse rate detection a person pulse rate is detected with the help of pulse sensor. the below bar chart in fig.12 analysis the pulse rate of the person based on their ages and their average bpm is shown. fig. 12. pulse rate detection level c) vibration analysis sathish kumar et al. enhanced wearable strap for feminine using iot | 90 the rate of vibration is calculated with the peak values. it represents the peak-topeak values in the form of waves, according to the period which is shown in the fig. 13. fig. 13 vibration rate level the actual rate of vibration which was detected by our device is represented in the below fig. 14. fig. 14. vibration detection rate 8. conclusion our main intention in the current era is that “women should always feel free to roam; wherever; anytime. they should not experience any adverse circumstances of any type. as a result, we created the smart band, which consists of several hardware parts connected to iot devices. since the device features a gps and gsm module, it can track where abouts and send messages to contacts that have been predefined. a vibration motor is also utilized in addition to keeping the sufferer awake for a set amount of time. we believed that the women would benefit greatly from being in difficult situations. this device's future applications will benefit youngsters, physically challenged ladies, and senior citizens. this project can be utilized by parents to find where their children are in addition to enhancing the conditions for women's safety in india. additionally, we may improve by compiling hospital datasets that will aid the sufferer by using gps and gsm to follow her location and send signals to the closest hospital when her pulse drops, or she needs an emergency care. we may also create offline maps to locate the victim even when there is no signal or internet connection. we can also include camera and microphone in addition to live monitor the activities when need. references agarkhed, j., rathi, a., & begum, f. (2020, october). women self defense device. in 2020 ieee bangalore humanitarian technology conference (b-htc) (pp. 1-5). ieee. ahir, s., kapadia, s., chauhan, j., & sanghavi, n. (2018, january). the personal stuna smart device for women's safety. in 2018 international conference on smart city and emerging technology (icscet) (pp. 1-3). ieee. akhila, c. m., raju, j., sethunath, l., yadav, v. p., & bhavadas, m. m. (2019). b with u-iot based woman security system. vol, 8, 507-509. 91 | international journal of informatics information system and computer engineering 3(1) (2022) 80-92 akram, w., jain, m., & hemalatha, c. s. (2019). design of a smart safety device for women using iot. procedia computer science, 165, 656-662. al bassam, n., hussain, s. a., al qaraghuli, a., khan, j., sumesh, e. p., & lavanya, v. (2021). iot based wearable device to monitor the signs of quarantined remote patients of covid-19. informatics in medicine unlocked, 24, 100588. ghosh, p., bhuiyan, t. m., nibir, m. a., hasan, m. e., islam, m. r., hasan, m. r., & hossain, t. (2021, january). smart security device for women based on iot using raspberry pi. in 2021 2nd international conference on robotics, electrical and signal processing techniques (icrest) (pp. 57-60). ieee. hossain, s. m., & chakraborty, s. (2021). arduino based smart shoe system for women safety, defense and integrated intelligent tracking, www.ijaceeonline.com. 6(1), pp. 1–7. humaira, i., ahmed, k. a., roy, s., safa, z. t., raian, f. m. t. h., & ashrafuzzaman, m. (2021). design and development of an advanced affordable wearable safety device for women: freedom against fearsome. adv. sci., technol. eng. syst. j., 6(2), 829-836. hyndavi, v., nikhita, n. s., & rakesh, s. (2020, june). smart wearable device for women safety using iot. in 2020 5th international conference on communication and electronics systems (icces) (pp. 459-463). ieee. islam, m. n., promi, n. t., shaila, j. m., toma, m. a., pushpo, m. a., alam, f. b., ... & rabbi, m. f. (2018, november). safeband: a wearable device for the safety of women in bangladesh. in proceedings of the 16th international conference on advances in mobile computing and multimedia (pp. 76-83). jatti, a., kannan, m., alisha, r. m., vijayalakshmi, p., & sinha, s. (2016, may). design and development of an iot based wearable device for the safety and security of women and girl children. in 2016 ieee international conference on recent trends in electronics, information & communication technology (rteict) (pp. 1108-1112). ieee. kaur, d., chahar, r., & ashta, j. (2020, march). iot based women security: a contemplation. in 2020 international conference on emerging smart computing and informatics (esci) (pp. 257-262). ieee. khandoker, r. r., khondaker, s., nur, f. n., & sultana, s. (2019, december). lifecraft: an android based application system for women safety. in 2019 international conference on sustainable technologies for industry 4.0 (sti) (pp. 1-6). ieee. sen, t., dutta, a., singh, s., & kumar, v. n. (2019, june). protecht–implementation of an iot based 3–way women safety device. in 2019 3rd international conference on electronics, communication and aerospace technology (iceca) (pp. 1377-1384). ieee. sogi, n. r., chatterjee, p., nethra, u., & suma, v. (2018, july). smarisa: a raspberry pi based smart ring for women safety using iot. in 2018 international conference on inventive research in computing applications (icirca) (pp. 451454). ieee. http://www.ijaceeonline.com/ sathish kumar et al. enhanced wearable strap for feminine using iot | 92 venkatesh, k., parthiban, s., kumar, p. s., & kumar, c. v. (2021, february). iot based unified approach for women safety alert using gsm. in 2021 third international conference on intelligent communication technologies and virtual mobile networks (icicv) (pp. 388-392). ieee. yaswanth, b. s., darshan, r. s., pavan, h., srinivasa, d. b., & murthy, b. v. (2020, december). smart safety and security solution for women using knn algorithm and iot. in 2020 third international conference on multimedia processing, communication & information technology (mpcit) (pp. 87-92). ieee. 11 | international journal of informatics information system and computer engineering 4(1) (2023) 11-22 doi: https://doi.org/10.34010/injiiscom.v4i1.9558 p-issn 2810-0670 e-issn 2775-5584 computational thinking: the essential skill for being successful in knowledge science research adam mukharil bachtiar school of knowledge science, japan advanced institute of science and technology, 1-8 asahidai, nomi shi, ishikawa, japan *corresponding email: s2220401@jaist.ac.jp 1. introduction we are living in society 5.0 with a focus on increasing the capability of human-being to create innovation (carayannis & morawska-jancelewicz, 2022). the limitation between the real and digital world seems to fade out. the interaction among humans is very intense in social media. digitalization is a word that the constituent parties in society should fulfill. higher education plays a significant role in ensuring the availability of human resources with requisite skills (shin. j.c., 2015). digitalization leads society to a new concept called the vuca world (mack & a b s t r a c t s a r t i c l e i n f o the vuca world concept was established in 2016 as the new challenge universe in the 21st century. humans live in society 5.0 and the vuca world simultaneously. the digital word has been a noisy word since then. there are a lot of requisite skills to be a survival kit for this kind of era. the vuca world's affection is spreading in the way of thinking and creating innovation, especially in the research domain. as a newcomer, knowledge science should state the requisite skills for its researchers to conduct their research successfully. many researchers offered computational thinking as a candidate for an essential skill to satisfy the effect of the vuca world. this study was focused on conducting a descriptive analysis method based on several literature reviews for mapping how computational thinking can serve as a best practice for knowledge science research. this study successfully revealed the connection between computational thinking. article history: submitted/received 21 jan 2023 first revised 23 feb 2023 accepted 27 march 2023 first available online 09 april 2023 publication date 01 june 2023 aug 2018 __________________ keywords: computational thinking, requisite skills, research, knowledge science, vuca. international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 4(1) (2023) 11-22 https://doi.org/10.34010/injiiscom.v4i1.9558 mailto:s2220401@jaist.ac.jp adam mukharil bachtiar. computational thinking: the essential skill …| 12 doi: https://doi.org/10.34010/injiiscom.v4i1.9558 p-issn 2810-0670 e-issn 2775-5584 khare, 2016). the extra challenge was added to society as the effect of this concept. the vuca world offered the uncertainty and unpredictable future that break in the stability of the digital era (johansen & euchner, 2013). the vuca presence threatens the process of creating innovation in the digital platform. society 5.0 has stated the requisite skills for humans in the 21st century as the primary skills to be achieved by undergraduate or graduate students. communication, critical thinking, collaboration, creativity, and problemsolving skills are some powerful words that appear in the list of requisite skills for the 21st century (gutiérrez-núñez et al., 2022; semsri et al., 2022; van laar et al., 2020). but the uncertainty word needs to be satisfied and force society to add extra skill as the complement. in the middle of the chaos, knowledge science became more mature nowadays as one of the knowledge domains. the hard and soft research type was published to encourage the role of knowledge science in the higher education domain (huang et al, 2016). without giving enough space for debating about the position of knowledge science compared with the existing knowledge domain, many researchers in the field have successfully filled the gap in the research in information science. the granularity between information and knowledge creates a summary that knowledge science is well deserved as the new domain (zins, c. 2006). like the other field, knowledge science needs to answer the challenge given by the vuca world. future researchers in this field should own the extra skills, excluding the essential skills in the 21st century, to survive and continue to produce the following research. some recent research shows the appearance of a new skill called computational thinking. most studies reveal the importance of computational thinking in education, including the knowledge science domain. that thinking method successfully demonstrated the change in human behaviors in solving the problem (sermsri et al, 2022; kong, s. c. 2022; yasar et al, 2023). the way of teaching is also got affected by computational thinking. today’s scientists not only leverage computational tools to conduct their investigations but often must also contribute to designing the computational tools for their specific research (hurt et al., 2023). because of this need, this study is focused on revealing the intuition and the way of thinking of computational thinking in conducting knowledge science research. using descriptive analysis, this study intuitively examined each element of computational thinking in some knowledge science research to be a general overview for future researchers in the knowledge science field. 2. method this study involves two significant concepts: computational thinking and knowledge science. both seem dif ferent, but computational thinking is a general thinking skill that can exist in every domain. the first sub-section will focus on explaining computational thinking, followed by the second, which focuses on explaining knowledge science as the domain. https://doi.org/10.34010/injiiscom.v4i1.9558 13 | international journal of informatics information system and computer engineering 4(1) (2023) 11-22 doi: https://doi.org/10.34010/injiiscom.v4i1.9558 p-issn 2810-0670 e-issn 2775-5584 2.1. computational thinking jeannette wings, the founder of computational thinking (also known as ct), used words such as problem-solving method and how computers execute the solution to give a powerful understanding of computational thinking (wing. j. m., 2014). there is one famous statement from her that if one person thinks using computational thinking, then the person will involve in formulating problems and their solutions so that the solutions are represented in a form that an information-processing agent can effectively carry out (shute & asbellclarke, 2017). the word “an information-processing agent” refers to the computer. some research demonstrates how computational thinking is the essential thinking method across many fields. tables 1 and 2 show the research on computational thinking in some science fields, either natural or social science. table 1. computational thinking in some science research study explanation qin, h., 2019 this study comprehensively develops how ct helps the participants learn about bioinformatics using computer laboratory exercises. the researchers can examine how to implement ct early in bioinformatics learning even though they cannot determine which elements are significant weintrop et al., 2016 this study argued about the position of ct in supporting the mathematics and science context. the modeling and simulation became the most significant part affected by ct chongo et al., 202 chemistry also became one field that ct invaded. the experiment using plugged and unplugged ct method is the central part of this study güven & gulbahar, 2020 social science will be the last field to be predicted as ct invades. this study provided an excellent comprehensive about how to implement ct in social studies table 2. four elements of computational thinking (mack & khare, 2016) element of ct explanation abstraction a. determine the fundamental problem from all the phenomena. b. reformulate into solvable and can be familiarized as the computational case decomposition break down the problem into several sub-problems that can be more solvable intuitively algorithm construct a series of the structured process to be followed in solving the problem pattern recognition finding the similarity and shared characteristics between the problem https://doi.org/10.34010/injiiscom.v4i1.9558 adam mukharil bachtiar. computational thinking: the essential skill …| 14 doi: https://doi.org/10.34010/injiiscom.v4i1.9558 p-issn 2810-0670 e-issn 2775-5584 those studies well-explained that ct can be a mature thinking skill in a short time after being declared by the founder. to think using this approach, the researchers should implement four elements of ct. with those elements, ct is possible to be implemented. 2.2. knowledge science based on nakamori (2011), knowledge science is an emerging discipline resulting from the demands of a knowledge-based economy and information revolution. the diversity between information and knowledge triggered the shift of the field's name from information science to knowledge science. changing the area's name reflects that current information science primarily focuses on exploring the mediating aspects of human knowledge (zins, c, 2006). unlike information science, which focuses on manipulating the form and structure of information, knowledge science concentrates on optimizing the knowledge creation process either by producing new knowledge using some methodologies or serving the optimization of human and social concepts in the knowledge creation process. there are two classifications of research in knowledge science (huang et al., 2016; hlupic et al, 2002). the classification is based on the type of process in the knowledge management area, which is a significant area in the knowledge science field. figure 1 shows the research classifications in the knowledge science area. both types share the central role of knowledge science, such as knowledge creation, knowledge sharing, knowledge management, and knowledge evaluation equall. fig. 1. two classifications in knowledge science research https://doi.org/10.34010/injiiscom.v4i1.9558 15 | international journal of informatics information system and computer engineering 4(1) (2023) 11-22 doi: https://doi.org/10.34010/injiiscom.v4i1.9558 p-issn 2810-0670 e-issn 2775-5584 3. results and discussion 3.1. examples of hard and soft-type research some examples of each type of knowledge science research were represented in this study before mapping each element of ct into the research. based on the examples, the intuition for making differences between hard and soft-type research in knowledge science can be understood. table 3 shows some examples of hard-type knowledge science research in the school of knowledge science, jaist. table 4 shows the opposite of the hard-type research, which is soft-type research in the knowledge science domain. table 3. examples of hard-type knowledge science research research short summary ono et al, 2022 this research produces technology for skiers learning using virtual reality. deep learning was used to recognize the skiing posture to be evaluated tan et al, 2019 this research produces knowledge in the form of infographics about catalyst degradation mechanisms based on operand spectro imaging and unsupervised learning from 3d images. hamanaka et al, 2016 this research focused on implementing lerdahl and jackendoff’s (1983) generative theory of tonal music (gttm) to generate new music based on the training data miyata et al., 2012 this research generates several procedural technologies that can be used to generate pattern images (3d models). torii et al., 2022 this research predicts movement characterizes the degree of animacy and measures it using granger causality. table 4. examples of soft-type knowledge science research research short summary sinthupundaja et al, 2019 this research examined the importance of the causal combinations of knowledge-acquisition conditions using fuzzy set qualitative comparative analysis. shahzad et al, 2016 this research aimed to identify if integration between knowledge strategy and knowledge management (km) processes leads to organizational creativity and performance. hashimoto, 2006 this research focused on modeling to clarify the evolutionary process of language, and evolutionary economics defines the dynamics of economic phenomena. uchihira et al, 2012 this research generated a model to optimize the knowledge transfer process in r&d project management. kim, 2017 this research is aimed to identify the factors that influence the creation of innovative ideas. the two workshops were conducted to reveal influential factors. https://doi.org/10.34010/injiiscom.v4i1.9558 adam mukharil bachtiar. computational thinking: the essential skill …| 16 doi: https://doi.org/10.34010/injiiscom.v4i1.9558 p-issn 2810-0670 e-issn 2775-5584 3.2. abstraction for simplification the intuition behind abstraction is to determine the essential part of the problem and generalize the problem to find its proper form. it can simplify the complex problem to become an intuitive and identified problem. some unnecessary elements of the problem can be excluded so that the focus of researchers increases simultaneously. for example, there are studies by hamanaka et al., 2016 and ohmura et al. about generating music based on the relationship using lattice probability distribution (ohmura et al, 2018). that study found the essence of the relationship between the tonal in one music. then, the relationship was used to generate new music. figure 2 is the overview of the abstraction in the research. in the soft-type, abstraction can be used for generalizing the procedure and its problem into one model. then, this model will be improved during the research. the famous abstraction result in soft-type research is the seci model that focuses on optimizing the knowledge creation process in one organization (farnese et al, 2019). 3.3. decomposition for reducing the complexity the intuition behind the decomposition is breaking down the identified problem into several easy-to-chunk problems. the primary strategy is about divide and conquers, which will lead the researchers to the estimated solution. in some research, decomposition is not easy, especially when the problem involves one complex system. soft system methodology and i-system can decompose the complexity among the constituent party followed by their emergence (nakamori, y, 2011; mingers et al, 1992). the decomposition in knowledge science research can be used to examine the interventions in one isystem for later, the solution will be recognized by the three dimensions, such as scientific, collaboration, and creative dimension. in the end, the three solutions offered by each dimension will be integrated in the final phase of the isystem. fig. 2. the abstraction of tonal music generation https://doi.org/10.34010/injiiscom.v4i1.9558 17 | international journal of informatics information system and computer engineering 4(1) (2023) 11-22 doi: https://doi.org/10.34010/injiiscom.v4i1.9558 p-issn 2810-0670 e-issn 2775-5584 the study by kim, e., 2017 has successfully demonstrated how the decomposition worked well. the study is focused on revealing the influential factors in idea generation and enhancing them using analogical thinking. the experiment in that study was divided into two workshops. the first workshop focused on revealing the influential factors, and, in the end, three influential factors were revealed. then, the second workshop focused on enhancing those factors using analogical thinking. the conclusion in one research can be achieved through several processes. each process produces the output represented as the input in the following procedure. the properness to break down the problem, especially to be some processes that can be parallelized, will increase the efficiency and optimal level to achieve successful research. 3.4. algorithm to lead the research the algorithm is a very familiar element among all the ct’s elements. it is mandatory for the researchers to build a structured and sequenced process to lead the problem into the solution. in knowledge science research, the algorithm can play a role as the methodology in one research. the study from miyata, k., 2012 demonstrated the algorithm in the form of procedural technology for pattern generation or 3d pattern generation. the pattern can be used in a kimono or building structure. step-by-step how the procedural technology was constructed from the actual pattern is one clear example of how the algorithm took an essential part in this research. another example algorithm can play a role as a procedure about how to conduct the experiment workshop. uchihira et al., 2012 experimented with making a model for optimizing the knowledge transfer process in research and development projects. the algorithm helped the study to illustrate the flow in a structured project case and to conduct an internalization workshop that consisted of six steps. each step is well-structured. the algorithm is about not only structuring the programming process but also the experimental process. 3.5. pattern recognition for finding the similarity this element plays a significant role in satisfying the uncertainty. rather than finding similarities, some researchers often focus on finding the differences among the research. most of the study is excellent in generating new approaches or results. even though they seem to be different, they have connected to each other. the five examples of hard-type knowledge science research in table 3 focus on finding the hidden knowledge using several methods in knowledge discovery methodology. the differences are in the source and form of the knowledge. similar connecting lines also happen in soft-type knowledge science research. the experimental method and the proposed model are the shared characteristic between studies. the differences in the domain and the design of experiments. using the capability for finding the similarity can help the researchers to shorten the time for getting the intuition behind the research. from the similarity, they can mark the area in the research domain map that has already been invaded by the other researchers and find the gap between them. table 5 shows similarities in some research in tables 3 and 4. https://doi.org/10.34010/injiiscom.v4i1.9558 adam mukharil bachtiar. computational thinking: the essential skill …| 18 doi: https://doi.org/10.34010/injiiscom.v4i1.9558 p-issn 2810-0670 e-issn 2775-5584 table 5. similarities between the research on the school of knowledge science in jaist with other research outside jaist research result of research similar research as an evidence ono et al, 2022 the proper interaction model for vr to help novice skiers similar research was found in creating an interaction model for elder skiers using vr technology. the recent study involved deep learning in evaluating the properness of body posture when skiing. tan et al, 2019 the similarity concepts and infographics that were produced by the unsupervised learning in the experimental material several concepts of data analytics and machine learning can be applied in material science. unsupervised learning was found inone study to be a knowledge discovery process in grouping microstructure material. the experimental object and the intention of data analytics can be a differentiating factor between the studies. hamanaka et al, 2016 the relationship rules between the tune in music for generating new music. the vertical rule is an outstanding result of this research. there is similar research about the vertical pattern in music and how to discover it. the discovery process is called computational music analysis. even though the similarity was so high, the difference is in the form of explicit knowledge produced by the algorithm. torii et al, 2022 the pattern of movement characteristics was measured by the degree of animacy and granger causality for the robotic domain the collision prediction from the robotic movement scenario also resulted from another research. the subdomain from the studies is different, and the focus of movement prediction can differ from the studies. sinthupund aja et al, 2019 the concept of the causal combinations of knowledgeacquisition condition rather than using fuzzy logic, the other study used bayesian network as their primary method to reveal the causal combination of the knowledge-acquisition condition. the dissimilarity also can be found in the proposed concept of knowledge acquisition. shahzad et al, 2016 the validated research model of the hypothesis about the integration between knowledge strategy and nowledge management and its correlation to organizational creativity and performance there are some studies about integrating other possible factors into knowledge management strategy. this further study focused on integrating the aspect of intellectual capital into knowledge management. the dissimilarity factors are the proposed integrated factors, and the destination of the effect comes from the integration procedure. hashimoto, 2006 the new pattern when doing recursion is to make the hierarchical structure one research mentioned several patterns in the linguistic domain. both studies are about finding a pattern in the linguistic model, but the methods used are different and also for their intention. 4. conclusion from the revealing process of ct in some knowledge science research, there are some conclusions for this research, such as: (a) computational thinking is a complementary skill to 21st-century skills. (b) the primary elements of ct in knowledge science research are abstraction and pattern recognition. the other two elements are similar to other elements in different skills. (c) abstraction optimizes how knowledge science researchers generalize problems and makes a model from this. (d) pattern recognition focuses on finding the https://doi.org/10.34010/injiiscom.v4i1.9558 19 | international journal of informatics information system and computer engineering 4(1) (2023) 11-22 doi: https://doi.org/10.34010/injiiscom.v4i1.9558 p-issn 2810-0670 e-issn 2775-5584 similarity among the studies so that the researchers can focus more on dissimilarity factors. references ashrafi, m., davoudpour, h., & khodakarami, v. (2017). a bayesian network to ease knowledge acquisition of causal dependence in cream: application of recursive noisy‐or gates. quality and reliability engineering international, 33(3), 479-491. carayannis, e. g., & morawska-jancelewicz, j. (2022). the futures of europe: society 5.0 and industry 5.0 as driving forces of future universities. journal of the knowledge economy, 1-27. chongo, s., osman, k., & nayan, n. a. (2021). impact of the plugged-in and unplugged chemistry computational thinking modules on achievement in chemistry. eurasia journal of mathematics, science and technology education, 17(4). conklin, d. (2002, august). representation and discovery of vertical patterns in music. in music and artificial intelligence: second international conference, icmai 2002 edinburgh, scotland, uk, september 12–14, 2002 proceedings (pp. 32-42). berlin, heidelberg: springer berlin heidelberg. de vries, a. w., faber, g., jonkers, i., van dieen, j. h., & v erschueren, s. m. (2018). virtual reality balance training for elderly: similar skiing games elicit different challenges in balance training. gait & posture, 59, 111-116. farnese, m. l., barbieri, b., chirumbolo, a., & patriotta, g. (2019). managing knowledge in organizations: a nonaka’s seci model operationalization. frontiers in psychology, 10, 2730. gutiérrez-núñez, s. e., cordero-hidalgo, a., & tarango, j. (2022). implications of computational thinking knowledge transfer for developing educational interventions. contemporary educational technology, 14(3), ep367. güven, i., & gulbahar, y. (2020). integrating computational thinking into social studies. the social studies, 111(5), 234-248. hamanaka, m., hirata, k., & tojo, s. (2016). implementing methods for analysing music based on lerdahl and jackendoff’s generative theory of tonal music. computational music analysis, 221-249. hashimoto, t. (2006). evolutionary linguistics and evolutionary economics. evolutionary and institutional economics review, 3, 27-46. hlupic, v., pouloudi, a., & rzevski, g. (2002). towards an integrated approach to knowledge management: ‘hard’,‘soft’and ‘abstract’issues. knowledge and process management, 9(2), 90-102. huang, f., gardner, s., & moayer, s. (2016). towards a framework for strategic knowledge management practice: integrating soft and hard systems for competitive advantage. vine journal of information and knowledge management systems. https://doi.org/10.34010/injiiscom.v4i1.9558 adam mukharil bachtiar. computational thinking: the essential skill …| 20 doi: https://doi.org/10.34010/injiiscom.v4i1.9558 p-issn 2810-0670 e-issn 2775-5584 hurt, t., greenwald, e., allan, s., cannady, m. a., krakowski, a., brodsky, l., ... & dorph, r. (2023). the computational thinking for science (ct -s) framework: operationalizing ct-s for k–12 science education researchers and educators. international journal of stem education, 10(1), 1-16. johansen, b., & euchner, j. (2013). navigating the vuca world. research-technology management, 56(1), 10-15. kemmerer, d. (2012). the cross‐linguistic prevalence of sov and svo word orders reflects the sequential and hierarchical representation of action in broca ’s area. language and linguistics compass, 6(1), 50-66. kitahara, a. r., & holm, e. a. (2018). microstructure cluster analysis with transfer learning and unsupervised learning. integrating materials and manufacturing innovation, 7, 148-156. kong, s. c. (2022). problem formulation in computational thinking development for nurturing creative problem solvers in primary school. education and information technologies, 1-20. li, c., & bacete, g. (2017). international journal of innovation studies. in workshop design for enhancing the appropriateness of idea generation using analogical thinking (vol. 1, no. 2, pp. 134-143). mack, o., & khare, a. (2016). perspectives on a vuca world. managing in a vuca world, 3-19. mingers, j., & taylor, s. (1992). the use of soft systems methodology in practice. journal of the operational research society, 43(4), 321-332. miyata, k. (2012, november). procedural technology and creativity mining. in 2012 seventh international conference on knowledge, information and creativity support systems (pp. 169-174). ieee. nakamori, y. (ed.). (2011). knowledge science: modeling the knowledge creation process. crc press. ohmura, h., shibayama, t., hirata, k., & tojo, s. (2018). music generation system based on human instinctive creativity. proceedings of computer simulation of musical creativity (csmc2018). ono, s., kanai, h., atsumi, r., koike, h., & nishimoto, k. (2022, june). learning support and evaluation of weight-shifting skills for novice skiers using virtual reality. in adaptive instructional systems: 4th international conference, ais 2022, held as part of the 24th hci international conference, hcii 2022, virtual event, june 26–july 1, 2022, proceedings (pp. 226-237). cham: springer international publishing. qin, h. (2009, march). teaching computational thinking through bioinformatics to biology students. in proceedings of the 40th acm technical symposium on computer science education (pp. 188-191). sermsri, n., sukkamart, a., & kantathanawat, t. (2022). thai computer studies student teacher complex problem-solving skills development: a cooperative https://doi.org/10.34010/injiiscom.v4i1.9558 21 | international journal of informatics information system and computer engineering 4(1) (2023) 11-22 doi: https://doi.org/10.34010/injiiscom.v4i1.9558 p-issn 2810-0670 e-issn 2775-5584 learning management model. journal of higher education theory and practice, 22(16), 87-99. shahzad, k., bajwa, s. u., siddiqi, a. f. i., ahmid, f., & raza sultani, a. (2016). integrating knowledge management (km) strategies and processes to enhance organizational creativity and performance: an e mpirical investigation. journal of modelling in management, 11(1), 154-179. shin, j. c. (2015). mass higher education and its challenges for rapidly growing east asian higher education. mass higher education development in east asia: strategy, quality, and challenges, 1-23. shute, v. j., sun, c., & asbell-clarke, j. (2017). demystifying computational thinking. educational research review, 22, 142-158. sinthupundaja, j., chiadamrong, n., & kohda, y. (2019). internal capabilities, external cooperation and proactive csr on financial performance. the service industries journal, 39(15-16), 1099-1122. tan, y., matsui, h., ishiguro, n., uruga, t., nguyen, d. n., sekizawa, o., ... & tada, m. (2019). pt–co/c cathode catalyst degradation in a polymer electrolyte fuel cell investigated by an infographic approach combining three dimensional spectroimaging and unsupervised learning. the journal of physical chemistry c, 123(31), 18844-18853. torii, t., oguma, k., & hidaka, s. (2022). animacy perception of a pai r of movements under quantitative control of its temporal contingency: a preliminary study. artificial life and robotics, 27(3), 448-454. uchihira, n., hirabayashi, y., sugihara, t., hiraishi, k., & ikawa, y. (2012, july). knowledge transfer in r&d project management: application to businessacademia collaboration project. in 2012 proceedings of picmet'12: technology management for emerging technologies (pp. 3473-3480). ieee. van laar, e., van deursen, a. j., van dijk, j. a., & de haan, j. (2020). de terminants of 21st-century skills and 21st-century digital skills for workers: a systematic literature review. sage open, 10(1), 2158244019900176. wang, y., ye, x., yang, y., & zhang, w. (2017, november). collision-free trajectory planning in human-robot interaction through hand movement prediction from vision. in 2017 ieee-ras 17th international conference on humanoid robotics (humanoids) (pp. 305-310). ieee. weintrop, d., beheshti, e., horn, m., orton, k., jona, k., trouille, l., & wilensky, u. (2016). defining computational thinking for mathematics and science classrooms. journal of science education and technology, 25, 127-147. wiig, k.m.: integrating intellectual capital and knowledge management. long range planning. 30, 399–405 (1997). https://doi.org/10.1016/s00246301(97)90256-9. wing, j. m. (2014). computational thinking benefits society. 40th anniversary blog of social issues in computing, 2014, 26. yasar, o., maliekal, j., veronesi, p., little, l., meise, m., & yeter, i. h. (2023). retrieval practices enhance computational and scientific thinking skills. stem, steam, https://doi.org/10.34010/injiiscom.v4i1.9558 adam mukharil bachtiar. computational thinking: the essential skill …| 22 doi: https://doi.org/10.34010/injiiscom.v4i1.9558 p-issn 2810-0670 e-issn 2775-5584 computational thinking and coding: evidence-based research and practice in children’s development, 16648714, 142. zins, c. (2006). redefining information science: from “information science” to “knowledge science”. journal of documentation. 62, 447–461 (2006). https://doi.org/10.34010/injiiscom.v4i1.9558 93 | international journal of informatics information system and computer engineering 3(1) (2022) 93-104 design and implementation of a cloud based decentralized cryptocurrency transaction platform benjamin kommey*, eric tutu tchao, emmanuel osae-addo, asiedu biney kofi yeboah, derick biney kwame nkrumah university of science and technology, kumasi, ghana *corresponding email: bkommey.coe@knust.edu.gh a b s t r a c t s a r t i c l e i n f o trading in the crypto-currency market has seen rapid growth and adoption, as well as the interest in crypto related technologies like blockchain and smart contracts. smart contracts have gained popularity in building so called decentralized applications (dapps) and decentralized finance (defi) apps, mainly because they are more secure, trustworthy, and largely distributed (removes centralized control). defi applications run on the blockchain technology and are secured by blocks (nodes) connected by cryptographical hash links. defi applications have a great potential in the crypto-currency trading domain, providing more secure and reliable means of trading, and performing transactions with crypto-currencies. only verified transactions are added to the blockchain after being approved by miners through a consensus mechanism and then it is replicated (distributed) among the nodes on the blockchain network. this research paper proposes a defi crypto exchange by integrating a numerous-signature stamp with a crypto api. a numerous-signature stamp solves the issue of transaction verifiability and authenticity. a crypto api provides the data about each crypto currency with which trades and transactions will be performed. this paper also discusses the technical background of the technology and a few related works. decentralization of transactions through smart contracts on the blockchain will improve trust, security and reliability of transactions and trades. article history: received 25 may 2022 revised 30 may 2022 accepted 10 june 2022 available online 26 june 2022 aug 2018 __________________ keywords: trading, cryptocurrency, defi, finance, technology, applications. international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 3(1) (2022) 93-104 mailto:bkommey.coe@knust.edu.gh benjamin kommey et al. design and implementation of a cloud based | 94 1. introduction a defining feature of cryptocurrencies is that they are generally not issued by any central authority, rendering them theoretically immune to government interference or manipulation. the ghana cedis is a one of many currencies that are backed by the us dollars however due to imminent economic decline, the cedi value has depreciated considerably prior to its inception. the value of the ghana cedis (like any other fiat currencies) has a positive correlation with the trust of the people in their government and financial system so deductively if the people’s trust dips dramatically; it could lead to hyperinflation of the cedi. to cite a common example, the zimbabwean dollar due to hyperinflation made their currencies more worthless than the paper they were printed on (toby, 2021). states that the african continent being the prominent user of mobile cash transactions has made advancement in modernization of finance system. mobile money transaction boomed globally in 2020, especially in sub-saharan africa which accounted for 43% of all new accounts, according to the gsm association. more than half of such accounts are in africa, which has been the fastest-growing region for mobile phone growth for several years. however, the security of such a system is flawed in a few ways making it the scavenging grounds of fraudsters. the use of just a mobile phone and sim card being the innovation for its convenience is also the perpetuator of exploitation. if the sim cards can be easily identified and targeted, there always be loopholes to scam people of their hard-earned money. in ghana, there are at least three major telecommunications companies all having their own mobile money system. ensuring to a degree that the system is not monopolized by one, but this leads to another inconvenience. in a region where there are agents of only one mobile money system say mtn, the users of other networks would not have access to their money. furthermore, the transfers of cash between different networks, though not a herculean task for many, is far from seamless and incur extra charges. the human complexity of cryptocurrency apps has made cryptocurrency the playground of a select few. even in far more developed countries the average person must be guided in cryptotrading because most applications have so many technicalities in their usage. with ghana’s high illiteracy rate (and even high financial illiteracy rate), we are not an exception (mukhopadhyay, 2016). the ghana cedis and mobile money systems are limited in the sense that neither one can ensure a secure and fast money transfer to anyone around the world at any time. the use of cryptocurrency system provides protection against inflation caused by government’s poor management of fiat currencies. a fast majority of the world are slowly embracing the system making it worth a market cap of trillions which is a testament to their faith in them. this peerto-peer system cuts off the necessity for the bank and government’s involvement. most cryptocurrencies are mined using brute force algorithm so that the number of blocks mined per day remains approximately constant to control the rate of introduction of new currencies preventing any excess influx of the currencies that can lead to inflation. 95 | international journal of informatics information system and computer engineering 3(1) (2022) 93-104 moreover, cryptocurrencies are secured by cryptography, which makes it nearly impossible to counterfeit or doublespend. every transaction must be verified by thousands of computers around the world. it gives the user pseudonymity which makes it more secure than the mobile money system. they are utilized for cross-border transactions to a limited extent. an excellent illustration of such decentralized transfers is flash loans in decentralized finance. these loans can be executed instantly and are used in trading because they are done without supporting collateral. cryptocurrency is the new wave that the world is riding on, an innovative technology which has the prospect to dethrone the current finance system and if care not taken ghana will be left behind. this project was proposed to facilitate secure crypto transactions and trading. trading involves the buying and selling activity between a buyer and a seller. the commodity that the two parties can exchange are such as fiat, stocks, bonds, or crypto currency. crypto currency is a digital currency or digital asset that is decentralized and not regulated by any bank. the decentralization of cryptocurrencies is based on blockchain technology, which is a distributed ledger system where each node is linked together in a peer-to-peer (p2p) network. each node in the network owns a similar copy of the ledger or transaction, which is verified and synchronized with the creation of a new block through a consensus protocol. the consensus protocol eliminates the need for a centralized or trustworthy entity, such as a bank or government agency (uriawan, 2020). after reaching an alltime high of $3.3 trillion on december 29, 2021, the global crypto market capitalisation was $2.21 trillion (coinmarketcap). the value of the cryptocurrency market has increased by a factor of six since november 2020, when it was only slightly more than $578 billion. in may 2021, the average daily trade volume reached a peak of $500 billion before levelling out at $120 billion (statista). due to its long history of increasing in value, investors continue to have faith in bitcoin, the first decentralized digital currency underpinned by blockchain technology. with a market cap of nearly $1 trillion, bitcoin holds a market dominance of 40% as of december 2021, followed by ether, which benefits from a 20% dominance, and other altcoins (coins other than bitcoin) such as solana and xrp. the number of blockchain wallet (instrument needed by crypto owners to store and manage their crypto) users went from 0 to 80 million in the past 10 years (statista) this gives a clear snapshot of the increase in popularity of cryptocurrency as a form of payment (puspitawati, 2021). the underpinning blockchain technology of cryptocurrencies inherently allows for fast, secure, and tamper-proof recording of transactions. nonetheless, cryptocurrencies aren't immune to cybercrime, crypto-thefts, and frauds. in 2021, global crypto thefts accounted for a loss of $681 million. the weak link in this system is largely due to the exchange or trading platform, which mostly are built on a web 2.0 architecture using centralized protocol. this architecture makes lots of exchange platforms still vulnerable by design to account for hacking and false transaction scams, quite like any other web application (rahayu, 2022). the purpose of this paper is to discuss the context of smart contracts and benjamin kommey et al. design and implementation of a cloud based | 96 how they can be implemented to develop a safer and secure decentralized platform for trading and exchanging decentralized digital assets like cryptocurrencies to smoothly usher us into the age of cashless system and provide a more secured means of exchanging, accessing, and trading cryptocurrencies. 2. related works this section gives a chronological and thorough condensation of cryptocurrencies from the time of its conception to the current trends and evolutionary milestones. dating back to 1980s where the first attempts started, then to the first token currency of 90s and eventually the blockchain technology together with its derivatives. one of the first attempts at cryptocurrencies occurred in the netherlands in the late 1980s. a batch of developers sort to link money to smart cards designed to cater for night-time thefts on petrol stations. vehicle drivers would use these cards as a means of transaction instead of cash, leaving no paper monies around for thieves. around that time, david chaum, an american cryptographer experimented on another form of electronic cash. he imagined a token currency which could be sent among people privately and securely. he developed a formula he called the “blinding formula” which would be used to encrypt information transferred between individuals. the “blinded cash” as he would call it could be transferred among individuals who would be having a signature of authenticity. he went on to establish digicash where he would put his idea into practice. though his company went bankrupt, his concepts and formulas of encryption played key roles in the development of subsequent currencies. start-ups began making efforts to further the goals of digicash in the 1990s. companies like paypal which is likely the company with the largest lasting impact on the financial world were created. individuals could send and receive money over a web browser quickly and securely. it inspired other startups like egold which attempted to create a platform where precious metals like gold could be traded. e-gold gave individuals online credit in exchange for physical gold and other precious metals. it was shut down eventually due to scams and other issues (digital curency, 2007). in 1998, wei dai, a computer engineer and graduate of the university of washington first revealed b-money. it was purposed to be a distributed electronic cash system which would be anonymous. wei described b-money as “a scheme for a group of untraceable digital pseudonyms to pay each other with money and to enforce contracts amongst themselves without outside help”. although bmoney was never officially launched, bmoney endeavoured to render many of the services offered by cryptocurrencies today (buntinx, 2016). around the same period, nick szabo created bit gold. bit gold came with a proof-of-work system that mirrors bitcoin mining process in certain ways. proof-of-work is a consensus mechanism that is used to confirm and record cryptocurrency transactions. it’s a means of adding new blocks of transactions to a cryptocurrency blockchain. it involves generating hash codes that would have to match the target hash code for the block. szabo’s bit gold had its most revolutionary aspect to be its decentralized status. thus, bit gold sort to avoid reliance on centralized and 97 | international journal of informatics information system and computer engineering 3(1) (2022) 93-104 authorities. ultimately, bit gold also proved unsuccessful as b-money but gave inspiration for future digital currencies. as one of the most successful prebitcoin digital currencies, hashcash was also developed i the mid-1990s. it was developed for purposes of minimizing email spam and preventing ddos attacks. hashcash also used a proof-of-work algorithm which would aid the generation and distribution of new coins like modern cryptocurrencies. just like previous developments, hashcash also became less effective due to increased need for processing power though most of its elements were used in the development of bitcoin (griffith, 2014). a blockchain is naturally a network of connected computer systems that duplicates and distributes a digital ledger of transactions. a blockchain divides its data into temporally and cryptographically linked blocks. a blockchain is often a sort of database that only allows for reading and adding. the peer-to-peer network nature of blockchain is a result of its decentralized architecture. as a result, users (peers) communicate with one another directly without the aid of authorities or other trustworthy intermediaries. the blockchain technology was used to implement bitcoin and other contemporary cryptos. many blockchain applications have been developed over the years and have revolutionised the way people view digital currencies. commonly cited applications include using digital assets on a blockchain to represent custom currencies and financial assets, the ownership of an underlying physical device, non-fungible assets such as domain names and more advanced applications such as decentralized exchange among others. another important area of the blockchain technology is the use of smart contracts. these are systems which automatically move digital assets according to arbitrary pre-specified rules. for example, one might have a treasury contract of the form "a can withdraw up to x currency units per day, b can withdraw up to y per day, a and b together can withdraw anything, and a can shut off b's ability to withdraw". when satoshi first established bitcoin in january 2009, he was simultaneously coining two radical and untested concepts (nakamoto, 2008). the first is the bitcoin, a decentralized online currency which is peer-to-peer and maintains a value without any backing, intrinsic value, or central issuer. so far, bitcoin as a currency has taken up majority of the public attention, in terms of both the political aspects of a currency without a central bank and its extreme uncertain volatility in price. however, there is also a different, equally important, aspect of satoshi's grand experiment. thus, the concept of a proof of work-based blockchain that allows for public consent on the issue of transactions. bitcoin can be described as a first-to-f system. thus, if an individual has 60 btc, and simultaneously sends the same 60 btc to entity a and to entity b, only the transaction that gets validated first will be processed. many cryptocurrencies of modern days then started emerging using the concept of bitcoin. in a paper by jaehyung lee and minhyung cho in 2018, exeum was introduced, a decentralized architecture that issued pegged token backed by world assets, including fiat currencies. pegged tokens are over-collateralized by the virtual assets exchanged in the decentralized virtual asset trading provided by the structure, effectively remoulding the price stability dilemma into maintaining the disparity between benjamin kommey et al. design and implementation of a cloud based | 98 the virtual asset exchange and real-world exchanges. the system implemented several mechanisms to motivate market makers and preserve the peg – a rebate for maker orders, the swap rate adjusted based on the demand of the asset in the exchange, and loose protection of the peg by the market maker dapp and the initial reserve. the exeum project to democratize market making activity by enroling arbitrage miners, using the market making software provided by exeum (lee, 2018). in a 2018 paper, chi ho lam disclosed a system to support a p2p cross chain crypto asset exchange based on signature scheme to facilitate a p2p crosschain crypto asset exchange. the system provides a universal secure and direct way for traders to exchange crypto assets across different chains without hassle. the benefit of this mechanism was that it applied to the signature level instead of the protocol level (lam, 2019). a study in 2019 by stanislaw dro˙zd˙z, and his colleagues from complex systems theory department, institute of nuclear physics, polish academy of sciences, provided unwavering support for the hypothesis of the gradual development of a novel and partially independent market, synonymous to the worldwide foreign exchange (forex) market, wherein cryptocurrencies are traded in a free-standing manner. in more practical terms, this meant that not only the bitcoin but even the whole emerging crypto market may, eventually, offer ’a hedge or safe haven’ for currencies, gold, and commodities. in a 2020 paper mohd faizal yusof and his colleagues aimed to clarify how to implement a cryptocurrencies payment platform which comprised of a web component to allow end-users to declare cryptocurrencies owned by them, a mobile component to support end-users who prefer mobile phones and a backend system to manage the collection of zakat in cryptocurrencies and integration with the entire system (yusof, 2021). stefan ciberaj and martin toma´sek in their ‘crypto trading platform's article included a prototype implementation of a proof-of-concept architecture for a bitcoin trading platform. an android client with a focus on user experience (ux) and an application programming interface (api) the client utilizes that is also directly accessible to users make up the offered interfaces. the server leverages cloud computing design patterns and is made up of microservices. it has a trio-tier architecture with a focus on scalability (fang, 2022). another 2021 study by method and system for crypto-asset transfers were introduced by berengoltz and his associates. the method includes sharding a wallet private key so that each shard is given to a different secure module, generating signatures by each secure module based on a respective shard of the sharded wallet private key and obtained trading platform credentials, and verifying the cryptoasset transaction when a threshold of the generated signatures is found to match (brenglotz, 2021). an issue with a sharding-based strategy is the security worry that develops when a shard is hacked, leading to shard takeovers where one shard attacks another and information is lost (frankenfield, 2021; presthus, 2017; binns, 2022) (fig. 1). 3. method 3.1. system architecture 99 | international journal of informatics information system and computer engineering 3(1) (2022) 93-104 fig. 1. system architecture the system architecture is as depicted in fig. 1. a user accesses or enters the url of the decentralized trading platform (e.g., slimetrader.com) which displays the user interface (ui). the static files for the ui are retrieved via ipfs (a decentralized off-chain storage solution). when the user initiates a transaction, forms are provided for the details of the transaction to be entered. transaction details are commodity / cryptocurrency to transfer, recipient address and amount. on clicking submit, the transaction is encrypted or signed with the user’s private key by meta mask (all write actions must be signed, otherwise the transaction will be rejected by the nodes on the blockchain). providers like meta mask offer nodes that allows the user / frontend to connect and interact with the blockchain. the smart contract consists of the business logic that automatically processes the transaction details in the ethereum virtual machine and checks if sender balance is enough for the transaction or debit the sender’s account or verify the recipient’s wallet address or credit the recipient’s address with the debited amount. the nodes verify the transaction via a consensus protocol, once approved the transaction is then hashed as a block onto the blockchain (as a result no 3rd party or central authority is required to provide trust). this data stored in the blockchain is queried and sent to the frontend to be displayed. 3.2. system block diagram the cryptocurrency platform, in this case represented with a block diagram, as illustrated in fig. 2 contains mainly four (4) blocks namely the user block, frontend block, provider block and ethereum blockchain block. the procedures for transaction are as explained in subsection titled “system architecture”. fig. 2. system block diagram 3.3. system workflow the system workflow is as shown in fig. 3. user loads application in an internet browser and logs in / signs up using a meta mask wallet account. if a user does not have a meta mask account, direct the user to the meta mask page to acquire one. on logging in, the dashboard is loaded to show transactions and your account holdings. if the user wants to make a transaction, the user clicks on the send button and a form is displayed and the user can fill in details of the transaction which includes the name of receiver and amount to be transferred. after which, the user clicks on the submit/make transfer button to initiate the transaction. a confirmation popup message is displayed for the user to verify the transaction and if the user confirms, the transaction is made, and the user is redirected to the dashboard. current holdings are displayed on the dashboard. but if the user’s wallet is credited, a notification appears as a popup. the user benjamin kommey et al. design and implementation of a cloud based | 100 clicking on the notification will be directed to the dashboard to view the update. fig. 3. system workflow 3.4. system software design the system software was designed using modular method. the modules or software components are frontend, login, transitions and backend and this is illustrated graphically in fig. 4 and detailed descriptions are given in table 1. fig. 4. system software design diagram table 1. software design components and description frontend development next.js next js is a framework built on react js which is a javascript framework designed to be declarative, component-based, and portable. react makes it easier to create interactive user interfaces. react efficiently updates and renders just the right components when data changes after designing simple views for each state in an application. since the logic of a component is written in javascript instead of templates, you can easily pass data through your app and keep state out of the dom. sass sass is the most mature, stable, and powerful professional css extension language in the world. sass boasts of more features and capabilities than any css extension language out there and that is why we opted for it in our project. swipe.js swiper is a modern mobile touch slider with hardware accelerated transitions and amazing native behaviour. it is purposed to be used in building mobile websites, mobile web apps, and mobile native or hybrid applications. swiper, along with other great components, is a part of framework7, a fully featured framework for 101 | international journal of informatics information system and computer engineering 3(1) (2022) 93-104 building ios & android app. this helps to comply with a multi-platform compatibility like digi crypto. swiper comes with a very resourceful api. it allows for the creation custom paginations, navigation buttons, parallax effects and other vast options. log in meta mask to help secure and useable ethereum-based websites, meta mask was developed. it specifically takes care of account administration and establishing a user's connection to the blockchain. users who already have the meta mask extension installed can easily log in on the landing page by clicking a button. the user is routed to the official meta mask extension download page to install meta mask if it is not already installed. at window. ethereum, meta mask accesses a worldwide api into websites that its users browse. websites can access users' ethereum accounts with this api, read data from the associated blockchains, and recommend that users sign messages and transactions. the provider object's presence suggests an ethereum user. transitions framer motion an interactive design tool for websites and prototyping is called framer. building complete marketing sites, landing pages, online campaigns, and much more are its strong points. it covers each step of the design process, from interactive prototypes to graphic mock-ups, but its key advantage is publishing right from the canvas. because you can ship your design right away and all app transitions use it, framer is the quickest tool for building sites. type.js type of javascript programming language tool coin ranking api to integrate cryptocurrency prices into your app or website. gain access to high-quality data about all coins, like price history, circulating supplies, exchanges, trading pairs, and much more. the account page is customized to fetch current data about the user’s wallet coins or assets specifically (see fig. 5) backend development smart contracts, goerli benjamin kommey et al. design and implementation of a cloud based | 102 fig. 5. display of coin ranking api site to initiate testing, recharts were built on top of svg elements with a lightweight dependency on d3 submodules. a chart was customized by tweaking component properties and passing in custom components, quickly building the chart with decoupled, reusable react components. fig. 6 depicts the system dashboard with charts and recharts. fig. 6. display form functionality and validation were facilitated using the formik package which takes care of the repetitive functions—keeping track of values/errors/visited fields, orchestrating validation, and handling submission. by staying within the core react framework and away from magic, formik makes debugging, testing, and reasoning about forms intuitive. the system software application was deployed to the ipfs using fleek. fleek is a deployment platform employed in this system. fleek allows for continuous deployment in that, when any change is made and pushed to github, the change would automatically be seen on the deployed site. the fleek interface is as displayed in fig. 7. fig. 7. display of fleek interface smart contracts were written using solidity to define the logic for performing transactions and storing the data on the blockchain. contract application binary interface (abi) were developed from the smart contracts. this encodes the interface of the smart contract for the ui. that is, it tells the contract-abstraction library, for example ethers, what functions to provide. finally, a goerli wallet address was created and a goerli faucet was used to deposit virtual ether into the created wallet. goerli is a test net for deploying application into a sandbox, a test environment, for the development face of a project. alchemy was used to deploy the smart contract to the goerli test net. a network to deploy the contract to (either the ethereum main net or a test net) and wallet address to use was defined in the hardhat config file. the hardhat config file is a file that allows definition of conditions for deploying our smart contracts. lastly, 103 | international journal of informatics information system and computer engineering 3(1) (2022) 93-104 a context processor was defined in the frontend to provide the contract abi and to define other functions for manipulating the wallet. also, the processor monitors wallet related events, performed by user in the frontend. 4. results and discussion various tests were run to see if all functionalities were working as expected. that is, to be able to send and receive tokens on our decentralised application. the so called `goerli’ network was used as basis for intensive tests. this goerli network test was done before doing real testing on an ethereum network, since the deployment process is irreversible. the procedure for this testing is as described in the following. the goerli test net provided two users signed unto the platform via meta mask with virtual ethereum tokens through its goerli faucets. two users, user a and user b signed into the application via meta mask. with the virtual ethereum tokens provided by the goerli faucets, user a sent some tokens to user b’s account. this transaction was made possible to the meta mask wallet. the wallet address of user b was provided for user a. this address was used by user a in the locating user b’s account to the send the virtual ethereum tokens. the tokens were successfully received by user indicating a successful transaction. testing results were as expected, i.e., the system was able to successfully send and receive tokens seamlessly. 5. conclusion this paper has established the problems associated with centralised finance applications and successfully demonstrated the process of sending tokens from one ethereum based wallet to any other ethereum based wallet using meta mask. the project’s aim to develop a completely decentralised platform was achieved since the use of meta mask wouldn’t require any central body to make transactions and other functions. it was demonstrated that the frontend of the application could also be hosted on a decentralised platform and would allow for the exchange of tokens between wallets that are outside the ethereum network using bridging software. it would enable the direct deposit of fiat currencies as well and would integrate a platform to allow users to trade ethereum based tokens. references berengoltz, p., ofrat, i., & shaulov, m. (2021). u.s. patent application no. 17/172,794. binns, d. (2022). no free tickets: blockchain and the film industry. m/c journal, 25(2). buntinx, j. p. (2016). top 4 cryptocurrency projects created before bitcoin. digital currency business e-gold indicted for money laundering and illegal money transmitting: https://www.justice.gov/archive/opa/pr/2007/april/07_crm_301.html https://www.justice.gov/archive/opa/pr/2007/april/07_crm_301.html benjamin kommey et al. design and implementation of a cloud based | 104 fang, f., ventre, c., basios, m., kanthan, l., martinez-rego, d., wu, f., & li, l. (2022). cryptocurrency trading: a comprehensive survey. financial innovation, 8(1), 1-59. frankenfield, j. (2021). bitcoin. investopedia, feb, 18. griffith, k. (2014). a quick history of cryptocurrencies bbtc-before bitcoin. bitcoin magazine, 16. lam, c. h. (2019). u.s. patent application no. 16/429,075. lee, j., & cho, m. (2018). exeum: a decentralized financial platform for price-stable cryptocurrencies. arxiv preprint arxiv:1808.03482. mukhopadhyay, u., skjellum, a., hambolu, o., oakley, j., yu, l., & brooks, r. (2016, december). a brief survey of cryptocurrency systems. in 2016 14th annual conference on privacy, security and trust (pst) (pp. 745-752). ieee. nakamoto, s., & bitcoin, a. (2008). a peer-to-peer electronic cash system. bitcoin.– url: https://bitcoin. org/bitcoin. pdf, 4, 2. presthus, w., & o’malley, n. o. (2017). motivations and barriers for end-user adoption of bitcoin as digital currency. procedia computer science, 121, 8997. puspitawati, l., & ahmad, a. (2021). information system for forex investment and their effects on investment growth in foreign currencies. international journal of research and applied technology (injuratech), 1(1), 127-133. rahayu, s. (2022). implementation of blockchain in minimizing tax avoidance of cryptocurrency transaction in indonesia. international journal of research and applied technology (injuratech), 2(1), 30-43. toby shapshak: mobile money in africa reaches nearly $500bn during pandemic: https://www.forbes.com/sites/tobyshapshak/2021/05/19/mobilemoney-in-africa-reaches-nearly-500bn-during-andemic/?sh=1175069b3493 uriawan, w. (2020). swot analysis of lending platform from blockchain technology perspectives. international journal of informatics, information system and computer engineering (injiiscom), 1(1), 103-116. yusof, m. f., rasid, l. a., & masri, r. (2021). implementation of zakat payment platform for cryptocurrencies. azka international journal of zakat & social finance, 17-31. https://www.forbes.com/sites/tobyshapshak/2021/05/19/mobile-money-in-africa-reaches-nearly-500bn-during-andemic/?sh=1175069b3493 https://www.forbes.com/sites/tobyshapshak/2021/05/19/mobile-money-in-africa-reaches-nearly-500bn-during-andemic/?sh=1175069b3493 219 | international journal of informatics information system and computer engineering 3(2) (2022) 219-230 doi: https://doi.org/10.34010/injiiscom.v3i2.8805 p-issn 2810-0670 e-issn 2775-5584 phishing website detection using several machine learning algorithms: a review paper alexander veach*, munther abualkibash school of information security and applied computing, eastern michigan university, ypsilanti, michigan, united states *corresponding email: aveach1@emich.edu 1. introduction phishing has become one of the most prevalent social engineering attacks in the digital environment. from personal accounts to corporate user accounts, all must be aware of the potential dangers of a phishing attack. this has led to an ongoing battle to prevent phishing attacks by blocking dangerous websites and communications. there are many methods to fight these attacks, with many looking to the new advancements in machine learning and artificial intelligence as a potential solution to phishing attacks. the method discussed in this paper is detecting phishing websites with machine learning algorithms. unfortunately, such a problem lacks a catch-all solution, which has led to the formation of multiple different approaches to the problem. for example, one solution could suggest designing a b s t r a c t s a r t i c l e i n f o phishing is one of the major web social engineering attacks. this has led to demand for a better way to predict and stop them in a commercial environment. this paper seeks to understand the research done in the field and analyse the next steps forward. this is done by focusing on what goes into the selection of proper features, from manual selection to the use of genetic algorithms such as adaboost and multiboost. then a look into the classifiers in use, neural networks and ensemble algorithms which were prominent alongside some novel approaches. this information is then processed into a framework for cloud-based and clientbased phishing website detection, alongside suggestions for possible future research and experiments that could help progress the field. article history: submitted/received 02 aug 2022 first revised 05 sept 2022 accepted 02 oct 2022 available online 20 oct 2022 publication date 01 dec 2022 aug 2018 __________________ keywords: artificial intelligence, data science, machine learning, phishing. international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 3(2) (2022) 219-230 https://doi.org/10.34010/injiiscom.v3i2.8805 alexander veach and munther abualkibash. phishing website detection using several machine…| 220 doi: https://doi.org/10.34010/injiiscom.v3i2.8805 p-issn 2810-0670 e-issn 2775-5584 methods for on-hardware machine learning, which will limit the choice of algorithms to simpler versions but will allow for mass implementation. other solutions could focus on offloading the classification and model to a third-party service like microsoft azure or amazon web services, which circumvents the limitation of algorithms in exchange for another group of issues. including the differences in selecting features, where to gather the data, and much more there are a multitude of potential solutions with many looking for the most effective solution. the purpose of this paper is to look at the potential solutions and outline what the next steps for such research could be. 2. method to analyze the current popular solutions and implementation of anti-phishing technologies using machine learning and artificial intelligence a plethora of research was gathered from collage repositories and online journal sites such as jstor. once the multitude of research was gathered, which amounted to 91 papers. these 91 papers were then read and analyzed, taking the classifier and methods used into account and their differences. once that was complete, papers with relevance to the topic at hand and important for discussion were selected and used, the number of which is 14. 3. results and discussion 3.1. the material used the application of machine learning against phishing is not a new development and there has been a multitude of research done over the last few years. especially so for phishing urls. this is some of the relevant research that has come out in the last few years. there is sanchez-paniagua et al. (2022) who focused on analysing deep learning methods compared to other methods, namely ensemble and genetic selection algorithms. in their study they found that their model of using tf-idf + n-gram outperformed other methods by varying degrees. with the closest performers being within 0.5 points of accuracy while the weakest performers were behind as much as 10 points. the researchers also found that “...handcrafted url features decrease their performance over time, up to 10.42% accuracy in the case of the lightgbm algorithm from the year 2016 to 2020. for this reason, machine learning methods should be trained with recent urls to prevent substantial aging from the date of its release” (sanchez-paniagua et al., 2022). xiao et al. (2020) focused on using cnn with multi-head self-attention to determine if links were valid or phishing. by using mhsa, the researchers found better accuracy and speed compared to cnn-ltsm with a difference of 0.002 in cnn-mhsa’s favour. for future work, xiao et al. (2020) focuses on updating the model to take the html content into consideration to increase the accuracy further (xiao et al., 2020). a different direction was pursued by suleman and awan (2019), who focused on the use of generic algorithms such as “yet another generating genetic algorithm” or yagga. testing it against other gas found a 94.99% accuracy with an id3 classifier (suleman & awan, 2019). https://doi.org/10.34010/injiiscom.v3i2.8805 221 | international journal of informatics information system and computer engineering 3(2) (2022) 219-230 doi: https://doi.org/10.34010/injiiscom.v3i2.8805 p-issn 2810-0670 e-issn 2775-5584 another example of study when it comes to genetic algorithms is subasi and kremic (2020) who compared adaboost and multiboosting when it came to testing phishing websites. the researchers found a high accuracy of 97.61% using an svm classifier with adaboost, however the cost of that accuracy is that svm adaboost reported a complexity, in seconds, of “8193.72” (subasi & kremic, 2020). another genetic algorithm study comes from alsariera et al. (2020) who focused on using their forest penalizing attributes algorithm that uses weight to deemphasize inconsequential variables. the team then compared the results to meta-learning variants of the algorithm specifically testing a bagging method and adaboost. of which they found that adaboosted forest penalizing attributes had an accuracy of 97%, beating the other accuracies of 96.26% for base classifier and 96.58% for bagged, and a speed where “...false alarm notifications are next to zero” (alsariera et al., 2020). a more unique approach is the one chen et al. (2020) took focusing on the visual similarity of websites to determine if it is a phishing website. it does this by using wavelet hashing and scale-invariant feature transform to determine similarity. the researchers found some success when using microsoft, dropbox, and bank of america as a comparison point, getting accuracy results of 98.14%, 98.61% and 99.95% respectively (chen et al., 2020). another unique approach is that of ali and malebary (2020), who used particle swarm optimization to improve detection of fraudulent phishing websites. by using the high speed pso model the team proposes feature weighting in much the same way a genetic algorithm operates. compared to the ga selection and weighting the team found that “...psobased feature weighting omitted between 7%-57% of irrelevant features” and found that classifiers using their method “...outperformed these machine learning models with applying ig, chi-square, wrapper, ga-based features selection, and ga-based features weighting” (ali & malebary, 2020). another approach takes the visual analysis of websites but then combines it with a neural network classifier. this approach is what abdelnabi et al. (2020) proposed which uses a triplicate network to compare websites to popular websites on alexa. by using the ensemble method with neural networks, they outline a potential future path for using website matching (abdelnabi et al., 2020). assefa and katarya (2022) focused on analysing other deep learning methods and their results and compared it to autoencoder, a form of unsupervised neural network. in the report they noted various limitations in other studies, noting issues such as non-comprehensive reports and compared their achievements to the autoencoder method. they found that autoencoder had an accuracy of 91.24% and that with better data mining techniques the performance could be improved (assefa & katarya, 2022). mandadi et al. (2022) focused on finding the most important features denoting three types, domain-based, html and javascript based, and address bar based features, with the total number of features under these three categories being considered was 17. once that was set, they tested the features with random forest and decision tree which gave values of 87.0% and 82.4% for accuracy respectively (mandadi et al., 2022). https://doi.org/10.34010/injiiscom.v3i2.8805 alexander veach and munther abualkibash. phishing website detection using several machine…| 222 doi: https://doi.org/10.34010/injiiscom.v3i2.8805 p-issn 2810-0670 e-issn 2775-5584 saravanan and subramanian (2020) used ga feature selection alongside an artmap supervised neural network. artmap is made up of “a pair of selforganized adaptive resonance theory (art) modules arta and artb. these two modules are interconnected by an inter-art self-associative memory and an internal controller, whose objective is to maximize the predictive generalization and to minimize the predictive error. each art module is associated with f1 and f2 layers which act as a short term memory and a long term memory for category selection” (2020). this model also uses the firefly algorithm to determine which features are useful. the study found their own unique algorithm to be the best performing in all performance measures except for detection time, which svm performed better (saravanan & subramanian, 2020). mourtaji et al. (2017) also outlines which features they believe are best suited for detection. having five groups which are: lexical based analytics method, abnormal based feature, content-based analytics method, and an identity-based method. alongside these features they suggest a blacklist function on-top of these features. they used a linear regression classifier and reported an accuracy of 95.5% with a false positive rate of 1.4% (mourtaji et al., 2017). zhou and zhang (2022) propose a dualweight random forest algorithm that is “based on the combination of feature weight and decision tree weight”. the proposed classifier was then tested against random forest, random forest algorithm with decision tree weight, and dynamic random forest and had the highest accuracy with a value of 94.93% which was 2.22 points higher than the next highest which was dynamic random forest with 92.71 (zhou & zhang, 2022). 3.2. analysis phishing is one of the most dangerous and effective online fraud methods in existence today. this concern has led to the search for a so-called “silver bullet” that would protect potentially affected parties from phishing attacks. many have looked towards machine learning and artificial intelligence to create an application that, when used, would detect threats and adapt to them to create the ultimate defense. however there are many parts to consider including which classifier should be used for training, what attributes should be weighed to determine threat and which dataset is the best for training the model. the first major question is by which metric should such a model be trained around. should it be url focused, should it be based upon the content of the website itself, or should it be based on the websites meta content using tools such as whois. url based analysis is simple to implement and fast to process, but lacks other information from the website which can decrease accuracy. similarly analyzing the content of the web page alongside the url itself takes more time to execute for the benefit of more accurate results. some even suggest image recognition models such as chen et al. (2020) with their visual similarity model. then, when it comes to weighing features, some papers suggest using attribute selection algorithms such as adaboost, multiboost, or other genetic algorithms to predict which attributes lend themselves to correct identification such as suleman and awan (2019), subasi and kremic https://doi.org/10.34010/injiiscom.v3i2.8805 223 | international journal of informatics information system and computer engineering 3(2) (2022) 219-230 doi: https://doi.org/10.34010/injiiscom.v3i2.8805 p-issn 2810-0670 e-issn 2775-5584 (2020), and alsariera et al. (2020). by using these machine proven attributes many hope to increase the efficiency of the used classification algorithms. subasi and kremic (2020) noted that “adaboost achieved the superior classification accuracy, with svm 97.61%” which beat their best accuracy single classifier result which was random forest which achieved “an accuracy of 97.26%”. another study done by sanchez-paniagua et al. (2022) reported that, when testing trained models based on data from 2016, 2017 and 2020: “...all models struggled to endure over time and their performance decreased when tested on the following years’ dataset” (sanchez-paniagua et al., 2022). thus showing the importance of an ever updating classification scheme. there are many offered solutions when it comes to what classifier to use, with two of the most common answers being neural network classifiers and random forest classification. random forest has been found by many researchers to be their choice of classifier in the studies surveyed. zhou et al. (2020) used a modified version of random forest, named double weighted random forest, and returned an accuracy 94.94% when using k-means clustering for feature selection. in studies that found other methods to be more effective such as sanchez-paniagua et al. the difference was only a 0.20 accuracy difference compared to lightgbm with 94.67 (sanchezpaniagua et al., 2022). however, some report a lower accuracy number, such as mandadi et al. (2022) who found a reported accuracy of 82.4% with 17 features using a phishtank dataset. this variance could be attributed to the differences in feature selection and the contents of the used datasets. another common solution is the use of neural network classifiers such as cnn, lstm, gnn and many others. neural network classification is recommended similarly to random forest with many studies finding high accuracy when predicting malicious phishing urls. as mentioned in the section prior, sanchezpaniagua et al. found that light bgm had the highest tested accuracy of the classifiers used with static feature selection on the piu-60k dataset (sanchezpaniagua et al., 2022). other studies have noticed similar results with other neural networks, specifically those with deep learning capabilities. xiao et al. (2020) applied multi-head self-attention, or mhsa, to a convolution neural network and found an accuracy rate of 0.9834 or 98.34 percent. the study proposed more solutions to increase that number even higher with their main worry being to “decrease the input of [url’s length parameter]” (xiao et al., 2020). novel application of the prior is also wellresearched. with a common focus on using visual detection, to detect pages that are too close to other pages as seen in abdelnabi et al’s work (2020). in their research they proposed a model that uses three convolutional models to determine phishing or not based on the similarity to other major pages collected from alexa. another unique approach is ali and malebary (2020) who propose a model based on particle swarm optimization feature weighing. which reportedly outperformed other weighting algorithms. like most topics there is not a singular silver bullet, so to speak, when it comes to predicting if a website is malicious or not. phishing methods commonly change to what is most efficient at that time which has led to a never ending conflict trying to https://doi.org/10.34010/injiiscom.v3i2.8805 alexander veach and munther abualkibash. phishing website detection using several machine…| 224 doi: https://doi.org/10.34010/injiiscom.v3i2.8805 p-issn 2810-0670 e-issn 2775-5584 prevent said attacks. this has led to a focus on using genetic algorithms or other methods to create a curated list of features. as noted by sanchez-paniagua et al, “compared to machine learning algorithms, both cnn models obtained better results than handcrafted features” (sanchez-paniagua et al., 2022). by using deep learning models, a higher level of accuracy can be maintained at the cost of more costly requirements. neural network classifiers by design develop a much richer identification method, upon which they layer information in a way imitating human neurons, which requires more processing power than simple classifiers such as a decision tree classifier. these methods, when properly trained, can generate extremely accurate results. however, this in of itself is a much more costly method requiring a higher level of processing power commonly using high-end graphics cards designed for that explicit purpose such as the nvidia titan v. on the other end of the spectrum is random forest, or other ensemble classifiers, that instead rely on a series of classification tests to assure accuracy. thanks to this, ensemble classifiers require less processing power and have a better success rate with less data provided. however, random forest lacks the potential depth of learning that deep learning neural networks can possibly provide and is not adept when adapting to changes over time, as reported by sanchez-paniagua et al. (2022). then there are two further trains of thought when it comes to implementation, if the software should be designed to run off of the hardware it is installed upon or if the hardware should be run off of virtualized software through the cloud. both have their benefits and drawbacks, as offloading the processing better works when using devices such as mobile phones and other low powered devices. however, this builds a dependency on stable connection for the service to work, and a reliance on consistent service. this then creates specifications of an infrastructure that can support such needs. while using the physical machine itself limits the potential design of the model, as it must be customized to each device or be designed to work with most devices sacrificing customization. the benefit would be reliability, as the model would only require the model that is already trained and the processing power of the device executing it. this would limit potential downtime and other server connectivity issues, but could cost more in the long run for businesses implementing this method. another issue would be training the models in a reasonable way to adapt to changes in phishing techniques. something which sanchez-paniagua et al. (2022) found as much as a 10% decrease in accuracy as malicious phishing links change. the next most common solution was custom classifiers or unique analysis methods, or other similar methods, which made up nineteen of the ninety papers analyzed. these solutions focused on designing custom classifiers that would parse the target information, with claims that the unique solution was more effective than other common solutions. these classifiers are often similar to ensemble methods which combine classifiers in a multilayered approach. however, some are amalgamations designed to work as a single classifier instead of the normal multileveled https://doi.org/10.34010/injiiscom.v3i2.8805 225 | international journal of informatics information system and computer engineering 3(2) (2022) 219-230 doi: https://doi.org/10.34010/injiiscom.v3i2.8805 p-issn 2810-0670 e-issn 2775-5584 classification that ensemble methods use which is why they have their own category. some of these solutions claim to have a success rate when tested of around 98 percent while others claim a much lower result. for example, saravanan and subramanian (2022) used a combination of a genetic algorithm to select important features and artmap, a neural network classifier based upon the firefly algorithm. there was also another group that had nineteen studies suggest its use. the deep learning methods are made up of such classifiers as cnn, dnn, gnn and their derivatives. these methods were used specifically to design evolving models that could potentially detect new attacks and adapt quickly. an issue with these studies is of course the resource intensive nature of deep learning methods. the method's resource intensive nature leaves only two options when it comes to potential implementation: require all hardware to meet the specification or offload the ai to a cloud-based solution. by requiring a dedicated gpu any company wishing to adopt will face a steep entry cost which will be a barrier to general adoption especially for major companies with tens of thousands of workers. the same is true for a cloud based solution as any corporation that wishes to adopt such a method will undoubtedly pay fees for such usage. something that was noticed in many of the reports is a lack of standardization when it comes to reporting the information gained from experimentation. several papers only reported the accuracy without any of the other data points leaving you to extrapolate how they reached that conclusion. this issue has been noted in other papers such as “intelligent phishing website detection using deep learning”, where assefa and katarya (2022) note that 3 of the papers analyzed failed to either provide enough details or the results reported were “not comprehensive”. this issue then compounds as a sizable group of papers would leave out important information such as the specifications of how they created their private dataset, and other key details needed to replicate their findings. this information is critical for understanding how efficient each method is. this can be remedied by having a standard for reporting the results of ai/ml for phishing detection. a solution would be to standardize what results are included in studies. this standard should require: a) the explicit location and name of which dataset was used, b) the algorithm used, c) explicit instructions on how the model was trained, d) an in-depth breakdown of false positives and negatives and true positives and negatives, and e) analysis execution speed. going forward there appears to be two paths when it comes to designing a defensive tool against fraudulent websites. the first approach would be focused on designing a client-focused service that would run a classifier on the hardware provided. the second approach would be to focus upon designing a cloudbased solution called through an api to offload the compute intensive work. both of these approaches have their own benefits and drawbacks, which will be discussed in greater detail in the next section, but either are a good beginning step for advancing anti-phishing measures. https://doi.org/10.34010/injiiscom.v3i2.8805 alexander veach and munther abualkibash. phishing website detection using several machine…| 226 doi: https://doi.org/10.34010/injiiscom.v3i2.8805 p-issn 2810-0670 e-issn 2775-5584 3.3. example of a client-based solution the following is a proposed framework for a client-based solution for an antiphishing extension. the solution should be built in a browser native language, such as javascript, using the provided machine learning libraries such as tensorflow. when the website is accessed the extension will check a maintained whitelist which contains commonly used and trusted websites such as search engines, online office tools, and other trusted websites. then, if the website is not trusted the extension will harvest data needed for classification on the model used. for this example it will be assumed that an ensemble classifier such as random forest will be used. the classifier will account for multiple features including domain information, the url, and content on the website itself. something similar to the feature set suggested by mandadi et al. (2022), which lists dns record, website traffic, age of domain, end period of domain, iframe redirection, status bar customization, disabling right click, website forwarding, domain, ip address, “@” symbol, length, depth, redirection “//”, “http/https” in domain name, using url shortening services “tiny url”, prefix or suffix “-” in domain (mandadi et al., 2022). the extension should have a pre-built model based upon the above implemented in the extension, with updates to reflect trends in current phishing websites. while the extension classifies the website the extension should have an interim page that will update when classification is done to either send the user to the website or inform the user of the detected security risk. this model is considerably easy to implement and can theoretically be run on most modern workstations. this model also can be updated when performance drops due to changing trends in phishing to counteract the loss in accuracy, however doing so would require a consistent team to continuously watch the current trends in phishing websites. another weakness of this model is the potential for false positives and other accuracy issues, which would slow down the average user’s speed of use. the proposed model will also need to determine if the link is safe or unsafe rapidly, else earning the ire of the end user. these factors would need to be mitigated for a commercial implementation, by either optimizing the classification process, designing unique methods to obfuscate the methods in an unnoticed way, or other similar ideas (see fig. 1). https://doi.org/10.34010/injiiscom.v3i2.8805 227 | international journal of informatics information system and computer engineering 3(2) (2022) 219-230 doi: https://doi.org/10.34010/injiiscom.v3i2.8805 p-issn 2810-0670 e-issn 2775-5584 fig. 1 a diagram of a simple client-based anti-phishing solution 3.4. example of a cloud-based solution where the prior solution is relatively simple to implement, the following is much more difficult due to the necessity of powerful computational processes, which are then hosted on either a public cloud service or a private cloud. this version would use a deep learning method, such as cnn-ltsm, which would be trained using information from repositories such as phishtank. the classifier should be guided to look at meta information, website content, and the website url itself. this trained model will then act upon information sent to it from client devices and determine if the site is a phishing website or a legitimate website. the model will then add that information into the next training set to continuously update the dataset to have it evolve naturally to counter new methods of phishing as they appear as suggested by sanchezpaniagua et al. (2022). this model, while simple to outline, is difficult to execute for practical use. for effective deep learning data needs to be consistently fed to the model for it to stay up-to-date. supporting this infrastructure would cost a lot of money or resources to execute effectively, alongside the customization needed to optimize the classification processes. ignoring those issues, another issue that one will run into is ensuring uptime for those dependent on the software. the cloud focused model requires consistent back and forth between all users and the classification service at all times for effective use. this also will require a lot of resources to implement. once the model is properly trained and maintained, it however has the potential for a higher accuracy than its ensemble based brother https://doi.org/10.34010/injiiscom.v3i2.8805 alexander veach and munther abualkibash. phishing website detection using several machine…| 228 doi: https://doi.org/10.34010/injiiscom.v3i2.8805 p-issn 2810-0670 e-issn 2775-5584 above. in the deep learning studies surveyed for this paper, most reported an accuracy of 97% or more, trumping the average next highest classifier which was often the random forest algorithm. therefore, there is potential for cloudbased anti-phishing techniques powered by machine learning and artificial intelligence but the resource cost will limit effective implementation without serious capital investment (see fig. 2). 3.5. future work a prudent first step would be to standardize reporting of machine learning and artificial intelligence. currently there is no codified standard for reporting machine learning and artificial intelligence study results. some studies contain everything needed to replicate the experiments performed and how the conclusion was drawn; however other studies will leave out needed details for conclusive analysis or replication. mourtaji et al. (2017) for example outlines their own framework and show results from said framework without supplying the dataset used in testing, which they claim to have pulled from phishtank and alexa to populate. by providing the dataset used in testing to an online repository for verification it allows for doubt to be cleared and will be of great assistance to other researchers in the field. fig. 2 a diagram of a simple cloud-based anti-phishing solution https://doi.org/10.34010/injiiscom.v3i2.8805 229 | international journal of informatics information system and computer engineering 3(2) (2022) 219-230 doi: https://doi.org/10.34010/injiiscom.v3i2.8805 p-issn 2810-0670 e-issn 2775-5584 by focusing on standards that ensure easy replication of results, and clarity within the information reported, other researchers will be able to work off of the research and develop the new technologies. therefore we would like to suggest a framework that would include these specifications for all reported testing: a repository containing the training dataset and testing dataset used, the features selected for classification, the classifier used alongside documentation of how to implement custom classifiers, the true positives and negatives alongside the false positives and negatives from resulting validation tests, precision rating, recall rating, accuracy rating, and f1 score. alongside this information there should be enough instruction for the reader to validate the paper by replicating the experiment within. by including this information it shall ensure reliable replication, which will make it easier to build upon thus helping the proliferation of information. on a more practical level the next step should be creating working models and testing them in live environments. by making a model, client or cloud based, will allow for researchers to see the practical shortcomings to these methods and correct them. once the shortcomings are known more development can take place evolving the field, which will help combat one of the most common threats on the internet. 4. conclusion phishing is one of the most common threats to cybersecurity in the current world. many organizations have become acutely aware of the potential danger of a successful attack. this has led to an increased focus on developing new technologies to prevent such attacks from taking place. by using machine learning and artificial intelligence many posit a learning defensive system that can prevent website phishing attacks and lower potential vectors for attack. currently there is no cure-all with many papers acknowledging the ever-changing nature of website based phishing attacks, preventing a permanent solution. however, a well automated system could go a long way to preventing websitebased phishing attacks and could be a useful solution for major organizations. most studies believe that a web extension for modern web browsers such as google chrome is where companies should look for future developments. a development of a working model for testing in live environments would do well in advancing the field by showing what potential shortcomings exist. finally, there is a lack of standardization in the reporting of data done in the multitude of studies focusing on the topic. to better advance the field in the focus of implementing anti-phishing ml/ai into working prototypes, a standard of reporting would make it easier to gather information. by always including the dataset used, the algorithm used, the instructions for training the model, a breakdown of the training and testing results and a record of time taken to execute a task, it would allow for information to be disseminated and processed faster which in turn could assist in the development of such antiphishing technologies. https://doi.org/10.34010/injiiscom.v3i2.8805 alexander veach and munther abualkibash. phishing website detection using several machine…| 230 doi: https://doi.org/10.34010/injiiscom.v3i2.8805 p-issn 2810-0670 e-issn 2775-5584 references abdelnabi, s., krombholz, k., & fritz, m. (2020, october). visualphishnet: zero-day phishing website detection by visual similarity. in proceedings of the 2020 acm sigsac conference on computer and communications security (1681-1698). ali, w., & malebary, s. (2020). particle swarm optimization-based feature weighting for improving intelligent phishing website detection. ieee access, 8, 116766116780. alsariera, y. a., elijah, a. v., & balogun, a. o. (2020). phishing website detection: forest by penalizing attributes algorithm and its enhanced variations. arabian journal for science and engineering, 45(12), 10459–10470. assefa, a., & katarya, r. (2022, march). intelligent phishing website detection using deep learning. in 2022 8th international conference on advanced computing and communication systems (icaccs) 1, 1741-1745. ieee. chen, j. l., ma, y. w., & huang, k. l. (2020). intelligent visual similarity-based phishing websites detection. symmetry, 12(10), 1681. mandadi, a., boppana, s., ravella, v., & kavitha, r. (2022, april). phishing website detection using machine learning. in 2022 ieee 7th international conference for convergence in technology (i2ct) (1-4). ieee. mourtaji, y., & bouhorma, m. (2017, october). perception of a new framework for detecting phishing web pages. in proceedings of the mediterranean symposium on smart city application (1-6). sánchez-paniagua, m., fernández, e. f., alegre, e., al-nabki, w., & gonzález-castro, v. (2022). phishing url detection: a real-case scenario through login urls. ieee access, 10, 42949-42960. saravanan, p., & subramanian, s. (2020). a framework for detecting phishing websites using ga based feature selection and artmap based website classification. procedia computer science, 171, 1083-1092. subasi, a., & kremic, e. (2020). comparison of adaboost with multiboosting for phishing website detection. procedia computer science, 168, 272-278. suleman, m. t., & awan, s. m. (2019). optimization of url-based phishing websites detection through genetic algorithms. automatic control and computer sciences, 53(4), 333-341. zhou, j., liu, y., xia, j., wang, z., & arik, s. (2020). resilient fault-tolerant antisynchronization for stochastic delayed reaction–diffusion neural networks with semi-markov jump parameters. neural networks, 125, 194-204. zhou, z., & zhang, c. (2022, may). phishing website identification based on double weight random forest. in 2022 3rd international conference on computer vision, image and deep learning & international conference on computer engineering and applications (cvidl & iccea) (263-266). ieee. https://doi.org/10.34010/injiiscom.v3i2.8805 hammed olalekan bolaji and tajudeen oluwafemi bolaji. information and communication …| 50 information and communication technology utilization for managerial planning among educational administrators in secondary schools in ilorin metropolis hammed olalekan bolaji1, tajudeen oluwafemi bolaji2* *1department of science education, faculty of education, al-hikmah university, ilorin-nigeria. 2department of educational management & counselling, faculty of education, al-hikmah *corresponding email: alikhlasschools@gmail.com a b s t r a c t s a r t i c l e i n f o this study examines how secondary school administrators in the ilorin metropolis use ict for managerial planning. the principal must effectively and efficiently organize, manage, and supervise the school's business for them to run successfully. this study's design is a survey-style descriptive study to determine how ict affects principals' administrative performance. the population of this study consisted of 292 participants from all 75 public secondary schools in the ilorin metropolis. the instrument for data collection was a structured questionnaire titled "information communication technology and principal's administrative effectiveness questionnaire" (ictpaeq) while reliability coefficient values of 0.85 were obtained. the data gathered from the field was evaluated using pertinent descriptive statistics such as percentages, mean, and standard deviation, while research question 4 was addressed using pearson production moment correlation (ppmc) statistics. this finding may be explained by the fact that most schools lacked these ict resources, which may have contributed to the finding. this finding revealed that secondary school principals do better administratively the more they use ict resources. this further implies that secondary school management effectiveness may be in danger in the absence of ict. this implied that ict is a crucial factor in the efficient administration of article history: received 18 dec 2022 revised 20 dec 2022 accepted 25 dec 2022 available online 26 dec 2022 aug 2018 __________________ keywords: ict, school administrator, administrative performance. international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 3(2) (2022) 50-64 51 | international journal of informatics information system and computer engineering 3(2) (2022) 50-64 1. introduction it is now generally acknowledged that information and communication technology (ict), which is driving globalization, deregulation, and innovation, are the main factors influencing the economic landscape. ict now serves as the foundation for economic growth that occurs at both the macro and micro levels, thus individuals who choose not to take part in such advances run the risk of becoming more marginalized by others. as a result, the use of ict and the management of knowledge has advanced learning across a variety of media, which has improved human understanding in the 21st century. ict is therefore a crucial component in maintaining managerial planning, teaching, and learning. more importantly, education should be highly valued because it can revolutionize any culture. it is the process of acquiring the information and abilities needed to maintain growth in all areas and sectors of life, for individuals, groups, and organizations. according to egwu (2016), a principal must effectively and efficiently organize, manage, and supervise the school's business for them to run successfully. as the head administrator of the secondary school, he is responsible for making optimal use of all available resources, including ict, to achieve school objectives. leaders at all levels must ensure that people and material resources are managed optimally for the educational system to accomplish its national plans and goals (nwune et al., 2016). nkwoh (2011) said that to effectively guide schools toward the achievement of educational goals, school principals must have a broad range of talents and be competent. according to carol and edward (2004), competency is the ability to complete a task successfully using knowledge, skills, attitude, and judgment. it is just the capacity and expertise needed to complete a specific task. the ability to successfully manage resources for productivity is referred to as managerial competency. the management of instructional programs, staff and student personnel administration, financial and physical resource management, and community relationship management are among the duties of school administrators as outlined by heller (2012). the formulation of school action plans and the viability of the institution depends heavily on the efficient management of both human and material resources. therefore, the teachers must have administrative abilities and demonstrate the to use and integrate technology effectively (teklemariam, 2009). the global revolution brought about by icts secondary schools. based on the findings of this study, the following recommendations were made for secondary school administrators to become more proficient and use ict technologies to enhance training for school administrators who are not yet ict proficient. hammed olalekan bolaji and tajudeen oluwafemi bolaji. information and communication …| 52 in teaching, learning, innovation, and institutional management must be used immediately (new media consortium, 2007). if the education industry is to fulfil its objectives and effectively compete on the global stage, school managers play a crucial role in the usage and integration of ict. ict adoption in schools won't be successful unless the principal fully supports it, acquires the necessary knowledge, and offers staff professional development and encouragement during the transition (wilmore, 2000). 1.1. statement of the problem stakeholders in the ilorin metropolitan have expressed worry over the management situation in secondary schools. this worry has focused on the school principals' failure to successfully perform their administrative tasks as a result of the significant growth in student enrollment, class size, pupils, and teacher data, which is typical of nigerian secondary schools (okon et al., 2015). administrative duties in schools would inevitably be processed using advanced tools and infrastructure, like ict. what is more important to worry about is whether or not there are adequate information and communication technology (ict) facilities for effective secondary school administration, and if there are, how well they are being used. this is not unrelated to the government of nigeria's inadequate funding for the education sector, secondary school administrators' inability to keep up with the rate of ict development, and the lack of personnel with the necessary skills to manage ict at all levels of administration. (muchiri, 2014), the ineffective institutional policies that encourage and regulate the use of ict, and the intermittent or nonexistent electrical supply in schools. observations made in our secondary schools indicated that this deficiency can have a significant impact on the managerial efficiency of secondary school principals. this raises the question of how much the availability and use of ict facilities affect the administrative effectiveness of public secondary school principals in the ilorin metropolis. 1.2. purpose of the study this study's primary goal is to examine how secondary school administrators in the ilorin metropolis use ict for managerial planning. in particular, the study looked into: 1. assess the extent to which secondary schools in the ilorin metropolitan had ict facilities. 2. examined how ict was used by secondary school principals in the ilorin metropolis for administrative purposes. 3. determined the degree of performance of secondary school principals in carrying out their regular administrative duties in the ilorin metropolis 4. analyze the association between secondary school principals' administrative effectiveness and the use of ict facilities in the ilorin metropolis. 1.3. research questions the following research questions were raised: 53 | international journal of informatics information system and computer engineering 3(2) (2022) 50-64 1. what degree of ict infrastructure is present in secondary schools in the ilorin metropolis? 2. what percentage of secondary school principals in the ilorin metropolis use ict resources for administrative tasks? 3. what level of efficiency do the ilorin metropolis' secondary school principals exhibit when carrying out their regular administrative duties? 4. does the use of ict facilities in the ilorin metropolitan significantly affect the administrative effectiveness of secondary school principals? 2. literature review conceptualization of information and communication technology according to reports, the word information and communications technology (ict) was first used to refer to computers' capabilities for communication around the beginning of the 1990s, replacing the term information technology (it). information and communication technology (ict) is the fusion of communication, information, and technologies that are based on common digital technology, according to egoeze, misra, akman, and colomopalacios (2014). ict is thus defined as an electronic instrument, equipment, or device that is used to gather, process, store, retrieve, or transfer information. according to salisu (2014), information and communication technologies (icts) include telecom networks as well as related hardware, software, and services (salisu, 2014; shah, 2014). ict was divided into four (4) different categories by: communication technologies, network technologies, computer technologies, and mobile technologies. all forms of media used for transferring voice, video, data, or multimedia are included in communication technologies. personal area networks (pan), campus area networks (can), intranets, extranets, local area networks (lans), wide area networks (wans), and the internet are examples of network technology. computer technologies include all types of removable storage devices, including optical discs, disks, flash memory, video books, multimedia projectors, interactive electronic displays, and modern personal computers that are constantly being developed (pcs). e-learning also makes use of mobile technology like mobile phones, personal digital assistants (pdas), palmtops, etc. that contain information as their physical object. ict was further divided into five groups by (brinda et al. 2016; ereh et al., 2012), including computing facilities and services; film/tape-based facilities, such as slide projectors, microfiche readers, micro card readers, and microprint readers; reproduction facilities, such as photocopying and duplicating machines; and telecommunication facilities. the degree to which a secondary school can offer all the ict resources required is a reflection of the prestige of the institution. ict use in nigerian education is hindered by several variables. these include a lack of finance to support the technology's acquisition, a lack of training, a lack of staff enthusiasm to utilize icts as teaching tools in the classroom, and more (oluwalola, 2017; oyedeji, 2015). arikewuyo (2009) added that the principal's duties include, but are not limited to: managing and allocating hammed olalekan bolaji and tajudeen oluwafemi bolaji. information and communication …| 54 school resources effectively; allocating school space effectively; ensuring satisfactory standards of upkeep and cleanliness of school facilities; planning staff development; and directing curriculum change and implementation (arikewuyi, 2009). according to adeniyi and omoteso (2014), in carrying out these duties, principals must show they are capable of leading by showcasing their professional expertise, organizational and managerial skills, and capacity to develop and implement sound school policies (adeniyi & omoteso, 2014; adesina, 2015). an international notion of utmost significance is administration. adeniyi and omoteso (2014) define administration as the process of coordinating people and resources within a company to achieve predetermined goals. additionally, the term "administration" refers to tasks like carrying out organizational objectives and choices as well as doing in-depth research on these tasks. performance is a difficult concept to define and comprehend. effectiveness is frequently used as a synonym for other words including effectiveness, efficiency, productivity, and competency. according to okon et al. (2011), performance is the degree of accomplishment given the time, resources, and circumstances. as a result, performance refers to a principal's execution of statutory tasks or functions that are directed at achieving goals established by the school (okon et al., 2011). accordingly, achieving daily management goals that are both feasible and of the utmost importance to the school constitutes efficient administration. as a result, a proactive secondary school principal's duties include organizing the available material and human resources and applying them consistently to the fulfilment of educational goals (national open university nigeria, 2014). the methodical coordination of human, material, and financial resources for the resolution of ongoing administrative issues to realize the planned objectives of secondary education is the principal's administrative performance. studies on ict and principal’s administrative performance in recent years, several surveys have been developed to gather data on the degree to which schools are building their capacity to integrate ict into learning, teaching, and management operations. oboegbulem and ugwu (2013) examined 30 schools in the southeastern states, including abia, anambra, ebonyi, enugu, and imo, which had access to ict and the internet (oboegbulem & ugwu, 2013). the results demonstrated, among other things, the need of using ict in school administration. especially during the current globalization period. however, due to school administrators' inability to manage ict equipment in schools properly, secondary schools only use ict to a very limited extent. ex-post facto research (okon et al., 2015) among 255 principals from 85 public secondary schools in akwa ibom state, nigeria, found a substantial correlation between the use of ict for record-keeping and communication, as well as the administrative performance of principals. 55 | international journal of informatics information system and computer engineering 3(2) (2022) 50-64 results from a related study conducted by oyedemi (2015) on the opinions of 120 administrators regarding the use of ict for efficient school management in the ilesa local government area of osun state showed that school principals have a favourable reaction to the usage of ict tools (oyedemi, 2015). according to subair and bada (2014), who conducted a similar study in osun state with 100 principals of public secondary schools, school administrators are aware of the value of ict in school administration, but the main challenge is a lack of the necessary skill and knowledge to use these resources. additionally, it was discovered that the majority of schools lack the necessary ict resources and those very few principals can use the ict tools at their disposal for administrative functions. the study also discovered that the majority of principals used print technologies for a variety of administrative tasks. in nigeria's akwa ibom state, (etudor-eyo, et al. 2012) researched 396 secondary school administrators' use of ict and communication efficacy. the results showed that administrators' use of ict and their effectiveness in communication are both high; there is a significant positive relationship between administrators' use of ict and their effectiveness in communication, and the use of ict significantly predicts secondary school administrators' communication effectiveness. a little body of knowledge about ict and administrative performance in the research area (ilorin metropolis) and the geopolitical region (north-central geopolitical zone), both of which are located in nigeria, is known as a result of the aforementioned empirical review. the topic coverage of these studies is one of the primary constraints of current research, which justifies the necessity for additional study. additionally, the literature review reveals that despite a large number of studies on this research project, the findings do not demonstrate any consistency in the relationship between the variables under examination. these findings point to a glaring vacuum in the literature that this current study aims to remedy. 3. methodology this study's design is a survey-style descriptive study to determine how ict affects principals' administrative performance. according to atunde (2011), descriptive research allows the researcher the chance to sample the opinions of a sizable number of samples from the study population to draw conclusions and make generalizations based on the replies received (atunde, 2011). the study area is the metropolis of ilorin. principals and vice-principals of secondary schools in the ilorin metropolis participated in the survey. ilorin west, ilorin south, and ilorin east are the three local government areas (lgas) that make up the metropolis. there are 75 public secondary schools in the ilorin metropolis overall (kwara state ministry of education and human capital, 2018; the federal republic of nigeria, 2013). the population of this study consisted of 292 participants from all 75 public secondary schools in the ilorin metropolis, including 75 principals and 217 vice-principals. 45 of the 75 public secondary schools in ilorin metropolis were sampled using the hammed olalekan bolaji and tajudeen oluwafemi bolaji. information and communication …| 56 stratified random sampling technique (15 schools were chosen from each of the lgas). additionally, from each of the 45 public secondary schools used for this study, the principal and two viceprincipals were chosen. this increased the number of responders who were used to 135 in total. the instrument for data collection was a structured, validated, and pretested questionnaire titled "information communication technology and principal's administrative effectiveness questionnaire" (ictpaeq) (reliability coefficient values of 0.85, 0.74, and 0.71 were obtained for sections b, c, and d, respectively). there are two sections to the questionnaire: parts a and b. the questions in part a ask respondents for personal information. part b is divided into four sections (a to d). the 15 items in section a were used to determine if there was ict equipment in the schools. ten questions make up section b, which is meant to gather data on how extensively principals use ict. questions in section c ask respondents to provide information on how successfully they use ict tools to carry out their administrative responsibilities. items in section d are included to help you learn more about the barriers to effective ict use in secondary schools. the questionnaire has parts a and b with options such as "available" and "not available," "frequently used," "occasionally used," "seldom used," and "not used," and sections c and d with options such as "strongly agree," "agree," "disagree," and "strongly disagree." the questionnaire was given to the subjects by the researcher with the assistance of three research assistants. the straightforward method allowed for prompt completion and return of the questionnaire copies. 96.3 per cent of those who completed the survey were responsive (that is 130 out of 135 administered questionnaires were returned and filled correctly). to address research questions 1, 2, and 3 respectively, data gathered from the field was evaluated using pertinent descriptive statistics such as percentages, mean, and standard deviation, while research question 4 was addressed using pearson production moment correlation (ppmc) statistics. research question 1: what degree of ict infrastructure is present in secondary schools in the ilorin metropolis? when interpreting respondents' responses to questions about availability, a response with an overall percentage score between 75 and 100 per cent is considered high, a response with an overall percentage score between 50 and 74 per cent is considered moderate, and a response with an overall percentage score of less than 50 per cent is considered low. according to each of the specific research topics, the findings of this study were displayed in table 1. 57 | international journal of informatics information system and computer engineering 3(2) (2022) 50-64 table 1. the level of availability of ict facilities in secondary schools in the ilorin metropolis s/n items available n = 130 f & % not available f & % 1. computer (desktop) 84(64.6%) 46(35.4%) 2. printer 78(60.0%) 52(40.0%) 3 internet services 72(55.4%) 58(44.6%) 4 projector 24(18.5%) 106(81.5%) 5 projector screen 24(18.5%) 106(81.5%) 6 photocopying devices/xerox machines 70(53.8%) 60(46.2%) 7 scanning machine 48(36.9%) 82(63.1%) 8 computer accessories 79(60.8%) 51(39.2%) 9 software 33(25.4%) 97(74.4%) 10 radio 104(80.0%) 26(20.0%) 11 television 37(28.5%) 93(71.5%) 12 satellite disc 31(23.8%) 99(76.2%) 13 handset/mobile phone 130(100.0%) 14 laptop 38(29.2%) 92(70.8%) 15 fax machine 130(100.0%) overall 43.7% 56.3% table 1 shows that the sampled secondary schools had a poor availability of information and communication technology (ict) facilities (43.7 per cent). the majority of secondary schools lacked the majority of ict resources. the shortfall ranged from 0% for a fax machine to 18.5% for a projector and projector screens, 25.4% for software, and 23.8% for a satellite disc. 35.4 per cent of secondary schools lacked a desktop computer set, which ought to be a standard component of all administrative offices. however, the cell phone (130 or 100.0%) and radio (80.0%) were the most accessible ict resources in public secondary schools in ilorin city. research question 2: examined how ict was used by secondary school principals hammed olalekan bolaji and tajudeen oluwafemi bolaji. information and communication …| 58 in the ilorin metropolis for administrative purposes. analysis of research question 2 made use of mean computation. this led to the classification of a mean score of above 2.5 as an acceptable response and a mean score of below 2.5 as a non-accepted response (rejected). the following was the interpretation of the individual and grand mean scores: high extent (he) was defined as 3.25 to 4.00, moderate extent (me) as 2.50 to 3.24, and low extent (le) as mean values below 2.50. (le) (table 2). table 2. the level of ict used by secondary school principals in the city of ilorin for administrative purposes s/ n items n = 130 mean s. d decision 16 a device that processes and stores data for efficient management 3.30 0.78 he 17 equipment for printing documents 2.75 0.75 me 18 internet browsing services 2.64 0.73 me 19 handset for communicating the most recent information about school-related concerns to staff, parents, and students. 3.96 0.94 he 20 scanners for scanning documents, such as passports. 2.19 1.12 le 21 radio for monitoring the most recent and relevant events or information worldwide 3.50 0.85 he 22 flash drives are a type of computer accessory used for data and information storage. 2.89 0.78 me 23 satellite disc for remote viewing of international programming 2.46 1.03 le 24 reproduction of staff, student, and school documents using a photocopier 3.27 0.85 he 25 for writing and designing, use programs like microsoft office and corel draw. 1.98 1.07 le overall mean 28.95 8.90 grand mean 2.90 0.89 moderate extent 59 | international journal of informatics information system and computer engineering 3(2) (2022) 50-64 the amount of ict used for administrative purposes by secondary school principals in the ilorin metropolitan is shown in table 2. the table demonstrates that the mean scores on items 19, 21, 16, and 24 are respectively 3.96, 3.50, 3.30, and 3.27, indicating that principals use their cellphones to communicate with staff, students, and other principals as well as with students and staff; radios to keep up with global news; computers to type, process, and store data for effective management; and photocopiers to make copies of documents. additionally, the mean values on items 22, 17, and 18 of 2.89, 2.75, and 2.64, respectively, show that principals use computers and computer accessories to process data, printers to print documents, and internet services to browse. however, a limited percentage of exindicatecipals employ software of various types, scanners for scanning passports, documents, etc., and satellite discs for viewing foreign programs remotely, as indicated by mean values of 2.46, 2.19, and 1.98 on items 23, 20, and 25, respectively. as a result, the grand mean value of 2.90 indicated modest ict use for administrative purposes by principals in public secondary schools in the ilorin metropolitan. research question 3: what level of efficiency do the ilorin metropolis' secondary school principals exhibit when carrying out their regular administrative duties? mean computation was used to analyze research question 3. accordingly, a mean score of above 2.5 was considered an appropriate response, whereas a mean score below 2.5 was considered an unacceptable response (rejected). the scores, both individual and overall, were interpreted as follows: highly level (hl) values ranged from 3.25 to 4.00, moderately level (ml) values ranged from 2.50 to 3.24, and low level (ll) values ranged from mean values below 2.50 (table 3). table 3. the level of principals' performance in carrying out their standard administrative responsibilities in secondary schools in the city of ilorin s/n items n = 130 mean s. d decision 26 preserving data that can be updated on personnel or student data. 2.60 0.81 ml 27 composing and sending mail 3.09 0.60 ml 28 preserving an updated inventory of the school's assets 3.29 0.74 hl 29 keeping an updated inventory of the school's assets in storage 1.75 1.04 ll hammed olalekan bolaji and tajudeen oluwafemi bolaji. information and communication …| 60 30 design and printing of student evaluations and testimonies. 2.42 0.91 ll 31 admission of new students and registration of current ones. 2.95 0.69 ml 32 online registration for public exams is available for students. 2.64 0.71 ml 33 early budget planning for the school. 2.56 0.73 ml 34 keeping and accessing student disciplinary records. 2.80 0.82 ml 35 creating internal school memos 2.63 0.78 ml 36 preparing the workload of teachers 2.75 0.75 ml 37 creating a school schedule 2.70 0.81 ml 38 spreading information both inside and outside of the school. 2.85 0.78 ml 39 compiling academic performance data for students 2.66 0 .77 ml 40 keeping reliable records of students' academic progress. 2.24 1.02 ll overall mean 39.79 11.26 grand mean 2.65 0.75 ml table 3 results showed that the grand mean value of 2.65 is higher than the 2.50 criterion limitations. this suggests that secondary school principals performed their usual administrative duties at a somewhat high level. research question 4: does the use of ict facilities in the ilorin metropolitan significantly affect the administrative effectiveness of secondary school principals? (table 4). table 4. the use of ict resources and administrative effectiveness are correlated descriptive statistics mean std deviation n usage of ict 28.9523 8.90304 130 administrative performance 39.7853 11.25708 130 61 | international journal of informatics information system and computer engineering 3(2) (2022) 50-64 correlations usage of ict administrative performance usage of ict pearson correlation 1 631 sig. (2-tailed) 000 n 130 130 administrative performance pearson correlation 631 1 sig. (2-tailed) 000 n 130 130 correlation is significant at the 0.05 level (2-tailed). as seen in table 4, the calculated r-value (0.631) is greater than the critical r-table value (0.195) at 0.05 significance levels for 128 degrees of freedom. hence, the null hypothesis is rejected. this demonstrates that there was a strong positive correlation between secondary school principals' administrative effectiveness in the ilorin metropolis and their use of ict facilities. 4. discussion of findings according to the study's findings, there were few information, communication, and technology (ict) facilities available in public secondary schools in the ilorin metropolitan. this result was in line with a previous study by adeyemi and olaleye (2010), who discovered that secondary schools in ekiti state had limited access to ict equipment (adeyemi & olaleye, 2010; anukam et al., 2012). this finding is in line with that of subair and bada (2014), who claimed that only a small percentage of osun state's public secondary schools have the necessary ict resources, making it difficult for principals to use those resources for administrative functions. results from the second research question showed that secondary school principals in the ilorin metropolitan used ict to a modest degree for administrative purposes (grand mean value of 2.90). this finding may be explained by the fact that most schools lacked these ict resources, which may have contributed to the finding. the conclusions of this study are refuted by adeyemi and olaleye's (2010) study, which found that secondary school principals in ekiti state used ict equipment at a low rate. the current study also found that principals performed only moderately in carrying out their designated administrative obligations. hammed olalekan bolaji and tajudeen oluwafemi bolaji. information and communication …| 62 this finding might not have been unrelated to the issue of a lack of ict resources in the schools as well as their sparing use of those resources. this result, however, contradicts that of adeyemi and olaleye (2010) who noted that the level of management of secondary schools in ekiti state, nigeria, was low. it is therefore consistent with other researchers' findings (makewa et al., 2013). further research revealed a substantial correlation between the use of information and communication technology (ict) resources and secondary school administrators' administrative performance. this revealed that secondary school principals do better administratively the more they use ict resources. this further implies that secondary school management effectiveness may be in danger in the absence of ict. this result is consistent with that of oboegbulem and ugwu (2013), who reported that the use of ict in school administration is essential, particularly in this era of globalization, but that the extent of their application in secondary schools is very low because school administrators in the south eastern states of nigeria lack the skills to manage ict facilities for the efficient administration of schools. this result was in line with that of etudor-eyo et al. (2012), who claimed that ict had a significant impact on the efficient operation of secondary schools. 5. conclusion according to the study's findings, information and communication technology (ict) significantly affects secondary school principals' administrative performance in the city of ilorin. this implied that ict is a crucial factor in the efficient administration of secondary schools. however, the researcher concluded that secondary schools in nigeria are not yet prepared for technological progress because of the moderate availability and usage of ict facilities. based on the findings of this study, the following recommendations were made; the government should make ict technologies accessible in all secondary schools so that administrators may become more skilled and use them. governments and ngos (nongovernmental organizations) may on occasion provide school administrators with improved training if they are not yet ict-savvy. for school administrators who are not yet ict-savvy, governments and ngos (non-governmental organizations) may on occasion offer improved training. references adeniyi, w. o., & omoteso, b. a. (2014). emotional intelligence and administrative effectiveness of secondary school principals in southwestern nigeria. international journal of psychology and behavioral sciences, 4(2), 79-85. adesina, m. o. (2015). ict: its relevance in the teaching and learning of physical education in nigeria. journal of emerging trends in educational research and policy studies, 6(3), 236-239. 63 | international journal of informatics information system and computer engineering 3(2) (2022) 50-64 adeyemi, t. o., & olaleye, f. o. (2010). information communication and technology (ict) for the effective management of secondary schools for sustainable development in ekiti state, nigeria. american-eurasian journal of scientific research, 5(2), 106-113. angie, o., & ugwu, r. n. (2013). the place of ict (information and communication technology) in the administration of secondary schools in south eastern states of nigeria. anukam, i. l., okunamiri, p. o., & ogbonna, r. n. o. (2012). basic text on educational management. arikewuyo, m. o. (2009). professional training of secondary school principals in nigeria: a neglected area in the educational system. florida journal of educational administration & policy, 2(2), 73-84. atunde, m. o. (2011). influence of management information system on academic staff effectiveness in kwara state colleges of education. unpublished m. ed thesis, national open university of nigeria. brinda, t., mavengere, n., haukijärvi, i., lewin, c., & passey, d. (2016). stakeholders and information technology in education. springer international publishing. egoeze, f., misra, s., akman, i., & colomo-palacios, r. (2014). an evaluation of ict infrastructure and application in nigeria universities. acta polytechnica hungarica, 11(9), 115-129. ereh, c. e., & okon, n. n. (2015). keeping of teachers’ record and principals’ administrative effectiveness in akwaibom state secondary schools, nigeria. international journal of education, learning and development, 4(1), 4044. etudor-eyo, e., emah, i. e., & ante, h. a. (2012). the use of ict and communication effectiveness among secondary school administrators. educare, 4(2). makewa, l., meremo, j., role, e., & role, j. (2013). ict in secondary school administration in rural southern kenya: an educator’s eye on its importance and use. international journal of education and development using ict, 9(2). muchiri, g. m. (2014). factors influencing school principals’ integration of ict in administration of public secondary schools in githunguri sub county, kiambu county, kenya (doctoral dissertation, university of nairobi). national open university nigeria (2014). course material on theories and practice of public administration (pad813). national open university of nigeria press. okon, f. i., akpan, e. o., & ukpong, o. u. (2011). financial control measures and enhancement of principals’ administrative effectiveness in secondary schools in akwa ibom state. african journal of scientific research, 7(1), 335342. okon, j. e., ekaette, s. o., & ameh, e. (2015). information and communication technology (ict) utilization and principals' administrative effectiveness in public secondary schools in akwa ibom state, nigeria. african educational research journal, 3(2), 131-135. hammed olalekan bolaji and tajudeen oluwafemi bolaji. information and communication …| 64 oluwalola, f. k. (2017). record keeping, information and communication technology in school management. ed. olubor, r.o. et al. educational management: new perspectives. amfitop books. oyedeji, n.b (2012). management in education: principles and practice (revised edition). success educational services. oyedemi, o. a. (2015, july). ict and effective school management: administrators’ perspective. in proceedings of the world congress on engineering 1, 1-3. salisu, r. o. (2014). information and communication technology (ict) and registrars’ administrative effectiveness in kwara state colleges of education. shah, m. (2014). impact of management information systems (mis) on school administration: what the literature says. procedia-social and behavioral sciences, 116, 2799-2804. the federal republic of nigeria (2013). national policy on education. nerdc press. moksud alam mallik. an efficient fuzzy clustering algorithm for mining user session ...| 80 an efficient fuzzy clustering algorithm for mining user session clusters on web log data moksud alam mallik1,2*, nurul fariza zulkurnain1 1international islamic university malaysia, kuala lumpur, malaysia. 2vnr vignana jyothi institute of engineering & technology, hyderabad, india. *corresponding email: 1alammallik_m@vnrvjiet.in a b s t r a c t s a r t i c l e i n f o data mining is extremely vital to get important information from the web. additionally, web usage mining (wum) is essential for companies. wum permits organizations to create rich information related to the eventual fate of their commercial capacity. the utilization of data that is assembled by web usage mining gives the organizations the capacity to deliver results more compelling to their organizations and expanding of sales. client access patterns can be mined from web access log information using web usage mining (wum) techniques. because there are so many end-user sessions and url resources, the size of web user session data is enormous. human communications and non-deterministic browsing patterns increment equivocalness and dubiousness of client session information. the fuzzy set-based approach can solve most of the challenges listed above. this paper proposes an efficient fuzzy clustering algorithm for mining client session clusters from web access log information to find the groups of client profiles. in addition, the methodologies to preprocess the net log data as well as data cleanup client identification and session identification are going to be mentioned. this incorporates the strategy to do include choice (or dimensionality decrease) and meeting weight task assignments. article history: received 18 dec 2021 revised 20 dec 2021 accepted 25 dec 2021 available online 26 dec 2021 aug 2018 __________________ keywords: data mining, web usage mining (wum), data preprocessing, fuzzy clustering. international journal of informatics, information system and computer engineering international journal of informatics information system and computer engineering 2(2) (2021) 80-93 81 | international journal of informatics information system and computer engineering 2(2) (2021) 80-93 1. introduction data mining, the extraction of hid judicious information from immense informational collections, is a staggering new development with the phenomenal potential to help associations revolve around the fundamental information in their data stockrooms. information mining instruments anticipate future examples and work on them, allowing associations to make proactive datadriven decisions. utilizing a blend of ai, measurable investigation, demonstrating methods, and data set innovation, information mining discovers designs and unobtrusive connections in information and construes decisions that permit the forecast of future outcomes. data mining (information disclosure from information) is the extraction of fascinating for example non-immaterial, verifiable, ahead-of-time dark, and conceivably important examples or information from a huge proportion of information. it changes locally very well and may be alluded to as information revelation (mining) in data sets (kdd), information, extraction, information, design investigation, and so forth (han et al., 2012; zahid et al., 2011; cooley et al., 1997). web mining is defined as the disclosure and evaluation of useful data from the world wide web in a broad sense. there are two sections to web mining: web content mining and web utilization mining are two types of web mining. web use mining is the automated disclosure of user access patterns from web servers. every business collects a significant amount of data on a daily basis in its operations. web servers generate this information, which is saved in server access logs. examining server access log data helps the organization to focus on lifetime estimation of customers, showcasing strategies for products, effective promotional campaigns, etc. it also helps in rebuilding websites to represent the organization and promote their products and services in a better way in www. web mining is by and large isolated into two parts. the first part is secondary in space; it converts web data into an appropriate exchange structure. this combines exchange id preparation and information inclusion. the subsequent part is space selfsufficient applications like general information mining and example coordinating with methods like clustering (cooley et al., 1997). preprocessing, information extraction, and examination outcomes are all included in wum. the preprocessing stage of web-use mining aims to convert unprocessed web log data into a large number of customer profiles. each of these profiles receives a plan or a number of urls related to a customer session. the preprocessing stage in web-use mining changes the harsh snap stream data into a gettogether of customer profiles. each of these forms contains a set of urls that correspond to a client session. for different preprocessing activities, such as data fusion and cleaning, user and session identification, and so on, several algorithms and heuristic methods are used. convergence of log files from several web servers is referred to as data fusion. data cleaning incorporates assignments, for example, eliminating unnecessary references to inserted objects, style documents, illustrations, or sound records, and disposing of references because of bug routes. by doing away with an undesirable substance like this we can lessen the size of the input file and make the mining moksud alam mallik. an efficient fuzzy clustering algorithm for mining user session ...| 82 errand efficient. so, during preprocessing we will clean the data, identify the user by using the ip address and identify the user session by using time-oriented heuristics. we can assign weight to urls based on the number of times they are accessed in different sessions also weight can be assigned to a session according to the number of urls present in it (zahid et al., 2011; ansari et al., 2012; babuy et al., 2011). when user sessions are found we can utilize them for clustering. little sessions will be removed because it shows disturbances in the data. rather than straightforwardly removing it, we can utilize a fuzzy set_theoretic way to deal with it. direct elimination of minimal estimated sessions may achieve a loss of a gigantic proportion of data. so, we can relegate weight to all sessions considering the number of urls got to by the session (see figure 1). figure 1. structure of web usage mining after this, we can apply the fuzzy clustering algorithm to recognize user session clusters. fuzzy membership is promoted by fuzzy clustering. in this case, a single informational index can be used by many groups. it suggests that one informational collection can find a place with a few bunches all the while. every informational index will have a degree of enrollment in each group; some groups will have a high level of participation, while others will have a low level of enrollment. the value of participation will range from zero to one. the total assessment of the participation of one meeting to each bunch of habitats will be one. data fuzzy clustering ought to oversee fit reality. for instance, if an informational index is on the limit between at least two bunches fluffy grouping will give it halfway participation among bunches (bezdek et al., 1984). in fuzzy clustering, each datum point has relegated participation worth to every one of the clusters. if the membership value is zero the data is not a piece of that cluster. no zero value shows that the data is attached to that cluster. membership value will be always between zero and one. here we can discover similar user access patterns i. e. same url patterns by applying the fuzzy clustering algorithm. the output of this step will be separate user session clusters it (zahid et al., 2011; ansari et al., 2012; babuy et al., 2011). the literature review is found in section 2, the proposed algorithm is found in section 3, the test results are found in section 4, and the conclusion and future improvements to this research are found in section 5. 2. literature survey digitized information is easy to capture and storing it is very cheap. so gigantic measure of data has been put 83 | international journal of informatics information system and computer engineering 2(2) (2021) 80-93 away in distinctive sorts of databases and other types of storage. the data storage frequency is developing at an exceptional rate. this developing data is amassed in various huge data storages. this sort of circumstance requires intense apparatuses to grasp knowledge from this ocean of information. with the exceptionally high development of data sources open on the world wide web, it has wound up continuously indispensable for clients to use customized instruments in finding the needed information resources, and to follow and dissect their utilization designs. so, there is a necessity to create server-side and client-side tools that mine knowledge adequately (cooley et al., 1997). web usage mining is the revelation of client access designs from web servers. how clients are getting to a webpage is critical to building the use of the site by clients. there are three steps to it. preprocessing, pattern extraction, and examination of the results. different forms of sounds are removed during the preprocessing stage. the user and session identification process will be completed in this stage. a wide variety of pattern extraction techniques are available like clustering, path analysis, etc based on the needs of the analyst. once web usage patterns are discovered there are different types of techniques and tools to analyze and understand them. a gigantic amount of unessential data is available in input web access logs. many user sessions and url resources makes the dimension of web-user session data very high. human interactions and nondeterministic browsing patterns increase the ambiguity and vagueness of user session data. the world wide web is a massive, dynamic data source that is both architecturally complex and constantly evolving. as a result, it is a fertile ground for data mining and web mining. using various information mining methodologies, web mining can be utilized to extract valuable information from the internet. the majority of web information is unlabeled, dispersed, heterogeneous, semicoordinated, time-moving, and multidimensional. the following categories of data can be found on the internet: (i) the substance of real web pages (ii) intra-page constructions of the website pages. (iii) inter-page structures decide linkage structures between website pages. (iv) we use information depicting web (v) user profiles incorporate demographic and enrolment data about users. web usage mining (wum) takes a gander at the aftereffects of customer relationships with a web worker, including weblogs, click streams, and informational index trades at a website or a social event of related areas. wum performs three guideline steps: preprocessing, design extraction, and results in examination it (zahid et al., 2011; ansari et al., 2012; babuy et al., 2011). giovanna use a lodap (log data preprocessor) tool to do preprocessing of web log data (castellano et al., 2007; nasraoui et al., 2000). to investigate web log information, we use lodap, a product device that cycles web access information to eliminate immaterial log passages, recognize gets made by clients, moksud alam mallik. an efficient fuzzy clustering algorithm for mining user session ...| 84 and gather client gets into client meetings. every client meeting contains access data (number of visits, season of visit, and so on) about the pages seen by a client; as a result, it depicts that client's navigational behavior. the term "user identification" refers to the process of identifying unique users from online log data. generally, the log document in extended common log design gives simply the pc's ip address and the client specialist. user registration-required websites will include additional user login information that can be utilized to identify users. each ip address will be treated as a user if the user login information is not available. after this, we have to recognize user sessions. here we will partition the web log data file into diverse parts known as user sessions. every session is considered a single visit to a website. identification of client meetings from the weblog record is a convoluted errand. this information can be used as a contribution to an assortment of information mining calculations it (zahid et al., 2011; ansari et al., 2012; babuy et al., 2011). for clustering user sessions, we employ the fuzzy c-means clustering technique. here we need to randomly select initial cluster centers. the similarity measure is done based on the page visit time using fuzzy intersection and union. even after preprocessing noise is still present in the web log data. olfa defined the similarity between user sessions where compute preprocessing and segmentation of web log data into sessions. preprocessing of web log data and cluster user sessions can achieve using the fuzzy clustering technique. this will affect the clustering result and similarity measures (olfa nasraoui et al., 2008). zahid explains an existing web usage mining framework. it uses the fuzzy settheoretic approach in preprocessing and in clustering. it improves mining results when compared with the crisp approach in preprocessing and clustering. because the fuzzy approach matches more with a real-world scenario. it is using the fuzzy c-means algorithm for clustering (zahid et al., 2011; ansari et al., 2011). using a fuzzy c-means clustering technique, castellano hopes to divide website users into different groups and generate session clusters. preprocessing should remove noise up to maximum because it will affect remaining operations like session identification and clustering the sessions. the fuzzy setbased approach can solve most of the challenges listed above. fcm needs an initial random selection of clusters. this work focuses on designing “an efficient fuzzy clustering algorithm for mining user session clusters from web access log data". it improves the quality of clusters discovered (castellano et al., 2006). 3. method: proposed system here a new efficient fuzzy clustering algorithm that can proficiently mine client session clusters from web access log information is proposed. the calculation manages the least of medians while choosing group focuses. the strategy lessens mean squared mistakes and takes out the impact of anomalies. 3.1. input data 85 | international journal of informatics information system and computer engineering 2(2) (2021) 80-93 the essential information sources utilized in web utilization mining are the worker log documents, which incorporate web server access logs and application server logs. the input server log data is downloaded from the site https://filewatch. net. filewatcher is a ftp search engine that monitors more than two billion files on more than 5,000 ftp servers. the downloaded file name is "pa. sanitized access. 20070109. gz". a sample server log file entry is given below (table 1). table 1. sample server log file entry 1168300919. 015 the time of the request 1781 the elapsed time for http request 17. 219. 121. 198 ip address of the client tcp_miss/200 http reply status code 1333 bytes send to the server in response to the request get the requested action http://www. quiethits. com/hitsurfer. php direct/204. 92. 87. 134 uri of the item being mentioned, customer client name, the hostname of the machine where we got the solicitation, text/html content-type of the object. 3.2. data mining every hour, well-known websites generate gigabytes of online log data. managing such massive records is a difficult task. log record sizes can be reduced by performing information cleansing, allowing mining assignments to be lifted. when a user requests for a web page enters or clicks on a url usually a single request will cause several urls to be generated like figures, scripts, etc. so all urls with a graphic extension should be removed. web robots are also identified and their queries are removed during data cleaning. in weblog data, a web robot (also called as web wanderers, crawlers, or spiders) generates numerous request lines automatically. robot’s request is unwanted because it is not generated by the user, it is generated by the machine. so, we should remove robot requests as removing them will increase the accuracy of clustering results. here we employed two methods for extracting robot requests. the first one is checking for an entry in "robots. txt" in http://www.quiethits.com/hitsurfer.php%20-%20%20direct/204.92.87.134 http://www.quiethits.com/hitsurfer.php%20-%20%20direct/204.92.87.134 http://www.quiethits.com/hitsurfer.php%20-%20%20direct/204.92.87.134 moksud alam mallik. an efficient fuzzy clustering algorithm for mining user session ...| 86 web log data and the second one is removing head requests (zahid et al., 2011; ansari et al., 2012; babuy et al., 2011). next is the removal of urls with query strings. normally url with query strings is used for requesting extra details from within the web page within the same session. since they are unnecessary, we will remove them as well (zahid et al., 2011; ansari et al., 2012; babuy et al., 2011). the input file is 30. 6mb in size and has 2,06,914 entries. after removing urls with graphic contents, the log file has 72,498 entries which are almost one third of the input file. after removing the web robot request, we have 72,305 entries. after removing urls with query string,we have 59,054 entries in the log file. then we will encrypt ip address to hide the user’s identity and to have ease in future processing and the ip address will be put away in a map with its encoded id. furthermore, each url will be appointed a unique number and it will be put away in a url map along with its number (zahid et al., 2011; ansari et al., 2012; babuy et al., 2011). the data cleaning algorithm is demonstrated in the following scheme: 1. step 1: remove each line of the input file one by one. 2. step 2: remove all urls with suffixes recorded in the above suffix list. 3. step 3: remove all urls produced by web robots. 4. step 4: remove urls with query strings. 5. step 5: take out the ip address and store it on a map. 6. step 6: code url with url number and store it on a map. 7. step 7: sort each line based on the ip address encryption code. 8. step 8: print in the required fields to a yield file. the output file after applying the above algorithm will be as shown in table 2. the output file is sorted in ascending order based on the encoded value of the ip address (table 2). table 2. output file after data cleaning ip time elapsed time bytes url ip1 1168300931. 828 142 1599 1 ip1 1168300935. 244 501 1617 2 ip1 1168300936. 604 1 1617 3 ip1 1168300941. 345 2 1593 4 ip1 1168300957. 585 186 1585 6 ip1 1168300985. 665 145 1563 10 87 | international journal of informatics information system and computer engineering 2(2) (2021) 80-93 3.3. user identification after cleaning input web log data, we can distinguish users. since the log file doesn’t contain user login information, we consider each ip as a user. next, we separate all solicitations identifying with the individual user. the algorithm for user identification is shown in the following scheme (zahid et al., 2011; ansari et al., 2012; babuy et al., 2011). step 1: split every line in the input file into obliged fields. step 2: store it(i. e. obliged fields) in a map m1 with ip address as the key and another map m2 as the worth. key of the map m2 is the time and worth is whatever is left of the fields. step 3: sort the internal map m2 considering the time key. step 4: print the content of the map m1 to the yield record. the organization of the yield document produced after user identification is shown in table 3. 3.4. session identification client session distinguishing proof is the technique of dividing the customer activity log of each customer into sessions, each addressing alone visit to the site. sites without client verification data generally depend on heuristic strategies for sessionization. the sessionization heuristic guides in isolating the genuine game plan of exercises performed by one customer in one visit to the site. keeping in mind the end goal to recognize client sessions we can try different things with two distinctive time-oriented heuristics (toh) as portrayed underneath (zahid et al., 2011; ansari et al., 2012; babuy et al., 2011): toh1: the time term of a session should not surpass a limit α. let the timestamp of the main url demand, in a session be, t1. if another url asks for a session with timestamp ti it is allotted to the same session if and only if ti-t1≤ α. the principal url asking for with timestamp bigger than t1 +α is taken as the first request of the following session (zahid et al., 2011; ansari et al., 2012; babuy et al., 2011): 1. step 1: the given steps ought to be finished for every line in the information input file. 2. step 2: if the line contains user id, then userid =user id of the line. 3. step 3: print line to output file under this user id and the first session of same user id. 4. step 4: in case that l is the first accessed log of the user then t1 = line. time else t2 = line. time. 5. step 5: if t2-t1≤ α at that point print line under the same session to the file. 6. step 6: if it is not as in the previous step i. e. step 5 then output user id and corresponding line under a new session, t1 = line. time. detailed information is shown in table 3. moksud alam mallik. an efficient fuzzy clustering algorithm for mining user session ...| 88 table 3. algorithm to create user sessions taking into account toh1. user time elapsed time bytes url ip1 1168300931. 828 142 1599 1 1168300935. 244 501 1617 2 1168300936. 604 1 1617 3 . . . . . . . . . ip2 1168300953. 645 648 260 5 1168300990. 665 143 260 14 toh2: the time spent on a page visit should not surpass a limit α. let a url that is most recently given to a session having a timestamp ti. the next url’s request fits in with the same session if and only if ti+1-ti≤ α where ti+1 is the timestamp of the new url’s request. this url is now the first of the following session (zahid et al., 2011; ansari et al., 2012; babuy et al., 2011). in our implementation for the interim, we are utilizing toh1. we have chosen 30 minutes as the estimation of the limit time. the algorithm for user session identification is shown in table 4 and the output file of session identification are shown in table 4. 3.5. dimensionality reduction removing to separate the logs references to low bolster urls (i. e. that are not bolstered by a predetermined number of user sessions) can give a powerful dimensionality decrease system while enhancing clustering. to implement this, we are removing urls that occur only once (zahid et al., 2011; ansari et al., 2012; babuy et al., 2011) (see table 4). table 4: output file of session identification user session time elapsed time bytes url ip1s1 1168300931. 828 142 1599 1 1168300935. 244 501 1617 2 1168300936. 604 1 1617 3 ip1s2 1168302738. 407 81 1623 482 1168302745. 477 138 1559 483 . . . . . . . . . ip2s49 1168300953. 645 648 260 5 . . . . . . . . . 89 | international journal of informatics information system and computer engineering 2(2) (2021) 80-93 3.6. session weight assignment the session files can be divided for the clustering process in order to remove small sessions with the purpose of removing variation from the data. in any event, deleting these little measured sessions directly may result in the loss of a vital measure of information, especially if the number of these small sessions is significant. here we allot weights to every one of these sessions considering the number of urls got to by the sessions. session weight assignment is done based on the following equation (zahid et al., 2011; ansari et al., 2012; babuy et al., 2011). 𝑊𝑠𝑖 =0,if|𝑠𝑖|≤1 𝑊𝑠𝑖 =1, if|𝑠𝑖|≥1 where |𝑠𝑖| is the number of urls accessed in a particular session. 3.7. development of user session matrix here we represent sessions using a matrix. every row denotes a session, and the column denotes a url. if a url arrives in a session, then the entry for that url in the specific session will be more prominent than zero. it will be many events of that url in that session. if url is not present, then that entry will be zero. sessions are referred to by utilizing a sparse matrix in row-major form. it reduces processing time up to a great extent. after all, we are dividing to standardize the session matrix for every column by its greatest value (zahid et al., 2011; ansari et al., 2012; babuy et al., 2011). for fuzzy clustering structures, the fuzzy c-means technique is commonly employed nowadays. so, in order to compare our new algorithm against the previous system, we used fcm (zahid et al., 2011; ansari et al., 2012; babuy et al., 2011; bezdek et al., 1984). 3.8. implementation of proposed system. the suggested system can be implemented as a fast fuzzy clustering technique for mining user session clusters from web log data, as described in section 5 titled "session clustering." the following is the primary part of the processing: at first, we will take one meeting say s1 and discover the distance between this meeting to every other meeting (say s2; s3; s4; …; sn)multiplied by the enrollment capacity of s1 to bunch focus 1(v1). next, we will sort these qualities into rising requests and take the middle. the above step will be done for all sessions s1; s2; s3; s4; …; sn. now these medians obtained from the above steps will be sorted and the least value will be taken. the session relating to the least worth will be taken as the main group community in this round. all above advances will be proceeded for bunch focus 2 up to group focus c(v1; v2; v3; …; vc). in this way, we will get new arrangements of bunch focuses in one round. new group communities will be determined up to a particular number of rounds till we get ideal bunch habitats. 3.9. modification in proposed system. moksud alam mallik. an efficient fuzzy clustering algorithm for mining user session ...| 90 here for every cluster center, we will be selecting the smallest value of medians. however, the issue is that abruptly we are getting the same smallest median in each iteration. so, in each cycle, we are getting the same cluster center repeatedly. so, we rolled out a little improvement in this algorithm. instead of selecting the least median in each round, we will choose the smallest median in the first round, the second smallest median in the second round, the third smallest median in the third round, and so on. by actualizing in this manner, we are demonstrating indicators of progress in the suggested algorithm's execution, which is superior to fcm. 3.9.1. fuzzy membership function expect to be x = {x1; x2; :::; xm} is the arrangement of information focuses or sessions. each point is a vector of the structure i = 1… m , xi = {xi1; xi2; :::; xin}. let v ={v1; v2; :::; vc} is a bunch of n dimensional vectors compares to c group habitats and each bunch place is a vector of the structure 8j = 1:::n , vj = {v1j; v2j; :::; vnj}. let uij addresses enrollment of information point(or meeting) xi in bunch j. the m×c enrollment framework u = [uij] shows portion of sessions to different bunch communities. it fulfills following models. ∑ 𝑢𝑖𝑗 = 1; ∀i = 1 … m 𝑐 𝑗=1 0<∑ 𝑢𝑖𝑗 𝑚 𝑖=1