JURNAL RISET INFORMATIKA 
Vol. 5, No. 2. March 2023 

P-ISSN: 2656-1743 |E-ISSN: 2656-1735 
DOI: https://doi.org/10.34288/jri.v5i2.507 

Accredited rank 3 (SINTA 3), excerpts from the decision of the Minister of RISTEK-BRIN No. 200/M/KPT/2020 

 
195 

 
COMPARATIVE ANALYSIS OF USING WORD EMBEDDING IN DEEP 
LEARNING FOR TEXT CLASSIFICATION 

 
Mukhamad Rizal Ilham1, Arif Dwi Laksito2*) 

 
Fakultas Ilmu Komputer 

Universitas Amikom Yogyakarta 
Yogyakarta, Indonesia 

mukhamad.ilham@students.amikom.ac.id1, arif.laksito@amikom.ac.id2*) 
 

(*) Corresponding Author 
 

Abstract 
A group of theory-driven computing techniques known as natural language processing (NLP) are used to 
interpret and represent human discourse automatically. From part-of-speech (POS) parsing and tagging to 
machine translation and dialogue systems, NLP enables computers to carry out various natural language-
related activities at all levels. In this research, we compared word embedding techniques FastText and 
GloVe, which are used for text representation. This study aims to evaluate and compare the effectiveness of 
word embedding in text classification using LSTM (Long Short-Term Memory). The research stages start 
with dataset collection, pre-processing, word embedding, split data, and the last is deep learning techniques.  
According to the experiments' results, it seems that FastText is superior compared to the glove technique. 
The accuracy obtained reaches 90%. The number of epochs did not significantly improve the accuracy of 
the LSTM model with GloVe and FastText. It can be concluded that the FastText word embedding technique 
is superior to the GloVe technique. 
 
Keywords: Word Embedding; Sentiment Analysis; Deep Learning; LSTM 
 

Abstrak 
Natural Language Processing (NLP) adalah seperangkat teknik komputasi yang didorong oleh teori untuk 
secara otomatis menganalisis dan mewakili bahasa manusia. NLP memungkinkan komputer untuk melakukan 
berbagai tugas terkait bahasa alami di semua tingkatan, mulai dari penguraian dan penandaan part -of-
speech (POS) hingga Machine translation dan sistem dialog. Dengan banyaknya data dan peningkatan jumlah 
dokumen yang signifikan per hari, klasifikasi teks menjadi semakin penting karena digunakan dalam berbagai 
aplikasi seperti penyaringan informasi, penyaringan spam, hingga mengkategorikan text. Tujuan dari 
penelitian ini untuk menganalisis perbandingan performa kinerja word embedding Glove dan Fasttext pada 
klasifikasi text. Dalam penelitian ini juga menggunakan model deep learning algoritma LSTM (Long Short -
Term Memory). Berdasarkan hasil eksperiman metodologi Fasttext lebih unggul dibanding dengan teknik 
Glove akurasi yang didapatkan mencapai 90% dengan menggunakan pelatihan di semua epoch dan 
perbandingan akurasi masing masing epoch tidak kelihatan signifikan. Dapat disimpulkan bahwa Teknik 
word embedding Fasttext lebih unggul dibanding dengan teknik GloVe. 
 
Kata kunci: Word Embedding; Sentiment Analisis; Deep Learning; LSTM 
 
 
INTRODUCTION 
 
A group of theory-driven computing 

techniques known as natural language processing 
(NLP) are used to interpret and represent human 
discourse automatically. NLP research has evolved 
from Punch cards and Batch processing, where 
decoding a single sentence took up to seven 
minutes, to an era like Google, where millions of 
web pages can be processed in less than a second 
(Young, Hazarika, Poria, & Cambria, 2018). From 

part-of-speech (POS) parsing and tagging to 
machine translation and dialogue systems, NLP 
enables computers to carry out various natural 
language-related activities at all levels. 

Text categorisation, which is used in 
various applications, including information 
filtering, spam filtering, and text categorisation, is 
becoming more and more crucial due to the vast 
quantity of data and considerable growth in the 
number of documents produced daily. The main 
research topics include efficient document text 

mailto:Mukhamad.ilham@students.amikom.ac.id
mailto:arif.laksito@amikom.ac.id


P-ISSN: 2656-1743 | E-ISSN: 2656-1735 
DOI: https://doi.org/10.34288/jri.v5i2.507 

JURNAL RISET INFORMATIKA 
Vol. 5, No. 2. March 2023 

Accredited rank 3 (SINTA 3), excerpts from the decision of the Minister of RISTEK-BRIN No. 200/M/KPT/2020 

 
196 

 
representation and the selection of better deep 
learning algorithms. The technique of automatically 
comprehending, gathering, and analysing textual 
data to extract sentiment information from views 
expressed in text is known as sentiment analysis or 
opinion mining (Wang, Nulty, & Lillis, 2020). 

One technique to convert words into 
continuous vectors of a certain length is word 
embedding. Word embedding converts words into 
vectors that summarise their syntactic and semantic 
information. Therefore, word embedding is 
considered suitable as a feature representation in 
neural network models for Natural Language 
Processing (NLP) tasks (Deho, Agangiba, Aryeh, & 
Ansah, 2018). Word weighting is a pre-processing 
data strategy that assigns an appropriate weight to 
each term to represent the term's relevance to the 
text. This model plays a vital role in improving text 
classification with high efficiency. Word embedding 
is a crucial technique in deep learning since it can 
analyse the text as an input for the deep learning 
model. 

Deep learning is a technique for feature 
extraction, pattern recognition, and classification 
that involves employing several layers of processing 
to build different models and execute classification 
tasks from the gathered data (Imaduddin, 
Widyawan, & Fauziati, 2019). The deep learning 
algorithm used in this research is LSTM (Long 
Short-Term Memory), one of the variations of RNN 
(Recurrent Neural Network). LSTM can be used to 
overcome the weakness of RNN, which is its 
inability to store data during learning if too much 
data has to be stored. 

The bag of words technique, the first 
technique created for encoding words into vector 
form, marked the beginning of the development of 
word embedding. In 1972 Karen Spärck Jones 
introduced the TF-IDF (Term Frequency - Inverse 
Document Frequency) technique, a combination of 
TF (Term frequency) and IDF (inverse document 
frequency) is a statistical measure that describes 
the words in several documents (Jones, 1972). To 
build practical neural network-based word 
insertion training in 2013, Tomas Mikolov and his 
colleagues at Google created the new word2vec 
approach (Mikolov, Chen, Corrado, & Dean, 2013). 
After a year, Jeffrey Penington and his colleagues 
created the GloVe (Global Vectors) technique, an 
extension of the effective word2vec learning 
technology (Brennan, Loan, Watson, Bhatt, & 
Bodkin, 2017). The last technique, FastText, was 
developed by Facebook in 2017, which is very fast 
and effective in learning word representation and 
text classification (Bojanowski, Grave, Joulin, & 
Mikolov, 2017). 

There have been numerous studies in the 
area of sentiment analysis that used word 
embedding techniques like Bag of Word (Imaduddin 
et al., 2019; Marukatat, 2020), Word2Vec 
(AlSurayyi, Alghamdi, & Abraham, 2019; Imaduddin 
et al., 2019; Kilimci & Akyokus, 2019; Marukatat, 
2020; Rahman, Sari, & Yudistira, 2021), doc2vec 
(Imaduddin et al., 2019), GloVe (AlSurayyi et al., 
2019; Imaduddin et al., 2019; Kilimci & Akyokus, 
2019), and FastText (Kilimci & Akyokus, 2019; 
Marukatat, 2020). Those word embedding 
approaches were then evaluated using RNN (Kilimci 
& Akyokus, 2019; Rahman et al., 2021), CNN(Kilimci 
& Akyokus, 2019), LSTM (Kilimci & Akyokus, 2019; 
Rahman et al., 2021), Naïve Bayes (Rahman et al., 
2021). A study by (Deho et al., 2018) offered word 
embedding to identify the polarity of sentiment 
(positive, negative, or neutral) from existing text. 
This improved the accuracy of sentiment 
categorisation. Additionally, a new technique, 
known as Improved Word Vector (IWV), was 
presented by (Bojanowski et al., 2017) to increase 
the precision of pre-trained word embedding in 
sentiment analysis. The study's findings indicate 
that the word embedding technique can increase 
the precision of text classification (Deho et al., 
2018). The IWV approach significantly improves the 
researcher's proposed sentiment analysis 
technique (Rezaeinia, Ghodsi, & Rahmani, 2017). 

The performance of word embedding 
word2vec Continuous Bag of Words (CBOW), 
word2vec, doc2vec, GloVe, and FasText was 
compared by other researchers in addition to new 
approaches being suggested and word embedding 
techniques being identified. Accuracy of 95.52% on 
the domain of hotel reviews from the Traveloka site 
with a total of 5,000 reviews. The GloVe method has 
the highest accuracy rate compared to other 
methods (Imaduddin et al., 2019). This research is 
similar to previous research (Kamiş & Goularas, 
2019) that GloVe can improve almost all 
configuration performance. 

The effectiveness of Deep Learning has 
been compared in another research. Previous 
research by (AlSurayyi et al., 2019) compared RNN 
combined with LSTM, RNN combined with Bi-LSTM 
(Bidirectional LSTM), and CNN (Convolutional 
neural networks) for word representation using 
word2vec and GloVe techniques. The results 
showed that RNN combined with Bi-LSTM using the 
glove technique got better accuracy than other 
methods. This study used the domain of restaurant 
reviews from Yelp. Researchers (Rahman et al., 
2021) compared LSTM, Naïve Bayes, RNN, and word 
representation using the Word2vec technique. The 


JURNAL RISET INFORMATIKA 
Vol. 5, No. 2. March 2023 

P-ISSN: 2656-1743 |E-ISSN: 2656-1735 
DOI: https://doi.org/10.34288/jri.v5i2.507 

Accredited rank 3 (SINTA 3), excerpts from the decision of the Minister of RISTEK-BRIN No. 200/M/KPT/2020 

 
197 

 
outcomes demonstrated that LSTM was better than 
other approaches. 

This study uses GloVe and FastText word 
embedding in the LSTM model for text 
representation. This research aims to analyse and 
compare the effectiveness of word embedding on 
text represented by LSTM deep learning 
architecture. 

 
RESEARCH METHODS 

 
This research is expected to provide high-

accuracy text classification and good performance 
between word embedding gloves and FastText 
techniques evaluated using LSTM. To achieve the 
expected research objectives, we used the 
methodology shown in Figure 1. 
 

Figure 1. Flowchart 

 
Data collection and labelling 

The Spotify application review dataset 
from the Kaggle.com website is used in this study. 
The datasets of our study can be downloaded from 
the URL: https://www.kaggle.com/datasets/ 
mfaaris/spotify-app-reviews-2022.  

Spotify app reviews on Google Play Store 
were collected from January 1, 2022, to July 9, 2022. 
The dataset consists of 5 columns, namely 

time_submitted, review, rating, total_thumbsup, 
replay, and 61,594 rows. However, the columns 
used for this study are the review column and the 
rating column. 

 
Pre-processing 

Before the data is used for sentiment 
analysis, several preparatory processes must be 
done to get the best classification results. In the first 
stage, symbols, punctuation marks, and emojis are 
removed from the data set. The second stage, 
tokenisation and case folding is breaking down the 
sentences in the dataset into words, also known as 
tokens, and converting all capital letters into 
lowercase letters. The third step is filtering or 
removing stop words, taking important words and 
discarding words that are unimportant or have no 
meaning. 

 
Figure 2. Percentage of labelling data 

 
The library used for filtering is NLTK 

(Natural Language Toolkit), developed by Steven 
Bird and Edward Loper at the University of 
Pennsylvania in 2001 (Botrè, Lucarini, Memoli, & 
D’Ascenzo, 1981). The fourth step is Stemming, 
converting unstandardised words into common 
words or removing affixes. The last step is labelling 
data. The data is grouped into positive and negative 
sentiments based on application ratings, as in the 
research (AlSurayyi et al., 2019; Imaduddin et al., 
2019), which only uses positive sentiments and 
negative sentiments in data labelling. This study's 
total percentage of labelling data is shown in Figure 
2. 
 
Word Embedding 
A. Glove 

Word embedding converts words into a 
continuous vector form with a predefined text 
length. Many methods have been developed to 
convert words into vectors, including a bag of 
words, TF-IDF (Jones, 1972), Word2vec (Mikolov et 
al., 2013), GloVe (Brennan et al., 2017), and FastText 
(Bojanowski et al., 2017). GloVe is a method that 
combines local context-based learning in word2vec 


P-ISSN: 2656-1743 | E-ISSN: 2656-1735 
DOI: https://doi.org/10.34288/jri.v5i2.507 

JURNAL RISET INFORMATIKA 
Vol. 5, No. 2. March 2023 

Accredited rank 3 (SINTA 3), excerpts from the decision of the Minister of RISTEK-BRIN No. 200/M/KPT/2020 

 
198 

 
with global statistical matrix factorisation 
techniques like LSA (Brennan et al., 2017). Matrix 
factorisation and the Skip-Gram method are 
combined in the GloVe methodology. The co-
occurrence matrix created by GloVe (word context 
X) is used for prediction and calculation outside the 
existing corpus (Imaduddin et al., 2019). 

 
𝐽0 ∑ 𝑓(𝑋𝑖𝑗 )(𝑤𝑖
𝑇 𝑤�̃� + 𝑏𝑖 + 𝑏�̃� − 𝑙𝑜𝑔𝑋𝑖𝑗 )

2𝑣
𝑖,𝑗=1   ........... (1) 

 
Where each element 𝑋𝑖𝑗  Indicates the number of 

times the word appears in word J, for 𝑤𝑗  is a vector 

for the context word 𝑤𝑖  vector of the main word and 
𝑏𝑖 , 𝑏𝑗  Scalar bias for the main word and context 

(Kilimci & Akyokus, 2019). 
 

B. FastText 
FastText, an open-source project from 

Facebook Research is a fast and efficient technique 
for learning word representations and performing 
text classification often used for NLP. The primary 
function of FastText insertion is to analyse the 
internal structure of words. This works particularly 
well in morphologically complex languages as it 
allows learning of self-representations for various 
word morphologies (Bojanowski et al., 2017). 

 
𝑠(𝑤, 𝑐) = ∑ 𝑍𝑔

𝑇 𝑣𝑐𝑔𝜖𝑔𝜔   .................................................... (2) 
 

The Word2Vec-proposed negative sampling 
skip-gram model is implemented by FastText using 
a modified skip-gram function. The word score is 
calculated by adding the vector representation of 
the n-grams in the set 𝐺𝑤 ⊂ {1, … , 𝐺}, which is the 
set of n-grams found in the word w. 

 
Split Dataset 

The data set is divided into training and 
testing data to train the machine learning model. In 
this experiment, the data is divided into a ratio of 
80:20, with 80% of the data used to train the model 
and 20% used to test it. 

 
LSTM 

Long Short-Term Memory (LSTM) 
networks are a complex deep learning approach. 
LSTM works very well on various problems and is 
widely used by many researchers (AlSurayyi et al., 
2019). Due to its complex dynamics, LSTM may 
quickly "memorise" data over a lengthy period. In a 
vector of memory cells called 𝑐𝑡

𝑙 ∈  𝑅𝑛 , the "long-
term" memory is stored. Although different LSTM 
designs vary in terms of connection layout and 
activation function. All LSTM architectures have 
explicit memory cells that can store data for long 

periods of time. The LSTM has the option of 
replacing, retrieving, or storing the memory cell for 
later (Zaremba, Sutskever, & Vinyals, 2014). There 
are three gates for storing information for an 
extended period. The forget gate removes 
information from the cell that is not needed. The 
input gate adds beneficial information. The output 
gate pulls valuable information from the cell state 
for output values. Through gates that let 
information flow through or are blocked by the 
LSTM unit, the LSTM unit decides what to store and 
when to permit reads, writes, and deletions(Kilimci 
& Akyokus, 2019) 
 

Figure 3. LSTM 

 
A collection of LSTM architectures or memory cells 
are shown in Figure 3. 

 
𝑖𝑡 = 𝜎(𝑊𝑥𝑖 𝑥𝑡 + 𝑊ℎ𝑖 ℎ𝑡−1 + 𝑊𝑐𝑖 𝑐𝑡−1𝑏𝑖 ) ..................... (3) 
 
𝑓𝑡 = 𝜎(𝑊𝑥𝑓 𝑥𝑡 + 𝑊ℎ𝑓 ℎ𝑡−1 + 𝑊𝑐𝑓 𝑐𝑡−1 + 𝑏𝑓 )  ........... (4) 
 
𝑐𝑡 = 𝑓𝑡 𝑐𝑡−1 + 𝑖𝑡 𝑡𝑎𝑛ℎ (𝑊𝑥𝑐 𝑥𝑡 + 𝑊ℎ𝑐 ℎ𝑡−1 + 𝑏𝑐  ......... (5) 
 
ot = σ(Wxoxt + Whoht−1 + Wcoct + bo) ................. (6) 
 
ℎ𝑡 = 𝑜𝑡 𝑡𝑎𝑛ℎ(𝑐𝑡 ) ................................................................ (7)            

 
where 
𝑖𝑡  = input gate. 
𝑓𝑡  = forget gate. 
𝑜𝑡  = output gate. 
𝑐𝑡  = sell activation vector. 
𝑋𝑡  = the input at time t. 
ℎ𝑡−1= the previous state. 
𝐶𝑡−1= the previous state memory. 
𝐶𝑡 = current memory state. 
ℎ𝑡 = the current state. 

 
JURNAL RISET INFORMATIKA 
Vol. 5, No. 2. March 2023 

P-ISSN: 2656-1743 |E-ISSN: 2656-1735 
DOI: https://doi.org/10.34288/jri.v5i2.507 

Accredited rank 3 (SINTA 3), excerpts from the decision of the Minister of RISTEK-BRIN No. 200/M/KPT/2020 

 
199 

 
Evaluation 
The evaluation stage is a step to check the 

accuracy of the experimental results and measure 
the performance of the model that has been 
produced. The performance of the algorithm is 
measured in this study using a confusion matrix, 
and the metrics utilised for assessment are True 
Positive (TP), True Negative (TN), False Positive 
(FP), and False Negative (FN). 

 
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑇𝑃+𝑇𝑁

𝑇𝑃+𝑇𝑁+𝐹𝑃
 .................................................... (8) 

 
𝑃𝑟𝑒𝑐𝑖𝑠𝑠𝑖𝑜𝑛 =
𝑇𝑃

𝑇𝑃+𝐹𝑃
 ......................................................... (9) 

 
𝑅𝑒𝑐𝑎𝑙𝑙 =
𝑇𝑃

𝑇𝑃+𝐹𝑁
 ................................................................ (10) 

 
𝐹 − 𝑀𝑒𝑎𝑠𝑢𝑟𝑒 = 2 ∗ 𝑅𝑒𝑐𝑎𝑙𝑙 ∗
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛

𝑅𝑒𝑐𝑎𝑙𝑙+𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛
 ...... (11) 

 
RESULTS AND DISCUSSION 

 
In this section, we analyse and compare 

how well word embedding represented by LSTM 
architecture performs. This research was conducted 
using phyton version 3.10.7 and jupyter notebook 
version 6.4.11 and a device with an intel core i3-
8100 processor, 8 GB RAM and windows ten pro 
operating system. The dataset used in this study 
consists of 61,594 Spotify application reviews from 
the Google Play Store that have gone through pre-
processing and labelling. As shown in Figure 2, the 
dataset is divided into positive and negative 
sentiments, with negative sentiments based on 
reviews with ratings 1-2 and positive sentiments 
based on reviews with ratings 4-5. 

After pre-processing phase, word 
embedding techniques were employed to convert 
words into vector form with a predetermined 
length. In this step, we compared GloVe and 
FastText in the English language with 300 
dimensions. It took about 2 minutes 30 seconds to 
fetch 4.7 GB of GloVe data, while it took 5 minutes 
50 seconds to load 4.2 GB of FastText data.  

 
Table 1. Classification accuracy of the Word 

embedding GloVe deep learning model 

 GloVe 

 accuracy time 

epoch 50 89% 33 minute 11 second 

epoch 100 89% 
1 hour 7 minute 45 
second 

epoch 200 89% 
2 hour 16 minute 6 
second 

 
Table 2. Classification accuracy of the word 

embedding FastText deep learning 
model 

 FastText 
  accuracy time 

epoch 50 90% 
34 minute 15 
second 

epoch 100 90% 
1 hour 7 minute 22 
second 

epoch 200 90% 
2 hour 14 minute 56 
second 

 
Further, the dataset is split into training 

and testing data in a ratio of 80:20, with 80% of the 
data used to train the model and 20% to test it. To 
make the data more balanced, random 
oversampling (ROS) was used to double the 
minority class and add it to the training dataset 
before starting the data split. The positive and 
negative opinions were 29,937 and 24,771 before 
the ROS process. Consequently, the number of labels 
in the minority (Negative) was set at 29,937, which 
was also the number in the majority (Positive). In 
this study, we used a single LSTM architecture or 
one layer, 64 units, Adam optimiser, and the 
learning rate is 0.001. The training was performed 
in three iterations of 50, 100, and 200 epochs. 
Figures 4 to 6 illustrate the LSTM in three epochs 
using the GloVe word embedding approach. 
Training and validation have a wide gap in accuracy 
and loss. 

 
(a) (b) 

Figure 4. Training (a) Accuracy and (b) Loss using LSTM and GloVe at 50 epochs 


P-ISSN: 2656-1743 | E-ISSN: 2656-1735 
DOI: https://doi.org/10.34288/jri.v5i2.507 

JURNAL RISET INFORMATIKA 
Vol. 5, No. 2. March 2023 

Accredited rank 3 (SINTA 3), excerpts from the decision of the Minister of RISTEK-BRIN No. 200/M/KPT/2020 

 
200 

 
(a) (b) 

 
Figure 5. Training (a) Accuracy and (b) Loss using LSTM and GloVe at 100 epochs 

 
(a) (b) 

 
Figure 6. Training (a) Accuracy and (b) Loss using LSTM and GloVe at 200 epochs 

 
Figures 7 to 9 depict the accuracy and loss 

during LSTM training using FastText word 
embedding with epochs of 50, 100, and 200. The 
FastText word embedding technique is superior to 

all epochs compared to the Glove technique, as seen 
in table 2. However, the training and validation have 
a wide gap in accuracy and loss, similar to GloVe 
word embedding. 

  
(a) (b) 

 
Figure 7. Training (a) Accuracy and (b) Loss using LSTM and FastText at 50 epochs 

 
(a) (b) 

 
Figure 8. Training (a) Accuracy and (b) Loss using LSTM and FastText at 100 epochs 


JURNAL RISET INFORMATIKA 
Vol. 5, No. 2. March 2023 

P-ISSN: 2656-1743 |E-ISSN: 2656-1735 
DOI: https://doi.org/10.34288/jri.v5i2.507 

Accredited rank 3 (SINTA 3), excerpts from the decision of the Minister of RISTEK-BRIN No. 200/M/KPT/2020 

 
201 

 
(a) (b) 

 
Figure 9. Training (a) Accuracy and (b) Loss using LSTM and GloVe at 200 epochs 

 
CONCLUSIONS AND SUGGESTIONS 

 
Bag of Word, TF-IDF, word2vec, GloVe and 

FastText are some word embedding methods to 
display words in vector form. In this study, we 
compare GloVe and FastText word embedding, two 
state-of-the-art word representation algorithms 
that use deep learning LSTM architecture. 
According to the experiments' results, it seems that 
FastText is superior compared to the glove 
technique. The accuracy obtained reaches 90%. The 
number of epochs did not significantly improve the 
accuracy of the LSTM model with GloVe and 
FastText.  For all the scenarios tested, the training 
and validation have a wide gap in the model's 
accuracy and loss. It seems that model 
improvement is needed for future research. 
Moreover, the early stop method for model training 
is crucial for overfitting and underfitting. The early 
stops technique can also achieve model 
convergence in the precise number of epochs. 
 

REFERENCES 
 
AlSurayyi, W. I., Alghamdi, N. S., & Abraham, A. 

(2019). Deep learning with word embedding 
modeling for a sentiment analysis of online 
reviews. International Journal of Computer 
Information Systems and Industrial 
Management Applications, 11, 227–241. 
Retrieved from 
http://www.mirlabs.org/ijcisim/regular_pap
ers_2019/IJCISIM_22.pdf 

Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. 
(2017). Transactions of the Association for 
Computational Linguistics. Transactions of the 
Association for Computational Linguistics, 5, 
135–146. Retrieved from 
https://transacl.org/ojs/index.php/tacl/arti
cle/view/999 

Botrè, C., Lucarini, C., Memoli, A., & D’Ascenzo, E. 
(1981). 397 - On the entropy production in 

oscillating chemical systems. 
Bioelectrochemistry and Bioenergetics, 8(2), 
201–212. https://doi.org/10.1016/0302-
4598(81)80041-4 

Brennan, P. M., Loan, J. J. M., Watson, N., Bhatt, P. M., 
& Bodkin, P. A. (2017). Pre-operative obesity 
does not predict poorer symptom control and 
quality of life after lumbar disc surgery. 
British Journal of Neurosurgery, 31(6), 682–
687. 
https://doi.org/10.1080/02688697.2017.13
54122 

Deho, O. B., Agangiba, W. A., Aryeh, F. L., & Ansah, J. 
A. (2018). Sentiment analysis with word 
embedding. 2018 IEEE 7th International 
Conference on Adaptive Science & Technology 
(ICAST), 1–4. 
https://doi.org/10.1109/ICASTECH.2018.85
06717 

Imaduddin, H., Widyawan, & Fauziati, S. (2019). 
Word embedding comparison for Indonesian 
language sentiment analysis. 2019 
International Conference of Artificial 
Intelligence and Information Technology 
(ICAIIT), 426–430. 
https://doi.org/10.1109/ICAIIT.2019.88345
36 

Jones, K. S. (1972). A statistical interpretation of 
term specificity and its application in 
retrieval. Journal of Documentation, 28(1), 11–
21. https://doi.org/10.1108/eb026526 

Kamiş, S., & Goularas, D. (2019). Evaluation of Deep 
Learning Techniques in Sentiment Analysis 
from Twitter Data. 2019 International 
Conference on Deep Learning and Machine 
Learning in Emerging Applications (Deep-ML), 
12–17. https://doi.org/10.1109/Deep-
ML.2019.00011 

Kilimci, Z. H., & Akyokus, S. (2019). The Evaluation 
of Word Embedding Models and Deep 
Learning Algorithms for Turkish Text 
Classification. 2019 4th International 


P-ISSN: 2656-1743 | E-ISSN: 2656-1735 
DOI: https://doi.org/10.34288/jri.v5i2.507 

JURNAL RISET INFORMATIKA 
Vol. 5, No. 2. March 2023 

Accredited rank 3 (SINTA 3), excerpts from the decision of the Minister of RISTEK-BRIN No. 200/M/KPT/2020 

 
202 

 
Conference on Computer Science and 
Engineering (UBMK), 548–553. IEEE. 
https://doi.org/10.1109/UBMK.2019.89070
27 

Marukatat, R. (2020). A Comparative Study of Using 
Bag-of-Words and Word-Embedding 
Attributes in the Spoiler Classification of 
English and Thai Text. In Studies in 
Computational Intelligence (Vol. 847). 
Springer International Publishing. 
https://doi.org/10.1007/978-3-030-25217-
5_7 

Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). 
Efficient estimation of word representations 
in vector space. 1st International Conference 
on Learning Representations, ICLR 2013 - 
Workshop Track Proceedings, 1–12. Retrieved 
from https://arxiv.org/abs/1711.08609 

Rahman, M. Z., Sari, Y. A., & Yudistira, N. (2021). 
Analisis Sentimen Tweet COVID-19 
menggunakan Word Embedding dan Metode 
Long Short-Term Memory (LSTM). Jurnal 
Pengembangan Teknologi Informasi Dan Ilmu 
Komputer, 5(11), 5120–5127. Retrieved from 
http://j-ptiik.ub.ac.id 

Rezaeinia, S. M., Ghodsi, A., & Rahmani, R. (2017). 

Improving the accuracy of pre-trained word 
embeddings for sentiment analysis. ArXiv, 1–
15. Retrieved from 
https://arxiv.org/abs/1711.08609 

Wang, C., Nulty, P., & Lillis, D. (2020). A Comparative 
Study on Word Embeddings in Deep Learning 
for Text Classification. Proceedings of the 4th 
International Conference on Natural Language 
Processing and Information Retrieval, 37–46. 
https://doi.org/10.1145/3443279.3443304 

Young, T., Hazarika, D., Poria, S., & Cambria, E. 
(2018). Recent trends in deep learning based 
natural language processing [Review Article]. 
IEEE Computational Intelligence Magazine, 
13(3), 55–75. 
https://doi.org/10.1109/MCI.2018.2840738 

Zaremba, W., Sutskever, I., & Vinyals, O. (2014). 
Recurrent Neural Network Regularization. 
ArXiv, (2013), 1–8. Retrieved from 
http://arxiv.org/abs/1409.2329