International Journal on Advances in ICT for Emerging Regions 2022 15 (2):  

 

October 2022                                                             International Journal on Advances in ICT for Emerging Regions 

Facebook for Sentiment Analysis: Baseline Models 

to Predict Facebook Reactions of Sinhala Posts 
Vihanga Jayawickrama∗, Gihan Weeraprameshwara∗, Nisansa de Silva∗, Yudhanjaya Wijeratne† 

∗ Department of Computer Science & Engineering, University of Moratuwa 

†LIRNEasia 

vihangadewmini.17@cse.mrt.ac.lk 

 
Abstract— Research on natural language processing in most 

regional languages is hindered due to resource poverty. A 

possible solution for this is utilization of social media data in 

research. For example, the Facebook network allows its users 

to record their reactions to text via a typology of emotions. This 

network, taken at scale, is therefore a prime dataset of 

annotated sentiment data. This paper uses millions of such 

reactions, derived from a decade worth of Facebook post data 

centred around a Sri Lankan context, to model an eye of the 

beholder approach to sentiment detection for online Sinhala 

textual content. Three different sentiment analysis models are 

built, taking into account a limited subset of reactions, all 

reactions, and another that derives a positive/negative star 

rating value. The efficacy of these models in capturing the 

reactions of the observers are then computed and discussed. 

The analysis reveals that the Star Rating Model, for Sinhala 

content, is significantly more accurate (0.82) than the other 

approaches. The inclusion of the like reaction is discovered to 

hinder the capability of accurately predicting other reactions. 

Furthermore, this study provides evidence for the applicability 

of social media data to eradicate the resource poverty 

surrounding languages such as Sinhala. 

 

Keywords— NLP, sentiment analysis, Sinhala, word vector-
ization  

I. INTRODUCTION  

NDERSTANDING human emotions is an interesting, 

yet complex process which researchers and scientists 

around the world have been attempting to standardize for a 

long period of time. In the computational sciences, sentiment 

analysis has become a major research topic, especially in 

relation to textual content [1, 2]. Several fields, scattered in 

diverse arenas from product marketing to political 

manipulations, benefit from the advancements in sentiment  

 
Correspondence: Vihanga Jayawickrama #1 (E-mail: 

vihangadewmini.17@cse.mrt.ac.lk) Received: 10-08-2022 

Revised:25-10-2022 Accepted: 28-10-2022 

Vihanga Jayawickrama1, Gihan Weeraprameshwara2, Nisansa de 

Silva3 are form niversity of Moratuwa, Department of Computer 

Science and Engineering. (vihangadewmini.17@cse.mrt.ac.lk, 

gihanravindu.17@cse.mrt.ac.lk, nisansadds@cse.mrt.ac.lk) 

Yudhanjaya Wijeratne4 is from LIRNEasia 

(yudhanjaya@lirneasia.net) 

This paper is an extended version of the paper “Seeking Sinhala 

Sentiment: Predicting Facebook Reactions of Sinhala Posts” 

presented at the ICTer Conference (2021) 

DOI: http://doi.org/10.4038/icter.v15i2.7247 

© 2022 International Journal on Advances in ICT for Emerging 

Regions 

 

  

analysis. Studies such as those conducted by Rudkowsky et 

al. [3], Aguwa et al. [4], and Zobal [5] have described the 

potential of sentiment analysis and attempted to introduce 

useful tools for use in this field and discover new knowledge 

Sentiment analysis of textual content can be approached 

in two ways:  

1) Through the perspective of the creator  

2) Through the perspective of the observer.  

Many research projects attempt to follow the first 

approach, but only a few such as Hui et al. [6] have followed 

the second. Exploring the perspective of the observer would 

be quite important since the emotional reaction of the author 

and the reader to the same content is not necessarily identical. 

For certain fields, such as movie reviews [7] or product 

reviews [8], the perspective of the author is much more 

valuable than that of the reader; however, this relationship 

does not always hold true. Much effort is generally expended 

in the field of political polling, for example, where the public 

perception of a speech is studied to assess impact.  
To the extent of our knowledge, no attempt has been 

made to do such analysis in Sinhala, the subject of this study. 

Sinhala, similar to many other regional languages, suffers 

from resource poverty [9]. Previous research and resources 

available for NLP in Sinhala are limited and isolated [10, 11]. 

This is therefore an experimental attempt in bridging this 

knowledge gap. The objective is to predict the sentimental 

reaction of Facebook users to textual content posted on 

Facebook. This study uses a raw corpus of Sinhala Facebook 

posts scraped through Crowdtangle1 by Wijeratne and de 

Silva [12], and analyses the user reactions therein as a 

sentiment annotation that reflects the emotional reaction of a 

reader to the said post [13]. Facebook reactions Like, Love, 

Wow, Haha, Sad, Angry, and Thankful are utilized as the 

sentiment annotation of a post within the scope of this project. 

Figure 1 illustrates the visual representations of Facebook 

reactions presented to the users and are included in the 

dataset.  

1https://www.crowdtangle.com/ 

U  

Fig. 1:Facebook Reactions 

This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and 

reproduction in any medium, provided the original author and source are credited. 

https://creativecommons.org/licenses/by/4.0/legalcode
file:///C:/Users/WW/Desktop/vihangadewmini.17@cse.mrt.ac.lk
file:///C:/Users/WW/Desktop/gihanravindu.17@cse.mrt.ac.lk
file:///C:/Users/WW/Desktop/nisansadds@cse.mrt.ac.lk
file:///C:/Users/WW/Desktop/yudhanjaya@lirneasia.net
http://doi.org/10.4038/icter.v15i2.7247
https://www.crowdtangle.com/
https://creativecommons.org/licenses/by/4.0/legalcode


23  V. Jayawickrama, G. Weeraprameshwara, N. de Silva, Y. Wijeratne 

International Journal on Advances in ICT for Emerging Regions                                                                                                                           October 2022 

Overall, three models were created and tested. As for the first 

model, a reaction vector was created for each post with the 

normalized reaction counts belonging to Love, Wow, Haha, 

Sad, and Angry categories. Like and Thankful reactions, 

which are outliers at positive and negative ends of the 

spectrum respectively, were ignored. The results showed that 

the procedure could predict reaction vectors with F1 scores 

ranging between 0.13 and 0.52. The second model was 

highly similar to the first model, the only difference being 

the inclusion of Like and Thankful reactions for the 

prediction. The resultant F1 scores ranged between 0.00 and 

0.96.  

In the third model, the reactions were combined to create 

a positivity/negativity value for each post, following the 

procedure presented by De Silva et al. [8]. Here, Love and 

Wow were considered as positive, Sad and Angry were 

considered as negative, and Haha was ignored due to its 

conflicting use cases. The normalization was carried out as 

earlier for the four reactions included, and the difference 

between positive and negative values were re-scaled into the 

range 1 to 5, in order to map to the popular star rating system  

utilized by De Silva et al. [8]. The F1 score of this star rating 

value ranged between 0.29 and 0.30. In contrast, the binary  

categorization of reactions as Positive and Negative 

exhibited promising results, with F1 scores in the range 0.70- 

0.71 for Positive and 0.41 - 0.42 for Negative.  

Thus, it can be concluded that such a binary 

categorization system captures the sentimental reaction to 

Facebook post more efficiently in comparison to the multi-

category reaction value system, and presents a measure of 

reasonable accuracy in the imputation of such sentiment.  

It should be re-iterated here that the values used here are 

completely independent from the intended or perceived 

sentiment of the original posts and are solely dependent on 

sentiment expressed by the audience reactions. Further, the 

model only attempts to predict the positivity or the negativity 

of Facebook reactions added to a post by users, and not of 

the actual emotion inflicted in the users by the post. While 

the duo might be correlated, the exact nature of the relation 

would have to be further explored before reaching a distinct 

conclusion. Figure 2 illustrates the scope of this research, 

where arrows indicate the influences among intended and 

perceived sentiments. This journal paper is an extension of 

our previously published conference paper [14]. 

II. BACKGROUND 

Many of the studies on sentiment analysis are focused on 

purposes such as understanding the quality of reviews given 

for products presented in e-commerce sites [8, 15, 16] or 

understanding the political preferences of people [3, 17]. 

Among the research on review analysis, the work of De 

Silva et al. [8] is prominent. Rather than conducting a 

sentiment analysis following the more traditional procedures 

of identifying sentiments at the sentence level or at the 

document level, which assumes each sentence and each 

document to reflect a single emotion, this study had taken a 

path to determine sentiments on an aspect level. Different 

aspects were extracted from the review, and for each aspect, 

the sentiment value was calculated. Further, the study 

provides a set of guidelines to determine the semantic 

orientation of a subject using a sentiment lexicon while 

guiding how to handle negations, words that increase 

sentiment, words that shift the sentiment of the sentences, 

and groups of words that are used to express an emotion, all 

of which are important to convert sentiment in text into 

mathematical figures. The methodology presented by De 

Silva et al. [8] is crucial for this study since it provides the 

basis of one of the two workflows we discuss in this study to 

predict reactions for Sinhala text.  

The work by Martin and Pu [16], a research done on 

creating a prediction model that could identify helpful 

reviews that are not yet voted by other users, emphasizes the 

value of sentiment analysis. Rather than solely relying on 

structural aspects of a review such as the length and the 

readability score, the emotional context was also utilized in 

 rating the reviews, with the support of the GALC lexicon, 

which represents 20 different emotional categories. One of 

the most important findings of the project was that the 

emotion based model outperforms the structure based model 

by 9%. The work of Singh et al. [15] too has used several 

textual features such as ease of reading, subjectivity, polarity, 

and entropy to predict the helpfulness ratio. The model 

intends to assist the process of assigning a helpfulness value 

to a review as soon as the review is posted, thus giving the 

spotlight to useful reviews over irrelevant reviews. Both 

researches have highlighted the usefulness of understanding 

the reaction of the reader to different content. The studies on 

political preferences cover a massive area. Many 

governments and political parties use social media to 

understanding the audience. Therefore, the power vested in 

sentiment analysis cannot be ignored.  

The research done by Caetano et al. [17] and Rudkowsky 

et al. [3] explain two different cases where sentiment 

analysis is utilized in politics. Caetano et al. attempts to 

analyse twitter data and define the homophily of the twitter 

audience while Rudkowsky et al. demonstrates the usability 

of word embedding over bag-of-words by developing a 

negative sentiment detection model for parliament speeches. 

Caetano et al. concludes that the homophily level increased 

with the multiplex connection of the audience, while Caetano 

et al. states that the negativity of the speeches of a parliament 

member correlates to the position he holds in the parliament. 

While these instances may not be immediately identifiable 

as direct results of sentiment analysis, they are great 

examples for the wide range covered by sentiment analysis.  

Facebook data plays a major part in our research. 

Therefore, it is vital to explore the previous research done on 

Facebook data. The work by Pool and Nissim [18] and 

Freeman et al. [19] use datasets obtained from Facebook for 

emotion detection. The data scope covered through the work 

of Freeman et al. lacks diversity since the research is solely 

focused on Scholarly articles. However, Pool and Nissim has 

attempted to maintain a general dataset by using a variety of 

sources, ranging from New York Times to SpongeBob. The 

motivation behind this wide range of sources was to pick the 

best sources to train ML models for each reaction. Pool and 

Nissim too has looked into developing models with different 

features such as TF-IDF, embeddings, and n-grams. This 

comparison provides useful guidelines for picking up certain 

features in data. One of the most important aspects of the 

work by Pool and Nissim is that they have taken an extra step 

to test the models with external datasets; namely, 

AffectiveText [20], Fairy Tales [21], and ISEAR [22], to 

prove the validity of the developed model since those are 

widely used datasets in the field of sentiment analysis. This 

provides a common ground to compare different sentiment 



Facebook for Sentiment Analysis: Baseline Models to Predict Facebook Reactions of Sinhala Posts                24 

October 2022                                                          International Journal on Advances in ICT for Emerging Regions 

 

analysis models. The work of Graziani et al. [13] too follows 

the same procedure in comparing their model to those of 

others. 

While all papers mentioned above provide quite useful 

information, almost all of them relate to English, which is a 

resource-rich language. On the contrary, our project will be 

based on the Sinhala language, which is a resource-poor 

language in the NLP domain [9]. Very few attempts have 

been made to detect sentiments in Sinhala content, and most 

of the attempts made were either abandoned or not released 

to the public [10]. This poses a major challenge to our work 

due to the scarcity of similar work in the domain.  

Among the currently available research in this arena, 

Senevirathne et al. [23] is the state-of-the-art Sinhala text 

sentiment analysis attempt to the best of our knowledge. 

Through this paper, Senevirathne et al. has introduced a 

study of sentiment analysis models built using different deep 

learning techniques as well as an annotated sentiment dataset 

consisting of 15059 Sinhala news comments. The work was 

done to understand the reactions of the people reading. 

Furthermore, earlier attempts such as Medagoda et al. [24] 

provides insight into utilizing resources available for 

languages such as English for generating progress in 

sentiment analysis in Sinhala. The partially automated 

framework for developing a sentiment lexicon for Sinhala 

presented through Chathuranga et al. [25] is a noteworthy 

attempt at using a Part-of-Speech (PoS) tagged corpus for 

sentiment analysis. The authors proposed the use of 

adjectives tagged as positive or negative to predict the 

sentiment embedded in textual content.  

Obtaining a corpus that would fit our purposes was the 

second major challenge we faced when working with a 

Sinhala, given that, as Caswell et al. [26] observes, the 

majority of the publicly available datasets for low resource 

languages are not of adequate quality. Fortunately, the work 

of Wijeratne and de Silva [12] provided an adequate dataset. 

The authors presented Corpus-Alpha: a collection of Sinhala 

Facebook posts, Corpus-Sinhala-Redux: posts with only 

Sinhala text and a collection of stopwords. Both the raw 

corpus created by the authors and the stopwords will be used 

in our work. 

III. METHODOLOGY 

This study was conducted using the raw Facebook data 

corpus developed by Wijeratne and de Silva [12] through 

Facebook Crowdtangle. The corpus consists of 1,820,930 

Facebook posts created by pages popular in Sri Lanka 

between 01-01-2010 and 02-02-2020 [12]. The table I 

describes the columns of the corpus that were utilized for the  

purpose of this study. The Facebook reactions, which are 

emotional reactions of Facebook users to content, are utilized 

as sentiment annotations within this study. When taken 

collectively, user annotations can be considered as an 

effective representation of the public perception of the given 

content. 

 

A. Pre-processing 
 

The corpus was pre-processed by cleaning the Message 

column and normalizing reaction counts. Cleaning the 

Message column was initiated by removing control 

characters in the text. Characters belonging to Unicode 

categories Cc, Cn, Co, and Cs were replaced with a space 

[27]. The character with the unicode value 8205, also known 

as the Zero Width Joiner, was replaced with a null string 

while the other characters in category Cf were replaced by a 

space. The reason for this is that the Zero Width Joiner was 

often present in the middle of Sinhala words, especially 

when the Sinhala characters rakāransaya (රකාරාාංශය), 
yansaya (යාංසය), and rēpaya (රේඵය) were used. 

 From the subsequent text, URLs, email addresses, user 

tags (of the format @user), and hashtags were removed. 

Since only Sinhala and English words are to be considered 

in this study, any words containing characters that are neither 

Sinhala nor ASCII were removed. The list of stop words for 

Sinhala developed from this corpus by Wijeratne and de 

Silva [12] were removed next. English letters in the corpus 

were then converted to lowercase. All remaining characters 

that do not belong to Sinhala letters or English letters were 

replaced with white spaces. Numerical content was removed 

due to their high unlikelihood to be repeated in the same 

sequence order. Finally, multiple continuous white spaces in 

the corpus were replaced with a single white space. Once 

cleaned, entries of which the Message column were merely 

null strings or empty strings were removed from the corpus. 

The final cleaned corpus consisted of 526,732 data rows. 

 

B. Core Reaction Set Model 
 

In selecting the core reaction set, Like and Thankful 

reactions were excluded due to their counts being outliers in 

comparison to other reactions; Like being an outlier on the 

higher end and Thankful being an outlier on the lower end. 

The total count of each reaction in the corpus along with their 

percentages are mentioned in table II. A probable reason for 

the abnormal behaviour of those reactions are the duration 

that they have been present on Facebook. Like was the first 

reaction introduced to the platform, back in 2009 [28]. Love,

Fig. 2: The scope of the research in comparison to the series of sentiments associated with a Facebook post 



25  V. Jayawickrama, G. Weeraprameshwara, N. de Silva, Y. Wijeratne 

International Journal on Advances in ICT for Emerging Regions                                                                                                                           October 2022 

 

 

𝑇 =  𝑛𝐿 +  𝑛𝑊  +  𝑛𝐻  +  𝑛𝑆  +  𝑛𝐴   (1) 
 

𝑁𝑟 =  
𝑛𝑟

𝑇
        (2) 

 
TABLE  II 

TOTAL COUNTS OF REACTIONS IN THE CORPUS 

 

Reaction Count 
Percentage 

All Core 

Like 528,060,209 95.43 - 

Love 12,526,942 2.26 49.56 

Wow 1,906,174 0.34 7.54 

Haha 6,524,139 1.18 25.81 

Sad 2,987,589 0.54 11.82 

Angry 1,329,552 0.24 5.26 

Thankful 13,637 0.002 - 

    

Wow, Haha, Sad, and Angry reactions were introduced in 

2016 [29]; however, Like still retained its state as the default 

reaction which a simple click on the react button enforces. 

The Thankful reaction was a temporary option introduced as 

part of Mothers’ Day celebrations of Facebook in May 2016 

[30]. The reaction was removed from the platform after a few 

days, and was reintroduced in May 2017 to be removed again 

after the Mother’s Day celebrations [31].  

Thus, the core reaction set was defined considering only 

the Love, Wow, Haha, Sad, and Angry reactions. The 

percentages of the core reactions are also shown in Table II. 

Furthermore, Fig. 3 shows the core reaction percentages as a 

pie chart. Thus, initially, the normalization was done 

considering only the core reactions. Equation 1 obtains the 

sum of reactions (T) of an entry using the counts of: Love 

(𝑛𝐿 ), Wow (𝑛𝑊 ), Haha (𝑛𝐻 ), Sad (𝑛𝑆), and Angry (𝑛𝐴). The 
Equation 2 shows the normalized value 𝑁𝑟  for reaction r 
where 𝑛𝑟  is the raw count of the reaction and T is the sum 
obtained in Equation 1. 

The dataset was then divided into train and test subsets 

for the purpose of calculating and evaluating the accuracy of 

vector predictions. The message column of the train set was 

 

 
 

Fig. 3: Core Reaction Percentages 

 

tokenized into individual words, and set operation was used 

to obtain the collection of unique words for each entry. Then, 

a dictionary was created for each entry by assigning the 

normalized reaction vector of the entry to each word. The 

dictionaries thus created were merged vertically, taking the 

average value of vectors assigned to a word across the dataset 

as the aggregate reaction vector of that word. Equation 3 

describes this process where 𝑉𝑊  is the aggregate reaction 
vector for the word W, 𝑅𝑖  is the reaction vector of the 𝑖th 
entry (𝐸𝑖 ), n is the number of entries, and ∅ is the empty 
vector. 

 

𝑉𝑊 =
∑ {

𝑅𝑖 𝑖𝑓 𝑊 𝜖 𝐸𝑖
  ∅ 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

𝑛
𝑖=1

∑ {
1 𝑖𝑓 𝑊 𝜖 𝐸𝑖

  0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝑛
𝑖=1

   (3) 

 

The dictionary thus created was used to predict the 

reaction vectors of the test dataset. Entries in the test set were 

tokenized and then converted to unique word sets similar to 

the aforementioned processing of the training set. Then for 

each of the words in a set of a message which also exists in 

the dictionary created above, the corresponding reaction 

vector was obtained from the dictionary. For entries of which 

none of the words were found in the dictionary, the mean 

vector value of the train dataset was assigned. Equation 4 

shows the calculation of the predicted vector 𝑉𝑀  for a 
message where, 𝑉𝑊  is taken from the dictionary (which was 
populated as described in Equation 3), and 𝑁𝑀 is the number 
of words in the message 𝑀 . 

 

TABLE I 
FIELDS OF THE SOURCE DATASET THAT WERE USED FOR THIS STUDY 

 

Field Name Description Data Type 

Index Index of the dataset int 

Like The number of Like reacts on the post int 

Love The number of Love reacts on the post int 

Wow The number of Wow reacts on the post int 

Haha The number of Haha reacts on the post int 

Sad The number of Sad reacts on the post int 

Angry The number of Angry reacts on the post int 

Thankful The number of Thankful reacts on the post int 

Message Textual content of the Facebook post string 



Facebook for Sentiment Analysis: Baseline Models to Predict Facebook Reactions of Sinhala Posts                26 

October 2022                                                          International Journal on Advances in ICT for Emerging Regions 

𝑉𝑀 =  
∑ 𝑉𝑀𝑁𝑀

𝑁𝑀
   (4) 

 

C. Defining the Evaluation Statistics 
 

To evaluate the performance of the prediction process, a 

number of statistics were calculated. Equation 5 shows the 

calculation of Accuracy 𝐴𝑟  for reaction 𝑟 where, 𝑁𝑟  is the 
expected (actual) value for the entry as calculated in 

Equation 2 and 𝑀𝑟  is the predicted value calculated in 
Equation 4 as 𝑀𝑟 𝜖 𝑉𝑀. 

 

𝐴𝑟 = min
 

(𝑁𝑟 , 𝑀𝑟 )   (5) 

 

The accuracy can be defined this way since we are 

solving a bin packing problem and the vector values are sum 

up to 1. Equations 21, 7, and 22 shows the calculation of 

Recall (𝑅𝑟 ), Precision (𝑃𝑟 ), and F1 score (𝐹1𝑟 ) respectively 
where notation is same as Equation 5. 

 

𝑅𝑟 =  
𝐴𝑟

𝑀𝑟
    (6) 

𝑃𝑟 =  
𝐴𝑟

𝑁𝑟
   (7) 

 

𝐹1𝑟 =  
2×𝐴𝑟

𝑁𝑟+𝑀𝑟
   (8) 

 

The above measures were calculated for each entry of the 

dataset and the average value of each measure was assigned 

as the resultant performance measure of the dataset. Those 

values were then averaged across 5 runs of the code. 

 

D. All Reaction Set Model 
 

The All Reaction Set Model was developed following the 

same procedure of the core reaction set model. In addition to 

the reactions included in the core reaction set, Like ( 𝑛𝐿𝑖) and 

Thankful ( 𝑛𝑇 ) were considered during this step. Equation 9 

depicts how the sum of reactions is obtained while the 

normalized value 𝑁𝑟
∗ for each reaction could be obtained as 

mentioned in Equation 10. 𝑇 ∗ refers to the sum of reactions 

obtained through Equation 9. 

 
𝑇 ∗ ==  𝑛𝐿𝑖 +  𝑛𝐿 +  𝑛𝑊  +  𝑛𝐻  +  𝑛𝑆  +  𝑛𝐴 +  𝑛𝑇 (9) 

 

𝑁𝑟
∗ =

𝑛𝑟

𝑇∗
   (10) 

 

The sentiment vector for each entry was then generated 

following the same procedure as in III-B. The evaluation was 

done as mentioned in III-C.  

 

E. Star Rating Model 
 

The next step of the study was inspired by the procedure 

proposed by De Silva et al. [8]. They propose using the star 

rating to generate sentiment vectors. Since the star rating 

take They propose using the star rating associated with 

amazon customer reviews to generate sentiment vectors. 

Since the star rating takes a value between 1 and 5 where 3 

is considered neutral, and values more than 3 and less than 3  

are considered as positive and negative respectively by them. 

To adjust Facebook reactions to this scale, we classified the  
 

Fig. 4: Reactions Considered for the Star Rating Model 

positivity of reactions as presented in table IV. The positivity 

of the Haha reaction is considered to be uncertain due to its 

conflicting use cases: the reaction is often used both 

genuinely and sarcastically on the platform [32]. Therefore, 

the experiment was carried out considering only the Love, 

Wow, Sad, and Angry reactions. The normalization process 

described in Section III-B for the Core Reaction Set Model 

was updated by modifying Equation 1 as shown in Equation 

11 and modifying Equation 2 as shown in Equation 12, 

where �́� is the modified sum of reactions of the entry. Figure 

4 presents the distribution of selected reactions in the corpus. 

 
�́� =  𝑛𝐿 +  𝑛𝑊  +  𝑛𝑆  +  𝑛𝐴  (11) 

 

𝑁𝑟́ =
𝑛𝑟

�́�
   (12) 

 

The positive sentiment value ( 𝐸(𝑃,𝑖) ) for entry 𝑖  was 

calculated by summing the Normalized Love ( 𝑁𝐿́ ) and 
Normalized Wow (𝑁�́� ) values while the negative sentiment 

(𝐸(𝑁,𝑖)) was calculated by summing the Normalized Sad (𝑁𝑆́  ) 

and Normalized Angry (𝑁𝐴́ ) values, as shown in Equations  
13 and 14. Using 𝐸(𝑃,𝑖) and 𝐸(𝑁,𝑖), the aggregated sentiment 

for entry 𝑖 was calculated as shown in Equation 15. 
 

𝐸(𝑃,𝑖) =  𝑁(𝐿,𝑖)́ + 𝑁(𝑊,𝑖)́    (13) 

 

𝐸(𝑁,𝑖) =  𝑁(𝑆,𝑖)́ + 𝑁(𝐴,𝑖)́    (14) 

 

𝐸𝑖 =  𝐸(𝑃,𝑖) − 𝐸(𝑁,𝑖)   (15) 

 

The Star Rating Value (𝑆𝑖 ) for entry 𝑖 which is calculated 
over the entire dataset was computed as shown in Equation 

16 where 𝐼 is the set of entries in the dataset. 
 

𝑆𝑖 = 4 × (
𝐸𝑖− min

𝐸𝑗𝜖 𝐼
(𝐸𝑗)

max
𝐸𝑗𝜖 𝐼

(𝐸𝑗)− min
𝐸𝑗𝜖 𝐼

(𝐸𝑗)
) + 1 (16) 

 

The sentiment vector ( 𝑉𝑖 ) for entry 𝑖  is defined in 
Equation 17 where 𝐸(𝑃,𝑖) , 𝐸(𝑁,𝑖) , and 𝑆𝑖  were calculated as 

mentioned before. 

 

𝑉𝑖 = [𝐸(𝑃,𝑖), 𝐸(𝑁,𝑖), 𝑆𝑖  ]   (17) 

 
Once the vectors were computed, the processing of test 

and train sets, building of the dictionary, and evaluating the  



27  V. Jayawickrama, G. Weeraprameshwara, N. de Silva, Y. Wijeratne 

International Journal on Advances in ICT for Emerging Regions                                                                                                                           October 202 

  
TABLE III 

PERFORMANCE MEASURES OF VECTOR PREDICTIONS 

Train 

(%) 
Reaction 

Core Reaction Set Model All Reaction Set Model 

Accuracy Recall Precision F1 Score Accuracy Recall Precision F1 Score 

95 

Like - - - - 0.9169 0.9651 0.9691 0.9626 

Love 0.3119 0.5863 0.7838 0.5164 0.0056 0.2510 0.6221 0.1769 

Wow 0.0298 0.3111 0.6373 0.2218 0.0005 0.1487 0.4550 0.0818 

Haha 0.1163 0.4241 0.6279 0.3060 0.0042 0.1646 0.6044 0.1068 

Sad 0.0497 0.2355 0.6206 0.1613 0.0015 0.1013 0.5829 0.0638 

Angry 0.0175 0.2059 0.5837 0.1318 0.0006 0.0880 0.5193 0.0495 

Thankful - - - - 0.0000 0.0007 0.0440 0.0000 

90 

Like - - - - 0.9170 0.9652 0.9691 0.9626 

Love 0.3119 0.5847 0.7833 0.5147 0.0056 0.2513 0.6225 0.1774 

Wow 0.0299 0.3110 0.6375 0.2216 0.0005 0.1486 0.4557 0.0818 

Haha 0.1160 0.4242 0.6261 0.3053 0.0042 0.1639 0.6043 0.1064 

Sad 0.0497 0.2360 0.6205 0.1616 0.0015 0.1009 0.5840 0.0636 

Angry 0.0174 0.2041 0.5834 0.1308 0.0006 0.0882 0.5162 0.0494 

Thankful - - - - 0.0000 0.0007 0.0376 0.0000 

80 

Like - - - - 0.9167 0.9649 0.9691 0.9625 

Love 0.3118 0.5854 0.7833 0.5153 0.0056 0.2515 0.6208 0.1770 

Wow 0.0298 0.3113 0.6370 0.2218 0.0005 0.1490 0.4527 0.0816 

Haha 0.1160 0.4238 0.6266 0.3052 0.0042 0.1647 0.6037 0.1067 

Sad 0.0499 0.2380 0.6176 0.1623 0.0015 0.1012 0.5825 0.0636 

Angry 0.0174 0.2045 0.5856 0.1314 0.0006 0.0889 0.5142 0.0497 

Thankful - - - - 0.0000 0.0007 0.0297 0.0000 

70 

Like - - - - 0.9167 0.9650 0.9690 0.9625 

Love 0.3117 0.5855 0.7829 0.5152 0.0056 0.2513 0.6216 0.1771 

Wow 0.0298 0.3110 0.6376 0.2217 0.0005 0.1484 0.4539 0.0814 

Haha 0.1158 0.4236 0.6263 0.3049 0.0042 0.1643 0.6045 0.1065 

Sad 0.0497 0.2368 0.6183 0.1616 0.0015 0.1014 0.5816 0.0637 

Angry 0.0174 0.2050 0.5847 0.1314 0.0006 0.0885 0.5155 0.0495 

Thankful - - - - 0.0000 0.0007 0.0342 0.0000 

50 

Like - - - - 0.9167 0.9650 0.9690 0.9625 

Love 0.3121 0.5863 0.7824 0.5156 0.0056 0.2513 0.6206 0.1768 

Wow 0.0298 0.3113 0.6361 0.2214 0.0005 0.1491 0.4519 0.0815 

Haha 0.1155 0.4236 0.6249 0.3043 0.0042 0.1643 0.6034 0.1063 

Sad 0.0496 0.2366 0.6195 0.1617 0.0015 0.1014 0.5815 0.0636 

Angry 0.0173 0.2041 0.5855 0.1310 0.0006 0.0886 0.5142 0.0494 

Thankful - - - - 0.0000 0.0007 0.0330 0.0000 



Facebook for Sentiment Analysis: Baseline Models to Predict Facebook Reactions of Sinhala Posts                28 

October 2022                                                          International Journal on Advances in ICT for Emerging Regions 

 

 
 
Fig. 6: Changes of the F1 score of the Star Rating Model with Train-Test 

Division 
 

 

TABLE IV 
POSITIVITY AND NEGATIVITY OF FACEBOOK REACTIONS 

 

Reaction Positivity/Negativity 

Love Positive 

Wow Positive 

Haha Uncertain 

Sad Negative 

Angry Negative 

 

model was conducted akin to that in Section III-C and 

Section III-B. The performance measures of the model were 

calculated using Gaussian distances. 

 
1) Accuracy: The accuracy of prediction for each post 

was measured in terms of True Gaussian Distance of a post, 

which is defined as the Gaussian distance to the predicted 

Star Rating Value of the post from its true Star Rating Value, 

on a distribution centered on the true Star Rating Value. It 

should be noted that the raw star rating values before 

discretizing into classes are utilized here. The accuracy 𝐴𝑖́  of 
a post 𝑖  with True Gaussian Distance 𝐺𝑇,𝑖  is calculated as 
shown in Equation 18. Equation 19 then describes the 

calculation of accuracy 𝐴𝑥́  for class 𝑥 of which the number 
of posts is 𝑛𝑥. 

 
𝐴𝑖́ = 1 − 𝐺𝑇,𝑖    (18) 

 

𝐴𝑥́ =
∑ 𝐴𝑖́

𝑛𝑥
𝑛=1

𝑛𝑥
   (19) 

 

2) Precision: In order to calculate the precision of 

predictions, the Gaussian Trespass of each post into its 

predicted class was considered. The trespass was measured 

as the Gaussian distance from the boundary of the true class 

of the post to the midpoint of its predicted class, on a 

Gaussian distribution centered around the midpoint of the 

true class. Equation 20 shows the calculation of precision of 

each star rating class, where 𝑃�́�  represents the precision value 
of class 𝑥, 𝑛𝑐𝑐,𝑥  represents the number of correctly classified 

posts in class 𝑥, and 𝑇𝑖  represents the trespass value of post 
𝑖 in class 𝑥. 

 

𝑃�́� =
𝑛𝑐𝑐,𝑥

𝑛𝑐𝑐,𝑥+∑ 𝑇𝑖
𝑛𝑥
𝑛=1

  (20) 

 

3) Recall: The recall value was calculated for each post 

in terms of its Class Gaussian Distance, which is defined as 

the Gaussian distance to the midpoint of the predicted Star 

Rating Class of the post from the midpoint of its true Star 

Rating Class, on a distribution which is centered on the 

midpoint of the true class. The recall value 𝑅𝑥́  for a class 𝑥 
consisting of an 𝑛𝑥 number of Facebook posts, each with a 
recall of 𝑅𝑖́ , was obtained as depicted by Equation 21. 

 

𝑅𝑥́ =
∑ 𝑅𝑖́

𝑛𝑥
𝑛=1

𝑛𝑥
   (21) 

 

4) F1 Score: The F1 score of each class was then 

calculated based on the precision and recall values of the 

class, following the standard formula. Equation 22 portrays 

the calculation of F1 score 𝐹1𝑥́  for class 𝑥. 
 

𝐹1𝑥́ =  
2×𝑃�́�×𝑅�́�

𝑃�́�+𝑅𝑥́
  (22) 

 

5) Overall Performance: The overall performance 

measures for star rating were calculated by taking a weighted 

mean of performance measures of classes, with weights 

assigned based on the class size. 

IV.  RESULTS 

 

Table III shows the results obtained for the preference 

measure defined in Section III-C for the Core Reaction Set 

Model introduced in Section III-B and All Reaction Set 

Model introduced in Section III-D. All reactions except Sad 

reach their highest F1 score at the 95% − 05% train-test 

division, while the Sad reaction reaches its peak F1 score at 

the 80% − 20% division. Interestingly, the performance of 

the model in predicting each reaction seems to roughly 

follow a specific pattern; reactions that were used more often 

in the dataset seem to have a higher F1 score than reactions 

that were used less often, with the exception of the F1 score 

of Wow being higher than that of Sad. Figure 5 portrays the 

F1 score for each reaction as the train-test division varies for 

the Core Reaction Set Model. In the case of All Reaction Set 

Model, as shown in Table III, while the F1 of Like was much 

higher than that of other reactions, its inclusion brought forth 

significant reductions in the F1 scores of the other reactions. 

The Thankful reaction had a F1 of almost zero. 

The overall results obtained for Star Rating Model 

introduced in section III-E are shown in table V. In contrast 

to the results obtained for Positive and Negative components, 

aggregation of reactions into a single Star Rating value has 

caused a significant decrease in precision; possibly due to the      

discrete nature of the Star Rating value which is divided into 

bins at 0.5 intervals. Figure 6 portrays the change of F1 value 

with the train-test division.   



29  V. Jayawickrama, G. Weeraprameshwara, N. de Silva, Y. Wijeratne 

International Journal on Advances in ICT for Emerging Regions                                                                                                                           October 2022 

 

 
 

 
TABLE V 

PERFORMANCE MEASURES OF STAR RATING VECTOR PREDICTION 

 

 

 
 

Fig. 5: Changes of the F1 score of the Core Reaction Model with Train-Test 

Division 

Furthermore, the results obtained for each Star Rating 

Class are displayed in Table VI. It could be observed that the 

model exhibits better performance with regard to predicting 

more neutral star rating values. While accuracy and recall 

measures show comparable performance across all classes, 

this difference becomes much more prominent in precision. 

Consequently, a notable increase in performance is observed 

in more neutral classes in terms of F1 score. Further 

explorations revealed that the root cause of this issue is that 

the predictions of the model tend to lean towards more 

neutral classes, as portrayed in Table VII. It should be noted 

that the extremely positive and extremely negative classes 

are significantly larger in size, in comparison to 

comparatively neutral classes.  

As portrayed by Figure 5, the performance of the models 

remains largely unaffected by the train-test division chosen. 

The reason could be the large size of the dataset; the number 

of unique words in the train dataset does not change 

significantly for different train-test divisions. 

V. CONCLUSION 

 

Upon comparing the Star Rating Model with the Core 

Reaction Set Model, it becomes evident that the F1 scores are 

significantly improved upon the accumulation of separate 

reaction values into two categories as Positive and Negative. 

A possible reason for this is the possibility of the intra- 

category measurement errors being eliminated due to 

merging. However, merging all reactions into a single Star 

Rating value accentuates errors. This could be accounted to 

the additional error margin introduced by discretization. 

Further, the model predictions for Star Rating Classes that 

are closer to the median proves to be better than those for the 

edge-classes. The negative effect of Like and Thankful 

reactions, which were eliminated in the Core Reaction Set 

Model due to their abnormal counts, could be proven as well. 

The inclusion of those reactions caused significant 

reductions in the F1 scores of the other reactions as can be 

seen from the results of the All Reaction Set Model.  

This study represents modelling efforts that may be 

considered classical and limited in nature. Recent years have 

seen a significant growth in machine learning algorithms 

delivering exceptional results in many domains of text 

analysis, especially in finding non-linear relationships in the 

Train Set (%) Category 
Performance Measure 

Accuracy Recall Precision F1 Score 

95 

Positive 0.5406 0.7496 0.8601 0.7068 

Negative 0.2062 0.4775 0.8067 0.4207 

Star Rating 0.6930 0.6912 0.2259 0.2921 

90 

Positive 0.5420 0.7524 0.8589 0.7088 

Negative 0.2052 0.4753 0.8069 0.4192 

Star Rating 0.6931 0.6913 0.2267 0.2945 

80 

Positive 0.5416 0.7527 0.8571 0.7075 

Negative 0.2038 0.4718 0.8077 0.4159 

Star Rating 0.6917 0.6896 0.2236 0.2912 

70 

Positive 0.5410 0.7503 0.8588 0.7065 

Negative 0.2046 0.4751 0.8051 0.4176 

Star Rating 0.6925 0.6905 0.2280 0.2975 

50 

Positive 0.5403 0.7514 0.8572 0.7064 

Negative 0.2040 0.4742 0.8053 0.4166 

Star Rating 0.6915 0.6895 0.2298 0.2994 



Facebook for Sentiment Analysis: Baseline Models to Predict Facebook Reactions of Sinhala Posts                30 
 

October 2022                                                          International Journal on Advances in ICT for Emerging Regions 

TABLE VI 
STAR RATING MODEL: CLASS PERFORMANCE MEASURES 

 

Star Rating Class Train Set (%) 
Performance Measure 

Accuracy Precision Recall F1 Score 

1.0 

95 0.5677 0.0001 0.5573 0.0021 

90 0.5700 0.0015 0.5598 0.0030 

80 0.5673 0.0007 0.5574 0.0015 

70 0.5686 0.0012 0.5586 0.0023 

50 0.5672 0.0013 0.5572 0.0026 

1.5 

95 0.5886 0.0151 0.5981 0.0293 

90 0.5888 0.0127 0.5985 0.0248 

80 0.5888 0.0142 0.5969 0.0277 

70 0.5873 0.0164 0.5961 0.0319 

50 0.5880 0.0176 0.5965 0.0341 

2.0 

95 0.6368 0.0841 0.6448 0.1457 

90 0.6392 0.1134 0.6470 0.1924 

80 0.6369 0.0953 0.6450 0.1653 

70 0.6373 0.1039 0.6449 0.1785 

50 0.6370 0.1074 0.6442 0.1839 

2.5 

95 0.7177 0.4403 0.7288 0.5481 

90 0.7162 0.4191 0.7280 0.5318 

80 0.7174 0.4324 0.7270 0.5421 

70 0.7150 0.4286 0.7251 0.5385 

50 0.7147 0.4248 0.7251 0.5356 

3.0 

95 0.7930 0.6707 0.8043 0.7314 

90 0.7892 0.6408 0.7981 0.7108 

80 0.8018 0.6696 0.8077 0.7320 

70 0.7932 0.6427 0.7982 0.7118 

50 0.7954 0.6543 0.8021 0.7206 

3.5 

95 0.8513 0.8456 0.8677 0.8565 

90 0.8455 0.8357 0.8630 0.8491 

80 0.8473 0.8390 0.8640 0.8513 

70 0.8485 0.8319 0.8615 0.8465 

50 0.8470 0.8283 0.8600 0.8439 

4.0 

95 0.8378 0.8135 0.8517 0.8321 

90 0.8334 0.7929 0.8443 0.8178 

80 0.8346 0.7888 0.8426 0.8148 

70 0.8309 0.7833 0.8405 0.8109 

50 0.8333 0.7913 0.8434 0.8165 

4.5 

95 0.7642 0.6144 0.7819 0.6879 

90 0.7630 0.6130 0.7822 0.6872 

80 0.7584 0.6011 0.7784 0.6783 

70 0.7604 0.6108 0.7805 0.6853 

50 0.7604 0.6110 0.7803 0.6853 

5.0 

95 0.7154 0.1554 0.7047 0.2545 

90 0.7144 0.1564 0.7037 0.2558 

80 0.7135 0.1564 0.7028 0.2558 

70 0.7160 0.1646 0.7054 0.2669 

50 0.7134 0.1653 0.7030 0.2677 



31  V. Jayawickrama, G. Weeraprameshwara, N. de Silva, Y. Wijeratne 

International Journal on Advances in ICT for Emerging Regions                                                                                                                           October 202 

 

TABLE VII 

STAR RATING MODEL: CONFUSION MATRIX OF CLASSES 

 

data. Kowsari et al. [33] highlights a number of pre-

processing steps (such as dimensionality reduction using 

topic modelling or principal component analysis) and 

algorithms that may be combined with the feature 

engineering work presented here (especially the selection of 

useful data classes and reduction to a star rating) for 

potentially more accurate models in the future. As noted 

therein, deep learning techniques hold particular promise. 

This is further explored in the work of Weeraprameshwara 

et al. [34], [35] that can be considered as a continuation of 

the research, which tests new models and develops a new 

embedding system using the Facebook data. 

The study uses a word embedding developed by the work 

of Senevirathne et al. [23] for the Facebook dataset. However, 

developing an embedding structure based on the dataset may 

provide better sentiment annotation. Further enhancements 

can be done by introducing granularity to the embedding 

structure such as sentence embeddings.  

An alternate approach to sophisticated modelling would 

be to examine pre-processing techniques therein that may not 

be possible in Sinhala as of the time of writing, due to limited 

or missing language resources and tooling, as noted by de 

Silva [10]; building these tools may further yield increases 

in accuracy even with a simplistic model.  
 

References 
 

[1]  V. Gamage, M. Warushavithana, N. de Silva and others, “Fast 
Approach to Build an Automatic Sentiment Annotator for Legal 

Domain using Transfer Learning,” in Proceedings of the 9th 

Workshop on Computational Approaches to Subjectivity, 
Sentiment and Social Media Analysis, 2018. 

[2]  P. Melville, W. Gryc and R. D. Lawrence, “Sentiment analysis of 

blogs by combining lexical knowledge with text classification,” in 
SIGKDD, 2009. 

[3]  E. Rudkowsky, M. Haselmayer, M. Wastian and others, “More 

than bags of words: Sentiment analysis with word embeddings,” 

Communication Methods and Measures, vol. 12, p. 140–157, 

2018. 

[4]  C. Aguwa, M. H. Olya and L. Monplaisir, “Modeling of fuzzy-
based voice of customer for business decision analytics,” 

Knowledge-Based Systems, vol. 125, p. 136–145, 2017. 

[5]  V. Zobal, Sentiment analysis of social media and its relation to 
stock market, Univerzita Karlova, Fakulta sociálnı́ch věd, 2017. 

[6]  J. L. O. Hui, G. K. Hoon and W. M. N. W. Zainon, “Effects of 

word class and text position in sentiment-based news 
classification,” Procedia Computer Science, vol. 124, p. 77–85, 

2017. 

[7]  R. Socher, A. Perelygin, J. Wu and others, “Recursive deep 
models for semantic compositionality over a sentiment treebank,” 

in EMNLP, 2013. 

[8]  S. De Silva, H. Indrajee, S. Premarathna and others, “Sensing the 
sentiments of the crowd: Looking into subjects,” in 2nd 

International Workshop on Multimodal Crowd Sensing, 2014. 

[9]  Y. Wijeratne, N. de Silva and Y. Shanmugarajah, “Natural 
language processing for government: Problems and potential,” 

International Development Research Centre (Canada), 2019. 

[10]  N. de Silva, “Survey on publicly available sinhala natural 
language processing tools and research,” arXiv preprint 

arXiv:1906.02358, 2019. 

[11] S. Ranathunga and N. de Silva, “Some languages 

are more equal than others: Probing deeper into the 

linguistic disparity in the nlp world,” arXiv preprint 

arXiv:2210.08523, 2022 

[12]  Y. Wijeratne and N. de Silva, “Sinhala language corpora and 

stopwords from a decade of sri lankan facebook,” arXiv preprint 

arXiv:2007.07884, 2020. 

[13]  L. Graziani, S. Melacci and M. Gori, “Jointly learning to detect 

emotions and predict facebook reactions,” in ICANN, 2019. 

[14]  V. Jayawickrama, G. Weeraprameshwara, N. de Silva and Y. 
Wijeratne, “Seeking Sinhala Sentiment: Predicting Facebook 

Reactions of Sinhala Posts,” in 2021 21st International 
Conference on Advances in ICT for Emerging Regions (ICter), 

2021. 

[15]  J. P. Singh, S. Irani, N. P. Rana and others, “Predicting the 
“helpfulness” of online consumer reviews,” Journal of Business 

Research, vol. 70, p. 346–355, 2017. 

 
True Star Rating Class 

1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 

P
r
e
d

ic
te

d
 S

ta
r
 R

a
ti

n
g

 C
la

ss
 

1.0 5 51 848 2194 1516 920 256 19 2 

1.5 2 29 340 1444 1390 1159 406 37 0 

2.0 0 9 77 391 611 708 324 31 6 

2.5 1 4 22 157 324 414 258 27 0 

3.0 0 0 6 110 210 324 200 44 3 

3.5 0 0 8 61 216 466 329 93 2 

4.0 0 0 5 60 255 618 597 192 6 

4.5 0 2 7 98 446 1211 1602 1029 43 

5.0 1 4 7 210 1081 3594 8105 7798 844 

      



Facebook for Sentiment Analysis: Baseline Models to Predict Facebook Reactions of Sinhala Posts                32 

 

October 2022                                                          International Journal on Advances in ICT for Emerging Regions 

[16]  L. Martin and P. Pu, “Prediction of helpful reviews using 
emotions extraction,” in AAAI, 2014. 

[17]  J. A. Caetano, H. S. Lima and others, “Using sentiment analysis to 

define twitter political users’ classes and their homophily during 
the 2016 American presidential election,” Journal of internet 

services and applications, vol. 9, p. 1–15, 2018. 

[18]  C. Pool and M. Nissim, “Distant supervision for emotion detection 
using Facebook reactions,” arXiv preprint arXiv:1611.02988, 

2016. 

[19]  C. Freeman, M. K. Roy, M. Fattoruso and H. Alhoori, “Shared 
feelings: Understanding facebook reactions to scholarly articles,” 

in JCDL, 2019. 

[20]  C. Strapparava and R. Mihalcea, “SemEval-2007 Task 14: 
Affective Text,” in Fourth International Workshop on Semantic 

Evaluations, 2007. 

[21]  E. C. O. Alm, Affect in* text and speech, University of Illinois at 
Urbana-Champaign, 2008. 

[22]  K. R. Scherer and H. G. Wallbott, “Evidence for universality and 

cultural variation of differential emotion response patterning.,” 
Journal of personality and social psychology, vol. 66, p. 310, 

1994. 

[23]  L. Senevirathne, P. Demotte, B. Karunanayake and others, 
“Sentiment Analysis for Sinhala Language using Deep Learning 

Techniques,” arXiv preprint arXiv:2011.07280, 2020. 

[24]  N. Medagoda, S. Shanmuganathan and J. Whalley, “Sentiment 
lexicon construction using SentiWordNet 3.0,” in ICNC, 2015. 

[25]  P. D. T. Chathuranga, S. A. S. Lorensuhewa and M. A. L. 
Kalyani, “Sinhala sentiment analysis using corpus based sentiment 

lexicon,” in ICTer, 2019. 

[26]  I. Caswell, J. Kreutzer and others, “Quality at a glance: An audit 
of web-crawled multilingual datasets,” arXiv preprint 

arXiv:2103.12028, 2021. 

[27]  M. Davis and K. Whistler, “Unicode character database,” Unicode 
Standard Annex, vol. 44, p. 95170–0519, 2008. 

[28]  J. Kincaid, Facebook Activates "Like" Button; FriendFeed Tires 

Of Sincere Flattery.  

[29]  L. Stinson, "Facebook Reactions, the Totally Redesigned Like 

Button, Is Here," Wired. 

[30]  C. Newton, Facebook tests temporary reactions with a flower for 
Mother's Day, The Verge, 2016. 

[31]  A. Liptak, Facebook brought back its flower reaction for Mother’s 

Day, 2017. 

[32]  P. C. Kuo and others, “Facebook reaction-based emotion classifier 

as cue for sarcasm detection,” arXiv preprint arXiv:1805.06510, 

2018. 

[33]  K. Kowsari, K. Jafari Meimandi, M. Heidarysafa and others, 

“Text classification algorithms: A survey,” Information, vol. 10, p. 

150, 2019. 

[34]  G. Weeraprameshwara, V. Jayawickrama, N. de Silva and Y. 

Wijeratne, “Sentiment analysis with deep learning models: a  

comparative study on a decade of Sinhala language Facebook 
data,” in 2022 The 3rd International Conference on Artificial 

Intelligence in Electronics Engineering, 2022. 

[35] G. Weeraprameshwara, V. Jayawickrama, N. de Silva and Y. 

Wijeratne, “Sinhala Sentence Embedding: A Two-Tiered 

Structure for Low-Resource Languages,” arXiv preprint arXiv: 
2210.14472, 2022.