International Journal on Advances in ICT for Emerging Regions 2022 15 (2):

October 2022 International Journal on Advances in ICT for Emerging Regions

Facebook for Sentiment Analysis: Baseline Models

to Predict Facebook Reactions of Sinhala Posts
Vihanga Jayawickrama∗, Gihan Weeraprameshwara∗, Nisansa de Silva∗, Yudhanjaya Wijeratne†

∗ Department of Computer Science & Engineering, University of Moratuwa

†LIRNEasia

vihangadewmini.17@cse.mrt.ac.lk

Abstract— Research on natural language processing in most

regional languages is hindered due to resource poverty. A

possible solution for this is utilization of social media data in

research. For example, the Facebook network allows its users

to record their reactions to text via a typology of emotions. This

network, taken at scale, is therefore a prime dataset of

annotated sentiment data. This paper uses millions of such

reactions, derived from a decade worth of Facebook post data

centred around a Sri Lankan context, to model an eye of the

beholder approach to sentiment detection for online Sinhala

textual content. Three different sentiment analysis models are

built, taking into account a limited subset of reactions, all

reactions, and another that derives a positive/negative star

rating value. The efficacy of these models in capturing the

reactions of the observers are then computed and discussed.

The analysis reveals that the Star Rating Model, for Sinhala

content, is significantly more accurate (0.82) than the other

approaches. The inclusion of the like reaction is discovered to

hinder the capability of accurately predicting other reactions.

Furthermore, this study provides evidence for the applicability

of social media data to eradicate the resource poverty

surrounding languages such as Sinhala.

Keywords— NLP, sentiment analysis, Sinhala, word vector-
ization

I. INTRODUCTION

NDERSTANDING human emotions is an interesting,

yet complex process which researchers and scientists

around the world have been attempting to standardize for a

long period of time. In the computational sciences, sentiment

analysis has become a major research topic, especially in

relation to textual content [1, 2]. Several fields, scattered in

diverse arenas from product marketing to political

manipulations, benefit from the advancements in sentiment

Correspondence: Vihanga Jayawickrama #1 (E-mail:

vihangadewmini.17@cse.mrt.ac.lk) Received: 10-08-2022

Revised:25-10-2022 Accepted: 28-10-2022

Vihanga Jayawickrama1, Gihan Weeraprameshwara2, Nisansa de

Silva3 are form niversity of Moratuwa, Department of Computer

Science and Engineering. (vihangadewmini.17@cse.mrt.ac.lk,

gihanravindu.17@cse.mrt.ac.lk, nisansadds@cse.mrt.ac.lk)

Yudhanjaya Wijeratne4 is from LIRNEasia

(yudhanjaya@lirneasia.net)

This paper is an extended version of the paper “Seeking Sinhala

Sentiment: Predicting Facebook Reactions of Sinhala Posts”

presented at the ICTer Conference (2021)

DOI: http://doi.org/10.4038/icter.v15i2.7247

Regions

analysis. Studies such as those conducted by Rudkowsky et

al. [3], Aguwa et al. [4], and Zobal [5] have described the

potential of sentiment analysis and attempted to introduce

useful tools for use in this field and discover new knowledge

Sentiment analysis of textual content can be approached

in two ways:

1) Through the perspective of the creator

2) Through the perspective of the observer.

Many research projects attempt to follow the first

approach, but only a few such as Hui et al. [6] have followed

the second. Exploring the perspective of the observer would

be quite important since the emotional reaction of the author

and the reader to the same content is not necessarily identical.

For certain fields, such as movie reviews [7] or product

reviews [8], the perspective of the author is much more

valuable than that of the reader; however, this relationship

does not always hold true. Much effort is generally expended

in the field of political polling, for example, where the public

perception of a speech is studied to assess impact.
To the extent of our knowledge, no attempt has been

made to do such analysis in Sinhala, the subject of this study.

Sinhala, similar to many other regional languages, suffers

from resource poverty [9]. Previous research and resources

available for NLP in Sinhala are limited and isolated [10, 11].

This is therefore an experimental attempt in bridging this

knowledge gap. The objective is to predict the sentimental

reaction of Facebook users to textual content posted on

Facebook. This study uses a raw corpus of Sinhala Facebook

posts scraped through Crowdtangle1 by Wijeratne and de

Silva [12], and analyses the user reactions therein as a

sentiment annotation that reflects the emotional reaction of a

reader to the said post [13]. Facebook reactions Like, Love,

Wow, Haha, Sad, Angry, and Thankful are utilized as the

sentiment annotation of a post within the scope of this project.

Figure 1 illustrates the visual representations of Facebook

reactions presented to the users and are included in the

dataset.

1https://www.crowdtangle.com/

Fig. 1:Facebook Reactions

This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and

reproduction in any medium, provided the original author and source are credited.

https://creativecommons.org/licenses/by/4.0/legalcode
file:///C:/Users/WW/Desktop/vihangadewmini.17@cse.mrt.ac.lk
file:///C:/Users/WW/Desktop/gihanravindu.17@cse.mrt.ac.lk
file:///C:/Users/WW/Desktop/nisansadds@cse.mrt.ac.lk
file:///C:/Users/WW/Desktop/yudhanjaya@lirneasia.net
http://doi.org/10.4038/icter.v15i2.7247
https://www.crowdtangle.com/
https://creativecommons.org/licenses/by/4.0/legalcode

23 V. Jayawickrama, G. Weeraprameshwara, N. de Silva, Y. Wijeratne

International Journal on Advances in ICT for Emerging Regions October 2022

Overall, three models were created and tested. As for the first

model, a reaction vector was created for each post with the

normalized reaction counts belonging to Love, Wow, Haha,

Sad, and Angry categories. Like and Thankful reactions,

which are outliers at positive and negative ends of the

spectrum respectively, were ignored. The results showed that

the procedure could predict reaction vectors with F1 scores

ranging between 0.13 and 0.52. The second model was

highly similar to the first model, the only difference being

the inclusion of Like and Thankful reactions for the

prediction. The resultant F1 scores ranged between 0.00 and

0.96.

In the third model, the reactions were combined to create

a positivity/negativity value for each post, following the

procedure presented by De Silva et al. [8]. Here, Love and

Wow were considered as positive, Sad and Angry were

considered as negative, and Haha was ignored due to its

conflicting use cases. The normalization was carried out as

earlier for the four reactions included, and the difference

between positive and negative values were re-scaled into the

range 1 to 5, in order to map to the popular star rating system

utilized by De Silva et al. [8]. The F1 score of this star rating

value ranged between 0.29 and 0.30. In contrast, the binary

categorization of reactions as Positive and Negative

exhibited promising results, with F1 scores in the range 0.70-

0.71 for Positive and 0.41 - 0.42 for Negative.

Thus, it can be concluded that such a binary

categorization system captures the sentimental reaction to

Facebook post more efficiently in comparison to the multi-

category reaction value system, and presents a measure of

reasonable accuracy in the imputation of such sentiment.

It should be re-iterated here that the values used here are

completely independent from the intended or perceived

sentiment of the original posts and are solely dependent on

sentiment expressed by the audience reactions. Further, the

model only attempts to predict the positivity or the negativity

of Facebook reactions added to a post by users, and not of

the actual emotion inflicted in the users by the post. While

the duo might be correlated, the exact nature of the relation

would have to be further explored before reaching a distinct

conclusion. Figure 2 illustrates the scope of this research,

where arrows indicate the influences among intended and

perceived sentiments. This journal paper is an extension of

our previously published conference paper [14].

II. BACKGROUND

Many of the studies on sentiment analysis are focused on

purposes such as understanding the quality of reviews given

for products presented in e-commerce sites [8, 15, 16] or

understanding the political preferences of people [3, 17].

Among the research on review analysis, the work of De

Silva et al. [8] is prominent. Rather than conducting a

sentiment analysis following the more traditional procedures

of identifying sentiments at the sentence level or at the

document level, which assumes each sentence and each

document to reflect a single emotion, this study had taken a

path to determine sentiments on an aspect level. Different

aspects were extracted from the review, and for each aspect,

the sentiment value was calculated. Further, the study

provides a set of guidelines to determine the semantic

orientation of a subject using a sentiment lexicon while

guiding how to handle negations, words that increase

sentiment, words that shift the sentiment of the sentences,

and groups of words that are used to express an emotion, all

of which are important to convert sentiment in text into

mathematical figures. The methodology presented by De

Silva et al. [8] is crucial for this study since it provides the

basis of one of the two workflows we discuss in this study to

predict reactions for Sinhala text.

The work by Martin and Pu [16], a research done on

creating a prediction model that could identify helpful

reviews that are not yet voted by other users, emphasizes the

value of sentiment analysis. Rather than solely relying on

structural aspects of a review such as the length and the

readability score, the emotional context was also utilized in

rating the reviews, with the support of the GALC lexicon,

which represents 20 different emotional categories. One of

the most important findings of the project was that the

emotion based model outperforms the structure based model

by 9%. The work of Singh et al. [15] too has used several

textual features such as ease of reading, subjectivity, polarity,

and entropy to predict the helpfulness ratio. The model

intends to assist the process of assigning a helpfulness value

to a review as soon as the review is posted, thus giving the

spotlight to useful reviews over irrelevant reviews. Both

researches have highlighted the usefulness of understanding

the reaction of the reader to different content. The studies on

political preferences cover a massive area. Many

governments and political parties use social media to

understanding the audience. Therefore, the power vested in

sentiment analysis cannot be ignored.

The research done by Caetano et al. [17] and Rudkowsky

et al. [3] explain two different cases where sentiment

analysis is utilized in politics. Caetano et al. attempts to

analyse twitter data and define the homophily of the twitter

audience while Rudkowsky et al. demonstrates the usability

of word embedding over bag-of-words by developing a

negative sentiment detection model for parliament speeches.

Caetano et al. concludes that the homophily level increased

with the multiplex connection of the audience, while Caetano

et al. states that the negativity of the speeches of a parliament

member correlates to the position he holds in the parliament.

While these instances may not be immediately identifiable

as direct results of sentiment analysis, they are great

examples for the wide range covered by sentiment analysis.

Facebook data plays a major part in our research.

Therefore, it is vital to explore the previous research done on

Facebook data. The work by Pool and Nissim [18] and

Freeman et al. [19] use datasets obtained from Facebook for

emotion detection. The data scope covered through the work

of Freeman et al. lacks diversity since the research is solely

focused on Scholarly articles. However, Pool and Nissim has

attempted to maintain a general dataset by using a variety of

sources, ranging from New York Times to SpongeBob. The

motivation behind this wide range of sources was to pick the

best sources to train ML models for each reaction. Pool and

Nissim too has looked into developing models with different

features such as TF-IDF, embeddings, and n-grams. This

comparison provides useful guidelines for picking up certain

features in data. One of the most important aspects of the

work by Pool and Nissim is that they have taken an extra step

to test the models with external datasets; namely,

AffectiveText [20], Fairy Tales [21], and ISEAR [22], to

prove the validity of the developed model since those are

widely used datasets in the field of sentiment analysis. This

provides a common ground to compare different sentiment

Facebook for Sentiment Analysis: Baseline Models to Predict Facebook Reactions of Sinhala Posts 24

October 2022 International Journal on Advances in ICT for Emerging Regions

analysis models. The work of Graziani et al. [13] too follows

the same procedure in comparing their model to those of

others.

While all papers mentioned above provide quite useful

information, almost all of them relate to English, which is a

resource-rich language. On the contrary, our project will be

based on the Sinhala language, which is a resource-poor

language in the NLP domain [9]. Very few attempts have

been made to detect sentiments in Sinhala content, and most

of the attempts made were either abandoned or not released

to the public [10]. This poses a major challenge to our work

due to the scarcity of similar work in the domain.

Among the currently available research in this arena,

Senevirathne et al. [23] is the state-of-the-art Sinhala text

sentiment analysis attempt to the best of our knowledge.

Through this paper, Senevirathne et al. has introduced a

study of sentiment analysis models built using different deep

learning techniques as well as an annotated sentiment dataset

consisting of 15059 Sinhala news comments. The work was

done to understand the reactions of the people reading.

Furthermore, earlier attempts such as Medagoda et al. [24]

provides insight into utilizing resources available for

languages such as English for generating progress in

sentiment analysis in Sinhala. The partially automated

framework for developing a sentiment lexicon for Sinhala

presented through Chathuranga et al. [25] is a noteworthy

attempt at using a Part-of-Speech (PoS) tagged corpus for

sentiment analysis. The authors proposed the use of

adjectives tagged as positive or negative to predict the

sentiment embedded in textual content.

Obtaining a corpus that would fit our purposes was the

second major challenge we faced when working with a

Sinhala, given that, as Caswell et al. [26] observes, the

majority of the publicly available datasets for low resource

languages are not of adequate quality. Fortunately, the work

of Wijeratne and de Silva [12] provided an adequate dataset.

The authors presented Corpus-Alpha: a collection of Sinhala

Facebook posts, Corpus-Sinhala-Redux: posts with only

Sinhala text and a collection of stopwords. Both the raw

corpus created by the authors and the stopwords will be used

in our work.

III. METHODOLOGY

This study was conducted using the raw Facebook data

corpus developed by Wijeratne and de Silva [12] through

Facebook Crowdtangle. The corpus consists of 1,820,930

Facebook posts created by pages popular in Sri Lanka

between 01-01-2010 and 02-02-2020 [12]. The table I

describes the columns of the corpus that were utilized for the

purpose of this study. The Facebook reactions, which are

emotional reactions of Facebook users to content, are utilized

as sentiment annotations within this study. When taken

collectively, user annotations can be considered as an

effective representation of the public perception of the given

content.

A. Pre-processing

The corpus was pre-processed by cleaning the Message

column and normalizing reaction counts. Cleaning the

Message column was initiated by removing control

characters in the text. Characters belonging to Unicode

categories Cc, Cn, Co, and Cs were replaced with a space

[27]. The character with the unicode value 8205, also known

as the Zero Width Joiner, was replaced with a null string

while the other characters in category Cf were replaced by a

space. The reason for this is that the Zero Width Joiner was

often present in the middle of Sinhala words, especially

when the Sinhala characters rakāransaya (රකාරාාංශය),
yansaya (යාංසය), and rēpaya (රේඵය) were used.

From the subsequent text, URLs, email addresses, user

tags (of the format @user), and hashtags were removed.

Since only Sinhala and English words are to be considered

in this study, any words containing characters that are neither

Sinhala nor ASCII were removed. The list of stop words for

Sinhala developed from this corpus by Wijeratne and de

Silva [12] were removed next. English letters in the corpus

were then converted to lowercase. All remaining characters

that do not belong to Sinhala letters or English letters were

replaced with white spaces. Numerical content was removed

due to their high unlikelihood to be repeated in the same

sequence order. Finally, multiple continuous white spaces in

the corpus were replaced with a single white space. Once

cleaned, entries of which the Message column were merely

null strings or empty strings were removed from the corpus.

The final cleaned corpus consisted of 526,732 data rows.

B. Core Reaction Set Model

In selecting the core reaction set, Like and Thankful

reactions were excluded due to their counts being outliers in

comparison to other reactions; Like being an outlier on the

higher end and Thankful being an outlier on the lower end.

The total count of each reaction in the corpus along with their

percentages are mentioned in table II. A probable reason for

the abnormal behaviour of those reactions are the duration

that they have been present on Facebook. Like was the first

reaction introduced to the platform, back in 2009 [28]. Love,

Fig. 2: The scope of the research in comparison to the series of sentiments associated with a Facebook post

25 V. Jayawickrama, G. Weeraprameshwara, N. de Silva, Y. Wijeratne

International Journal on Advances in ICT for Emerging Regions October 2022

𝑇 = 𝑛𝐿 + 𝑛𝑊 + 𝑛𝐻 + 𝑛𝑆 + 𝑛𝐴 (1)

𝑁𝑟 =
𝑛𝑟

𝑇
(2)

TABLE II

TOTAL COUNTS OF REACTIONS IN THE CORPUS

Reaction Count
Percentage

All Core

Like 528,060,209 95.43 -

Love 12,526,942 2.26 49.56

Wow 1,906,174 0.34 7.54

Haha 6,524,139 1.18 25.81

Sad 2,987,589 0.54 11.82

Angry 1,329,552 0.24 5.26

Thankful 13,637 0.002 -

Wow, Haha, Sad, and Angry reactions were introduced in

2016 [29]; however, Like still retained its state as the default

reaction which a simple click on the react button enforces.

The Thankful reaction was a temporary option introduced as

part of Mothers’ Day celebrations of Facebook in May 2016

[30]. The reaction was removed from the platform after a few

days, and was reintroduced in May 2017 to be removed again

after the Mother’s Day celebrations [31].

Thus, the core reaction set was defined considering only

the Love, Wow, Haha, Sad, and Angry reactions. The

percentages of the core reactions are also shown in Table II.

Furthermore, Fig. 3 shows the core reaction percentages as a

pie chart. Thus, initially, the normalization was done

considering only the core reactions. Equation 1 obtains the

sum of reactions (T) of an entry using the counts of: Love

(𝑛𝐿 ), Wow (𝑛𝑊 ), Haha (𝑛𝐻 ), Sad (𝑛𝑆), and Angry (𝑛𝐴). The
Equation 2 shows the normalized value 𝑁𝑟 for reaction r
where 𝑛𝑟 is the raw count of the reaction and T is the sum
obtained in Equation 1.

The dataset was then divided into train and test subsets

for the purpose of calculating and evaluating the accuracy of

vector predictions. The message column of the train set was

Fig. 3: Core Reaction Percentages

tokenized into individual words, and set operation was used

to obtain the collection of unique words for each entry. Then,

a dictionary was created for each entry by assigning the

normalized reaction vector of the entry to each word. The

dictionaries thus created were merged vertically, taking the

average value of vectors assigned to a word across the dataset

as the aggregate reaction vector of that word. Equation 3

describes this process where 𝑉𝑊 is the aggregate reaction
vector for the word W, 𝑅𝑖 is the reaction vector of the 𝑖th
entry (𝐸𝑖 ), n is the number of entries, and ∅ is the empty
vector.

𝑉𝑊 =
∑ {

𝑅𝑖 𝑖𝑓 𝑊 𝜖 𝐸𝑖
∅ 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

𝑛
𝑖=1

∑ {
1 𝑖𝑓 𝑊 𝜖 𝐸𝑖

0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝑛
𝑖=1

(3)

The dictionary thus created was used to predict the

reaction vectors of the test dataset. Entries in the test set were

tokenized and then converted to unique word sets similar to

the aforementioned processing of the training set. Then for

each of the words in a set of a message which also exists in

the dictionary created above, the corresponding reaction

vector was obtained from the dictionary. For entries of which

none of the words were found in the dictionary, the mean

vector value of the train dataset was assigned. Equation 4

shows the calculation of the predicted vector 𝑉𝑀 for a
message where, 𝑉𝑊 is taken from the dictionary (which was
populated as described in Equation 3), and 𝑁𝑀 is the number
of words in the message 𝑀 .

TABLE I
FIELDS OF THE SOURCE DATASET THAT WERE USED FOR THIS STUDY

Field Name Description Data Type

Index Index of the dataset int

Like The number of Like reacts on the post int

Love The number of Love reacts on the post int

Wow The number of Wow reacts on the post int

Haha The number of Haha reacts on the post int

Sad The number of Sad reacts on the post int

Angry The number of Angry reacts on the post int

Thankful The number of Thankful reacts on the post int

Message Textual content of the Facebook post string

Facebook for Sentiment Analysis: Baseline Models to Predict Facebook Reactions of Sinhala Posts 26

October 2022 International Journal on Advances in ICT for Emerging Regions

𝑉𝑀 =
∑ 𝑉𝑀𝑁𝑀

𝑁𝑀
(4)

C. Defining the Evaluation Statistics

To evaluate the performance of the prediction process, a

number of statistics were calculated. Equation 5 shows the

calculation of Accuracy 𝐴𝑟 for reaction 𝑟 where, 𝑁𝑟 is the
expected (actual) value for the entry as calculated in

Equation 2 and 𝑀𝑟 is the predicted value calculated in
Equation 4 as 𝑀𝑟 𝜖 𝑉𝑀.

𝐴𝑟 = min

(𝑁𝑟 , 𝑀𝑟 ) (5)

The accuracy can be defined this way since we are

solving a bin packing problem and the vector values are sum

up to 1. Equations 21, 7, and 22 shows the calculation of

Recall (𝑅𝑟 ), Precision (𝑃𝑟 ), and F1 score (𝐹1𝑟 ) respectively
where notation is same as Equation 5.

𝑅𝑟 =
𝐴𝑟

𝑀𝑟
(6)

𝑃𝑟 =
𝐴𝑟

𝑁𝑟
(7)

𝐹1𝑟 =
2×𝐴𝑟

𝑁𝑟+𝑀𝑟
(8)

The above measures were calculated for each entry of the

dataset and the average value of each measure was assigned

as the resultant performance measure of the dataset. Those

values were then averaged across 5 runs of the code.

D. All Reaction Set Model

The All Reaction Set Model was developed following the

same procedure of the core reaction set model. In addition to

the reactions included in the core reaction set, Like ( 𝑛𝐿𝑖) and

Thankful ( 𝑛𝑇 ) were considered during this step. Equation 9

depicts how the sum of reactions is obtained while the

normalized value 𝑁𝑟
∗ for each reaction could be obtained as

mentioned in Equation 10. 𝑇 ∗ refers to the sum of reactions

obtained through Equation 9.

𝑇 ∗ == 𝑛𝐿𝑖 + 𝑛𝐿 + 𝑛𝑊 + 𝑛𝐻 + 𝑛𝑆 + 𝑛𝐴 + 𝑛𝑇 (9)

𝑁𝑟
∗ =

𝑛𝑟

𝑇∗
(10)

The sentiment vector for each entry was then generated

following the same procedure as in III-B. The evaluation was

done as mentioned in III-C.

E. Star Rating Model

The next step of the study was inspired by the procedure

proposed by De Silva et al. [8]. They propose using the star

rating to generate sentiment vectors. Since the star rating

take They propose using the star rating associated with

amazon customer reviews to generate sentiment vectors.

Since the star rating takes a value between 1 and 5 where 3

is considered neutral, and values more than 3 and less than 3

are considered as positive and negative respectively by them.

To adjust Facebook reactions to this scale, we classified the

Fig. 4: Reactions Considered for the Star Rating Model

positivity of reactions as presented in table IV. The positivity

of the Haha reaction is considered to be uncertain due to its

conflicting use cases: the reaction is often used both

genuinely and sarcastically on the platform [32]. Therefore,

the experiment was carried out considering only the Love,

Wow, Sad, and Angry reactions. The normalization process

described in Section III-B for the Core Reaction Set Model

was updated by modifying Equation 1 as shown in Equation

11 and modifying Equation 2 as shown in Equation 12,

where �́� is the modified sum of reactions of the entry. Figure

4 presents the distribution of selected reactions in the corpus.

�́� = 𝑛𝐿 + 𝑛𝑊 + 𝑛𝑆 + 𝑛𝐴 (11)

𝑁𝑟́ =
𝑛𝑟

�́�
(12)

The positive sentiment value ( 𝐸(𝑃,𝑖) ) for entry 𝑖 was

calculated by summing the Normalized Love ( 𝑁𝐿́ ) and
Normalized Wow (𝑁�́� ) values while the negative sentiment

(𝐸(𝑁,𝑖)) was calculated by summing the Normalized Sad (𝑁𝑆́ )

and Normalized Angry (𝑁𝐴́ ) values, as shown in Equations
13 and 14. Using 𝐸(𝑃,𝑖) and 𝐸(𝑁,𝑖), the aggregated sentiment

for entry 𝑖 was calculated as shown in Equation 15.

𝐸(𝑃,𝑖) = 𝑁(𝐿,𝑖)́ + 𝑁(𝑊,𝑖)́ (13)

𝐸(𝑁,𝑖) = 𝑁(𝑆,𝑖)́ + 𝑁(𝐴,𝑖)́ (14)

𝐸𝑖 = 𝐸(𝑃,𝑖) − 𝐸(𝑁,𝑖) (15)

The Star Rating Value (𝑆𝑖 ) for entry 𝑖 which is calculated
over the entire dataset was computed as shown in Equation

16 where 𝐼 is the set of entries in the dataset.

𝑆𝑖 = 4 × (
𝐸𝑖− min

𝐸𝑗𝜖 𝐼
(𝐸𝑗)

max
𝐸𝑗𝜖 𝐼

(𝐸𝑗)− min
𝐸𝑗𝜖 𝐼

(𝐸𝑗)
) + 1 (16)

The sentiment vector ( 𝑉𝑖 ) for entry 𝑖 is defined in
Equation 17 where 𝐸(𝑃,𝑖) , 𝐸(𝑁,𝑖) , and 𝑆𝑖 were calculated as

mentioned before.

𝑉𝑖 = [𝐸(𝑃,𝑖), 𝐸(𝑁,𝑖), 𝑆𝑖 ] (17)

Once the vectors were computed, the processing of test

and train sets, building of the dictionary, and evaluating the

27 V. Jayawickrama, G. Weeraprameshwara, N. de Silva, Y. Wijeratne

International Journal on Advances in ICT for Emerging Regions October 202

TABLE III

PERFORMANCE MEASURES OF VECTOR PREDICTIONS

Train

(%)
Reaction

Core Reaction Set Model All Reaction Set Model

Accuracy Recall Precision F1 Score Accuracy Recall Precision F1 Score

Like - - - - 0.9169 0.9651 0.9691 0.9626

Love 0.3119 0.5863 0.7838 0.5164 0.0056 0.2510 0.6221 0.1769

Wow 0.0298 0.3111 0.6373 0.2218 0.0005 0.1487 0.4550 0.0818

Haha 0.1163 0.4241 0.6279 0.3060 0.0042 0.1646 0.6044 0.1068

Sad 0.0497 0.2355 0.6206 0.1613 0.0015 0.1013 0.5829 0.0638

Angry 0.0175 0.2059 0.5837 0.1318 0.0006 0.0880 0.5193 0.0495

Thankful - - - - 0.0000 0.0007 0.0440 0.0000

Like - - - - 0.9170 0.9652 0.9691 0.9626

Love 0.3119 0.5847 0.7833 0.5147 0.0056 0.2513 0.6225 0.1774

Wow 0.0299 0.3110 0.6375 0.2216 0.0005 0.1486 0.4557 0.0818

Haha 0.1160 0.4242 0.6261 0.3053 0.0042 0.1639 0.6043 0.1064

Sad 0.0497 0.2360 0.6205 0.1616 0.0015 0.1009 0.5840 0.0636

Angry 0.0174 0.2041 0.5834 0.1308 0.0006 0.0882 0.5162 0.0494

Thankful - - - - 0.0000 0.0007 0.0376 0.0000

Like - - - - 0.9167 0.9649 0.9691 0.9625

Love 0.3118 0.5854 0.7833 0.5153 0.0056 0.2515 0.6208 0.1770

Wow 0.0298 0.3113 0.6370 0.2218 0.0005 0.1490 0.4527 0.0816

Haha 0.1160 0.4238 0.6266 0.3052 0.0042 0.1647 0.6037 0.1067

Sad 0.0499 0.2380 0.6176 0.1623 0.0015 0.1012 0.5825 0.0636

Angry 0.0174 0.2045 0.5856 0.1314 0.0006 0.0889 0.5142 0.0497

Thankful - - - - 0.0000 0.0007 0.0297 0.0000

Like - - - - 0.9167 0.9650 0.9690 0.9625

Love 0.3117 0.5855 0.7829 0.5152 0.0056 0.2513 0.6216 0.1771

Wow 0.0298 0.3110 0.6376 0.2217 0.0005 0.1484 0.4539 0.0814

Haha 0.1158 0.4236 0.6263 0.3049 0.0042 0.1643 0.6045 0.1065

Sad 0.0497 0.2368 0.6183 0.1616 0.0015 0.1014 0.5816 0.0637

Angry 0.0174 0.2050 0.5847 0.1314 0.0006 0.0885 0.5155 0.0495

Thankful - - - - 0.0000 0.0007 0.0342 0.0000

Like - - - - 0.9167 0.9650 0.9690 0.9625

Love 0.3121 0.5863 0.7824 0.5156 0.0056 0.2513 0.6206 0.1768

Wow 0.0298 0.3113 0.6361 0.2214 0.0005 0.1491 0.4519 0.0815

Haha 0.1155 0.4236 0.6249 0.3043 0.0042 0.1643 0.6034 0.1063

Sad 0.0496 0.2366 0.6195 0.1617 0.0015 0.1014 0.5815 0.0636

Angry 0.0173 0.2041 0.5855 0.1310 0.0006 0.0886 0.5142 0.0494

Thankful - - - - 0.0000 0.0007 0.0330 0.0000

Facebook for Sentiment Analysis: Baseline Models to Predict Facebook Reactions of Sinhala Posts 28

October 2022 International Journal on Advances in ICT for Emerging Regions

Fig. 6: Changes of the F1 score of the Star Rating Model with Train-Test

Division

TABLE IV
POSITIVITY AND NEGATIVITY OF FACEBOOK REACTIONS

Reaction Positivity/Negativity

Love Positive

Wow Positive

Haha Uncertain

Sad Negative

Angry Negative

model was conducted akin to that in Section III-C and

Section III-B. The performance measures of the model were

calculated using Gaussian distances.

1) Accuracy: The accuracy of prediction for each post

was measured in terms of True Gaussian Distance of a post,

which is defined as the Gaussian distance to the predicted

Star Rating Value of the post from its true Star Rating Value,

on a distribution centered on the true Star Rating Value. It

should be noted that the raw star rating values before

discretizing into classes are utilized here. The accuracy 𝐴𝑖́ of
a post 𝑖 with True Gaussian Distance 𝐺𝑇,𝑖 is calculated as
shown in Equation 18. Equation 19 then describes the

calculation of accuracy 𝐴𝑥́ for class 𝑥 of which the number
of posts is 𝑛𝑥.

𝐴𝑖́ = 1 − 𝐺𝑇,𝑖 (18)

𝐴𝑥́ =
∑ 𝐴𝑖́

𝑛𝑥
𝑛=1

𝑛𝑥
(19)

2) Precision: In order to calculate the precision of

predictions, the Gaussian Trespass of each post into its

predicted class was considered. The trespass was measured

as the Gaussian distance from the boundary of the true class

of the post to the midpoint of its predicted class, on a

Gaussian distribution centered around the midpoint of the

true class. Equation 20 shows the calculation of precision of

each star rating class, where 𝑃�́� represents the precision value
of class 𝑥, 𝑛𝑐𝑐,𝑥 represents the number of correctly classified

posts in class 𝑥, and 𝑇𝑖 represents the trespass value of post
𝑖 in class 𝑥.

𝑃�́� =
𝑛𝑐𝑐,𝑥

𝑛𝑐𝑐,𝑥+∑ 𝑇𝑖
𝑛𝑥
𝑛=1

(20)

3) Recall: The recall value was calculated for each post

in terms of its Class Gaussian Distance, which is defined as

the Gaussian distance to the midpoint of the predicted Star

Rating Class of the post from the midpoint of its true Star

Rating Class, on a distribution which is centered on the

midpoint of the true class. The recall value 𝑅𝑥́ for a class 𝑥
consisting of an 𝑛𝑥 number of Facebook posts, each with a
recall of 𝑅𝑖́ , was obtained as depicted by Equation 21.

𝑅𝑥́ =
∑ 𝑅𝑖́

𝑛𝑥
𝑛=1

𝑛𝑥
(21)

4) F1 Score: The F1 score of each class was then

calculated based on the precision and recall values of the

class, following the standard formula. Equation 22 portrays

the calculation of F1 score 𝐹1𝑥́ for class 𝑥.

𝐹1𝑥́ =
2×𝑃�́�×𝑅�́�

𝑃�́�+𝑅𝑥́
(22)

5) Overall Performance: The overall performance

measures for star rating were calculated by taking a weighted

mean of performance measures of classes, with weights

assigned based on the class size.

IV. RESULTS

Table III shows the results obtained for the preference

measure defined in Section III-C for the Core Reaction Set

Model introduced in Section III-B and All Reaction Set

Model introduced in Section III-D. All reactions except Sad

reach their highest F1 score at the 95% − 05% train-test

division, while the Sad reaction reaches its peak F1 score at

the 80% − 20% division. Interestingly, the performance of

the model in predicting each reaction seems to roughly

follow a specific pattern; reactions that were used more often

in the dataset seem to have a higher F1 score than reactions

that were used less often, with the exception of the F1 score

of Wow being higher than that of Sad. Figure 5 portrays the

F1 score for each reaction as the train-test division varies for

the Core Reaction Set Model. In the case of All Reaction Set

Model, as shown in Table III, while the F1 of Like was much

higher than that of other reactions, its inclusion brought forth

significant reductions in the F1 scores of the other reactions.

The Thankful reaction had a F1 of almost zero.

The overall results obtained for Star Rating Model

introduced in section III-E are shown in table V. In contrast

to the results obtained for Positive and Negative components,

aggregation of reactions into a single Star Rating value has

caused a significant decrease in precision; possibly due to the

discrete nature of the Star Rating value which is divided into

bins at 0.5 intervals. Figure 6 portrays the change of F1 value

with the train-test division.

29 V. Jayawickrama, G. Weeraprameshwara, N. de Silva, Y. Wijeratne

International Journal on Advances in ICT for Emerging Regions October 2022

TABLE V

PERFORMANCE MEASURES OF STAR RATING VECTOR PREDICTION

Fig. 5: Changes of the F1 score of the Core Reaction Model with Train-Test

Division

Furthermore, the results obtained for each Star Rating

Class are displayed in Table VI. It could be observed that the

model exhibits better performance with regard to predicting

more neutral star rating values. While accuracy and recall

measures show comparable performance across all classes,

this difference becomes much more prominent in precision.

Consequently, a notable increase in performance is observed

in more neutral classes in terms of F1 score. Further

explorations revealed that the root cause of this issue is that

the predictions of the model tend to lean towards more

neutral classes, as portrayed in Table VII. It should be noted

that the extremely positive and extremely negative classes

are significantly larger in size, in comparison to

comparatively neutral classes.

As portrayed by Figure 5, the performance of the models

remains largely unaffected by the train-test division chosen.

The reason could be the large size of the dataset; the number

of unique words in the train dataset does not change

significantly for different train-test divisions.

V. CONCLUSION

Upon comparing the Star Rating Model with the Core

Reaction Set Model, it becomes evident that the F1 scores are

significantly improved upon the accumulation of separate

reaction values into two categories as Positive and Negative.

A possible reason for this is the possibility of the intra-

category measurement errors being eliminated due to

merging. However, merging all reactions into a single Star

Rating value accentuates errors. This could be accounted to

the additional error margin introduced by discretization.

Further, the model predictions for Star Rating Classes that

are closer to the median proves to be better than those for the

edge-classes. The negative effect of Like and Thankful

reactions, which were eliminated in the Core Reaction Set

Model due to their abnormal counts, could be proven as well.

The inclusion of those reactions caused significant

reductions in the F1 scores of the other reactions as can be

seen from the results of the All Reaction Set Model.

This study represents modelling efforts that may be

considered classical and limited in nature. Recent years have

seen a significant growth in machine learning algorithms

delivering exceptional results in many domains of text

analysis, especially in finding non-linear relationships in the

Train Set (%) Category
Performance Measure

Accuracy Recall Precision F1 Score

Positive 0.5406 0.7496 0.8601 0.7068

Negative 0.2062 0.4775 0.8067 0.4207

Star Rating 0.6930 0.6912 0.2259 0.2921

Positive 0.5420 0.7524 0.8589 0.7088

Negative 0.2052 0.4753 0.8069 0.4192

Star Rating 0.6931 0.6913 0.2267 0.2945

Positive 0.5416 0.7527 0.8571 0.7075

Negative 0.2038 0.4718 0.8077 0.4159

Star Rating 0.6917 0.6896 0.2236 0.2912

Positive 0.5410 0.7503 0.8588 0.7065

Negative 0.2046 0.4751 0.8051 0.4176

Star Rating 0.6925 0.6905 0.2280 0.2975

Positive 0.5403 0.7514 0.8572 0.7064

Negative 0.2040 0.4742 0.8053 0.4166

Star Rating 0.6915 0.6895 0.2298 0.2994

Facebook for Sentiment Analysis: Baseline Models to Predict Facebook Reactions of Sinhala Posts 30

October 2022 International Journal on Advances in ICT for Emerging Regions

TABLE VI
STAR RATING MODEL: CLASS PERFORMANCE MEASURES

Star Rating Class Train Set (%)
Performance Measure

Accuracy Precision Recall F1 Score

1.0

95 0.5677 0.0001 0.5573 0.0021

90 0.5700 0.0015 0.5598 0.0030

80 0.5673 0.0007 0.5574 0.0015

70 0.5686 0.0012 0.5586 0.0023

50 0.5672 0.0013 0.5572 0.0026

1.5

95 0.5886 0.0151 0.5981 0.0293

90 0.5888 0.0127 0.5985 0.0248

80 0.5888 0.0142 0.5969 0.0277

70 0.5873 0.0164 0.5961 0.0319

50 0.5880 0.0176 0.5965 0.0341

2.0

95 0.6368 0.0841 0.6448 0.1457

90 0.6392 0.1134 0.6470 0.1924

80 0.6369 0.0953 0.6450 0.1653

70 0.6373 0.1039 0.6449 0.1785

50 0.6370 0.1074 0.6442 0.1839

2.5

95 0.7177 0.4403 0.7288 0.5481

90 0.7162 0.4191 0.7280 0.5318

80 0.7174 0.4324 0.7270 0.5421

70 0.7150 0.4286 0.7251 0.5385

50 0.7147 0.4248 0.7251 0.5356

3.0

95 0.7930 0.6707 0.8043 0.7314

90 0.7892 0.6408 0.7981 0.7108

80 0.8018 0.6696 0.8077 0.7320

70 0.7932 0.6427 0.7982 0.7118

50 0.7954 0.6543 0.8021 0.7206

3.5

95 0.8513 0.8456 0.8677 0.8565

90 0.8455 0.8357 0.8630 0.8491

80 0.8473 0.8390 0.8640 0.8513

70 0.8485 0.8319 0.8615 0.8465

50 0.8470 0.8283 0.8600 0.8439

4.0

95 0.8378 0.8135 0.8517 0.8321

90 0.8334 0.7929 0.8443 0.8178

80 0.8346 0.7888 0.8426 0.8148

70 0.8309 0.7833 0.8405 0.8109

50 0.8333 0.7913 0.8434 0.8165

4.5

95 0.7642 0.6144 0.7819 0.6879

90 0.7630 0.6130 0.7822 0.6872

80 0.7584 0.6011 0.7784 0.6783

70 0.7604 0.6108 0.7805 0.6853

50 0.7604 0.6110 0.7803 0.6853

5.0

95 0.7154 0.1554 0.7047 0.2545

90 0.7144 0.1564 0.7037 0.2558

80 0.7135 0.1564 0.7028 0.2558

70 0.7160 0.1646 0.7054 0.2669

50 0.7134 0.1653 0.7030 0.2677

31 V. Jayawickrama, G. Weeraprameshwara, N. de Silva, Y. Wijeratne

International Journal on Advances in ICT for Emerging Regions October 202

TABLE VII

STAR RATING MODEL: CONFUSION MATRIX OF CLASSES

data. Kowsari et al. [33] highlights a number of pre-

processing steps (such as dimensionality reduction using

topic modelling or principal component analysis) and

algorithms that may be combined with the feature

engineering work presented here (especially the selection of

useful data classes and reduction to a star rating) for

potentially more accurate models in the future. As noted

therein, deep learning techniques hold particular promise.

This is further explored in the work of Weeraprameshwara

et al. [34], [35] that can be considered as a continuation of

the research, which tests new models and develops a new

embedding system using the Facebook data.

The study uses a word embedding developed by the work

of Senevirathne et al. [23] for the Facebook dataset. However,

developing an embedding structure based on the dataset may

provide better sentiment annotation. Further enhancements

can be done by introducing granularity to the embedding

structure such as sentence embeddings.

An alternate approach to sophisticated modelling would

be to examine pre-processing techniques therein that may not

be possible in Sinhala as of the time of writing, due to limited

or missing language resources and tooling, as noted by de

Silva [10]; building these tools may further yield increases

in accuracy even with a simplistic model.

References

[1] V. Gamage, M. Warushavithana, N. de Silva and others, “Fast
Approach to Build an Automatic Sentiment Annotator for Legal

Domain using Transfer Learning,” in Proceedings of the 9th

Workshop on Computational Approaches to Subjectivity,
Sentiment and Social Media Analysis, 2018.

[2] P. Melville, W. Gryc and R. D. Lawrence, “Sentiment analysis of

blogs by combining lexical knowledge with text classification,” in
SIGKDD, 2009.

[3] E. Rudkowsky, M. Haselmayer, M. Wastian and others, “More

than bags of words: Sentiment analysis with word embeddings,”

Communication Methods and Measures, vol. 12, p. 140–157,

2018.

[4] C. Aguwa, M. H. Olya and L. Monplaisir, “Modeling of fuzzy-
based voice of customer for business decision analytics,”

Knowledge-Based Systems, vol. 125, p. 136–145, 2017.

[5] V. Zobal, Sentiment analysis of social media and its relation to
stock market, Univerzita Karlova, Fakulta sociálnı́ch věd, 2017.

[6] J. L. O. Hui, G. K. Hoon and W. M. N. W. Zainon, “Effects of

word class and text position in sentiment-based news
classification,” Procedia Computer Science, vol. 124, p. 77–85,

2017.

[7] R. Socher, A. Perelygin, J. Wu and others, “Recursive deep
models for semantic compositionality over a sentiment treebank,”

in EMNLP, 2013.

[8] S. De Silva, H. Indrajee, S. Premarathna and others, “Sensing the
sentiments of the crowd: Looking into subjects,” in 2nd

International Workshop on Multimodal Crowd Sensing, 2014.

[9] Y. Wijeratne, N. de Silva and Y. Shanmugarajah, “Natural
language processing for government: Problems and potential,”

International Development Research Centre (Canada), 2019.

[10] N. de Silva, “Survey on publicly available sinhala natural
language processing tools and research,” arXiv preprint

arXiv:1906.02358, 2019.

[11] S. Ranathunga and N. de Silva, “Some languages

are more equal than others: Probing deeper into the

linguistic disparity in the nlp world,” arXiv preprint

arXiv:2210.08523, 2022

[12] Y. Wijeratne and N. de Silva, “Sinhala language corpora and

stopwords from a decade of sri lankan facebook,” arXiv preprint

arXiv:2007.07884, 2020.

[13] L. Graziani, S. Melacci and M. Gori, “Jointly learning to detect

emotions and predict facebook reactions,” in ICANN, 2019.

[14] V. Jayawickrama, G. Weeraprameshwara, N. de Silva and Y.
Wijeratne, “Seeking Sinhala Sentiment: Predicting Facebook

Reactions of Sinhala Posts,” in 2021 21st International
Conference on Advances in ICT for Emerging Regions (ICter),

2021.

[15] J. P. Singh, S. Irani, N. P. Rana and others, “Predicting the
“helpfulness” of online consumer reviews,” Journal of Business

Research, vol. 70, p. 346–355, 2017.

True Star Rating Class

1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

P
r
e
d

ic
te

d
S

ta
r
R

a
ti

n
g

C
la

1.0 5 51 848 2194 1516 920 256 19 2

1.5 2 29 340 1444 1390 1159 406 37 0

2.0 0 9 77 391 611 708 324 31 6

2.5 1 4 22 157 324 414 258 27 0

3.0 0 0 6 110 210 324 200 44 3

3.5 0 0 8 61 216 466 329 93 2

4.0 0 0 5 60 255 618 597 192 6

4.5 0 2 7 98 446 1211 1602 1029 43

5.0 1 4 7 210 1081 3594 8105 7798 844

Facebook for Sentiment Analysis: Baseline Models to Predict Facebook Reactions of Sinhala Posts 32

October 2022 International Journal on Advances in ICT for Emerging Regions

[16] L. Martin and P. Pu, “Prediction of helpful reviews using
emotions extraction,” in AAAI, 2014.

[17] J. A. Caetano, H. S. Lima and others, “Using sentiment analysis to

define twitter political users’ classes and their homophily during
the 2016 American presidential election,” Journal of internet

services and applications, vol. 9, p. 1–15, 2018.

[18] C. Pool and M. Nissim, “Distant supervision for emotion detection
using Facebook reactions,” arXiv preprint arXiv:1611.02988,

2016.

[19] C. Freeman, M. K. Roy, M. Fattoruso and H. Alhoori, “Shared
feelings: Understanding facebook reactions to scholarly articles,”

in JCDL, 2019.

[20] C. Strapparava and R. Mihalcea, “SemEval-2007 Task 14:
Affective Text,” in Fourth International Workshop on Semantic

Evaluations, 2007.

[21] E. C. O. Alm, Affect in* text and speech, University of Illinois at
Urbana-Champaign, 2008.

[22] K. R. Scherer and H. G. Wallbott, “Evidence for universality and

cultural variation of differential emotion response patterning.,”
Journal of personality and social psychology, vol. 66, p. 310,

1994.

[23] L. Senevirathne, P. Demotte, B. Karunanayake and others,
“Sentiment Analysis for Sinhala Language using Deep Learning

Techniques,” arXiv preprint arXiv:2011.07280, 2020.

[24] N. Medagoda, S. Shanmuganathan and J. Whalley, “Sentiment
lexicon construction using SentiWordNet 3.0,” in ICNC, 2015.

[25] P. D. T. Chathuranga, S. A. S. Lorensuhewa and M. A. L.
Kalyani, “Sinhala sentiment analysis using corpus based sentiment

lexicon,” in ICTer, 2019.

[26] I. Caswell, J. Kreutzer and others, “Quality at a glance: An audit
of web-crawled multilingual datasets,” arXiv preprint

arXiv:2103.12028, 2021.

[27] M. Davis and K. Whistler, “Unicode character database,” Unicode
Standard Annex, vol. 44, p. 95170–0519, 2008.

[28] J. Kincaid, Facebook Activates "Like" Button; FriendFeed Tires

Of Sincere Flattery.

[29] L. Stinson, "Facebook Reactions, the Totally Redesigned Like

Button, Is Here," Wired.

[30] C. Newton, Facebook tests temporary reactions with a flower for
Mother's Day, The Verge, 2016.

[31] A. Liptak, Facebook brought back its flower reaction for Mother’s

Day, 2017.

[32] P. C. Kuo and others, “Facebook reaction-based emotion classifier

as cue for sarcasm detection,” arXiv preprint arXiv:1805.06510,

2018.

[33] K. Kowsari, K. Jafari Meimandi, M. Heidarysafa and others,

“Text classification algorithms: A survey,” Information, vol. 10, p.

150, 2019.

[34] G. Weeraprameshwara, V. Jayawickrama, N. de Silva and Y.

Wijeratne, “Sentiment analysis with deep learning models: a

comparative study on a decade of Sinhala language Facebook
data,” in 2022 The 3rd International Conference on Artificial

Intelligence in Electronics Engineering, 2022.

[35] G. Weeraprameshwara, V. Jayawickrama, N. de Silva and Y.

Wijeratne, “Sinhala Sentence Embedding: A Two-Tiered

Structure for Low-Resource Languages,” arXiv preprint arXiv:
2210.14472, 2022.