Paper—A Sentiment Analysis Tool for Determining the Promotional Success of Fashion Images on Insta… 

A Sentiment Analysis Tool for Determining the 
Promotional Success of Fashion Images on Instagram 

https://doi.org/10.3991/ijim.v11i2.6563 

Mohamed AbdelFattah 
British University in Egypt, Cairo, Egypt 

Mohamed.Abdelfattah@bue.edu.eg 

Dahab Galal* 
British University in Egypt, Cairo, Egypt 

Dahab.Galal@bue.edu.eg 

Nada Hassan 
British University in Egypt, Cairo, Egypt 

Nada.Hassan@bue.edu.eg 

Doaa S. Elzanfaly 
British University in Egypt, Cairo, Egypt 
Doaa.Elzanfaly@bue.edu.eg 

Greg Tallent 
London South Bank University, London, United Kingdom 

greg.tallent@lsbu.ac.uk 

Abstract—Sentiment Analysis (SA) or Opinion Mining is the process of 
analysing natural language texts to detect an emotion or a pattern of emotions 
towards a certain product to make a decision about that product. SA is a topic 
of text mining, Natural Language Processing (NLP) and web mining disci-
plines. Research in SA is currently at its peak given the amount of data generat-
ed from social media networks. The concept is that consumers are expressing 
exactly what they need, want and expect from a product but on the other hand 
the companies don’t have the tools to analyse and understand these feelings to 
satisfy these consumers accordingly.   

One of the applications that generate a high rate of reactions and sentiments 
in social networks is Instagram. This study focuses on analysing the reactions 
generated by the top 50 fashion houses on Instagram given their top 20 images 
with the highest number of likes. The approach taken in this study is to qualify 
the visual aesthetics of fashion images and to establish why some succeed on 
social media more than others.   

The basic question asked in this paper is whether there are certain visual aes-
thetics that appeal more to the user and are therefore more successful on social 
media than others as determined by a measure we introduce, ‘Social Value’. To 
do so, a sentiment analysis tool is developed to measure the proposed social 
value of each image. An input of comments from each image will be processed. 
Each comment will go through a pre-processing phase; each word will be 
placed through a lexicon to identify if it is positive or negative. The output of 

66 http://www.i-jim.org


Paper—A Sentiment Analysis Tool for Determining the Promotional Success of Fashion Images on Insta… 

the lexicon is a score value assigned to each comment to identify its degree of 
positivity, negativity, or it has no effect on the social value. Adding to these re-
sults, the number of likes and shares would also be taken into consideration 
quantifying the image’s value. A cumulative result is then produced to deter-
mine the social value of an image. 

Keywords—Sentiment Analysis; Opinion Mining; Instagram; Social Value; 
Aesthetics 

1 Introduction 

With the rise and dominance of social media in almost every day-life activity, the 
need to understand its power and the potential it can offer has become more pressing. 
Through user-generated content, which is enabled by social media, opinion mining 
becomes critical in order to investigate the provided content to identify feedback 
towards a certain product, political campaign, initiative …etc. There are different 
levels of opinion mining, each depending on the domain in question. The level that is 
of most interest with regards to social media is that of opinion mining at feature level. 
This paper focuses on Instagram and the effect of the pictures that major fashion 
houses on the actual buying decisions of the customers. This would entail extracting 
the features of the object in question (whether a comment or an image) and further 
determining whether the opinion is positive or negative 

This paper is organized as follows; section two starts by reviewing the literature to 
introduce different approaches used in sentiment analysis, and then gives a brief 
overview of machine learning and Lexicon-based approaches and finally discusses 
two recent techniques employed on social media platforms which are localized twitter 
opinion mining using sentiment analysis and unsupervised sentiment analysis in so-
cial media. Section three introduces the proposed application, its work flow and the 
different modules it’s composed of. Following that, a brief summery is given to be 
followed by the future work for this application. This paper is the first of a series of 
papers tackling sentiment analysis in Instagram. 

2 Literature Review 

To determine the orientation of a particular opinion, there are two types of tech-
niques that can be used; lexicon based methods as well as machine learning methods. 
Machine learning can be further divided into supervised learning, semi-supervised 
learning and un-supervised learning. Supervised learning generally refers to the use of 
classification techniques. There are a number of techniques that are used in opinion 
mining including Naïve Bayes, Support Vector Machines, and Multi-layer perceptron 
(Miwari, Singh, & Srivastava, 2015).  With such techniques, the training data that is 
provided is already labelled with classes (in this case that would be the nature of a 
particular comment) and the labelled data is used to develop a certain model that 
would later be used to identify the class of unknown / unlabelled examples. 

iJIM ‒ Vol. 11, No. 2, 2017 67


Paper—A Sentiment Analysis Tool for Determining the Promotional Success of Fashion Images on Insta… 

In the case where provided documents are unlabelled, unsupervised techniques are 
implemented.  Some of the algorithms used include Latent Dirichlet Allocation 
(LDA) and Probabilistic Latent Semantic Analysis (PLSA) (Arora, Patil, & Correia, 
2015). The algorithms are used to extract hidden topics from within the document’s 
text. These topics are considered to be features of the document (Benevenuto, Araujo, 
& Riberio, 2015). The challenge with such techniques lies in the fact that a large 
amount of data needs to be trained so as to provide valid and beneficial information.  

Semi-supervised techniques attempt to handle the disadvantages of both supervised 
and unsupervised methods. It is a relatively new approach that learns from both la-
belled and unlabelled data in that the small amount of labelled data is used for learn-
ing, which is then applied to unlabelled data (Arora, Patil, & Correia, 2015). 

Other techniques that can be used in opinion mining are lexicon-based approaches. 
In such case, those techniques belong to one of two methods: Dictionary-based meth-
ods and Corpus-based methods. Dictionary-based methods find the opinion words 
within a document or sentence and then search the dictionary for their corresponding 
synonyms and antonyms (Feldman, 2013). Corpus-based methods on the other hand, 
are provided with a list of opinion words and search the entire corpus for similar 
words with relevant context. This can be done using statistical or semantic methods 
(Arora, Patil, & Correia, 2015; Hridoy, Ekram, Islam, Ahmed, & Rahman, 2015). 

Extensive work has been done using both approaches. The work done by Paltoglou 
& Thelwall (2012) proposed a lexicon-based classifier that predicts the degree of 
emotional valence in text so as to provide predictions that are intended to tackle the 
issue of sentiment analysis (Xiaowen, Bing, & Philip, 2008). The classifier is further 
enhanced by adding “an extensive list of linguistically driven functionalities” to the 
classifier, which would further enhance the prediction provided (Paltoglou & 
Thelwall, 2012). The reason that this work is considered unsupervised is directly 
correlated to the lack of using a reference corpus as well as the lack of need for train-
ing. 

The proposed solution is intended to determine the intensity of an emotion, being 
positive (+1), negative (-1) or neutral (0). The level of valence in each case is ex-
pressed as two separate ratings, one indicating the positivity (1…5) and the other 
indicating the negativity of the comment (-1…-5). The values 1 & -1 indicate the lack 
of the emotion in question. The score is expressed as follows: {Cpos, Cneg}. Within a 
given comment or sentence, the number of positive and negative tokens are taken into 
consideration, and the class associated with the maximum value is selected. Given a 
particular document, the algorithm identifies all the emotional words (by searching 
through an emotional dictionary) and identifies their polarity and intensity. The initial 
scores are then modified based on their identified nature (negation, capitalization, 
exclamations and emoticons, intensifiers, and diminishers). Each of these identified 
categories has a different weight. The assignment of weights is further explained in 
detail in (Paltoglou & Thelwall, 2012). Continuing with the above steps then provides 
the total score (Paltoglou & Thelwall, 2012). The work was tested on 3 real data sets 
yielding positive results that showed that the proposed work yields better results than 
machine learning techniques in the majority of cases. 

68 http://www.i-jim.org


Paper—A Sentiment Analysis Tool for Determining the Promotional Success of Fashion Images on Insta… 

Hridoy et al. proposed another technique that allows for the “utilization and inter-
pretation of twitter data to determine public opinion proposed another more specific 
technique” (2015). A main and crucial step in this solution was the extraction and 
processing of data, given its unstandardized nature. The data was obtained from Twit-
ter’s API and was further cleaned up using Java. The Stanford Natural Language 
Processing tool was then used to label the provided data since SNLP provides the 
grammatical relations between words. Since not all relations are meaningful, a selec-
tion of 50 relations was taken into consideration to identify the pieces of useful in-
formation (Hridoy, Ekram, Islam, Ahmed, & Rahman, 2015).  

A numeric value must be used to indicate the sentiment in a particular tweet. In or-
der to do so, the SentiWordNet was used to assign scores for the provided tweets. By 
taking into consideration the provided word and the part of speech within the sentence 
in question, SentiWordNet assigns a numeric score for each word, which are in turn 
added together to provide the score for the entire tweet. The assigned score is a value 
between -1 and 1. The lower the value, the more negative the sentiment is and vice 
versa (Hridoy, Ekram, Islam, Ahmed, & Rahman, 2015). Since SentiWord can only 
identify words and not sentences, part of speech tagging is utilized to differentiate the 
various sentences, which is also provided with the SNLP tool. A custom programmer 
was implemented for the tagger since SentiWord can only acknowledge verbs, adjec-
tives, adverbs and nouns. This can be further generalized to investigate comments 
related to a particular location, gender …etc.  The experimental results showed that 
the adopted method provided meaningful and relevant information to the problem in 
question and that it can be easily generalized to fit any necessary problem definition 
(Hridoy, Ekram, Islam, Ahmed, & Rahman, 2015). 

Upon investigating the surveyed literature, it is evident that lexicon-based ap-
proaches are better suited when unsupervised learning is in order. In both the sur-
veyed works, the proposed lexicon-based approaches have proved to be more efficient 
than machine learning-based techniques with respect to unsupervised learning. As 
such, the solution proposed in this work follows a lexicon-based approach. 

3 Proposed Application 

The main purpose of this paper is to propose an application that can produce the 
social value/impact of a brand in the market through Instagram. Data will be collected 
about the brand from the images uploaded by the brand. Sentiment analysis will be 
applied on these images to identify the impact of each image has on the brand, then 
accumulate the whole social value as one result.  

Based on the research done on sentiment analysis, there are various levels and 
techniques in which the social value can be extracted from the data provided. The 
proposed application focuses on the featured level mining to determine the value of 
the data to be positive, negative, or neutral and by what percentage. The technique at 
extraction which will be used is the lexicon based approach which is based on natural 
language processing and utilizes parts of speech tagging and WordNet. In this section 
we describe the model of the proposed application presented in Fig. 1 

iJIM ‒ Vol. 11, No. 2, 2017 69


Paper—A Sentiment Analysis Tool for Determining the Promotional Success of Fashion Images on Insta… 

 
Fig. 1. Application Work Flow 

3.1 Data Module 

The data required for the proposed application is collected from Instagram. The 
collected data are the comments written by the followers of a certain brand. This data 
will then be pre-processed to prepare it for the sematic analysis module. This part 
describes both the comment extraction and comment pre-processing components. 

Comment Extraction: The comment extraction component will be done through 
the PHP API provided by Instagram. Instagram allows only the last 150 comments of 
each selected image to be extracted as a security measure (ref Instagram). The com-

70 http://www.i-jim.org


Paper—A Sentiment Analysis Tool for Determining the Promotional Success of Fashion Images on Insta… 

ments will be extracted and placed in a MySQL database, where each comment is 
linked to the image it was extracted from. Each image contains extra information such 
as the total number of likes, total number of comments, and brand name which will be 
used later. 

Comment Pre-Processing: After collecting the data required for processing, the 
data needs to be pre-processed to remove any irrelevant comments which are not 
opinionated. The Stanford natural language processing (SNLP) tool will be used to 
aid in the pre-processing stage of the data. The SNLP tool is an open source natural 
language processing tool developed by Stanford University. The tool will in aid in 
retrieving parts of speech, and can identify the grammatical relationship between the 
words in the sentence. 

The first step is to originate every word in all comments by using the SNLP tool to 
stem the words and remove any emojis, punctuations, or spaces. Emojis are removed 
to reduce complications to the pre-processing phase. Secondly we will use the tool to 
tag each word with its part of speech. After completing this task, we need to identify 
if the comment is opinionated or not. We need to identify the relationships between 
the words, the SNLP tool will help with this task; there are 50 dependencies that de-
fine the relationships between the words integrated in the tool, but only nsubj, amod, 
and dobj will be used to eliminate any non-opinionated comments.  

The nsubj dependency will be used to identify the relations between the nouns and 
adjectives or verbs in a sentence which complement the noun. The amod dependency 
will be used to identify the adjectives that modify the noun phrase. The final depend-
ency, dobj, will be used to identify direct objects that a verb is referring to in a sen-
tence. The elimination process will target those comments that do not contain at least 
one dependency of the ones listed above. (Hridoy, Ekram, Islam, Ahmed, & Rahman, 
2015)  

3.2 Sentiment Analysis Module 

After processing our data and filtering out unwanted comments, now we can start 
by assigning a sentimental value to each comment and define the social value of each 
brand. This module works in three phases defined by the comment analyzer, image 
evaluator, and brand evaluator components. In this section we will define each com-
ponent and how they are linked together. 

Comment Analyser: This component is responsible in scoring each comment with 
an accumulative sentiment score. Built in to the SNLP tool is the sentiWord tool 
which can classify a word as positive, negative or neutral by scoring it on a scale from 
-1 to 1; -1 being most negative, 1 being most positive and 0 being neutral or does not 
affect the comment in any manner. The sentiWord tool takes input the word and its 
part of speech tag and produces the sentiment score, where the final numbers of each 
of these words will be accumulated to define the total sentiment score of the comment 
at hand.  

Image Evaluator: As stated before, each brand has a set of images from which we 
extracted the comments. In this component we will evaluate each image by identify-
ing its own sentiment analysis score using equation 1. 

iJIM ‒ Vol. 11, No. 2, 2017 71


Paper—A Sentiment Analysis Tool for Determining the Promotional Success of Fashion Images on Insta… 

 Score (image i) = (1) 

The score of the image is defined by the accumulative score of all the comments 
related to the image over the total comments related to the image. ‘i’ represents the 
image that will be scored, ‘n’ represents the total number of comments related to 
image ‘i’ and ‘j’ represents the current comment. 

Brand  Evaluator: The final component is responsible of outputting the final so-
cial value of each brand. After the module has evaluated each image in the previous 
component, in this component we get the average of the image scores related to the 
brand using equation 2. 

Score (brand i) =  (2) 

Where ‘i’ represents the brand at hand, ‘n’ represents the total images related to 
brand ‘i’ and ‘j’ represents the current image out of the ‘n’ images selected. 

3.3 Output 

The application will output the final social value of any selected brand, as well as 
brand experts will be able to fully analyze the social value that was produced by go-
ing through all the selected images and reviewing which ones had the least scores to 
try and modify them or remove them; on the other hand, they can also see which 
images scored the higher scores to promote those images more to gain more attraction 
in the market 

4 Conclusion  

By calculating a new proposed value which was called “Social Value” to pictures, 
the impact of a picture can be better quantified as reference to how people react to it. 
Such quantification provides a strong baseline for companies -in case of this paper 
fashion houses- to build on their future marketing campaigns given the Social Impact 
of their previous images and how their followers reacted to them. The aim is to come 
up with a self-sufficient application that can recommend to users the best features to 
include in their pictures to better suit the tastes of their various followers on social 
media. 

5 Future Work 

The results produced from this application will be represented to a domain expert 
to identify how accurate are the results. After receiving feedback from the domain 
experts, modifications can be made in order to improve accuracy of the application. 
The next step of the application is to include sentence level opinion mining in order to 
extract specific features that the brand followers had targeted in specific, as well as 
add a machine learning technique to further increase the accuracy of the social value 

72 http://www.i-jim.org


Paper—A Sentiment Analysis Tool for Determining the Promotional Success of Fashion Images on Insta… 

of each brand. This paper presents the first of series of papers regarding sentiment 
analysis of Instagram precisely. 

6 References 

[1] Arora, A., Patil, C., & Correia, S. (2015). Opinion Mining: An Overview. International 
Journal of Advanced Research in Computer and Communication Engineering , 4 (11), 94-
98. 

[2] Benevenuto, F., Araujo, M., & Riberio, F. (2015). Sentiment Analysis Methods for Social 
Media. WebMedia'15 (p. 11). New York, USA: ACM. 

[3] Feldman, R. (2013). Techniques and Applications for Sentiment Analysis. Communica-
tions of the ACM, 82-89. https://doi.org/10.1145/2436256.2436274 

[4] Hridoy, S. A., Ekram, M. T., Islam, M. S., Ahmed, F., & Rahman, R. M. (2015). Localized 
twitter opinion mining using sentiment analysis. Decision Analytics , 2 (8), 1-19. 

[5] Miwari, R., Singh, A., & Srivastava, A. (2015). Opinion Mining Techniques on Social 
Media Data. International Journal of Computer Applications , 118 (6). 
https://doi.org/10.5120/20753-3149 

[6] Paltoglou, G., & Thelwall, M. (2012). Twitter, MySpace, Digg: Unsupervised Sentiment 
Analysis in Social Media. ACM Transactions on Intelligent Systems and Technology , 3 
(4). https://doi.org/10.1145/2337542.2337551 

[7] Xiaowen, D., Bing, L., & Philip, Y. S. (2008). A Holistic Lexicon-based Approach to 
Opinion Mining. International Conference on Web Search and Data Mining. Palo Alto, 
California : ACM. 

7 Authors 

Mohamed AbdelFattah is with British University in Egypt, Cairo, Egypt 
(Mohamed.Abdelfattah@bue.edu.eg). 

Dahab Galal (corresponding author) is with British University in Egypt, Cairo, 
Egypt (Dahab.Galal@bue.edu.eg, Tel.: +02 0100487 4830). 

Nada Hassan is with British University in Egypt, Cairo, Egypt 
(Nada.Hassan@bue.edu.eg). 

Doaa S. Elzanfaly is with British University in Egypt, Cairo, Egypt 
(Doaa.Elzanfaly@bue.edu.eg). 

Greg Tallent is with London South Bank University, London, United Kingdom 
(greg.tallent@lsbu.ac.uk). 

This article is a revised version of a paper presented at the BUE International Conference on Sustaina-
ble Vital Technologies in Engineering and Informatics, held Nov 07, 2016 - Nov 09, 2016 , in Cairo, 
Egypt. Article submitted 21 December 2016. Published as resubmitted by the authors 22 February 2017. 

iJIM ‒ Vol. 11, No. 2, 2017 73


	iJIM – Vol. 11, No. 2, 2017
	A Sentiment Analysis Tool for Determining the Promotional Success of Fashion Images on Instagram