Vol9No2Paper4


To cite this article: Lamrhari, S., Elghazi, H. & El Faker, A. (2019) Business 
intelligence using the fuzzy-Kano model. Journal of Intelligence Studies in 
Business. 9 (2) 43-58. 

Article URL: https://ojs.hh.se/index.php/JISIB/article/view/408 

This article is Open Access, in compliance with Strategy 2 of the 2002 Budapest Open Access Initiative, which 
states: 

Scholars need the means to launch a new generation of journals committed to open access, and to help existing journals that 
elect to make the transition to open access. Because journal articles should be disseminated as widely as possible, these new 
journals will no longer invoke copyright to restrict access to and use of the material they publish. Instead they will use 
copyright and other tools to ensure permanent open access to all the articles they publish. Because price is a barrier to access, 
these new journals will not charge subscription or access fees, and will turn to other methods for covering their expenses. 
There are many alternative sources of funds for this purpose, including the foundations and governments that fund research, 
the universities and laboratories that employ researchers, endowments set up by discipline or institution, friends of the cause 
of open access, profits from the sale of add-ons to the basic texts, funds freed up by the demise or cancellation of journals 
charging traditional subscription or access fees, or even contributions from the researchers themselves. There is no need to 
favor one of these solutions over the others for all disciplines or nations, and no need to stop looking for other, creative 
alternatives. 

 
Journal of Intelligence Studies in Business 
Publication details, including instructions for authors and subscription 
information: https://ojs.hh.se/index.php/JISIB/index 

 
Business intelligence using the fuzzy-Kano model  

Soumaya Lamrharia*, Hamid Elghazib, Abdellatif El 
Fakera 
aENSIAS, Mohammed V University Rabat, Morocco, bNational Institute of 
Posts and Telecommunications Rabat, Morocco  

*soumaya_lamrhari@um5.ac.ma  

 
Journal of Intelligence Studies in Business 

PLEASE SCROLL DOWN FOR ARTICLE 
 

Editor-in-chief: 
Klaus Solberg Søilen

Included in this printed copy:
Making sense of the collective intelligence 
field: A review  

Collective intelligence process to interpret weak 
signals and early warnings
Fernando C. de Almeida and Humbert Lesca pp. 19-29
 
Study on the various intellectual property 
management strategies used and implemented 
by ICT firms for business intelligence

Journal of Intelligence 
Studies in Business

V
ol 9

, N
o 2

, 2
0

1
9

J
ou

rn
a

l of In
telligen

ce S
tu

d
ies in

 B
u

sin
ess  

ISSN: 2001-015X 

Vol. 9, No. 2 2019

Klaus Solberg Søilen     pp. 6-18
   

Shabib-Ahmed Shaikh    pp. 30-42  
and Tarun Kumar Singhal 
   
 
A new corpus-based convolutional neural network 
for big data text analytics
Wedjdane Nahili, Khaled Rezeg  pp. 59-71  
and  Okba Kazar  
   
 
Business Intelligence using the Fuzzy-Kano model 
Soumaya Lamrhari , Hamid Elghazi   pp. 43-58 
and Abdellatif El Faker 
   
 
Using open data and Google search data for 
competitive intelligence analysis
Jan Černý, Martin Potančok   pp. 72-81  
and Zdeněk Molnár 
   
 The potential of business intelligence 
tools for expert finding 
Mehdi Dadkhah, Mohammad Lagzian,  pp. 82-95
Fariborz Rahim-nia and Khalil Kimiafar
     

Business intelligence using the fuzzy-Kano model 
 
Soumaya Lamrharia*, Hamid Elghazib and Abdellatif El Fakera 
 

aENSIAS, Mohammed V University Rabat, Morocco 
bNational Institute of Posts and Telecommunications Rabat, Morocco  
 
Corresponding author (*): soumaya_lamrhari@um5.ac.ma 
 
Received 25 September 2019 Accepted 24 October 2019 

ABSTRACT Today, understanding customer satisfaction is becoming a difficult and complex 
task for companies due to the explosive growth of the voice of the customer in online reviews. 
This has pushed companies to rethink their business strategies and resort to business 
intelligence techniques in order to help them in analyzing customer requirements and market 
trends. This paper proposes a decision support framework for dynamically transforming the 
voice of the customer data into actionable insight. The framework measures the customer 
satisfaction by extracting key products’ aspects along with customers’ sentiments from online 
reviews using a text mining technique: the latent Dirichlet allocation approach. We apply the 
Fuzzy-Kano model to classify the real customer requirements, then, map them dynamically to 
the SWOT matrix. The proposed approach is extensively tested on an empirical dataset based 
on several performance metrics including accuracy, precision, recall, and F-score. The reported 
results showed that latent Dirichlet allocation approach has correctly extracted aspects with 
97.4% accuracy and 92.4 % precision. 

KEYWORDS Business intelligence, customer satisfaction, decision support framework, 
Fuzzy-Kano model, latent Dirichlet allocation, online reviews, text mining, voice of the 
customer, web intelligence 

 
“The secret of successful retailing is to give 
your customers what they want.” 

 
Sam Walton 

 
1. INTRODUCTION 
In today’s competitive marketplace, business 
leaders have realized that customers are the 
major driving force leading a company to thrive 
(Carulli et al., 2013) (Lee et al., 2014). In fact, 
most of the product-based companies require 
an in-depth understanding of their customers’ 
satisfaction. Thus, they resort to business 
intelligence (BI) techniques in order to provide 
competitive products that meet the customer 
needs and go in line with the current market 
trend (Sabanovic and Søilen, 2012). The voice 

of the customer (VOC) is a widely used term in 
market research that describes the customers’ 
feedback about their expectations and 
experiences in relation to products and 
services. This is considered an essential first 
step in developing a successful product or 
service (Aguwa et al., 2012). The VOC is 
usually captured in a variety of ways such as 
questionnaire surveys, face to face interviews, 
telephone interviews, and discussion groups 
(Goodman, 2014) (Rese et al., 2015). However, 
most of these methods are demanding in terms 
of time, cost, and their geographic reach 
(Szolnoki and Hoffmann, 2013). Additionally, 
the participants’ willingness to provide actual 
input can impact the collected data quality 
(Reyes, 2016). Besides, the surveys are 
generally conducted occasionally, which makes 

Journal of Intelligence Studies in Business 
Vol. 9, No. 2 (2019) pp. 43-58 
Open Access: Freely available at: https://ojs.hh.se/ 

 
 44 
the timeliness of the gathered data 
questionable (Culotta and Cutler, 2016). 
Consequently, we need to consider other 
alternative data sources to reveal customer 
expectations.  

The growing popularity of social media and 
BI in the last decade makes them a valuable 
digital channel for listening and capturing 
customers’ voices (Gioti et al., 2018). Unlike 
conventional approaches, the VOC on social 
media is publicly available, easily accessible 
anywhere and anytime at low cost. Examples 
of these VOCs include customer posts, 
comments, and reviews. Customer reviews can 
be considered a trustworthy VOC since they 
hold massive data where customers voluntarily 
share their experiences about a specific product 
or service after use or purchase. Unfortunately, 
these reviews may not explicitly reflect 
customer needs since they require more 
advanced data analysis methods. Therefore, 
most companies have adopted BI techniques 
(Nyblom et al., 2012), such as text mining, to 
discover hidden patterns in this large amount 
of textual data to support the decision making 
process (Søilen et al., 2017) (Xu and Li, 2016) 
(Jia, 2018).  

Plenty of studies have been conducted to 
explicitly or implicitly understand customer 
satisfaction from online review content. For 
instance, Decker and Trusov (2010) applied an 
econometric framework based on Poisson 
regression, binomial regression, and latent 
class Poisson regression models. The basic 
potential of using those classification 
algorithms is to estimate the relative strength 
of effects resulting from the list of attributes 
identified through customer reviews about 
mobile phones. The methodology findings 
reveal that the negative binomial regression 
approach provides significant estimation 
parameters, which quantify the effects that the 
product attributes have on overall customer 
satisfaction. Park and Lee (2011) proposed a 
systematic framework for extracting customer 
requirements from an online customer center 
and transforming them into product 
specifications data. In their approach, 
customer opinions are collected, then a text 
mining analysis is conducted on customer 
complaints to extract meaningful keywords. 
Based on the extracted VOCs, customers are 
clustered into different groups with similar 
needs. Then, the target groups will be carefully 
selected by the companies.  Further, a co-word 
and a decision tree analysis are used to 
translate the customer requirements into 

product specifications. Xiao et al. (2016) 
established a novel econometric preference 
measurement model for extracting overall 
customers’ preferences from online product 
reviews. The model allows a semi-automatic 
extraction of product features along with the 
related reviewers’ sentiments. Then, aggregate 
customer preferences are extracted from online 
product reviews by a modified ordered choice 
model, which considers the variety of 
customers’ ratings and allows them to assign 
rating sores with their own thresholds. 
Furthermore, the identified customer 
requirements are classified into different 
categories, e.g. basic, performance, excitement, 
innovation-needed, reverse and divergent, by 
using a marginal effect-based Kano model, 
which is an extension of the classical Kano 
model that employs the marginal effect 
information disclosed by the proposed modified 
ordered choice model. 

In addition, other research studies have 
applied an aspect-based sentiment analysis 
approach for understanding customers’ 
satisfaction. This approach involves extracting 
aspects and finding their corresponding 
sentiments. Latent Dirichlet allocation (LDA) 
is considered a state-of-the-art modeling tool 
for extracting products’ features in the aspect-
based sentiment analysis (Saura et al., 2019). 
For instance, Farhadloo et al. (2016) proposed 
a Bayesian approach that models the customer 
satisfaction based on the individual aspect 
ratings. First, the study utilizes the aspect-
based sentiment analysis method described in 
(Farhadloo and Rolland, 2013) as a basis to 
transform unstructured input data into semi-
structured data. Then, the Bayesian method 
enables the extraction of the relative 
importance of each aspect of the product or 
service. For consumer-generated content in 
marketing, Tirunillai and Tellis (2014) 
proposed a unified framework that extracts the 
key latent quality dimensions (known as a 
“topic” in the LDA literature) of consumer 
satisfaction and the associated sentiments 
using unsupervised Bayesian learning 
algorithm based LDA. Moreover, the approach 
determines the validity, importance, dynamics, 
and heterogeneity of the extracted dimensions. 
In another context, Guo et al. (2017) put 
forward an LDA based approach to identify the 
most important dimensions of customer service 
in the hotel sector. Then, they performed a 
perceptual mapping to represent the key 
dimensions influencing the visitors’ 
satisfaction and the visitors’ perceived ratings 


 45 
in different hotel classification. Qi et al. (2016) 
proposed an automatic filtering model to mine 
customers’ requirements from online reviews. 
First, it filters out the reviews that are helpful 
for product improvement. Then, a lexicon-
based sentiment analysis, LDA, and page rank 
are used to rank the terms based on their 
frequencies and semantic relationships. In 
addition, the conjoint analysis and the Kano 
model are utilized to determine the product 
attribute weights and categories and evaluate 
their impact on customer satisfaction.  

Despite the contributions made by the 
aforementioned studies regarding the 
understanding of customer satisfaction from 
online reviews, they still have some drawbacks. 
First, in (Decker and Trusov, 2010), (Farhadloo 
et al., 2016), (Qi et al., 2016), (Xiao et al., 2016); 
(Park and Lee, 2011), the authors quantified 
the effects that customer requirements may 
have on their satisfaction by using various 
modeling methods that measure product 
attributes, e.g. weights and importance. While 
in (Guo et al., 2017), (Tirunillai and Tellis, 
2014), the authors focused only on mining the 
relevant products’ attributes. Second, most of 
the existing studies that have measured the 
effects of customer requirements on customer 
satisfaction have not classified the identified 
requirements either from the customer or the 
provider perspectives. Third, our approach 
bears a close resemblance to the one proposed 
by Qi et al. (2016), except that in our study, we 
have incorporated the Fuzzy analysis to the 
Kano model instead of the conjoint analysis. 
With Fuzzy analysis, the measurement of each 
product’s attribute is presented in the form of 
the degree of membership allowing the 
customers to express their preferences towards 
multi-attributes at the same time, unlike the 
conjoint analysis where the customers can only 
express their preferences for a single attribute. 

Based on the results reported in (Tirunillai 
and Tellis, 2014), (Qi et al., 2016), (Guo et al., 
2017), LDA has demonstrated good stability 
and satisfactory performance in terms of 
accurately extracting the key customer 
requirements from a large volume of online 
reviews. Therefore, we have selected it as a 
topic modeling method in our approach. To the 
best of our knowledge, this is the first attempt 
to combine LDA, the Fuzzy-Kano model and 
the SWOT method into one decision support 
framework for understanding customer 
satisfaction. Specifically, we will analyze the 
collected VOC from online reviews, then, 
extract the actual customers’ requirements 

that have more impact on their experiences 
with a given product or service.  

Such a framework is beneficial for 
companies since it allows them to deeply 
understand the customers’ needs and 
proactively adapt their product/service or even 
their business model accordingly. It is 
composed of four major modules. The first one 
consists of collecting and preprocessing data 
from online customer reviews. The second one 
extracts the products’ aspects and the 
corresponding customers’ sentiments from the 
preprocessed data using LDA. The third 
module classifies the real customer needs that 
affect their satisfaction based on the Fuzzy-
Kano model. The fourth module maps the 
Fuzzy-Kano model’s output to a SWOT matrix 
in order to easily interpret the obtained results. 
The proposed approach is extensively 
evaluated using an empirical dataset, which 
includes mobile phone reviews collected from 
Amazon. The evaluation is based on several 
performance metrics including accuracy, 
precision, recall, and F-score. 

The remainder of this paper is organized as 
follows. Section II provides the theoretical 
background of the proposed framework. 
Section III describes our methodology. In 
Section IV, we evaluate the effectiveness of our 
method using a real case study. In section V, 
we draw some conclusions and shed light on 
further research directions. 

 
2.  THEORETICAL BACKGROUND  
2.1 Latent Dirichlet Allocation (LDA) 
In this paper, we seek a way to map customers’ 
reviews to the topics, without having prior 
knowledge on what those topics are. This calls 
into question the unsupervised classification 
problem on natural language. LDA is an 
unsupervised topic modeling approach widely 
applied in natural language processing. The 
present study deployed LDA (Blei, 2012) 
instead of other topic model approaches found 
in the literature because it relies on more 
comprehensive probabilistic assumptions on 
the text generation and has shown satisfactory 
performance and good stability when 
classifying large data sets (Lu et al., 2011) 
(Alghamdi and Alfalqi, 2015) (Hofmann, 2017). 
In LDA, each document consists of a mixture of 
topics and each topic consists of a collection of 
words. Given a corpus 𝐷 consisting of 𝑀 
documents each of length 𝑁, each document 
contains a sequence of 𝑊 words, each of these 
words represents the 𝑣&' word in a vocabulary 


 46 
of 𝑉 distinct terms and 𝐾 is the total number of 
topics. Thus: 
 

• 𝛼 and 𝛽	define the prior distribution 
parameters per-document topic 
distribution and per-topic word 
distribution respectively. 

• 𝜃. is the topic distribution for 
document 𝑚. 

• 𝜑1 is the word distribution for topic 𝑘. 
• 𝑧4. is the topic for the 𝑛&' word in 

document 𝑚. 
• and 𝑤.4 is the specific word 

 
Formally, LDA generates a corpus 𝐷 of 𝑀 
documents according to the following 
generative process: 
 

• Choose a topic distribution 𝜃7	~	𝐷𝑖𝑟(𝛼), 
where 𝑖	 ∈	{1,…. ,𝑀}, and 𝐷𝑖𝑟(𝛼) is 
a Dirichlet distribution with scaling 
parameter α which typically is sparse 
(𝛼	 < 	1). 

• For each topic 𝑘	 ∈	{1,…. ,𝐾}, Choose 
𝜑1	~	𝐷𝑖𝑟(𝛽), where 𝛽 is typically 
sparse. 

• For each of the word positions 𝑖, 𝑗	, 
where 𝑗	 ∈	{1,…. ,𝑁7}  , and  𝑖	 ∈
	{1,…. ,𝑀}:  

o Choose a 
topic 𝑧7,F	~	𝑀𝑢𝑙𝑡𝑖𝑛𝑜𝑚𝑖𝑎𝑙(	𝜃7). 

o Choose a 
word 𝑤7,F	~	𝑀𝑢𝑙𝑡𝑖𝑛𝑜𝑚𝑖𝑎𝑙(𝜑LM,N). 

 
Moreover, a graphical model can also mirror 
the generative process of documents. As 
depicted in Figure 1, the boxes refer to repeated 
contents where the number of repetitions is 
presented by the variable at the corner of the 
corresponding box. The blue node represents 
the only observed variable (𝑤). The white nodes 
denote latent variables (𝜑, 𝜃); Gray nodes 
represent hyperparameters (𝛼 and 𝛽). The 
arrows indicate dependencies among the model 
parameters.  

Practically, the model must determine the 
hidden variables from the data, namely the 
document-topic distribution 𝜃, and the topic-
word distribution 𝜑. To this end, the Gibbs 
Sampling algorithm (Darling, 2011) is applied 
to estimate those two LDA parameters.  
 
2.2 Kano Model  
The Kano model (Kano, 1984) is an effective 
tool used by companies to integrate the VOC 
into the product and service development 

lifecycle. It is regarded as a nonlinear 
relationship between product quality and 
customer satisfaction. It measures customer 
sentiments to discover which customer 
requirements have the highest impact on 
customer satisfaction (Tontini et al., 2013). 

The Kano model often carries out surveys 
and questionnaire investigations on customers 
to determine the requirements of a particular 
product or service. For a given product’s aspect, 
a functional question (aspect’s presence) and a 
dysfunctional question (aspect’s absence) are 
asked. Each question form should be answered 
on a five-point scale such as: like, necessary, 
neutral, unnecessary, and dislike. Based on a 
statistical analysis of all the accumulated 
responses of the survey, each answer pair is 
aligned with the Kano evaluation (Table 1), 
forming certain requirements (Ullah and 
Tamaki, 2011).  Table 1 shows that by 
combining the two answers (functional and 
dysfunctional), the product’s aspects can be 
classified into six categories of requirement 
that influence customer satisfaction, including:  

 
• “Must-be” (M) requirement is expected 

by the customers, its presence does not 
lead to customer satisfaction, but its 
absence leads to extreme customer 
dissatisfaction.  

 
Table 1 The standard Kano evaluation (Ullah and Tamaki, 
2011). Nec = necessary; Neu = neutral; Unnec = 
unnecessary; Dis = dislike. 

 Dysfunctional 
Like Nec. Neu Unnec Dis 

F
u

n
ct

io
n

al
 Like Q A A A O 

Nec R I I I M 
Neu R I I I M 
Unnec R I I I M 
Dis R R R R Q 

Figure 1 The graphical representation of the LDA model, 
redrawn from (Blei, 2012) 


 47 

 
•  “One-dimensional” (O) requirement is 

the property of a customer need that 
increases customer satisfaction when it 
is fulfilled. Inversely, customer 
satisfaction decreases when it is not 
fulfilled.  

• “Attractive” (A) requirement is usually 
uncommon or unexpected by the 
customers, if included, can truly 
increase customer satisfaction; if not, 
there is no feeling of dissatisfaction.  

• “Indifferent” (I) requirements are those 
that the customer does not care about 
whether they exist or not. That is, these 
attributes will cause neither the 
satisfaction nor the dissatisfaction of 
customers, but that does not mean they 
do not impact the company's production 
decisions.  

• “Reverse” (R) requirements are those 
whose presence results in 
dissatisfaction since not all customers 
are alike. In other words, what makes 
one customer satisfied might probably 
alienate another.  

• And the “Questionable” (Q) 
requirement, which occurs when the 
customer selects an unclear answer 
from both functional and dysfunctional 
sides.   

 
In addition, the Kano questionnaires and 
surveys allow the users to select only a single 
option from a set of options. That makes them 
unable to express their uncertainty toward 
certain aspects by selecting more than one 
choice. To address the issue of uncertainty 
concerning people’s satisfaction as well as the 
vagueness of human thought, our study 
combines the classical Kano model with the 

fuzzy analysis to obtain an equivalent Fuzzy-
Kano model that classifies the customers’ 
requirements based on fuzzy logic rather than 
binary logic (Lee and Huang, 2009). The Fuzzy-
Kano model allows customers to express multi-
feeling, with the help of the different Kano 
categories, by giving fuzzy satisfactory values 
to certain aspects. This fuzzy set of values is 
represented by variable membership degrees 
ranging from 0 to 1, reflecting the uncertainty, 
where the sum of elements is equal to 1. 
Furthermore, this approach automates the 
building of the Kano model. It incorporates the 
VOCs into the Fuzzy-Kano model through LDA 
to obtain much larger scale data with more 
reliable insights since the classical Kano 
model, when used alone, cannot directly handle 
such data. 
 
3.   METHODOLOGY 
The proposed framework is composed of four 
modules as illustrated in Figure 2: (1) data 
extraction and preprocessing; (2) aspect-
sentiment pairs extraction using LDA; (3) 
requirements classification based on the 
Fuzzy-Kano model; and (4) decision-making 
analysis driven by Fuzzy-Kano and SWOT. In 
this section, we describe each of these modules. 
 
 
3.1 Data Extraction and 
Preprocessing  

The first module consists of gathering online 
customer reviews as the material for analysis 
and saving them in the form of a table in which 
each review denotes a document. Generally, 
reviews contain emoticons, special characters, 
punctuation, HTML tags, capital letters and 
misspelled words. So, it is necessary to apply a 

Figure 2 The proposed decision support framework. 


 48 
set of operations to each review before moving 
to the next module. These preprocessing 
operations include: 
 

Tokenization: is the act of breaking up a 
sequence of textual content into words, 
phrases, and symbols called tokens. These 
tokens are used as input data for further 
processing. 
Stop word removal: is the process of 
filtering out irrelevant words and 
characters from data, such as prepositions 
and pronouns. 
Part-Of-Speech Tagging (POST): is 
applied to assign a special label to each 
token (word) in a text such as a noun, verb, 
or adjective. 
Filtering tokens: is used to filter out all 
words where the length is out of the range 
[2-25 characters]. 
Transforming cases: consists of 
converting all tokens into lowercase. 
Stemming: is applied to discard affixes 
from each word to obtain their root form. 

 
Additionally, some reviews can be wrapped in 
a specific electronic file format, such as HTML, 
XML or JSON, which sometimes requires 
transformation into another format so as to be 
easily processed by the next modules. After 
performing the aforementioned preprocessing 
operations, a set of valid words is generated by 
excluding all meaningless words from the 
token list. Thus, a document-term matrix is 
produced, which indicates terms and their 
occurrence frequencies in each document. 
 
3.2 Aspect-Sentiment Pairs 

Extraction using LDA 
In this module, we begin by implementing LDA 
to reveal all topics being discussed by 
customers in the reviews. For this, we compute 
the probability of each word in the review as 
written in equation 1: 
 

𝑝(𝑤|𝑅) =	S𝑝(𝑤|𝑇)
U

7VW

× 	𝑝(𝑇|𝑅7)															(1) 

 
Where 𝑝(𝑤|𝑇) is the probability of a word 𝑤 

given a topic 𝑇 and 𝑝(𝑇|𝑅7) is the probability of 
a topic 𝑇 given a review 𝑅7, with 𝐾 is the total 
number of reviews in the overall collection.  

Then, we extract aspects and sentiments 
that appear together in the same topic 
distribution according to the POS tagging 
process. Words describing sentiments are 
mainly represented by adjectives and adverbs, 
meanwhile, a product aspect is mainly 
represented by nouns or noun phrases (Hu and 
Liu, 2004a), but not all nouns refer to aspects. 
Therefore, we select first the most 
representative nouns as aspect candidates 
according to their co-occurrence frequencies in 
the review, as well as their appearance with 
sentiment words. To identify sentiment word 
orientation, the Wordnet (Miller, 1995) is used 
as well as the opinion lexicon provided in (Hu 
and Liu, 2004b), when the sentiment words are 
not supported by Wordnet. Next, we use the 
popular approach of Hu and Liu (2004b) to 
construct aspect-sentiment pairs, which is 
based on extracting nearby adjectives to a 
frequent aspect.  

Practically, we define a nearby adjective as 
the nearest opinion word to a specific aspect 
considering token distance (measured in the 
number of words far away from that aspect). 
The maximum number of the nearest 
sentiment words is set at two for the simple 
reason that usually when a third word is found, 
it was certainly describing another aspect that 
was ignored during processing. By doing so, we 
prevent the incorrect attribution of a sentiment 
word to an aspect. Moreover, we consider that 
once a sentiment word is assigned to an aspect, 
it will not be considered in the future 
attribution.  

To compute the final sentiment score for an 
aspect (positive or negative), we sum up all 
sentiment word scores related to that aspect as 
follows: 

 
𝐴7.𝑠𝑠 = S

𝑆𝑊F.𝑠𝑠
𝑑𝑖𝑠𝑡(𝑆𝑊F,𝐴7)F

																							(2) 

 
Where 𝐴7.𝑠𝑠 is the sentiment score of an 

aspect	𝐴7, 𝑆𝑊F.𝑠𝑠 is the polarity score {−1,1} 
given to the 𝑗&' sentiment word according to the 
opinion lexicon, and 𝑑𝑖𝑠𝑡(𝑆𝑊F,𝐴7) is the 
distance between the aspect 𝐴7	and the 
identified sentiment word 𝑆𝑊F. This allows us 
to identify the opinion words with the highest 
weight, i.e. the nearest opinion word to the 
aspect. 

 
 49 
3.3 Requirements Classification 

based on Fuzzy-Kano model 
In this module, we use the aspect-sentiment 
pairs generated previously in combination with 
the Fuzzy-Kano model to classify the real 
customer requirements that affect customer 
satisfaction. In the document collection, each 
comment is written by a customer, 𝑐, to express 
a sentiment, 𝑠, toward several aspects 𝑎𝑠𝑝 of an 
item,	𝑖. By using the quadruplet	{𝑠, 𝑖,𝑎𝑠𝑝,𝑐}, we 
form the matrix of aspect and sentiment 
distribution, denoted as	𝐴 = (𝑎7F)W`F`a

W`7`b . For 
instance, in equation 3, rows represent aspects 
and columns denote items. The matrix entries 
represent the customer’s sentiment 𝑐ba	toward 
the aspect 𝑝 of the item	𝑞. We assign +1 to a 
positive attitude, -1 to a negative attitude, and 
0 to a neutral attitude or no opinion expressed. 
Then, we construct for each aspect a set of n-
dimensional vector distributions. For example, 
the first row in the matrix indicates that for 
aspect 1, the customer marks a negative 
attitude for item 1, neutral or no feeling toward 
item 2, and a positive attitude for item	𝑞. Thus, 
each row in the matrix constitutes a customer’s 
sentiment vector corresponding to that aspect.  
 

𝐴 = d

−1 0 ⋯ 1
0 1 ⋯ 0
⋮ ⋮ ⋱ ⋮
−1 −1 ⋯ 1

i														(3) 

 
To apply the Fuzzy-Kano, first we calculate 

for each aspect the customer’s degree of 
preference when the aspect has a functional 
presence and the customer’s degree of dislike 
when the aspect has a dysfunctional absence or 
insufficiency. Probability gives real knowledge 
when the customer feelings are ambiguous or 
uncertain. So, we calculate such degrees as 
probabilities of preference and dislike. They 
are represented, respectively, in equations 4 
and 5: 

 
𝑝𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒(𝑐,𝐴𝑠𝑝7) =	
𝑁m
𝑝 × 𝑞

×
𝑆7
n

𝑆7
										(4) 

 
𝑑𝑖𝑠𝑙𝑖𝑘𝑒(𝑐,𝐴𝑠𝑝7) =	

𝑁m
𝑝 × 𝑞

×
𝑆7
p

𝑆7
																				(5) 

 
Where 𝑝𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒(𝑐,𝐴𝑠𝑝7) and 

𝑑𝑖𝑠𝑙𝑖𝑘𝑒(𝑐,𝐴𝑠𝑝7) represent the probabilities that 
customer, 𝑐, has a positive or negative 
sentiment, respectively, for aspect 𝐴𝑠𝑝7 for a 
specific item, 𝑁m denotes the number of 
sentiments either positive or negative 

expressed by a customer, 𝑐, toward some 
aspects, 𝑝 × 𝑞 refers to the dimension of aspect-
sentiment matrix, 𝑆7n and 𝑆7p represent the 
number of positive and negative sentiments 
given by 𝑐 for aspect 𝐴𝑠𝑝7 respectively, and 𝑆7 is 
the total number of sentiment attitudes 
expressed by several customers for the aspect 
𝐴𝑠𝑝7.  

 
Second, each of the obtained preference and 

dislike values refers to a fuzzy set, which 
contains elements that have varying degrees of 
membership in the set. These degrees 
correspond to the five Kano’s standard answers 
(‘like’, ‘necessary’, ‘neutral’, ‘unnecessary’, and 
‘dislike’). They are determined using the 
membership functions where each element of 
the fuzzy set is mapped to a value ranging from 
0 to 1. In particular, we employ in this paper 
the triangular membership function because of 
its simplicity in determining the input 
parameter values, namely the 𝑝𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒 and 
𝑑𝑖𝑠𝑙𝑖𝑘𝑒 in our case (Umoh and Isong, 2013). 
According to the triangular membership 
method, the five Kano’s standard answers are 
represented as five triangular fuzzy numbers 
between 0r and 1r, as follows: 

 
• Dislike: (0,0,0.25) 
𝜇t(𝑥) = v

0.25 − 𝑥																									0 ≤ 𝑥 ≤ 0.25
0																																											𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

 
• Unnecessary: (0,0.25,0.5) 

𝜇t(𝑥) = y	
𝑥													0 ≤ 𝑥 ≤ 0.25																									
0.5 − 𝑥						0 ≤ 𝑥 ≤ 0.5																
	0														𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒														

 
• Neutral: (0.25,0.5,0.75) 

𝜇t(𝑥)

= y	
𝑥 − 0.25												0.25 ≤ 𝑥 ≤ 0.5																												

0.75 − 𝑥							0.5 ≤ 𝑥 ≤ 0.75																
	0																				𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒														

 
• Necessary: (0.5,0.75,1) 

𝜇t(𝑥)

= y	
𝑥 − 0.5																									0.5 ≤ 𝑥 ≤ 0.75																												
1 − 𝑥																																												0.75 ≤ 𝑥 ≤ 1															
	0																																																											𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒														

 
• Like: (0.75,1,1) 
𝜇t(𝑥) = v

𝑥 − 0.75										0.75 ≤ 𝑥 ≤ 1
0																													𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

 
Where 𝑥 is the fuzzy set represented by the 
degree of preference/dislike, and 𝜇t(𝑥) is its 
triangular membership function. 

Figure 3 illustrates the graphic 
presentation of the triangular membership 
function. The closer the value of 
preference/dislike degree to a Kano’s standard 


 50 

answers, the higher the membership degree to 
it. For instance, while a 𝑝𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒 value is 
located between 0 and 0.25, namely	𝛽, the 
membership degrees to “dislike” and 
“unnecessary” are 𝛼Wand 𝛼{	respectively.  

In Table 2, we illustrate an example of a 
customer’s membership degrees of preference 
and dislike for aspect 1 in topic 0. Using Table 
2 only, it is difficult to determine the proper 
classification of the customer requirements. 
Therefore, the customer’s membership degrees 
of preference and dislike can be transformed 
into two five-vector representations, namely 
𝑃𝑟𝑒 = {0.75,0.21,0.04,0,0}	and	𝐷𝑖𝑠 =
	{0,0,0,0.91,0.09} as defined in  (Lee and 
Huang, 2009). Then, using a matrix 
multiplication	𝑃𝑟𝑒~ ⨂𝐷𝑖𝑠, a 5 × 5 Kano’s two-
dimensional Fuzzy relation matrix ‘𝑀𝑆’ is 
obtained as: 

 
𝑀𝑆 =	𝑃𝑟𝑒~	⨂	𝐷𝑖𝑠	

=

⎣
⎢
⎢
⎢
⎡
0 0 0 0.68 0.06
0 0 0 0.19 0.01
0 0 0 0.03 0.003
0 0 0 0 0
0 0 0 0 0 ⎦

⎥
⎥
⎥
⎤
													(6) 

 
Relative to Table 1 stated in the literature, 

the customer requirements can also be written 
as a two-dimensional 5 × 5  matrix ‘𝑀𝐸’ as:  

 
𝑀𝐸 =	

⎣
⎢
⎢
⎢
⎡
𝑄 𝐴 𝐴 𝐴 𝑂
𝑅 𝐼 𝐼 𝐼 𝑀
𝑅 𝐼 𝐼 𝐼 𝑀
𝑅 𝐼 𝐼 𝐼 𝑀
𝑅 𝑅 𝑅 𝑅 𝑄⎦

⎥
⎥
⎥
⎤
																					(7) 

 
After ‘MS’ being obtained, we sum the 

values of the ‘MS’ matrix entries with each 
other if they belong to the same cell in the 
evaluation matrix ‘ME’. As a result, the 

classification of the customer requirements can 
be acquired as follows:   

 
𝑅 = �

0.68
𝐴

,
0.013
𝑀

,
0.06
𝑂

,
0.22
𝐼
	,
0
𝑅
,
0
𝑄
�										(8) 

 
As mentioned earlier, the Kano model’s 

classification of requirements is qualitative 
and judged to be ineffective in the quantitative 
evaluation of customer satisfaction. Therefore, 
Berger et al. (1993) proposed customer 
satisfaction coefficients to provide quantitative 
values of satisfaction and dissatisfaction in 
case of fulfillment or non-fulfillment of a 
customer requirement, as given in equations 9 
and 10:  

 
𝐶𝑆7

n =
𝐴7 + 𝑂7

𝐴7 + 𝑂7 + 𝑀7 + 𝐼7
																																(9) 

 
𝐶𝐷7

p = −
𝑂7 + 𝑀7

𝐴7 + 𝑂7 + 𝑀7 + 𝐼7
																							(10)		 

 
Table 2 An example of a customer’s membership degree to 
Kano’s standard answers for aspect 1 in Topic 0. S = 
standard answers; M = membership degrees; Nec = 
necessary; Neu = neutral; Unnec = unnecessary; Dis = 
dislike. 

                  
 S    

    M 

Like Nec Neu Unnec Dis 

Preference 75% 21% 4%   

Dislike    91% 9% 

Figure 3 The triangular membership function of the degree of preference/dislike to the Kano standard answers. 


 51 

Where 𝐶𝑆7nand 𝐶𝐷7p are respectively the 
customer satisfaction and dissatisfaction 
coefficients of the 𝑖&' customer requirements, 
and	𝐴7,𝑂7,𝑀7 and 𝐼7 represent the probability 
distributions obtained according to the Kano’s 
evaluation for the requirement 𝑖. Reverse and 
questionable requirements were ignored. Note 
that the minus sign in equation 10 emphasizes 
the negative impact on customer satisfaction, 
which will be decreased if these (one-
dimensional and must-be) requirements are 
not included. On the other hand, the value of 
𝐶𝑆7

n is usually positive, indicating that 
customer satisfaction will be increased by 
providing these (attractive and one-
dimensional) requirements. 

A positive satisfaction coefficient ranges 
from 0 to 1, while a negative satisfaction 
coefficient runs from 0 to -1. A value of zero 
implies no impact on customer satisfaction 
whether the requirement is met or not. The 
closer 𝐶𝑆7n	is to 1, the higher the influence of 
meeting the requirement is on the customer 
satisfaction, and the closer 𝐶𝐷7p is to -1, the 
greater the influence of not meeting the 
requirement is on the customer dissatisfaction. 
In this way, all evaluated requirements can be 
represented graphically through a scatterplot, 
which is divided into four quadrants according 
to the satisfaction coefficient values. The X-
axis is for 𝐶𝑆n and the Y-axis is for	𝐶𝐷p. Each 
customer requirement could be assigned to 
different quadrants of the scatterplot based on 
the Kano requirements. As shown in Figure 4, 
the first quadrant stands for the one-
dimensional requirements, the second 
quadrant stands for the attractive 
requirements, the third quadrant stands for 
the indifferent requirements and the fourth 
quadrant stands for the must-be requirements. 
Therefore, in designing new products/services, 
priority should be given to the higher 𝐶𝑆n	and 

the lower 𝐶𝐷p i.e. Attractive requirements, and 
when improving an existing product/service, 
more focus should be given to the high 
𝐶𝑆n	value and the high 𝐶𝐷pvalue, i.e. one-
dimensional requirements. This rule guides 
the decision-maker’s team of a company when 
deciding on which customer requirement has 
more impact on the company’s quality 
production process.  

 
3.4 Decision Making Analysis driven 

by Fuzzy-Kano and SWOT 
In this module, we propose a bi-layered matrix 
that maps the Fuzzy-Kano outputs into the 
SWOT matrix in order to interpret the 
requirements from the customer and the 
provider perspectives, as shown in Figure 5. 
The upper matrix lists the requirements from 
the customer’s perspective. Its horizontal axis 
represents the fulfillment level of a 
requirement deducted from the customer 
satisfaction and dissatisfaction coefficients 
previously calculated, while the other axis 
refers to the Fuzzy-Kano requirement’s 
classification. The upper matrix results are 
mapped into the SWOT matrix (lower matrix). 
SWOT is used as an analysis tool to provide 
insights about products by identifying their 
strengths and weaknesses (i.e. internal factors) 
along with potential opportunities and threats 
(i.e. external factors) (Phadermrod et al., 2019).  

As can be seen from Figure 5, the upper 
matrix includes six zones ranging from (a) to 
(f). Zone (a) contains unfulfilled must-be 
requirements. The product’s provider needs to 
fulfill these requirements in order to guarantee 
the minimum quality of the product. Zone (b) 
includes fulfilled must-be requirements which 

Figure 4 The Kano requirements classification according to 
customer satisfaction coefficients. 

Figure 5 The KANO and SWOT bi-layered matrix. 


 52 
means that the product already retains a 
minimum of quality. Zone (c) includes 
unfulfilled one-dimensional requirements. The 
product’s provider should invest more in 
improving these requirements in order to avoid 
customer dissatisfaction and increase customer 
satisfaction. Zone (e) contains unfulfilled 
attractive requirements. Even though these 
requirements will not cause the customer 
dissatisfaction since they are not expected by 
the customers, they create a product with a 
novel attractive aspect that can achieve 
unexpectedly positive effects. Zones (d) and (f) 
hold fulfilled/one-dimensional and 
fulfilled/attractive requirements, respectively. 
The product’s provider does not need to modify 
the product since those requirements are 
already at a high level of satisfaction. However, 
if they make more effective improvements, this 
can dramatically raise customer satisfaction. 
The improvements to be made in both zones are 
different. In (f), improvements are more 
innovative, while in (d) they are more realistic. 

In the lower matrix, the aforementioned 
zones are mapped to the SWOT matrix. Zones 
(a) and (c) include unfulfilled/must-be and 
unfulfilled/one-dimensional requirements 
which can be regarded as a weakness of the 
product or even a potential threat for the 
provider. Therefore, zones (a) and (c) can be put 
in the W-T cell. Zone (e) holds unfulfilled 
attractive requirements that can be 
interpreted differently depending on the 
studied case. They can be considered as 
weaknesses that the product’s provider can 
minimize by improving further the product 
quality and turn those weaknesses into an 
opportunity. In this case, zone (e) can be put in 
the W-O cell. On the other hand, those 
requirements can be considered strengths if 
the provider includes them in the product and 
they were not expected by the customers. 
However, if these requirements do not meet the 
customers’ expectations, then they can become 
a potential threat. In this case, zone (e) can be 
put in the S-T cell. Zones (b), (d), and (f) 
respectively include the fulfilled/must-be, 
fulfilled/one-dimensional, and fulfilled/ 
attractive requirements that can be considered 
strengths since they can be easily fulfilled. In 
addition, adding new features to the product 
can be an opportunity to create a new market 
related to these features. Thus, these zones are 
put in the S-O cell. 

Note that the indifferent requirements are 
not considered in the bi-layered matrix, simply 
because they are of little or no consequence to 

the customer. So, the provider can ignore them 
to save time, cost, and resources.  

 
4. EXPERIMENTS AND RESULTS 
In this section, we conduct a case study to 
evaluate the effectiveness and feasibility of the 
proposed framework using online mobile phone 
reviews collected from Amazon. In the 
following, we describe our dataset and show 
potential results. 
4.1 Dataset  

4.1.1 Preprocessing 
In order to evaluate the effectiveness and 
feasibility of the proposed framework, the first 
phase consists of collecting and preprocessing 
the required dataset. In this paper, a dataset of 
unlocked mobile phone reviews has been 
selected. This dataset was acquired from 
Amazon using (“PromptCloud”). It includes 
400,000 mobile phone reviews, containing 
product and customer information, ratings and 
plaintext reviews. In this study, we conducted 
the experiments on a subsample of the original 
dataset, which contains approximately 2000 
reviews.  
 
Table 3 Partial demonstration of experimental dataset. 

Review Price Rating 
I feel so LUCKY to have found this 
used (phone to us & not used hard at 
all), phone on line from someone who 
upgraded and sold this one. My Son 
liked his old one that finally fell apart 
after 2.5+ yea... 

199.99 5.0 

It’s battery life is great. It’s very 
responsive to touch. The only issue is 
that sometimes the screen goes black 
and you have to press the top button 
several times to get the screen to re-
illuminate. 

199.99 3.0 

 
Table 3 illustrates some samples from the 

dataset. Each single review includes a 
considerable amount of unnecessary data, 
which must be cleaned to reduce noisy data and 
extract insightful information such as aspects 
and sentiments. The preprocessing operations 
applied in this work include tokenization, stop 
word removal, transform cases, stemming, and 
non-alphanumeric character removal. All the 
preprocessing operations were conducted using 
the Python NLTK toolkit (version 3.7). In 
addition, we grouped synonyms to reduce 
dimensionality by using a manually entered 
list including the most common synonyms e.g. 
the words “cellphone”, “smartphone”, “phones” 
are all transformed into “phone”. Negation 


 53 

handling is quite important in this study, it 
assists in improving sentiment analysis 
accuracy. Therefore, we used the simplest 
approach proposed in (Das et al., 2001), which 
is based on appending a negation tag “_NEG” 
to every word found between a negation and 
the first punctuation mark following it, so as to 
reverse the polarity of all these words while 
computing their scores. Misspelling is also 
taken into consideration since the reviews are 
usually hand-typed. Some predefined functions 
from the “autocorrect package” are used to deal 
with misspellings. The POS tagging is used to 
find adjectives that are considered sentiment 
words, as well as products’ aspects where 
nouns (NN) and noun phrases (NNP) are 
considered potential aspect candidates. 
 
 
Table 4 Setting values for running LDA. 

Parameter settings Values 
Number of documents (𝑀) 1593 
Number of topics (𝐾) 20 
Number of iterations 50 
𝛼 = 1/𝐾 1/20 
𝛽 = 1/𝐾 1/20 

 
Table 5 List of aspects along with their sentiment polarity 
and scores for topic ID = 5. 

Aspect(s) Polarity Sentiment score 
Battery safety -1 -0.72 
Booting time -1 -0.14 

Price 1 0.53 
Speakers quality 1 0.83 

Battery life -1 -0.57 
Shipping 1 0.33 

Screen size -1 -0.92 
Internet speed -1 -0.10 

weight 1 0.69 
Camera resolution 1 0.86 

 
Moreover, we applied certain filtering 
operations, such as: excluding reviews without 
an adjective POS tag, since sentiments are 
mainly identified from adjectives; pruning 
words that are not recognized by the opinion 
lexicon or Wordnet; and keeping reviews in 
which an aspect appeared at least once. In the 
end, the final list was made up of 1763 reviews, 
which was split into 1593 reviews intended for 
training and 170 reviews for testing. The 
testing reviews were chosen randomly, and a 
new column was added, including aspects and 
the relative sentiments’ polarity. 
 

4.1.2 Extracting Topics and 
Constructing Aspect-
Sentiment Pairs 

Before proceeding with the LDA application, 
we prepared the data for phrase modeling, 
which consisted of grouping common words 
that often get a special meaning when they are 
used together. That is, we built bi-gram 
phrases from the reviews. Then, using the 
“GENSIM” library, we built our LDA model 
over the parameters cited in Table 4. The 
number of topics 𝐾 was set at 20 to avoid 
producing a general result with a lack of 
details. Moreover, a larger number of topics 
may take longer to converge. For the other 
parameters, GENSIM default values were 
used. 

Through the LDA model, we obtained the 
first output, namely, the word-topic matrix. It 
included 20 meaningful topics each 
represented as a weighted list of words in 
descending order. Figure 6 indicates the first 
four topics with the top 20 most frequent 
words. Topics were inspected by a specific 
index. Instead, topic names can be defined 
manually by inferring topics from relevant 
words’ meanings. For instance, looking at topic 
1 keywords, we can summarize it to “phone 
screen and battery performance”.  The second 
output generated by LDA was the document-

Figure 6 List of top 20 keywords for the first four topics. 


 54 
topic matrix. An example of topic allocation to 
the five first documents (reviews) is illustrated 
in Figure 7. 

By extracting numerous aspects that 
customers are reviewing and their 
corresponding sentiments along with the 
accumulated sentiment scores calculated using 
equation 2, we gain insights into what 
negatively or positively impacts product 
reviews, as well as what the customers like or 
dislike about the product. Table 5 shows a 
partial list of such aspects along with their 
polarity classes and sentiment scores grouping 
by topic ID 5.  
 

4.2 Evaluation and Results 
4.2.1 Results of the Extracting 

Aspect-Sentiment Pairs  
To evaluate how the extracting aspect-
sentiment pairs approach performed, two set of 
experiments were conducted: (i) measure the 
effectiveness of the aspects extraction and (ii) 
measure the effectiveness of the sentiments 
assignment to the corrected aspects extracted. 
In this regard, four performance metrics were 
used: accuracy (Acc), precision (P), recall (R), 
and F1-score (F1). Accuracy means how often 
our model is correct but when used alone, it 
cannot be trusted to select a well-performing 
model. Therefore, we used the three other 
metrics to give more detailed insights into the 
performance characteristics of our method. 
Precision refers to the percentage of the 
relevant data. A higher precision indicates 
more true positives and less false positives. On 
the other hand, recall expresses the proportion 
of all relevant results correctly classified by our 
model. High recall means less false negatives 
and high true positives. According to the 

confusion matrix notations (Ting, 2017), the 
accuracy, precision, and recall are computed 
respectively by the following equations:   

𝐴𝑐𝑐 =	
𝑇𝑃 + 𝑇𝑁

𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁
																									(11) 

 
𝑃 =	

𝑇𝑃
𝑇𝑃 + 𝐹𝑃

																																																			(12) 
 
𝑅 =	

𝑇𝑃
𝑇𝑃 + 𝐹𝑁

																																																		(13) 
 
Where TP is true positives, TN is true 

negatives, FP is false positives, FN is false 
negatives. The F1-score combines precision 
and recall and gives an overall view of the 
accuracy of the approach. The F1-score is given 
by: 

 
𝐹W = 2 ∗	

𝑃 × 𝑅
𝑃 + 𝑅

																													(14) 
 
In the experiment set (i), TPs refer to the 

correctly extracted aspects. TNs are the 
aspects that were discarded by the model and 
did not appear in the test data either. FPs are 
words that the model classified as aspects but 
are not actually aspects. FNs are the aspects 
that the model labeled as not being aspects 
when they were actually aspects. In the 
experiment set (ii), TPs refer to the aspects 
correctly classified with positive scores. FPs 
are the aspects incorrectly classified with 
positive scores. FNs are the aspects incorrectly 
classified with negative scores. 

 
Table 6 Performance results. Acc = accuracy; Pre = precision. 

Experiments 
set 

Acc. Pre Recall F1-
score 

(i) Aspects 
extraction 

97.4%   92.4 
% 

84.5%      88.27%      

(ii) 
Sentiments 
assignment  

89,8% 90.7%      94.7% 92.6%      

 
Table 6 depicts the accuracy, precision, 

recall, and F1-score of the proposed aspect-
sentiment pairs approach in the experiments 
set (i) and (ii). As one can see, in (i), the model 
reports a high precision value (92.4%) meaning 
that most of the actual aspects are correctly 
classified with low FP values. The recall rate is 
84.5%, suggesting that the most returned 
aspects are correctly labeled with low FN 
values. The F1-score is relatively high, 
meaning that the model represents insightful 

Document 1 Document 2 Document 3 Document 4 Document 5
0

20%

40%

60%

80%

100%

P
ro

ba
bi

lit
y 

Topic 0 Topic 9 Topic 11 Topic 15 Topic 19

Figure 7 Topic distribution for the first 5 documents. 


 55 
results in terms of extracting the most 
discussed aspects of specific products. In (ii), 
the results are significantly different than the 
first experiment set. In particular, the F1-score 
is 92.6%, which indicates that assigning correct 
sentiments’ polarity performs fairly well 
compared to the aspects’ extraction, which 
reports 88.27%. These results suggest that the 
extraction of aspect-sentiment pairs performs 
efficiently in identifying accurate aspects and 
assigning appropriate sentiments to them. 
This will help in feeding the Fuzzy-Kano model 
with accurate inputs, consequently providing 
valuable business insights. 

4.2.2 Results of the Fuzzy-Kano 
Model 

The Fuzzy-Kano model classified the ten 
aspects previously extracted into must-be, one-
dimensional, attractive, and indifferent 
requirements by calculating their degrees of 
preference and dislike. Table 7 highlights the 
findings of the assessed requirements’ 
classification along with their impact on 
customer satisfaction. 

According to the customer satisfaction 
coefficient (CS+/CD-) reported in Table 7, we 
can represent all the classified requirements 
via a scatterplot, as shown in Figure 8. 

 
Table 7 Fuzzy-Kano classification and customer satisfaction 
coefficients results. R.No. = requirement number; A. Req. = 
assessed requirements; Kano Class = Kano Classification. 

R. 
No.  

A. Req. Kano Class CS+ CD- 

R0 Battery 
safety 

Must-be 0.29 -0.83 

R1 Booting time One-
dimensional 

0.78 -0.62 

R2 Price Indifferent 0.06 -0.05 

R3 Speakers 
quality 

One-
dimensional 

0.54 -0.58 

R4 Battery life Must-be 0.46 -0.89 

R5 Shipping Indifferent 0.42 -0.12 

R6 Screen size Attractive 0.83 -0.36 

R7 Internet 
speed 

One-
dimensional 

0.60 -0.70 

R8 Weight Attractive 0.57 -0.32 

R9 Camera 
resolution 

Attractive 0.71 -0.49 

From Figure 8 and Table 7, the findings 
indicate that all the must-be requirements are 
battery-related, namely, R0 and R4 since they 
have a higher level of dissatisfaction among the 
customers compared to other requirements. 
Furthermore, R1, R3, and R7 are all one-
dimensional requirements, which implies that 
customers expect the companies to improve the 
performance of this product requirement. On 
the other hand, the attractive requirements 
such as R6 and R9 have a greater impact on 
satisfaction if fulfilled while R8 has a relatively 
lower impact on customer satisfaction when 
compared to R1. The indifferent attributes, R2 
and R5 reflect a low impact on customer 
satisfaction and dissatisfaction, thus, they 
should be the last to be focused on over the 
three other requirements. 

4.2.3 Fuzzy-Kano and SWOT 
Mapping and Analysis Results 

In this section, the identified requirements are 
mapped to the bi-layered matrix. First, they 
are classified according to the Fuzzy-Kano 
model from the customer’s perspective, then, 
classified according to the SWOT method from 
the provider’s perspective. The results of the 
mapping are shown in Figure 9. 

Considering the aforementioned results and 
the analysis reported in the fourth module of 
our proposed framework, R0 and R4 must be 
fulfilled to guarantee the minimum quality of 
the product and meet the customers’ 
requirements. These requirements are headed 
to W-T, which motivate the provider to improve 
the battery performance, including safety and 
durability. In addition, internet speed (R7) is 
considered W-O from the provider’s 
perspective. Therefore, further enhancements 
of R7 will not only lead to increased customer 
satisfaction but also decrease its 
dissatisfaction. Requirements in the zones (d) 

Figure 8 The representation of the Fuzzy-Kano classification 
results according to CS+ and CD-. 


 56 

and (f) such as booting time (R1), loudspeaker 
quality (R3), and weight (R8) are included in S-
O, which means that those requirements are 
easy to fulfill, and when the provider makes 
more improvements on them, this will lead to a 
higher level of customer satisfaction than the 
current level. The requirements in zone (e) are 
related to S-T. Even though (R9) and (R6) are 
not expected by the customers, the provider 
should be able to assess the customers’ 
preferences and overcome the current threat by 
adding a new value to the product, e.g. improve 
the camera resolution.  

 
5. CONCLUSION 
A good understanding of customer satisfaction 
is important for the survival of any company in 
today’s competitive market. No business can 
deny the critical role of the customers’ voices in 
increasing customer satisfaction. However, 
drawing insights from a huge amount of VOC 
data is challenging. Thus, companies resort to 
BI methods and tools to extract actionable 
information for improving their products and 
meeting their customers’ needs. 

This study proposes a decision-making 
framework for assisting companies in 
understanding their customers’ satisfaction 
through extracting meaningful insights from 
online VOC data. The proposed framework 
consists of four main modules: data extraction 
and preprocessing, aspect-sentiment pairs 
extraction using LDA, requirement 
classification based on the Fuzzy-Kano model, 
and decision-making analysis driven by Fuzzy-
Kano and SWOT.  

A case study including online reviews of 
mobile phones is considered to evaluate the 
performance of the aspect-sentiment pair 
extraction module based on several metrics 
including the accuracy, precision, recall, and F-
score. The results showed that the aspects were 
correctly extracted with a value of 97.4% in 
accuracy and 92.4 % in precision. Additionally, 
the sentiments were accurately assigned to the 
extracted aspects with a value of 89.8% and a 
precision value of 90.7%. These results 
constitute an accurate VOC input to feed the 
Fuzzy-Kano model. They allow us to classify 
the customer requirements that affect their 
satisfaction into four main categories: must-be, 
one-dimensional, attractive, and indifferent. 
Then, we can map them dynamically to the 
SWOT matrix in order to provide valuable and 
interpretable insights for companies. 

This framework has some potential 
limitations that serve as a direction for future 
work. First, the study is conducted on online 
reviews which are assumed to be hand-typed 
and written by honest reviewers (i.e. not fake). 
However, if these reviews have been 
maliciously manipulated, they may impact the 
analysis process and result in biased decisions. 
An efficient spam review detection technique 
would be needed to identify whether the 
reviews are real or fake. 

In addition, the aspect-sentiment pairs 
extraction module deals only with the explicit 
aspects but does not tackle the implicit ones. 
For example, in the following sentence “The 
battery of this phone is pretty good”, the aspect 
“battery” appears explicitly. However, in the 

Figure 9 Requirements mapping results. 


 57 
sentence “The phone lasts all day”, the aspect 
“battery” is implicit because it is not stated 
directly, but only inferred from the meaning of 
the sentence.  

Furthermore, the dynamics of the Fuzzy-
Kano model are not included. It considers the 
evolution of the customer requirements over 
time. e.g., current attractive requirements can 
be transformed into must-be requirements in 
the coming years. 

6. REFERENCES 

Aguwa, C.C., Monplaisir, L., Turgut, O., 2012. 
Voice of the customer: Customer satisfaction 
ratio based analysis. Expert Systems with 
Applications 39, 10112–10119. 
https://doi.org/10.1016/j.eswa.2012.02.071 

Alghamdi, R., Alfalqi, K., 2015. A survey of topic 
modeling in text mining. Int. J. Adv. Comput. 
Sci. Appl.(IJACSA) 6. 

Berger, C.C., Blauth, R.E., Boger, D., 1993. 
kano’s methods for understanding customer-
defined quality. 

Blei, D.M., 2012. Probabilistic Topic Models. 
Commun. ACM 55, 77–84. 
https://doi.org/10.1145/2133806.2133826 

Carulli, M., Bordegoni, M., Cugini, U., 2013. An 
approach for capturing the Voice of the 
Customer based on Virtual Prototyping. J 
Intell Manuf 24, 887–903. 
https://doi.org/10.1007/s10845-012-0662-5 

Culotta, A., Cutler, J., 2016. Mining Brand 
Perceptions from Twitter Social Networks. 
Marketing Science 35, 343–362. 
https://doi.org/10.1287/mksc.2015.0968 

Darling, W.M., 2011. A theoretical and practical 
implementation tutorial on topic modeling and 
gibbs sampling, in: Proceedings of the 49th 
Annual Meeting of the Association for 
Computational Linguistics: Human Language 
Technologies. pp. 642–647. 

Das, S.R., Chen, M.Y., Agarwal, T.V., Brooks, C., 
Chan, Y., Gibson, D., Leinweber, D., Martinez-
jerez, A., Raghubir, P., Rajagopalan, S., 
Ranade, A., Rubinstein, M., Tufano, P., 2001. 
Yahoo! for amazon: Sentiment extraction from 
small talk on the web, in: 8th Asia Pacific 
Finance Association Annual Conference. 

Decker, R., Trusov, M., 2010. Estimating 
aggregate consumer preferences from online 
product reviews. International Journal of 
Research in Marketing 27, 293–307. 
https://doi.org/10.1016/j.ijresmar.2010.09.001 

Farhadloo, M., Patterson, R.A., Rolland, E., 2016. 
Modeling customer satisfaction from 
unstructured data using a Bayesian approach. 
Decision Support Systems 90, 1–11. 
https://doi.org/10.1016/j.dss.2016.06.010 

Farhadloo, M., Rolland, E., 2013. Multi-Class 
Sentiment Analysis with Clustering and Score 
Representation, in: 2013 IEEE 13th 
International Conference on Data Mining 
Workshops. Presented at the 2013 IEEE 13th 
International Conference on Data Mining 
Workshops, pp. 904–912. 
https://doi.org/10.1109/ICDMW.2013.63 

Gioti, H., Ponis, S.T., Panayiotou, N., 2018. Social 
business intelligence: Review and research 
directions. Journal of Intelligence Studies in 
Business 8. 

Goodman, J., 2014. Customer experience 3.0: 
High-profit strategies in the age of techno 
service. Amacom. 

Guo, Y., Barnes, S.J., Jia, Q., 2017. Mining 
meaning from online ratings and reviews: 
Tourist satisfaction analysis using latent 
dirichlet allocation. Tourism Management 59, 
467–483. 
https://doi.org/10.1016/j.tourman.2016.09.009 

Hofmann, T., 2017. Probabilistic Latent Semantic 
Indexing. SIGIR Forum 51, 211–218. 
https://doi.org/10.1145/3130348.3130370 

Hu, M., Liu, B., 2004a. Mining Opinion Features 
in Customer Reviews, in: AAAI. 

Hu, M., Liu, B., 2004b. Mining and Summarizing 
Customer Reviews, in: Proceedings of the 
Tenth ACM SIGKDD International 
Conference on Knowledge Discovery and Data 
Mining, KDD ’04. ACM, New York, NY, USA, 
pp. 168–177. 
https://doi.org/10.1145/1014052.1014073 

Jia, S.S., 2018. Leisure Motivation and 
Satisfaction: A Text Mining of Yoga Centres, 
Yoga Consumers, and Their Interactions. 
Sustainability 10, 4458. 

KANO, N., 1984. Attractive quality and must-be 
quality. Hinshitsu (Quality, the Journal of 
Japanese Society for Quality Control) 14, 39–
48. 

Lee, H., Han, J., Suh, Y., 2014. Gift or threat? An 
examination of voice of the customer: The case 
of MyStarbucksIdea. com. Electronic 
Commerce Research and Applications 13, 
205–219. 


 58 
Lee, Y.-C., Huang, S.-Y., 2009. A new fuzzy 

concept approach for Kano’s model. Expert 
Systems with Applications 36, 4479–4484. 
https://doi.org/10.1016/j.eswa.2008.05.034 

Lu, Y., Mei, Q., Zhai, C., 2011. Investigating task 
performance of probabilistic topic models: an 
empirical study of PLSA and LDA. Inf 
Retrieval 14, 178–203. 
https://doi.org/10.1007/s10791-010-9141-9 

Miller, G.A., 1995. WordNet: a lexical database 
for English. Communications of the ACM 38, 
39–41. 

Nyblom, M., Behrami, J., Nikkilä, T., Solberg 
Søilen, K., 2012. An evaluation of Business 
Intelligence Software systems in SMEs-a case 
study. Journal of Intelligence Studies in 
Business 2, 51–57. 

Park, Y., Lee, S., 2011. How to design and utilize 
online customer center to support new product 
concept generation. Expert Systems with 
Applications 38, 10638–10647. 
https://doi.org/10.1016/j.eswa.2011.02.125 

Phadermrod, B., Crowder, R.M., Wills, G.B., 
2019. Importance-Performance Analysis 
based SWOT analysis. International Journal 
of Information Management 44, 194–203. 
https://doi.org/10.1016/j.ijinfomgt.2016.03.009 

PromptCloud: Fully Managed Web Scraping 
Service, n.d. URL 
https://www.promptcloud.com/ (accessed 
9.24.19). 

Qi, J., Zhang, Z., Jeon, S., Zhou, Y., 2016. Mining 
customer requirements from online reviews: A 
product improvement perspective. 
Information & Management, Big Data 
Commerce 53, 951–963. 
https://doi.org/10.1016/j.im.2016.06.002 

Rese, A., Sänn, A., Homfeldt, F., 2015. Customer 
integration and voice–of–customer methods in 
the German automotive industry. 
International Journal of Automotive 
Technology and Management. 

Reyes, G., 2016. Understanding non response 
rates: insights from 600,000 opinion surveys. 

Sabanovic, A., Søilen, K.S., 2012. Customers’ 
Expectations and Needs in the Business 
Intelligence Software Market. Journal of 
Intelligence Studies in Business 2. 

Saura, J.R., Palos-Sanchez, P., Grilo, A., 2019. 
Detecting indicators for startup business 
success: Sentiment analysis using text data 
mining. Sustainability 11, 917. 

Søilen, K.S., Tontini, G., Aagerup, U., 2017. The 
perception of useful information derived from 
Twitter: A survey of professionals. Journal of 
Intelligence Studies in Business, 7(3). 

Szolnoki, G., Hoffmann, D., 2013. Online, face-to-
face and telephone surveys—Comparing 
different sampling methods in wine consumer 
research. Wine Economics and Policy 2, 57–66. 
https://doi.org/10.1016/j.wep.2013.10.001 

Ting, K.M., 2017. Confusion Matrix, in: Sammut, 
C., Webb, G.I. (Eds.), Encyclopedia of Machine 
Learning and Data Mining. Springer US, 
Boston, MA, pp. 260–260. 
https://doi.org/10.1007/978-1-4899-7687-1_50 

Tirunillai, S., Tellis, G.J., 2014. Mining 
Marketing Meaning from Online Chatter: 
Strategic Brand Analysis of Big Data Using 
Latent Dirichlet Allocation. Journal of 
Marketing Research 51, 463–479. 
https://doi.org/10.1509/jmr.12.0106 

Tontini, G., Solberg Søilen, K., Silveira, A., 2013. 
How interactions of service attributes affect 
customer satisfaction: A study of the Kano 
model’s attributes. Total Quality Management 
& Business Excellence 24, 1253–1271. 

Ullah, A.M.M.S., Tamaki, J., 2011. Analysis of 
Kano-model-based customer needs for product 
development. Systems Engineering 14, 154–
172. https://doi.org/10.1002/sys.20168 

Umoh, U.A., Isong, B.E., 2013. Fuzzy logic based 
decision making for customer loyalty analysis 
and relationship management. International 
Journal on Computer Science and 
Engineering 5, 919. 

Xiao, S., Wei, C.-P., Dong, M., 2016. Crowd 
intelligence: Analyzing online product reviews 
for preference measurement. Information & 
Management 53, 169–182. 
https://doi.org/10.1016/j.im.2015.09.010 

Xu, X., Li, Y., 2016. The antecedents of customer 
satisfaction and dissatisfaction toward 
various types of hotels: A text mining 
approach. International Journal of Hospitality 
Management 55, 57–69. 
https://doi.org/10.1016/j.ijhm.2016.03.003