International Journal of Interactive Mobile Technologies (iJIM) – eISSN: 1865-7923 – Vol  16 No  24 (2022)


Paper—Machine Learning for Feeling Analysis in Twitter Communications: A Case Study in HEYDRU!... 

Machine Learning for Feeling Analysis in Twitter 
Communications: A Case Study in HEYDRU!, Perú 

https://doi.org/10.3991/ijim.v16i24.35493  

Rosa Alegre-Veliz1, Pedro Gaspar-Ortiz1, Javier Gamboa-Cruzado2(), 
Liset Rodríguez Baca1, Waldy Grandez Pizarro3, Rosa Menéndez Mueras2,  

Carlos Chávez Herrera2 
1 Universidad Autónoma del Perú, Lima, Perú 

2 Universidad Nacional Mayor de San Marcos, Lima, Perú 
3 Universidad de San Martín de Porres, Lima, Perú 

jgamboac@unmsm.edu.pe 

Abstract—Nowadays, feeling analysis has become a trend; above all, in dig-
ital product development companies, as it is essential for rapid and automatic 
analysis. Feeling analysis deals with emotions with the help of software, and it is 
playing an unavoidable role in workplaces. The constant growth of social net-
works, especially the Twitter social network, has made the ability to understand 
and comprehend users or clients take a greater scope regarding their needs; and 
therefore, increase the complexity of analysis of this social network, causing ex-
cessive expenses in time, personnel, and money. This work presents a solution 
through the application of Machine Learning (ML) for feeling analysis and thus 
improve analysis, execution time and customer satisfaction. The scope of this 
research is limited to using the Support Vector Machine (SVM), a supervised 
learning technique for the intended analysis. The model derives from the ML 
technique making use of cross validation. CRISP-ML(Q) is the applied Method-
ology. The results show that the use of ML allows efficient feeling analysis in 
Twitter communications. 

Keywords—machine learning, feeling analysis, Twitter, algorithms, classifica-
tion, CRISP-ML(Q), SVM 

1 Introduction 

Today the care and effective service to the user is being promoted as a new compet-
itive value to be considered within companies; the position of the consumer as sover-
eign of this is evident. Therefore, the objective is to avoid the bad relationship between 
the group of users and customers with the company. This situation is not always clearly 
perceived by both parties, but this causes expensive effects. The exploration and anal-
ysis of the content of social networks has aroused the interest of both researchers and 
companies in general. 

Following these guidelines, [4] mentions that, to obtain sentiment data through 
tweets, it is required to build learning models and obtain labeled data, which are usually 

126 http://www.i-jim.org

https://doi.org/10.3991/ijim.v16i24.35493
mailto:example@example.org
mailto:example@example.org
mailto:example@example.org


Paper—Machine Learning for Feeling Analysis in Twitter Communications: A Case Study in HEYDRU!... 

difficult and expensive to obtain. For this reason, it promotes semi-supervised learning, 
which generates a vast amount of data to classify feeling analysis of tweets; likewise, 
[9] expresses that, due to the large amount of data on audiences, communicators should 
use micro-segmentation with data analysis, for which it makes use of the CRISP-DM 
methodology, relying on data analysis models that lead to clustering, prediction and/or 
classification of data in a massive way. 

On the other hand, [19] deals with the frequency of conversations in his research 
about menthol cigarettes to get the tweet and the sentiment characteristics. Through 
SVM classifiers, a large amount of data was extracted to analyze, one of the results 
shows that 47% of the data analyzed was positive; nevertheless, it can express the idea 
that can lead to addiction, opposing a greater number of negative opinions. As [22] 
mentions, to support the creation of successful startups, a model that determines the 
analysis of tweets and a feeling analysis supported by the SVM algorithm is needed to 
know how positive or negative comments about a startup can predict its popularity [13]. 
Accordingly, a business plan can be generated, and continuous feeling analysis can be 
carried out to keep the market active regarding the startup. 

Currently, Machine Learning techniques are increasingly relevant at the business 
level in terms of automation and restructuring, for which, indicators with adequate in-
formation should be taken as a base to guide an optimal change within the decision-
making in the company. Systematic data analysis also should be performed to develop 
a predictive model; thus, the automatic detection of sentiments in tweets becomes a 
powerful and useful tool for analyzing social networks and many other applications. 

Given this worrisome reality, i.e., the inefficiency of software solutions for sentiment 
analysis in Twitter communications using Machine Learning worldwide, the present 
research will allow to close this technological and business gap. 

The main objective of this research is to study the application of Machine Learning 
through sentiment analysis in the technology company Heydru!, which will be used as 
a case study. It shows how Machine Learning performs efficiently against a sentiment 
analysis in the company to automate its process in its marketing and sales areas. The 
proposed approach is to use decentralized technology, by which it is intended to de-
velop a system for issuing certificates. 

2 Background and related works 

Currently, Machine Learning is one of the most popular ways to examine emotional 
behaviors, which generates intelligent algorithms that can learn without relying on rule-
based programming. The application of machine learning has been prioritized in vari-
ous fields, with the business environment as the main environment [14][15]. In relation 
to what was said above, different agencies are being adapted in the application of ma-
chine learning for their different processes [8]. 

In recent years, feeling analysis has become a frequent research topic due to the great 
demand in the market and the need to analyze public opinion [10]. Through the time, 
new techniques have appeared, as well as libraries and tools to apply sentiment analysis 

iJIM ‒ Vol. 16, No. 24, 2022 127


Paper—Machine Learning for Feeling Analysis in Twitter Communications: A Case Study in HEYDRU!... 

processing [1]. With the positioning of social networks, users also have all kind of fa-
cilities to express their opinions on different topics of interest [4]. Being aware of the 
opinions regarding a brand or product and measuring its impact is currently of vital 
importance for all companies, since the image of the company is what is at stake [11]. 

There is also an intense application of technologies in analysis [2]. It is worth noting 
the recent prevalence of social networks, especially Twitter, which is characterized by 
being a social network that generates a large amount of data and messages; with the 
possibility that it can be linked to a live event anywhere in the world [5]. Given this 
diversity of audiences, they can be segmented and located in a geographical area or in 
a hashtag environment [9]. Regarding the ranking of the sentiment of the messages on 
Twitter, there is a history of studies applied to different themes and languages; among 
them, the inclusion of emoticons is considered as a relevant element to support the 
context and to increase the accuracy of the model [14]. 

On the other hand, the use of hashtags, where the label system is considered to build 
the classifier [12], the semantics-based approach suggests the removal of stop words. 
These without apparent load of meaning such as the articles “one”, “an”, “the” or “the” 
[2]. In addition, the use of Machine Learning technology allows the development of the 
Support Vector Machine (SVM) algorithm [3][15]. The application of Machine Learn-
ing technology has also allowed it to be applied in the field of customer service; as, 
somehow, the analysis provided by the algorithm alerts the service team to any new 
issues that need to be taken into account; and, in this way, the preparation of a plan or 
strategy is made possible [17]. 

This generates contributions in various fields in addition to the business field; for 
example, it is a resource for presidential elections [2], to obtain the classification of the 
candidates according to positive and negative indicators to know who the winner will 
be [6]. 

3 Research methodology 

3.1 Machine learning application development methodology 

For the development of the solution in this research, the CRISP-ML(Q) methodology 
was used; it has recently been proposed by the machine learning community to ensure 
the quality of the result of the project. Careful maintenance of each phase is kept in 
mind to reduce the risk of performance degradation over time. It has six phases (See 
Figure 1): business and data understanding, data engineering (data preparation), Ma-
chine Learning Model Engineering, quality assurance for machine learning applica-
tions, deployment and finally, monitoring and maintenance [31]. 

128 http://www.i-jim.org


Paper—Machine Learning for Feeling Analysis in Twitter Communications: A Case Study in HEYDRU!... 

  
Fig. 1. CRISP-ML(Q) methodology workflow 

3.2 Applied research method 

Operationalization of the variables: the indicators and their details, considered in the 
research, are shown in Table 1. 

Table 1.  Operationalization of the dependent variable 

Indicator Index Unit of Measurement 
Analysis time [1-12] Days 
Analysis cost [5400 – 108000] Nuevo sol 
Number of people involved [1-11] People 

Acceptance level 
Totally disagree, In disagreement, Nei-
ther agree nor disagree, In agreement, 

Totally agree 
Likert scale 

 
Research Design. This research presents a pure experimental design, which applies 

the Post Test method for the Experimental Group (Ge) and for the Control Group (Gc). 

iJIM ‒ Vol. 16, No. 24, 2022 129


Paper—Machine Learning for Feeling Analysis in Twitter Communications: A Case Study in HEYDRU!... 

The elements of the sample are chosen randomly (R), to which a stimulus or experi-
mental condition (X) is applied. When the stimulus (Machine Learning solution) is ap-
plied to the Ge, values for the indicators are obtained (O1) and when the stimulus is not 
applied, values for the Gc are obtained (O2). 

RGe     X     O1 
RGc     --     O2 
Universe and sample. It was established as a universe to all the processes of analy-

sis of sentiments of communications by Twitter in marketing agencies in Peru.  
In case of the sample, the process of sentiment analysis was taken in the communi-

cations by Twitter. With n = 30 transactions. 
Data collection procedures. The direct observation technique was applied in the 

research; the tool, observation sheet, was used for collecting the data for each study 
indicator. This technique was applied from the beginning. 

Statement of hypotheses. H1: If you use a Machine Learning Application using 
CRISP-ML(Q) then the time to Analyze Feelings in Twitter Communications is re-
duced. 

H2: If a Machine Learning Application is used using CRISP-ML(Q) then the cost to 
Analyze Feelings in Twitter Communications is reduced. 

H3: If you use a Machine Learning Application using CRISP-ML(Q) then the num-
ber of people involved in Analyzing Feelings in Twitter Communications is reduced. 

H4: If a Machine Learning Application is used using CRISP-ML(Q) then the level 
of acceptance of the end user to Analyze Feelings in Twitter Communications is im-
proved. 

For the hypothesis test, with the purpose of contrasting each one of them, the fol-
lowing was proposed: 

μ1 = Population Mean (H1, H2, H4) for Gc PosTest 
μ2 = Population Mean (H1, H2, H4) for Ge PosTest 
Where: Ho: μ1 ≤ μ2 and Ha: μ1 > μ2 
Also: 
μ1 = Population Mean (H3) for Gc PosTest 
μ2 = Population Mean (H3) for Ge PosTest 
Where: Ho: μ1 ≥ μ2 and Ha μ1 < μ2 
To test the hypotheses, two statistical tests were applied: Student's t test (quantitative 

values) and the Mann-Whitney U test (qualitative values) using the Minitab software. 

4 Case study 

The research was carried out for a specific case in the organization called Heydru! 
This company’s main activity is to develop software in Peru. The architecture of the 
solution is shown in Figure 2.  

130 http://www.i-jim.org


Paper—Machine Learning for Feeling Analysis in Twitter Communications: A Case Study in HEYDRU!... 

 
Fig. 2. Twitter data analysis workflow 

Next, the case of the company Heydru! is described in detail and rigorously, follow-
ing step by step the life cycle of the CRISP-ML(Q) methodology that has been shown 
in Figure 1. 

4.1 Business and data understanding 

Three success criteria are reflected for the validation of the first phase. Regarding 
the business criterion, it seeks to reduce the time of investigation, the time of creating 
reports and decision making. In the case of machine learning criteria, criteria such as 
the performance evaluated by the F1, and the soft measures are established. They assess 
robustness, the possibility of explanation, scalability, complexity, and resource de-
mand. In addition, for the feasibility of the data, the availability of data, resources and 
regulatory constraints are established. 

4.2 Data engineering (data preparation) 

It is divided into four sub-phases, the first phase in which the following data are 
selected: creation_date, id, text, source, reply_to_tweet, reply_to_user_id, re-
ply_to_user_name; the second is related to data cleaning, in which the cleaning of links, 
mentions, hashtag, multiple_spaces, lowercase, special characters and elimination of 
duplicates is developed; in the third subphase, feature engineering, there are no new 
features to add to the data or to be derived from existing ones. In the last subphase, the 
development of data standardization is contemplated, in which the definition of func-
tions is established that allows guaranteeing the reproducibility of the application; 

iJIM ‒ Vol. 16, No. 24, 2022 131


Paper—Machine Learning for Feeling Analysis in Twitter Communications: A Case Study in HEYDRU!... 

therefore, the functions they fulfill are store_tweet, divifitData, cross_validation and 
plot_matConfusion. 

4.3 Engineering machine learning models 

First, the quality measure is defined, and a quality and validity check of the model 
to be used is carried out. F1 is used. This metric is the combination of the accuracy and 
recall metrics: 

𝐹𝐹1 =  2 ∗ (𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 ∗  𝑝𝑝𝑝𝑝𝑝𝑝𝑟𝑟𝑟𝑟𝑟𝑟) / (𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 +  𝑝𝑝𝑝𝑝𝑝𝑝𝑟𝑟𝑟𝑟𝑟𝑟) (1) 

And for the development of the product, the SVM (support vector machine) algo-
rithm is used since a classification model is going to be made, and this is the optimal 
way to do it, so from a database, logical construction diagrams. The greatest difficulty, 
produced in the test of the data with the application of the model, was the data cleaning 
since the expected automation was not contemplated, so a function for the adequate and 
uniform cleaning of the data had to be created. In this way, a set of data suitable for the 
application of the model was achieved. 

4.4 Quality assurance for machine learning applications 

For this phase, it was obtained the approval and verification from an expert, who 
analyzed the data and the respective model with the expected results. For this, the CEO 
of the company Heydru! was introduced to an expert who validates the technologies to 
be implemented in the company; additionally, it was presented to the person in charge 
of the marketing area to validate the operation of the project's product. 

4.5 Solution deployment 

The hardware requirements to deploy the project are as follows: 

─ Laptop: Core i7 9th Gen 3GHz, 8GB RAM 1600MHz, 500GB SSD, 2GB Integrated 
Graphics, Windows 10 Pro OS. 

─ Computer: Core i7 5th Generation 2.5GHz Processor, 12GB RAM 1800MHz, 
500GB SSD, NVIDIA 4GB, Windows 10 Pro OS. 

On the other hand, a contingency plan was established: a Script 2 must be generated, 
the queries are kept as a base, and the list of variables is established beforehand. It is 
verified that the input variables can only be manageable. 

For the implementation and deployment, online tests are carried out and one of them 
is the A/B test. 

To make use of the model, a web system is developed so that the end user can ana-
lyze the feeling of his users (See Figure 3, Figure 4, Figure 5 and Figure 6)). 

132 http://www.i-jim.org


Paper—Machine Learning for Feeling Analysis in Twitter Communications: A Case Study in HEYDRU!... 

 
Fig. 3. Login of the sentiment analysis system 

 
Fig. 4. Inputs section and sentiment analysis process 

iJIM ‒ Vol. 16, No. 24, 2022 133


Paper—Machine Learning for Feeling Analysis in Twitter Communications: A Case Study in HEYDRU!... 

 
Fig. 5. Presentation of the sentiment analysis interface based on the word startup 

       
Fig. 6. Sentiment analysis interface in responsive format 

4.6 Monitoring and maintenance 

In this phase, practices are established to avoid a drop in the model's performance, 
which consists of carrying out constant supervision so that it is evaluated, and it is de-
cided when it is necessary to train the model again. 

The ML model is updated. In addition to monitoring and retraining, reflect on the 
business use and ML task, it is valuable to adjust the ML process. 

134 http://www.i-jim.org


Paper—Machine Learning for Feeling Analysis in Twitter Communications: A Case Study in HEYDRU!... 

5 Experiments, results and discussions 

5.1 Results: Reduction / Increase of indicators I1, I2, I3, I4 

100% of the data obtained from the Control Group (Gc) and the Experimental Group 
(Ge), for each research indicator, were recorded with an observation sheet. It is obtained 
that 50.43% of the times, to analyze feelings, is less than the average time. Regarding 
the cost to develop the sentiment analysis, 73.61% is lower than the average. On the 
other hand, the number of personnel involved represents 50.0% less than the average 
with respect to the given implementation. Finally, in the level of acceptance of the end 
user, an increase of 69.0% was noted.  

The direct observation technique was done using the stopwatch as a measuring in-
strument, which was very useful to understand the current state of the Twitter Sentiment 
Analysis process. In addition, the Minitab software was applied to perform the statisti-
cal calculations to provide information as evidence for the results obtained. 

Normality test. Next (See Figure 7), the test is performed to determine which indi-
cators have data with normal behavior. This serves to determine the statistical test to be 
applied. 

 
Fig. 7. Average sentiment analysis time in Twitter startup data 

It is observed that in the Gc Post Test and the Ge Post Test the p-value (0.107 and 
0.118) > α (0.05). Therefore, the values of the Time to analyze indicator behave nor-
mally. For the I2: Cost of feeling analysis in the Twitter data, in the Gc Post-Test and 

iJIM ‒ Vol. 16, No. 24, 2022 135


Paper—Machine Learning for Feeling Analysis in Twitter Communications: A Case Study in HEYDRU!... 

in the Ge Post-Test the p-value (0.271 and 0.243) > α (0.05). Therefore, the values of 
the indicator Cost to analyze have a normal behavior. 

Similarly, it was done for Indicator 3 (Number of people involved). Its data are also 
shown to have normal behavior. 

5.2 Discussion: Effect on sentiment analysis in Twitter communications 

Descriptive statistics. Table 2 shows the results obtained and the application of de-
scriptive statistics for each indicator. 

Table 2.  Operationalization of the dependent variable 

Sample N 95% confidence intervals for the mean Kurtosis Asymetry Q3 

I1: PosTest (Ge) 30 2.8805 - 4.0529 -0.504850 0.295076 5.0000 
I2: PosTest (Ge) 30 12022 – 14000 -0.606303 0.381776 17000 
I3: PosTest (Ge) 30 3.0000 - 4.0000 -0.668777 0.182359 5.0000 

 
In summary, for each indicator in Table 2 it shows that, around 95% of the values, 

they are within 2 standard deviations for the comparison of the average. La Kurtosis 
indicates that there are values with peaks that are too low; similarity, the asymmetry 
indicates that most of the values are presented as low, the 3rd quarter indicates that the 
75% of the values are less than or equal to this value. For indicator I1, the results are 
like those of [12] that in their research about the perceptions of Twitter users on menthol 
cigarettes: analysis of content and sentiment, expressed that the average time to analyze 
sentiments in communications via Twitter is estimated at Ge (4 days), was significantly 
shorter at Gc (10 days). There are similar results in [3] which, in a survey and a com-
parative study of the analysis of the sentiment of the tweets through the semi-supervised 
learning, estimating that the Ge (2.12 days) was significantly smaller than the Gc (8 
days). Similarly, [20] is found with the investigation based on the analysis of sentiment 
on customer satisfaction with digital payment in Indonesia. 

A comparative study, using KNN and Naive Bayes, highlighted in Ge (5.30 days) 
being significantly shorter at Gc (8 days). In addition to this, in a research on the detec-
tion of indicators for the success of an emerging company, the analysis of sentiments 
using text data mining estimates that Ge (1.12 days) was significantly lower than Gc (4 
days) [15]. 

For indicator I2, it should be noted that the results for this indicator, in relation to 
the average cost, have been like those of [4], referring to the exploration of customer 
reviews online for the development of new products: the case of identifying reinforcers 
in the cosmetic industry, it is estimated in Ge (S/. 18 789.00), it was significantly lower 
for Gc (S/. 55 432.00). 

In a similar way, the similar results in the study [1] with its empirical investigation 
on the prediction of customer abandonment behavior using the Twitter mining ap-
proach, it is estimated that the Ge (S/. 9 780.00) was significantly lower than Gc 
(S/27987.30). Thus, the study [9] on the analysis of sentiment of multimodal data from 
Twitter estimates that Ge (S/. 15 400.00) was significantly lower than Gc (S/. 60 

136 http://www.i-jim.org


Paper—Machine Learning for Feeling Analysis in Twitter Communications: A Case Study in HEYDRU!... 

780.50). On the other hand, in relation to a comparison of the performance of the su-
pervised automatic learning models for the analysis of the opinion of the Covid-19 
tweets, it is estimated that the Ge (S/. 10 950.23) was significantly lower than the Gc 
(S/. 19,789.56) [14]. 

Finally, the results of the I3 indicator with the [19] show similarities referring to an 
efficient preprocessing method for the supervised feeling analysis through the conver-
sion of sentences into numerical vectors: a case study of Twitter, estimates that in Ge 
(2 personas) was significantly smaller than Gc (5 personas). On the other hand, [5] in 
his investigation Analysis of social networks of Covid-19 feelings: application of Arti-
ficial Intelligence estimates that Ge (6 people) was significantly lower than Gc (10 peo-
ple). The study [2] referring to a hybrid N-gram model using Naive Bayes to classify 
political feelings on Twitter estimates that Ge (1 person) was significantly lower than 
Gc (3 personas). Finally, in an investigation into the deep analysis of sentiment: a case 
study derived from Turkish Twitter; it is estimated that Ge (4 persons) was significantly 
lower than Gc (8 persons) [17].  

As in developing countries similar to Peru, the Machine Learning solutions for the 
Analysis of Sentiments in Communications via Twitter are still under-developed. This 
poses a challenge for the development of these solutions aimed at commercial users 
from different business sectors in various remote areas. 

Inferential statistics. In Tables 3 and 4, the values of the application of the statistical 
principles are shown for the contrast of the hypotheses. 

Table 3.  Operationalization of the dependent variable 

Sample n Ho t-value p-value 
I1: PosTest (Gc) 

30 μ1 > μ2 6.40 0.000 
I1: PosTest (Ge) 
I2: PosTest (Gc) 

30 μ1 > μ2 17.23 0.000 
I2: PosTest (Ge) 
I3: PosTest (Gc) 

30 μ1 > μ2 10.04 0.000 
I3: PosTest (Ge) 

Table 4.  Operationalization of the dependent variable 

Sample n Ho w-value p-value 
I4: PosTest (Gc) 

30 µ1> µ2 475.00 0.000 
I4: PosTest (Ge) 

 
Since all p values are less than 0.05, the results provide enough evidence to reject 

the null hypotheses (Ho), and the alternate hypotheses were correct. The tests will turn 
out to be significant. 

Research implications. Other applications of the solution were in the political 
sphere, which leads to evaluating the comments on social networks, in this case Twitter, 
and as a result, the candidate's level of acceptance. Regarding the application, it started 
with the understanding of business and data, in which the scope is defined, we evaluate 
success criteria. In this case, it corresponds to the publications of the candidates and the 

iJIM ‒ Vol. 16, No. 24, 2022 137


Paper—Machine Learning for Feeling Analysis in Twitter Communications: A Case Study in HEYDRU!... 

comments done by the readers. It can be also the media, natural persons, and other 
politicians’ interest. In the following phase, data engineering, a function for data clean-
ing and development of data for the model is planned and developed. During the model 
engineering phase, the application of the model is used with the data collected. How-
ever, the model is evaluated with the use of other noisy or incorrect input data to obtain 
the validity of the model. Finally, the deployment is carried out with the implementation 
of the model programmed on an existing software system. 

The Machine Learning solution that has allowed to significantly optimize the values 
of Analysis cost, Number of people involved and Acceptance level can be perfectly 
applied in a wide variety of business sectors, in different geographical regions, world-
wide, today and in the future. 

6 Conclusions and future research 

The Machine Learning models used for the analysis of feelings in social networks 
are effectively integrated in different areas, highlighting in the companies of Marketing 
that optimize their work and reduce costs related to time and dedicated personnel to 
carry out the analysis of feelings in social networks, verifying the growth of startups in 
which they invest and obtain good investment results. Most applications carried out for 
the analysis of feelings using predictive models with high precision, such as convolu-
tional neural networks (CNN) and Support Vector Machine (SVM), generate an auto-
mation of processes for data management, interpretation of predictive analysis to carry 
out and customize the tasks of the assigned personnel. In the present investigation, an 
application was used for the analysis of feelings for the management of trends in 
startups, which is of great importance for the marketing area of the company Heydru! 
The solution was developed using the CRISP-LM(Q) methodology, which adapts to 
the use of massively changing information, ensuring the quality of the model in its de-
ployment as the information changes or increases. As a result, it was able to improve 
the indicators of the process in the company Heydru!, related to the time of an analysis, 
the cost to carry out an analysis, the amount of staff involved and the level of satisfac-
tion; solving these limitations and finding the solutions one person would need. 

For future investigations, it is proposed to improve the algorithm used to increase 
the level of precision, adding neural networks of different types [15][16][17], as well 
as using new techniques of Deep Learning. The platform for the analysis of feelings 
can be optimized to increase the amount of information to be handled, as well as to 
increase the different statistical graphs for the report that should be presented in the 
respective area, optimizing the cost of the personal and the time dedicated to the anal-
ysis [13]. It is also necessary to implement the CRISP-LM(Q) methodology in a con-
tinuous way for this type of analysis, which requires the use of information in constant 
change and increase, since this methodology has recently been implemented, it is nec-
essary to follow up and study the new changes and updates. 

138 http://www.i-jim.org


Paper—Machine Learning for Feeling Analysis in Twitter Communications: A Case Study in HEYDRU!... 

It is necessary to expand the use of the feeling analysis applications to different areas 
that need to carry out monitoring of the performance of a company, as well as the de-
velopment of cloud platforms [29][30] with adaptive use to carry out a feeling analysis, 
supporting this is the way for SME that wants to implement the quality analysis process. 

Some limitations have been identified in some algorithms for sentiment analysis on 
Twitter, which limits the efficiency of the results among decision makers; however, this 
has not impaired the interpretation of the results. Future implementations of Machine 
Learning solutions should consider the use of more efficient algorithms to achieve bet-
ter results and thus eliminate bias in decision making. 

7 Acknowledgment 

We would like to thank the Universidad Autónoma del Perú for its support to our 
work carried out, specifically the career of de Systems Engineering. Also, to the Heydru 
company! for allowing us to apply our research in its institution. 

8 References 

[1] L. Almuqren, F. S. Alrayes, and A. I. Cristea, “An empirical study on customer churn be-
haviours prediction using arabic twitter mining approach,” Futur. Internet, vol. 13, no. 7, pp. 
1–19, 2021, https://doi.org/10.3390/fi13070175  

[2] J. Awwalu, A. Bakar Abu, and M. Yaakub Ridzwan, “Hybrid N-gram model using Naïve 
Bayes for classification of political sentiments on Twitter,” Neural Comput. Appl., vol. 31, 
no. 12, pp. 9207–9220, 2019, https://doi.org/10.1007/s00521-019-04248-z  

[3] X. M. Cuzcano and V. H. Ayma, “A comparison of classification models to detect cyber-
bullying in the Peruvian Spanish language on twitter,” Int. J. Adv. Comput. Sci. Appl., vol. 
11, no. 10, pp. 132–138, 2020, https://doi.org/10.14569/IJACSA.2020.0111018  

[4] N. F. F. Da Silva, L. F. S. Coletta, and E. R. Hruschka, “A survey and comparative study of 
tweet sentiment analysis via semi-supervised learning,” ACM Comput. Surv., vol. 49, no. 
1, pp. 1–26, 2016, https://doi.org/10.1145/2932708  

[5] M. Denegri Coria, J. C. Morales Arevalo, J. L. Hilario-Rivas, R. Hilario-Cárdenas, and J. I. 
Prado-Juscamaita, “Supervised Sentiment Analysis Algorithms,” vol. 12, no. 14, pp. 2000–
2012, 2021, https://turcomat.org/index.php/turkbilmat/article/view/10547  

[6] M. Haddara, J. Hsieh, A. Fagerstrøm, N. Eriksson, and V. Sigurðsson, “Exploring cus-tomer 
online reviews for new product development: The case of identifying reinforcers in the cos-
metic industry,” Manag. Decis. Econ., vol. 41, no. 2, pp. 250–273, 2020, https://doi.org/ 
10.1002/mde.3078  

[7] M. Hung et al., “Social network analysis of COVID-19 sentiments: Application of artifi-cial 
intelligence,” J. Med. Internet Res., vol. 22, no. 8, pp. 1–13, 2020, https://doi.org/10.2196/ 
22590  

[8] J. Ji, H. Wang, S. Song, and J. Pi, “Sentiment analysis of comments of wooden furniture 
based on naive Bayesian model,” Prog. Artif. Intell., vol. 10, no. 1, pp. 23–35, 2021, 
https://doi.org/10.1007/s13748-020-00221-3  

[9] M. Kapatamoyo, “Data analytics in mass communication: New methods for an old craft,” 
RISTI - Rev. Iber. Sist. e Tecnol. Inf., vol. 2019, no. E20, pp. 504–515, 2019. 

iJIM ‒ Vol. 16, No. 24, 2022 139

https://doi.org/10.3390/fi13070175
https://doi.org/10.1007/s00521-019-04248-z
https://doi.org/10.14569/IJACSA.2020.0111018
https://doi.org/10.1145/2932708
https://turcomat.org/index.php/turkbilmat/article/view/10547
https://doi.org/10.1002/mde.3078
https://doi.org/10.1002/mde.3078
https://doi.org/10.2196/22590
https://doi.org/10.2196/22590
https://doi.org/10.1007/s13748-020-00221-3


Paper—Machine Learning for Feeling Analysis in Twitter Communications: A Case Study in HEYDRU!... 

[10] J. D. Kinyua, C. Mutigwe, D. J. Cushing, and M. Poggi, “An analysis of the impact of Pres-
ident Trump’s tweets on the DJIA and S&P 500 using machine learning and sentiment anal-
ysis,” J. Behav. Exp. Financ., vol. 29, p. 100447, 2021, https://doi.org/10.1016/j.jbef.2020. 
100447  

[11] A. Kumar and G. Garg, “Sentiment analysis of multimodal twitter data,” Multimed. Tools 
Appl., no. February, 2019, https://doi.org/10.1007/s11042-019-7390-1  

[12] S. Kurniawan, W. Gata, D. A. Puspitawati, I. K. S. Parthama, H. Setiawan, and S. Hartini, 
“Text Mining Pre-Processing Using Gata Framework and RapidMiner for Indonesian Sen-
timent Analysis,” IOP Conf. Ser. Mater. Sci. Eng., vol. 835, no. 1, 2020, https://doi.org/ 
10.1088/1757-899X/835/1/012057  

[13] J. Murga et al., “A Sentiment Analysis Software Framework for the support of Business 
information architecture in the tourist sector,” vol. 12390, 2020, https://doi.org/10.1007/ 
978-3-662-62308-4_8  

[14] H. L. Nguyen and J. E. Jung, “Statistical approach for figurative sentiment analysis on So-
cial Networking Services: a case study on Twitter,” Multimed. Tools Appl., vol. 76, no. 6, 
pp. 8901–8914, 2017, https://doi.org/10.1007/s11042-016-3525-9  

[15] J. Ochoa-Luna and D. Ari, “Deep Neural Network Approaches for Spanish Sentiment Anal-
ysis of Short Texts,” vol. 1, pp. 206–216, 2018, https://doi.org/10.1007/978-3-030-03928-8  

[16] J. Ochoa-Luna and D. Ari, “Word embeddings and deep learning for spanish twitter sen-
timent analysis,” Commun. Comput. Inf. Sci., vol. 898, pp. 19–31, 2019, https://doi.org/ 
10.1007/978-3-030-11680-4_4  

[17] D. Palomino and J. Ochoa-Luna, “Advanced Transfer Learning Approach for Improving 
Spanish Sentiment Analysis,” Adv. Soft Comput., vol. 11835 LNAI, no. November, pp. 
112–123, 2019, https://doi.org/10.1007/978-3-030-33749-0_10  

[18] G. A. Pierina, P. J. Guzman Ramos, L. A. Chipana Vila, C. A. Trigoso Valeriano, and J. 
Fabian Arteaga, “Bag of embedding words for sentiment analysis of tweets,” Computers, 
vol. 14, no. 3, pp. 223–231, 2019, https://doi.org/10.17706/jcp.14.3.223-231  

[19] S. W. Rose, C. L. Jo, S. Binns, M. Buenger, S. Emery, and K. M. Ribisl, “Perceptions of 
menthol cigarettes among twitter users: Content and sentiment analysis,” J. Med. Internet 
Res., vol. 19, no. 2, pp. 1–16, 2017, https://doi.org/10.2196/jmir.5694  

[20] J. K. Rout, K. K. Raymond Choo, A. Kumar Dash, S. Bakshi, S. Kumar Jena, and K. L. 
Williams, “A model for sentiment and emotion analysis of unstructured social media text,” 
Electron. Commer. Res., vol. 18, no. 1, pp. 181–199, 2018, https://doi.org/10.1007/s10660-
017-9257-8  

[21] F. Rustam, M. Khalid, W. Aslam, V. Rupapara, A. Mehmood, and G. Sang Choi, “A per-
formance comparison of supervised machine learning models for Covid-19 tweets senti-
ment analysis,” PLoS One, vol. 16, no. 2, pp. 1–23, 2021, https://doi.org/10.1371/journal. 
pone.0245909  

[22] J. R. Saura, P. Palos-Sanchez, and A. Grilo, “Detecting indicators for startup business suc-
cess: Sentiment analysis using text data mining,” Sustain., vol. 11, no. 3, pp. 1–14, 2019, 
https://doi.org/10.3390/su11030917  

[23] P. Sharma and A. K. Sharma, “Experimental investigation of automated system for twitter 
sentiment analysis to predict the public emotions using machine learning algorithms,” Ma-
ter. Today Proc., 2020, https://doi.org/10.1016/j.matpr.2020.09.351  

[24] H. A. Shehu et al., “Deep Sentiment Analysis: A Case Study on Stemmed Turkish Twitter 
Data,” IEEE Access, vol. 9, pp. 56836–56854, 2021, https://doi.org/10.1109/ACCESS. 
2021.3071393  

[25] K. Sigit, A. P. Dewi, G. Windu, Nurmalasari, T. Muhamad, and N. Kadinar, “Comparison 
of Classification Methods on Sentiment Analysis of Political Figure Electability Based on 

140 http://www.i-jim.org

https://doi.org/10.1016/j.jbef.2020.100447
https://doi.org/10.1016/j.jbef.2020.100447
https://doi.org/10.1007/s11042-019-7390-1
https://doi.org/10.1088/1757-899X/835/1/012057
https://doi.org/10.1088/1757-899X/835/1/012057
https://doi.org/10.1007/978-3-662-62308-4_8
https://doi.org/10.1007/978-3-662-62308-4_8
https://doi.org/10.1007/s11042-016-3525-9
https://doi.org/10.1007/978-3-030-03928-8
https://doi.org/10.1007/978-3-030-11680-4_4
https://doi.org/10.1007/978-3-030-11680-4_4
https://doi.org/10.1007/978-3-030-33749-0_10
https://doi.org/10.17706/jcp.14.3.223-231
https://doi.org/10.2196/jmir.5694
https://doi.org/10.1007/s10660-017-9257-8
https://doi.org/10.1007/s10660-017-9257-8
https://doi.org/10.1371/journal.pone.0245909
https://doi.org/10.1371/journal.pone.0245909
https://doi.org/10.3390/su11030917
https://doi.org/10.1016/j.matpr.2020.09.351
https://doi.org/10.1109/ACCESS.2021.3071393
https://doi.org/10.1109/ACCESS.2021.3071393


Paper—Machine Learning for Feeling Analysis in Twitter Communications: A Case Study in HEYDRU!... 

Public Comments on Online News Media Sites,” IOP Conf. Ser. Mater. Sci. Eng., vol. 662, 
no. 4, 2019, https://doi.org/10.1088/1757-899X/662/4/042003  

[26] M. K. Sohrabi and F. Hemmatian, “An efficient preprocessing method for supervised sen-
timent analysis by converting sentences to numerical vectors: a twitter case study,” Mul-
timed. Tools Appl., 2019, https://doi.org/10.1007/s11042-019-7586-4  

[27] G. Vizcarra, A. Mauricio, and L. Mauricio, “A deep learning approach for sentiment anal-
ysis in Spanish Tweets,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. In-
tell. Lect. Notes Bioinformatics), vol. 11141 LNCS, pp. 622–629, 2018, https://doi.org/ 
10.1007/978-3-030-01424-7_61  

[28] H. Wisnu, M. Afif, and Y. Ruldevyani, “Sentiment analysis on customer satisfaction of dig-
ital payment in Indonesia: A comparative study using KNN and Naïve Bayes,” J. Phys. Conf. 
Ser., vol. 1444, no. 1, 2020, https://doi.org/10.1088/1742-6596/1444/1/012034  

[29] G. Zapata, J. Murga, C. Raymundo, J. Alvarez, and F. Dominguez, “Predictive model based 
on sentiment analysis for peruvian smes in the sustainable tourist sector,” IC3K 2017 - Proc. 
9th Int. Jt. Conf. Knowl. Discov. Knowl. Eng. Knowl. Manag., vol. 3, no. Kmis, pp. 232–
240, 2017, https://doi.org/10.5220/0006583302320240  

[30] G. Zapata, J. Murga, C. Raymundo, F. Dominguez, J. M. Moguerza, and J. M. Alvarez, 
“Business information architecture for successful project implementation based on senti-
ment analysis in the tourist sector,” J. Intell. Inf. Syst., vol. 53, no. 3, pp.  563–585, 2019, 
https://doi.org/10.1007/s10844-019-00564-x  

[31] S. Studer et al., “Towards CRISP-ML(Q): A Machine Learning Process Model with Quality 
Assurance Methodology,” Machine Learning and Knowledge Extraction, vol. 3, no. 2, pp. 
392–413, Apr. 2021, https://doi.org/10.3390/make3020020  

9 Authors 

Rosa Alegre-Veliz, student at the Faculty of Engineering and Architecture at the 
Universidad Autónoma del Perú, with extensive experience in database modeling and 
software development (email: rale-grev@autonoma.edu.pe). 

Pedro Gaspar-Ortiz, student at the Faculty of Engineering and Architecture at the 
Universidad Autónoma del Perú, with extensive experience in database management 
and development of mobile applications (email: pgas-par@autonoma.edu.pe). 

Javier Gamboa-Cruzado, Systems Engineer, Doctor in Systems Engineering, Doc-
tor in Administration. Professor-Researcher in the postgraduate programs at Systems 
Engineering Faculty at the Universidad Nacional Mayor de San Marcos, Peru (email: 
jgamboac@unmsm.edu.pe). 

Liset Rodriguez Baca, Systems Engineer, graduated in Education, Master in Sys-
tems Engineering with Mention in Management and Management in Information Tech-
nology, Master in Strategic Business Management, Doctor in Education Sciences. Di-
rector of the Professional School of Systems Engineering at the Universidad Autónoma 
del Perú (email: liset.rodriguez@autonoma.pe). 

Waldy Grandez Pizarro, Computing and Systems Engineer. Professor at the Fac-
ulty of Engineering and Architecture at the Universidad de San Martin de Porres, Peru 
(email: wgrandezp@usmp.pe). 

iJIM ‒ Vol. 16, No. 24, 2022 141

https://doi.org/10.1088/1757-899X/662/4/042003
https://doi.org/10.1007/s11042-019-7586-4
https://doi.org/10.1007/978-3-030-01424-7_61
https://doi.org/10.1007/978-3-030-01424-7_61
https://doi.org/10.1088/1742-6596/1444/1/012034
https://doi.org/10.5220/0006583302320240
https://doi.org/10.1007/s10844-019-00564-x
https://doi.org/10.3390/make3020020


Paper—Machine Learning for Feeling Analysis in Twitter Communications: A Case Study in HEYDRU!... 

Rosa Menéndez Mueras, Systems Engineer. Professor at the Facultad de Ingeniería 
de Sistemas e Informática at the Universidad Nacional Mayor de San Marcos, Peru 
(email: rmenendezm@unmsm.edu.pe). 

Carlos Chávez Herrera, Systems Engineer. Professor at the Facultad de Ingeniería 
de Sistemas e Informática at the Universidad Nacional Mayor de San Marcos, Peru 
(email: cchavezh@unmsm.edu.pe). 

Article submitted 2022-08-22. Resubmitted 2022-10-05. Final acceptance 2022-10-21. Final version pub-
lished as submitted by the authors. 

142 http://www.i-jim.org