Microsoft Word - Issue-1_Vol-10.docx


 

  73  

A Triadic Formal Concept Analysis Approach to Analyzing Online Hate Speech 
in Facebook Comments 

 
Radu Mihai Meza 

Babeș-Bolyai University, Cluj-Napoca, Romania  
Strada Universității 7-9, Cluj-Napoca 400084 

Phone: 0264 405 300 
meza@fspac.ro 

 
Șerban Nicolae Meza  

Technical University Cluj-Napoca, Cluj-Napoca, Romania 
Strada Memorandumului 28, Cluj-Napoca 400114 

Phone: 0264 401 200 
Serban.Meza@com.utcluj.ro 

 
Abstract 
This paper outlines computational thinking, language independent methodology for 

identifying and analyzing the contexts, targets and contents of online hate speech manifested in 
Facebook comments on popular Fan Pages or open groups. The three-step process involves data 
collection via API tools, a preliminary co-occurrence analysis of user-defined semantic field codes 
and clustering and visual analysis using triadic formal concept analysis navigation tools.  

 
Keywords: Hate Speech; Co-Occurrence Analysis; Triadic Formal Concept Analysis; 
Unstructured Data; Digital Media Analysis. 

 
1. Hate Speech in the Digital Media 
The issue of online hate speech is the subject of heated policy debate in Europe and globally 

over the past few years. Although hate speech acts have been regulated by European laws, online 
communication through platforms owned by businesses located outside the users’ country and 
subject to different legislation raises new issues. As Facebook is presently used by approximately 
2.2 billion people globally, both governments and NGOs look towards the social media giant to 
ensure sound and effective mechanisms of dealing with hate speech acts per each country’s policy. 
At the forefront of the group of European countries pressuring Facebook to come up with improved 
methods is Germany – which has some of the strictest regulatory frame-works concerning hate 
speech. In September 2015, after increasing pressure (Donahue, 2015) on the backdrop of the 
refugee crisis, Facebook announces an initiative to increase its efforts to tackle racist content on its 
German website. Furthermore, in early 2016, Facebook outsources the monitoring and control of 
racist posts over increased public criticism of the company’s reluctance to deal with hate speech in 
accordance with the European regulatory frameworks (“Facebook outsources fight against racist 
posts in Germany,” 2016). However, Facebook activity by users in both Europe and the United 
States at the end of 2016 has driven lawmakers to further increase pressure on Facebook to “clamp 
down on hate speech, fake news and other misinformation shared online, or face new laws, fines or 
other legal actions”(Scott & Eddy, 2016). There is clearly a job market opportunity for computer 
scientists and digital media specialists who are equipped with the conceptual and technical skills to 
tackle this complex problem. They will need to use computational approaches to deal with hate 
speech in social media streams, to sift through large amounts of data to identify, analyze, and 
classify potentially dangerous speech, hate speech and offensive speech and prepare intervention 
strategies. 

 



BRAIN – Broad Research in Artificial Intelligence and Neuroscience 

Volume 10, Issue 1 (January - February, 2019), ISSN 2067-3957 
 

 74

A Brief Conceptualization of Hate Speech 
The key issues in countering online hate speech are outlined in the 2015 UNESCO study 

”Countering Online Hate Speech” (Gagliardone, Gal, Alves, & Martinez, 2015): definition, 
jurisdiction, comprehension and intervention. 

There are different definitions of ”hate speech”, which frequently mix concrete threats to the 
security of individuals and groups with expressions of frustration and anger. Also, online media 
communication platforms such as Facebook, Twitter or Google define their own policies towards 
admissible content published by users. However, as recent tensions have shown, these often clash 
with national legislation and consensus seems unlikely. 

Online networked communication platforms have given private spaces of expression a public 
function and the combined speed and reach of Internet communication raise new issues for 
governments trying to enforce national legislation in the virtual public sphere, often in contexts 
managed by companies located in other states. 

There seems to be a lack of comprehension of the relation between online hate speech 
phenomena and offline speech and action or more precisely, violent action. In (Gagliardone et al., 
2015), the authors highlight the lack of studies examining the links between hate speech online and 
other social phenomena. 

Different contexts for online communication have led to different intervention strategies – 
from user flagging, reporting or ranking to monitoring, editorializing and counter-speaking. 
However, popular online social network type platforms seem reluctant to publish aggregate results 
that would allow an overview of the phenomenon. 

Media platforms define their own policies towards admissible content published by users, 
governments find it hard to enforce national legislation in the virtual public sphere, there is little 
comprehension of the relation between online hate speech phenomena and violent action and 
different contexts for online communication have given birth to different intervention strategies – 
from user flagging, reporting or ranking to monitoring, editorializing and counter-speaking. 
However, all these four key issues are strongly related to the identification and analysis of hate 
speech from semi-structured or unstructured data such as Facebook comments. 

Reports and academic works emanating from NGO initiatives are starting to shape an 
emerging scholarship on the issue. When analyzing hate speech as an act of communication, 
overviews of the issue (Angi & Bădescu, 2014) recommend focusing on: 

• Content (what is being said); 
• Emitters (who is communicating); 
• Targets (who is the message about); 
• Context (including when the action takes place). 
 
There clearly lacks a large scale online data-driven study of contexts, emitters, contents and 

targets for hate speech in social media with generalizable results and evidence that could drive 
policy in the matter. 

 
2. Computational Approaches and Co-occurrence Analysis 
It is only very recently that digital social science academic research into the niche topic of 

online hate speech has emerged, using computational approaches towards collection and analysis of 
large datasets of comments on news sites, blogs and especially social media (Meza, 2016). 

From a methodological standpoint, detecting violent, obscene or hate speech is a problem for 
both media researchers and content managers or digital platform owners. Natural language 
processing is a complex task and there is a scarcity of tools available for most languages. 

Computational thinking was popularized a decade ago as ”a fundamental skill used by 
everyone in the world by the middle of the 21st century”(Wing, 2006). The concept developed and 
is still developing as it is adopted in education, but problem-solving via computational thinking may 
be defined by abstraction, automation and analysis (Lee et al., 2011). 



R.M. Meza. Ș.N. Meza - A Triadic Formal Concept Analysis Approach to Analyzing Online Hate Speech in Facebook 
Comments 

75 

Co-occurrence analysis is widespread in communication and information sciences, especially 
in library science, but also in machine translation or natural language processing. Recent efforts in 
computational linguistics applied to hate-speech use machine learning techniques similar to 
sentiment analysis in correlation with techniques for detecting terms used to reference racial, ethnic 
or religious groups (Gitari, Zuping, Damien, & Long, 2015). 

Digital media analysis may make use of API interaction tools for data collection from social 
media, computational linguistics tools that allow the exploration of word or concept co-occurrence 
networks or user-friendly drag-and-drop visual environments for analysis of large data sets such as 
Tableau as research shows students in the Web 2.0 age prefer efficient, easy-to-use, accessible 
applications. 

 
3. Dyadic and Triadic Formal Concept Analysis (FCA) Preliminaries and Tools 
Formal Concept Analysis (FCA) is a method of knowledge representation introduced in the 

1980s by Rudolf Wille, rooted in the pragmatic philosophy of Charles Sanders Peirce, based on a 
binary incidence relation, and building on applied lattice and order theory. It has applications in 
various fields and its advantage lies in the possibility to visualize and explore formal concepts in a 
formal context (a data table that represents binary relations between items in a set of objects and 
items in a set of attributes) as representations of complete lattices. The mathematical foundations are 
described as follows (Ganter & Wille, 2012): 

A formal context is a triple K := (G;M; I), where G is a set whose elements are called 
objects, M is a set whose elements are called attributes, and I is a binary relation between G 
and M (i.e. I ⊆ X ) . (g, m) ∈ I is read “object g has attribute m”. 
A formal concept of a formal context (G, M, I) is a pair (A, B) with A ⊆ G, B ⊆ M, A' = B 
and B' = A. The sets A and B are called the extent and the intent of the formal concept (A, 
B), respectively. The subconcept super concept relation is formalized by: 

 
(A1, B1)  ≤  (A2, B2):  ⇔ A1 ⊆ A2 (⇔ B1 ⊇ B2). 

 
The set of all formal concepts of a context K together with the order relation ≤ is always a 

complete lattice (i.e. for each subset of concepts, there is always a unique greatest common 
subconcept and a unique least common super concept), called the concept lattice of K, also called 
conceptual hierarchy. In a line diagram (in FCA, the term line diagram is used for the Hasse 
diagram of a lattice) each node represents a formal concept. 

Triadic Formal Concept Analysis (3FCA) (Lehmann & Wille, 1995) was introduced to model 
relations between three sets: 

A triadic context is defined as a quadruple K := (G;M;B; Y), where G, M, and B are set and 
Y is a ternary relation between G, M and B, i.e. Y ⊆G×M×B; the elements of G, M, and B are 
called objects, attributes and conditions, respectively, and (g,m,b) ∈ Y is read: the object g has the 
attribute m under the condition b. 

A triadic concept of triadic context (G;M;B; Y) is defined as a triple (A1, A2, A3) with 
A1×A2×A3 ⊆ Y which is maximal with respect to component-wise inclusion. 
Recent work on triadic conceptual navigation (Kis, Sacarea, & Troanca, 2015; Rudolph, Săcărea, & 
Troancă, 2015) has provided graphical navigation tools such as FCA Tools Bundle which use a 
local navigation paradigm to make 3FCA visualizations intuitive and applicable. 
 
 
 
 
 
 
 



BRAIN – Broad Research in Artificial Intelligence and Neuroscience 

Volume 10, Issue 1 (January - February, 2019), ISSN 2067-3957 
 

 76

4. Proposed Analysis Method 
The three-step computational thinking, language independent methodology can be 

summarized as follows: 
 
4.1. Data collection via API interrogation tools or Web scraping 
Comment threads associated with Facebook Fan Page posts can be obtained via API 

interrogation with various tools such as Facepager (Keyling & Jünger, 2013). Other contexts such as 
news websites comment threads, internet forums or message boards can be scraped with generic 
Web scraping tools. 
 

4.2. Co-occurrence analysis of terms referring to potential targets of hate speech with 
codes for hate speech semantic fields 

Co-occurrence analysis is deemed an appropriate method for studying large data sets of short 
texts such as comments or Tweets. For the specific purposes of analyzing hate speech, codes may be 
defined to identify: 

 Key words, terms or expressions referring to targets of hate speech acts (e.g. immigrants, 
different racial or ethnic groups); 

 Key words, terms or expressions that constitute dangerous, hate, offensive or violent speech 
(e.g. invectives, insults, swear words, negative stereotypes, calls to violent action). 

 
Such a method is used in (Meza, 2016), yielding results in the form of co-occurrence 

frequency tables indicating the number of occurrences in the same comment (in the context of one 
or several Facebook Fan Pages) of the two types of codes. Visual representations are possible 
through tools like KH Coder (Higuchi, 2001), but lack navigation and concept discovery. 

 
4.3. 3FCA visualization and navigation 
By establishing threshold values (with respect to the size of the dataset), a co-occurrence 

matrix may be converted into a formal context. 3FCA allows the definition of objects (targets), 
attributes (hate codes over content) and conditions (Facebook Fan Pages). 
 



R.M. Meza. Ș.N. Meza - A Triadic Formal Concept Analysis Approach to Analyzing Online Hate Speech in Facebook 
Comments 

77 

 
Figure 1. Triadic formal context example 

 
Figure 1 shows a triadic formal context built with FCA Tools Bundle (Meza, 2016), 

illustrating how co-occurrence analysis data from large datasets of millions of comments can be 
converted into formal contexts. The objects are defined as the codes for the likely targets of hate 
speech. Attributes are defined as codes that cover semantic fields of hate speech such as common 
negative stereotypes. The conditions are defined as the public Facebook Fan Pages where the 
collected comments originate from. 

 



BRAIN – Broad Research in Artificial Intelligence and Neuroscience 

Volume 10, Issue 1 (January - February, 2019), ISSN 2067-3957 
 

 78

 
Figure 2. Local navigation of with 3 locked conditions (left) and 2 locked conditions (right) 

 
Line diagram visualizations of 3FCA local navigation (Figure 2) with locked conditions can 

be used to identify what is being said about which targets in which contexts. For example, Target5 is 
associated with the attribute Dirty in all contexts, whereas in two of them it is also associated with 
Thief. Target4 and Target2 are represented as Dangerous and Violent, both being associated with 
Stupid and Dirty. 

 
5. Case Study: Comments on Facebook Pages of Alternative Online News Outlets 
In order to illustrate and evaluate the proposed methodology, a case study was designed. A 

dataset of 416.554 comments was collected from all the Facebook posts created in January-March 
2017 of the Fan Pages of 23 online Romanian language alternative online news outlets. Data 
collection was done by interrogating the Facebook Open Graph API. 

Keyword search was used to identify comments mentioning four groups who are often 
targets of hate speech: Roma, Hungarians, Jews and LGBT. For each of the groups, variants of the 
of the popular Romanian terms, both neutral and pejorative, were used. A total of 3620 comments 
contained references to the four target groups.  



R.M. Meza. Ș.N. Meza - A Triadic Formal Concept Analysis Approach to Analyzing Online Hate Speech in Facebook 
Comments 

79 

 
Figure 3. Co-occurrence matrix of target groups and attributes for comments dataset 

 
Keyword search was also used to detect terms referring to negative attributes such as Dirty, 

Lazy, Liar, Stupid, Thieving, Violent by several term stems from the appropriate semantic families. 
A total of 28.914 records in the dataset were found. 

Co-occurrence analysis produced the matrix in Figure 3, with 506 comments containing both 
terms referring to the target groups and the attributes searched. Three Facebook pages which 
contained the highest number of occurrences were chosen. 

The co-occurrence matrix is easily converted into a formal context by establishing a value 
threshold. In this case, the value threshold was set at a minimum (1). 

 

 
Figure 4. NewsFBPage1 

 
Figure 5. NewsFBPage2 

 
Figure 6. NewsFBPage3 

 
Once FCA is applied using FCA Tools Bundle (Meza, 2016) and the formal concepts 

detected, each Facebook page context can be visualized as a line diagram as shown in figures 4,5 
and 6. This allows comparative analysis of the concept hierarchy and identification of similarities 
between formal concepts. 



BRAIN – Broad Research in Artificial Intelligence and Neuroscience 

Volume 10, Issue 1 (January - February, 2019), ISSN 2067-3957 
 

 80

 
Figure 7. Triadic FCA application 

 
Triadic FCA allows analysts to distinguish between discourse patterns in different contexts. 

For example, in Figure 7, we can see how the inclusion of one of the contexts may change the 
position of a target group in the concept hierarchy. 
 

6. Conclusion 
The proposed methodology improves on previous approaches by using 3FCA over formal 

contexts derived from preliminary co-occurrence analysis of content, emitters and targets in one or 
several contexts - comments to posts on popular Facebook Fan Pages. 

The advantages of this methodology are that it is scalable (co-occurrence analysis of codes in 
short texts is not a very complex task for large datasets), language independent, and visually 
intuitive, hence easy to interpret for digital media specialists, linguists or social scientists. 
Introducing computational thinking in problem-solving like hate-speech analysis and making use of 
freely available software tools referenced here may provide new opportunities for digital social 
researchers, journalists, professional communicators and digital media specialists. Integrating API 
interrogation, Web scraping, co-occurrence analysis and 3FCA in future tools may prove useful in 
unstructured data analysis education. 

 
Acknowledgements 
This work was supported by a grant of Ministry of Research and Innovation, CNCS - 

UEFISCDI, project number PN-III-P1-1.1-TE-2016-0892, within PNCDI III. 
 

References  
Angi, D., & Bădescu, G. (2014). Discursul instigator la ură în România. Fundația pentru 

Dezvoltarea Societății Civile. Retrieved from  
http://www.fdsc.ro/library/files/studiul_diu_integral.pdf. 

Donahue, P. (2015, September 26). Merkel Confronts Facebook’s Zuckerberg Over Policing Hate 
Posts. Bloomberg.com. Retrieved from https://www.bloomberg.com/news/articles/2015-09-
26/merkel-confronts-facebook-s-zuckerberg-over-policing-hate-posts. 

Facebook outsources fight against racist posts in Germany. (2016, January 15). Reuters. Retrieved 
from http://www.reuters.com/article/facebook-germany-idUSKCN0UT1GM. 



R.M. Meza. Ș.N. Meza - A Triadic Formal Concept Analysis Approach to Analyzing Online Hate Speech in Facebook 
Comments 

81 

Gagliardone, I., Gal, D., Alves, T., & Martinez, G. (2015). Countering online hate speech. 
UNESCO Publishing. Retrieved from  
https://www.google.com/books?hl=en&lr=&id=WAVgCgAAQBAJ&oi=fnd&pg=PA3&dq
=online+hate+speech+unesco&ots=TaamaoJQVB&sig=xUFAShQSkdkdHtMImSPL50my
DRE. 

Ganter, B., & Wille, R. (2012). Formal concept analysis: mathematical foundations. Springer 
Science & Business Media. Retrieved from  
https://www.google.com/books?hl=en&lr=&id=hNwqBAAAQBAJ&oi=fnd&pg=PA1&dq=
Ganter,+B.,+Wille,+R.:+Formal+Concept+Analysis+-
+Mathematical+Foundations.&ots=0cRL5XEe3p&sig=P4IzDkOx5d0I4IeSnYB6_5QVRJM 

Gitari, N. D., Zuping, Z., Damien, H., & Long, J. (2015). A lexicon-based approach for hate speech 
detection. International Journal of Multimedia and Ubiquitous Engineering, 10(4), 215–
230. 

Higuchi, K. (2001). Kh coder. A Free Software for Quantitative Content Analysis or Text Mining, 
Available at: Http://khc. Sourceforge. Net/en. 

Keyling, T., & Jünger, J. (2013). Facepager. An application for generic data retrieval through 
APIs. Source code available at https:// github. com/ strohne/ Facepager. 

Kis, L. L., Sacarea, C., & Troanca, D. (2015). FCA Tools Bundle-a Tool that Enables Dyadic and 
Triadic Conceptual Navigation. Proc. of FCA4AI. Retrieved from https://hal.univ-
lorraine.fr/hal-01425672/document#page=45 

Lee, I., Martin, F., Denner, J., Coulter, B., Allan, W., Erickson, J., … Werner, L. (2011). 
Computational thinking for youth in practice. Acm Inroads, 2(1), 32–37. 

Lehmann, F., & Wille, R. (1995). A triadic approach to formal concept analysis. In International 
Conference on Conceptual Structures (pp. 32–43). Springer. Retrieved from 
http://link.springer.com/chapter/10.1007/3-540-60161-9_27 

Meza, R. (2016). Hate Speech in the Romanian Online Media. Journal of Media Research, 
9(3(26)), 55–77. 

Rudolph, S., Săcărea, C., & Troancă, D. (2015). Towards a navigation paradigm for triadic 
concepts. In International Conference on Formal Concept Analysis (pp. 252–267). Springer. 
Retrieved from http://link.springer.com/chapter/10.1007/978-3-319-19545-2_16 

Scott, M., & Eddy, M. (2016, November 28). Facebook Runs Up Against German Hate Speech 
Laws. The New York Times. Retrieved from  
http://www.nytimes.com/2016/11/28/technology/facebook-germany-hate-speech-fake-
news.html 

Wing, J. M. (2006). Computational thinking. Communications of the ACM, 49(3), 33–35. 
 

Radu Mihai Meza (b. November 15, 1985) received his BSc in Computer Science (2008), 
Bachelor of Communication Sciences - Journalism (2008), MA in Media Communication (2009) 
and Ph.D. in Sociology (2012) from ”Babeș-Bolyai” University, Cluj-Napoca. He is associate 
professor in the Department of Journalism, College of Political, Administrative and Communication 
Sciences at ”Babeș-Bolyai” University. 
 

Șerban Nicolae Meza (b. May 7, 1983) received his Bachelor of Economics and Business 
Management (2006) from ”Babeș-Bolyai” University, Cluj-Napoca, B.Eng. in Telecomunications 
(2007), M.Eng. in Signal and Image Processing (2008), and Ph.D. in Electronics and 
Telecomunications from the Technical University of Cluj-Napoca. He is lecturer in the Faculty of 
Electronics, Telecommunications and Information Technology at the Technical University of Cluj-
Napoca.