Kurdistan Journal of Applied Research (KJAR)
Print-ISSN: 2411-7684 | Electronic-ISSN: 2411-7706
Website: Kjar.spu.edu.iq | Email: kjar@spu.edu.iq
Design and Implementation of a
Chatbot for Kurdish Language
Speakers Using Chatfuel Platform
Hemn Mela Karim Barznji Jamal Ali Hussein
Information Technology Computer Department
Computer Science Institute College of Science
Sulaimani Polytechnic University University of Sulaimani
Sulaimani, Iraq Sulaimani, Iraq
dr.hemn@yahoo.com jamal.ali@univsul.edu.iq
Article Info ABSTRACT
Volume 5 - Issue 2 -
December 2020
DOI:
10.24017/science.2020.2.10
Article history:
Received : 22 Sept 2020
Accepted: 30 December 2020
Chatbot is a software agent that is used to conduct intelligent
conversations between machines and humans. Chatbots are
mostly depend on Natural Language Processing (NLP). In
this paper, the design and implementation of a chatbot are
provided to help Kurdish speakers in using online
conversations via texts to find answers instead of direct
contact with human agents. The NLP-based software agent is
implemented using the Chatfuel platform. Chatfuel uses
artificial intelligence to communicate with humans by
simulating human conversations through voice commands or
texts. The proposed chatbot is tested on an electronic tourist
guide that helps visitors to the religious places in the
mountainous village of Barzanja that is located in Iraqi
Kurdistan. The case study is conducted by using three-
hundred questions and answers. One hundred volunteers
participated in this study. The participant asks a question and
the bot provides an answer if it recognizes the question,
otherwise it provides a default answer along with a
suggestion of how to use the system properly. The data of
these experiment is collected, analyzed, and problems
regarding Kurdish language are detected. Designing software
agents for processing Kurdish texts faces many challenges.
Kurdish texts have not yet been processed using natural
language processing (NLP). In addition, Kurdish font
disorder and the lack of standardized keyboards and writing
styles makes processing Kurdish text difficult. Furthermore,
Kurdish language consists of variety of different dialects with
different typing styles. In this research, we specifically focus
on the design of a software agent for the Central Kurdish
(Sorani) dialect. We managed to solve some of the problems
related to the Kurdish language and suggest solutions to
others.
Keywords:
Chatbot, Kurdish Language,
NLP, Software Robotic,
Artificial Intelligent, Kurd
Agent.
Copyright © 2020 Kurdistan Journal of Applied Research.
mailto:dr.hemn@yahoo.com
mailto:jamal.ali@univsul.edu.iq
Kurdistan Journal of Applied Research | Volume 5 – Issue 2 – December 2020 | 118
All rights reserved.
1. INTRODUCTION
Chatbot is a software agent that is used to conduct friendly intelligent conversation between a
machine and human. The chatbot term refers to text conversation, but now growing through
other communication means such as voice. Enhanced chatbots can also reply using images,
relational links, gallery, video, etc. [1]. The basic conceptual and objective of chatbot creation
is that the computer talks in natural language with human in reality, which should be as human
as possible. Based on this, the chatbot is built for conversations and usually offers a special
idea such as searching the Internet, organizing files on a computer, arranging engagement and
appointments, and so on [2].
There are numerous chatbot applications for helping users in finding flights, hotels, travel
destinations, and jobs. Chatbot is used in many areas, such as ecommerce, banking,
entertainment, health, and Education [3]. Chatbots have many advantages over direct
conversations, such as availability, reduced costs, and the enhancement of social experiences.
There are many software applications available to create chatbot agents. These applications
are simple to implement because they allow users to create chatbots without writing any code,
but they also enable professional developers to write codes if necessary. Common examples of
AI chatbot platforms are: Chatfuel, Bot Framework, Wit.ai, Manychat, Dialogflow, etc.
We have worked with the chatfuel platform since it is one of the best chatbot engines that uses
artificial intelligence (AI) to communicate with human. It simulates human conversation
through voice commands or text conversations or both. The focus is on automation and
adaptability, from answering questions to collecting data [4]. We use this powerful chatbot
builder to create a chatbot as a tourist guide for Barzanja1 village. Although we focus on
Central Kurdish (Sorani) dialect speakers using Kurdish alphabet, which a common writing
style in Iraqi Kurdistan, but the proposed chatbot system is capable of enhancing its response
whenever a new word or question is entered by the users even if they use different dialect or
writing styles. A case study that consists of three-hundred questions and answers and one
hundred participants is conducted.
When we applied this research, some challenges have occurred, such as Kurdish Font
disorder, different typing styles, punctuations and non-standardized Kurdish language. We
provide solutions to some of the problems related to using Kurdish language in NLP systems
and provide suggestion to some other problems.
1.1 Challenge and Problems
The following challenges and problems are related to NLP of the Kurdish language:
1- Writing Styles Variation: Kurdish language has several formal writing styles with
some unformal styles such as Latin style, English alphabet, Central Kurdish style
(Sorani) and Arabic alphabet. For examples, the following words have the same
meaning ('Come' in English) but have been written using either different words or the
same word with different writing styles:
بھو بةو وەرە وةرة بێ بيَ بي بيَ
2- Dialectal Variation: Kurdish language has different dialects according the area of
Kurdistan. Each dialect has its own grammar and vocabulary. Mixing these dialects is
problematic when using Kurdish text in NLP based systems such as chatbots.
3- Orthographic Ambiguity and Inconsistency: In Kurdish language; vocabulary,
grammar and writing styles sometime cause ambiguity and inconsistency that are
difficult to determine and classify.
1 A thirteenth century village located in a mountainous area near the city of Sulaimani (Sulaymaniyah)
in Iraqi Kurdistan. It is a place for many Islamic and Yarsani shrines and holy sites.
Kurdistan Journal of Applied Research | Volume 5 – Issue 2 – December 2020 | 119
4- Morphological Richness: Kurdish words are inflected for a several of features, such
as gender, number, person, voices, aspect, etc., that have different formats according
to the dialect. For example, the following pair of words have close spellings with
totally different meanings:
Milk شیر
Lion شێر
Short كوڵ
Blunt كول
5- Idiomatic Dialogue Expressions: Since some idiomatic expression in Kurdish
language are common, but others are less common, it becomes challenging when
replying to a question by the bot. The following two expressions have a close
meaning using different words:
Good Morning بھیانیت باش
Morning of Light بھیانیت رۆشن
1. LITERATURE REVIEW
Natural language processing (NLP) is new for Kurdish Language, so it is hard to find NLP
works on Kurdish in the literature. Therefore, we review some researches that are close to the
Kurdish Language such as the Arabic language.
An artificial intelligent agent chatbot for Kurdish Language has been proposed in [5] by using
Artificial Intelligent Markup Language (AIML) on the free and opensource platform
Pandorabots with a Facebook account. It can answer queries in Kurdish. This system takes the
input in text format, then it displays the results in text and provides accurate and quick
answers to users.
Writing style of Arabic language is close to that of Kurdish. In [6], an Arabic chatbot for
children with Autism Spectrum Disorder (ASD) is developed based on pattern matching (PM).
A new Arabic short text similarity (STS) measure is used to extract facts from user’s
responses to match rules in scripted conversation in a particular domain (Science). The
researcher proposed the system on grammatical and morphological.
The first chatbot using for an Arabic dialect was presented in [7] exploring each challenge that
faces the creation of conversational agents. It uses the Egyptian dialect of the Arabic language.
The researchers illustrate several solutions and explain all elements of BOTTA Chatbot. The
database of BOTTA is available to all researchers that are working on Arabic chatbots or the
languages close to Arabic in their writing styles such as Kurdish, Urdu and Persian.
In the research proposed in [8], several obstacles and challenges that need to be resolved
when developing an effective Arabic chatbot is presented. This is important for other
languages that use an alphabet close to the Arabic language alphabet.
2. THE PROPERTIES OF THE KURDISH LANGUAGE
The Kurdish language is the backbone of this research, so we define and introduce this
language, we especially focus on the Central Kurdish (Sorani) branch.
Kurdish (Kurdish: Kurdí, كوردی, Kurdî) language is a branch of Indo-European family of
languages. But dialects of Kurdish are members of the Indo-Iranic languages of the
northwestern subdivision. The Kurdish language is not dependent language because it has all
features of languages such as historical development, continuity, grammatical system and rich
living vocabularies [9]. The Kurdish language belongs to the “Median” language or “Proto-
Kurdish”. People of Kurdistan speak several dialects of the language. Kurdish language
dialects are [10]:
1) Nordic Kurdish dialects, also called Kurmanjí and Badínaní.
Kurdistan Journal of Applied Research | Volume 5 – Issue 2 – December 2020 | 120
2) Central dialects, also called Soraní
3) The Southern Kurdish dialects, also called Pehlewaní or “Pahlawanik”.
The other two branches of Kurdish language are Dimílí also called “Zaza” and Hewramí also
called Goraní. According some references about linguistic, the southwestern branch of the
Indian and Iranian languages of the Lurrí (Luri) branch is classified as a sub branch of
Kurdish [9].
The Kurdish nation is divided among five countries: Iraq, Iran, Turkey, Armenia and Syria.
Kurdish literature was written in Arabic, Persian or Turkish, although the Kurdish language,
written in Central Kurdish (Sorani) and Kurdish Latin Alphabet script, began to appear in the
seventh century AD. Nowadays, Kurdish is written in three different writing styles.
1) The Iraqi and Irani Kurdish are using Central Kurdish alphabet, for example: کوردی.
2) The Kurdish of Turkey and Syria use Kurdish Latin alphabet, for example: Kurdî.
3) The Kurdish of Armenia use Cyrillic alphabet, for example: քրդի.
The letters are 34 but the sounds of Kurdish language are 37 for central Kurdish alphabet, but
Kurdish Latin alphabet is 31 letters commonly [11], as show in Table 1. In Sulaimani and
Kirkuk, the letter D is often softened to the point of being inaudible. The most prominent
example of this case is the present modal prefix "دە ":
laimani and Kirkuk) (In Suمھڕۆ= ئ (Standard)مەڕۆد
In Kurdish language, especially in Sorani dialect, no words begin with "ر", all initial Rs are
trilled "[13] [12]" ڕ:
رۆژگار = ڕۆژگار
Generally, the letters of Kurdish language are pronounced as written that is divide in two
parts: Vowel Letters: - it is consisting of long and short vowels as we present as:ە , ئــ ، ا ، ئـا,
Constant Letters: - they are another letter that is determined constant .وو ,و ,ی یــــ ,ێ ، یــَ
sound, look the table 1. The words are constructed using combine of two letters or more [13]
[14]. According the parts of speech and syntax, Kurdish language words classifies to 8 parts:
the verb (فھرمان/كار), the noun (ناو), the pronoun (ڕاناو/جێناو), the adjective (ئاوەڵناو), the adverb
is (ناو) The noun .(ژمارە) and number (قسھھھڵدان) the interjection ,(ئامڕازەكان) Tools ,(ئاوەڵفرمان)
a word can add (the - ەكھ), (a, an -ێك ) and (plural - ان) that is a main in part of speech. This
words that name of people, thing and place. It is not related to the time.
كچ ، ساڵ، ئاو، كوڕ، دڵشاد، ڕووپاك، گوڵھگھنم، شھكرە سێوو
The pronoun (ڕاناو/ج ێناو): It substitutes for nouns or noun phrases and designates persons that
is very important because it is commonly used such as . Generally, it is divided into: personal
pronoun such asمن، تۆ، ئھو، ئ ێمھ، ئێوە، ئھوان . possessive pronouns is divide into fourth group
that is transitive past verb pronouns such as م - مان، ت - تان ، ی - یان , imperative past verb
pronoun such as م - ین، ی یت - ن، - - ن , transitive and imperative present verb pronoun as - م
ئھمھ - Demonstrative pronouns as . ن ، ه imperative verb pronoun as , ین، ی یت - ن، ات ێت - ن
,چی ,كێ The interrogative pronouns and adverbs as .كھ Relative pronouns as . ئھمانھ ، ئھوە - ئھوانھ
.چھند ,چۆن ,كوێ ,كام
The adjective ( ئاوەڵناو) is a word that can add (-er,تر ) and (-est, ترین) Such as smarter زی رەكتر.
But the adverb (ئاوەڵفرمان) is a word that modify verbs and adjectives Such as بھ خێرایی quickly.
The tools (ئام ڕ ازەكان) is a word that use to any causes and to create relations between words or
sentences such as بھ، لھ، دە، لھگھڵ، بۆ، لھسھر، ێك، ەكھ، و but the interjection (قسھھ ھڵدان) usually
expresses emotion and is capable of standing alone such as ئ ۆی Oh, بھراست ؟ Really, �یھڵ Come
Kurdistan Journal of Applied Research | Volume 5 – Issue 2 – December 2020 | 121
on! The number (ژمارە) is consisting of cardinal and ordinal number. Finally Verb (فرمان/كار)
is an expresses existence, action, or occurrence. [14] [13] [15] [16] [17] [18] [19]
Table 1 : The table of Kurdish letters alphabet, Central Kurdish (Sorani) and Kurdish Latin alphabet.
Central Kurdish alphabet Kurdish Sound Kurdish Latin alphabet
NO. Sorani Examples Examples Letter
a/ Amed; zana A, a/ ئیش ئـ 1
a:/, long a Batman; kellebab B, b/ ئاوات ا 2
b/ Urdun; dund C, c/ باران؛ داب ب 3
p/ Çoman; kiç Ç, ç/ پار؛ قاپ پ 4
t/ Dihok; berd D, d/ تاو؛ پیت ت 5
d͡ʒ/ Erzirrom; bere E, e/ جام؛ تاج ج 6
t͡/ چاو؛ خاچ چ 7 ʃ/ Êwan, pêrê Ê, ê
h/ Firat; def F, f/ حھیران؛ حھ سار ح 8
x/ Gever; deng G, g/ خاک؛ ناخ خ 9
d/ Hewlêr, Ah H, h/ داس؛ ئازاد د 10
r/ Sirinçik I, i/ برین؛ بیر ر 11
Bold /R/ Îlam, sînî Î, î ڕاست؛ مھڕ ڕ 12
z/ Jawero; kîj J, j/ زانست؛ ناز ز 13
ʒ/ Kobanê; erk K, k/ ژیار؛ کیژ ژ 14
s/ Laliş; mel L, l/ سارد؛ کراس س 15
ʃ/ Mehabad; dem M, m/ شین؛ باش ش 16
gh/ Nisêb N, n/ عێراق؛ دهعبا ع 17
gh/' Pawe; esp P, p/ غونچھ ؛ قۆناغ غ 18
f/ Oremar; boso O, o/ فیل؛ ماف ف 19
v/ Qûçan; deq Q, q/ ڤیان؛ حھ ڤده ڤ 20
Q/ dar R, r/ قیر؛ تاق ق 21
k/ Ranye; perr RR, rr/ کانی؛ پیک ک 22
ɡ/ Sine; kras S, s/ گا؛ سینگ گ 23
l/ Şengal, baş Ş, ş/ الو؛ دیل ل 24
Bold /L/ Tirbesipî; kat T, t گو�ڵھ؛ ماڵ ڵ 25
m/ Urdun; dund U, u/ مار؛ سام م 26
n/ Ûrmiye, sûtû Û, û/ ناو؛ بان ن 27
h/ Vêtnam; bav V, v/ ھیوا؛ بھھره ھـ 28
e/ Wan; naw W, w/ ھھڵھ؛ ھھڵوژه ه 29
u/ Xaneqîn; qonax X, x/ وانھ ؛ داو و 30
o/ Yêrîvan; key Y, y/ دۆ؛ دۆشاو ۆ 31
u:/, Long /u/ Zaxo; berz Z, z/ دوو؛ بوو وو 32
/i/ یاد؛ دایھ ی 33
Bold i دێ؛ ڕێ ێ 34
The sentences are the largest unit in syntax of Kurdish language that are consisting of above
part of speech as (subject, object, adverbial, adjunct, complement and verb).
Hemin runs – ھێمن ڕادەكات
.Hemin and his friends run at the park every day - ھێمن و ھاورێكانى، ھھموو ڕۆژێ لھ باخچھكھدا ڕادەكھن
In addition to the transitive verb and the non-transitive verb, there is a third type of verb called
the connecting verb. The word (or phrase) that accompanies a connecting verb is not an
object, but a complement. The subject complement can be a noun, an adjective or a
preposition. most common linking verb is " بوون ", which is equivalent to “to be” in English.
.Chro is in university – چرۆ لھ زانكۆیھ | .Chro is lecturer – چرۆ وانھبێژە
Kurdistan Journal of Applied Research | Volume 5 – Issue 2 – December 2020 | 122
The order of components of each Kurdish sentences are بكھر subject, ب ھركار object and كار
.verb. Generally, the tense in Kurdish language are present and past [20] [21] [13] (فرمان)
Table 2: Table of Kurdish language tenses
كات
Tense
ج ۆر
Type
یاسا
Rule
نموونھ
Example
وو
رد
ڕاب
P
as
t
سادە
Simple
ھاتین ڕەگى ڕابردوو + ج ێناو (ڕاناو) ى لكاوو
خواردمان
بھردەوام
Continuous
دەھاتین دە + ڕەگى ڕابردوو + جێناو (ڕاناو) ى لكاوو
دەمانخوارد
تھواو
Perfect
ھاتبووین ڕەگى ڕابردوو +بوو+ جێناو (ڕاناو) ى لكاوو
خواردبوومان
مھرجى
Conditional
بھاتینایھ ب+ ڕەگى ڕابردوو + جێناو (ڕاناو) ى لكاوو+ایھ
بمانخواردایھ
وو
رد
ھب
ڕان
P
re
se
nt
سادە
Simple
دەنووسین دە + ڕەگى داھاتوو + جێناو (ڕاناوو) لكاوو
دەنووین
تھواو
Perfect
ڕەگى ڕابردوو + وو + ڕاناوو (ج ێناوو ) ى لكاوو + ــھ ،
بھ�م (ــھ) لھگھڵ تێن ھپھڕدا و تھنیا لھگھڵ كھسی سێھھمى
.تاك دەردەكھوێ
خواردوومانھ
نوستوو ین
مھرجى سادە
Simple
Conditional
بڕۆین ب + ڕەگى داھاتوو + ڕاناوو (ج ێناوو )ى لكاوو
بشۆین
مھرجى تھواو
Perfect
Conditional
ھاتبین ڕەگى ڕابردوو + ب + ڕاناوو (ج ێناوو ) ى لكاوو
كردبمان
داخوازى
Imperative
ب + ڕەگى داھاتوو + ــھ (ئھگھر ڕەگھكھ بھ كپ كۆ تاى
ھاتبێ)
بنووسھ
بشۆ
3. NATURAL LANGUAGE PROCESSING FOR KURDISH LANGUAGE
Natural language processing (NLP) is branch of linguistics, computer science, and artificial
intelligence that helps computers understand, interpret and manipulate human language [22]
However, NLP was originally known as Natural Language Understanding (NLU), it is now
well understood that although the goal of NLP. NLU is real, it has not yet been achieved. But
the main goal of NLP is “to accomplish human-like language processing”. [22] NLP have a
challenge to developing a program that understands natural language is a difficult problem.
NLP has more application such as: Searching and indexing for large text. Word processor
software. Information retrieval. Text categorization using classification. Text summarization
software automatically. Question Answering (QA) Applications. [23]
To understanding and applied the NLP to Kurdish language, both things are necessary: the
first one is Kurdish language component and grammar. The second one is component of NLP
that is divided into Natural Language Understanding (NLU) and Natural Language Generation
(NLG) [24] The main techniques of NLP are syntax analysis and semantic analysis:
First – Syntax Analysis: it is referring to the sentences that words arranged in this structure
of text and they have grammatical meaning. Also known as parsing. It has more techniques:
Tokenization and pattern matching are an essential operation used to break up a string into
words, punctuation marks, numbers and other items. For example:
“Dr. Hawzhin, Mr. Sherko Barznji”, said Kurdistan, introducing us. can be tokenized as in the
following, where each token is enclosed in single quotation marks:
‘"’ ‘Dr.’ ‘Hawzhin ‘,’ ‘Mr.’ ‘Sherko ‘Barznji’ ‘"’ ‘,’ ‘said’ ‘Kurdistan ‘,’ ‘introducing’ ‘us’ ‘.’
The important task in this step is finding the boundary of words. In Kurdish language, the
Kurdistan Journal of Applied Research | Volume 5 – Issue 2 – December 2020 | 123
boundary of words can determine using the fully separated by space, separated by half – space
or be related to each other.
من پسپۆری بھرنامھ سازیم
من پسپۆری بھرنامھسازیم
In the first sentence, the بھرنامھ سازیم is two words, if we determine by space separator, but in
the second sentence the بھرنامھسازیم is one words. The second form is correct but the first form
in incorrect.
Parts of speech (POS): Another NLP task is speech tagging to identify the part of speech for
every word and categorized of words that have same properties of grammatical, for example:
Table 3: POS tag for Kurdish sentences example
Kurdish sentence . من دەرۆم بۆ قوتابخانھ
English sentence . School to go I
POS Tags Punctuation Noun Preposition Verb Pronoun
Lemmatization is a common technique to solve words in the form of their dictionary, which
requires a detailed dictionary in which the algorithm can search for words and link them to
their respective prepositions.
چوون -- دەچم -- دەچووم -- بچۆ --چوو بوو –چووە، باش -- باشتر -- باشترین
Stemming: it is a process to convert from inflected or derivates words to steam, base or root
form. commonly, it removes all the suffixes and affixes based on some predefined linguistic
rules.
ه--.ه ، ھھیھ –ھھیھ ، ھھبووە – ھھیھ ، ھھبێت – ھھیھ، ھھبوو – ھھیھ، ڕۆشتنھوە – ڕۆشتن، ڕۆشت بوو -- ڕۆشت
Stemming for Kurdish language classify to verbal stemming and non-verbal steaming.
Input: x
For 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 in creation_rule_of_verb do
if x == 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 do
x′=𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑟𝑟𝑜𝑜 x
if x′ 𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑣𝑣𝑟𝑟𝑟𝑟𝑣𝑣_𝑑𝑑𝑖𝑖𝑑𝑑𝑟𝑟 then
add x′ to Collection_Of_Suggestion
end end end
if Collection_Of_Suggestion != empty then
return shortest word in
Collection_Of_Suggestion as stem
else return x end
Figure 1: Algorithm of verbal stemming in Kurdish
language
Input: x
For Collection_of_Suffix/Collection_of_affix do
For y in Collection_of_Suffix/Collection_of_affix do
if x ends with y then
x′= x[0:(𝑟𝑟𝑟𝑟𝑖𝑖(x)−𝑟𝑟𝑟𝑟𝑖𝑖(y))]
if x′ is in 𝐿𝐿𝑟𝑟𝐿𝐿𝑖𝑖𝑑𝑑𝑟𝑟𝑖𝑖_𝑑𝑑𝑖𝑖𝑑𝑑𝑟𝑟ionary then
add x′ to 𝑑𝑑𝑐𝑐𝑖𝑖𝑑𝑑𝑖𝑖𝑑𝑑𝑐𝑐𝑟𝑟𝑟𝑟_Collection
end end end end
if 𝑑𝑑𝑐𝑐𝑖𝑖𝑑𝑑𝑖𝑖𝑑𝑑𝑐𝑐𝑟𝑟𝑟𝑟_Collection != empty then
return shortest word in 𝑑𝑑𝑐𝑐𝑖𝑖𝑑𝑑𝑖𝑖𝑑𝑑𝑐𝑐𝑟𝑟𝑟𝑟_Collection asstem
else return x end
Figure 2: Algorithm of non- verbal stemming in Kurdish
language
Kurdistan Journal of Applied Research | Volume 5 – Issue 2 – December 2020 | 124
Parsing: This includes performing grammatical analysis for the sentence provided. The
syntactic parser usually receives a sentence containing margins as input and returns a parsed
syntax as output.
Second – Semantic Analysis: it is referring to the meaning that is sent by text and focus of
meaning identification of language. It is the difficult part of NLP that has not yet been fully
solved. Some computer technique and algorithms are created to understand and interpretation
of words. Common techniques are: Name Entity Recognition (NER) is a most common task
in semantic analysis that is extracting entities from text. The entities can be name, place, email
address, and more. Natural language generation, This includes using a database to obtain
semantic goals and convert them into human language. It is a special technique that is used to
convert from plain text to raw structured data.
[25] [26] [24] [23] [27] [28] [29] [30] [31]
4. CHATBOTS
The chatbot is a software agent based on artificial intelligence which is used in conversation
between users and software robot [2]. This agent can interact with human carefully using NLP
as a basic to produce this process [32]. Chatbot is a simulation of human user conversations
especially over the Internet, but it is possible to apply it as an offline software for specific
purposes, such as travelling guide, education or self-learning of languages [33]. The idea of
chatbot belongs to the Alan Turing test [34].
Eliza chatbot is the first agent that was developed by Joseph Weizenbaum in AI Laboratory at
Massachusetts Institute of Technology (MIT) in 1966. [35] [36]. Parry is another chat bot that
was created by the psychiatrist and computer scientist Kenneth Mark Colby at the department
of Psychiatry in Stanford University in 1972 [35]. The chatbot Jabberwacky was created by
British developer Rollo Carpenter in 1988. It was intended to simulate a natural human
dialogue [1]. In 1992, Dr. Sabaitso chatbot was created by Creative Labs for MS-Dos. In
1994, the term of chatbot was coined. In 1995, ALICE was created by Richard Wallace, which
is an acronym for “Artificial Linguistic Internet Computer Entity”. In 2001, Wallace
published AIML specifications [2] [37].
Smarter Child was an intelligent chatbot created in 2001; it has some features such as
accessing data quickly and funny personalized conversations [1]. In 2006, the Watson chatbot
was created by IBM, it is a question answering system. in 2010, Siri was created by Apple as
part of the Apple operating system; it is a text and voice chatbot [1]. In 2012, MITSUKU
chatbot was created by Steve Worswick. It uses AIML language to understand the user’s
response [38]. In the same year, the Google Now was developed by Google using NLP [39].
The Alexa chatbot was developed in 2015 by Amazon, it is capable to interact with voice and
it uses algorithms of NLP to receive sounds, recognize and respond [40]. In the same year,
Microsoft Company created Cortana bot for mobile and personal computers that use Windows
operating system. [41]. In 2016, social networking site Facebook provided a platform of
messenger that allows developers to build a bot for Facebook users [42].
4.1. Types of Chatbot
Chatbot classify in some classification to determine chatbot types. Common categorized of
chatbot according different parameters are: - The knowledge domains that are categorized
based on the knowledge they have access to or the amount of data they receive. The Providing
services is another classification of bots are based on the branch of knowledge that deals with
the amount of space that people feel it necessary to set between themselves and others In The
goal's classifications, chatbots are categorized based on the early objectives that is aim to
achieves.
Kurdistan Journal of Applied Research | Volume 5 – Issue 2 – December 2020 | 125
The processing of input and method of response generations: the categorization of chatbot
are according methods that are divides to 2 models: The Rule Based Approach (RBA), the
chatbots trains based on predefined set of rules that was trained in the early stages to answer
questions. Self – Learning Approach (SLA), the chatbots can learn on their own using the
advanced technologies such as AI and Machine Learning. It is divided into:
1- Retrieval-based approach of Chatbots has much easier structure to create bots and
provide more predictable result. because it is applying functions on predefined
patterns of input and responses that uses heuristic method to deliver suitable
response. Now, this approach is very common and more practical.
2- Generative based approach of Chatbots are the hereafter of chatbots that build a
smarter chatbot. Unfortunately, it has not wide range to use by developer, because It
is now more in laboratories.
If chatbots are about general topic conversation and response properly, it is opened domain
chatbots. Otherwise, if chatbots are about specific topic and specialized title, it is closed
domain.
[43] [44] [45] [46]
4.2. The design techniques of chatbots
The design techniques used by chatbot developers are:
1) Parsing: it is used to analyze and process the input from users by using several
functions of NLP, such as Python NLTK tree [47].
2) Artificial Intelligence Markup Language (AIML): It is the main technique that is
used to design chatbots. [48].
3) Chat Script: This is a technique that helps in cases when no matches return from
AIML. It makes the best syntax to build a reasonable default answer. It offers a set of
features such as variable concepts, facts, and and/or logic operations [47].
4) Pattern Matching: this technique is about the artificial intelligence that is used to
design the chatbots to match the input from users with the database-stored answers
and then returning the identical response [49].
5) SQL and relational Database: A method that has recently been used in Chatbot
design to remember Chatbot previous conversations.
6) Markov Chain: Chatbots are used to create responses that are more likely to be
useful and therefore more accurate. The Markov chain idea is that there is a
probability of occurrence for any letter or word in the same textual dataset [3] [50].
4.3. AIML – Artificial Intelligence Markup Language
AIML is a standard of artificial intelligence markup language that is a language for artificial
intelligent applications creation. It built based on extensible markup language (XML) dialect
invented. The AIML is very important to AI software agent, especially natural language
software agent development because it use in structure of semantic and syntax as theoretical
structure. AIML was developed during 1995 to 2000 by the Alicebot free software community
and Dr. Richard S. Wallace, the AIML is created using the techniques of pattern recognition
or pattern matching. It is manipulated to natural language modeling for conversation between
human and chatbots that use simulation response approach. [51] The main purpose of AIML is
the definition of some knowledge that chatbot has [52].
According the technical of speaking, AIML basic anatomy and structure is tag. Each tag
consists of open/start tag and close/end tag as following example:
AIML has some static tag. Category, pattern, and template are three most common important
tags. The category tag is used to knowledge unit definition of conversation. The tag of pattern
is used to identify the user input and the template tag is used to response to user input
Kurdistan Journal of Applied Research | Volume 5 – Issue 2 – December 2020 | 126
specifically. The three tags and all AIML tags must be wraps and write between the open/start
AIML tag and close/end
پرسیارى بەكارھێنەر
وە�مى گونجاو بۆ پرسیارى بەكارھێنەر
Other common AIML Tags are the following tags:
1- tag: is used to get random response of same input differently. This tag is used with tag to
carry items of different response:
وشە یان ڕستە وەك وەالم
وشە یان ڕستەى لێكنزیك وەك وەالم
وشە یان ڕستەى لێكنزیك وەك وەالم
وشە یان ڕستەى لێكنزیك وەك وەالم
2- and tags: are used with variables. The set tag is used to set value in a variable but get tag is
used to get value from a variable:
نرخ پێدانى ھەمیشە گۆڕاوو
وەرگرتنەوەى نرخى ھەمیشە گۆڕاوو
3- tag: is used to respond base on the context:
وە�مدانەوە بە پێى دەق
4- tag: it is used to create line break.
تكایە ئەم وشانە بخوێنەرەوە
ئاو باران با
5- Button tags: they are some tags that are used to create a button to apply specific action, see the following:
ڕاستترین بۆچوون دەربارەى وشەى بەرزنجە
وشەكەلە ڕاستى بەرزەنجە یە، دەگەرێتەوە بۆ وشەى بەرزەنگە، كە وشەیەكى
كۆنى كوردى گۆرانە، واتای بەرز، بڵند، با�، كەژ و كێوو دەدات
The text tag is optional that is use to preview a text that appear on the button, but content of postback tag is appear by
chatbot when user click on the name of button. Sometime the developer of chatbot use the URL tag.
ڕاستترین بۆچوون دەربارەى وشھى بھرزنجھ
https://barzanja.com 2
6- Quick reply tags: these tags are other rich media element with text and postback such as post back button.
The text tag is appeared on the reply response but the post back tag send message to bot.
مێژووى بھرزەنجھ
Barzanja History
7- tag: it is a rich media element tag that is used as advanced AIML chatbot implementation to solve
some problem and to chatbot response for user.
barznja.png
8- tag: this tag is used to allow chatbot to send back video as response:
barzanja.mp4
9- Card tag: it is used to wrap around other tags to collect all elements such as Image tag, buttons tag, title
tag, sub title tag, text and so on. The result is containing navigation all of rich media elements:
barznja.png
شارۆچكھى بھرزەنجھ
زێدى زانایان
مێژووى بھرزەنجھ
Barznja History
ناوداران و زانایان
Zanayan
2 This website URL is not existing in Internet. It is just test to URL tag.
سەرچاوو
س�وو
س�وو، دڵخۆشم بە بینینت
Kurdistan Journal of Applied Research | Volume 5 – Issue 2 – December 2020 | 127
[54] [55] [56] [57]
4.4. Chatbot platforms and construction components
We created a chatbot to experiment the Kurdish language problems with natural language
processing techniques, then we suggest solutions to solve these problems. We use chatfuel
platform to create a chatbot for Barzanja village that consists of 300 questions and answers to
apply this work. There are several software platforms available to create chatbot agents.
Common examples of AI chatbot platforms are: Chatfuel, Bot Framework by Microsoft,
Wit.ai, Manychat, Dialogflow, IBM Watson Powered by Neural Network,Botsify, Reply.ai,
Aivo, Pandorabots, Boost.ai, MobileMonkey [58]. Table 4 compares between three of these
platforms [59] [60] [61].
Table 4: This table is differentiation between Microsoft bot, Dialogflow and IBM Watson
Microsoft Bot Dialogflow Bot IBM Watson
Developed by Microsoft Developed by Google. Developed by IBM.
It has open source SDK that is
used to test the bot before
deployment in to the channel.
Inline code and multi-
functional intelligent
integrations.
Watson offers pre-trained and
pre-integrated architecture
It is support text, SMS, Video
and Speech.
It is support natural language
and speech to text
conversations.
It is support natural language
processing and question-
answering system
interact with skype, slack, etc. interact with Google, Alexa, etc detect the disease
AI and machine learning bot. machine learning and AI bot. Neural network and AI bot.
4.5. Chatfuel platform
Chatfuel is one of the best platforms to create chatbots. It provides a WYSIWYG interface
that allows users to create chatbots [62] [4]. So, it is a useful and important platform since it
provides AI technology to script conversations interactively. Several companies use chatfuel
platform, such as Adidas, Uber, TechCrunch, British Airways, Goal.com, Volkswagen, and
MTV [3].
Other main properties of the chatfuel platform are: Chatfuel provide templates and prebuilds
to create chatbot from. Chatfuel makes chatbots directly by asking users to choose from
suggested topics to produce a meaningful conversation. [58] [4]. The design and
implementation of bots consist of the following basics and important components:
1) Automate: it is an important part to create each bot, consists of the following:
A- Block: it is the main part of the bot that is used as the base to link the cards. The
blocks are like webpage of websites [63], [4].
B- Cards: it is the block content that include elements such as text, images, galleries,
videos, audios, comments, quick replies, attributes, and so on.
C- Plugin: it is a small program to enhance the bot [58].
2) Live chat: It is an important part to monitor active users at the time of chatting.
3) AI Setup: This is a special part to enter all possible questions and answers. [63].
The structure of chatfuel is easy for building conversations between human and bot. The user
opens the Facebook messenger and then type a phrase or tap a button to start a conversation.
The chatfuel engine determine the user's action and then redirect it to a block or text. Then it
replies to the user with a correct or the best nearest answer. Figure 3 illustrates those steps.
Kurdistan Journal of Applied Research | Volume 5 – Issue 2 – December 2020 | 128
Figure 3: The general diagram of Chatfuel applications [63]
5. PROPOSE OF BARZANJA CHATBOT
Barzanja Chatbot is a tourism guide to the religious holy sites in the village of Barzanja for
Kurdish language speakers. That we proposed using AIML and Chatfuel as platform. It
consists of 300 general and common questions with answers.
Figure 4: The Dashboard of barzanja chat bot
5.1. Barzanja chatbot components
The design and implementation of Barzanja Chatbot software agent consists of blocks, cards
and AI setup as the main components.
5.1.1. Blocks and Cards
A- Block: We use blocks to prepare answers or to connect them with questions. When a user
sends questions, the bot sends to the user these blocks as answers. Other blocks are used for
special information that Bot sends to users such as the welcome block; this block appears
when a user getting started, while the default answer block is used to reply to a user who sends
a question that cannot be recognized by the bot. But block in AIML is a category tag, other all
tags are cards that are used between open/start and close/end tag of categories to create
component of block such as text, image and video. For example:
Figure 5: Wireframe diagram of blocks in barzinja chat bot that is consists of block name and cards.
B- Cards:
Common cards of Barzanja Chatbot is text cards, but sometimes we use cards that represent
images, galleries, quick replies and so on. Figure 7 and figure 8 examples for block and block
contents (cards) with AIML code.
پرسیارى بھكارھێنھر
وە�مى گونجاو بۆ پرسیارى بھكارھێنھر
بھرزەنجھ مێژووى
Barzanja
History
جوگرافیا
Barzanja
Geography
ناوداران
Nawdaran
Kurdistan Journal of Applied Research | Volume 5 – Issue 2 – December 2020 | 129
Figure 6: A quick reply card that appears as a suggestion answer by the chatbot to any users.
The GUI of Welcome Message block and AIML of welcome message is a common block and
card in a chatbot as the following:
Figure 7: GUI of block and card example
5.1.2. Set Up AI
We use this part of the dashboard to enter 300 questions and possible bot answers to reply to
users’ questions.
Figure 8: Wireframe diagram of general Set up AI and AIML code to Set up AI
Below AIML codes shows a sample question and the bot’s answer.
#بھرزنجھ # بھرزەنجھ # بھرزنگھ # برزنجھ # زانیارى وێنھى # ئھلبوم # ئھلبووم# وینھ # وێنھ #
شارۆچكھى بھرزەنجھ BH.png
قھ�ى سرۆچك Sro.png
مھرقھدەكان Mar.png
#بھرزنجھ چیھ؟ #برزنجھ؟# بھرزنجھ بناسێنھ #برزنجھ جیھ؟# بھرزنجھ جیھ؟ #
شارۆچكھیھكھ لھ شارباژێڕ كیلۆمھتر لھ سلێمانى یھوە دورە 62شارۆچكھیھكھ
زێدى زانایانى بھرزەنجى یھ ھرى كوردستانھزھئھ
barznja.png
بھخێربێیت
بھرزەنجھ، كھ سھرجھم باس و بابھتھكانى بھخێربێن بۆ الپھرەى تایبھت بھ مێژووى
.تایبھتھ بھ مێژووى بھرزەنجھ لھ كۆتاى سھدەى پێنجى كۆچی تا ئیمڕۆ
.ئھم الپھرەیھ لھالیھن نھوەكانى حاجى بابا شێخى بھرزەنجى یھوە بھ رێوە دەبرێ
.دووبارە بھخێربێن
پھیوەندى تھلھفۆنى
07701515582
پرسیارى بھكارھێنھر
وە�مى گونجاو بۆ پرسیارى
بھكارھێنھر
شارۆچكەى بەرزەنجە BH.png
قەڵاى سرۆچك Sro.png
مەرقەدەكان Mar.png
5.1.3. Live Chat
جوگرافیا
Barzanja Geography
ناوداران
Nawdaran
6. USING AND ANALYZING THE CHATBOT
7. RESULTS AND DISCUSSION
8. CONCLUSION AND FUTURE WORK