Kurdistan Journal of Applied Research (KJAR) Print-ISSN: 2411-7684 | Electronic-ISSN: 2411-7706 Website: Kjar.spu.edu.iq | Email: kjar@spu.edu.iq Design and Implementation of a Chatbot for Kurdish Language Speakers Using Chatfuel Platform Hemn Mela Karim Barznji Jamal Ali Hussein Information Technology Computer Department Computer Science Institute College of Science Sulaimani Polytechnic University University of Sulaimani Sulaimani, Iraq Sulaimani, Iraq dr.hemn@yahoo.com jamal.ali@univsul.edu.iq Article Info ABSTRACT Volume 5 - Issue 2 - December 2020 DOI: 10.24017/science.2020.2.10 Article history: Received : 22 Sept 2020 Accepted: 30 December 2020 Chatbot is a software agent that is used to conduct intelligent conversations between machines and humans. Chatbots are mostly depend on Natural Language Processing (NLP). In this paper, the design and implementation of a chatbot are provided to help Kurdish speakers in using online conversations via texts to find answers instead of direct contact with human agents. The NLP-based software agent is implemented using the Chatfuel platform. Chatfuel uses artificial intelligence to communicate with humans by simulating human conversations through voice commands or texts. The proposed chatbot is tested on an electronic tourist guide that helps visitors to the religious places in the mountainous village of Barzanja that is located in Iraqi Kurdistan. The case study is conducted by using three- hundred questions and answers. One hundred volunteers participated in this study. The participant asks a question and the bot provides an answer if it recognizes the question, otherwise it provides a default answer along with a suggestion of how to use the system properly. The data of these experiment is collected, analyzed, and problems regarding Kurdish language are detected. Designing software agents for processing Kurdish texts faces many challenges. Kurdish texts have not yet been processed using natural language processing (NLP). In addition, Kurdish font disorder and the lack of standardized keyboards and writing styles makes processing Kurdish text difficult. Furthermore, Kurdish language consists of variety of different dialects with different typing styles. In this research, we specifically focus on the design of a software agent for the Central Kurdish (Sorani) dialect. We managed to solve some of the problems related to the Kurdish language and suggest solutions to others. Keywords: Chatbot, Kurdish Language, NLP, Software Robotic, Artificial Intelligent, Kurd Agent. Copyright © 2020 Kurdistan Journal of Applied Research. mailto:dr.hemn@yahoo.com mailto:jamal.ali@univsul.edu.iq Kurdistan Journal of Applied Research | Volume 5 – Issue 2 – December 2020 | 118 All rights reserved. 1. INTRODUCTION Chatbot is a software agent that is used to conduct friendly intelligent conversation between a machine and human. The chatbot term refers to text conversation, but now growing through other communication means such as voice. Enhanced chatbots can also reply using images, relational links, gallery, video, etc. [1]. The basic conceptual and objective of chatbot creation is that the computer talks in natural language with human in reality, which should be as human as possible. Based on this, the chatbot is built for conversations and usually offers a special idea such as searching the Internet, organizing files on a computer, arranging engagement and appointments, and so on [2]. There are numerous chatbot applications for helping users in finding flights, hotels, travel destinations, and jobs. Chatbot is used in many areas, such as ecommerce, banking, entertainment, health, and Education [3]. Chatbots have many advantages over direct conversations, such as availability, reduced costs, and the enhancement of social experiences. There are many software applications available to create chatbot agents. These applications are simple to implement because they allow users to create chatbots without writing any code, but they also enable professional developers to write codes if necessary. Common examples of AI chatbot platforms are: Chatfuel, Bot Framework, Wit.ai, Manychat, Dialogflow, etc. We have worked with the chatfuel platform since it is one of the best chatbot engines that uses artificial intelligence (AI) to communicate with human. It simulates human conversation through voice commands or text conversations or both. The focus is on automation and adaptability, from answering questions to collecting data [4]. We use this powerful chatbot builder to create a chatbot as a tourist guide for Barzanja1 village. Although we focus on Central Kurdish (Sorani) dialect speakers using Kurdish alphabet, which a common writing style in Iraqi Kurdistan, but the proposed chatbot system is capable of enhancing its response whenever a new word or question is entered by the users even if they use different dialect or writing styles. A case study that consists of three-hundred questions and answers and one hundred participants is conducted. When we applied this research, some challenges have occurred, such as Kurdish Font disorder, different typing styles, punctuations and non-standardized Kurdish language. We provide solutions to some of the problems related to using Kurdish language in NLP systems and provide suggestion to some other problems. 1.1 Challenge and Problems The following challenges and problems are related to NLP of the Kurdish language: 1- Writing Styles Variation: Kurdish language has several formal writing styles with some unformal styles such as Latin style, English alphabet, Central Kurdish style (Sorani) and Arabic alphabet. For examples, the following words have the same meaning ('Come' in English) but have been written using either different words or the same word with different writing styles: بھو بةو وەرە وةرة بێ بيَ بي بيَ 2- Dialectal Variation: Kurdish language has different dialects according the area of Kurdistan. Each dialect has its own grammar and vocabulary. Mixing these dialects is problematic when using Kurdish text in NLP based systems such as chatbots. 3- Orthographic Ambiguity and Inconsistency: In Kurdish language; vocabulary, grammar and writing styles sometime cause ambiguity and inconsistency that are difficult to determine and classify. 1 A thirteenth century village located in a mountainous area near the city of Sulaimani (Sulaymaniyah) in Iraqi Kurdistan. It is a place for many Islamic and Yarsani shrines and holy sites. Kurdistan Journal of Applied Research | Volume 5 – Issue 2 – December 2020 | 119 4- Morphological Richness: Kurdish words are inflected for a several of features, such as gender, number, person, voices, aspect, etc., that have different formats according to the dialect. For example, the following pair of words have close spellings with totally different meanings: Milk شیر Lion شێر Short كوڵ Blunt كول 5- Idiomatic Dialogue Expressions: Since some idiomatic expression in Kurdish language are common, but others are less common, it becomes challenging when replying to a question by the bot. The following two expressions have a close meaning using different words: Good Morning بھیانیت باش Morning of Light بھیانیت رۆشن 1. LITERATURE REVIEW Natural language processing (NLP) is new for Kurdish Language, so it is hard to find NLP works on Kurdish in the literature. Therefore, we review some researches that are close to the Kurdish Language such as the Arabic language. An artificial intelligent agent chatbot for Kurdish Language has been proposed in [5] by using Artificial Intelligent Markup Language (AIML) on the free and opensource platform Pandorabots with a Facebook account. It can answer queries in Kurdish. This system takes the input in text format, then it displays the results in text and provides accurate and quick answers to users. Writing style of Arabic language is close to that of Kurdish. In [6], an Arabic chatbot for children with Autism Spectrum Disorder (ASD) is developed based on pattern matching (PM). A new Arabic short text similarity (STS) measure is used to extract facts from user’s responses to match rules in scripted conversation in a particular domain (Science). The researcher proposed the system on grammatical and morphological. The first chatbot using for an Arabic dialect was presented in [7] exploring each challenge that faces the creation of conversational agents. It uses the Egyptian dialect of the Arabic language. The researchers illustrate several solutions and explain all elements of BOTTA Chatbot. The database of BOTTA is available to all researchers that are working on Arabic chatbots or the languages close to Arabic in their writing styles such as Kurdish, Urdu and Persian. In the research proposed in [8], several obstacles and challenges that need to be resolved when developing an effective Arabic chatbot is presented. This is important for other languages that use an alphabet close to the Arabic language alphabet. 2. THE PROPERTIES OF THE KURDISH LANGUAGE The Kurdish language is the backbone of this research, so we define and introduce this language, we especially focus on the Central Kurdish (Sorani) branch. Kurdish (Kurdish: Kurdí, كوردی, Kurdî) language is a branch of Indo-European family of languages. But dialects of Kurdish are members of the Indo-Iranic languages of the northwestern subdivision. The Kurdish language is not dependent language because it has all features of languages such as historical development, continuity, grammatical system and rich living vocabularies [9]. The Kurdish language belongs to the “Median” language or “Proto- Kurdish”. People of Kurdistan speak several dialects of the language. Kurdish language dialects are [10]: 1) Nordic Kurdish dialects, also called Kurmanjí and Badínaní. Kurdistan Journal of Applied Research | Volume 5 – Issue 2 – December 2020 | 120 2) Central dialects, also called Soraní 3) The Southern Kurdish dialects, also called Pehlewaní or “Pahlawanik”. The other two branches of Kurdish language are Dimílí also called “Zaza” and Hewramí also called Goraní. According some references about linguistic, the southwestern branch of the Indian and Iranian languages of the Lurrí (Luri) branch is classified as a sub branch of Kurdish [9]. The Kurdish nation is divided among five countries: Iraq, Iran, Turkey, Armenia and Syria. Kurdish literature was written in Arabic, Persian or Turkish, although the Kurdish language, written in Central Kurdish (Sorani) and Kurdish Latin Alphabet script, began to appear in the seventh century AD. Nowadays, Kurdish is written in three different writing styles. 1) The Iraqi and Irani Kurdish are using Central Kurdish alphabet, for example: کوردی. 2) The Kurdish of Turkey and Syria use Kurdish Latin alphabet, for example: Kurdî. 3) The Kurdish of Armenia use Cyrillic alphabet, for example: քրդի. The letters are 34 but the sounds of Kurdish language are 37 for central Kurdish alphabet, but Kurdish Latin alphabet is 31 letters commonly [11], as show in Table 1. In Sulaimani and Kirkuk, the letter D is often softened to the point of being inaudible. The most prominent example of this case is the present modal prefix "دە ": laimani and Kirkuk) (In Suمھڕۆ= ئ (Standard)مەڕۆد In Kurdish language, especially in Sorani dialect, no words begin with "ر", all initial Rs are trilled "[13] [12]" ڕ: رۆژگار = ڕۆژگار Generally, the letters of Kurdish language are pronounced as written that is divide in two parts: Vowel Letters: - it is consisting of long and short vowels as we present as:ە , ئــ ، ا ، ئـا, Constant Letters: - they are another letter that is determined constant .وو ,و ,ی یــــ ,ێ ، یــَ sound, look the table 1. The words are constructed using combine of two letters or more [13] [14]. According the parts of speech and syntax, Kurdish language words classifies to 8 parts: the verb (فھرمان/كار), the noun (ناو), the pronoun (ڕاناو/جێناو), the adjective (ئاوەڵناو), the adverb is (ناو) The noun .(ژمارە) and number (قسھھھڵدان) the interjection ,(ئامڕازەكان) Tools ,(ئاوەڵفرمان) a word can add (the - ەكھ), (a, an -ێك ) and (plural - ان) that is a main in part of speech. This words that name of people, thing and place. It is not related to the time. كچ ، ساڵ، ئاو، كوڕ، دڵشاد، ڕووپاك، گوڵھگھنم، شھكرە سێوو The pronoun (ڕاناو/ج ێناو): It substitutes for nouns or noun phrases and designates persons that is very important because it is commonly used such as . Generally, it is divided into: personal pronoun such asمن، تۆ، ئھو، ئ ێمھ، ئێوە، ئھوان . possessive pronouns is divide into fourth group that is transitive past verb pronouns such as م - مان، ت - تان ، ی - یان , imperative past verb pronoun such as م - ین، ی یت - ن، - - ن , transitive and imperative present verb pronoun as - م ئھمھ - Demonstrative pronouns as . ن ، ه imperative verb pronoun as , ین، ی یت - ن، ات ێت - ن ,چی ,كێ The interrogative pronouns and adverbs as .كھ Relative pronouns as . ئھمانھ ، ئھوە - ئھوانھ .چھند ,چۆن ,كوێ ,كام The adjective ( ئاوەڵناو) is a word that can add (-er,تر ) and (-est, ترین) Such as smarter زی رەكتر. But the adverb (ئاوەڵفرمان) is a word that modify verbs and adjectives Such as بھ خێرایی quickly. The tools (ئام ڕ ازەكان) is a word that use to any causes and to create relations between words or sentences such as بھ، لھ، دە، لھگھڵ، بۆ، لھسھر، ێك، ەكھ، و but the interjection (قسھھ ھڵدان) usually expresses emotion and is capable of standing alone such as ئ ۆی Oh, بھراست ؟ Really, �یھڵ Come Kurdistan Journal of Applied Research | Volume 5 – Issue 2 – December 2020 | 121 on! The number (ژمارە) is consisting of cardinal and ordinal number. Finally Verb (فرمان/كار) is an expresses existence, action, or occurrence. [14] [13] [15] [16] [17] [18] [19] Table 1 : The table of Kurdish letters alphabet, Central Kurdish (Sorani) and Kurdish Latin alphabet. Central Kurdish alphabet Kurdish Sound Kurdish Latin alphabet NO. Sorani Examples Examples Letter a/ Amed; zana A, a/ ئیش ئـ 1 a:/, long a Batman; kellebab B, b/ ئاوات ا 2 b/ Urdun; dund C, c/ باران؛ داب ب 3 p/ Çoman; kiç Ç, ç/ پار؛ قاپ پ 4 t/ Dihok; berd D, d/ تاو؛ پیت ت 5 d͡ʒ/ Erzirrom; bere E, e/ جام؛ تاج ج 6 t͡/ چاو؛ خاچ چ 7 ʃ/ Êwan, pêrê Ê, ê h/ Firat; def F, f/ حھیران؛ حھ سار ح 8 x/ Gever; deng G, g/ خاک؛ ناخ خ 9 d/ Hewlêr, Ah H, h/ داس؛ ئازاد د 10 r/ Sirinçik I, i/ برین؛ بیر ر 11 Bold /R/ Îlam, sînî Î, î ڕاست؛ مھڕ ڕ 12 z/ Jawero; kîj J, j/ زانست؛ ناز ز 13 ʒ/ Kobanê; erk K, k/ ژیار؛ کیژ ژ 14 s/ Laliş; mel L, l/ سارد؛ کراس س 15 ʃ/ Mehabad; dem M, m/ شین؛ باش ش 16 gh/ Nisêb N, n/ عێراق؛ دهعبا ع 17 gh/' Pawe; esp P, p/ غونچھ ؛ قۆناغ غ 18 f/ Oremar; boso O, o/ فیل؛ ماف ف 19 v/ Qûçan; deq Q, q/ ڤیان؛ حھ ڤده ڤ 20 Q/ dar R, r/ قیر؛ تاق ق 21 k/ Ranye; perr RR, rr/ کانی؛ پیک ک 22 ɡ/ Sine; kras S, s/ گا؛ سینگ گ 23 l/ Şengal, baş Ş, ş/ الو؛ دیل ل 24 Bold /L/ Tirbesipî; kat T, t گو�ڵھ؛ ماڵ ڵ 25 m/ Urdun; dund U, u/ مار؛ سام م 26 n/ Ûrmiye, sûtû Û, û/ ناو؛ بان ن 27 h/ Vêtnam; bav V, v/ ھیوا؛ بھھره ھـ 28 e/ Wan; naw W, w/ ھھڵھ؛ ھھڵوژه ه 29 u/ Xaneqîn; qonax X, x/ وانھ ؛ داو و 30 o/ Yêrîvan; key Y, y/ دۆ؛ دۆشاو ۆ 31 u:/, Long /u/ Zaxo; berz Z, z/ دوو؛ بوو وو 32 /i/ یاد؛ دایھ ی 33 Bold i دێ؛ ڕێ ێ 34 The sentences are the largest unit in syntax of Kurdish language that are consisting of above part of speech as (subject, object, adverbial, adjunct, complement and verb). Hemin runs – ھێمن ڕادەكات .Hemin and his friends run at the park every day - ھێمن و ھاورێكانى، ھھموو ڕۆژێ لھ باخچھكھدا ڕادەكھن In addition to the transitive verb and the non-transitive verb, there is a third type of verb called the connecting verb. The word (or phrase) that accompanies a connecting verb is not an object, but a complement. The subject complement can be a noun, an adjective or a preposition. most common linking verb is " بوون ", which is equivalent to “to be” in English. .Chro is in university – چرۆ لھ زانكۆیھ | .Chro is lecturer – چرۆ وانھبێژە Kurdistan Journal of Applied Research | Volume 5 – Issue 2 – December 2020 | 122 The order of components of each Kurdish sentences are بكھر subject, ب ھركار object and كار .verb. Generally, the tense in Kurdish language are present and past [20] [21] [13] (فرمان) Table 2: Table of Kurdish language tenses كات Tense ج ۆر Type یاسا Rule نموونھ Example وو رد ڕاب P as t سادە Simple ھاتین ڕەگى ڕابردوو + ج ێناو (ڕاناو) ى لكاوو خواردمان بھردەوام Continuous دەھاتین دە + ڕەگى ڕابردوو + جێناو (ڕاناو) ى لكاوو دەمانخوارد تھواو Perfect ھاتبووین ڕەگى ڕابردوو +بوو+ جێناو (ڕاناو) ى لكاوو خواردبوومان مھرجى Conditional بھاتینایھ ب+ ڕەگى ڕابردوو + جێناو (ڕاناو) ى لكاوو+ایھ بمانخواردایھ وو رد ھب ڕان P re se nt سادە Simple دەنووسین دە + ڕەگى داھاتوو + جێناو (ڕاناوو) لكاوو دەنووین تھواو Perfect ڕەگى ڕابردوو + وو + ڕاناوو (ج ێناوو ) ى لكاوو + ــھ ، بھ�م (ــھ) لھگھڵ تێن ھپھڕدا و تھنیا لھگھڵ كھسی سێھھمى .تاك دەردەكھوێ خواردوومانھ نوستوو ین مھرجى سادە Simple Conditional بڕۆین ب + ڕەگى داھاتوو + ڕاناوو (ج ێناوو )ى لكاوو بشۆین مھرجى تھواو Perfect Conditional ھاتبین ڕەگى ڕابردوو + ب + ڕاناوو (ج ێناوو ) ى لكاوو كردبمان داخوازى Imperative ب + ڕەگى داھاتوو + ــھ (ئھگھر ڕەگھكھ بھ كپ كۆ تاى ھاتبێ) بنووسھ بشۆ 3. NATURAL LANGUAGE PROCESSING FOR KURDISH LANGUAGE Natural language processing (NLP) is branch of linguistics, computer science, and artificial intelligence that helps computers understand, interpret and manipulate human language [22] However, NLP was originally known as Natural Language Understanding (NLU), it is now well understood that although the goal of NLP. NLU is real, it has not yet been achieved. But the main goal of NLP is “to accomplish human-like language processing”. [22] NLP have a challenge to developing a program that understands natural language is a difficult problem. NLP has more application such as: Searching and indexing for large text. Word processor software. Information retrieval. Text categorization using classification. Text summarization software automatically. Question Answering (QA) Applications. [23] To understanding and applied the NLP to Kurdish language, both things are necessary: the first one is Kurdish language component and grammar. The second one is component of NLP that is divided into Natural Language Understanding (NLU) and Natural Language Generation (NLG) [24] The main techniques of NLP are syntax analysis and semantic analysis: First – Syntax Analysis: it is referring to the sentences that words arranged in this structure of text and they have grammatical meaning. Also known as parsing. It has more techniques: Tokenization and pattern matching are an essential operation used to break up a string into words, punctuation marks, numbers and other items. For example: “Dr. Hawzhin, Mr. Sherko Barznji”, said Kurdistan, introducing us. can be tokenized as in the following, where each token is enclosed in single quotation marks: ‘"’ ‘Dr.’ ‘Hawzhin ‘,’ ‘Mr.’ ‘Sherko ‘Barznji’ ‘"’ ‘,’ ‘said’ ‘Kurdistan ‘,’ ‘introducing’ ‘us’ ‘.’ The important task in this step is finding the boundary of words. In Kurdish language, the Kurdistan Journal of Applied Research | Volume 5 – Issue 2 – December 2020 | 123 boundary of words can determine using the fully separated by space, separated by half – space or be related to each other. من پسپۆری بھرنامھ سازیم من پسپۆری بھرنامھسازیم In the first sentence, the بھرنامھ سازیم is two words, if we determine by space separator, but in the second sentence the بھرنامھسازیم is one words. The second form is correct but the first form in incorrect. Parts of speech (POS): Another NLP task is speech tagging to identify the part of speech for every word and categorized of words that have same properties of grammatical, for example: Table 3: POS tag for Kurdish sentences example Kurdish sentence . من دەرۆم بۆ قوتابخانھ English sentence . School to go I POS Tags Punctuation Noun Preposition Verb Pronoun Lemmatization is a common technique to solve words in the form of their dictionary, which requires a detailed dictionary in which the algorithm can search for words and link them to their respective prepositions. چوون -- دەچم -- دەچووم -- بچۆ --چوو بوو –چووە، باش -- باشتر -- باشترین Stemming: it is a process to convert from inflected or derivates words to steam, base or root form. commonly, it removes all the suffixes and affixes based on some predefined linguistic rules. ه--.ه ، ھھیھ –ھھیھ ، ھھبووە – ھھیھ ، ھھبێت – ھھیھ، ھھبوو – ھھیھ، ڕۆشتنھوە – ڕۆشتن، ڕۆشت بوو -- ڕۆشت Stemming for Kurdish language classify to verbal stemming and non-verbal steaming. Input: x For 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 in creation_rule_of_verb do if x == 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 do x′=𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑟𝑟𝑜𝑜 x if x′ 𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑣𝑣𝑟𝑟𝑟𝑟𝑣𝑣_𝑑𝑑𝑖𝑖𝑑𝑑𝑟𝑟 then add x′ to Collection_Of_Suggestion end end end if Collection_Of_Suggestion != empty then return shortest word in Collection_Of_Suggestion as stem else return x end Figure 1: Algorithm of verbal stemming in Kurdish language Input: x For Collection_of_Suffix/Collection_of_affix do For y in Collection_of_Suffix/Collection_of_affix do if x ends with y then x′= x[0:(𝑟𝑟𝑟𝑟𝑖𝑖(x)−𝑟𝑟𝑟𝑟𝑖𝑖(y))] if x′ is in 𝐿𝐿𝑟𝑟𝐿𝐿𝑖𝑖𝑑𝑑𝑟𝑟𝑖𝑖_𝑑𝑑𝑖𝑖𝑑𝑑𝑟𝑟ionary then add x′ to 𝑑𝑑𝑐𝑐𝑖𝑖𝑑𝑑𝑖𝑖𝑑𝑑𝑐𝑐𝑟𝑟𝑟𝑟_Collection end end end end if 𝑑𝑑𝑐𝑐𝑖𝑖𝑑𝑑𝑖𝑖𝑑𝑑𝑐𝑐𝑟𝑟𝑟𝑟_Collection != empty then return shortest word in 𝑑𝑑𝑐𝑐𝑖𝑖𝑑𝑑𝑖𝑖𝑑𝑑𝑐𝑐𝑟𝑟𝑟𝑟_Collection asstem else return x end Figure 2: Algorithm of non- verbal stemming in Kurdish language Kurdistan Journal of Applied Research | Volume 5 – Issue 2 – December 2020 | 124 Parsing: This includes performing grammatical analysis for the sentence provided. The syntactic parser usually receives a sentence containing margins as input and returns a parsed syntax as output. Second – Semantic Analysis: it is referring to the meaning that is sent by text and focus of meaning identification of language. It is the difficult part of NLP that has not yet been fully solved. Some computer technique and algorithms are created to understand and interpretation of words. Common techniques are: Name Entity Recognition (NER) is a most common task in semantic analysis that is extracting entities from text. The entities can be name, place, email address, and more. Natural language generation, This includes using a database to obtain semantic goals and convert them into human language. It is a special technique that is used to convert from plain text to raw structured data. [25] [26] [24] [23] [27] [28] [29] [30] [31] 4. CHATBOTS The chatbot is a software agent based on artificial intelligence which is used in conversation between users and software robot [2]. This agent can interact with human carefully using NLP as a basic to produce this process [32]. Chatbot is a simulation of human user conversations especially over the Internet, but it is possible to apply it as an offline software for specific purposes, such as travelling guide, education or self-learning of languages [33]. The idea of chatbot belongs to the Alan Turing test [34]. Eliza chatbot is the first agent that was developed by Joseph Weizenbaum in AI Laboratory at Massachusetts Institute of Technology (MIT) in 1966. [35] [36]. Parry is another chat bot that was created by the psychiatrist and computer scientist Kenneth Mark Colby at the department of Psychiatry in Stanford University in 1972 [35]. The chatbot Jabberwacky was created by British developer Rollo Carpenter in 1988. It was intended to simulate a natural human dialogue [1]. In 1992, Dr. Sabaitso chatbot was created by Creative Labs for MS-Dos. In 1994, the term of chatbot was coined. In 1995, ALICE was created by Richard Wallace, which is an acronym for “Artificial Linguistic Internet Computer Entity”. In 2001, Wallace published AIML specifications [2] [37]. Smarter Child was an intelligent chatbot created in 2001; it has some features such as accessing data quickly and funny personalized conversations [1]. In 2006, the Watson chatbot was created by IBM, it is a question answering system. in 2010, Siri was created by Apple as part of the Apple operating system; it is a text and voice chatbot [1]. In 2012, MITSUKU chatbot was created by Steve Worswick. It uses AIML language to understand the user’s response [38]. In the same year, the Google Now was developed by Google using NLP [39]. The Alexa chatbot was developed in 2015 by Amazon, it is capable to interact with voice and it uses algorithms of NLP to receive sounds, recognize and respond [40]. In the same year, Microsoft Company created Cortana bot for mobile and personal computers that use Windows operating system. [41]. In 2016, social networking site Facebook provided a platform of messenger that allows developers to build a bot for Facebook users [42]. 4.1. Types of Chatbot Chatbot classify in some classification to determine chatbot types. Common categorized of chatbot according different parameters are: - The knowledge domains that are categorized based on the knowledge they have access to or the amount of data they receive. The Providing services is another classification of bots are based on the branch of knowledge that deals with the amount of space that people feel it necessary to set between themselves and others In The goal's classifications, chatbots are categorized based on the early objectives that is aim to achieves. Kurdistan Journal of Applied Research | Volume 5 – Issue 2 – December 2020 | 125 The processing of input and method of response generations: the categorization of chatbot are according methods that are divides to 2 models: The Rule Based Approach (RBA), the chatbots trains based on predefined set of rules that was trained in the early stages to answer questions. Self – Learning Approach (SLA), the chatbots can learn on their own using the advanced technologies such as AI and Machine Learning. It is divided into: 1- Retrieval-based approach of Chatbots has much easier structure to create bots and provide more predictable result. because it is applying functions on predefined patterns of input and responses that uses heuristic method to deliver suitable response. Now, this approach is very common and more practical. 2- Generative based approach of Chatbots are the hereafter of chatbots that build a smarter chatbot. Unfortunately, it has not wide range to use by developer, because It is now more in laboratories. If chatbots are about general topic conversation and response properly, it is opened domain chatbots. Otherwise, if chatbots are about specific topic and specialized title, it is closed domain. [43] [44] [45] [46] 4.2. The design techniques of chatbots The design techniques used by chatbot developers are: 1) Parsing: it is used to analyze and process the input from users by using several functions of NLP, such as Python NLTK tree [47]. 2) Artificial Intelligence Markup Language (AIML): It is the main technique that is used to design chatbots. [48]. 3) Chat Script: This is a technique that helps in cases when no matches return from AIML. It makes the best syntax to build a reasonable default answer. It offers a set of features such as variable concepts, facts, and and/or logic operations [47]. 4) Pattern Matching: this technique is about the artificial intelligence that is used to design the chatbots to match the input from users with the database-stored answers and then returning the identical response [49]. 5) SQL and relational Database: A method that has recently been used in Chatbot design to remember Chatbot previous conversations. 6) Markov Chain: Chatbots are used to create responses that are more likely to be useful and therefore more accurate. The Markov chain idea is that there is a probability of occurrence for any letter or word in the same textual dataset [3] [50]. 4.3. AIML – Artificial Intelligence Markup Language AIML is a standard of artificial intelligence markup language that is a language for artificial intelligent applications creation. It built based on extensible markup language (XML) dialect invented. The AIML is very important to AI software agent, especially natural language software agent development because it use in structure of semantic and syntax as theoretical structure. AIML was developed during 1995 to 2000 by the Alicebot free software community and Dr. Richard S. Wallace, the AIML is created using the techniques of pattern recognition or pattern matching. It is manipulated to natural language modeling for conversation between human and chatbots that use simulation response approach. [51] The main purpose of AIML is the definition of some knowledge that chatbot has [52]. According the technical of speaking, AIML basic anatomy and structure is tag. Each tag consists of open/start tag and close/end tag as following example: AIML has some static tag. Category, pattern, and template are three most common important tags. The category tag is used to knowledge unit definition of conversation. The tag of pattern is used to identify the user input and the template tag is used to response to user input Kurdistan Journal of Applied Research | Volume 5 – Issue 2 – December 2020 | 126 specifically. The three tags and all AIML tags must be wraps and write between the open/start AIML tag and close/end پرسیارى بەكارھێنەر Other common AIML Tags are the following tags: 1- tag: is used to get random response of same input differently. This tag is used with
  • tag to carry items of different response:
  • وشە یان ڕستە وەك وەالم
  • وشە یان ڕستەى لێكنزیك وەك وەالم
  • وشە یان ڕستەى لێكنزیك وەك وەالم
  • وشە یان ڕستەى لێكنزیك وەك وەالم
  • 2- and tags: are used with variables. The set tag is used to set value in a variable but get tag is used to get value from a variable: نرخ پێدانى ھەمیشە گۆڕاوو وەرگرتنەوەى نرخى ھەمیشە گۆڕاوو 3- tag: is used to respond base on the context: وە�مدانەوە بە پێى دەق 4- tag: it is used to create line break. تكایە ئەم وشانە بخوێنەرەوە 5- Button tags: they are some tags that are used to create a button to apply specific action, see the following: كۆنى كوردى گۆرانە، واتای بەرز، بڵند، با�، كەژ و كێوو دەدات The text tag is optional that is use to preview a text that appear on the button, but content of postback tag is appear by chatbot when user click on the name of button. Sometime the developer of chatbot use the URL tag. 6- Quick reply tags: these tags are other rich media element with text and postback such as post back button. The text tag is appeared on the reply response but the post back tag send message to bot. مێژووى بھرزەنجھ Barzanja History 7- tag: it is a rich media element tag that is used as advanced AIML chatbot implementation to solve some problem and to chatbot response for user. barznja.png 8-