94 © 2020 Adama Science & Technology University. All rights reserved Ethiopian Journal of Science and Sustainable Development e-ISSN 2663-3205 Volume 8 (1), 2021 Journal Home Page: www.ejssd.astu.edu.et ASTU Research Paper Corpus-Based Word Sense Disambiguation for Ge’ez Language Amlakie Aschale Alemu1,, Kinde Anlay Fante2 1Department of Electrical and computer Engineering, Faculty of Technology, Debre Tabor University, Debre Tabor, Ethiopia 2Faculty of Electrical and Computer Engineering, Jimma Institute of Technology, Jimma University, Jimma, Ethiopia Article Info Abstract Article History: Received 20 August 2020 Received in revised form 08 December 2020 Accepted 24 December 2020 In natural language processing, languages have a number of ambiguous words. The absence of automatic word sense disambiguation for any language can be a challenge for the development of natural language processing applications such as Information Extraction, Information Retrieval, Machine Translation, etc. The aim of this study is to design a word sense disambiguation prototype model for Ge’ez Language words using Corpus-based techniques. Due to the unavailability of Ge’ez wordNet and annotated datasets, six ambiguous words were chosen for this study. These words are ሀለፈ (halafe), ቆመ (ḱome), ባረከ (bareke), አስተርዓየ (astaraye), ገብረ (gebre), ሰዓለ (se’ale). A total of 2119 Ge’ez sense examples were collected for the six ambiguous word from Ge’ez literature. The performance of three Corpus-based machine learning techniques (Adaboost, SMO, and ADTree) were tested on the WEKA package. We evaluated the performance of the three Corpus-based machine learning approaches which are unsupervised, supervised and semi-supervised for disambiguation of the six Ge’ez words. The experimental results show that the best performance is achieved using ADtree algorithm (semi-supervised machine learning approach). The proposed method achieved an average performance of 92.1%, 91.3%, 91% and 91.1% of Precision, Recall, F1- score and Accuracy using ADTree algorithm respectively. A window size of 4-4 has been found to be the optimal window size to identify the meaning of the selected ambiguous words of Ge’ez language using ADTree algorithm. Keywords: Natural Language processing Word Sense Disambiguation Semi¬-supervised ADtree Ge’ez Language 1. Introduction In the 21 st century, the growth of information technology has led the way for a large volume of information to be available for the society. Discussion about importance of a language for using the available information is not far from obvious since it serves as a medium of communication among the races. Language has a potential to express a wide range of ideas and to convey complex thoughts. In particular, natural language is now being used to exchange information among humans and has now reached to the extent of being evolution criteria for the technology. In order to  Corresponding author, e-mail: amlakieaschale19@gmail.com https://doi.org/10.20372/ejssdastu:v8.i1.2021.283 make available information useful for the society, an interest has emerged to make use of technology to process natural language. In response to such a need, Natural Language Processing (NLP) has come up with a main focus of natural language computations (Getahun Wassie and Million Meshesha, 2014). NLP is a field of computer science that deals with the interaction among computers and humans using natural language that aims to enhance human-to-human communication and human to computer communication (Solomon Mekonnen, 2010). It is normally used to describe the function of software or hardware http://www.ejssd.astu.edu/ https://doi.org/10.20372/ejssdastu:v8.i1.2021.............. Amlakie Aschale and Kinde Anlay Ethiop.J.Sci.Sustain.Dev., Vol. 8 (1), 2021 95 components in a computer system, which analyzes or synthesizes spoken or written language. There are in fact two distinct focuses of NLP: language processing and language generation. The former one refers to the analysis of language for producing a meaningful representation, while the second refers to the production of language from a representation (Pal et al., 2013). The field of NLP was originally referred to as Natural Language Understanding (NLU) in the early days of Artificial Intelligence. It is well agreed today that while the goal of NLP is true NLU, that goal has not yet been accomplished. A full NLU System would be able to paraphrase an input text, translate the text into another language, answer questions about the contents of the text and draw inferences from the text (Mahmoodvand and Hourali, 2017). The aim of NLP is studying problems in the automatic generation and understanding of natural languages. Natural language is understood as a tool that people use to express themselves and has specific properties that improves the efficiency of textual information retrieval systems. These properties are linguistic variation and ambiguity. NLP is also a subfield of artificial intelligence and linguistics (Naseer and Hussain, 2009). Ambiguity is one of the greatest challenges in NLP, the term refers to understanding of something in two or more possible ways or something that has more than one meaning. It can appear in sentence (called structural or syntactic ambiguity) or at a word level (called lexical ambiguity) and phonological ambiguity. Ambiguity is a universally recognized linguistic phenomenon, which arises from the structure of the language and can be explained in terms of the analysis at different levels (Daniel Jurafsky, 2018). So that developing word sense disambiguation (WSD) is crucial for the development of natural language applications such as information extraction, machine translation, information retrieval, question answering, text summarization and others. In the field of computational linguistics, word sense disambiguation is defined as the problem of computationally determining which “sense” of a word is activated by the use of the word in a particular context. Lexical disambiguation in its broadest definition is nothing less than determining the meaning of every word in context, which appears to be a largely unconscious process in people. Due to the importance of WSD for understanding semantics and many real- world applications, researchers have been interestingly trying to tackle that problem. So far, different word sense ambiguation techniques were proposed for Amharic (Getahun and Million, 2014; Seid and Yaregal, 2017; Solomon Mekonnen, 2010), Afan Oromo (Workneh Tesema et al., 2016), and Tigrigna (Mersa Mebrhatu, 2018) as languages of Ethiopia. To the best of our knowledge, there is no word sense disambiguation model reported for Ge’ez language. The objective of this work was to develop a word sense disambiguation model for six ambiguous Ge’ez words. We have designed three different corpus- based machine learning models to compare the performance of different techniques. Through experiment, we have explored the best model and the parameters of the models for six ambiguous Ge’ez words. 2. Materials and Methods This section describes the design of WSD system for Ge’ez language. It mainly focuses on preparation of corpus, word selection, architecture of Ge’ez WSD model, document pre- processing techniques, preparing machine readable datasets, and evaluation techniques of the model. According to different scholars, words that we want to disambiguate could be selected by the researchers from WordNet, which is available on the web or online, or from different sources of the language or documents of the language, which is annotated manually. 2.1. WordNet WordNet is a lexical database; it provides a large repository of some languages lexical items, which is available online. The WordNet was designed to establish relations between the main four types of Parts of Speech (POS): noun, verb, adjective and adverb. WordNet defines the relations between synsets and relations between word senses. A specific meaning of one word under one type of POS is called a sense and synset represents the smallest unit in WordNet, which describes a specific meaning of a word. It includes the word itself, explanation and the synonyms of its meaning. The difference is that lexical relations are relations between members of two different synsets, however semantic Amlakie Aschale and Kinde Anlay Ethiop.J.Sci.Sustain.Dev., Vol. 8 (1), 2021 96 relations are relations between two whole synsets (Workneh Tesema et al., 2016). 2.2. Words that are selected from documents According to Mersa Mebrhatu, (2018), the construction of the sense tagged corpus needs a great amount of time and cost. Due to this reason, we have selected small a number of ambiguous words in this study. The corpora which have sense information of all words, have been built recently, but they are not large enough to provide sufficient disambiguation information of the all words. Therefore, the methods based on the sense tagged corpora have difficulties in disambiguating senses of all words so that the selection of ambiguous words that were used in this study was based on the number of senses of single word. In Leykun Berhanu (2005), there are words that have multiple senses, from two sense up to sixteen sense. Due to the unavailability of Ge’ez wordNet and annotated datasets, six ambiguous words that have two senses were chosen for this study. These ambiguous words are selected from ግስ (gis) which is found in ቅኔ (kinie) school in Ethiopian Orthodox Tewahido church. WSD performance can be affected by the distribution of training data for each sense that means number of sense examples are required to be equal as much as possible and a balanced distribution of training data has been employed to maximize performance in the work of (Solomon Mekonnen, 2010). 2.3. Data Collection In this research, we have used corpus-based approach. It is challenging to acquire sense annotated corpus for WSD studies due to lack of standard sense annotated corpus or context-based repository (Wordnets) for Ge’ez language. Due to this reason, data collection becomes first rather than corpus preparation. By considering this, the researchers collected data from different sources such as Ge’ez bible, Sinksar, Fithanegest, Gedile semaetat, and teaching materials from Bahir Dar university Ge’ez Department. Here, we first collected a huge data that contains 193,000 sentences or instances. To retrieve the sentences that contain the selected ambiguous words; we developed a simple algorithm and this simple algorithm accepts a word which is an ambiguous words from the user and then displays the sentences that contain the given ambiguous word and we have got 2,119 sentences or instances from a huge data that we collected. According to Yemane Kaleta et al. (2016), selecting sentences of ambiguous words from a variety of domains is very important to build efficient and reliable WSD prototype since similar domains usually restrict words to one sense. Therefore, in order to build efficient and reliable WSD prototype for Ge’ez language, we collected data from different domain areas. 2.4. Proposed System Architecture The flow of activities that are used to develop the proposed WSD is given in Figure 1. The proposed system architecture contains different steps. The first step is accepting sentences that contains ambiguous words. The next step is applying preprocessing activities like normalization, tokenization, stop word removal, stemming and transliteration. In unsupervised learning, unlabeld datasets are given to the selected clustering algorithms to build WSD prototype model of Ge’ez language. Whereas, in supervised learning labeled datasets are given to the selected classification algorithms to build WSD prototype model of Ge’ez language. In semisupervised learning, a few numbers of labeled seed examples together with large number of unlabeled datatasets are given to the selected clustering algorithms inorder to obtain fully labeled datasets based on labeled seed examples. Those fully labeled datasets are given to the selected classification algorithms to build WSD prototype model of the language. 2.4.1. Preprocessing Phase P reprocessing describes any type of processing performed on raw data to prepare it for next processing procedure. Hence, preprocessing is the preliminary step which transforms the data into a format that will be more easily and effectively processed. Preprocessing must ensure that the source text be presented to NLP is in a form usable for it. In this study, preprocessing is a primary step to make our data sets compatible with the machine learning tool that was used in our study called Weka. In the preprocessing stage of this study tokenization, stemming, stop word removal, transliteration and normalization are performed. Amlakie Aschale and Kinde Anlay Ethiop.J.Sci.Sustain.Dev., Vol. 8 (1), 2021 97 Figure 1: Corpus-Based Ge’ez WSD system Architecture 2.4.1.1. Normalization: In this study, normalizing the characters is performed because it is not suitable for the next preprocessing stages. In Ge’ez writing system, characters with the same sound have different symbols. These different symbols must be considered as similar even if they have effect on meaning. As a result, in this study, some symbols of the same sound were converted to one common form. For example, if the character is one of ዐ,ኣ,ዓ (with the sound a) then it will be changed to their equivalent respective orders of አ, similarly, ሐ,ሓ,ሃ,ኀ and ኃ (all of them with a similar sound, h) then it will be converted to ሀ to make ሀለፈ. By the same token, all orders of ሠ (with the sound s) are changed to their equivalent respective orders of ሰ to make ሰአለ. Generally, we normalize the characters based on the words that we have selected in our study but not for all Ge’ez language words. 2.4.1.2. Tokenization: tokenization is very important to this study. It is the process of breaking sentences into words or tokens. The corpus, which is a set of sentences first tokenized into words. Tokenization is done by identifying with the white spaces, comma (,) and special symbols between the words. All punctuation marks, numbers and special characters are removed from the text before the data is processed. Hence, these punctuation marks don’t have any relevance to identify the meaning of ambigous words using WSD. Therefore except '።' which is used to detect the end of the sentence, all other punctuations are detached from words in tokenization process. Tokenization is used to get context words for disambiguation purpose. For instance, if we have a sentence like ወተርፈ : አዳም : ውስተ : ምድረ : ኤዶም : ወቃየንሰ : ሀለፈ : ወሀደረ : ታህተ : ምስራቀ : ኤዶም።, then tokenized sentence will be ወተርፈ, አዳም, ውስተ, ምድረ, ኤዶም, ወቃየንሰ, ሀለፈ, ወሀደረ, ታህተ, ምስራቀ, ኤዶም, ። 2.4.1.3. Stop word removal: after tokenization, we have removed Ge’ez language stop words, as it has no effect on meaning of the words. In this study, stop word removal is used to remove stop words from the corpus because the absence or presence of these words has no contribution to identify appropriate sense. Not all tokenized words are necessary in this work. For this study, we collected stop words which are conjunctions, prepositions and articles of the language because of absence of standard stop words. For instance, words such as (‘ባህቱ’, ‘እንተ’, ‘ከማሁ’, ‘ኩሉ’, ‘እምዛ’,’እስመ’, ‘ድህረ’, ‘እምነ’). Since stop words do not have significant discriminating powers in the meaning of ambiguous Amlakie Aschale and Kinde Anlay Ethiop.J.Sci.Sustain.Dev., Vol. 8 (1), 2021 98 words; we filtered stop­words list to ensure that only content bearing words are included. Nevertheless, stop words like ‘ሀበ’ and ‘መንገለ’ are not removed from the corpus because they have a significant role on the word ‘ሀለፈ’. 2.4.1.4. Stemming: Stemming is the process of reducing morphological variants of words into base or root form. In morphologically rich languages like Ge’ez, a stemmer will lead to significant improvements in WSD systems. In Ge’ez language, there are different terms that are generated from the same root word due to their grammatical use. To create different derivational and inflectional word forms, Ge’ez language makes use of prefixes, suffixes, and infixes. Therefore, those extra words or characters that change the root word to different forms are stemmed from the corpus using the stemming algorithm developed by ourselves which is suitable for the language. This algorithm removes both prefixes and suffixes only since we developed affix removal of the stemmer. Therefore, to get the common form of the ambiguous words we tried to normalize infixes of the root word manually. For Example, an ambiguous word ‘ገብረ’ may become like ‘ገብሩ’, ‘ይገብር’, ‘ይግበር’, etc. after removing both prefixes and suffixes. To make it suitable for machine learning algorithms, we inspected manually all those words into one word which is‘ገብረ’. The same thing was applied for other ambiguous words that are used in this study but not for the context words. The reason behind not applying normalization after stemming for the context words is due to the long time it takes to normalize all the context words that are used in this study. 2.4.2. Transliteration After the above preprocessing tasks were done for Ge’ez documents that we have collected; transliteration were performed for Ge’ez language documents. It is the representation of the characters of one language by corresponding characters of another language. In this study, the transliteration was accomplished from Ge’ez characters into Latin characters to make documents compatible with the machine learning tool called Weka (Getahun Wassie and Million Meshesha, 2014). Since we selected a machine learning tool (called Weka) for conducting our experiment, we applied transliteration because WEKA platform uses Attribute Relation File Format (ARFF) or Comma Separated Value (CSV). These file formats can be applied after transliteration have been performed in order to make it compatible with the WEKA tool. The transliteration of the Ge’ez corpus was conducted by using System for Ethiopic representation in ASCII (SERA). 2.5. Preparing Datasets In this study, we used corpus based approach. We prepared a combination of labeled datasets for supervised learning, unlabeled datasets for unsupervised learning and semi-labeled datasets semi-supervised for training. Because corpus-based approach uses both labeled and unlabeled datasets for training and testing. More number of sentences need not be annotated manually in semi supervised machine learning approach. Instead of this, we select the representative seed examples for each sense of ambiguous words. So to select representative seed examples, labeled and unlabeled data size distribution for training set is typically 85­98% unlabeled datasets; and the rest are for labeled datasets (Mahmoodvand and Hourali, 2017). According to this, we prepared 12% of labeled datasets and 88% of unlabeled datasets for each of the six chosen ambiguous words from the total datasets before clustering. That means a word ‘ገብረ’ has 160 instances, so from this 12% are labeled and 88% are unlableled which becomes 20 instances are labeled and 140 instances are unlabled. When we label seed examples automatically, we applied the following techniques. 2.5.1. Seed Selection Techniques In this section, the techniques that were applied in this study will be presented. This research was conducted by using corpus based approach. Both labeled and unlabeled documents were used in the semi-supervised approach. The seed selection technique employs the method proposed in (Getahun Wassie and Million Meshesha, 2014). The techniques that were used in this study consist of four steps. Step 1. Selecting representative seed examples for each class or sense of the ambiguous words: Selecting representative seed examples for each class is effective and those selected seed words are used to label unlabeled documents. Selecting seed words to select representative seed examples for semi-supervised approach is challenging task (Getahun Wassie and Million Meshesha, Amlakie Aschale and Kinde Anlay Ethiop.J.Sci.Sustain.Dev., Vol. 8 (1), 2021 99 2014). Selecting improper seed examples results in poor performance. Improper seed examples can be selected when we tag (label) our datasets randomly by humans. When humans select seed examples randomly, they may select improper seed words which means the selected seed words cannot differentiate the senses (meanings) of the ambiguous word. To minimize manual selection of limited number of seed examples from the total datasets, we used tree algorithms because tree algorithms represent the related concept of the target word starting from the node. Tree algorithms use information gain as lexical knowledge and information gain can minimize subjectivity problem in manual selection of seed examples (Solomon Mekonnen, 2010). By considering this concept, we used ADTree algorithm to select seed words. We classified our datasets by using ADTree algorithm for each ambiguous words and then tree visualization is performed. After tree visualization, we took seed words scoring high information gain in the tree structure. Those seed words were used for discrimination purpose of the remaining sense of the ambiguous word. Step 2. Clustering both labeled and unlabeled seed examples using “classes to clusters evaluation” mode: here, we have not used the resulting clusters from the ADTree algorithm for classification. We only use them to identify the cluster of missing instances based on labeled seed examples. After clustering have been done using EM algorithm which has best performance in clustering algorithms in this study, we can see the effect of semi-supervised learning method in our work. The clustering result shows that the class labels of some of the seed examples were misclassified. However, automatic clustering suggests that such label changes were not required because those seeds were labeled with their sense class as the promise they are chosen by experts intentionally. Step 3. Feature Extraction and Selection: Feature Selection: The success of machine learning requires instances to be represented using an effective set of features that are correlated with the categories of word senses. For this study, feature selection was performed by preparing a eight-eight window size. Because in our datasets the highest window size is eight. Instances with missing values were also removed from the feature sets. Therefore, feature selection is a data reduction mechanism. Feature Extraction: feature vector which represents words for each instance of a target word, that means files of comma-separated values, a line in WEKA with an extension of. arff or .csv. These vectors represent a text window surrounding ambiguous word of a eight-eight words in our case. Step 4. Design of the classifier: Before classifying our datasets using the selected classification algorithms, we labeled our datasets manually depending on the selected seed words. The manually labeled seed examples being are used as cluster labels during clustering of both labeled and unlabeled documents (datasets). Knowing cluster label of each instance becomes important for differentiating the class of missed instances by taking each cluster as a distinctive class. This helps us to label unlabeled instances with their classes. Table 1: Example of WSD Dataset for Semi-Supervised Learning LContext3 LContext2 LContext1 Target word RContext1 RContext2 RContext3 Class ? ? emeze Halefe kaeba bahere horu pass ? tanesio emeheya Halefe halafa behera tirose pass halifo kaeba tirose Halefe wosidona galila maekala pass ? tanesio emeheya Halefe haba behera yehuda ? reeyo soba maseya Halefe haba bitaneya aseretu ? ahadu aseretu keleetu Halefe haba liqana kahenate died maseyo halafa aseretu Halefe aseretu keleetu aredaihu died ? ? sanita Halefe hagara nayene horu ? ? ? emeze Halefe soba baseha gize ? ahadu beesi kebure Halefe behera rehuqa yenesae ? ? Represents missing value Amlakie Aschale and Kinde Anlay Ethiop.J.Sci.Sustain.Dev., Vol. 8 (1), 2021 100 Table 1 indicates that we prepared our datasets with 18 attributes and two of them are target word and class. The rest 16 attributes are context words that are used to determine the meaning of the ambiguous word which means eight words to the left and eight words to the right of the target word. Attributes that exceeds this size are removed. When we use eight words to the left and eight words to the right, there might be missing values. Those missing values are replaced by question marks (?), because question mark is compatible with Weka. Reducing dimensionality of datasets can improve the performance of WSD; because instances having redundancy and missing values problems will be reduced (Solomon Mekonnen, 2010). 2.6. Evaluation Techniques To evaluate performance of clustering and classification algorithms, four different modes are available in WEKA. Those are using training sets, supplied test set, percentage split, classes to clusters evaluation mode and cross validation for both clustering and classification algorithms. However, when we train and test our datasets on all the above evaluation modes, classes to clusters evaluation mode and cross validation are the most effective evaluation modes. Therefore, for our study we used ‘classes to clusters evaluation’ mode for clustering algorithm and cross validation for classification algorithm. Therefore, for this study we used ‘classes to clusters evaluation’ mode for clustering algorithm and cross validation for classification algorithm. When we use ‘classes to cluster evaluation’ mode, WEKA shows the clustering result as error rate using ‘classes to clusters evaluation’ mode. Therefore, accuracy of clustering algorithms was obtained after subtracting the error rate from hundred. This accuracy is used to measure how well it has been able to generalize the clustering results. For this study 10-fold cross-validation evaluation technique is used in our experiment. In this technique, first the total data set is divided into 10 mutually disjoint folds approximately of equal size using stratified sampling mechanism. In stratified sampling, the folds are stratified so that the class distribution of the tuples in each fold is approximately the same as that in the initial data. We have a total of 2119 manually tagged sense examples which is divided into 10 approximately of equal sizes. As a result of this, each fold of a data set contains 212 sense examples with balanced distribution number of senses per fold. After identifing and separating the training set and testing set from the total datasets, we remove manually tagged sense examples from test set. During this process 90% of the data is used for training to develop the system whereas the remaining 10% is used for testing the system. The process was repeated ten times. After each training phase, the system was tested on average of 212 Ge’ez sentence. Each of the corresponding training set contains an average of 1907 sentences. The performance of classification algorithms is usually measured by parameters such as accuracy, recall, precision and F-measure. These performance parameters are the functions of the numbers of correctly and incorrectly classified instances which are obtained on the confusion matrix of WEKA output. 3. Results and Discussions This section presents the performance evaluation of the implemented model. To achieve our objectives, the following experiments or scenarios are considered which are applied on our prepared datasets.  Comparison of corpus based approaches which are supervised, semi-supervised and un-supervised with different modes;  Investigating the most effective approach and effective algorithm from the selected algorithms that improve the performance of Ge’ez WSD model;  Experimenting with different context window size for disambiguation of ambiguous words. 3.1. Comparison of Corpus Based Approaches To compare results of unsupervised, supervised, and semi­supervised machine learning approaches; we used the same datasets of the language, and the classification algorithms for both semi-supervised and supervised approaches. But in study (Solomon Mekonnen, 2010) for unsupervised approach we used the selected clustering algorithms which are Expectation maximization, Simple K-Means, Farthest First, Hierarchical Clusterer for clustering purpose of our datasets. We used clustering algorithms for un-supervised machine learning approach because clustering is unsupervised technique. Therefore, comparison of those three machine learning approaches was conducted on the same datasets. In addition to this, comparison of semi-supervised and supervised approaches Amlakie Aschale and Kinde Anlay Ethiop.J.Sci.Sustain.Dev., Vol. 8 (1), 2021 101 was conducted on the same datasets by using the same classification algorithms. The algorithms are SMO, Naïve Bayes and Bagging which are supervised learning methods, and Adaboost and ADtree are used from semi-supervised learning approach. 3.1.1. Unsupervised Learning Unsupervised learning is an independent process where no supervision is involved during the learning step. Unsupervised corpus based methods do not rely on external knowledge sources such as MRD, concept hierarchies and sense tagged texts. Those approaches are mainly clustering approaches where words and contexts are clustered. During clustering, each cluster corresponds to a sense of a target word. The goal of clustering is to group together elements in a way which maximizes similarity between elements in one cluster and to minimize similarity between elements belonging to different clusters. 3.1.2. Supervised Learning Supervised is the use of algorithms that reason from externally supplied instances (training set) to form classes to differentiate new data. The goal of supervised learning is to build a model of the distribution of class labels in terms of predictor features. In order to build the model it involves training and testing phases. During the training phase a sense-annotated training corpus is required, from which syntactic and semantic features are extracted to build a classifier using machine learning techniques and in testing phase the classifier tries to find out the appropriate sense for the word based on surrounding words present in the instances. 3.1.3. Semi-Supervised Learning Semi-supervised techniques involve training information like in supervised but the information given at initial training phase is less. Semi-supervised or minimally supervised methods are gaining popularity because of their ability to get by with only a small amount of annotated reference data while often outperforming totally unsupervised methods on large data sets. There are a host of diverse methods and approaches, which learn important characteristics from auxiliary data and cluster or annotated data using acquired information. For comparison purpose, we can take the maximum average performance or accuracy of the three machine learning approaches by using their best performing algorithms since they record best accuracy for all unsupervised, supervised and semi-supervised methods.The result is shown in Figure 2. From Figure 2, we can observe that semi-supervised machine learning approach achieves the highest accuracy of WSD prototype models. By using both labeled and unlabeled datasets, the performance of WSD prototype model have improved compared with other approaches. This is because unlabeled datasets are clustered using manually labeled datasets during clustering. From this we can see that semi-supervised machine learning methods are Figure 2: Average performance of the three-machine learning approaches Amlakie Aschale and Kinde Anlay Ethiop.J.Sci.Sustain.Dev., Vol. 8 (1), 2021 102 the most suitable methods for the development of Ge’ez WSD prototype model than supervised and un-supervised machine learning methods using bootstrapping, which means using ADTree, AdaBoostM1 and SMO algorithms. We achieved the best performance of the classifier for Ge’ez ambiguous words using semi-supervised corpus based approach, because seed words which are obtained with high information gain using ADTree algorithm were used for the selection of representative seed examples of this study. 3.2. Comparison of classification algorithm for Ge’ez datasets For investigating the best performing classification and clustering algorithm for Ge’ez WSD prototype model, we applied three approaches namely unsupervised, supervised and semi-supervised approaches for the selected six ambiguous words of Ge’ez language. We used the result achieved by using semi-supervised methods because the performances achieved by using those machine learning method were the most preferable when we compared with the performances achieved by unsupervised and supervised learning methods. For investigation purpose of those selected semi-supervised algorithms, we used average accuracy, precision, recall and F1-score to access the performance of the three machine learning algorithms. The comparison of those three classifying algorithms was based on the achieved performance to classify ambiguous words of Ge’ez language. Those performance comparison of the selected classification algorithms was done on the same Ge’ez dataset. The result is shown in Figure 3. From Figure 3, we observe that ADTree algorithm achieves the best performance for our datasets. We achieved an average Precision, Recall, F1-score and Accuracy of 92.1%, 91.3%, 91% and 91.1%, respectively. Its efficiency was also better than AdaBoostM1 and SMO algorithm. AdaBoostM1 and SMO algorithms also performed comparable result to each other for Ge’ez WSD prototype model. 3.3. Determining the optimal context window size of the Language To find the optimal context window size, different studies have been conducted using different WSD approaches for different languages. WSD researches were conducted for Amharic language by different researchers starting from one-one to ten-ten window sizes to find out the optimal context window size for this language using different approaches (Solomon Mekonnen, 2010). In a research conducted for Amharic language using supervised machine learning method for five ambiguous words (mesasat, meTrat, qereSe, Atena and mesal) and it was advised that window size 3-3 is an effective by using Naïve Bayes algorithm. The authors of (Getahun Wassie and Million Meshesha, 2014) have done a research on Amharic WSD using semi-supervised machine learning method and advised that the optimal window size is 2-2 or 3-3 window size using five classification algorithms for the selected five ambiguous words of the language (ATena, derese, tenesa, ale, bela). Figure 3: Performance of the classification algorithm Amlakie Aschale and Kinde Anlay Ethiop.J.Sci.Sustain.Dev., Vol. 8 (1), 2021 103 Figure 4: Average Accuracy of each window size using semi-supervised approach algorithms The optimal window size 3-3 was effective for three of the bootstrapping and SVM algorithms (ADTree, AdaBoostM1, bagging, and SMO), and 2-2 window size was reported to be effective using Naïve Bayes algorithm. Due to those reason we used semi-supervised algorithms to determine the size of context window because Semi-supervised approach scores the highest accuracy than the others approaches. We obtained that window size of 4-4 is the optimal window size in order to differentiate the meaning of the selected Ge’ez ambiguous words using AdaBoostM1 algorithm in our study. From Figure 4, we concluded that, semi-supervised algorithms perform much better than the other algorithms which means bootstrapping algorithms (ADTree, and AdaBoostM1) SVM algorithm (SMO). We see that ADTree, AdaBoostM1 and SMO achieved high performance on the given datasets. However, our focus is to determine which window size is suitable for Ge’ez language. Then all algorithms that are ADtree, AdaBoostM1 and SMO score high accuracy on window size of 4-4. Therefore, window size 4-4 is the best window size using ADTree algorithm for our Ge’ez datasets. SMO was also the best performer algorithm next to ADTree algorithm for window size of 4-4 using semi-supervised learning method. Lastly, we can conclude that window size of 4-4 becomes best performer using ADTree algorithm for WSD prototype model of our Ge’ez datasets depending on our experiments. 4. Conclusion and Recommendations 4.1. Conclusion There are so many words with more than one meaning in natural language and the meaning is determined by its context. The automated process of recognizing word senses in context is known as Word Sense Disambiguation (WSD). In this study, three experiments have been conducted using different classification and clustering algorithms. The first experiment was a comparison of the results obtained using the three machine learning methods which means unsupervised, semi-supervised, and supervised learning methods. In this experiment, a semi-supervised approach has performed better compared to the other machine learning methods. Since the semi-supervised learning method was employed in this work, we used the final fully labeled dataset which is obtained from unlabeled data sets. Those unlabeled datasets were labeled after clustering using the clustering assumption. But for clustering purposes, we used the EM algorithm because the EM algorithm performs better compared to the other selected clustering algorithms. The second experiment was conducted to determine the best performing algorithm for the selected Ge’ez datasets. From this end the best Amlakie Aschale and Kinde Anlay Ethiop.J.Sci.Sustain.Dev., Vol. 8 (1), 2021 104 performing algorithm for the selected Ge’ez datasets were found to be the ADTree algorithm compared to both clustering and classification algorithms that were selected in this study. The last experiment was conducted to investigate the optimal window size for determining the senses of each ambiguous word. From experimental results, we obtained that the window size of 4-4 can be considered as optimal window size for Ge’ez WSD systems. In general, we conclude that semi-supervised learning is potential learning method that performs better in our study. There are many potential algorithms to be applied for Ge’ez WSD systems using semi-supervised learning corpus-based approaches and ADtree is the best performing algorithm for this language among those semi-supervised algorithms. 4.2. Recommendations Word sense disambiguation researches require a variety of linguistic resources like thesaurus, WordNet, Machine-Readable dictionaries and effective Ge’ez language stemmer. There is no standard stop word of the language in which we faced a significant challenge as Ge’ez lacks those resources. Lack of sense annotated data for the language was also another challenge that makes us to limit our study on six ambiguous words of the language. And this makes us to limit our dataset on 2119 sentences or instances only. Therefore, we have the following recommendations which include the development of resources and future research directions for WSD of Ge’ez language:  This study considers words that have only two senses. In the futures, the researcher will consider words with more than two senses.  This study has concentrated only on modeling WSD to tackle lexical ambiguity which is at word level. Further researches would be recommended to address other types of ambiguities in Ge’ez language like Character and structural ambiguity (sentsnces level).  In addition to the corpus-based approaches, there are also knowledge-based and hybrid approaches that were used for WSD of other languages. Therefore, we recommend that these approaches need to be investigated for Ge’ez language as well. Referencec Eker, Ö. (2007). Developing Methods For Word Sense Disambiguation. Boğaziçi University. Getahun Wassie, Ramesh, B. P., Solomon Teferra, & Million Meshesha. (2014). A Word Sense Disambiguation Model for Amharic Words using Semi-Supervised Learning Paradigm. Science, Technology and Arts Research Journal, 3(3): 147- 155. Jurafsky, D. (2000). Speech & language processing. Pearson Education India. Leykun Berhanu(2005). Contemporary Challenges in the Ministry of the Ethiopian Orthodox Church. PhD thesis, Howard University. Mahmoodvand, M., & Hourali, M. (2017). Semi-supervised approach for Persian word sense disambiguation. In 2017 7th International Conference on Computer and Knowledge Engineering (ICCKE), 104-110. Mersa Mebrhatu (2018). Unsupervised Machine Learning Approach for Tigrigna Word Sense Disambiguation. Computer Engineering and Intelligent Systems, 9(6): 10–16. Naseer, A., & Hussain, S. (2009). Supervised word sense disambiguation for Urdu using Bayesian classification. Center for Research in Urdu Language Processing, Lahore, Pakistan. Pal, A. R., Kundu, A., Singh, A., Shekhar, R., & Sinha, K. (2013). A Hybrid Approach To W ord Sense Disambiguation Combining Supervised And Unsupervised Learning. 4(4): 89–101. Seid Yesuf & Yaregal Assabie (2017). Amharic Word Sense Disambiguation Using Wordnet. In The 5th International Conference on the Advancement of Science and Technology. Solomon Mekonen. (2010). Word Sense Disambiguation for Amharic Text: A Machine Learning Approach. Unpublished Master’s Thesis, 1-94. Workneh Tesema, Tesfaye Debela & Kibebew Teferi.(2016). Towards the sense disambiguation of Afan Oromo words using hybrid approach (unsupervised machine learning and rule based). Ethiopian Journal of Education and Sciences, 12(1): 61-77.