LONTAR KOMPUTER VOL. 10, NO. 3 DECEMBER 2019                p-ISSN 2088-1541   
DOI : 10.24843/LKJITI.2019.v10.i03.p07  e-ISSN 2541-5832 
Accredited B by RISTEKDIKTI Decree No. 51/E/KPT/2017 
 

193 
 

Query Suggestion on Drugs e-Dictionary Using the 
Levenshtein Distance Algorithm 

    
Halimah Tus Sadiah

a1
, Muhamad Saad Nurul Ishlah

a2
, Nisa Najwa Rokhmah

b3 

 
a
Manajemen Informatika, Universitas Pakuan 

Jl.Pakuan, Bogor 16143 
1
sadiahht@unpak.ac.id (Corresponding author) 

2
nurul.ishlah@unpak.a.c.id   

 
b
Farmasi, Universitas Pakuan 

Jl.Pakuan, Bogor 16143 
3
nisanajwarokhmah@gmail.com    

 
Abstract 
 

The dictionary of medicine in the form of a thick book has many disadvantages, one of which is 
impractical. This is the reason for Indonesian developers to create a drugs e-Dictionary. But the 
drug e-Dictionary that has been developed is still in the form of a letter index so that users must 
search the terms one by one in sequential order. This has become so inefficient and ineffective 
that it is necessary to add a search function and query suggestion feature to the drug e-
dictionary. The purpose of this study is to build a query suggestion facility on drugs e-Dictionary 
using the Levenshtein Distance algorithm. The stages of this research consist of the 
Development of web-based drugs e-Dictionary, Implementation of the Levenshtein Distance 
Algorithm, Query Suggestion Testing, and Usage. The query suggestion function works by 
producing the closest word output contained in the database. Based on the results of the 
implementation of the Levenshtein Distance algorithm and test results, Drugs e-Dictionary can 
evaluate words that are not in the database. It reaches 90% accuracy of the inputted query, with 
90% precision and 90% recall in the confusion matrix.  

  
Keywords: Query Suggestion, Drugs e-Dictionary, Algorithm, Levenshtein Distance Algorithm 
  
 
1. Introduction 

The decision to use a drug (medication drug) always raise concern on the benefits and risks so 
that a Pharmacist needs a drug dictionary to search for previously unknown terms of medicine 
[1]. Besides, the drug dictionary becomes one of the learning tools that are used by 
Pharmacists, Students, and the Indonesian community in learning medicine or foreign terms 
about medicine. The drug dictionary that is used nowadays is in the form of a thick physical 
dictionary book. It turns out to have drawbacks, such as it is too heavy to be carried so that it is 
not practically handy. This is one of many reasons for Indonesian developers to compete in 
creating an electronic dictionary of drugs or what we know as the term drug e-dictionary. 

Most of the available drugs e-dictionaries that have been developed so far are still in the form of 
a letter-index based dictionary. It makes users have to search for words or terms one by one in 
a sequential fashion. This has become so inefficient and ineffective that it is necessary to add a 
search function to the drug e-dictionary. The search function on drugs e-dictionary is very 
important because it can be used as a shortcut when searching words or terms needed so that 
users can search for words effectively and efficiently [2][3].  

The drug e-dictionary search function needs to be optimized with the addition of the Query 
Suggestion facility. Query Suggestion is some interface between a user and a search engine 
[4]. This facility is an effective and efficient approach to help the user in the process of finding 
information by providing a suggestion for the user when mistyping is happened in the search 
form [2][3][5][6]. This feature is very important to be applied since it can improve the usability 
factor of searching [7][8]. It works by looking for the similarity between a correct query and a 

mailto:1sadiahht@unpak.ac.id
mailto:2nurul.ishlah@unpak.a.c.id


LONTAR KOMPUTER VOL. 10, NO. 3 DECEMBER 2019                p-ISSN 2088-1541   
DOI : 10.24843/LKJITI.2019.v10.i03.p07  e-ISSN 2541-5832 
Accredited B by RISTEKDIKTI Decree No. 51/E/KPT/2017 
 

194 
 

false query in a database [9]. This feature can be a solution for preventing the user from typing 
the wrong name of the drug. The Query Suggestion can be used in a search application by 
implementing the Levenshtein Distance Algorithm.  

Research on query suggestion has been done by Jiang et al. (2008), namely Query suggestion 
by query search: a new approach to user support in web search [3]. Meanwhile, research on the 
Levenshtein Distance algorithm was conducted by Ngafidin and Wibawanto (2015), namely the 
Implementation of the Autocomplete Feature and the Levenshtein Distance Algorithm to 
Increase the Effectiveness of Word Search in the Indonesian Big Dictionary (KBBI) [10]. This 
study aims to build a query suggestion facility using the Levenshtein Distance algorithm on 
drugs e-dictionary. This research is critical to do so that pharmacists, students, and the public 
can easily search for drug terms in the drug e-dictionary. 

 
2. Research Methods 

The research method used in this study consists of several stages, as shown in Figure 1, which 
is described below: 

1.  Development of Web-based e-Dictionary Drugs 
Development of Web-based e-Dictionary Drugs using the SDLC (System Development 
Life Cycle) method that has been adapted to the needs of Web-based Drugs e 
Dictionary [11][12]. The stages are Plan, Analysis, Design, Code, Testing. 

2.  Implementation of the Levenshtein Distance Algorithm 
The implementation is done by adding the Levenshtein Distance algorithm in the PHP 
programming language. 

3.  Testing Query Suggestion 
 Testing is done by inputting drug terms in the search form as many as 100 terms. The 

number of terms entered consists of 50 correct terms, 50 incorrect terms, or incorrect 
terms. 

4.  Usage 
Drugs e-Dictionary that has been tested is then hosted to be used by users. 

   
Figure 1. Research Method 

Start 

Plan Analysi
s 

Design Code Testing 

Implementation of the Levenshtein Distance 
Algorithm 

Testing Query Suggestion 

Usage 

Error   

N
o 

Ye
s  

end 


LONTAR KOMPUTER VOL. 10, NO. 3 DECEMBER 2019                p-ISSN 2088-1541   
DOI : 10.24843/LKJITI.2019.v10.i03.p07  e-ISSN 2541-5832 
Accredited B by RISTEKDIKTI Decree No. 51/E/KPT/2017 
 

195 
 

3. Result and Discussion 

3.1.  Web-based Drugs e-Dictionary Development 
3.1.1. Plan 

In the planning stage, data collection is carried out. Data was collected from the ISO Indonesian 
Information Specialist book [13]. The collected data consist of drug categories, drug names, 
indications, contradictions, side effects, drug interactions, dosages, packaging, and drugs 
warning. 

3.1.2. Analysis 

In this stage, system functionality requirements and non-system requirements are collected. 
There are 28 system functionality requirements, namely 10 front end system functionality 
requirements and 18 back end system functionality requirements. The non-functional 
requirements only produced 7 system non-functional requirements. 
 
3.1.3. Design 
Next, in this design stage, a search system flow will be developed. It is depicted in Figure 2. 
 

Figure 2. Developed Search System Flow 


LONTAR KOMPUTER VOL. 10, NO. 3 DECEMBER 2019                p-ISSN 2088-1541   
DOI : 10.24843/LKJITI.2019.v10.i03.p07  e-ISSN 2541-5832 
Accredited B by RISTEKDIKTI Decree No. 51/E/KPT/2017 
 

196 
 

Based on Figure 2, the flow of drug searching is described below: 
1. User accesses drugs e-dictionary website 
2. User searches for the name or drug term in the search form  

 If the user’s query is empty, then the system will show empty query notification or 
“query has not been inserted”.  

 If the inputted query is in the database, then the system will show search results. 

 If the inputted query is not available in the database, then the system will proceed 
with Levenshtein Distance Algorithm followed up with the query suggestion 
 

3.1.4. Code 

In this implementation stage, the system is developed in PHP language with MYSQLi for the 
database connection. The result is a web drug e-dictionary. Drugs e-dictionary consists of the 
main searching page, which searches based on drugs term as depicted in figure 3; searching 
based on disease indication, as shown in Figure 4; and A-Z index-based searching, as depicted 
in Figure 5. 

 
Figure 3. Drugs e-Dictionary website 
 

Figure 4. Homepage user interface based disease indication 
 

LONTAR KOMPUTER VOL. 10, NO. 3 DECEMBER 2019                p-ISSN 2088-1541   
DOI : 10.24843/LKJITI.2019.v10.i03.p07  e-ISSN 2541-5832 
Accredited B by RISTEKDIKTI Decree No. 51/E/KPT/2017 
 

197 
 

Figure 5. A-Z index-based searching page 
 
3.1.5. Testing 

A black box is used in the testing stage. It is usually called a system functional test [11]. Based 
on the test, 28 functions from the system is running as expected. 

3.2.  Levenshtein Distance Algorithm Implementation 

The Levenshtein Distance algorithm is an algorithm created by Vladimir Levenshtein in 1965 
[14]. This algorithm looks for the distance between the words entered by the user and the words 
stored in the system database by the method of calculating the number of differences between 
the two strings in the form of a matrix [15][16]. It works by calculating the distance between the 
two strings and then look for the minimum number of change operations to change from string A 
to string B. The calculation is represented using the Levenshtein distance calculation table, 
where the last value in the lower right corner is the final value of the second distance string. In 
the Levenshtein distance algorithm  there are three operations performed, namely the operation 
of changing characters, adding characters and deleting characters [17[18] [19]. Figure 6 is a 
pseudocode Levenshtein distance algorithm. 

 
Figure 6. Pseudocode Levenshtein Distance Algorithm 

 
LONTAR KOMPUTER VOL. 10, NO. 3 DECEMBER 2019                p-ISSN 2088-1541   
DOI : 10.24843/LKJITI.2019.v10.i03.p07  e-ISSN 2541-5832 
Accredited B by RISTEKDIKTI Decree No. 51/E/KPT/2017 
 

198 
 

The pseudocode of the algorithm, as depicted in figure 6, can be computed manually, as shown 
in figure 7 and figure 8. 

 
Let “Paraci” be inputted characters, and word in the database is Paraco. 
 
m = Inputted by user = Paraci 
n = Word in the database = Paraco 

d[0,0] = 0  

Initialize first row and first column with 0,1,2,. . . m   0,1,2,...n 

 
Figure 7. The First row and column initialization 
 

 For each character, compare each character from inputted word with an actual word in 
the database. If it is a match, then the cost is 0. Otherwise the cost will be 1 

 Check the minimum, 

d[i,j] 
Top = d[i,j]+1 
Side = d[i,j]+1 
Diagonal  = d[i,j]+ cost 

 Compare character P with P, put cost = 1 if differ, otherwise cost =0 

Check all values in d [i,j] 
Top = 1  

Minimum 
diagonal 

Diagonal = 0 
Side =  

 d [i,j] = d [i,j] + cost = 0 + 0 = 0 so on, so forth 
 

Figure 8. Manual computation process of The Levenshtein Distance algorithm 
 

In Figure 8, the distance generated is a value that is in the lower-right corner of the matrix, 
which is 1. The value of one means there is 1 operation performed. The value of one is 
generated from the operation of the sum of the cost values with a minimum diagonal value. The 
distance value obtained from the diagonal side means that the operation that works is a 


LONTAR KOMPUTER VOL. 10, NO. 3 DECEMBER 2019                p-ISSN 2088-1541   
DOI : 10.24843/LKJITI.2019.v10.i03.p07  e-ISSN 2541-5832 
Accredited B by RISTEKDIKTI Decree No. 51/E/KPT/2017 
 

199 
 

substitution. So for the PARACI String to be converted into a PARACO string, one operation is 
needed, namely the substitution of the first character ("I") to the character "O" so that the value 
of the Levenshtein Distance is equal to 1. The result of the Levenshtein Distance algorithm 
implementation on drugs e-dictionary, as depicted in figure 9. 

 
Figure 9. Result of Levenshtein Distance algorithm implementation on drugs e-dictionary 

 
3.2.1. Testing the Drugs e-Dictionary with Query Suggestion Added Facility 

Tests carried out in the form of validation testing by inputting 100 test queries into the search 
form. Table 1 summarizes the results of the validation test on the drug e-dictionary. Figure 9 
shows an example of query suggestion testing. 

 
Table 1. Example of Words in query suggestion validation testing 

No 
Inputted 
Query 
(drug 
name) 

Query 
Suggestion 

Output 

 
Category of 
Levenshtein 

Distance 
Algorithm 
Operation 

Notes Validation 

1 Paracetamol -  Paraceta
mol 

- The 
inputted 
query is 
correct 

Valid 

2 Kamols “Apakah 
maksud 
anda 
kamolas, ?” 

Kamolas Add letter a Incorrect 
query 
inputted, 
lacking 
letters 

Valid 

3 Zephanall “Apakah 
maksud 
anda 
Zephanal?” 

Zephanal delete letter l Incorrect 
query 
entered, 
Excess 

Valid 


LONTAR KOMPUTER VOL. 10, NO. 3 DECEMBER 2019                p-ISSN 2088-1541   
DOI : 10.24843/LKJITI.2019.v10.i03.p07  e-ISSN 2541-5832 
Accredited B by RISTEKDIKTI Decree No. 51/E/KPT/2017 
 

200 
 

No 
Inputted 
Query 
(drug 
name) 

Query 
Suggestion 

Output 

 
Category of 
Levenshtein 

Distance 
Algorithm 
Operation 

Notes Validation 

letters 
4 Paraci “Apakah 

maksud 
anda 
paraco?” 

Paraco Substitute i 
with o 

Incorrect 
query 
entered  
 

Valid 

5 Diparin “Apakah 
maksud 
anda 
Dapyrin ?” 

Dapyrin Query 
suggestion 
by closest 
word 

The 
inputted 
query 
does not 
exist in 
the 
database 

Valid 

 
In Table 1, validation tests are categorized into an insert, delete, and substitution operations. 
Whenever an inputted query is not in the database, the system will show notification of “The 
inputted query does not exist in the database”, which then will show Query suggestion by 
generating some terms that are closer in the database. Let’s take “diparin” as an inputted query 
(unknown term in the database). The system will show “dapyrin” as the suggestion (Table 1).  

The developed system uses a non-case sensitive query checking. Hence it will not affect the 
output, whether the inputted query is in an uppercase or lowercase. In addition, if the inputted 
query is a meaningless word, such as “ZZZZ”, then the system will show a word with initial letter 
Z that has the fewest number of words in the database, in this case “Zalona”. The system will 
search for any terms with minimal Levenshtein Distance algorithm operation.  

The evaluation of accuracy represented in a confusion matrix (Table 2), which has four 
classification process results, namely: True Positive (TP), True Negative (TN), False 
Positive (FP) dan False Negative (FN) [20].  

 
Table 2. Confusion Matrix 

Total 
Population 

Predicted: 
yes 

Predicted: 
no 

Actual: yes TP FN 

Actual: no FP TN 

 
Based on TN, FP, FN and TP accuracy are obtained (equation 1), precision (equation 2) and 
recall (equation 3). Based on equation 1, equation 2 and equation 3, we have a result of query 
accuracy for drug query of 90%, precsion=90%, recall = 90%. The confusion matrix for the 
evaluation result of drug terms is in Table 3. 
 

LONTAR KOMPUTER VOL. 10, NO. 3 DECEMBER 2019                p-ISSN 2088-1541   
DOI : 10.24843/LKJITI.2019.v10.i03.p07  e-ISSN 2541-5832 
Accredited B by RISTEKDIKTI Decree No. 51/E/KPT/2017 
 

201 
 

Table 3. Confusion Matrix for Drug Terms Evaluation 

100 Predicted: 
yes 

Predicted: 
no 

Actual: 
yes  

45 5 

Actual: 
no 

5 45 

 
4. Conclusion 

The drug e-dictionary search function needs to be optimized with the addition of the Query 
Suggestion facility. The Query Suggestion facility was developed using the Levenshtein 
Distance algorithm. Based on the results of the implementation, the Levenshtein Distance 
algorithm runs from the top left corner of a two-dimensional array that has been filled with 
several initial string characters and target strings and is given a cost value. The cost value at the 
lower right-hand end is the Distance edit value that represents the number of operations that the 
algorithm has to process. Based on the test results, the system can evaluate words that are not 
in the database with the query suggestion function closest to the database. It reaches 90% 
accuracy of the inputted query, with 90% precision and 90% recall in the confusion matrix. The 
future work is the implementation of n-gram on drugs e-dictionary and performing a comparative 
analysis of Levenshtein distance algorithm with n-gram. 

 
References  
 
[1]  Departemen Kesehatan RI. Tanggung Jawab Apoteker Terhadap Keselamatan Pasien 

(Patient Safety ). Jakarta: Direktorat Bina Farmasi Komunitas Dan Klinik Ditjen Bina 
Kefarmasian Dan Alat Kesehatan Departemen Kesehatan RI. 2008. 

[2]  Y. Song, & Li-wei He. 2010. Optimal Rare Query Suggestion With Implicit User. ACM 
Journals.pp: 901-910. 

[3]  S. Jiang, S. Zilles, & R. Holte. 2008. Query suggestion by query search: a new approach 
to user support in web search [Online]. [Cited 2018 August 1]. Available from 
www.cs.uregina.ca/~zilles/jiangZH09.pdf 

[4]  Y. Song, D. Zhou., & L.W. He. 2011. Post-ranking-query-suggestion-by-diversifying-
searchresul [Online]. [Cited 2018 August 1]. Available from https://www.microsoft.com/id-
id/: https://www.microsoft.com/en-us/research/publication/post-ranking-query-
suggestionbydiversifying-search-results/ 

[5]  J.-M.Yangy, R. Cai, F. Jingz, S.Wangy, L. Zhangy, & W.Y.Ma. 2008. Search-based Query 
Suggestion.[Online] [Cited 2018 August 1]. Available from 
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.159.3499&rep=rep1&type=pdf 

[6] Q. Mei, D. Zhou & K Church 2008. Query Suggestion Using Hitting Time [Online]. [Cited 
2018 August 1]. Available from https://www.microsoft.com/enus/ 
research/wpcontent/uploads/2017/01/sugg.pdf 

[7] H. Cao, D. Jiang,J. Pei, Q. He, Z. Liao, E. Chen, & H. Li. 2008. Context-Aware Query 
Suggestion by Mining Click-Through [Online]. [Cited 2018 August 1]. Available from 
https://www.cs.sfu.ca/~jpei/publications/QuerySuggestion-KDD08.pdf 

[8]  Z.-J. Zha, L. Yang, T. Me., M. Wang, & Zengfu. Visual Query Suggestion. ACM Journals. 
pp. 15-24. 2009. 

[9]  S. Bathia, D. Majumdar, & P. Mitra. Query Suggestions in the Absence of Query Logs. 
ACM Journals, pp. 1-10.2011. 

[10]  K.N. Ngafidin & H. Wibawanto. Implementasi Fitur Autocomplete dan Algoritma 
Levenshtein Distance untuk Meningkatkan Efektivitas Pencarian Kata di Kamus Besar 
Bahasa Indonesia (KBBI). Jurnal Teknik Elektro. Vol. 7, No. 1, pp.1-6. 2015 

[11]  R Pressman dan B.R. Maxim. Software Engineering a Practitioners approach. McGraw-
Hill Education : New York. 2014. 

[12]  J Satzinger, R. Jackson, & S. Burd. System Analysis and Design in a changing World. 
USA: Course Technology Cengage Learning. 2010. 

http://www.cs.uregina.ca/~zilles/jiangZH09.pdf
https://www.microsoft.com/id-id/
https://www.microsoft.com/id-id/
https://www.microsoft.com/en-us/research/publication/post-ranking-query-suggestionbydiversifying-
https://www.microsoft.com/en-us/research/publication/post-ranking-query-suggestionbydiversifying-
https://www.microsoft.com/enus/
https://www.cs.sfu.ca/~jpei/publications/QuerySuggestion-KDD08.pdf


LONTAR KOMPUTER VOL. 10, NO. 3 DECEMBER 2019                p-ISSN 2088-1541   
DOI : 10.24843/LKJITI.2019.v10.i03.p07  e-ISSN 2541-5832 
Accredited B by RISTEKDIKTI Decree No. 51/E/KPT/2017 
 

202 
 

[13]  Ikatan Apoteker Indonesia. ISO Informasi Spesialite Obat Indonesia. Vol 52. 2019. 
Jakarta : Isfi Penerbitan.2019 

[14] Z.Afriansyah, D.Puspitaningrum, & Ernawati. Rancang Bangun Aplikasi Pencocokan DNA 
Manusia Menggunakan Algoritma Levenshtein Distance (Studi Kasus: Dna Kanker Hati 
Manusia). Jurnal Rekursif . Vol.  3, No. 2,pp. 61-67.2015. 

[15]  B. Pratama & S. Pamungkas, Analisis Kinerja Algoritma Levenshtein Distance Dalam 
Mendeteksi Kemiripan Dokumen Teks. Jurnal Log!k@ . Vol. 6, No. 2, pp. 131-143.2016 

[16]  T. Aprilianto, & A. Badawi. Sistem Koreksi Kata Dan Pengenalan Struktur Kalimat 
Berbahasa Indonesia Dengan Pendekatan Kamus Berbasis Levenshtein Distance. Jurnal 
SPIRIT. Vol.  9, No. 1, pp 48-61. 2017. 

[17]  R. Haldar, & D. Mukhopadhyay. 2011. Levenshtein Distance Technique in Dictionary 
Lookup Methods: An Improved Approach [Online]. [Cited 2018 August 1]. Available from  

[18]  R. Mishra, & N. Kaur. A Survey of Spelling Error Detection and Correction Techniques. 
International Journal of Computer Trends and Technology.  Vol. 3, No. 4, pp. 372-374. 
2013 

[19]  N. Ariyani, N., R. Sutardi, & Ramadhan. Aplikasi Pendeteksi Kemiripan Isi Teks Dokumen 
Menggunakan Metode Levenshtein Distance. semanTIK.Vol.  2, No. 1,pp. 279-286. 2016. 

[20]  M. Navin, Pankaja R. Performance Analysis of Text Classification Algorithms using 
Confusion Matrix. International Journal of Engineering and Technical Research (IJETR). 
Vol. 6, No. 2,pp. 75-78. 2016