AREL – Augmented Reality–based enriched learning experience


ACTA IMEKO 
ISSN: 2221-870X 
September 2022, Volume 11, Number 3, 1 - 5 

 
ACTA IMEKO | www.imeko.org September 2022 | Volume 11 | Number 3 | 1 

AREL – Augmented Reality–based enriched learning 
experience 

A V Geetha1, T Mala2 

1 Research Scholar, Department of Information Science and Technology, College of Engineering, Anna University, India  
2 Associate Professor, Department of Information Science and Technology, College of Engineering, Anna University, India  

 
Section: RESEARCH PAPER  

Keywords: Augmented reality; learning technologies; education; Vuforia 

Citation: A V Geetha, T Mala, AREL – Augmented Reality–based enriched learning experience, Acta IMEKO, vol. 11, no. 3, article 12, September 2022, 
identifier: IMEKO-ACTA-11 (2022)-03-12 

Section Editor: Zafar Taqvi, USA 

Received March 30, 2022; In final form July 30, 2022; Published September 2022 

Copyright: This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, 
distribution, and reproduction in any medium, provided the original author and source are credited. 

Corresponding author: A V Geetha, e-mail: geethu15@gmail.com  

 
1. INTRODUCTION 

Within few months, the pandemic of coronavirus illness 2019 
(COVID-19) caused by the novel virus SARS-CoV-2 has forced 
enormous changes in the way businesses and other sectors 
operate. According to World Economic Forum, 1.2 billion 
children in 186 countries were affected by school closures, as of 
March 2021 [1]. Moreover, the new wave of cases in several 
regions of the world impacts the return towards normalcy. Herd 
immunity and vaccines provide only temporary relief to regions 
affected by the new virus strains. Thus, online learning has 
evolved into a viable alternative to traditional classroom-based 
learning, with instruction delivered remotely and using digital 
platforms.  

Even before the COVID era, there is a steady increase in the 
growth rate and adoption of technology in education. According 
to GlobeNewswire, it is estimated that the online education 

market will reach $350 billion by 2025. In addition, concerning 
the response to COVID-19, several online education platforms 
such as DingTalk, have scaled their cloud services by more than 
100,000 servers [2]. Augmented and Virtual Reality (AR/VR), the 
emerging technology trend, can improve the online learning 
experience by increasing engagement and retention.  

VR headsets such as Google Cardboard (GC) have made the 
technology accessible to most of the world's population. VR and 
AR based online learning platforms offer experiential learning, 
where the students learn through experience rather than through 
traditional methods such as rote learning. Some of the benefits 
of experiential learning are - accelerated learning, engagement, 
understanding of complex concepts easily. 

Traditional educational methods are progressively becoming 
digital, due to technological advancements. Mobile Augmented 
Reality (AR) is the superimposition of the virtual objects over 
reality. AR is widely used in many field such as manufacturing, 
robotics, behavioral treatment, aircraft engineering design and so 

ABSTRACT 
In present era, teaching occur either on a chalkboard or on a projected power point presentation on the wall. Traditional teaching 
methods such as blackboards and power point presentations are being phased out in favor of enriched learning experiences provided 
by emerging edtech. With the closure of schools due to COVID-19, the demand for online educational platforms has also increased. 
Furthermore, some of the recent trends in edtech include personalized learning, gamification and immersive learning with eXtended 
Reality (XR) technologies. Due to its immersive experience, XR is a pioneering technology in education, with multiple benefits including 
greater motivation, a positive attitude toward learning, concrete learning of abstract concepts, and so on. Existing Augmented Reality 
(AR) based education applications often rely on unimodal input such as marker-based trigger to launch the educational content. Hence, 
this work proposes a multi-modal interface to enable the content delivery through marker and speech recognition-based content 
delivery. Additionally, the proposed work is designed as mobile based AR platform with the regional language support to increase the 
ubiquitous accessibility of the AR content. Thus, the proposed mobile AR based enriched learning (AREL) platform provides a multi-
modal mobile based educational AR platform for primary students. Based on the feedback received after the usage, it is observed that 
AREL improves the learning experience of the students. 

mailto:geethu15@gmail.com


ACTA IMEKO | www.imeko.org September 2022 | Volume 11 | Number 3 | 2 

on [3]. The spectrum of Extended Reality (XR), which refers to 
the spectrum of real-virtual environments. XR has gradually 
evolved in the realm of education, revolutionizing pedagogical 
practices as well as students' educational experiences by 
facilitating the understanding of complicated aspects in 
education through the visual depiction of images based on real-
world data [4]. Specifically, AR and VR has been extensively used 
in many educational applications. For example, In [5], a solar 
system mobile app is designed to test the knowledge retention of 
college students. In [6], a collaborative augmented reality system 
is used to deliver geometrical concepts in mathematics, which are 
abstract can be easily illustrated using AR platform. As a result, 
AR aids in the comprehension of challenging concepts of the 
learning module.   

In [7], the work presents a gesture-based AR system to aid in 
understanding the anatomical structures of the human body. [8] 
focuses on the concept of comic-based education through 
markerless AR for improving metacognitive abilities of the 
students. Thus, there are variety of approaches for designing the 
educational apps based on AR and VR such as head mounted 
displays, markerless systems, and gesture-based systems. In 
addition, Mobile AR systems are effective because they allow for 
portability and easy access. 

Therefore, the proposed AREL system is a mobile AR-based 
learning platform in which students can scan the contents of 
their books to discover videos that appear magically over the 
pages, transforming a plain textbook into a book with dynamic 
information. Furthermore, AREL delivers the contents in 
regional language of the students to improve the engagement 
with the app. AREL is made up of a collection of modules such 
as speech recognition system, image tracking and registration 
module that take advantage of mobile sensors and computational 
power. The application is developed using Unity Engine and 
Vuforia SDK.  

The mobile application interfaces with the Vuforia cloud 
target recognition system via a client-server architecture. To grab 
students' attention and increase their learning experience, the 
content in the book is enriched with augmented graphics, 
animations, and other edutainment features.  

The mobile camera is linked to the scene's virtual camera as 
soon as the program is launched. Once the target image is 
recognized with the help of the Vuforia image database, using 
pattern recognition algorithms, the corresponding output view is 
rendered using the display unit.  As a result, the output view 
consists of virtual objects laid out over the real-time objects. The 
triggered audio output explains the concepts that are scanned, 
which in turn improves the experience of the learning.  

The learning module also consists of multiple-choice 
questions on the contents taught, to assess the understanding of 

the learned material. Based on the feedback received after the 
usage, it is observed that AREL positively improves the learning 
experience. 

2. METHODOLOGY 

AREL is designed as a multi-modal AR interface for students 
to deliver mobile-based learning. Based on the survey of various 
literature related to AR based educational platform design, it is 
observed that AR-based learning platform improves engagement 
and comprehension of difficult concepts. AR also provides an 
experiential learning experience rather than traditional methods 
such as rote learning or instructor-led learning. AREL is designed 
as a mobile-based AR system for increasing the accessibility of 
AR based learning for students. Thus, AREL complements and 
improves the online learning solutions or protocols developed 
during the COVID era.  

2.1. Development Model 

AREL is developed using the Rapid Application 
Development (RAD) Model, as it accelerates the system 
development. RAD model is appropriate when the product 
development time is less, and the project requires high 
component reusability and modularity. Figure 1 illustrates the 
RAD model followed in the work. The RAD model involves 
four development phases, and they are briefly explained as 
follows: 

• Requirement Gathering: In this phase, the objectives and 
requirements for the products are gathered based on the 
technical review. Thus, this phase aids in the understanding 
of the project goals and expectations. 

• User Description: To design the component, the developer 
collects the description of the component design from its 
user. Based on the design from the user, a prototype of the 
application is developed, which further reviewed, and the 
design is updated. 

• Implementation: In this phase, the developer implements 
the requirements and perform testing on the product. For a 
typical AR application, this phase involves: UI interface 
design, creation of 3D objects, coding and testing of the 
product. 

• Evaluation: Once the product is completely developed and 
tested whether the user expectations are met. Once the 
product is successfully evaluated, the project reaches the 
users. 

2.2. Software Used 

• Unity: Unity is a game development engine which is used to 
create games for 3D environments such as VR and AR. 
Unity supports scripting through C# [9].  

• The applications developed using unity can be exported to 
platforms such as iOS, Android, or desktop platforms like 
Windows. Unity provides a comprehensive framework for 
adding interactive animations, audio and physics based 
logical simulations for natural and close to real interactions. 
Therefore, AREL is designed used Unity engine. 

• Vuforia Software Development Kit (SDK): Vuforia is an 
SDK supported by Unity and enables creation of AR 
applications for mobile [10]. It uses computer vision-based 
technologies to track image targets, object-targets, or area 
targets for marker-based AR application.  

• Upon recognition, the virtual object is placed relative to the 
marker and the virtual camera position. Vuforia is integrated 

 
Figure 1. RAD Development Model. 


ACTA IMEKO | www.imeko.org September 2022 | Volume 11 | Number 3 | 3 

to the unity engine for developing the AR concepts of 
AREL. 

• Google Speech-to-text API: The google speech to text API  
[11] enables integration of the speech recognition ability into 
to variety of applications.  

Upon sending a voice audio, it sends a transcript of the audio 
from the service. It uses sophisticated deep learning models 
ranging from Long Term Short Term-Recurrent Neural 
Networks to sophisticated speech recognition algorithms to 
perform accurate recognition. 

3. AUGMENTED REALITY BASED ENRICHED LEARNING 

The objectives of the AREL system are as follows: 

• To design a mobile-based AR system which increases the 
accessibility of AR based learning for students 

• To design an AR platform that can act as teaching aid for 
the students 

• To complement and improve the online experience through 
AR 

• To provide multi-modal interface and regional language 
support. 

The objectives are achieved in AREL through its multi-modal 
interfaced content delivery methods. AREL consists of two 
modes of content delivery as follows: a) Image target-based 
content delivery b) Speech-to-text based content delivery. This is 
illustrated in Figure 2, where the application receives input from 
the camera and microphone to deliver the content via AR. 

3.1. Image target-based content delivery 

The AR interface of the system is developed using Vuforia 
SDK and uses its image target database system for processing the 
image targets. The image-targets from the children’s textbook is 
created and the processed in the target database system of 
Vuforia. It is then integrated with the application through unity 
gaming engine.  

Upon scanning these image targets, the virtual camera 
performs an image recognition based on the features available in 
the target database. Once a matching image target is found, 
relevant content is displayed as AR content.  

Figure 3 illustrates the image-target based content delivery. 
The output view consists of video content or 3D objects (with 
audio description) laid out over the real-time objects. 

3.2. Speech-to-text based content delivery 

AREL also supports speech-to-text based content delivery. 
The audio samples received from the microphone is pre-
processed using noise cancellation and the speech input is sent 

to the speech-to-text service. Speech processing involves several 
steps, including analysis, feature extraction, modelling, and 
testing. The feature extraction process extracts unique features 
of the audio using Mel Frequency Cepstral Coefficients (MFCC) 
technique.  

Upon recognizing the sample, the speech input is converted 
to text. If the text matches any speech commands, then the 
appropriate AR content is displayed. The detailed steps for the 
content delivery here is as follows: 

1) Record a short audio from the user’s microphone 

2) Convert the audio into wav format 

3) Upload the file into the google server 

4) Once the uploaded file is processed, it receives the 

output from the JSON file. 

5) Process the JSON file with the text command, the 

corresponding number is displayed 

Figure 4 depicts the working of the speech to text-based 
content delivery. Algorithm 1 depicts the process involved in the 
speech-to-text based content delivery. 

 
Algorithm 1: Speech-to-text based content delivery 

Input: Microphone Audio (A) 
Output: AR Content based on speech 
 
1: 𝐶:=command_words 
2: W:=convert_audio_to_wav(A) 
3: text_from_speech:=speech_to_text_API(W) 
4: If text_from_speech ∈ 𝐶 then 
5:           command= text_from_speech ∩ 𝐶 
6:           load_content(command) 
7: End If 
 

4. RESULTS AND DISCUSSION 

AREL provides an AR based multi-modal audio-visual AR 
based learning with regional language support. To evaluate the 
usability and the learning experience of the AREL system, an 
observation study of the prototype system is made in a primary 
school in Chennai, India. Participants of this study range from 5-
8 years. The students were instructed on how to use the app over 
their physical book. During the experiment, the children were 
asked to try both the speech mode of learning and image target-
based mode. The screenshots of the image target-based content 
delivery are shown in Figure 5. The screenshots of the speech-
based content delivery are illustrated in Figure 6. 

During the analysis of the AREL experiment, the children 
were tested for the concepts presented. It is observed that after 

 
Figure 2. Architecture of the AREL System. 

 
Figure 3. Image target-based content delivery. 


ACTA IMEKO | www.imeko.org September 2022 | Volume 11 | Number 3 | 4 

the usage of the AREL the children improved and answered 
correctly. Post the experiment, the parents of the children were 
asked to provide feedback on the usability, learning retention, 
learning engagement and overall experience. The result of the 
survey is aggregated and tabulated, as shown in the Figure 7, 
where 1 represents the lowest rating such as unusable app or 
poor learning retention or improper learning engagement. 
Overall experience of 10 represents a user-friendly design and 
development with parameters related to learning are score high. 

The average usability rating of the app is 7, which represents 
the user-friendliness score of using the application. As the 
application supports both voice interaction and image-based 
interaction, it aids in better exploring their book. The children 
were excited to see the virtual content appearing in real-time over 
their book.  

According to the results of the experiment, the multi-modal 
user interface with image-target and voice-based interactions, as 
well as the augmented reality display integrating real and virtual 
items, functions as a natural immersive experience for children. 
As the children try out the various interactions of the same 
learnable content, the learning retention got improved, as 
indicated by the tabulated score in Figure 7.  

The children expressed a strong willingness to explore the 
application, indicating that it might be used as a fun and engaging 
learning tool. This is also indicated by an average learning 
engagement score of 8.33 from the survey. The overall 

experience of the mobile AR platform is at 8, which indicates the 
learning experience and usability experience of the children is 
positive and improved.  

5. CONCLUSIONS 

The development and evaluation of a mobile AR based 
enriched learning experience for learning language and math are 
reported in this work. One of the benefits of an AR learning 
experience over a standard book is that other intriguing aspects 
like animation, virtual objects, sound, and video may be included 
while the physical book is still present. The findings of the study 
suggest that the existence of such aspects during the learning 
process generate excitement, learning engagement, and 
enjoyment. The findings are supported by the answers to our 
survey questions to the parents of the children. The findings also 
show that the multi-modal interface of real and virtual things 
provides a natural immersive experience as well as an engaging 
and exciting learning tool for this age range. However, after 
repeated usage of the same book, children may become bored 
with AREL if they can guess what things will appear. Therefore, 
as part of future work, including a surprise aspect in the 
application could make it more enjoyable and engaging. 

While each image target-based marker has multiple types of 
visual content that could be presented, randomising the 
presentation of such content could surprise the child and can be 
included as future enhancement. Furthermore, learning analytics 
of user engagement and learning retention can be utilised to 
evaluate the user experience, and personalised content for each 
student will be applied in future work. 

REFERENCES 

[1] The rise of online learning during the COVID-19 pandemic 
|World Economic Forum. Onlinbe [Accessed 16 August 2022] 
https://www.weforum.org/agenda/2020/04/coronavirus-
education-global-covid19-online-digital-learning/ 

[2] Online Education Market Study 2019 | World Market Projected. 
Online [Accessed 16 August 2022]  
https://www.globenewswire.com/news-
release/2019/12/17/1961785/0/en/Online-Education-Market-

  
Figure 4. Speech-to-text based content delivery. 

 
Figure 5. Screenshots from Image target-based content delivery.  

 
Figure 6. Screenshots from Image target-based content delivery.  

 
Figure 7. Feedback for AREL. 

https://www.weforum.org/agenda/2020/04/coronavirus-education-global-covid19-online-digital-learning/
https://www.weforum.org/agenda/2020/04/coronavirus-education-global-covid19-online-digital-learning/
https://www.globenewswire.com/news-release/2019/12/17/1961785/0/en/Online-Education-Market-Study-2019-World-Market-Projected-to-Reach-350-Billion-by-2025-Dominated-by-the-United-States-and-China.html
https://www.globenewswire.com/news-release/2019/12/17/1961785/0/en/Online-Education-Market-Study-2019-World-Market-Projected-to-Reach-350-Billion-by-2025-Dominated-by-the-United-States-and-China.html


ACTA IMEKO | www.imeko.org September 2022 | Volume 11 | Number 3 | 5 

Study-2019-World-Market-Projected-to-Reach-350-Billion-by-
2025-Dominated-by-the-United-States-and-China.html 

[3] R. Azuma, Y. Baillot, R. Behringer, S. Feiner, S. Julier, B. 
MacIntyre, Recent advances in augmented reality, IEEE Comput. 
Graph. Appl., vol. 21, no. 6, Nov. 2001, pp. 34–47.  
DOI: 10.1109/38.963459  

[4] S. Alvarado, W. Gonzalez, T. Guarda, Augmented reality ‘Another 
level of education, Iber. Conf. Inf. Syst. Technol. Cist., vol. 2018-
June, 2018, pp. 1–5.  
DOI: 10.23919/CISTI.2018.8399331  

[5] K. T. Huang, C. Ball, J. Francis, R. Ratan, J. Boumis, J. Fordham, 
Augmented versus virtual reality in education: An exploratory 
study examining science knowledge retention when using 
augmented reality/virtual reality mobile applications, 
Cyberpsychology, Behav. Soc. Netw., vol. 22, no. 2, Feb. 2019, pp. 
105–110.  
DOI: 10.1089/cyber.2018.0150  

[6] H. Kaufmann, D. Schmalstieg, Mathematics and geometry 
education with collaborative augmented reality, Computers & 
Graphics, vol. 27, no. 3, 2003, pp. 339–345.  
DOI: 10.1016/S0097-8493(03)00028-1  

[7] F. Bernard, C. Gallet, H. D. Fournier, L. Laccoureye, P. H. Roche, 
and L. Troude, Toward the development of 3-dimensional virtual 

reality video tutorials in the French neurosurgical residency 
program. Example of the combined petrosal approach in the 
French College of Neurosurgery, Neurochirurgie, vol. 65, no. 4, 
Aug. 2019, pp. 152–157.  
DOI: 10.1016/j.neuchi.2019.04.004  

[8] A. M. Nidhom, A. A. Smaragdina, K. N. Gres Dyah, B. N. R. P. 
Andika, C. P. Setiadi, J. M. Yunos, Markerless Augmented Reality 
(MAR) through Learning Comics to Improve Student 
Metacognitive Ability, ICEEIE 2019 - Int. Conf. Electr. Electron. 
Inf. Eng. Emerg. Innov. Technol. Sustain. Futur., Oct. 2019, pp. 
201–205.  
DOI: 10.1109/ICEEIE47180.2019.8981411 

[9] Unity Real-Time Development Platform | 3D, 2D VR & AR 
Engine. Online [Accesed 16 August 2022]  
https://unity.com/ 

[10] Vuforia Developer Portal. Online [Accessed 16 August 2022] 
https://developer.vuforia.com/ 

[11] Quickstart: Transcribe speech to text by using client libraries | 
Cloud Speech-to-Text Documentation | Google Cloud. Online 
[Accessed 16 August 2022]  
https://cloud.google.com/speech-to-text/docs/transcribe-
client-libraries#before-you-begin 

 
https://www.globenewswire.com/news-release/2019/12/17/1961785/0/en/Online-Education-Market-Study-2019-World-Market-Projected-to-Reach-350-Billion-by-2025-Dominated-by-the-United-States-and-China.html
https://www.globenewswire.com/news-release/2019/12/17/1961785/0/en/Online-Education-Market-Study-2019-World-Market-Projected-to-Reach-350-Billion-by-2025-Dominated-by-the-United-States-and-China.html
https://doi.org/10.1109/38.963459
https://doi.org/10.23919/CISTI.2018.8399331
https://doi.org/10.1089/cyber.2018.0150
https://doi.org/10.1016/S0097-8493(03)00028-1
https://doi.org/10.1016/j.neuchi.2019.04.004
https://doi.org/10.1109/ICEEIE47180.2019.8981411
https://unity.com/
https://developer.vuforia.com/
https://cloud.google.com/speech-to-text/docs/transcribe-client-libraries#before-you-begin
https://cloud.google.com/speech-to-text/docs/transcribe-client-libraries#before-you-begin