Gesture recognition of sign language alphabet with a convolutional neural network using a magnetic positioning system


ACTA IMEKO 
ISSN: 2221-870X 
December 2021, Volume 10, Number 4, 97 - 102 

 
ACTA IMEKO | www.imeko.org December 2021 | Volume 10 | Number 4 | 97 

Gesture recognition of sign language alphabet with a 
convolutional neural network using a magnetic positioning 
system 

Emanuele Buchicchio1, Francesco Santoni1, Alessio De Angelis1, Antonio Moschitta1, Paolo Carbone1 

1 Department of Engineering University of Perugia, Italy  

 
Section: RESEARCH PAPER  

Keywords: gesture recognition; sign language; machine learning; CNN  

Citation: Emanuele Buchicchio, Francesco Santoni, Alessio De Angelis, Antonio Moschitta, Paolo Carbone, Gesture recognition of sign language alphabet 
with a convolutional neural network using a magnetic positioning system, Acta IMEKO, vol. 10, no. 4, article 17, December 2021, identifier: IMEKO-ACTA-
10 (2021)-04-17 

Section Editors: Umberto Cesaro and Pasquale Arpaia, University of Naples Federico II, Italy 

Received October 15, 2021; In final form December 4, 2021; Published December 2021 

Copyright: This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, 
distribution, and reproduction in any medium, provided the original author and source are credited. 

Corresponding author: Emanuele Buchicchio, e-mail: emanuele.buchicchio@studenti.unipg.it   

 
1. INTRODUCTION 

Sign language recognition (SLR) is a research area that 
involves gesture tracking, pattern matching, computer vision, 
natural language processing, linguistics, and machine learning [1]. 
The final goal of SLR is to develop methods and algorithms to 
build an SRL system (SLRS) capable of identifying signs, 
decoding their meaning, and producing some output that the 
intended receiver can understand (Figure 1). 

The general SLR problem includes the following tasks: 
1) letter/number sign gesture recognition, 
2) word sign gesture recognition, and 
3) sentence-level sign language translation  

Available literature surveys [2]-[5] report that recent research 
achieved accuracy in the range of 80–100% for the first two tasks 
using vision-based and sensor-based approaches. 

In this paper, we compare the performance of the two 
systems we developed: a vision-based system and a hybrid system 

with sensor-based data acquisition and vision-based classification 
stages. 

1.1. SLRS Performance Assessment 

In the instrumentation and measurement field, machine 
learning is used for processing indirect measurement results. An 
indirect measurement is defined in [6] as a “method of measurement 
in which the value of a quantity is obtained from measurements 
made by direct methods of measurement of other quantities 
linked to the measurand by a known relationship.” In the 
machine learning (ML) common jargon [7], the quantities that 
can be measured with a direct method are denoted as features x1, 
x2, …, xn, and the measurand as y. The measurand y is linked to 
features by a functional relationship y=f(x1, x2, …, xn). The 
process of estimating f is known as “training.” In the training 
process, the ML model is trained with the given dataset to find 
the best possible approximation according to the selected 
optimality criterion. The trained model produces an estimation 
of y in response to the vector x=(x1, x2, …, xn). 

ABSTRACT 
Gesture recognition is a fundamental step to enable efficient communication for the deaf through the automated translation of sign 
language. This work proposes the usage of a high-precision magnetic positioning system for 3D positioning and orientation tracking of 
the fingers and hands palm. The gesture is reconstructed by the MagIK (magnetic and inverse kinematics) method and then proce ssed 
by a deep learning gesture classification model trained to recognize the gestures associated with the sign language alphabet. Results 
confirm the limits of vision-based systems and show that the proposed method based on hand skeleton reconstruction has good 
generalization properties. The proposed system, which combines sensor-based gesture acquisition and deep learning techniques for 
gesture recognition, provides a 100% classification accuracy, signer independent, after a few hours of training using transfer learning 
technique on well-known ResNet CNN architecture. The proposed classification model training method can be applied to other sensor-
based gesture tracking systems and other applications, regardless of the specific data acquisition technology. 

mailto:emanuele.buchicchio@studenti.unipg.it


ACTA IMEKO | www.imeko.org December 2021 | Volume 10 | Number 4 | 98 

In the case of classification systems, the measurand y is the 
class to which an input vector x belongs. The most widely used 
performance metric for gesture SLRS is classification accuracy 
(defined as the ratio of correct predictions over the total 
predictions). In this work, accuracy was adopted both for model 
benchmark and as model optimality criterion. 

1.2. Sign Language 

Sign language (SL) is defined as "any means of 
communication through bodily movements, especially of the 
hands and arms, used when spoken communication is impossible 
or not desirable" [8]. Modern sign language originated in the 18th 
century when Charles-Michel de l'Épée developed a system for 
spelling out French words with a manual alphabet and expressing 
whole concepts with simple signs. Other national sign languages 
were developed from this system and became an essential means 
of communication among the hearing-impaired and deaf 
communities. According to the World Federation of the Deaf, 
today exist over 200 sign languages used by 70 million deaf [9].  

Sign language involves using facial expressions and different 
body parts, such as arms, fingers, hands, head, and body. One 
class of sign languages, also known as fingerspelling, is limited to 
a set of manual signs that represent the symbols of the letters of 
an alphabet performed with one hand [10]. The ASL signs of the 
alphabet letters are shown in Figure 2. 

1.3. Vision-Based vs. Sensor-Based Approaches for Hands Tracking 
and Gesture Recognition 

Many common devices and applications rely on tracking 
hands, fingers, or handheld objects. Specifically, smartphones 
and smartwatches track 2D finger position, a mouse tracks 2D 
hand position, and augmented reality devices like the Microsoft 
HoloLens 2 track the 3D pose of the finger. In addition to SLR, 
many other applications rely on hand gesture recognition such as 
augmented reality [12], assistive technology [13], [14], 
collaborative robotics [15], telerobotics [16], home automation 
[17], infotainment systems [18], [19], intelligence and espionage 
[20] and many others [21]. 

In this paper, we focused on recognizing static hand gestures 
associated with the letters of the alphabet for fingerspelling. Both 
computer-vision-based and sensor-based approaches were 
implemented for sign language alphabet recognition. Hand 
features extraction is a significant challenge for vision-based 
systems [11] because extraction is affected by many factors, such 
as lighting conditions, complex backgrounds in the image, 
occlusion, and skin color. Sensor-based gesture recognition 
systems are commonly implemented as gloves featuring various 
types of sensors. Sensor-based approaches have the advantage of 
simplifying the detection process and can help make the gesture 
recognition system less dependent on input devices. On the 
other hand, a disadvantage of sensor-based systems is that they 
can be expensive and too invasive for real-world deployment. 

2. VISION-BASED SIGN LANGUAGE GESTURE RECOGNITION  

Machine learning techniques are widely adopted for gesture 
classification tasks. Various public datasets are available for 
system performance assessment and benchmark. The American 
Sign Language MNIST Dataset [22], a flavor of the classic 
MNIST dataset [23], created for sign language gesture, is often 
used as a baseline. Other more complex datasets such as [24], 
[25] are also available. 

2.1. Classic Machine Learning and Convolutional Neural Network 
on MNIST Dataset 

The American Sign Language MNIST Dataset is in a tabular 
format similar to the original MNIST dataset. Each row in the 
CSV file has a label and 784 pixels values ranging from 0-255, 
representing a single 28 × 28 pixels greyscale image. In total, 
there are 27,455 training cases and 7,172 tests cases in this 
dataset. The classification accuracy was selected as the primary 
metric for models’ performance assessment and benchmarking 
with other published comparable works.  

Two different models were trained to accomplish the 
letter/number gesture recognition task from static images using 
two different approaches: a classic ML model and a deep neural 
network (Figure 3). 

The first model was selected among many model candidates 
obtained by applying different combinations of features 
engineering techniques, ML algorithms, and ensemble methods 
using the Automated ML (AutoML) service of Azure Machine 
Learning. Azure Machine Learning [26] is a cloud-based platform 
that provides tools for automation and orchestration of all 
training, scoring, and comparison operations. AutoML tests 
hundreds of models in a few hours with parallel job execution 
with no human interaction after the initial experiment and 
remote compute target cluster setup. The experiment generates 
many models that achieve 100% classification accuracy. Among 

 
Figure 1. Block diagram of a sign language recognition system (SLRS).  

 
Figure 2. Letters of the American Sign Language (ASL) alphabet [11].  


ACTA IMEKO | www.imeko.org December 2021 | Volume 10 | Number 4 | 99 

them, the “Logistic Regression” based model has a smaller 
memory footprint at runtime. 

The second model was created with a minimal custom 
convolutional neural network (CNN) architecture (2D 
convolution, max pooling, flatten, dense layer, dropout, dense) 
commonly used for simple deep learning image recognition tasks 
(Figure 4). The model was built and trained with the Keras 
library. Model hyperparameters such as the number of neurons 

in layers, batch size, the number of training epochs, and dropout 
percentage were tuned using the HyperDrive service from Azure 
Machine Learning. The best scoring model achieves a 
classification accuracy score of 99.99%. 

The best models from the two training pipelines were 
deployed as web services for production usage. 

The (zipped) size of the CNN model is about 17 MB when 
the logistic regression model size is only 0.8 MB. Simple and 
lightweight models should be preferred if there is no 
performance penalty. 

2.2. Vision-based Classification Accuracy 

The 100% accuracy was confirmed after deployment with test 
cases from the American Sign Language MNIST.  

Simple classic ML models could not recognize gestures in 
realistic images with variable backgrounds and light conditions. 
The CNN model scores over 90% accuracy on a subset of the 
"ASL Alphabet" [24] image dataset that includes more "realistic" 
light and background conditions. However, while deployed as a 
web service, the performance on image stream from a live 
camera was not satisfactory for production usage in challenging 
conditions such as partial line of sight obstruction, presence of 
shadows in the image, and confusing backgrounds like in the test 
case of ASL Alphabet Test dataset [25]. 

3. SENSOR-BASED GESTURE RECOGNITION WITH DEEP CNN 
ON VISUAL GESTURE REPRESENTATION  

Our experiment with a vision-based approach confirms both 
performance and limitation described in other works. Given the 
result of our experiments and other works, in this paper, we 
propose an SLRS system that combines a sensor-based approach 
in the acquisition stage and computer vision techniques in the 
gesture recognition stage (Figure 5). 

3.1. Hand Tracking with Magnetic Position System (MPS) 

The magnetic Positioning System (MPS) described in [27] is 
immune from many problems that affect computer vision 
techniques such as occlusion, light condition, shadows, skin 
colors.  

The MPS is composed of transmitting nodes and receiving 
nodes. The transmitting nodes are mounted on the fingers and 
hand to be tracked (Figure 6), whereas the receiving nodes are 
placed at known positions on the sides of the operational 
volume. An advantage of the sensor-based systems is that they 
are not sensitive to illumination conditions and the other factors 
affecting vision-based systems. Furthermore, MPS can also 
operate in the presence of obstructions caused by objects or body 
parts. Therefore, the proposed approach enables robust and 
reliable tracking of the hand and fingers. It is thus suitable for 
SLR and the other applications of hand gesture recognition, such 
as human-machine interaction, virtual and augmented reality, 
robotic telemanipulation, and automation. 

 
Figure 3. Workflow for the comparison of various machine learning models 
for static gesture recognition using Azure SKD, AutoML and HyperDrive for 
operations automation. 

 
Figure 4. Deep CNN model architecture. 

 
Figure 5. Proposed SLRS with sensor-based data acquisition and vision-based gesture recognition.  


ACTA IMEKO | www.imeko.org December 2021 | Volume 10 | Number 4 | 100 

3.2. Gesture Recognition Using Skeleton Reconstruction 

Classic machine learning models can achieve 100% accuracy 
on static sign language recognition tasks on laboratory datasets 
like [24]. CNN deep learning models score high accuracy (over 
90%) on realistic images. Classic machine learning models can 
achieve 100% accuracy on static sign language recognition tasks 
on laboratory datasets. CNN deep learning models score high 

accuracy (over 90%) on realistic images with variable light. 
However, these high performances are not robust and cannot be 
easily replicated in real-world operating conditions. 
In our paper [11], we demonstrated that training the classification 
model on data from a tracking system gives substantial 
advantages in terms of robustness to environmental conditions 
and signer variability. 

The hand gesture is reconstructed using the technique 
illustrated in [28], with the improvements added in [11], which 
we called MagIK (magnetic and inverse kinematics). The 
method, with some empirical modification introduced in the 
model to optimize the reconstruction of the gesture among 
different test subjects, allows reconstructing the movement of 
the hand with 24 degrees of freedom (DOF). Positions and 
orientations of all the magnetic nodes estimated by the MPS are 
sent to a kinematic model of the hand, to obtain the position and 
flexion of each joint and the position and orientation of the 
whole hand with respect to the MPS reference frame. As the last 
step, MagIK produces a visual representation, such as the 
examples shown in Figure 7. We call this technique “skeleton 
reconstruction”. 

3.3. Efficient Deep CNN Training for Sign Language Recognition 

Many pre-trained deep learning models are proven to be 
adequate for image/video classification tasks. We chose the 
ResNet34 CNN because the ResNet (residual network) 
architecture achieves good results in image classification tasks 
and is relatively fast to train [29]. 

Figure 8 illustrates the training pipeline implemented with 
PyTorch and FastAI [30] library. Transfer learning approaches 
allow fast training of the deep CNN (ResNet34) model. 

The optimal learning rate for training was estimated with the 
Cyclical Learning Rates method [31] to avoid time-consuming 
multiple runs to perform hyperparameters sweeps. 

The rules of thumb for the selection of learning rate value 
from [31] are: 

1) one order of magnitude less than where the 
minimum loss was achieved; and 

2) the last point where the loss was clearly decreasing. 
The Loss estimation plot (Figure 9) produced by the algorithm 
implementation in the FastAI library suggested a learning rate in 
the range 10-2 – 10-3. 

Model fine-tuning was performed using FastAI API with a 
sequence of freeze, fit-one-cycle, unfreeze, and fit-one-cycle 
operations using the «discriminative learning rate» method. The 
training continued until error rate, validation loss, and training 
loss converged to zero after four epochs (Figure 10). 

3.4. Gesture Classification Inference with MPS 

The trained model, after the fine-tuning process, was 
developed in an inference pipeline (Figure 11) that takes the 

 
Figure 6. MPS transmitting coils mounted on a wearable glove. 

 
Figure 7. Examples of ASL letters (Y and L) articulated while wearing the glove, 
and their respective reconstructions obtained through the kinematic model 
and MagIK technique.  

 
Figure 8. Training pipeline for ResNet43 CNN with transfer learning.  


ACTA IMEKO | www.imeko.org December 2021 | Volume 10 | Number 4 | 101 

output generated by MPS control software and, for each acquired 
frame: 

1) Reconstructs the gesture using MagIK model 
kinematic model, 

2) Exports the visual representation as a bitmap image, 
3) Feeds the CNN model with the generated gesture 

image and get the array of confidence values 
associated with each class in the training dataset, and 

4) Printouts the label of the sign class with the highest 
confidence value. 

4. CONCLUSIONS 

Classic machine learning models can only achieve 100% 
accuracy on static sign language recognition tasks on laboratory 
datasets [22]. Deep CNN models can accomplish the task with 
over 90% accuracy also on more realistic images [24]. However, 
these high performances are not robust and cannot be replicated 
in real-world operating conditions. Combining sensor-based 

acquisition, visual reconstruction of the skeleton, and a deep 
CNN classification model, the proposed system achieves 100% 
inference accuracy on gestures performed by different people 
after a few epochs of training. We cannot achieve 100% accuracy 
with classic machine learning in comparable experimental 
conditions. 

The sensor-based approach is immune from many problems 
that affect computer vision techniques such as occlusion, light 
condition, shadows, skin colors. Building a gesture recognizer on 
top of a tracking system, instead of direct classification from a 
sensor stream, can help make the gesture recognition system less 
dependent on input devices. Skeleton tracking allows for good 
generalization: system performances are robust across different 
sign performers and classifications do not rely on specific hand 
characteristics. 

The classification method implemented in this work can be 
applied to almost any sensor-based dataset: the only requirement 
is to provide a convenient visual representation of input data to 
be used both in training and inference. After replacing the 
MagIK with another method suitable for the specific application, 
other stages of the training pipeline and inference pipeline do not 
need any change and can be directly used for many other 
applications. 

REFERENCES 

[1] H. Cooper, B. Holt, R. Bowden, Sign language recognition. In 
Visual analysis of humans; moeslund, T., Hilton, A., Krüger, V., 
Sigal, L., Eds.; Springer 2011.  
DOI: 10.1007/978-0-85729-997-0_27 

[2] A. Wadhawan, P. Kumar, Sign language recognition systems: A 
decade systematic literature review. Arch. Comput. Methods Eng. 
28 (2019) pp. 785–813. 
DOI:10.1007/s11831-019-09384-2 

[3] M. J. Cheok, Z. Omar, M. H. Jaward, A review of hand gesture 
and sign language recognition techniques. Int. J. Mach. Learn. 
Cyber 10 (2019) pp. 131–153.  
DOI: 10.1007/s13042-017-0705-5 

[4] R. Elakkiya, Machine learning based sign language recognition: A 
review and its research frontier. J. Ambient. Intell. Hum. Comput. 
2020.  
DOI: 10.1007/s12652-020-02396-y 

[5] R. Rastgoo, K. Kiani, S. Escalera, Sign language recognition: A 
deep survey. Expert Syst. Appl. 164 (2021).  
DOI: 10.1016/j.eswa.2020.113794 

[6] "IEC standard 60050–300", International Electrotechnical 
Vocabulary (IEV) - Part 300: Electrical and Electronic 
Measurements and Measuring Instruments, International 
Electrotechnical Commission, Jul. 2001. 

[7] S. Shirmohammadi, H. Al Osman, Machine learning in 
measurement Part 1: error contribution and terminology 

 
Figure 9. Loss estimation plot against learning rate values for optimal learning 
rate selection. The optimal value for training is in range 10-2 – 10-3. 

 
Figure 10. Loss and error rate values recorded during the training process.  

 
Figure 11. Inference pipeline with MPS and skeleton reconstruction and an 
example of execution from Jupyter Notebook python environment. 

https://doi.org/10.1007/978-0-85729-997-0_27
https://doi.org/10.1007/s11831-019-09384-2
https://doi.org/10.1007/s13042-017-0705-5
https://doi.org/10.1007/s12652-020-02396-y
https://doi.org/10.1016/j.eswa.2020.113794


ACTA IMEKO | www.imeko.org December 2021 | Volume 10 | Number 4 | 102 

confusion, IEEE Instrumentation & Measurement Magazine, 
24(2) (2021) pp. 84-92. 
DOI: 10.1109/MIM.2021.9400955 

[8] Encyclopedia Britannica, Sign Language. Online [Accessed 
December 05 2021] 
https://www.britannica.com/topic/sign/language  

[9] World Federation of the Deaf. Online [Accessed December 05 
2021]. 
http://wfdeaf.org/our-work  

[10] fingerspelling. Wikipedia. Online [Accessed December 05 2021] 
https://en.wikipedia.org/wiki/Fingerspelling  

[11] M. Rinalduzzi, A. De Angelis, F. Santoni, E. Buchicchio, A. 
Moschitta, P. Carbone, P. Bellitti, M. Serpelloni, Gesture 
recognition of sign language alphabet using a magnetic positioning 
System. Appl. Sci. 11 (2021), 5594. 
DOI:10.3390/app11125594 

[12] J. Dong, Z. Tang, Q. Zhao, Gesture recognition in augmented 
reality assisted assembly training. J. Phys. Conf. Ser. 1176(3) 
(2019), art. 032030.  
DOI: 10.1088/1742-6596/1176/3/032030 

[13] R. E. O. Ascari Schultz, L. Silva, R. Pereira, Personalized 
interactive gesture recognition assistive technology. In 
Proceedings of the 18th Brazilian Symposium on Human Factors 
in Computing Systems, Vitória, Brazil, 22–25 October 2019. 
DOI: 10.1145/3357155.3358442 

[14] S. S: Kakkoth, S. Gharge, Real Time Hand Gesture Recognition 
and its Applications in Assistive Technologies for Disabled. In 
Proceedings of the Fourth International Conference on 
Computing Communication Control and Automation 
(ICCUBEA), Pune, India, 16–18 August 2018. 
DOI: 10.1109/ICCUBEA.2018.8697363 

[15] M. A. Simão, O. Gibaru, P. Neto, Online recognition of 
incomplete gesture data to interface collaborative robots, IEEE 
Trans. Ind. Electron. 66 (2019) pp. 9372–9382.  
DOI: 10.1109/TIE.2019.2891449 

[16] I. Ding, C. Chang, C. He, A kinect-based gesture command 
control method for human action imitations of humanoid robots. 
In Proceedings of the 2014 International Conference on Fuzzy 
Theory and Its Applications (iFUZZY2014), Kaohsiung, Taiwan, 
26–28 November 2014; pp. 208–211.  
DOI: 10.1109/iFUZZY.2014.7091261 

[17] S. Yang, S. Lee, Y. Byun, Gesture recognition for home 
automation using transfer learning, 2018 International Conference 
on Intelligent Informatics and Biomedical Sciences (ICIIBMS), 
Bangkok, Thailand, 21–24 Oct. 2018, pp. 136–138.  
DOI: 10.1109/ICIIBMS.2018.8549921 

[18] Q. Ye, L. Yang, G. Xue, Hand-free gesture recognition for vehicle 
infotainment system control, 2018 IEEE Vehicular Networking 
Conference (VNC), Taipei, Taiwan, 5–7 December 2018; pp. 1–2.  

DOI: 10.1109/VNC.2018.8628409 
[19] Z. U. A. Akhtar, H. Wang, WiFi-based gesture recognition for 

vehicular infotainment system—An integrated approach, Appl. 
Sci. 9 (2019), art. 5268.  
DOI: 10.3390/app9245268 

[20] Y. Meng, J. Li, H. Zhu, X. Liang, Y. Liu, N. Ruan, Revealing your 
mobile password via WiFi signals: Attacks and countermeasures, 
IEEE Trans. Mob. Comput. 19(2) (2019) pp. 432–449. 
DOI: TMC.2019.2893338 

[21] M. J. Cheok, Z. Omar, M. H. Jaward, A review of hand gesture 
and sign language recognition techniques, Int. J. Mach. Learn. 
Cyber. 10 (2019) pp. 131–153.  
DOI: 10.1007/s13042-017-0705-5 

[22] The American Sign Language MNIST Dataset. Online [Accessed 
December 05 2021] 
https://www.kaggle.com/datamunge/sign-language-mnist  

[23] LeCun, Y., & Cortes, C. (2010). MNIST handwritten digit 
database. AT&T Labs. Online [Accessed December 05 2021] 
http://yann.lecun.com/exdb/mnist  

[24] ASL Alphabet. Online [Accessed December 05 2021] 
https://www.kaggle.com/grassknoted/asl-alphabet  

[25] ASL Alphabet Test, online [Accessed December 05 2021] 
https://www.kaggle.com/danrasband/asl-alphabet-test  

[26] Azure Machine Learning Product Overview. Online [Accessed 
December 05 2021] 
https://azure.microsoft.com/it-it/services/machine-
learning/#product-overview  

[27] F. Santoni, A. De Angelis, A. Moschitta, P. Carbone, A multi-node 
magnetic positioning system with a distributed data acquisition 
architecture, Sensors 20(21) (2020), art. 6210, pp. 1-23. 
DOI: 10.3390/s20216210 

[28] F. Santoni, A. De Angelis, A. Moschitta, P. Carbone, MagIK: A 
hand-tracking magnetic positioning system based on a kinematic 
model of the hand, IEEE Transactions on Instrumentation and 
Measurement 70 (2021), art. 9376979 
DOI: 10.1109/TIM.2021.3065761 

[29] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image 
recognition, 2016 IEEE Conference on Computer Vision and 
Pattern Recognition (CVPR), Las Vegas, NV, USA, 27-30 
June 2016, pp. 770-778. 
DOI: 10.1109/CVPR.2016.90 

[30] J. Howard, S. Gugger, Fastai: A layered API for deep learning, 
Information 11(2) (2020), art. 108. 
DOI: 10.3390/info110201081 

[31] L. N. Smith, Cyclical learning rates for training neural networks. 
Online [Accessed December 05 2021] 
https://arxiv.org/abs/1506.01186  

 
https://doi.org/10.1109/MIM.2021.9400955
https://www.britannica.com/topic/sign/language
http://wfdeaf.org/our-work
https://en.wikipedia.org/wiki/Fingerspelling
https://doi.org/10.3390/app11125594
https://doi.org/10.1088/1742-6596/1176/3/032030
https://doi.org/10.1145/3357155.3358442
https://doi.org/10.1109/ICCUBEA.2018.8697363
https://doi.org/10.1109/TIE.2019.2891449
https://doi.org/10.1109/iFUZZY.2014.7091261
https://doi.org/10.1109/ICIIBMS.2018.8549921
https://doi.org/10.1109/VNC.2018.8628409
https://dx.doi.org/10.3390/app9245268
https://doi.org/10.1109/TMC.2019.2893338
https://doi.org/10.1007/s13042-017-0705-5
https://www.kaggle.com/datamunge/sign-language-mnist
http://yann.lecun.com/exdb/mnist
https://www.kaggle.com/grassknoted/asl-alphabet
https://www.kaggle.com/danrasband/asl-alphabet-test
https://azure.microsoft.com/it-it/services/machine-learning/#product-overview
https://azure.microsoft.com/it-it/services/machine-learning/#product-overview
https://doi.org/10.3390/s20216210
https://doi.org/10.1109/TIM.2021.3065761
https://doi.org/10.1109/CVPR.2016.90
https://doi.org/10.3390/info11020108
https://arxiv.org/abs/1506.01186