International Journal of Interactive Mobile Technologies (iJIM) – eISSN: 1865-7923 – Vol. 15, No. 17, 2021


Paper—Leveraging Sensor Fusion and Sensor-Body Position for Activity Recognition for Wearable… 

Leveraging Sensor Fusion and Sensor-Body Position for 
Activity Recognition for Wearable Mobile Technologies

https://doi.org/10.3991/ijim.v15i17.25197

Ashraful Alam, Anik Das, Md. Shahriar Tasjid, Ahmed Al Marouf(*)
Daffodil International University, Dhaka, Bangladesh

marouf.cse@diu.edu.bd

Abstract—Smart devices like smartphones and smartwatches have made this 
world smarter. These wearable devices are created through complex research 
methodologies to make them more usable and interactive with its user. Various 
interactive mobile applications such as augmented reality (AR), virtual reality 
(VR) or mixed reality (MR) applications solely depend on the in-built sensors of 
the smart devices. A lot of facilities can be taken from these devices with sensors 
such as accelerometer and gyroscope. Different physical activities such as walk-
ing, jogging, sitting, etc., can be important for analysis like health state prediction 
and duration of exercise by using those sensors based on artificial intelligence. 
In this paper, we have implemented machine learning and deep learning algo-
rithms to detect and recognize eight activities namely, walking, jogging, stand-
ing, walking upstairs, walking downstairs, sitting, sitting-in-a-car and cycling; 
with a maximum of 99.3% accuracy. A few activities are almost similar in action, 
such as sitting and sitting-in-a-car, but difficult to distinguish; which makes it 
more challenging to predict tasks. In this paper, we have hypothesized that with 
more sensors (sensor fusion) and data collection points (sensor-body positions) a 
wide range of activities can be recognized and the recognition accuracies can be 
increased. Finally, we showed that the combination of all the sensors data of both 
pocket/waist and wrist can be used to recognize a wide range of activities accu-
rately. The possibility of using the proposed methodologies for futuristic mobile 
technologies is quite significant. The adaptation of most recent deep learning 
algorithms such as convolutional neural network (CNN) and bi-directional Long 
Short Time Memory (Bi-LSTM) demonstrated high credibility of the methods 
presented as experimentation. 

Keywords—sensor fusion, human activity recognition, machine learning, deep 
learning, sensors

1 Introduction

We are living in the age of technological advancements. In this era, the field of 
human-centric computing research is an emerging field of research in which we can 

iJIM ‒ Vol. 15, No. 17, 2021 141

https://doi.org/10.3991/ijim.v15i17.25197
mailto:marouf.cse@diu.edu.bd


Paper—Leveraging Sensor Fusion and Sensor-Body Position for Activity Recognition for Wearable… 

understand the nature of human behavior, habit, interests, etc. Human activity recogni-
tion is a type of work where all we have to do is to recognize or predict human move-
ment by studying and analyzing human-computer interaction (HCI), video surveillance, 
wearable devices, or sensors. However, the challenging part here is to detect, predict 
activities with decent accuracy. In our work, we tried to recognize human activity by 
using in-built sensors of smartphones. Sensors give high-frequency data every second 
with the physical movement of different parts of the body. Keeping smartphones in the 
pocket will not give us the same data if it is kept in hand. Thus, it is a challenging task 
to predict human activities from a wide range of sensor data precisely.

Human activity recognition can be defined as “determining or recognizing the 
human activity using technological aspects”. In literature, we found there are several 
ways to recognize human activities, such as sitting, walking, running, walking upstairs, 
walking downstairs, etc. Among the technological aspects, use of image processing 
[1–4] has been widely accepted and the reviews [5–9] have shown the various tech-
niques focusing on the prediction task. Feature extraction and dimensionality reduc-
tion-based approaches were used to get significant performance by the classifiers. The 
sensor-based approaches [2, 6] are not new in this area, but the usability of the smart-
phone in-built sensors is still in research. 

For this paper, we have placed one smartphone on the wrist and another one in the 
pocket to capture data of different activities. The sensors utilized for this task are accel-
erometer and gyroscope for each of the smartphones. The contributions of the papers 
are listed:

	─ Integration of multiple sensor data into one multi-model dataset by sensor fusion 
technique and utilize it for the further processing. 

	─ Consideration of sensor-body positions such as wrist, pocket and wrist-pocket to 
collect sensor data and utilize it for the further processing. 

	─ Applying classification algorithms e.g., KNN, CNN, Bi-directional LSTM, SVM, 
etc. for recognizing the different activities. 

We have collected data for sitting in car and cycling, which are novel in the domain 
and integrating these activities will provide robustness in the training-testing method.

2 Related works

There has been intense research going on for the past 2 decades on the area of human 
computer interaction, particularly on human activity recognition. Several attempts have 
been made to detect Human Activity from different body points attached to the bodies. 
Here [10] accelerometer and microphone data were used to recognize activities in a cer-
tain environment. Their attempt was to get good accuracy even if the device’s location 
in the body changes. Another attempt here [11] used only a tri-axial accelerometer of a 
phone to detect activities. They performed their experiment keeping the phone both in 
wrist and pocket. Then both model’s accuracy was compared. They used both individ-
ual and combinations of classifiers. Their results were promising. Vinh [12] used the 

142 http://www.i-jim.org


Paper—Leveraging Sensor Fusion and Sensor-Body Position for Activity Recognition for Wearable… 

data of the accelerometer from both hip and waist. Their attempt was to detect activity 
using low-powered and low-cost devices. Bao [13] used bi-axial accelerometer data 
retrieved from wrist, ankle, thigh, elbow, and hip. While collecting data they did not 
ask the users, where they should put the device while doing the activities. Therefore, 
the data came up from those five different body locations. After testing with several 
algorithms and comparing them, they proved that only thigh and waist data combined 
can perform close to the five data points combined. The decision tree was the best clas-
sifier in their experiment. Jatoba [14] and his team’s work has been done for monitoring 
activities on patients. So, they analyzed the data of micro accelerometer placed on the 
patient’s chest. With KNN and CART, they were able to get decent accuracy.

Most of the work has been done on the accelerometer data taken from either 
waist or wrist. But Zhu [15] used accelerometer data from foot and waist. And their 
best-performed algorithm was HMM. They reduced the complexity of the dataset by 
fusion of the data collected from foot and waist. This way they overcame the prob-
lem of the need for a strong displacement of sensors for the HMM model to work 
well. Some of the attempts were made in a discriminative way too [16]. The closest 
work to us is the work of San-Segundo-Hernández [17] and E. Bulbul [18]. In [17], 
they used accelerometer data from the wrist and pocket. They claimed accelerometer 
data provided better results compared to that of the gyroscope. What we have done is 
that, instead of analyzing the data of the accelerometer and gyroscope separately, we 
treated both sensor data as features for output and trained our model from that data. 
This way a wide range of activities can be measured accurately as the model gets more 
features to work on. We proved this by analyzing different combination of sensors data 
in section 3.2.  Bulbul [18] in their work, used both the sensors together to build the 
model. He achieved very good results too. But they used only pocket data. Our work 
includes both the sensors from two different devices placed on the pocket and wrist. 
This way the model can be trained to successfully distinguish among wide ranges of 
different activities even if they are quite similar like sitting in a moving car or a chair in 
a stationary place. Similarly, Hip-joint based hand activity recognition was proposed in 
[38], where the distance between the hip-joint and the hand joints were extracted using 
the Microsoft Kinect skeleton tracking system.

Different experiments used different sensors placed on single or multiple places of 
body locations. There are lots of algorithms that are implemented, but few of them 
perform well enough. For example, SVM, ANN, HMM, CNN, LSTM, etc. It is evi-
dent from the comparison above that when different sensors are combined together as 
features or more than one sensor are placed in multiple places of the human body, the 
performance and accuracy are achieved better [33]. In the context of artificial intelli-
gence, utilizing big data for green and sustainable technologies such as human activity 
recognition models are very crucial and important [34–35]. For the betterment of health 
technologies, merging of big data and green technologies is a must and the significance 
can be found in [36–37]. 

Machine learning algorithms, especially classification algorithms are utilized for 
similar to human activity recognition. Some of the significant problems addressed 
using the machine learning algorithms are: detecting malicious links from world 

iJIM ‒ Vol. 15, No. 17, 2021 143


Paper—Leveraging Sensor Fusion and Sensor-Body Position for Activity Recognition for Wearable… 

wide web (www) [21], handwritten digit recognition [22], depression detection from 
image and video analysis [23], air temperature prediction [24], etc. Therefore, we 
have applied seven different classification algorithms namely, logistic regression, 
random forest, K- nearest neighbor (k-NN), Support vector machine (SVM), Gra-
dient Boosting, Convolutional Neural Network (CNN), Bi-directional LSTM. The 
experimental results coming from these classifiers are highlighted in the result and 
analysis section.

Table 1. Comparative analysis and summary of different approaches  
made to detect human activity

Paper Sensors Sensor Placed Algorithms
Best Accuracy 

Obtained

Lester [10] Accelerometer, 
Microphone

Waist, wrist, 
shoulder

HHM 90%

Akram Bayat [11] Accelerometer Either wrist or 
waist

SVM, LMT, 
Random Forest

91.15%

E. Bulbul [18] Accelerometer, 
Gyroscope

Pocket SVM, k-NN, 
Bagging, Stacking

99.4%

San-Segundo-
Hernández [17]

Accelerometer Pocket, wrist CNN-LSTM, 
HMM

99.4%

L. Vinh [12] Accelerometer Waist, hip SMCRF 88.38%

C. Zhu [15] Accelerometer Foot, waist HMM 90%

L. Bao [13] Accelerometer Wrist, ankle, 
thigh, elbow, hip

KNN, NB 84%

L. C. Jatoba [10] Accelerometer, SPI Chest KNN, CART 95%

Z. He [19] Accelerometer Pocket SVM 97.51%

A. Khan [20] Accelerometer Chest ANN 97.9%

3 Proposed methodology

3.1 Research subject and instrument

Nowadays, every person has at least one smartphone in their pocket. Along with 
that, people have been starting to wear wristbands instead of analog watches. These 
gadgets have different kinds of sensors built-in, especially accelerometers and gyro-
scopes which are common. There is a lot of usage of these two sensors. In our paper, we 
are focusing on human activity detection and recognition by using accelerometer and 
gyroscope sensors data. Here we have detected eight activities and these are walking, 
standing, sitting, upstairs, downstairs, jogging, cycling, and sitting in the car. We have 
collected this dataset by using three android operating system-based smartphones. 
These were Samsung Galaxy S10 Plus, Redmi Note 7s, and Huawei Y9. We have used 
an application called “Sensor Data” to collect our required data.

144 http://www.i-jim.org


Paper—Leveraging Sensor Fusion and Sensor-Body Position for Activity Recognition for Wearable… 

3.2 Data collection procedure

Collecting high-frequency data properly was a challenging task. At first, we man-
aged 17 volunteers to perform different activities for a specific time period. The android 
application which we used to collect data was set up at a 50Hz sampling rate. So, 
data was recorded in the dataset in a rate of 50 samples per second. We used two 
smartphones at a time. One was in the pocket and the other was tied to the wrist. The 
device tied to the wrist can get similar data to a wristwatch. With those activities, we 
got data from two locations (wrist and pocket) for each of the activities. Each of them 
has both accelerometer and gyroscope sensor data which is 3 dimensional with data 
points known as axes (x, y, z). So, we got total of 12 columns for those 4 sensors each 
having 3 axes. Finally, we have stored each distinct type of data such as ‘Activity_
Hand_accelerometer’, ‘Activity_Hand_gyroscope’, ‘Activity_Pocket_accelerometer’, 
‘Activity_Pocket_gyroscope’ into text files. That is the procedure of our time series 
data collection. After finishing the data collection phase, we have pre-processed the 
raw data to make a suitable dataset where we can implement different algorithms. First 
of all, we have removed unnecessary strings from the text files. Because the mobile 
application generated some irrelevant strings at the beginning of the text file. Then we 
have converted each text file into a comma-separated value (CSV) file. Here, we have 
12 input columns for each of the activities and 1 output column where we have labelled 
the activity name. And for each of the activities, we have taken 44,000 samples. After 
that, we have merged all the CSV files into one to get our complete dataset.  In our 
dataset, we have got a total 3, 52,000 samples or instances. These instances are fed into 
the next step for pre-processing. 

3.3 Data pre-processing

In feature engineering step, number of different steps were needed for data manipu-
lation. we removed null values by using mean and median methods. There were noises 
in the dataset which need to be removed. So, we have done filtering by the butterfly 
method to remove the noises from the dataset and make the dataset smooth. Then we 
have done label encoding. As our output label was categorical data, we had to con-
vert that into numerical data by using the label encoding function. Furthermore, we 
have split our dataset into input and output columns. Along with this, we also split our 
dataset into a train set and test set at the ratio of 4:1, which is 80%–20% distribution. 
The test set was kept aside for final testing stage. In training stage K-Fold cross vali-
dation was used to train the models. This method is evident and widely accepted in the 
machine learning research community.

After splitting the dataset, we have implemented feature scaling on our dataset to 
bring all the data points into a specific range. We used Robust Scalar and Standard 
Scalar to scale the dataset. We have implemented both scalar fitting and transformation 
on the training dataset but only implemented transformation on the test dataset. This 
prevents data leakage. So, the test set remains completely unseen to the model.

As we have time-series data, we had to specify a specific sized window which is 
also known as a sliding window. We have tested different window sizes like 2 seconds, 

iJIM ‒ Vol. 15, No. 17, 2021 145


Paper—Leveraging Sensor Fusion and Sensor-Body Position for Activity Recognition for Wearable… 

4 seconds and observed the performance. Choosing the window size is tricky. The bigger 
the window, the better the result will be. But too big of a window will overfit the model. 
Moreover, the processing will be heavy as in each window there are lots of data. This 
will cause more problems if activities are detected in real-time. On the other hand, 
smaller window will be very fast for processing, but the result will not be as good as 
comparatively larger window. Considering this trade-off, we used a window size of 4 
seconds. We have also defined the hop size which is known as stride size. Therefore, 
sliding window keeps moving forward according to the stride size and in each step, it 
creates a sample of an activity, dimension of 200×12 each. Used hop size was large 
enough to reduce overlapping. It ensures the diversity of samples in each window.

Using the sliding window in our dataset, we have got 3D data. As mentioned earlier, 
each second 50 rows of data are recorded. Therefore, for a window of 4 seconds, we got 
200 rows and 12 columns of data. Each window represents an activity. This is a perfect 
data shape for models that uses Neural Networks. So, we used this dataset for our CNN 
and Bi-Directional LSTM models. 

Fig. 1. Visualization of how sliding window was used for data  
overlapping for creation of the dataset

Therefore, for traditional machine learning models, we needed to flatten the dataset. 
Each activity’s shape is 200×12 before flattening. Now after we flatten this, we get 
2,400 columns for each activity. There are so many columns after flattening which 
can lead to overfitting. So, for reducing dimensionality, we used Principal Component 
Analysis (PCA).

And in the feature scaling step there were also some problems with 3D data while 
implementing the Deep Learning models. Therefore, we have also made an alternative 
option to reshape the whole training dataset. For doing that, we had to go through all 
the samples of activities and then reshape them. This is the complete procedure of our 
data collection, data storing, data cleaning, and data pre-processing.  

146 http://www.i-jim.org


Paper—Leveraging Sensor Fusion and Sensor-Body Position for Activity Recognition for Wearable… 

Flattening

Fig. 2. How three-dimensional data is flattened into two-dimensional data

4 Data analysis

While performing any kind of activity, the orientation of the phone and smartwatch 
is expected to differ in different kinds of activities. And in most cases, this difference 
does occur. But often two or more activities can provide similar kinds of data. That 
means the data points will overlap in the same cartesian region. In that case, models 
can make an error to predict. This can be seen from the Figure 3 visualization of the 

iJIM ‒ Vol. 15, No. 17, 2021 147


Paper—Leveraging Sensor Fusion and Sensor-Body Position for Activity Recognition for Wearable… 

data where some of the data are plotted. Looking at the accelerometer data of pocket, 
downstairs and upstairs data have similar kinds of dispersion, range, and orientation in 
some regions. 

Walking Upstairs Walking Downstairs

Sitting Sitting in a Car

Fig. 3. Comparative visualization of Accelerometer and Gyroscope  
data points in different activities

Therefore, Machine Learning algorithms might often struggle to distinguish between 
those activities. In cases like these, wrist data adds more variation. Though this is very 
hard to prove in visualization, so we can only have the intuition here. But in the 3.2 
section, this hypothesis is proved by analyzing different sensor combination and it is 
proved that highest accuracy is reached when all the sensor data is taken as features. 
Looking at the wrist data of the accelerometer, data defers in dispersion and orientation 
in upstairs and downstairs activities. Adding the gyroscope to that, we will be able to 
add more information that can very easily be distinguished by the model. From the 
visualization of gyroscope data, we can see that both upstairs and downstairs data differ 
a lot. So visually, with these 4 sensors (2 in wrist and 2 in the waist) the margin of error 
seems to be very low for any activity that provides almost similar kind of data.

Sitting in a car and sitting home in a chair is an almost similar activity in action and 
hard to distinguish between those. From the Figure 3 visualization, we can see that 
accelerometer data of those 2 activities is hard to differentiate between. Data points 
have almost the same dispersion and orientation. Therefore, any machine learning algo-
rithm will have plenty of errors when trying to predict if someone is sitting in a car or 

148 http://www.i-jim.org


Paper—Leveraging Sensor Fusion and Sensor-Body Position for Activity Recognition for Wearable… 

in a chair at home. In this scenario too, the gyroscope data adds variation that can easily 
be differentiated. 

Fig. 4. Views from different angles of Accelerometer data recorded  
from the waist of 8 activities

Fig. 5. Views from different angles of Gyroscope data recorded from the waist of 8 activities

From the above two scenarios, we can see when data are analyzed together from 
waist accelerometer, gyroscope, and wrist accelerometer, gyroscope we can predict the 
activities very accurately even if they are very similar in action. When all the activities 
are put together in a single graph, we can further visualize the above-mentioned prob-
lem. Here only waist data are plotted. It is enough to give an intuition for wrist data as 
well.

All the data plotted in these 2 figures (Figure 4, Figure 5), are the data of the same 
instance. In other words, these are data for specific 10 seconds from the entire dataset 
so that we can compare how the data points reside for different activities. 

We can clearly see that upstairs and downstairs data are overlapping in Figure 4 and 
Figure 5 which are Yellow and Cyan respectively. So even from those 3 sensors, we 
might often not be able to distinguish between Upstairs and Downstairs. Some errors 
will come up eventually. But from Figure 5, we can easily distinguish between Cyan 
and Yellow. Therefore, these activities will be separated quite accurately. In the same 
way, other activities that are almost similar will be differentiated with those 4 activities 
even if they are almost similar in action. That is the reason why we are able to achieve 
an accuracy of 99.3% even though some of the activities are very similar to each other.

iJIM ‒ Vol. 15, No. 17, 2021 149


Paper—Leveraging Sensor Fusion and Sensor-Body Position for Activity Recognition for Wearable… 

5 Experimental results and discussion

In the past two decades, there have been many attempts to recognize human activ-
ities with precision. From our experiment, we will show that, when we combine the 
accelerometer and gyroscope data from pocket and wrist in columns as features, it is 
possible to detect and recognize a wide variety of activities with precision. The dataset 
we built and experimented with proves the base of our hypothesis. We have made the 
comparisons among accuracies obtained from different combinations of sensors and 
different combinations of positioning of devices if not used together. It will be evident 
how the accuracy gets affected if we remove one or more sensors or one device from 
our model. Here, by lower bound we mean the lowest accuracy by a specific model 
among the eight activities and by upper bound we mean the highest accuracy among 
the eight activities. 

When the accelerometer and gyroscope data are used from a device placed in the 
pocket, we can see in Table 2, most of the algorithms perform well achieving 96% 
to 97% accuracy. But if we look carefully, the Lower bound score did not reach up 
to the expected point. Though it is better than recognition of human activities from 
accelerometer and gyroscope sensor data from the device placed on the wrist, we still  
can improve the result by applying fusion on all the data. So, the models here do  
not perform very well in our dataset which we built very casually mimicking the real-
life actions. Among the traditional classification algorithms, random forest has shown 
(Table 3) quite satisfactory performance, in terms of accuracy. It is evident that in many 
cases [25] similar to this, random forest is a good model. Though SVM has been proven 
to work better in some special cases, such as fault classification in smart distribution 
network [26], ozone prediction [27], cyberbullying identification [28], harmonic source 
identification [29], etc.

Table 2. Performance evaluation metrics using both sensors in Wrist and Pocket

Algorithms

Location: Wrist, Sensors: Both Location: Pocket, Sensors: Both

Lower 
Bound

(f1-score)

Upper 
Bound

(f1-score)
Accuracy

Lower 
Bound

(f1-score)

Upper 
Bound

(f1-score)
Accuracy

Logistic Regression 0.08 0.96 63% 0.47 1.00 83%

Random Forest 0.76 0.98 90% 0.89 1.00 96%

KNN 0.63 1.00 85% 0.89 1.00 95%

SVM 0.72 1.00 87% 0.90 1.00 96%

Gradient Boosting 
Classifier

0.77 0.99 90% 0.93 1.00 97%

CNN 0.66 0.99 84% 0.88 1.00 96%

Bi-LSTM 0.08 0.96 63% 0.47 1.00 83%

In Table 2, we can see that CNN did good. It has predicted all the activities with 
decent accuracy. But on the individual activity level, we can see that the lowest f1-score 
is only 0.77 when the device is placed at wrist. Therefore, it is evident that though some 

150 http://www.i-jim.org


Paper—Leveraging Sensor Fusion and Sensor-Body Position for Activity Recognition for Wearable… 

activities were detected quite accurately as the upper bound f1-score is up to 1.00 but to 
detect some activities CNN did struggle. Though CNN is an extremely powerful model 
and has shown quite satisfactory performance in many cases, such as detection of 
license plates [30], rice false smut detection [31], etc. So, the decision from Table 2 is, 
to predict different activities accurately, data from only wrist or pocket is not sufficient.

Table 3. Performance evaluation metrics using single sensor in both the locations

Algorithms

Location: Wrist & Pocket;  
Sensor: Accelerometer

Location: Wrist & Pocket;  
Sensor: Gyroscope

Lower 
Bound

(f1-score)

Upper 
Bound

(f1-score)
Accuracy

Lower 
Bound

(f1-score)

Upper 
Bound

(f1-score)
Accuracy

Logistic Regression 0.46 1.00 79% 0.23 0.61 41%

Random Forest 0.84 1.00 95% 0.74 0.99 86%

KNN 0.74 1.00 92% 0.62 1.00 76%

SVM 0.64 1.00 86% 0.03 0.99 72%

Gradient Boosting 
Classifier

0.78 1.00 93% 0.71 1.00 86%

CNN 0.87 1.00 95% 0.88 1.00 93%

Bi-directional 
LSTM

0.86 1.00 94% 0.85 1.00 91%

If only accelerometer data is used (Table 3) from two devices placed in the pocket 
and wrist the models do not improve at all. Here best-performed algorithms are CNN 
and Random Forest. Though some activities were recognized without almost any error 
as the upper f1-score is perfect 1.00, their Lower bound f1-scores are still very low. But 
our aim is to recognize all the activities with the least margin of error.

Same goes for the models when gyroscope data is analyzed from those two loca-
tions. They are in no way any better than the previously analyzed situations. But what 
we can understand from these scenarios is, if we put all the data together, they will add 
up additional information when the data overlap for two or more activities in one sensor 
or device. Therefore, the accuracy will improve as we claim in our hypothesis. 

Table 4. Performance evaluation (using all the devices and sensors together. Device location: 
Wrist and pocket; Sensor used: Accelerometer and Gyroscope)

Algorithms
Lower Bound

(f1-score)
Upper Bound

(f1-score)
Accuracy

Logistic Regression 0.60 0.99 84.2%

Random Forest 0.89 1.00 95.8%

KNN 0.74 1.00 90.7%

SVM 0.90 1.00 96.7%

Gradient Boosting Classifier 0.85 1.00 95.0%

CNN 0.94 1.00 98.2%

Bi-LSTM 0.99 1.00 99.3%

iJIM ‒ Vol. 15, No. 17, 2021 151


Paper—Leveraging Sensor Fusion and Sensor-Body Position for Activity Recognition for Wearable… 

To prove our point, finally, the models are built using accelerometer and gyroscope 
data taken from pocket and wrist altogether as features. Which provides 12 features in 
total. Looking at Table 3, it can be easily seen that all the models scaled up a lot. Our 
best-performed model is Bi-directional LSTM. It has a lower bound f1-score of 0.99 
which is very accurate and perfect for a model that can recognize human activities with 
very much precision. In our dataset, we took data of activities that are very similar in 
action and hard to distinguish between. For example, sitting in a chair at home, sitting 
in a car both are very similar in action. Sitting and standing are very stationary in  
action. But all those were recognized very accurately by Bi-directional LSTM.

6 Conclusion

Activities can be recognized in different ways using different algorithms and sensors 
placing the devices in different places of the human body. But the aim is how accurately 
we can recognize activity and most importantly if we can detect complex activities that 
might be similar in actions. We have introduced an optimal way to recognize the activity 
by using accelerometer and gyroscope sensors from the pocket and wrist. This helped 
us to accurately identify the eight activities where some of them are difficult to distin-
guish between for the similarities among them. Different traditional and Deep learning 
algorithms have been applied to our dataset as found useful in several machine learning 
related works [39–42]. Among them, Bi-directional LSTM gives optimal accuracy. Bidi-
rectional LSTM gives us 99.3% of accuracy which is comparatively better. Thus, the idea 
of sensor fusion prevails and sensor-body position mechanism has given quite significant 
results, in terms of accuracy. The prospects of our proposed methods could be applied in 
health-related solutions. For example, diabetes patient’s wearable sensors [32] could be 
fused and sensor-body positions may affect the proper recognition of health conditions.

7 References

 [1] Soleimani, Elnaz, and Ehsan Nazerfard. “Cross-subject transfer learning in human activity 
recognition systems using generative adversarial networks.” Neurocomputing 426 (2021): 
26–34. https://doi.org/10.1016/j.neucom.2020.10.056

 [2] Erdaş, Çağatay Berke, and Selda Güney. “Human Activity Recognition by Using Different 
Deep Learning Approaches for Wearable Sensors.” Neural Processing Letters (2021): 1–15. 
https://doi.org/10.1007/s11063-021-10448-3

 [3] Tasnim, Nusrat, Mohammad Khairul Islam, and Joong-Hwan Baek. “Deep Learning Based 
Human Activity Recognition Using Spatio-Temporal Image Formation of Skeleton Joints.” 
Applied Sciences 11, no. 6 (2021): 2675. https://doi.org/10.3390/app11062675

 [4] Ke, Shian-Ru, Hoang Le Uyen Thuc, Yong-Jin Lee, Jenq-Neng Hwang, Jang-Hee Yoo, and 
Kyoung-Ho Choi. “A review on video-based human activity recognition.” Computers 2,  
no. 2 (2013): 88–131. https://doi.org/10.3390/computers2020088

 [5] Ramanujam, E., Thinagaran Perumal, and S. Padmavathi. “Human Activity Recognition 
with Smartphone and Wearable Sensors using Deep Learning Techniques: A Review.”  
IEEE Sensors Journal (2021). https://doi.org/10.1109/JSEN.2021.3069927

 [6] Abdel-Salam, Reem, Rana Mostafa, and Mayada Hadhood. “Human Activity Recognition 
using Wearable Sensors: Review, Challenges, Evaluation Benchmark.” arXiv preprint 
arXiv:2101.01665 (2021). https://doi.org/10.1007/978-981-16-0575-8_1

152 http://www.i-jim.org

https://doi.org/10.1016/j.neucom.2020.10.056
https://doi.org/10.1007/s11063-021-10448-3
https://doi.org/10.3390/app11062675
https://doi.org/10.3390/computers2020088
https://doi.org/10.1109/JSEN.2021.3069927
https://doi.org/10.1007/978-981-16-0575-8_1


Paper—Leveraging Sensor Fusion and Sensor-Body Position for Activity Recognition for Wearable… 

 [7] Straczkiewicz, Marcin, Peter James, and Jukka-Pekka Onnela. “A systematic review of 
smartphone-based human activity recognition for health research.” 

 [8] Vrigkas, Michalis, Christophoros Nikou, and Ioannis A. Kakadiaris. “A review of human 
activity recognition methods.” Frontiers in Robotics and AI 2 (2015): 28. https://doi.
org/10.3389/frobt.2015.00028

 [9] Aggarwal, Jake K., and Lu Xia. “Human activity recognition from 3d data: A review.” 
 Pattern Recognition Letters 48 (2014): 70–80. https://doi.org/10.1016/j.patrec.2014.04.011

 [10] Lester, Jonathan and Choudhury, Tanzeem and Borriello, Gaetano, “A practical approach to 
recognizing physical activities, Pervasive Computing,” Springer-Verlag Berlin Heidelberg, 
vol. 3968, pp. 1–16, 2006. https://doi.org/10.1007/11748625_1

 [11] Akram Bayat, Marc Pomplun, Duc A. Tran, “A Study on Human Activity Recognition 
Using Accelerometer Data from Smartphones,” Elsevier B.V, vol. 34, Pages 450–457, 
MobiSPC-2014. https://doi.org/10.1016/j.procs.2014.07.009

 [12] L. Vinh, S. Lee, H. Le, H. Ngo, H. Kim, M. Han, and Y.-K. Lee, “Semi-markov conditional 
random fields for accelerometer-based activity recognition,” Applied Intelligence, vol. 35, 
pp. 226–241, 2011. https://doi.org/10.1007/s10489-010-0216-5

 [13] L. Bao and S. S. Intille, “Activity recognition from user-annotated acceleration data,”  
in Pervasive, pp. 1–17, 2004. https://doi.org/10.1007/978-3-540-24646-6_1

 [14] L. C. Jatoba, U. Grossmann, C. Kunze, J. Ottenbacher, and W. Stork, “Context-aware mobile 
health monitoring: Evaluation of different pat-tern recognition methods for classification 
of physical activity,” in 30th Annual International Conference of the IEEE Engineer-
ing in Medicine and Biology Society, pp. 5250–5253, 2008. https://doi.org/10.1109/
IEMBS.2008.4650398

 [15] C. Zhu and W. Sheng, “Human daily activity recognition in robot-assisted living using 
multi-sensor fusion,” 2009 IEEE International Conference on Robotics and Automation, 
Kobe, 2009, pp. 2154–2159, doi: https://doi.org/10.1109/ROBOT.2009.5152756

 [16] T. Mitchell, “Decision Tree Learning”, in T. Mitchell, Machine Learning, the McGraw-Hill 
Companies, Inc., 1997, pp. 52–78. 

 [17] San-Segundo-Hernández, R., Blunck, H., Moreno-Pimentel, J., Stisen, A., & Gil-Martín, M.  
(2018). Robust Human Activity Recognition using smartwatches and smartphones. Eng. 
Appl. Artif. Intell. 72, 190–202. https://doi.org/10.1016/j.engappai.2018.04.002

 [18] E. Bulbul, A. Cetin and I. A. Dogru, “Human Activity Recognition Using Smartphones,” 
2018 2nd International Symposium on Multidisciplinary Studies and Innovative Technolo-
gies (ISMSIT), Ankara, 2018, pp. 1–6, doi: https://doi.org/10.1109/ISMSIT.2018.8567275. 

 [19] Z. He and L. Jin, “Activity recognition from acceleration data based on discrete cosine 
transform and svm,” in IEEE International Conference on Systems, Man and Cybernetics, 
pp. 5041–5044, 2009. https://doi.org/10.1109/ICSMC.2009.5346042

 [20] A. Khan, Y.-K. Lee, S. Lee, and T.-S. Kim, “A triaxial accelerometer-based physical- 
activity recognition via augmented-signal features and a hierarchical recognizer,” IEEE 
Trans. Inf. Technol. Biomed., vol. 14, no. 5, pp. 1166–1172, 2010. https://doi.org/10.1109/
TITB.2010.2051955

 [21] Ong Vienna Lee, Ahmad Heryanto, Mohd Faizal Ab Razak, Anis Farihan Mat Raffei, 
Danakorn Nincarean Eh Phon, Shahreen Kasim, Tole Sutikno, “A malicious URLs detec-
tion system using optimization and machine learning classifiers”, Indonesian Journal of 
Electrical Engineering and Computer Science, vol. 17, no. 3, pp. 1210–1214, March 2020. 
https://doi.org/10.11591/ijeecs.v17.i3.pp1210-1214

 [22] Owais Mujtaba Khandy, Samad Dadvandipour, “Analysis of machine learning algorithms 
for character recognition: a case study on handwritten digit recognition”, Indonesian Journal 
of Electrical Engineering and Computer Science, vol. 21, no. 1, pp. 574–581, January 2021. 
https://doi.org/10.11591/ijeecs.v21.i1.pp574-581

iJIM ‒ Vol. 15, No. 17, 2021 153

https://doi.org/10.3389/frobt.2015.00028
https://doi.org/10.3389/frobt.2015.00028
https://doi.org/10.1016/j.patrec.2014.04.011
https://doi.org/10.1007/11748625_1
https://doi.org/10.1016/j.procs.2014.07.009
https://doi.org/10.1007/s10489-010-0216-5
https://doi.org/10.1007/978-3-540-24646-6_1
https://doi.org/10.1109/IEMBS.2008.4650398
https://doi.org/10.1109/IEMBS.2008.4650398
https://doi.org/10.1109/ROBOT.2009.5152756
https://doi.org/10.1016/j.engappai.2018.04.002
https://doi.org/10.1109/ISMSIT.2018.8567275
https://doi.org/10.1109/ICSMC.2009.5346042
https://doi.org/10.1109/TITB.2010.2051955
https://doi.org/10.1109/TITB.2010.2051955
https://doi.org/10.11591/ijeecs.v17.i3.pp1210-1214
https://doi.org/10.11591/ijeecs.v21.i1.pp574-581


Paper—Leveraging Sensor Fusion and Sensor-Body Position for Activity Recognition for Wearable… 

 [23] Arselan Ashraf, Teddy Surya Gunawan, Bob Subhan Riza, Edy Victor Haryanto, Zuriati 
Janin, “On the review of image and video-based depression detection using machine 
learning”, Indonesian Journal of Electrical Engineering and Computer Science, vol. 19,  
no. 3, pp. 1677–1684, September 2020. https://doi.org/10.11591/ijeecs.v19.i3.pp1677-1684

 [24] Rana Muhammad Adnan, Zhongmin Liang, Alban Kuriqi, Ozgur Kisi, Anurag Malik, 
Binquan Li, Fatemehsadat Mortazavizadeh, “Air temperature prediction using differ-
ent machine learning models”, Indonesian Journal of Electrical Engineering and Com-
puter Science, vol. 22, no. 1, pp. 534–541, April 2021. https://doi.org/10.11591/ijeecs.v22.
i1.pp534-541

 [25] Monirul Islam, Mohammod Abul Kashem, Jia Uddin, “Fish survival prediction in an aquatic 
environment using random forest model”, IAES International Journal of Artificial Intelli-
gence (IJ-AI), vol. 10, no. 3, September 2021. 

 [26] Ong Wei Chuan, Nur Fadilah Ab Aziz, Zuhaila Mat Yasin, Nur Ashida Salim, Norfishah A. 
Wahab, “Fault classification in smart distribution network using support vector machine”, 
Indonesian Journal of Electrical Engineering and Computer Science, vol. 18, no. 3,  
pp. 1148–1155, June 2020. https://doi.org/10.11591/ijeecs.v18.i3.pp1148-1155

 [27] M. Tanaskuli, Ali N. Ahmed, Nuratiah Zaini, Samsuri Abdullah, Abdoulhdi A. Borhana, 
N. A. Mardhiah, Mathivanan Mathivanan, “Ozone prediction based on support vector 
machine”, Indonesian Journal of Electrical Engineering and Computer Science, vol. 17,  
no. 3, pp. 1461–1466, March 2020. https://doi.org/10.11591/ijeecs.v17.i3.pp1461-1466

 [28] Ni Made Gita Dwi Purnamasari, M. Ali Fauzi, Indriati Indriati, Liana Shinta Dewi, “Cyber-
bullying identification in twitter using support vector machine and information gain-based 
feature selection”, Indonesian Journal of Electrical Engineering and Computer Science,  
vol. 18, no. 3, pp. 1494–1500, June 2020. https://doi.org/10.11591/ijeecs.v18.i3.pp1494-1500

 [29] Mohd Hatta Jopri, Abdul Rahim Abdullah, Jingwei Too, Tole Sutikno, Srete Nikolovski, 
Mustafa Manap, “Support-vector machine and naïve bayes based diagnostic analytic of har-
monic source identification”, Indonesian Journal of Electrical Engineering and Computer 
Science, vol. 20, no. 1, pp. 1–8, October 2020. https://doi.org/10.11591/ijeecs.v20.i1.pp1-8

 [30] Naaman Omar, Adnan Mohsin Abdulazeez, Abdulkadir Sengur, Salim Ganim Saeed Al-Ali, 
“Fused faster RCNNs for efficient detection of the license plates”, Indonesian Journal of 
Electrical Engineering and Computer Science, vol. 19, no. 2, pp. 874–982, August 2020. 
https://doi.org/10.11591/ijeecs.v19.i2.pp874-982

 [31] Prabira Kumar Sethy, Nalini Kanta Barpanda, Amiya Kumar Rath, Santi Kumari Behera, 
“Rice false smut detection based on faster R-CNN”, Indonesian Journal of Electrical Engi-
neering and Computer Science, vol. 19, no. 3, pp. 1590–1595, September 2020. https://doi.
org/10.11591/ijeecs.v19.i3.pp1590-1595

 [32] Omar AlShorman, Buthaynah Alshorman, Fahed Alkahtani, “A review of wearable sen-
sors-based monitoring with daily physical activity to manage type 2 diabetes”, International 
Journal of Electrical and Computer Engineering (IJECE), vol. 11, no. 1, pp. 646–653, 
February 2021. https://doi.org/10.11591/ijece.v11i1.pp646-653

 [33] Ding, G., Tian, J., Wu, J., Zhao, Q., & Xie, L., “Energy efficient human activity recognition 
using wearable sensors”, IEEE Wireless Communications and Networking Conference Work-
shops (WCNCW), pp. 379–383, 2018. https://doi.org/10.1109/WCNCW.2018.8368980

 [34] J. Wu, S. Guo, H. Huang, W. Liu and Y. Xiang, “Information and Communications Tech-
nologies for Sustainable Development Goals: State-of-the-Art, Needs and Perspectives,” 
in IEEE Communications Surveys & Tutorials, vol. 20, no. 3, pp. 2389–2406, thirdquarter 
2018, doi: https://doi.org/10.1109/COMST.2018.2812301. 

 [35] J. Wu, S. Guo, J. Li and D. Zeng, “Big Data Meet Green Challenges: Big Data Toward 
Green Applications,” in IEEE Systems Journal, vol. 10, no. 3, pp. 888–900, Sept. 2016,  
doi: https://doi.org/10.1109/JSYST.2016.2550530. 

154 http://www.i-jim.org

https://doi.org/10.11591/ijeecs.v19.i3.pp1677-1684
https://doi.org/10.11591/ijeecs.v22.i1.pp534-541
https://doi.org/10.11591/ijeecs.v22.i1.pp534-541
https://doi.org/10.11591/ijeecs.v18.i3.pp1148-1155
https://doi.org/10.11591/ijeecs.v17.i3.pp1461-1466
https://doi.org/10.11591/ijeecs.v18.i3.pp1494-1500
https://doi.org/10.11591/ijeecs.v20.i1.pp1-8
https://doi.org/10.11591/ijeecs.v19.i2.pp874-982
https://doi.org/10.11591/ijeecs.v19.i3.pp1590-1595
https://doi.org/10.11591/ijeecs.v19.i3.pp1590-1595
https://doi.org/10.11591/ijece.v11i1.pp646-653
https://doi.org/10.1109/WCNCW.2018.8368980
https://doi.org/10.1109/COMST.2018.2812301
https://doi.org/10.1109/JSYST.2016.2550530


Paper—Leveraging Sensor Fusion and Sensor-Body Position for Activity Recognition for Wearable… 

 [36] J. Wu, S. Guo, J. Li and D. Zeng, “Big Data Meet Green Challenges: Greening Big Data,” in 
IEEE Systems Journal, vol. 10, no. 3, pp. 873–887, Sept. 2016, doi: https://doi.org/10.1109/
JSYST.2016.2550538. 

 [37] R. Atat, L. Liu, J. Wu, G. Li, C. Ye and Y. Yang, “Big Data Meet Cyber-Physical Systems: 
A Panoramic Survey,” in IEEE Access, vol. 6, pp. 73603–73636, 2018, doi: https://doi.
org/10.1109/ACCESS.2018.2878681. 

 [38] Ahmed Al Marouf, Md. Ferdousur Rahman Sarker, Shah Md. Tanvir Siddiquee, “Recogniz-
ing Hand-based Actions based on Hip-Joint centered Features using KINECT”, 2nd Interna-
tional Conference on Electrical & Electronic Engineering (ICEEE), 27–29 December 2017, 
RUET, Rajshahi, Bangladesh. https://doi.org/10.1109/CEEE.2017.8412879

 [39] I. Alghamdi, Mohammed. “Survey on Applications of Deep Learning and Machine Learning 
Techniques for Cyber Security.” International Journal of Interactive Mobile Technologies 
(iJIM) [Online], 14.16 (2020): pp. 210–224. Web. 1 Jul. 2021. https://doi.org/10.3991/ijim.
v14i16.16953

 [40] Alqudah, Yazan, Belal Sababha, Esam Qaralleh, & Tarek Yousseff. “Machine Learning to 
Classify Driving Events Using Mobile Phone Sensors Data.” International Journal of Inter-
active Mobile Technologies (iJIM) [Online], 15.02 (2021): pp. 124–136. Web. 1 Jul. 2021. 
https://doi.org/10.3991/ijim.v15i02.18303

 [41] Samann, Fady, Adnan Mohsin Abdulazeez, & Shavan Askar. “Fog Computing Based on 
Machine Learning: A Review.” International Journal of Interactive Mobile Technologies 
(iJIM) [Online], 15.12 (2021): pp. 21–46. Web. 1 Jul. 2021. https://doi.org/10.3991/ijim.
v15i12.21313

 [42] Saraubon, Kobkiat, Nuttapong Wiriyanuruknakon, & Natdanai Tangthirasunun. “Flashover 
Prevention System using IoT and Machine Learning for Transmission and Distribution 
Lines.” International Journal of Interactive Mobile Technologies (iJIM) [Online], 15.11 
(2021): pp. 34–48. Web. 1 Jul. 2021. https://doi.org/10.3991/ijim.v15i11.20753

8 Authors

Ashraful Alam is from the Department of Computer Science and Engineering 
of Daffodil International University (DIU). He is a potential researcher in the field 
of machine learning, deep learning, wearable sensors and artificial intelligence in 
healthcare. 

Anik Das is from the Department of Computer Science and Engineering of Daffo-
dil International University (DIU). He is a potential researcher in the field of machine 
learning, deep learning, data science and artificial intelligence in application areas.

Md. Shahriar Tasjid is from the Department of Computer Science and Engineering 
of Daffodil International University (DIU). He is a potential researcher in the field of 
machine learning, data science, wearable sensors and artificial intelligence. He pos-
sesses a great passion in cycling around the country.

Ahmed Al Marouf is currently working as a senior lecturer at the Department of 
Computer Science and Engineering of Daffodil International University (DIU). He is cur-
rently pursuing his Ph.D. in Computer Science at the Department of Computer Science of 
University of Calgary, Alberta, Canada. He is a potential researcher in the field of com-
putational intelligence, mobile technologies and artificial intelligence in health science.

Article submitted 2021-07-01. Resubmitted 2021-08-06. Final acceptance 2021-08-06. Final version 
published as submitted by the authors.

iJIM ‒ Vol. 15, No. 17, 2021 155

https://doi.org/10.1109/JSYST.2016.2550538
https://doi.org/10.1109/JSYST.2016.2550538
https://doi.org/10.1109/ACCESS.2018.2878681
https://doi.org/10.1109/ACCESS.2018.2878681
https://doi.org/10.1109/CEEE.2017.8412879
https://doi.org/10.3991/ijim.v14i16.16953
https://doi.org/10.3991/ijim.v14i16.16953
https://doi.org/10.3991/ijim.v15i02.18303
https://doi.org/10.3991/ijim.v15i12.21313
https://doi.org/10.3991/ijim.v15i12.21313
https://doi.org/10.3991/ijim.v15i11.20753