Automatic measurement of the hand dimensions using consumer 3D cameras


ACTA IMEKO 
ISSN: 2221-870X 
June 2020, Volume 9, Number 2, 75 - 82 

 

ACTA IMEKO | www.imeko.org June 2020 | Volume 9 | Number 2 | 75 

Automatic measurement of hand dimensions using consumer 
3D cameras 

Daniele Marchisotti1, Pietro Marzaroli1, Remo Sala1, Michele Sculati2, Hermes Giberti3, Marco 
Tarabini1 

1 Department of Mechanical Engineering, Politecnico di Milano, Via La Masa 1, Milan, Italy 
2 Private medical doctor, Bergamo, Via Ferruccio Galmozzi 12, Italy 
3 Dipartimento di Ingegneria Industriale e dell'Informazione, University of Pavia, Via Ferrata 1, Pavia, Italy 

 

 

Section: RESEARCH PAPER 

Keywords: hand; Kinect v2; Intel RealSense; NI LabVIEW; diet; calibration; measure. 

Citation: Daniele Marchisotti, Pietro Marzaroli, Remo Sala, Michele Sculati, Hermes Giberti, Marco Tarabini, Automatic measurement of the hand 
dimensions using consumer 3D cameras, Acta IMEKO, vol. 9, no. 2, article 12, June 2020, identifier: IMEKO-ACTA-09 (2020)-02-12 

Editor: Dušan Agrež, University of Ljubljana, Slovenia 

Received May 2, 2020; In final form June 12, 2020; Published June 2020 

Copyright: This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, 
distribution, and reproduction in any medium, provided the original author and source are credited. 

Funding: This paper was supported by Politecnico di Milano and by private doctor studio of Michele Sculati. 

Corresponding author: Daniele Marchisotti, e-mail: daniele.marchisotti@polimi.it  

 

1. INTRODUCTION 

There are several applications in medicine that require 
knowledge of the dimension of the human hand. Oedema of the 
hand, resulting in an increase of the hand volume, often 
accompanies upper-extremity lymphedema, which develops after 
breast cancer-related surgical interventions [1][2]. In this case, 
measurement of the hand volume and its changes over time can 
provide objective measures about the effectiveness of the 
therapy; however, to date, this measurement is not a part of the 
routine assessments due to the absence of a readily available 
method for the estimation of the hand volume. Given the rapid 
increase in prevalence of obesity [3] and the costs associated 
therewith [4], the benefits of a low-cost measurement system 
would also be consistent in the assessment of correct food 
portions. In fact, food portions have increased their energetic 

content from two to eight times in the past 40 years [5], and it 
has been found that body weight is related to food portion size 
[6]. Great benefits may be derived from assessing correct food 
portions using a simple intuitive comparison [7][8], and the size 
and geometric parameters of the hand can be used for this 
purpose [9][10]. It could be possible to determine food portions 
also with a portable reference object (a ball or a cube), but the 
hand offers difference reference volumes and areas (fist, palm, 
fingers) and cannot be left at home or forgotten somewhere. 
Since the anthropometric measurements of the hand vary greatly 
from person to person (up to 400 % according to [11]), it is 
necessary to verify the food volume to obtain precise diet 
indications. 

Different measurement methods were proposed for the 
automatic identification of one or more geometrical parameters 
of the hand. The different technologies that can be used for the 

ABSTRACT 
This article describes the metrological characterisation of two prototypes that use the point clouds acquired by consumer 3D cameras 
for the measurement of the human hand geometrical parameters. The initial part of the work is focused on the general description of 
algorithms that allow for the derivation of dimensional parameters of the hand. Algorithms were tested on data acquired using Microsoft 
Kinect v2 and Intel RealSense D400 series sensors. The accuracy of the proposed measurement methods has been evaluated in different 
tests aiming to identify bias errors deriving from point-cloud inaccuracy and at the identification of the effect of the hand pressure and 
the wrist flexion/extension. Results evidenced an accuracy better than 1 mm in the identification of the hand’s linear dimension and 
better than 20 cm3 for hand volume measurements. The relative uncertainty of linear dimensions, areas, and volumes was in the range 
of 1-10 %. Measurements performed with the Intel RealSense D400 were, on average, more repeatable than those performed with 
Microsoft Kinect. The uncertainty values limit the use of these devices to applications where the requested accuracy is larger than 5 % 
(volume measurements), 3 % (area measurements), and 1 mm (hands’ linear dimensions and thickness). 

mailto:daniele.marchisotti@polimi.it


 

ACTA IMEKO | www.imeko.org June 2020 | Volume 9 | Number 2 | 76 

measurement of the hand shape for biometric identification have 
been reviewed by Duta [12]. Existing works typically focus on 
the identification of the hand silhouette (hand shape systems) or 
on the identification of specific geometric features (hand 
geometry systems). The measurement chain of both these 
systems is based on cameras or scanners for capturing a subject’s 
hand image and compares this information with the database 
records to verify the subject’s identity. After the review of Duta 
was published, several works focused on similar topics. Sanchez-
Reillo et al., in their work [13], described a system capable of 
extracting hand parameters from a 2D colour image. Kumar et 
al. [14] proposed the extraction, from a single image, the 
palmprint and the hand geometry. The different hand features 
are then used to verify the capability of classification on a dataset 
of 100 users. 

The length of fingers is measured in several studies; the most 
widely used finger parameters are the distal extent of the 
fingertips and the length of the fingers. In a pilot study, Peters et 
al. [15] used a device first built by George [16] to identify the 
distance from the middle fingertip to the index and ring 
fingertips. The device is based on a sliding calliper with a 
resolution of 0.5 mm. Manning et al. [17] took photocopies of 
hands and made measurements based on the photocopies. The 
uncertainty of the method based on photocopies was evaluated 
[15] by assessing the measurement reproducibility. Four 
observers took ten measurements of the hands of four subjects. 
The observed parameters were the finger lengths of the index, 
middle, and ring fingers on both hands and from these 
parameters it was possible to compute the difference in length 
among fingers. The standard deviation of the length 
measurement across the fingers and the different subjects was 
0.64 mm. The standard deviation of the tip measure was 
0.52 mm, very close to the resolution of the instrument based on 
the calliper. 

The hand volume has been measured by Hughes using 
Archimedes principle [18]: the technique involves the immersion 
of a hand in a water container placed on an electronic balance, 
with errors lower than 0.3 %. Mayrovits et al. [19] proposed the 
estimation of the hand volume using geometric algorithms. The 
hand’s linear dimensions were measured using a calliper. The 
hand volume was calculated using an elliptical frustum model. 
The results of the model were compared to volumes determined 
by water displacement, with errors in the estimation typically 
lower than ±30 ml for hands with volumes between 220 and 600 
ml. 

The performance of a biometric system is typically evaluated 
by verifying the correct classification rates of images; however, 
as evidenced by Duta in [12], a comparison between the 
performances of different systems is almost impossible for 
different reasons. From a metrological point of view, the most 
important issue is that the different experimental setups used for 
the acquisition strongly affect the classification capabilities. 
However, the different works do not focus on the accuracy of 
the systems used to acquire the images. Furthermore, all the 
biometric systems are based on 2D hand images, thus preventing 
their usage in all the applications that require knowledge of the 
hand thickness or volume. A typical example is dietary 
applications, which are those for which the systems described in 
this work were developed. This article proposes the extraction of 
the following features (hereinafter parameters) necessary for the 
identification of food portions in a volume diet: 

 
 

1) Volumes 
a) Hand volume 
b) Palm volume 

2) Areas 
a) Area of the hand 
b) Area of the palm 
c) Middle finger area 

3) Dimensions 
a) Hand width 
b) Hand length 
c) Index, middle and ring fingers width 
d) Thumb length 
e) Fingers’ length 
f) Height (thickness) of the fingers, computed at 

the nails or at the first phalanx 
Many of the above parameters have no clear definition, and 

the presence of soft tissues implies a potentially relevant loading 
effect in the presence of contact measurement devices, such as 
the callipers. The main idea of our work is to derive the above 
listed parameters from point clouds describing the hand dorsum. 
The accuracy of different parameters is assessed by the 
metrological characterisation of two consumer 3D cameras. 
Although in the literature there are several works focused on the 
identification of metrological performances of 3D cameras [20]-
[23] and the classification capabilities of point cloud-based 
methods for hand silhouette identification, in this work, we focus 
on the accuracy in the identification of hand geometrical 
parameters using point clouds acquired by commercial 3D 
cameras. The article is structured as follows: the measurement 
method is described in Section 2; experimental results are 
presented in Section 3 and discussed in Section 4. The study’s 
conclusions are drawn in Section 5. 

2. MEASUREMENT METHOD 

The measurement system described in this work is based on 
acquiring the point cloud of the hand dorsum with the two 3D 
vision systems, i.e. Kinect V2 and the Intel RealSense D415 and 
D435. 

2.1. Sensors’ characteristics 

Kinect V2 is a Time-of-Flight (ToF) camera that uses Infrared 
(IR) emitters and sensors to reconstruct a 3D scene from the 
elapsed time between the emission of a light ray and its collection 
after reflection from a target. The metrological qualification of 
the camera [20] evidenced that the measurement uncertainty 
increases linearly with the distance between Kinect V2 and the 
measured point. The repeatability (expressed as standard 
deviation on the distance of planar objects perpendicular to the 
optical axis) is 1.2 mm at 1.5 m and 3.3 mm at the maximum 
distance (4.2 m). Moreover, there is a relationship between the 
measurement uncertainty and the radial coordinate of the sensor, 
as the IR light cone of emission is not homogeneous [20][21], 
and this might affect the volumes, areas, and dimensions of the 
hand parts. The minimum working distance of Kinect is 0.70 m. 
In these conditions, the field of view implies an observed area 
that is much larger than the hand size (a suitable measurement 
area can be defined by the size of an A4 piece of paper). In order 
to limit the effects of the different reflectivity of the reference 
surface and of the human skin, the subject is required to wear 
black silk gloves and to keep the instrument far from obstacles 
that may reflect or scatter IR radiation, as described in [23]. 

Two 3D cameras belonging to the Intel RealSense D400 
series were initially tested (D415 and D435). The cameras are 



 

ACTA IMEKO | www.imeko.org June 2020 | Volume 9 | Number 2 | 77 

based on stereo vision depth technology and consist of left and 
right imagers assisted by an IR projector. The latter emits an 
invisible static IR pattern to improve depth accuracy in scenes 
with low texture. The 3D scene is reconstructed by correlating 
points on the left and right images using built-in calibration 
equations [22]. The Intel RealSense data acquisition program 
allows for the selection between different settings for the 3D 
scene reconstruction: following the user manual, all the 
acquisitions were performed with the ‘short-range’ depth preset 
(given the fixed camera focal length and the necessity of filling 
scene with the subject hand) and the automatic exposure 
computation. The optimal distances for obtaining hand 3D 
images using RealSense D435 and D415 cameras are 0.4 m and 
0.5 m respectively. Distances are slightly larger than the 
minimum distances: 0.35 m and 0.45 m. 

The sensors’ characteristics are summarised in Table 1. Kinect 
V2 has a smaller depth measurement range with respect to 
RealSense cameras. The Kinect V2 resolution and sampling rate 
are lower than those of the Real Sense cameras, and its minimum 
focusing distance is larger. The Kinect V2 Field of View (FoV) 
is similar to that of the D415 and smaller than that of the D435. 
The smaller FoV of the D415 camera makes it more suitable for 
measuring small objects and was consequently chosen for the 
identification of the hand size. Preliminary tests evidenced that 
RealSense cameras are less sensitive to lighting conditions and 
are not affected by obstacles close to the target, although the light 
reflection on the plane that supports the hand often leads to 
disturbances on the point cloud. 

2.2. Measurement procedure 

The geometry of the hand is obtained by subtracting the 3D 
image of the plane that is supporting the hand from the 3D image 
of the hand laying on the plane. With reference to the 
coordinated axes of Figure 1, the measurement process consists, 
therefore, of the identification of two point clouds: 

1. The point cloud of the reference plane, with 
coordinates [Xp], [Yp], [Zp]; 

2. The point cloud of the hand lying on the reference 
plane, with coordinates [Xh], [Yh],[Zh]; 

The point cloud to be analysed, whose coordinates are [X], 
[Y], [Z], is defined by: 

 

(1) 

2.3. Algorithms 

Fit-for-purpose algorithms were implemented in LabVIEW 
2017 in order to recognise the main parts of the hand and to 
compute the parameters listed in section 1. The automation of 
the measurement process allows for the reduction of the 
variability due to the observer; although, the lack of a 
standardised model for the identification of specific hand 
parameters might introduce a discrepancy between different 
measurement algorithms. The dimensions, areas, and volumes of 
the hand were computed with the algorithms that are extensively 
described in [23]: the different hand features are identified using 
simple image processing algorithms that are based on the 
extraction of the hand silhouette identified from the binarized 
2D image of the hand by detecting the borders of the hand. The 
flowchart of the method is shown in Figure 2. Multiple point 
clouds of the hand are acquired by the sensor. A temporal 
average is used to derive the point cloud for analysis. The 
reference plane is subtracted from the observed scene, as 
described in section 2.2. The image is binarized to obtain the 
hand area; a particle filter was used to delete measurement 

   

   

   

 
 

 
 

 
 

h p

h p

h p

X = X - X

Y = Y - Y

Z = Z - Z

Table 1. Kinect V2 and RealSense D400 characteristics. 

Characteristic Kinect V2 
RealSense 

D415 
RealSense D435 

Working principle Time-Of-Flight Active IR Stereo Active IR Stereo 

Depth range 0.7-4.2 m 0.4-10 m 0.2-10 m 

Max depth resolution 512x424 1280x720 1280x720 

Max colour resolution 1920x1080 1920x1080 1920x1080 

FoV H: 70°, V: 60° H: 69°, V: 42° H: 91°, V: 65° 

Max acquisition frequency 30 Hz 90 Hz 90 Hz 

Working conditions Indoor Indoor/outdoor Indoor/outdoor 

 

Figure 1. Scheme of the plane and image reference systems. 

 

Figure 2. Flowchart of the analytical method for 3D point clouds. 



 

ACTA IMEKO | www.imeko.org June 2020 | Volume 9 | Number 2 | 78 

artefacts. Conventional image processing algorithms (image 
derivatives or contours extraction starting from the binary image) 
were used to identify the hand features parameters described in 
Section 2.1 [23]. 

2.4.  Calibration 

Tests have been performed by comparing the lengths, 
distances, and volumes measured with the 3D cameras with the 
standard measurement systems; more than 50 subjects 
participated in the experiments, which were performed in 
accordance with the Politecnico ethical guidelines. 

2.4.1. Thickness 

The hand thickness measured by the 3D cameras was 
compared to that which was measured by a triangulation laser 
sensor Micro-epsilon optoNCDT 1402, with a measurement 
range of 50 mm and a resolution of 5 µm [23]. The results 
reported in this work refer to the thickness measured at the 
centre of the middle fingernail; the finger represents a 
challenging position for measuring the thickness, given that small 
pressure variations at the fingertip result in a large variation of 
the measured height. However, similar experiments were 
performed also in different hand positions and provided similar 
results. The thickness of different subjects’ right hands (18 for 
the RealSense, 33 for the Kinect) was measured; given that tests 
were performed in different time periods, the subjects were not 
the same, thus preventing a direct comparison between the two 
measurement systems (described for a single subject in Section 
2.4.4). 

2.4.2. Volume 

The reference volume of the hand was computed by 
immersing the hands of different subjects (33 for the RealSense, 
30 for the Kinect) in water using a graduated beaker according 
to the procedure described by [18]. The resolution of the 
graduated scale was 10 ml. Furthermore, in this case, since the 
tests were performed in different experimental sessions, the 
subjects were not the same. 

2.4.3. Disturbances 

A third series of tests were performed to investigate the effect 
of disturbances. The quantities that may worsen the 
measurements’ reproducibility are the position of the forearm 
(i.e. the wrist flexion/extension) and the pressure between the 
hand and the supporting surface. Both these factors, because of 
the compliance of the hand soft tissues, modify the local height 
of the hand and consequently might vary the hand volume. Tests 
were performed with a single subject under reproducibility 
conditions: the hand volume has been measured with the Kinect 
V2 and RealSense D415 cameras upon varying: 

a) the elbow extension (i.e. the forearm angle with respect 
to the horizontal plane); the wrist extension angles were 
-10 °, 0 °, 10 °, 20 °, 30 °; a total of 40 measurements 
were performed; 

b) the hand pressure against the reference plane (0 kPa, 
1.30 kPa, 1.94 kPa, 2.65 kPa, 3.31 kPa, 3.97 kPa). 

The setups for the identification of the effects of the forearm 
orientation and of the hand pressure are shown in Figure 3. Tests 
were performed both with the Kinect V2 and RealSense D415. 
Wrist extension was controlled by positioning supports of 
different heights below the elbow. The extension α has been 
computed as the arctangent of the support height h divided by 
the forearm length w: 

 
(2) 

The hand contact pressure was obtained by dividing the net 
force measured by the scale (difference between the force 
measured by the balance F and the device weight W) by the hand 
area measured by the proposed 3D measurement system A. 

 
(3) 

Given that the contact area is smaller than the total hand area 
(because of the palm concave shape), the reported values are an 
overestimation of the actual hand pressure; however, reporting 
pressure data allows for a comparison of the measurement results 
with possible forthcoming studies. 

2.4.4. System comparison 

The final analyses were performed to compare the results of 
measurements performed with the Kinect and RealSense 
cameras. Different parameters of the hands of a single subject 

( )arctan /h w =

F W
p

A

−
=

 

Figure 3. Scheme of the setup for the identification of the disturbances 
generated by the forearm angle (a) and hand pressure against the surface (b). 

 
Figure 4. Correlation between the height of the hand measured by the laser 
sensor and the height of the hand measured by RealSense D415. 

(a)

(b)



 

ACTA IMEKO | www.imeko.org June 2020 | Volume 9 | Number 2 | 79 

were automatically extracted from 30 repeated measurements by 
means of the computation algorithms described in [23]. The 
compatibility of measurements was verified through the 
hypothesis testing about the equality of the means using Minitab. 

3. RESULTS 

3.1. Hand thickness calibration 

The fingers’ thickness ranged between 9 and 12 mm. The 
Root Mean Square (RMS) of the difference between the height 
measured with the Kinect and the height measured by the laser 
is 1.1 mm. This value is compatible with the Kinect resolution (1 
mm) and with the instrument uncertainty identified in [20], which 
was 1 mm when the measurand was placed at 0.75 m from the 
sensor. As shown in Figure 4, hand thickness h can be 
determined starting from the thickness measured by the Kinect 
hK using the following linear regression model: 

 (4) 

The standard error of the estimate is 0.85 mm. The R² value 
is 36.8 %; such a low value reasonably derives from the large 
variability of the measurand (uncontrolled pressure of the 
fingers) and to the limited variation of the fingers’ thickness in 
comparison with the Kinect resolution. 

The first order term of the regression model is between 0.9 
and 0.95 with a 95 % confidence level. The distribution of 
residuals and the probability plot of standardised residuals 
evidenced the validity of the proposed regression model [25]-
[28]. 

The same calibration was performed for the RealSense 
camera. The RMS of the difference between the height measured 
with the RealSense and the height measured by the laser is 1.1 
mm. This value is compatible with the expected resolution for 
RealSense D415 at a 0.5 m distance and is equal to the value 
obtained with the Kinect. As shown in Figure 5, similar to what 
was performed with the Kinect, the finger thickness h was fitted 
by a linear regression model starting from the thickness measured 
by the RealSense hRS: 

 (5) 

The standard error of the estimate is 0.88 mm. R2 is also low 
in this case (47.7 %). The distribution of residuals and their 
probability distribution evidenced the validity of the regression 
model. 

3.2. Volume calibration 

The results of the calibration of Kinect are shown in Figure 
6. Hand volumes ranged between 180 and 500 cm3. The RMS of 
the difference between the volume measured with the Kinect V2 
and the volume measured by the beaker is 35 cm3. The hand 
volume V (in cm³) can be estimated starting from the volume Vk 
measured by the Kinect using the following quadratic regression 
model: 

 (6) 

The standard error of estimate is 17 cm3. The R² value is 96.7 
%, evidencing a better correlation with respect to the thickness 
data presented in the previous paragraphs. The distribution of 
residuals and the probability plot of standardised residuals 
evidenced the validity of the quadratic regression model; 
conversely, the adoption of a linear model led to a parabolic 
distribution of residual versus fits. 

A similar regression operation was performed with the 
RealSense camera, and the results are shown in Figure 7. The 
RMS of the difference between the volume measured with the 
RealSense D415 and the reference volume was 64 cm3. To 
increase the accuracy of the volume results, data were modelled 
with a linear regression model: 

 (7) 

The standard error of estimate was 16 cm3, i.e. comparable to 
the value of 17 cm³ obtained with Kinect. R² was 95.2 %, and the 
analysis of the residuals evidenced the validity of the linear 
model. 

3.3. Influencing quantities 

The results (Figure 8 and Figure 9) evidenced the necessity of 
controlling both the forearm position and the contact pressure. 

0.93*
K

h h=

3.53 0.62
RS

hh = + 

2
48.44 0.63   0.00095    

K K
V V V= +  + 

33.71 0.75
RS

V V= + 

 
Figure 5. Correlation between the height of the hand measured by the laser 
sensor and the height of the hand measured by Kinect V2. 

 
Figure 6. Correlation between the hand volume measured by the Kinect 
camera and the one measured with the reference system. 

 
Figure 7. Correlation between the hand volume measured by the RealSense 
camera and the one measured with the reference system. 

VK [cm³]

V
 [c

m
³]

VRS [cm³]

V
 [c

m
³]



 

ACTA IMEKO | www.imeko.org June 2020 | Volume 9 | Number 2 | 80 

The wrist extension (Figure 8) affects the hand volume in a 
complex manner. The volume is maximum when the wrist is in 
neutral position, i.e. when the forearm is parallel to the plane 
supporting the hand palm. In the presence of elbow flexion or 
extension, the measured volume decreases, reasonably because 
of the change in pressure distribution on the hand palm and 
fingers. Results also evidence that Kinect V2 seems more 
sensitive to the elbow angle with than RealSense D415; however, 
given that the tests were performed by different subjects, it is not 
possible to conclude that the bias is generated by an instrumental 
effect. 

The effect of pressure is summarised in Figure 9; the increase 
in contact pressure results in a decrease of measured volume 
because of an average reduction of the hand thickness deriving 
from the compression of the hand soft tissues. The trends 
measured by the two measurement systems are comparable, 
although the volume reductions measured by the Kinect (Figure 
9 a) are, on average, larger than those measured by RealSense. 
The reduction of volume in correspondence to a pressure of 1.48 
kPa is 13.8 % for Kinect and 9.5 % for RealSense; similar 
differences occur at higher pressures. 

3.4. Systems comparison 

The comparison between the hand’s parameters extracted in 
30 repeated measurements from the two 3D cameras are shown 
in Table 2. Data show that the differences between the average 
measurements obtained with the Kinect V2 and the RealSense 
range between 1 % (hand area, flat fist volume) and 16 % (index 
finger thickness). 

In most cases (6 out of 9), the hypothesis testing about the 
equality of the means of the measurements performed by the two 
measurement systems lead to a rejection of the null hypothesis 

k=RS. The most critical measurement was the thickness of the 
index: the difference between the two measurement systems was 
large in percentage (16 %) but was comparable to the resolution 
of the two measurement systems (2 mm). Measurements of the 
Kinect were performed wearing a silk glove because of the 

necessity of keeping the sensor as close as possible to the hand 
(to grant a decent resolution) and to avoid saturation problems 
in specific hand areas. This did not lead to a systematic 
overestimation of hand dimensions/areas/volumes. 

4. DISCUSSION 

The results presented in this work evidence the accuracy 
limitations of the use of commercial 3D cameras for the 
identification of hands’ geometrical parameters. Both the 
calibrations versus the golden standard and the comparison 
between two different measurement systems evidenced accuracy 
limitations that are in the order of millimetres for the hand 
thickness, of a few cm2 in area measurements, and 20 cm3 for 
volume measurements. The volume measurements were affected 
by the wrist extension and by the hand pressure against the 
surface; consequently, in order to increase the measurement 
reproducibility, it is necessary to perform all the measurements 
with a neutral wrist extension (and lateral deviation) and by 
controlling the hand pressure. The possibility of compensating 
the disturbances given by these factors seems limited and 
deserves future investigations. 

The adoption of the fit-to-purpose calibration procedure 
allowed reducing the measurement uncertainty of both the 
systems. The Kinect V2 thickness calibration improved the 
accuracy of the measurement system, reducing the RMS of the 
errors from 1.1 mm to 0.8 mm; for the same system, the volume 
calibration reduced the standard uncertainty from 35 cm3 below 
20 cm3. The RealSense D415 volume calibration lowered the 
RMS of errors from 64 cm3 to 16 cm3 for hand volume and from 
1.1mm to 0.9 mm for thickness. These values are tolerable for 

 
Figure 8. Boxplot of the hand volume measured by Kinect V2 (a) and 
RealSense D415 (b) upon varying the forearm angle. Data refer to two 
different groups of people. 

 
Figure 9. Boxplot of the hand volume measured by Kinect V2 (a) and the 
RealSense D415 (b) upon varying the pressure between the hand and the 
supporting surface. Data refer to two different groups of people. 

Forearm angle [°]

V
 [c

m
³]

Forearm angle [°]

V
 [c

m
³]

(a)

(b) Pressure [kPa]

V
 [

cm
³]

V
 [

cm
³]

Pressure [kPa]

(a)

(b)



 

ACTA IMEKO | www.imeko.org June 2020 | Volume 9 | Number 2 | 81 

the identification of the food portions and are lower than the 
values reported in the literature for measurements based on the 
Archimedes principle (around 30 ml). The possibility of using 
more accurate sensors or scanning both sides of the hand is 
currently under evaluation and will be the topic of forthcoming 
studies. 

The comparison between the two measurement systems 
evidenced that the Intel RealSense D 415 camera allows for the 
acquisition of more repeatable results (average COV 3 %) with 
respect to the Kinect V2 (average COV 5 %), thanks to the 
higher resolution and to the lower measurement distance. The 
main limitation in the use of the Kinect V2 was related to the 
necessity of wearing black gloves and keeping the system far 
away from obstacles that could cause IR reflection. These 
limitations were not present when using RealSense D415 sensor, 
which reconstruct the 3D scene with an active IR stereo vision 
camera instead of the ToF principle of the Kinect. 

5. CONCLUSIONS 

This work described the characterisation of an automated 
system that can be used to determine the volume, silhouette, and 
other geometrical parameters of the hand, starting from the point 
cloud acquired with commercial 3D cameras. This automated 
system is composed of Kinect V2 or Intel RealSense D415 
sensors, which acquire a point cloud that is analysed using 
purposely developed algorithms tests performed with the 
Microsoft Kinect v2 and Intel RealSense D415. We evidenced 
metrological performances comparable to those currently used 
in the literature (for instance, measuring the hand volume by 
using a graduated beaker or using a calliper for dimensions). The 
main limitation of the systems based on the acquisition of the 
point cloud derives from the presence of disturbances that can 
have a strong influence on the final results. Measurements 
performed with the Kinect V2 were unreliable when not wearing 
gloves with optical proprieties similar to those of the reference 
plane; stereoscopic cameras did not evidence similar limitations. 
The accuracy of the measurement systems was increased by 
direct calibration, adopting a regression model for the 
compensation of the hand thickness and volume bias error, 
obtaining an accuracy lower than 1 mm for height and lower than 
20 cm3 for volume. Area measurements did not require any 
compensation, being the in-plane measurement free from bias 
component. Further efforts will be focused on the accuracy in 

the identification of the silhouette of the hand starting from 3D 
images. 

REFERENCES 

[1] H. N. Mayrovitz, Assessing local tissue edema in postmastectomy 
lymphedema, Lymphology 40(2) (2007) pp. 87-94. 

[2] H. N. Mayrovitz, N. Sims, J. MacDonald, Assessment of limb 
volume by manual and automated methods in patients with limb 
edema or lymphedema, Advances in Skin & Wound Care 3(6) 
(2000) p. 272. 

[3] Obesity and overweight statistics, World Health Organization, 
2018 [Online]. Available:  
https://www.who.int/en/news-room/fact-
sheets/detail/obesity-and-overweight   

[4] S. B. Wolfenstetter, P. Menn, R. Holle, A. Mielck, C. Meisinger, 
T. von Lengerke, Body weight changes and outpatient medical 
care utilisation: Results of the MONICA/KORA cohorts S3/F3 
and S4/F4, Psychosoc Med. 2012.  
https://doi.org/10.3205/psm000087  

[5] C. Geraci, L. Bioletti, A. Sabbatini, M. Formigatti, C. Baldo, 
L. Bolesina, E. Donghi, M. Villa, M. Sculati, M. Petroni, I metodi 
volumetrici in dietetica preventiva e clinica. Rivista Italiana di 
Nutrizione e Metabolismo, June 2017. 

[6] O. Sculati, T. Spagnoli, M. Sculati, M. Formigatti, A. Sabbatini, 
L. Bolesina, New techniques in nutritional surveys and food 
educational programs: volumetric dietetics for portion size 
assessment, Ann Ig. 15(2) (2003) pp. 135-146. 

[7] M. Nelson, M. Atkinson, S. Darbyshire, Food photography II: use 
of food photographs for estimating portion size and the nutrient 
content of meals, British Journal of Nutrition 76(1) (1996) pp. 31-
49. 

[8] M. Nelson, J. Haraldsdóttir, Food photographs: practical 
guidelines I. Design and analysis of studies to validate portion size 
estimates, Public Health Nutrition 1(4) (1998) pp. 219-230. 

[9] O. Sculati, Prevenzione e terapia dietetica Una guida per medici e 
dietisti, Il pensiero Scientifico Editore; Maria Luisa Amerio, 
Giuseppe Fatati, Volume Dietetica e Nutrizione, Il Pensiero 
Scientifico Editore, First Edition, 2007. 

[10] O. Sculati, G. Bettoncelli, O. Brignoli, G. Corgatelli, D. Ponti, 
A. Rumi, A. Zucchi, Efficient prevention of overweight and 
obesity in the experience of family practitioners and nutrition units 
of the public health system in Lombardy, Annali di igiene: 
medicina preventiva e di comunita 18(1) (2006) pp. 41-48. 

[11] M. Sculati, H. Giberti, M. Tarabini, Method for evaluation of 
portion sizes based on 3D measurement of hands, Foundation 
Acta Pædiatrica 106 (Suppl. 470) (2017) pp. 6-9. 

[12] N. Duta, A survey of biometric technology based on hand shape, 
Pattern Recognition 42(11) (2009) pp. 2797-2806. 

Table 2. Kinect V2 and RealSense D400 comparison concerning the measurements performed with the two measurement systems on 30 repeated measures 
of the same subject. 

 Microsoft Kinect Intel RealSense  95% CI difference  

 Mean St. Dev COV Mean St. Dev COV Difference (%) min max P value 

Hand volume in cm3 311 15 5% 317 10 3% -2% -13.5 -0.3 5% 

Fist volume in cm3 322 16 5% 327 10 3% -1% -10.8 2.8 24% 

Flat fist volume in cm3 344 16 5% 346 10.1 3% -1% -9.0 5.0 56% 

Hand area in cm2 137 4 3% 146 4 3% -7% -10.9 -7.0 0% 

Hand without thumb area in cm2 118 3.8 3% 123 2.9 2% -5% -6.8 -3.3 0% 

Index thickness in cm 1.2 0.1 8% 1.0 0.1 9% 16% 0.1 0.2 0% 

Middle finger width in cm 2.0 0.2 9% 1.8 0.1 3% 9% 0.1 0.3 0% 

Hand width in cm 8.4 0.1 2% 8.0 0.2 3% 6% 0.4 0.6 0% 

Hand length in cm 17.4 0.4 2% 18.2 0.2 1% -5% -0.9 -0.6 0% 

https://www.who.int/en/news-room/fact-sheets/detail/obesity-and-overweight
https://www.who.int/en/news-room/fact-sheets/detail/obesity-and-overweight
https://doi.org/10.3205/psm000087


 

ACTA IMEKO | www.imeko.org June 2020 | Volume 9 | Number 2 | 82 

[13] R. Sanchez-Reillo, C. Sanchez-Avila, A. Gonzalez-Marcos, 
Biometric identification through hand geometry measurements, 
IEEE Transactions on Pattern Analysis & Machine Intelligence, 
10 (2000) pp. 1168-1171. 

[14] A. Kumar, D. Wong, H. Shen, A. Jain, Personal verification using 
palmprint and hand geometry biometric, Proc. of the International 
Conference on Audio-and Video-Based Biometric Person 
Authentication, 2003, Berlin, Germany, pp. 668-678. 

[15] M. Peters, K. Mackenzie, P. Bryden, Finger length and distal 
finger extent patterns in humans, American Journal of Physical 
Anthropology: The Official Publication of the American 
Association of Physical Anthropologists 117(3) (2002) pp. 209-
217. 

[16] R. George, Human finger types, The Anatomical Record 46(2) 
(1930) pp. 199-204. 

[17] J. T. Manning, R. Trivers, R. Thornhill, D. Singh, The 2nd: 4th 
digit ratio and asymmetry of hand performance in Jamaican 
children, Laterality: Asymmetries of Body, Brain and Cognition, 
5(2) (2000) pp. 121-132. 

[18] S. Hughes, J. Lau, A technique for fast and accurate measurement 
of hand volumes using Archimedes’ principle, Australasian 
Physics & Engineering Sciences in Medicine, 31(1) (2008) p. 56. 

[19] H. N. Mayrovitz, N. CJ. H. Sims, C. J. Hill, T. Hernandez, 
A. Greenshner, H. Diep, Hand volume estimates based on a 
geometric algorithm in comparison to water displacement, 
Lymphology 39(2) (2006) pp. 95-103. 

[20] A. Corti, S. Giancola, G. Mainetti, R. Sala, A metrological 
characterization of the Kinect V2 time-of-flight camera, Robotics 
and Autonomous Systems 75 (2016) pp. 584-594. 

[21] H. Gonzalez-Jorge, P. Rodríguez-Gonzálvez, J. Martínez-
Sánchez, D. González-Aguilera, P. Arias, M. Gesto, L. Díaz-

Vilariño, Metrological comparison between Kinect I and Kinect II 
sensors, Measurement 70 (2015) pp. 21-26. 

[22] S. Giancola, M. Valenti, R. Sala, Metrological Qualification of the 
Intel D400™ Active Stereoscopy Cameras. In: A Survey on 3D 
Cameras: Metrological Comparison of Time-of-Flight, Structured-
Light and Active Stereoscopy Technologies, 2018, Springer, 
Cham, pp. 71-85. 

[23] M. Tarabini, D. Marchisotti, R. Sala, P. Marzaroli, H. Giberti, 
M. Sculati, A prototype for the automatic measurement of the 
hand dimensions using the Microsoft Kinect V2, Proc. of the 2018 
IEEE International Symposium on Medical Measurements and 
Applications (MeMeA), 2018, pp. 1-6.  

[24] Laser Instruction Manual and Datasheet. [Online]. Available: 
https://www.micro-epsilon.fr/download/manuals/man–
optoNCDT-1402–en.pdf   

[25] D. C. Montgomery, G. C. Runger, Applied Statistics and 
Probability for Engineers, John Wiley & Sons, 2010. 

[26] G. Moschioni, B. Saggin, M. Tarabini, 3-D sound intensity 
measurements: accuracy enhancements with virtual-instrument-
based technology, IEEE Trans. Instrumentation and 
Measurement, 57(9) (2008) 1820-1829. 

[27] B. Saggin, D. Scaccabarozzi, M. Tarabini, Instrumental phase-
based method for Fourier transform spectrometer measurements 
processing, Applied Optics 50(12) (2011) pp. 1717-1725. 

[28] B. Saggin, D. Scaccabarozzi, M. Tarabini, Metrological 
performances of a plantar pressure measurement system, IEEE 
Transactions on Instrumentation and Measurement 62(4) (2013) 
pp. 766-776. 

[29] G. Moschioni, B. Saggin, M. Tarabini, J. Hald, J. Morkholt, Use of 
design of experiments and Monte Carlo method for instruments 
optimal design, Measurement 46(2) (2013) pp. 976-984. 

 

https://www.micro-epsilon.fr/download/manuals/man–optoNCDT-1402–en.pdf
https://www.micro-epsilon.fr/download/manuals/man–optoNCDT-1402–en.pdf