KEDS_Paper_Template


Knowledge Engineering and Data Science (KEDS) pISSN 2597-4602 

Vol 5, No 1, December 2022, pp. 27-40 eISSN 2597-4637 
 

https://doi.org/10.17977/um018v5i12022p27-40  
©2022 Knowledge Engineering and Data Science | W : http://journal2.um.ac.id/index.php/keds | E : keds.journal@um.ac.id 

This is an open access article under the CC BY-SA license (https://creativecommons.org/licenses/by-sa/4.0/) 

Automatic 3D Cranial Landmark Positioning based on  

Surface Curvature Feature using Machine Learning 

Putu Hendra Suputra
 a, b, 1

, Anggraini Dwi Sensusiati
 c, 2

, Myrtati Dyah Artaria
 d, 3

,  

Gijsbertus Jacob Verkerke
 e, 4

, Eko Mulyanto Yuniarno
 a, f, 5, 

*, I Ketut Eddy Purnama
 a, f, 6

 
a
 Department of Electrical Engineering, Institut Teknologi Sepuluh Nopember 

Gedung B, C & AJ Kampus ITS, Keputih, Sukolilo, Surabaya, 60111, Indonesia 
b 

Department of Informatics, Universitas Pendidikan Ganesha 

Jalan Udayana No.11 Singaraja - Bali 81116, Indonesia 
c 
Department of Radiology, Faculty of Medicine, Universitas Airlangga 

Kampus A Universitas Airlangga, Jl. Mayjen. Prof. Dr. Moestopo 47, Surabaya 60131, Indonesia 
d 

Department of Anthropology, Universitas Airlangga 

Room 211, Building A, FISIP, Campus B, Jl. Airlangga No.4 - 6, Gubeng, Surabaya 60115, Indonesia 
e 
University Medical Center Groningen, University of Groningen 

Hanzeplein 1, 9713 GZ Groningen, Netherlands 
f 
Department of Computer Engineering, Institut Teknologi Sepuluh Nopember 

Gedung B & C, Kampus ITS Sukolilo, Surabaya 60111, Indonesia 
1 

hendra.suputra@undiksha.ac.id; 
2 

anggraini-d-s@fk.unair.ac.id; 
3 

myrtati.artaria@fisip.unair.ac.id;  
4 

g.j.verkerke@med.umcg.nl; 
5 

ekomulyanto@ee.its.ac.id*; 
6 
ketut@te.its.ac.id 

* corresponding author 

 
I. Introduction 

Craniofacial reconstruction is the art of reconstructing soft tissue that is lost from the skull to 
present a visual appearance of a face that is lifelike. Soft tissues are estimated based on the 
anthropometric features of the skull. Forensic identification also requires landmark positioning for 
correspondence with reference facial landmarks [1][2]. An expert must determine the position of 
particular anatomical features (landmarks) as the soft tissue reference.  

ARTICLE INFO A B S T R A C T   

Article history: 

Received 21 April 2022 

Revised 28 July 2022 

Accepted 8 August 2022 

Published online 7 November 2022 

 
Cranial anthropometric reference points (landmarks) play an important role in 
craniofacial reconstruction and identification. Knowledge to detect the position of 
landmarks is critical. This work aims to locate landmarks automatically. Landmarks 
positioning using Surface Curvature Feature (SCF) is inspired by conventional 
methods of finding landmarks based on morphometrical features. Each cranial 
landmark has a unique shape. With the appropriate 3D descriptors, the computer can 
draw associations between shapes and landmarks using machine learning. The 
challenge in classification and detection in three-dimensional space is to determine 
the model and data representation. Using three-dimensional raw data in machine 
learning is a serious volumetric issue. This work uses the Surface Curvature Feature 
as a three-dimensional descriptor. It extracts the local surface curvature shape into a 
projection sequential value (depth). A machine learning method is developed to 
determine the position of landmarks based on local surface shape characteristics. 
Classification is carried out from the top-n prediction probabilities for each landmark 
class, from a set of predictions, then filtered to get pinpoint accuracy. The landmark 
prediction points are hypothetically clustered in a particular area, so a cluster-based 
filter is appropriate to isolate them. The learning model successfully detected the 
landmarks, with the average distance between the prediction points and the ground 
truth being 0.0326 normalized units. The cluster-based filter is implemented to 
increase accuracy compared to the ground truth. Thus, SCF is suitable as a 3D 
descriptor of cranial landmarks. 

This is an open access article under the CC BY-SA license 

(https://creativecommons.org/licenses/by-sa/4.0/). 

Keywords: 

Anatomical feature 

Cranial landmark 

Machine learning 

Morphological approach  

Surface curvature feature 

Three dimensions 

  
28 P.H. Suputra et al. / Knowledge Engineering and Data Science 2022, 5 (1): 27-40 

 
Cranial landmarks are reference points in adding soft tissue thickness. A three-dimensional 
perspective poses a challenge in placing landmarks compared to positioning on two-dimensional 
images (images superimposition). The popular facial reconstruction methods, the American [3]. and 
Gerasimov [4], are the most intuitive method for a computer-based approach. Both simulate soft 
tissues guided by the placement of landmarks. 

The current developments of computerized craniofacial reconstructions generally use a three-
dimensional virtual model that mimics the idea of the manual reconstruction process [5][6]. 
Research related to computer-based craniofacial reconstruction is widely developed. However, most 
attention has been paid to reconstructing facial features based on soft tissue thickness. Determination 
of the position of the skull landmark is still determined manually. 

Knowledge and ability to detect the position of anatomical features are critical 
[7][8][9][10][11][12][13][14]. The golden rules in determining the position of anatomical features 
are based on its surface shape. The use of Surface Curvature Feature (SCF) for classifying skull 
surface landmarks is inspired by conventional methods of determining anthropometric reference 
points. The SCF is introduced as a morphometric feature model representing a local surface's shape 
[15]. Nonetheless, several studies of 3D automatic detection have been carried out. Anatomical 
landmark positioning by Jacinto [16] proposed a multi-atlas approach to orthopedic knee landmarks. 
The surface local shape is used as a reference for the registration of pre-defined landmark 
positioning. Another approach to the three-dimensional annotation system is also carried out by Lee 
[17] and Lindner [18]. However, they use a 2D image or a shadowed-2D image. The methods used 
did not measure directly to the skull model geometrically like the original conventional method. The 
challenge of using a three-dimensional image modality is the input dimension which involves a very 
large volume of voxels. Instead of employing a full 3D model, this study interprets the three-
dimensional shape information in context by a 3D descriptor. In other words, context-based 
information retrieval in three-dimensional form. 

Each landmark has a distinctive shape. By understanding a landmark's shape characteristics, the 
landmark's position on the skull is known. A regular surface's shape can be defined using a 
quantifiable model or a feature representation (3D descriptor). This modelling will ease the 
computation of surface shape features. The curvature model must be able to describe a surface's 
shape quantitatively. A surface is concave or convex by looking at the curvature value. In this work, 
curvature features are used as the input to obtain characteristic descriptors of the physical shape of 
skull surfaces as landmarks.  

By solving the volumetric issue of 3D data in machine learning, the learning work can be solved 
with a simple multi-layer perceptron to draw relationships between surface shapes as landmarks. 
The network model uses an activation function that accommodates the range of SCF sequential 
values to maintain the three-dimensional information context. The model can classify a point on the 
surface of the skull, whether it is a landmark or not. A novel method was proposed in this work, 
oriented towards the geometrical measurement in 3D spatial space. A three-dimensional feature 
representation, Surface Curvature Feature (SCF), is used to describe the local morphological shape 
of the landmarks. Machine learning is based on detecting and classifying the skull surface 
landmarks. The contribution of the proposed method is as follows: 

 The first study of the use of SCF as a three-dimensional feature representation (descriptor), 
in detecting anatomical features of the skull surface (landmarks) 

 An intuitive method for recognizing landmarks based on the shape of the surface is inspired 
by conventional methods. 

 A simple MLP is engaged in drawing correlations on the implicit characteristics of a surface 
shape as a skull landmark. 

II. Method 

A. Related Works 

Working with 3D objects for shape recognition cannot be separated from the 3D shape 
representation process. Generally, 3D representation is done using a feature-based approach, a 
facade-based approach (view), and a graphical information-based approach [19]. The feature-based 
approach uses shape models in transactions or classifications. The view-based approach renders a 


 P.H. Suputra et al. / Knowledge Engineering and Data Science 2022, 5 (1): 27–40 29 

 
3D view into 2D at a certain point of view. The final approach is to use the information on the 
surface topology.  

The feature-based approach relies on the geometric characteristics of 3D objects. Geometry 
characters can also be applied to global and local features, where global features are most widely 
used for classification work. Meanwhile, the view-based approach eliminates the volumetric load of 
3D objects. The computational cost of entering 3D data is much higher than that of a two- or single-
dimensional format. Therefore, several ways have been developed to convert the structure of 3D 
shapes into 2D. Conversion of 3D data into 2D must have information that is sacrificed as long as it 
maintains the desired features. The third approach emphasizes the relationships and connections of 
the various components of the object representation model. Computationally, the graph-based 
approach is considered inefficient compared to the other two approaches. If the first approach 
geometrically retrieves the shape of an object, the second approach must look at the object from 
various views to obtain the intended feature. 

The work related to the local shape approach of a 3-dimensional surface is the automatic 
positioning proposed by Jacinto [16]. Local rigid registration between the pre-defined landmark 3D 
model and the patient structure model is detected of anatomical landmarks. The three-dimensional 
annotation system by Lee [17] and Lindner [18] uses a 2D image approach. Cephalometric 
annotation performed by Lee uses a shadowed-2D image approximated by the landmark coordinates 
using machine learning. Meanwhile, what is done by Lindner uses lateral cephalogram images. This 
method accurately detects landmarks' location in the 2D image. However, the detection results 
cannot be used directly for facial reconstruction because they do not provide 3D spatial coordinate 
information. 

Orthopaedic landmark positioning proposed by Jacinto [16] outlines the multi-atlas approach. He 
proposed a method for determining the position of landmarks on a 3D orthopaedic atlas model of the 
patient's bone structure. There are two three-dimensional models, namely the patient and expert 
models. Both structure model consists of a triangulated mesh. The positioning of the landmarks is 
obtained by registering the patient model to the expert model. An expert identifies landmarks 
positions on the expert model as a pre-defined landmark model. It is then registered to the patient 
model.  

The process includes two stages: computing the global mesh registration and local registration. 
The initial fitting stage of the patient model to the expert model is carried out using the Iterative-
Closest-Point (ICP) algorithm. This stage aims to attach the patient model to the expert model. The 
next step is refining the fitting using adaptive local rigid registration. This stage places (transfers) the 
positions of the landmarks in the expert model to the patient model. The final position of the 
landmarks is refined using an automatic selection of a set of best-positioned landmarks. This 
approach increases the efficiency of unsupervised landmark positioning. 

Lindner et al. [18] proposed a method of automatic landmark annotation on lateral cephalometric 
images. The positioning of these landmarks is a standard procedure in orthodontic diagnosis and 
treatment. The Fully Automatic Landmark Annotation System (FALA) was developed works by 
utilizing machine learning to determine the landmark position based on the landmark position data 
annotated manually by experts. Two stages were deployed, namely Random Forest regression, to 
make the system robust against various variations in the image, and the second stage was the 
Constrained Local Model framework (RFRV-CLM) which specifically determined the position of 
landmarks. 

The data and the results of the landmark annotation by Lindner [18] were performed on the 
lateral cephalogram image. In other words, a two-dimensional image from the side of the object. The 
landmark detection was carried out on the bone contour and isolated the desired landmark position. 
The challenge is to distinguish a landmark that is on the contour of the bone or not. For this reason, 
adding a patch size to the training sample allows the system to learn more about the environment 
where the landmark is located in the learning process. 

A three-dimensional modality for detecting the position of landmarks based on machine learning 
was carried out by Lee et al. [17], is driven by a shift in the trend of using three-dimensional images 
in surgical simulations. However, this study does not directly process three-dimensional images to 
detect landmarks. Shaded two-dimensional rendering of three-dimensional imagery was used as the 


30 P.H. Suputra et al. / Knowledge Engineering and Data Science 2022, 5 (1): 27-40 

 
modality. The position of the landmark is annotated on the image to find its relationship with 
machine learning. 

Lee sees the problem in three-dimensional processing data is the number of voxels. This 
challenge was overcome by working on two-dimensional shaded objects with lighting and viewing 
angles variations. The method uses VGG-net to draw the correlation of landmark positions based on 
a manually marked dataset. 

Another work on cephalometric landmark detection was carried out [20]. The challenges faced 
are also related to the position of landmarks in three-dimensional space. They propose multistage 
deep reinforcement learning in detecting landmarks. The modality used is similar to Lee [17], using 
3D images rendered as 2D shaded images from various views. The work consists of two stages, 
namely the learning stage and the inferencing stage. The use of multi-stage is due to landmarks 
whose position in 3D empty space (such as the Foramen magnum or Sella) cannot be depicted by 
simple 2D images. Both single and multi-stage approaches can detect other landmarks. 

The challenge in using three-dimensional data in machine learning is the enormous data that 
must be processed. Several works, such as Lee [17], Jacinto [16], and Lindner [18], perform 
alternative three-dimensional contour processing with a two-dimensional approach. However, 
machine learning does not require raw image data as input. Refined or extracted information from 
the image can also be used if it represents the expected features. The Surface Curvature Feature 
method proposed by Yuniarno [15] works as a three-dimensional surface descriptor that translates 
the shape of the surface curvature into simpler context information for machine learning. 

Unlike those works that use a view-based approach, Yuniarno [15] uses a feature-based method 
for point cloud registration. He proposed an iterative closest point (ICP) algorithm as used by 
Jacinto but by utilizing surface curvature feature estimation (SCF), is a feature-based approach that 
works on local features. This method performs fitting k-NN local points to the hyperbolic paraboloid 
equation. This method was originally used to perform point cloud matching. In general, SCF acts as 
a model that describes the shape of curvature. The iterative closest point-surface curvature feature 
(ICP-SCF) algorithm can describe a local surface so that point cloud matching can be performed. 

From the related works, two issues need to be addressed in completing the task of classifying and 
detecting landmarks in 3D space. Unlike previous works that use two-dimensional modalities. First, 
this work focuses on positioning landmarks at the coordinate level. Second, apply machine learning 
to classify and detect the position of a landmark. The first issue is solved by representing the local 
surface shape context using SCF. Unlike Yuniarno, in this work, SCF is used naively as a surface 
curvature descriptor. The extraction results are then used as the learning input. We hypothesize that 
an adjacent point will have a similar surface shape for the second issue. If they are a landmark, then 
several points will be predicted as the same class with a similar probability level. Therefore, the top 
probability prediction points are taken and then filtered based on the most significant cluster. 

B. Proposed Method 

Considering the previous two issues, we propose a method aimed at detecting, classifying, and 
determining the position of cranial landmarks (Figure 1). The modality used is multislice-CT with 
DICOM format. Cranial sections are extracted as point clouds and stored as Cartesian coordinates. 
Next, the local surface context for each point cloud is extracted with SCF. This descriptor solves the 
problem of volumetric input by converting 3D data into feature representation values. Associations 
between surface shapes and landmarks are obtained by deep learning. Points with top-N 
probabilities are then filtered to isolate the largest cluster as a landmark area. 

C. Surface Curvature Feature 

Each landmark can be recognized based on the characteristics of its shape. A data representation 
model and a three-dimensional surface descriptor model are needed to achieve this hypothesis. The 
cranial is represented as a point cloud. Spatial data processing using a local coordinate system can 
provide a better perspective and simplify measurements. The Surface Curvature Feature (SCF) is 
used as the three-dimensional descriptor. 

Surface curvature is the curvature of a curve on the surface that passes through a certain point 
[15]. Surface curvature must be able to provide consistent information about the characteristics of a 
curve with a certain center point   . The coordinate system as a parameter frame in which the 


 P.H. Suputra et al. / Knowledge Engineering and Data Science 2022, 5 (1): 27–40 31 

 
curvature function works must be uniform. For spatial work in geometric spaces, local coordinate 
systems are often used. The local coordinate system is built on a point   , where the normal vector 
   of the surface at point    is used as   coordinate axis. Meanwhile, the   and   coordinate axes are 
obtained from the principal direction of the    point surface. 

D. Local Coordinate System 

The local coordinate system (Figure 2) simplifies the marking and measurement process. This 
coordinate system is best used for smaller area extents. The conversion is done by finding a 
composite transformation that transforms point clouds so that the normal vector    of   coincides 
with the azimuth [0 0 1] as the local principal axis  . The transformation matrix is a composite of 
translation and rotation. The center of rotation must be on a point  (        ), so the translation is 
         . 

For the z-axis direction of the local coordinate system to be parallel to the surface’s normal, the 

 
Fig. 1. Landmark Positioning Process based on Surface Curvature Feature 

 
Fig. 2. Local coordinate system on . The normal vector 𝑛𝑝 of 𝑃 coincides with the azimuth as the local principal axis 𝑧 


32 P.H. Suputra et al. / Knowledge Engineering and Data Science 2022, 5 (1): 27-40 

 
point cloud needs to be rotated. Rodrigues's rotation formula applied to find the rotation      
         , where          . It gives an efficient method for computing the rotation matrix   

corresponding to a rotation by an angle   about a fixed axis specified by the unit vector      
            . After the surface point cloud is transformed with a translation   and rotation  , the 
next task is determining the principal directions. 

E. Principal Directions Estimation 

A regular surface has two principal directions. The principal directions of a surface are used to 
determine two vectors as the   and   axes of the local coordinate. They are perpendicular to each 
other before the  -axis, which is parallel to the normal. The principal direction of surface of    is a 
tangent direction at point    on a regular surface in which the normal curvature of the surface at the 
point    reaches an extremal value. 

The principal directions define a direction in which the surface of the curve must travel to obtain 
the minimum and maximum curvature. The minimum and maximum curvature conditions are 
obtained from the eigenvalues of the shape operators. Meanwhile, the principal directions are 
obtained from eigenvectors which correlate with the eigenvalues. The maximum curvature    and 
the minimum curvature    of the curve is referred as the principal curvature [15]. 

The principal curvatures are the determinant of the Weingarten matrix: 

  [
      
      
]  [
  
  
] (1) 

If    and    are the eigenvalues of Weingarten matrix  , then the minimum curvature is    and the 
maximum curvature is   with |  |  |  |. The vectors associated with principal curvatures are 
called principal directions. A local coordinate combines the principal direction and the normal 
surface vector. They construct a coordinate frame (Darboux frame). 

F. Surface Curvature Feature Extraction 

The surface curvature feature is a sequence value that describes the curvature of a surface at a 
certain radius. Thirty-six sequence projection starting points were prepared (Figure 3). These points 
are in the planar ring perpendicular to the normal point    (centre) within ten degrees circular Z-axis 
(normal of   ). The measurement starts from the first (zero degrees) projection point whose position 
is in the radius   on the X-axis. The next points are every 10 degrees counterclockwise.  

One of the direct uses of SCF is to determine surface similarity. Two surfaces can be said to be 
similar in geometry, having similar features of the curvature of the surface. The similarity of surface 
geometry at two points can be compared by looking at the sequence values of the SCF. The deep 
learning approach can also be applied, considering the difficulty of finding a direct correlation 
between SCFs. By using deep learning, machines can relate the implicit definition of a classification 

 
Fig. 3. Surface Curvature Feature extraction. This returns a sequential value for each surface with centre p. The sequential 

value is the projection distance of the planar ring to the surface. A planar ring has a radius as the specified parameter, 

with the centre at p 


 P.H. Suputra et al. / Knowledge Engineering and Data Science 2022, 5 (1): 27–40 33 

 
based on a given feature. The machine will detect points on the surface with a certain surface shape. 
The surface shape is characterized by several SCFs which are similarly categorized (Figure 4). 
Principal direction is applied to ensure that similar surface shapes are measured at a uniform starting 
point. In other words, the SCF value will appear with a similar histogram pyramid for a surface with 
a similar shape. 

G. Data Preparation 

The data sample consisted of multi-slice Computed Tomography (CT) of the skulls. The contours 
of the skulls were separated from soft tissue by a Hounsfield unit scale. CT is great for density-based 
segmentation [21]. The data are retrospective-anonymous, with an accuracy of 0.625 mm between 
slices of 512×512 pixels. Each skull-face sample consists of 400 to 500 slices. 

Multi-slice CT of the head was then extracted from the skull surface based on tissue and soft 
tissue density differences. The surface is stored as a point cloud. Each local surface on the skull was 
extracted its surface shape using a three-dimensional descriptor as a surface curvature feature (SCF). 
The dataset consists of 458,000 local surfaces, extracted as SCF values as training-validating-testing 
data with 60:20:20 ratio.  

For predicting landmark's positions, points will be taken at the part that is considered the area of 
the landmarks. They are frontomalare temporale-right (fmtr), zygion-right(zygr), glabella (glb), 
nasion (nas), zygoorbitale-right(zygoor), zygoorbitale-left(zygool), pogonion (pog), frontomalare 
temporale-left (fmtl), and zygion-left (zygl) [2][8][22][23]. The points are samples with a radius of 
0.3 units from the centre of the desired landmarks (Figure 5). 

 
Fig. 4. Examples of four local surface, extracted by SCF 3D descriptor onto SCF values that are categorized as surface 

similar. A group of SCF categorized as similar will be trained to detect certain surface shapes. 

 
  (a) (b) (c) 

Fig. 5. Nine landmarks: (a) frontomalare temporale-right (fmtr) and zygion-right (zygr); (b) glabella (glb), nasion (nas), 

zygoorbitale-right(zygoor), zygoorbitale-left(zygool), and pogonion (pog); and (c) frontomalare temporale-left (fmtl) and 

zygion-left(zygl). 

 
34 P.H. Suputra et al. / Knowledge Engineering and Data Science 2022, 5 (1): 27-40 

 
H. Point Clouds Normalisation 

A point cloud sample    {  }   
  is a collection of points extracted from multi-slice computed 

tomography (CT). The surface of the face and the skull are obtained by the outer contour extraction 
method [24] as point clouds, stored in an array of Cartesian coordinates ([x y z]). The outer points of 
the skull are recorded in a Cartesian format. Each skull point cloud was normalized to obtain a 
uniform scale. Centre point    is determined as the average of the point cloud as the center of the 
coordinate system ([0 0 0]). Each skull is scaled so that the average point cloud distance to    is one. 
With this normalization, the new coordinates   

  are computed using the Equation (2) where 
   |     |. Normalization reduces the dependence of gradients on the scale of the initial values 
on the training process. This approach results in higher learning rates without the risk of divergence. 

  
         (

 
∑   
 
   
) (2) 

I. Feature Extraction 

The SCF sequential value is a series of values that represent the shape of a surface by measuring 
the distance of the ring projection to the surface. The ring radius and the point cloud sample radius 
are operator-defined parameters. Measurements were made on a normalized skull point cloud. In 
this study, the ring radius and sample radii were 0.25 and 0.5 normalized units (normalized units 
after application of Equation 1). The principal direction axis determines the starting point of 
measurement, which moves counterclockwise every ten degrees. Figure 6 shows the measurement 
of the SCF's sequential value as a surface's three-dimensional descriptor. 

A total of 36 user-defined projection starting points were prepared. These points are in the planar 
plane perpendicular to the normal point    within 10 degrees circular Z-axis (also normal of   ). The 
measurement starts from the first (zero degrees) projection point, whose position is in the radius   on 
the X-axis.  

There are cases where the projection of the ring does not touch the surface to reach the pre-
defined                    value. SCF is generally in a small range of values. The dominant 
                   value reduces the accuracy of the training, is not an actual projection value, 

 
Fig. 6. Surface Curvature Feature extraction on point cloud sample. Point cloud coordinate system is converted into local 

coordinate. The SCF sequential value as the result of projection measurements of SCF ring to the local surface. 


 P.H. Suputra et al. / Knowledge Engineering and Data Science 2022, 5 (1): 27–40 35 

 
but a maximum given value if the surface projection is not achieved. To eliminate the dominance of 
large                   , this is set to zero. 

J. Neural Network Deployment 

Input data is in the form of SCF, which is already a feature. Naturally, it can be processed 
directly using a multi-layer perceptron. A model is developed to detect landmarks into ten classes 
(glb, fmtr, fmtl, zygr, zygl, zygoor, zygool, pog, nasion, and non-landmark). It consists of layers 
which include an input layer, hidden layers, and output layer, is trained to acquire the probability 
value of membership of an SCF into ten landmark classes. The probability value shows how certain 
the data is predicted to be a class member.  

The SCF sequential value is the projection distance of the ring with radius   on the surface of the 
skull, work using 36 equidistant projection points. For example, on a convex surface, all SCF 
sequences will be positive in extreme cases. However, with variations in surface shape, they could 
be in the range of negative to positive. It takes an appropriate activation function to accommodate 
the range of values. The      activation function is considered because it can maintain a negative 
magnitude in the training input. Using the      activation function in combination with the      
gives significantly better results than the linear function. The loss function used in this classification 
model is                         , as it is commonly used in multi-label classification.  

III. Results and Discussions 

An artificial neural network model has been created to classify nine landmarks into ten classes 
(Figure 7). Non-landmark points are treated as separate labels beside the nine landmarks. The 
dataset consists of SCF values of 458.000 local surfaces labeled with nine landmark classes plus one 
non-landmark class. A non-landmark class is included to increase confidence in the prediction 
results. Most of the skull surface is not a landmark. So, adding a class that says "not a landmark" 
really helps improve accuracy. The neural network model was trained for 200 epochs to ensure the 
training results (Figure 8). It can be seen that around the 22nd epoch, there has been convergence. In 
some epochs, insignificant spikes occur, and the learning returns to convergence.  

An untrained cranial point cloud was tested on the neural network model. The point cloud 
consists of 200,000 evenly distributed samples (local surfaces). Predictions for each landmark are 
scattered in several areas but mostly clustered in the designated target. The class prediction in Figure 
9 was fired by the top hundred highest predicted probability values for each landmark class.  

Fig. 7. Model summary with categorical cross entropy loss 


36 P.H. Suputra et al. / Knowledge Engineering and Data Science 2022, 5 (1): 27-40 

 
An expert on the cranial surface annotates ground truths. The coordinates of the annotation points 
are extracted to be compared with the predicted results. The accuracy of each prediction point is 
measured as a three-dimensional Euclidean distance (Equation 3) between the ground truth 
           and the prediction point           . Seven of the nine landmarks give promising results 
compared to the ground truth. The prediction for zygoor and zygool have multiple clusters, with the 
largest are closer to ground truth. Accuracy is then improved with the implementation a distance 
radius-based filter and cluster-based filter.  

          √       
  (     )

 
  (3) 

Radius-based filters aim to select the points closest to the centre of mass (          of the 
prediction points. The mass centre is determined by the average coordinates of the predicted points. 
Each prediction points (         ) is then calculated its distance (Equation 3) from the centre 
(          and sorted in ascending order. The filter aims to leave candidate prediction points close 
to the ground truth. The hypothesis is that the prediction points should converge close to the ground 
truth position.  

The radius-based filter assumes that the center of mass is at the ground truth position. This filter 
works well on several landmark classes such as FMTL, GLB, POG, NAS, ZYGL, and ZYGL. But 
for other classes, this filter does not give consistent results. The problem is that the prediction points 
are spread over a wide range. This filter is also not able to handle multi-cluster predictions. 

Filter by cluster is carried out to cope with multi-cluster prediction. This filter is a simplification 
of neighborhood-based filtering techniques by Han [25]. This relies on the number of neighbors per 
prediction point within a normalized 0.05-unit radius. The idea is to eliminate isolated points or 

 
Fig. 8. Training (top) and lost (bottom) history of the learning process with 200 epochs. In approximately the 22
nd

 epoch, 

there has been convergence. In some epochs, insignificant spikes occur and returns to convergence. 


 P.H. Suputra et al. / Knowledge Engineering and Data Science 2022, 5 (1): 27–40 37 

 
small clusters (number of neighbors). Thus, the tighter the filter, the larger the cluster points remain. 
In the case of multi-cluster prediction, radius-based filters are not optimal. The average distance 
from the ground truth widened in the tighter filter. Otherwise, cluster-based filter consistently 
provides better results than radius-based filters (Table 1). On zygool and zygoor landmarks, the 
prediction points are spread over a wide range. Especially in zygool landmarks, there is more than 
one cluster (Figure 9). Radius-based filters failed to improve accuracy for finding top-20 points, 
while cluster-based filters cope this problem. The performance of cluster-based filters is generally 
better than mass-radius-based filters (Figure 10 and Figure 11). Referring to the position of the 
actual landmark, Figure 9 shows visual prediction of landmark and the filtered results. The mean 
distance from the ground truth is reduced (better). It improves the accuracy by eliminating points 
scattered too far from the center of mass. 

The results shown in Figure 10, Figure 11, and Table 1 confirm that the filter can eliminate less 
relevant points. The cluster-based filter consistently provides better results. It also strengthens the 
hypothesis that the prediction of a landmark will be clustered because it has a similar SCF. With 
neighborhood-based filtering techniques [25], they can be clustered. 

 
Fig. 9. Results of top-hundred landmark detection, compared with cluster-based filtering side by side.  


38 P.H. Suputra et al. / Knowledge Engineering and Data Science 2022, 5 (1): 27-40 

 
Using SCF with the Iterative Closest Point (ICP-SCF) successfully registered point clouds based 
on local surface SCF [15]. A similar idea is used in this work with a machine learning approach. 
Learning runs convergently, and the prediction results show that SCF can act as a surface descriptor 
that can be recognized for its landmark characteristics. The main challenge in using 3D modalities is 
many voxels [17]. This work addresses this issue by using three-dimensional descriptors to draw the 
shape context of a three-dimensional surface. Input learning is information in the form of a 3D 
surface shape context. Thus, the workload of the learning process will be much reduced. 

Table 1. The average distance between the top-n prediction points and the ground truth points 

Landmark Top100 Top-90 Top-80 Top-70 Top-60 Top-50 Top-40 Top-30 Top-20 Top-10 Top-5 

GLB            

Cluster filter 0.1420 0.0615 0.0457 0.0406 0.0384 0.0348 0.0316 0.0308 0.0287 0.0259 0.0228 

Radius filter 0.1420 0.0531 0.0449 0.0417 0.0379 0.0353 0.0318 0.0289 0.0251 0.0280 0.0246 

FMTR                       

Cluster filter 0.0973 0.0382 0.0336 0.0294 0.0258 0.0223 0.0199 0.0163 0.0134 0.0104 0.0125 

Radius filter 0.0973 0.0387 0.0372 0.0361 0.0378 0.0387 0.0390 0.0418 0.0478 0.0611 0.0651 

ZYGR                       

Cluster filter 0.1933 0.0483 0.0435 0.0393 0.0371 0.0359 0.0324 0.0277 0.0243 0.0206 0.0162 

Radius filter 0.1933 0.0484 0.0440 0.0417 0.0379 0.0354 0.0326 0.0334 0.0337 0.0322 0.0324 

POG                       

Cluster filter 0.2150 0.1444 0.0623 0.0582 0.0554 0.0503 0.0506 0.0525 0.0527 0.0553 0.0512 

Radius filter 0.2150 0.1093 0.0623 0.0545 0.0474 0.0412 0.0357 0.0305 0.0249 0.0185 0.0136 

FMTL                       

Cluster filter 0.0917 0.0395 0.0350 0.0338 0.0311 0.0301 0.0278 0.0244 0.0227 0.0184 0.0218 

Radius filter 0.0917 0.0400 0.0378 0.0359 0.0336 0.0310 0.0287 0.0267 0.0229 0.0225 0.0291 

ZYGL                       

Cluster filter 0.1717 0.0944 0.0489 0.0414 0.0388 0.0353 0.0316 0.0315 0.0313 0.0331 0.0304 

Radius filter 0.1717 0.0798 0.0492 0.0452 0.0433 0.0384 0.0397 0.0421 0.0429 0.0467 0.0562 

ZYGOOR                       

Cluster filter 0.4486 0.3882 0.3394 0.2567 0.1774 0.0501 0.0350 0.0230 0.0206 0.0240 0.0316 

Radius filter 0.4486 0.3385 0.2580 0.1953 0.1205 0.0781 0.0671 0.0806 0.1153 0.1968 0.2798 

ZYGOOL                       

Cluster filter 0.7019 0.6982 0.6783 0.6244 0.5983 0.5412 0.4521 0.2040 0.0219 0.0177 0.0212 

Radius filter 0.7019 0.6237 0.5780 0.5062 0.5040 0.5786 0.6806 0.7681 0.7726 0.6747 0.6177 

NAS                       

Cluster filter 0.3305 0.2705 0.2427 0.1393 0.0666 0.0708 0.0719 0.0731 0.0721 0.0717 0.0695 

Radius filter 0.3305 0.2150 0.0900 0.0634 0.0584 0.0551 0.0505 0.0494 0.0511 0.0675 0.0700 

 
Fig. 10. Filter landmark prediction points by distance radius. With the center of mass of the predicted points as the center 

of the radius. 


 P.H. Suputra et al. / Knowledge Engineering and Data Science 2022, 5 (1): 27–40 39 

 
 The feature-based approach is not only good for classification based on global features. 
However, a good understanding of local surface features can also be used to detect the unit shape of 
a landmark. So far, such detection and prediction have only been carried out using a view-based 
approach, such as works by Lindner [18], Lee [17], Jacinto [16], or Kang [20]. The method of 
automatically detecting the position of landmarks is necessary in craniofacial reconstruction. 
Especially in computer-assisted reconstruction, the morphological approach is a computational 
opportunity because it can be simulated by a computer based on available medical modalities (CT, 
MRI, USG). 

IV. Conclusions 

The hypothesis that inspires this work is that each landmark has a unique surface shape 
characteristic. SCF is used as a feature representation of the shape of the landmark surface. The 
neural network model has a convincing performance, with the average distance to the ground truth is 
0.0326 normalised units. Several prediction points with the highest probability values are taken for 
each landmark. They are scattered but tend to be clustered in the desired area. Cluster-based filters 
are better than mass-radius-based filters, consistently giving better pinpoint accuracy, especially in 
multi-cluster cases. Precision is measured based on the average distance of the top predictions to the 
ground truths. The cluster-based filter is needed for multi-cluster distribution to isolate the largest 
cluster. It is successfully coping with the multi-cluster case. Based on these results, using SCF was 
successful as a 3D descriptor representing local surface features. With the success of this work, the 
next research is its implementation as part of the craniofacial reconstruction framework. Not only for 
forensics, but the detection of landmarks is also helpful in simulating medical reconstruction and 
plastic surgery. Future research based on this work is organ damage reconstruction research with 
template transfer based on landmarks. With a similar concept, research on implant reconstruction is 
also possible. 

Declarations  

Author contribution  

All authors contributed equally as the main contributor of this paper. All authors read and approved the final paper. 

Funding statement  

Our thanks to Lembaga Pengelola Dana Pendidikan (LPDP) / Indonesia Endowment Fund for Education make this 
research possible. LPDP grant number [201902210113890].  

Conflict of interest  

The authors declare no known conflict of financial interest or personal relationships that could have appeared to 
influence the work reported in this paper.  

Additional information  

Reprints and permission information are available at http://journal2.um.ac.id/index.php/keds. 

Publisher’s Note: Department of Electrical Engineering - Universitas Negeri Malang remains neutral with regard to 
jurisdictional claims and institutional affiliations. 

 
Fig. 11. Filter based on the largest cluster  

http://journal2.um.ac.id/index.php/keds


40 P.H. Suputra et al. / Knowledge Engineering and Data Science 2022, 5 (1): 27-40 

 
References 

[1] P. T. Jayaprakash, “Conceptual transitions in methods of skull-photo superimposition that impact the reliability of 
identification: A review,” Forensic Sci. Int., vol. 246, pp. 110–121, 2015. 

[2] M. A. Lenza, A. A. De Carvalho, E. B. Lenza, M. G. Lenza, H. M. De Torres, and J. B. De Souza, “Radiographic 
evaluation of orthodontic treatment by means of four different cephalometric superimposition methods,” Dental Press 
J. Orthod., vol. 20, no. 3, pp. 29–36, 2015. 

[3] C. C. Snow, B. P. Gatliff, and K. R. McWilliams, “Reconstruction of facial features from the skull: An evaluation of 
its usefulness in forensic anthropology,” Am. J. Phys. Anthropol., vol. 33, no. 2, pp. 221–227, 1970. 

[4] H. Ullrich and C. N. Stephan, “Mikhail Mikhaylovich Gerasimov's Authentic Approach To Plastic Facial 
Recpnstruction,” Anthropologie, vol. LIV/2, no. October 2014, pp. 97–107, 2016. 

[5] P. Claes, D. Vandermeulen, S. De Greef, G. Willems, J. G. Clement, and P. Suetens, “Computerized craniofacial 
reconstruction: Conceptual framework and review,” Forensic Sci. Int., vol. 201, no. 1–3, pp. 138–145, 2010. 

[6] P. H. Suputra, A. D. Sensusiati, E. M. Yuniarno, M. H. Purnomo, and I. K. E. Purnama, “3D Laplacian Surface 
Deformation for Template Fitting on Craniofacial Reconstruction,” in ICCCM’20: Proceedings of the 8th 
International Conference on Computer and Communications Management, 2020, pp. 27–32. 

[7] M. de Buhan and C. Nardoni, “A facial reconstruction method based on new mesh deformation techniques,” Forensic 
Sci. Res., vol. 3, no. 3, pp. 256–273, 2018. 

[8] T. Gietzen et al., “A method for automatic forensic facial reconstruction based on dense statistics of soft tissue 
thickness,” PLoS One, vol. 14, no. 1, pp. 1–19, 2019. 

[9] P. Guyomarc’h, F. Santos, B. Dutailly, P. Desbarats, C. Bou, and H. Coqueugniot, “Three-dimensional computer-
assisted craniometrics: A comparison of the uncertainty in measurement induced by surface reconstruction performed 
by two computer programs,” Forensic Sci. Int., vol. 219, no. 1–3, pp. 221–227, 2012. 

[10] L. Jiang, J. Zhang, B. Deng, H. Li, and L. Liu, “3D Face Reconstruction with Geometry Details from a Single Image,” 
IEEE Trans. Image Process., vol. 27, no. 10, pp. 4756–4770, 2018. 

[11] A. Lodha, M. Mehta, M. N. Patel, and S. K. Menon, “Facial soft tissue thickness database of Gujarati population for 
forensic craniofacial reconstruction,” Egypt. J. Forensic Sci., vol. 6, no. 2, pp. 126–134, 2016. 

[12] B. Rosario Campomanes-Álvarez et al., “Modeling Facial Soft Tissue Thickness for Automatic Skull-Face Overlay,” 
IEEE Trans. Inf. Forensics Secur., vol. 10, no. 10, pp. 2057–2070, 2015. 

[13] L. J. Short, B. Khambay, A. Ayoub, C. Erolin, C. Rynn, and C. Wilkinson, “Validation of a computer modelled 
forensic facial reconstruction technique using CT data from live subjects: a pilot study,” Forensic Sci. Int., vol. 237, 
pp. 147.e1-147.e8, 2014. 

[14] Y. Zhao et al., “Laplacian Musculoskeletal Deformation for Patient-Specific Simulation and Visualisation,” in 2013 
17th International Conference on Information Visualisation, Jul. 2013, pp. 505–510. 

[15] E. Mulyanto Yuniarno, M. Hariadi, and M. Hery Purnomo, “Point cloud registration for a non-deformable object 
using surface curvature features,” J. Theor. Appl. Inf. Technol., vol. 51, no. 3, pp. 506–514, 2013. 

[16] H. Jacinto, S. Valette, and R. Prost, “Multi-atlas automatic positioning of anatomical landmarks,” J. Vis. Commun. 
Image Represent., vol. 50, no. November, pp. 167–177, 2018. 

[17] S. M. Lee, H. P. Kim, K. Jeon, S. H. Lee, and J. K. Seo, “Automatic 3D cephalometric annotation system using 
shadowed 2D image-based machine learning,” Phys. Med. Biol., vol. 64, no. 5, 2019. 

[18] C. Lindner, C. W. Wang, C. T. Huang, C. H. Li, S. W. Chang, and T. F. Cootes, “Fully Automatic System for 
Accurate Localisation and Analysis of Cephalometric Landmarks in Lateral Cephalograms,” Sci. Rep., vol. 6, no. 
August, pp. 1–10, 2016. 

[19] H. Kanaan and A. Behrad, “Three-dimensional shape recognition and classification using local features of model 
views and sparse representation of shape descriptors,” J. Inf. Process. Syst., vol. 16, no. 2, pp. 343–359, 2020. 

[20] S. H. Kang, K. Jeon, S. H. Kang, and S. H. Lee, “3D cephalometric landmark detection by multiple stage deep 
reinforcement learning,” Sci. Rep., vol. 11, no. 1, pp. 1–13, 2021. 

[21] N. Chowdhury et al., “Concurrent segmentation of the prostate on MRI and CT via linked statistical shape models for 
radiotherapy planning,” Med. Phys., vol. 39, no. 4, p. 2214, 2012. 

[22] M. Bayome, J. Hyun Park, and Y. A. Kook, “New three-dimensional cephalometric analyses among adults with a 
skeletal class I pattern and normal occlusion,” Korean J. Orthod., vol. 43, no. 2, pp. 62–73, 2013. 

[23] A. H. Ross, D. E. Slice, and S. E. Williams, “Geometric Morphometric Tools for the Classification of Human Skulls. 
Research Report,” pp. 1–59, 2010. 

[24] M. A. Ulinuha, E. M. Yuniarno, S. M. S. Nugroho, and M. Hariadi, “Outer contour extraction of skull from CT scan 
images,” IOP Conf. Ser. Mater. Sci. Eng., vol. 185, no. 1, 2017. 

[25] X. F. Han, J. S. Jin, M. J. Wang, W. Jiang, L. Gao, and L. Xiao, “A review of algorithms for filtering the 3D point 
cloud,” Signal Process. Image Commun., vol. 57, no. May, pp. 103–112, 2017. 

 
https://doi.org/10.1016/j.forsciint.2014.10.043
https://doi.org/10.1016/j.forsciint.2014.10.043
https://doi.org/10.1590/2176-9451.20.3.029-036.oar
https://doi.org/10.1590/2176-9451.20.3.029-036.oar
https://doi.org/10.1590/2176-9451.20.3.029-036.oar
https://doi.org/10.1002/ajpa.1330330207
https://doi.org/10.1002/ajpa.1330330207
http://puvodni.mzm.cz/Anthropologie/article.php?ID=2175
http://puvodni.mzm.cz/Anthropologie/article.php?ID=2175
https://doi.org/10.1016/j.forsciint.2010.03.008
https://doi.org/10.1016/j.forsciint.2010.03.008
https://doi.org/10.1145/3411174.3411175
https://doi.org/10.1145/3411174.3411175
https://doi.org/10.1145/3411174.3411175
https://doi.org/10.1080/20961790.2018.1469185
https://doi.org/10.1080/20961790.2018.1469185
https://doi.org/10.1371/journal.pone.0210257
https://doi.org/10.1371/journal.pone.0210257
https://doi.org/10.1016/j.forsciint.2012.01.008
https://doi.org/10.1016/j.forsciint.2012.01.008
https://doi.org/10.1016/j.forsciint.2012.01.008
https://doi.org/10.1109/TIP.2018.2845697
https://doi.org/10.1109/TIP.2018.2845697
https://doi.org/10.1016/j.ejfs.2016.05.010
https://doi.org/10.1016/j.ejfs.2016.05.010
https://doi.org/10.1109/TIFS.2015.2441000
https://doi.org/10.1109/TIFS.2015.2441000
https://doi.org/10.1016/j.forsciint.2013.12.042
https://doi.org/10.1016/j.forsciint.2013.12.042
https://doi.org/10.1016/j.forsciint.2013.12.042
https://doi.org/10.1109/IV.2013.67
https://doi.org/10.1109/IV.2013.67
http://www.jatit.org/volumes/Vol51No3/23Vol51No3.pdf
http://www.jatit.org/volumes/Vol51No3/23Vol51No3.pdf
https://doi.org/10.1016/j.jvcir.2017.11.015
https://doi.org/10.1016/j.jvcir.2017.11.015
https://doi.org/10.1088/1361-6560/ab00c9
https://doi.org/10.1088/1361-6560/ab00c9
https://doi.org/10.1038/srep33581
https://doi.org/10.1038/srep33581
https://doi.org/10.1038/srep33581
https://doi.org/10.3745/JIPS.02.0132
https://doi.org/10.3745/JIPS.02.0132
https://doi.org/10.1038/s41598-021-97116-7
https://doi.org/10.1038/s41598-021-97116-7
https://doi.org/10.1118/1.3696376
https://doi.org/10.1118/1.3696376
https://doi.org/10.4041/kjod.2013.43.2.62
https://doi.org/10.4041/kjod.2013.43.2.62
https://www.hsdl.org/?view&did=20036
https://www.hsdl.org/?view&did=20036
https://doi.org/10.1088/1757-899X/185/1/012028
https://doi.org/10.1088/1757-899X/185/1/012028
https://doi.org/10.1016/j.image.2017.05.009
https://doi.org/10.1016/j.image.2017.05.009