HUNGARIAN JOURNAL OF
INDUSTRY AND CHEMISTRY
Vol. 47(1) pp. 33–39 (2019)
hjic.mk.uni-pannon.hu
DOI: 10.33927/hjic-2019-06

AUTOMATED LABELING PROCESS FOR UNKNOWN IMAGES IN AN
OPEN-WORLD SCENARIO

DÁVID PAPP *1 AND GÁBOR SZŰCS1

1Department of Telecommunications and Media Informatics, Budapest University of Technology and
Economics, Magyar Tudósok krt. 2., H-1117 Budapest, HUNGARY

Most of the recognition systems presume a controlled, well-defined research setting, where all possible classes that can
appear during a test are known a priori. This environment is referred to as the “closed-world” model, while the “open-world”
model implies that unknown classes can be incorporated into a recognition algorithm whilst being predicted. Therefore,
recognition systems that operate in the real world have to deal with these unknown categories. Our objective was not only
to detect data that originate from categories unseen during training, but to identify similarities between pieces of unknown
data and then form new classes by automatically labeling them. Our Double Probability Model was extended by an image
clustering algorithm, in which Kernel K-means was used. A new procedure, namely the Cluster Classification algorithm
for the detection of unknowns and automated labeling, is proposed. These approaches facilitate the transition from open-
set recognition to an open-world problem. The Fisher Vector (FV) was used for the mathematical representation of the
images and then a Support Vector Machine introduced as a classifier. The measurement of similarity was based on the
FV representations. Experiments were conducted on the Caltech101 and Caltech256 datasets of images and the Rand
Index was evaluated over the unknown data. The results showed that our proposed Cluster Classification algorithm was
able to yield almost the same Rand Index, even though the number of unknown categories increased.

Keywords: open-world problem, cluster classification, image classification, open-set recognition,
image clustering

1. INTRODUCTION

In scenarios in the real world, the size of the avail-
able dataset continues to increase, therefore, any ma-
chine learning algorithm that operates in such an envi-
ronment has to be capable of preventing growth. This
is especially true in the case of image classification, be-
cause the growing dataset of tests can pose many diffi-
culties, e.g. it is possible that some of the test images
originate from categories that are unseen during train-
ing. Recognition systems should detect these unknown
images and handle them in an appropriate way. In the rest
of the paper the terms "unknown class or category" rep-
resent classes or categories that are unseen during train-
ing, and “unknown image” denotes images that originate
from unknown classes or categories. One way of han-
dling detected unknown images is to measure their sim-
ilarities and identify new categories. Subsequently, these
new categories can be added to the set of known classes.
Based on this, three modules are required to solve such
problems in the real world, namely a recognition system
equipped with an unknown detector, a labeling process
and an incremental learning process.

*Correspondence: pappd@tmit.bme.hu

Let us assume that there are K known classes
(C1, C2, . . . CK ) and U unknown classes in the test set
at any given moment, where SK and SU denote the sets
of known and unknown classes, respectively. Few distin-
guishable cases depend on the value of U:

1. U = 0,

2. U = 1,

3. U > 1.

Furthermore, a few more cases depend on the amount and
type of available information concerning SU :

(A) Training images,

(B) Set of attributes,

(C) Number of unknown categories (U),

(D) Nothing.

The cases that include 1 or A (e.g. 1A, 2A, 1B, 1C)
produce the general multiclass classification because all
categories are known a priori and positive-negative sam-
ples are available for each category during training. When

mailto:pappd@tmit.bme.hu


34 PAPP AND SZŰCS

U = 1, the task is only to identify the unknown images
because they originate from the same category, therefore,
the similarity measurement is unnecessary. In this paper,
the situation when U > 1 is considered.

As has been mentioned, 3A represents the traditional
multiclass classification. 3B+3C is referred to as trans-
fer learning or zero-shot learning [1], whereas according
to the literature the case of 3C+3D is known as open set
recognition [2, 3] or the open-world problem [4]. The for-
mer refers to the detection of images that originate from
unknown classes, and the latter includes the detection of
unknown images and a labeling process to identify new
classes, followed by the incremental learning of these
new categories.

Our goal was to tackle the open-world problem as
well as develop an algorithm that is able to detect the
unknown images and then introduce new classes by auto-
matically labeling the unknown data using unsupervised
learning. Previously an algorithm referred to as the Dou-
ble Probability Model (DPM) [5] was proposed, which is
suitable as an unknown detector in an open-set environ-
ment.

There are several works that use a variant of Support
Vector Machine (SVM) to solve the unknown detection
problem, such as the Support Vector Data Description [6],
One-class SVM [7, 8], Reject Option SVM (RO-SVM)
[9] and the novel Weibull-calibrated SVM (W-SVM) [3].
The latter one was developed to operate under the Com-
pact Abating Probability model, where the probability
of class membership decreases (abates) as points move
from known data towards unknown space. Scheirer et
al. claim that W-SVM outperforms their previous solu-
tions, namely the 1-vs-Set Machine Training algorithm
[2] and the Pi-SVM [10]. On the other hand, it was shown
that DPM outperforms W-SVM [5], therefore, in this pa-
per the DPM was used for unknown detection. Bendale
and Boult defined open world recognition and presented
the Nearest Non-Outlier algorithm in [4], which adds
object categories incrementally while detecting outliers
and managing open space risk. They defined open world
recognition in the form of three sequential steps: a multi-
class open set recognition function with a novelty detec-
tor, a labeling process and an incremental learning algo-
rithm. Although all of these steps should be automated,
they presumed labels were obtained by human labeling.
The main objective of our work and this paper is to pro-
pose an automated labeling process, the so-called Cluster
Classification (CC).

In the next section, the DPM and image clustering
methods are reviewed, subsequently, a baseline method
is suggested for an open-world problem and finally our
proposed algorithm, the CC, is presented. The third sec-
tion contains experimental results and in the last section
our conclusion is discussed.

2. Proposed open-world recognition sys-
tem

2.1 Double Probability Model

The DPM [5] is based on the likelihood of a classifier
and can be used with any kind of classifier that provides
class membership probabilities for the images. As a re-
sult, after training the classifier, it is capable of mak-
ing predictions with reliability values (scores) for each
class, i.e. decision vectors. The range of the scores de-
pends on the type of classifier (sometimes it is from 0
to 1 but it can be over any range. Only one condition is
required, namely the larger score for class Ci should rep-
resent the higher likelihood of being a member of class
Ci. In the training set or a validation set, the instances
with corresponding scores are investigated in each class.
The ground truth is known for this set, so the positive
elements can be selected from each class. In order to cal-
culate the conditional probability that a new instance be-
longs to class Ci according to its score, the cumulative
distribution function (CDF) of positive scores should be
determined, therefore, a reverse CDF of negative scores
was created:

FPi(x) = p (Ci|score < x) , (1)

FNi(x) = p (¬Ci|score > x) , (2)
where Pi and Ni denote the positive and negative ele-
ments, respectively. Note that the sum of these probabili-
ties is not always equal to 1 (this is not a requirement).

A DPM was constructed based on the CDF and re-
verse CDF functions. During testing, the focus is on the
likelihood of the occurrence of an unknown class com-
pared with any of the known classes. Before the compari-
son, the probabilities of the known classes should be cal-
culated. Scores (scorei for class Ci) for a new instance
are obtained as outputs from the original classifier, and
based on them the probability of class Ci occurring can
be expressed as described in

PCi = FPi (scorei)

K∏
j=1,j 6=i

FNj (scorej). (3)

An expression for the probability of class CK+1 is

PCK+1 =

K∏
j=1

FNj (scorej). (4)

If the probability of being a member of class CK+1 is
higher than for any other (known) class, then the new in-
stance will be a member of the unknown class. Otherwise
the prediction is based on the original classifier, i.e. the
class with the largest score will be selected. The decision
with regard to the prediction of test instance j is formal-
ized as

dj =

{
CK+1 | PCK+1 > maxi {PCi}

argmaxj {scorej} | otherwise
(5)

Hungarian Journal of Industry and Chemistry


AUTOMATED LABELING PROCESS FOR UNKNOWN IMAGES 35

At this point the algorithm is able to make a decision
about test data if it originates from an unknown cate-
gory. Also, should it originate from a known category,
then based on the output of the classifier its known cate-
gory can be determined.

2.2 Unknown image clustering

The image representations were created according to the
Bag-of-Words [11, 12] model. Based on their visual con-
tent, each image was represented by a single high di-
mensional vector. In order to create these high-level de-
scriptors, the local attributes of the images were inves-
tigated by calculating the low-level Scale Invariant Fea-
ture Transform (SIFT) [13] descriptor. Next, the Gaus-
sian Mixture Model (GMM) [14–16] was used to define
the visual code words and the Fisher Vectors [17, 18] to
encode the low-level descriptors into high-level descrip-
tors based on the visual code words. The Fisher Vectors
were the final representations (image descriptors) of the
images and were used as the input data for the clustering
algorithm. After the final clusters of Fisher Vectors were
formed, the image clusters could be produced by substi-
tuting the Fisher Vectors for the corresponding images.

The basis of our clustering approach is the well-
known K-means clustering algorithm [19] which consists
of two important inputs, namely the initial cluster centers
and the number of clusters. The K-means clustering al-
gorithm aims to minimize the sum of squared distances
from all points to their cluster centers:

E = min

(
k∑

l=1

∑
xi∈Cl

‖xi −zl‖
2

)
, (6)

where k denotes the number of clusters, xi represents a
member of cluster Cl and zl stands for the center of it.

However, the Fisher Vector consists of 65,791 dimen-
sions, and the basic K-means clustering algorithm per-
forms less efficiently when the clusters are non-linearly
separable or the data contains arbitrarily shaped clus-
ters of different densities. Therefore, an upgraded ver-
sion of the K-means clustering algorithm was applied
in the recognition system referred to as Kernel K-means
[20–22]. The objective function of Kernel K-means is still
to minimize the sum of squared distances, but it uses the
kernel trick to transform the data points into infinite fea-
ture space xi → ϑ (xi), as can be seen in

E = min


 k∑

l=1

∑
xi∈Cl

∥∥∥∥∥∥∥ϑ (xi)−
∑

xj∈Cl
ϑ (xj)

Nl

∥∥∥∥∥∥∥
2 , (7)

where Nl denotes the number of images in cluster Cl. The
trick here is that explicit calculations in the feature space
are never required, since transformed data points are only
present as part of an inner product. Therefore, they can be
substituted for their kernel representatives (the Gaussian
kernel was implemented here).

In order to reduce the randomness of final clusters,
the PlusPlus cluster center initialization algorithm was
used before the iterative steps, which was proposed by D.
Arthur and S. Vassilvitskii [23]. This approach aims to
spread out the initial cluster centers and accelerate their
convergence. The first cluster center is randomly selected
from the data points, after that each subsequent cluster
center is chosen from the data points with a probability
proportional to its squared distance from the closest ex-
isting cluster center.

In the following sub-sections, the usage of the pre-
sented methods is discussed.

2.3 Baseline method

In this section, a baseline method of open world recog-
nition is presented. First, at training time the classifier of
the training data is trained with K known classes, then, at
testing time classification of the test data (K +U classes)
is performed. The DPM is applied to the output of the
classifier to detect unknown images UDPM:

UDPM =

NU⋃
j=1

{Ij|dj = CK+1} (8)

where Ij represents test instance j, NU denotes the num-
ber of test instances in the test data, dj stands for the de-
cision of the DPM, and

⋃
{. . .} is the operation of union.

Now, let us assume that information concerning U
was provided (as in the case 3C), and U was used as the
number of clusters. The Kernel K-means PlusPlus cluster
center initialization algorithm (KK++) was performed on
UDPM with k = U clusters (which is the input parame-
ter for the KK++), and then the appropriate labels were
assigned to the unknown images:

Lj = CK+i|i = out
(
KK++

)
j = 1 . . . M, i = 1 . . . U

(9)

where M represents the number of unknown images; Lj
and Ci denote the label of unknown image UDPMj and
cluster identity, respectively. This concludes the baseline
method for automated labeling. At this point the classifier
can be retrained based on the previously known and new
labels, and then the new test data classified.

2.4 Cluster Classification

In this section our proposed CC approach is presented,
which is suitable for unknown detection and automated
labeling. This algorithm contains extended training and
testing phases. In training time, a classifier of the train-
ing data is trained with K known classes, then a pseudo-
cluster is also created based on the K known categories.
This means that the ground truth class labels are imple-
mented rather than a clustering algorithm (to determine
the final clusters), i.e. each category is a cluster. Subse-
quently, the images are substituted for their Fisher Vector

47(1) pp. 33–39 (2019)


36 PAPP AND SZŰCS

representations and the cluster centers calculated which
will be used in the testing phase.

Let us assume T categories are found in the testing
phase, and that T > K. The test data is classified into the
K known categories and a DPM applied based on the de-
cision vectors to detect the unknown images UDPM. The
next step is to form clusters using the Kernel K-means
clustering algorithm starting from the K cluster centers
that were calculated at training time from the pseudo-
cluster. Afterwards, the remaining T -K cluster centers
are determined following the PlusPlus initiation proto-
col. Furthermore, the training and test datasets were used
together as the input data. Basically, with these modifi-
cations it was possible to guide the clustering algorithm,
therefore, create more accurate clusters.

The following step of the testing phase is to clas-
sify the clusters {Ci} by weighted majority voting of the
members of the cluster. The vote is based on the class
membership probabilities (PCi; i = 1 . . . K + 1) calcu-
lated in Eqs. 3–4. As was seen in Section 1, the definition
of problem 3C assumes that the number of unknown cat-
egories exceeds 1. Nonetheless, the output of the DPM
only yields K + 1 alternatives instead of T . In spite of
this, the classification of clusters that depend on {PCi }
can increase the number of alternatives to T as will be
seen later. In Section 1, a differentiation was made be-
tween known and unknown images, and now this differ-
entiation is broken down even more. The training data
contains only known images, because each of them be-
longs to one of the set of known categories (SK ). From
now on, the union of known images of the training data
will be denoted by KGT as can be seen in:

KGT =

NK⋃
j=1

{Ij} (10)

where NK stands for the number of images in the train-
ing data. On the other hand, the test data contains both
known and unknown images. Furthermore, based on the
output of DPM, the test data can be divided into two dif-
ferent subsets, namely predicted known images (KDPM)
and predicted unknown images (UDPM), as can be seen
in Eqs. 11 and 8, respectively.

KDPM =

NU⋃
j=1

{Ij|dj 6= CK+1} (11)

The weight of the images can be calculated based on
the cluster coherence. The coherence of a cluster can be
determined by comparing the number of known images
to the number of predicted unknown images inside that
given cluster. It should be noted that known images inside
the clusters either originate from KGT or KDPM, while
the predicted unknown images are all part of UDPM. If
the number of known images exceeds the number of un-
known images it is implied that a cluster exhibits “known
coherence” (KC), and “unknown coherence” (UC) vice
versa, as described in:

Ccoh =

{
KC |

∥∥{KGT ∪KDPM}∥∥ ≥ ∥∥UDPM∥∥
UC |

∥∥{KGT ∪KDPM}∥∥ < ∥∥UDPM∥∥
(12)

where ‖X‖ represents the number of elements in X, and
the superscript coh indicates the coherence of cluster C.

The weights can be calculated as described in Eqs.
13 and 14. Intuitively, if an image is known and located
inside cluster UC, then it is “punished” by assigning a
lower weight to it; and vice versa, an unknown image is
given a lower weight inside cluster KC. Moreover, the
larger the difference between the numbers of known and
unknown images implies a more severe punishment with
regard to the value of weights.

wKCj =




1 +
(#known−#unknown)
(#known+#unknown)

| Ij /∈ UDPM

1− (#known−#unknown)
(#known+#unknown)

| Ij ∈ UDPM
(13)

wUCj =




1 +
(#known−#unknown)
(#known+#unknown)

| Ij ∈ UDPM

1− (#known−#unknown)
(#known+#unknown)

| Ij /∈ UDPM
(14)

Thereafter the final decision vector of cluster Ci can be
calculated as:

Vi =
1

Ni

Ni∑
j=1

wj ×dj (15)

where Ni denotes the number of images in cluster Ci, wj
represents the weight and dj stands for the decision vec-
tor ({PCi}) of image j. Note that dj possesses K + 1
elements (+1 from DPM), therefore, vector Vi also pos-
sesses K + 1 elements. Consequently, the element with
the maximum value of Vi determines the category of clus-
ter Ci. The classification of cluster Ci is formalized in:

Di =

{
new class | VK+1 = maxj{Vj}

argmaxi{Vi} | otherwise
(16)

The results of the classification of the clusters can be con-
sidered as a labeling proposal, i.e. label each image inside
cluster Ci according to Di. When decision Di for cluster
Ci is that it is part of a known category, then each im-
age inside Ci gets labeled with the same category. On the
other hand, when Di = a new class, a new category is
created and each image in Ci gets labeled with the new
category. Basically, the CC algorithm follows this label-
ing proposal.

3. Experimental Results

In order to measure the efficiency of the labeling process,
experiments were conducted on the Caltech101 [24] and

Hungarian Journal of Industry and Chemistry


AUTOMATED LABELING PROCESS FOR UNKNOWN IMAGES 37

Figure 1: Example images from the Caltech101 and Cal-
tech256 datasets. The airplane, butterfly and windmill
categories are represented by the left, middle and right
columns, respectively.

Caltech256 [25] datasets. Example images from these
datasets are shown in Fig. 1.

The former consists of 101 categories and 8, 677 im-
ages, while the latter is composed of 30, 607 images from
256 different classes. To create an open-world environ-
ment, 50 known and 50 unknown categories were ran-
domly selected from the Caltech101 dataset, and 100 of
both categories from the Caltech256 dataset. These ran-

Figure 2: Averaged results of the 5-5 different test datasets
that were randomly selected from the Caltech101 and
Caltech256 datasets. The RI is plotted against the num-
ber of unknown categories. The diagrams compare the
labeling performance of the DPM with Kernel K-means
(DPM+KK) against the CC.

Table 1: Summary of the results obtained from the test
data with the baseline (DPM+KK) and CC methods us-
ing the Caltech101 and Caltech256 datasets. The baseline
column contains the RI values evaluated which depend on
the number of unknown categories (un. cat.), and the CC
column presents the improvements that result from CC as
a percentage.

Caltech101 Caltech256

un.
cat.

base-line CC (%) un.
cat.

base-line CC (%)

5 0.629 6 10 0.514 1
10 0.594 13 20 0.489 9
15 0.567 15 30 0.484 6
20 0.561 16 40 0.478 9
25 0.550 17 50 0.452 13
30 0.514 28 60 0.448 13
35 0.536 18 70 0.433 19
40 0.522 23 80 0.426 17
45 0.505 24 90 0.412 22
50 0.497 25 100 0.397 21

dom selections were repeated 5 times in order to calcu-
late the average of the results of each experiment to ob-
tain a more comprehensive overview of the efficiency of
the CC algorithm with regard to these datasets. All of the
known categories were available from the beginning of
the tests, but the unknown categories were added incre-
mentally over 10 steps, and in each step the Rand Index
(RI),

RI =
TP + TN

TP + FP + TN + FN
, (17)

was evaluated over the unknown images, where TP, TN,
FP, and FN denote the number of true positive, true neg-
ative, false positive and false negative decisions, respec-
tively. The RI measures the similarity between the ground
truth and predicted labels of the unknown images, in
other words, the percentage of correct decisions.

Two methods were assessed and compared, namely
the baseline method (DPM+KK) and the CC, which were
discussed in Section 2.3 and 2.4, respectively. Both pro-
cedures used Fisher Vectors to mathematically represent
the images encoded from 128 dimensional SIFT descrip-
tors using a GMM consisting of 256 code words; a SVM
equipped with a radial basis function (RBF) kernel was
applied as a classifier. The results can be seen in Fig. 2
and Table 1.

The first diagram shows the results obtained from the
Caltech101 dataset and the second from the Caltech256
dataset. The DPM with Kernel K-means and the CC are
represented by dashed and solid lines, respectively. In
both experiments, the CC algorithm yielded a higher RI,
although during the first step the difference between the
two methods was minimal. It can be seen that the RI of
DPM+KK starts to decrease as the number of unknown
categories increases, while the CC remains by and large
unchanged.

47(1) pp. 33–39 (2019)


38 PAPP AND SZŰCS

4. Conclusion

In this paper, the problem of open world recognition was
reviewed and the possible cases were differentiated based
on our prior knowledge and actual information about the
test data and, thus, the unknown space. The DPM and
Kernel K-means algorithm were also reviewed in brief,
followed by the presentation of two approaches, which
perform multi-class classification, automatically detect
unknown images and propose a labeling for them. The
first method is a baseline technique where DPM was se-
quentially applied followed by Kernel K-means with a
PlusPlus cluster center initialization algorithm. However,
our proposed CC is a complex method of combining the
unknown detector and clustering algorithm that seeks to
determine the identity of formed clusters, while refining
the decisions made by the classifier and unknown detec-
tor. The CC algorithm constructs a specific weight system
to reward or punish images which were placed into a cat-
egory that is presumably unsuitable for their estimated
identity. Multiple experiments were conducted on two
large datasets (Caltech101 and Caltech256), and the RI
evaluated with regard to the unknown images. The results
showed that the CC outperformed the baseline method,
and was able to maintain almost the same RI, while the
number of unknown categories increased.

Acknowledgement

The research was supported by the ÚNKP-18-3 New Na-
tional Excellence Program of the Ministry of Human Ca-
pacities.

REFERENCES

[1] Lampert, C. H.; Nickisch, H.; Harmeling, S.: Learn-
ing to detect unseen object classes by between-
class attribute transfer, 2009 IEEE Conference
on Computer Vision and Pattern Recognition,
2009, pp. 951–958 ISBN: 978-1-4244-3992-8 DOI:
10.1109/CVPR.2009.5206594

[2] Scheirer, W. J.; de Rezende Rocha, A.; Sapkota,
A.; Boult, T. E.: Toward open set recognition,
IEEE T. Pattern Anal., 2013 35(7), 1757–1772 DOI:
10.1109/TPAMI.2012.256

[3] Scheirer, W. J.; Jain, L. P.; Boult, T. E.: Prob-
ability models for open set recognition, IEEE
T. Pattern Anal., 2014 36(11), 2317–2324 DOI:
10.1109/TPAMI.2014.2321392

[4] Bendale, A.; Boult, T.: Towards open world recog-
nition, 2015 IEEE Conference on Computer Vi-
sion and Pattern Recognition, 2015, pp. 1893–1902
ISBN: 978-1-4673-6964-0 DOI: 10.1109/CVPR.2015.7298799

[5] Papp, D.; Szűcs, G.: Double probability model for
open set problem at image classification, INFOR-
MATICA, 2018 29(2), 353–369 DOI: 10.15388/Informat-
ica.2018.171

[6] Tax, D. M.; Duin, R. P.: Support vector data descrip-
tion, Machine Learning, 2004 54(1), 45–66 DOI:
10.1023/B:MACH.0000008084.60811.49

[7] Cevikalp, H.; Triggs, B.: Efficient object detection
using cascades of nearest convex model classifiers,
2012 IEEE Conference on Computer Vision and
Pattern Recognition, 2012, pp. 3138–3145 ISBN:
978-1-4673-1226-4 DOI: 10.1109/CVPR.2012.6248047

[8] Schölkopf, B.; Platt, J. C.; Shawe-Taylor, J.;
Smola, A. J.; Williamson, R. C.: Estimating
the support of a high-dimensional distribution,
Neural Comput., 2001 13(7), 1443–1471 DOI:
10.1162/089976601750264965

[9] Zhang, R.; Metaxas, D. N.: RO-SVM: Support vec-
tor machine with reject option for image catego-
rization, In: Chantler, M.; Fisher, B.; Trucco, M.;
(Eds.): Proceedings of the British Machine Confer-
ence (BMVA Press, UK) 2006, pp. 123.1-123.10.
ISBN: 1-901725-32-4 DOI: 10.5244/C.20.123

[10] Jain, L. P.; Scheirer, W. J.; Boult, T. E.: Multi-
class open set recognition using probability of
inclusion, In: Fleet, D.; Pajdla, T.; Schiele, B.;
Tuytelaars, T.; (Eds.): Computer Vision – ECCV
2014., ECCV 2014. Lecture Notes in Computer
Science, 8691 (Springer, Cham, Switzerland) 2014,
pp. 393–409. ISBN: 978-3-319-10577-2 DOI: 10.1007/978-
3-319-10578-9_26

[11] Fei-Fei, L.; Fergus, R.; Torralba, A.: Recognizing
and learning object categories, 2007 IEEE Com-
puter Society Conference on Computer Vision and
Pattern Recognition, Short course, 2007 http://
people.csail.mit.edu/torralba/shortCourseRLOC/

[12] Lazebnik, S.; Schmid, C.; Ponce, J.: Beyond bags
of features: Spatial pyramid matching for recog-
nizing natural scene categories, 2006 IEEE Con-
ference on Computer Vision and Pattern Recogni-
tion, 2006 2, pp. 2169–2178 ISBN: 0-7695-2597-0 DOI:
10.1109/CVPR.2006.68

[13] Lowe, D. G.: Distinctive image features from scale-
invariant keypoints, Int. J. Comput. Vision, 2004
60(2), 91–110 DOI: 10.1023/B:VISI.0000029664.99615.94

[14] Reynolds, D. A.: Gaussian mixture models, In: Li,
S. Z.; (Ed.): Encyclopedia of Biometric Recogni-
tion, 1st ed., (Springer, Boston, USA) 2009, pp.
659–663 ISBN: 978-0-387-73003-5 DOI: 10.1007/978-1-
4899-7488-4_196

[15] Tomasi, C.: Estimating Gaussian mixture densi-
ties with EM: A tutorial (Tech. rep., Duke Uni-
versity) 2004 https://www2.cs.duke.edu/courses/
spring04/cps196.1/handouts/EM/tomasiEM.pdf

[16] Browne, R. P.; McNicholas, P. D.; Sparling, M.
D.: Model-based learning using a mixture of
mixtures of Gaussian and uniform distributions,
IEEE T. Pattern Anal., 2012 34(4), 814–817 DOI:
10.1109/TPAMI.2011.199

[17] Perronnin, F.; Dance, C.: Fisher kernel on vi-
sual vocabularies for image categorization, 2007
IEEE Conference on Computer Vision and Pattern

Hungarian Journal of Industry and Chemistry

https://doi.org/10.1109/CVPR.2009.5206594
https://doi.org/10.1109/CVPR.2009.5206594
https://doi.org/10.1109/TPAMI.2012.256
https://doi.org/10.1109/TPAMI.2012.256
https://doi.org/10.1109/TPAMI.2014.2321392
https://doi.org/10.1109/TPAMI.2014.2321392
https://doi.org/10.1109/CVPR.2015.7298799
https://doi.org/10.15388/Informatica.2018.171
https://doi.org/10.15388/Informatica.2018.171
https://doi.org/10.1023/B:MACH.0000008084.60811.49
https://doi.org/10.1023/B:MACH.0000008084.60811.49
https://doi.org/10.1109/CVPR.2012.6248047
https://doi.org/10.1162/089976601750264965
https://doi.org/10.1162/089976601750264965
https://doi.org/10.5244/C.20.123
https://doi.org/10.1007/978-3-319-10578-9_26
https://doi.org/10.1007/978-3-319-10578-9_26
http://people.csail.mit.edu/torralba/shortCourseRLOC/
http://people.csail.mit.edu/torralba/shortCourseRLOC/
https://doi.org/10.1109/CVPR.2006.68
https://doi.org/10.1109/CVPR.2006.68
https://doi.org/10.1023/B:VISI.0000029664.99615.94
https://doi.org/10.1007/978-1-4899-7488-4_196
https://doi.org/10.1007/978-1-4899-7488-4_196
https://www2.cs.duke.edu/courses/spring04/cps196.1/handouts/EM/tomasiEM.pdf
https://www2.cs.duke.edu/courses/spring04/cps196.1/handouts/EM/tomasiEM.pdf
https://doi.org/10.1109/TPAMI.2011.199
https://doi.org/10.1109/TPAMI.2011.199


AUTOMATED LABELING PROCESS FOR UNKNOWN IMAGES 39

Recognition, 2007, pp. 1–8 ISBN: 1-4244-1179-3 DOI:
10.1109/CVPR.2007.383266

[18] Perronnin, F.; Sánchez, J.; Mensink, T.: Improv-
ing the Fisher kernel for large-scale image clas-
sification, In: Daniilidis, K; Maragos, P.; Para-
gios, N.; (Eds.): Computer Vision – ECCV 2010.,
ECCV 2010. Lecture Notes in Computer Sci-
ence, 6314 (Springer, Berlin, Germany) 2010, pp.
143–156 ISBN: 978-3-642-15560-4 DOI: 10.1007/978-3-642-
15561-1_11

[19] MacQueen, J.: Some methods for classification and
analysis of multivariate observations, In: Le Cam,
L. M.; Neyman, J.; (Eds.): Proceedings of the Fifth
Berkeley Symposium on Mathematical Statistics
and Probability, 1 (University of California Press,
Berkeley, USA) 1967 pp. 281–297

[20] Chitta, R.; Jin, R.; Havens, T. C.; Jain, A. K.: Ap-
proximate kernel k-means: Solution to large scale
kernel clustering, In: Proceedings of the 17th ACM
SIGKDD International Conference on Knowledge
Discovery and Data Mining, (ACM, New York,
USA) 2011, pp. 895–903 ISBN: 978-1-4503-0813-7 DOI:
10.1145/2020408.2020558

[21] Dhillon, I. S.; Guan, Y.; Kulis, B.; Kernel k-means:

spectral clustering, and normalized cuts, In: Pro-
ceedings of the 10th ACM SIGKDD International
Conference on Knowledge Discovery and Data
Mining, (ACM, New York, USA) 2004, pp. 551–
556 ISBN: 1-58113-888-1 DOI: 10.1145/1014052.1014118

[22] Papp, D.; Szűcs, G.: MMKK++ algorithm for clus-
tering heterogeneous images into an unknown num-
ber of clusters, ELCVIA: Electronic Letters on Com-
puter Vision and Image Analysis, 2017 16(3), 30–45
DOI: 10.5565/rev/elcvia.1054

[23] Arthur, D.; Vassilvitskii, S.: k-means++: The ad-
vantages of careful seeding, In: Proceedings of
the Eighteenth Annual ACM-SIAM Symposium on
Discrete Algorithms, (Society for Industrial and
Applied Mathematics, Philadelphia, USA) 2007,
pp. 1027–1035 ISBN: 978-0-898716-24-5

[24] Fei-Fei, L.; Fergus, R.; Perona, P.: Learning gener-
ative visual models from few training examples: An
incremental Bayesian approach tested on 101 object
categories, Comput. Vis. Image Und., 2007 106(1),
59–70. DOI: 10.1016/j.cviu.2005.09.012

[25] Griffin, G.; Holub, A.; Perona, P.: The Caltech 256,
Caltech, Tech. Rep., 2012

47(1) pp. 33–39 (2019)

https://doi.org/10.1109/CVPR.2007.383266
https://doi.org/10.1109/CVPR.2007.383266
https://doi.org/10.1007/978-3-642-15561-1_11
https://doi.org/10.1007/978-3-642-15561-1_11
https://doi.org/10.1145/2020408.2020558
https://doi.org/10.1145/2020408.2020558
https://doi.org/10.1145/1014052.1014118
https://doi.org/10.5565/rev/elcvia.1054
https://doi.org/10.1016/j.cviu.2005.09.012

	INTRODUCTION
	 Proposed open-world recognition system
	Double Probability Model
	 Unknown image clustering
	 Baseline method
	 Cluster Classification

	 Experimental Results
	 Conclusion