BRAIN. Broad Research in Artificial Intelligence and Neuroscience,
ISSN 2067-3957, Volume 1, July 2010,
Special Issue on Complexity in Sciences and Artificial Intelligence,
Eds. B. Iantovics, D. Rǎdoiu, M. Mǎruşteri and M. Dehmer

ON CONVERTING SOFTWARE SYSTEMS TO OBJECT
ORIENTED ARCHITECTURES

Istvan Gergely Czibula and Gabriela Czibula

Abstract. Object-oriented concepts are useful concerning the reuse of ex-
isting software. Therefore a transformation of procedural programs to object-
oriented architectures becomes an important process to enhance the reuse of
procedural programs. Moreover, it would be useful to assist by automatic
methods the software developers in transforming procedural code into an equiv-
alent object-oriented one. In this paper we aim at introducing a hierarchical
clustering algorithm that can be used for assisting software developers in the
process of transforming procedural code into an object-oriented architecture.

Keywords: software engineering, procedural systems, object-oriented sys-
tems, machine learning, clustering

2000 Mathematics Subject Classification: 68N30, 62H30.

12



I.G. Czibula, G. Czibula - On converting software systems to object
oriented architectures

1. Introduction

It is well known that software evolution is an inevitable process for software
systems. Repeated changes alter the structure of a system, rapidly degrading
it and making the system “legacy”. Reengineering seems to be a promising
approach to upgrade these systems according to the latest technologies [1].

Object-oriented concepts are useful concerning the reuse of existing soft-
ware. Therefore a transformation of procedural programs to object-oriented
architectures becomes an important process to enhance the reuse of procedural
programs.

The evolution of some legacy software systems often requires the rewriting
of the system into an object-oriented programming language. This activity,
especially for large software systems, is difficult and time consuming.

That is why it would be useful to assist by automatic methods the software
developers in transforming procedural code into an equivalent object-oriented
one.

Unsupervised classification, or clustering, as it is more often referred as, is a
data mining activity that aims to differentiate groups (classes or clusters) inside
a given set of objects [2], being considered the most important unsupervised
learning problem.

The resulting subsets or groups, distinct and non-empty, are to be built so
that the objects within each cluster are more closely related to one another
than objects assigned to different clusters. Central to the clustering process is
the notion of degree of similarity (or dissimilarity) between the objects.

We have previously introduced in [3] a clustering approach for transforming
procedural systems into object-oriented ones. For this purpose, a partitional
clustering algorithm, named kOOS, was introduced. In this paper we aim at
extending the approach from [3] by introducing a hierarchical agglomerative
clustering algorithm that can be used for re-grouping the entities from an
existing software system written in a procedural language. The goal is to
obtain a partition of a software system, in which each cluster would correspond
to an application class from the equivalent object-oriented system.

The rest of the paper is structured as follows. The clustering approach for
assisting developers in the process of transforming software systems written in
procedural programming languages into object-oriented systems that we have
previously introduced in [3] is described in Section 2. In Section 3 we introduce
a hierarchical clustering algorithm for transforming procedural systems into

13



I.G. Czibula, G. Czibula - On converting software systems to object
oriented architectures

object-oriented ones. Section 4 outlines some conclusions of the paper and
also indicates further research directions.

2. A clustering approach for OO transformation. Background

We have previously introduced in [3] a clustering approach for transforming
procedural systems into object-oriented ones. In the following we will briefly
describe the proposed approach.

Let S = {s1, s2, ..., sn} be a non object-oriented software system, where
si, 1 ≤ i ≤ n can be a subprogram (function or procedure), a global variable,
a user defined type.

In order to transform S into an object-oriented system, we have proposed
an approach consisting of two steps:

1. Data collection - The existing software system is analyzed in order
to extract from it the relevant entities: subprograms, local and global
variables, subprograms parameters, subprograms invocations, data types
and modules, source files or other structures used for organizing the
procedural code.

2. Grouping - The set of entities extracted at the previous step are grouped
in clusters. The goal of this step is to obtain clusters corresponding to
the application classes of the software system S.

3. A Hierarchical Clustering Algorithm for OO
Transformation

In our clustering approach, the objects to be clustered are the entities from
the software system S, i.e., O = {s1, s2, . . . , sn}. Our focus is to group similar
entities from S in order to obtain groups (clusters) that will represent classes
in the equivalent object-oriented version of the software system S.

In order to express the dissimilarity degree between the entities from the
software system S, we will use an adapted generic cohesion measure [4]. Con-
sequently, the distance d(si, sj ) between two entities si and sj is expressed as
in Equation (1).

d(si, sj ) =




0 if i = j

1 − |prop(si)∩prop(sj )||prop(si)|+|prop(sj )| if prop(si) ∩ prop(sj ) 6= ∅
∞ otherwise

, (1)

14



I.G. Czibula, G. Czibula - On converting software systems to object
oriented architectures

where, for a given entity e ∈ S, prop(e) defines a set of relevant properties of
e, expressed as follows.

• If e is a subprogram (procedure or function) then prop(e) consists of:
the subprogram itself, the source file or module where e is defined, the
parameters types of e, the return type of e if it is a function and all
subprograms that invoke e.

• If e is global variable then prop(e) consists of: the variable itself, the
source files or modules where the variable is defined, all subprograms
that use e.

• If e is a user defined type then prop(e) consists of: the type itself, all
subprograms that use e, all subprograms that have a parameter of type
e and all functions that have e as returned type.

Our distance, as defined in Equation (1), highlights the concept of cohesion
[5]. It is very likely that entities with low distances will be placed in the
same application class, and distant entities will belong to different application
classes.

HOOS is based on the idea of hierarchical agglomerative clustering, and
uses an heuristic for merging two clusters. We use average link as linkage met-
ric, consequently we will consider the distance dist(k, k′) between two clusters
k ∈K and k′ ∈K as given in Equation (2).

dist(k, k′) =
1

|k| · |k′|
·

∑
e∈k,e′∈k′

d(e, e′) (2)

The heuristic used in HOOS is that, at a given step, the most two similar
clusters (the pair of clusters that have the smallest distance between them) are
merged only if the distance between them is less or equal to a given threshold,
distM in. This means that the entities from the two clusters are close enough
in order to be placed in the same cluster (application class).

The main steps of HOOS algorithm are:

• Each entity from the software system is put in its own cluster (singleton).

• The following steps are repeated until the partition of methods remains
unchanged (no more clusters can be selected for merging):

15



I.G. Czibula, G. Czibula - On converting software systems to object
oriented architectures

– select the two most similar clusters from the current partition, i.e,
the pair of clusters that minimize the distance from Equation (2).
Let us denote by dmin the distance between the most similar clus-
ters Ki and Kj ;

– if dmin ≤ distM in (the given threshold), then clusters Ki and Kj
will be merged, otherwise the partition remains unchanged.

In our approach we have chosen the value 1 for the threshold, because
distances greater than 1 are obtained only for unrelated entities (Equation
(1)). Each cluster from the resulted partition will represent an application
class from the equivalent object-oriented version of the software system S.

In order to experimentally validate the proposed clustering algorithm, we
have considered a simple code example written in Pascal and the algorithm
successfully provided the equivalent object-oriented code.

4.Conclusions

We have presented in this paper a hierarchical agglomerative clustering
algorithm (HOOS ) that can be used for assisting software developers in trans-
forming procedural software systems into object-oriented ones.

In comparison with existing approaches for transforming procedural sys-
tems into object-oriented architectures [6, 1, 7], the approach proposed in this
paper offers an automatic method for the transformation process and heuris-
tically determines the number of application classes candidates.

Further work will be done in the following directions:

• To improve the distance function used in the clustering process.

• To extend our approach in order to also determine relationships (class hi-
erarchies) between the application classes obtained in the object-oriented
system.

• To apply HOOS algorithm on large software systems.

• To apply other search based approaches for transforming procedural soft-
ware systems into object-oriented ones.

Acknowledgement

This work was supported by CNCSIS - UEFISCSU, project number PNII
- IDEI 2286/2008.

16



I.G. Czibula, G. Czibula - On converting software systems to object
oriented architectures

References

[1] Cobo, H., Mauco, V., Romero, M.d.C., Rodŕıguez, C.: A tool to reengineer
legacy systems to object-oriented systems. In: ER ’99: Proceedings of the
Workshops on Evolution and Change in Data Management, Reverse Engi-
neering in Information Systems, and the World Wide Web and Conceptual
Modeling, London, UK, Springer-Verlag (1999) 186–197

[2] Han, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann
Publishers Inc., San Francisco, CA, USA (2005)

[3] Czibula, I.: A clustering approach for transforming procedural into object-
oriented software systems. In: KEPT ’09: Proceedings of the Knowledge
Engineering: Principles and Techniques Conference. (2009) 185–188

[4] Simon, F., Loffler, S., Lewerentz, C.: Distance based cohesion measuring.
In: Proceedings of the 2nd European Software Measurement Conference
(FESMA), Technologisch Instituut Amsterdam (1999)

[5] Bieman, J.M., Kang, B.K.: Measuring design-level cohesion. Software
Engineering 24 (1998) 111–124

[6] Zou, Y., Kontogiannis, K.: Incremental transformation of procedural sys-
tems to object oriented platforms. In: COMPSAC ’03: Proceedings of the
27th Annual International Conference on Computer Software and Appli-
cations, Washington, DC, USA, IEEE Computer Society (2003) 290

[7] Jacobson, I., Lindström, F.: Reengineering of old systems to an object-
oriented architecture. SIGPLAN Not. 26 (1991) 340–350

Istvan Gergely Czibula and Gabriela Czibula
Department of Computer Science, Babeş-Bolyai University
1, M. Kogalniceanu Street, Cluj-Napoca
email:{istvanc, gabis}@cs.ubbcluj.ro

17