Concepts to Improve the Quality of Production Plans using Machine Learning


ACTA IMEKO 
ISSN: 2221-870X 
March 2020, Volume 9, Number 1, 32 - 39 

 
ACTA IMEKO | www.imeko.org March 2020 | Volume 9 | Number 1 | 32 

Concepts for improving the quality of production plans using 
machine learning 

Lukas Lingitz1, Wilfried Sihn1,2 

1 Fraunhofer Austria Research GmbH, Theresianumgasse 27, A-1040, Vienna, Austria 
2 TU Wien, Institute of Management Science, Theresianumgasse 7, A-1040, Vienna, Austria 

 
Section: RESEARCH PAPER 

Keywords: production planning; planning quality; master data; prediction; machine learning 

Citation: Lukas Lingitz, Wilfried Sihn, Concepts to Improve the Quality of Production Plans using Machine Learning, Acta IMEKO, vol. 9, no. 1, article 6, 
March 2020, identifier: IMEKO-ACTA-09 (2020)-01-06 

Editor: Lorenzo Ciani, University of Florence, Italy 

Received November 6, 2019; In final form February 10, 2020; Published March 2020 

Copyright: This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, 
distribution, and reproduction in any medium, provided the original author and source are credited. 

Funding: This work was supported by the European Union under the programme Electronic Component Systems for European Leadership (ECSEL) EU, 
Project: Power Semiconductor and Electronics Manufacturing 4.0 (SemI40) (Grant Agreement No. 692466). 

Corresponding author: Lukas Lingitz, e-mail: lukas.lingitz@fraunhofer.at Fehler! Linkreferenz ungültig. 

 
1. INTRODUCTION 

Industry 4.0, Cyber-Physical Production Systems (CPPSs) [2], 
[3], and the Industrial Internet of Things have a significant 
influence on Production Planning and Control (PPC) today and 
will do so in the future. Deploying CPPSs raises several 
challenges for industries, as addressed in [4]. These challenges 
include the extraction of knowledge from heterogeneous data 
sources; interoperations with production information systems; 
and new possibilities in terms of the changeability, adaptability, 
and reconfigurability of production systems. Compared to 
traditional production planning, which is based on a generally 
static knowledge base, smart factories enable the collection and 
exchange of real-time information between products, machines, 
processes, operations [5], and systems. The application and 
exchange of data by the elements of a smart factory lead to an 
automated and decentralised production, which is an essential 

characteristic of Industry 4.0 [6], [7]. However, there is a need to 
study how the different solutions – enabled by digitalisation – 
can support PPC and contribute to an increased corporate 
competitiveness [8]. 

Planning Quality (PQ) is a commonly used term in industrial 
practice when discussing PPC. However, the term is not clearly 
defined in the scientific context. Therefore, the authors give a 
proposal for a new definition of the term, trying to set a widely 
accepted standard. After defining PQ, a case study is given, 
wherein the suggested PQ has been increased through the 
application of Machine Learning (ML) for lead time prediction. 
Lastly, the authors give an outlook of two novel approaches that 
will be the main subject of an ongoing research project. 

The paper is structured as follows. In section 2, the authors 
give a literature review concerning PQ and the application of ML 
in the field of PPC. In section 3, a definition of the term PQ is 
given. A case study from the semiconductor industry in section 

ABSTRACT 
There are always deviations between production planning and subsequent execution. Furthermore, it has been found that the reliability 
of production plans and thus Planning Quality (PQ) can drop down to 25 % in the first three days after plan creation [1]. These deviations 
are caused by uncertainties, such as inaccurate or insufficient planning data (including data quality and availability); inappropriate 
planning and control systems; and unforeseeable events. Production planners therefore use buffers in the form of inventories or 
extended transitional periods to create possibilities for implementing corrective measures in production control. Buffers, however, lead 
to increased coordination and control efforts as well as to negative effects, particularly on the inventory, throughput time, and capacity 
utilisation. The potential for more accurate planning remains largely unexploited. The objective of this paper is to investigate the 
possibilities of increasing planning quality. Within a case study, the authors demonstrate how machine learning can be used to predict 
cycle times. Furthermore, the increased accuracy compared to the current method is shown. Based thereon, two approaches are 
presented, focusing on the reduction of gaps between the master data and predicted data used during the production planning process. 
Moreover, further research needs are identified. 

mailto:lukas.lingitz@fraunhofer.at


ACTA IMEKO | www.imeko.org March 2020 | Volume 9 | Number 1 | 33 

4 demonstrates the accuracy of ML for cycle time prediction 
compared to the currently used static approach. Section 5 relays 
the results of the case study and presents different approaches to 
how to improve PQ by applying ML. In section 6, the authors 
discuss these approaches and identify further research needs. 
Finally, a conclusion and outlook are given. 

2. RELATED LITERATURE  

Literature on PPC approaches the phrase PQ from different 
angles, considering several influencing factors. The first aspect 
focuses on classical logistical targets of PPC, such as flow times, 
due dates, setup costs, and product features. Since accurate and 
high-quality planning data is one of the most important parts of 
a good production plan, master data management is an important 
aspect in our discussion as well [9]. These first two approaches 
do not deal with (planned or unforeseeable) changes or 
uncertainties of the production environment, which are typical 
elements of the paradigm shift towards Industry 4.0 [10]–[12]. 
Therefore, robustness and resilience consider such disturbances. 
A production plan is robust if its performance is guaranteed – 
even when facing events not known at the time of planning [13], 
[14]. Resilience is, in contrast, the ability of a system to cope with 
changes of all kinds [15]. Two efficient ways of dealing with 
uncertainty are the application of stochastic fuzzy models and the 
use of adaptive and cooperative approaches [16]. The overall 
objective of PPC is the creation of reliable production plans, so 
their realisation on the shop floor should be close to – or ideally 
the same as – the production plan as it was originally planned. 
The deviation between planning and reality on the shop floor 
increases up to 75 % after just three days in medium-sized 
mechanical engineering enterprises [1]. As shown in the literature 
review, it is desirable to have more reliable production plans. 
Measuring the quality of the prediction, as an alternative, may 
reveal a potential for bridging the gap between planned and 
actual figures. Besides, the success of a good production plan 
depends on the decision-making process itself. In the era of 
Industry 4.0, the automation of decision-making processes and 
the level and way of human engagement are also essential topics 
[17], [18]. It can be concluded that there is no exact definition of 
PQ. In this paper, a novel industry-oriented concept for 
measuring, evaluating, and improving PQ will be developed. 

A general truth is that data does not bring any added value on 
its own, but in practice, domain-specific knowledge and 
algorithms are needed to extract useful information from 
heterogeneous and scalable data sources [19]. Simple statistical 
analysis is often not sufficient, as it is time-consuming and often 
does not lead to the desired results. Hence, automated data 
extraction and analytical methods are needed. Together with the 
rise of data science as one of the most popular emerging research 
and application fields today, ML has gained increasingly high 
attention in the recent past. 

At the very beginning of the development of ML, the vast 
majority of papers were published in journals related to the topic 
of computer science. However, with increasing demands of 
computational capabilities and big data analytics, the area is 
growing, with far-reaching applications in diverse disciplines. 
Nowadays, many different disciplines use ML algorithms, as 
experienced in 2012 [20]. The first ML applications in 
management science can be found in finance and marketing [21]. 
In 2009, Choudhary et al. identified that the emerging application 
of ML in PPC has not been systematically explored [22]. 
However, in the following years, several research papers were 

published in the context of production management, focusing on 
applying ML to advanced planning and scheduling [23]; quality 
improvement; process monitoring; and defect analysis [24]. Yet, 
researchers have not intensively focused on (sub-)topics relevant 
to PPC – such as flow time prediction, lot cycle time prediction, 
and lead time prediction; thus, the improvement potential has 
not been completely identified and maximised. 

The results of our literature review determine that the current 
trend in PPC is to employ ML-based simulation and optimisation 
algorithms. Furthermore, it can be recognised that the focus of 
most of the analysed publications is either given to production 
scheduling (47 %) or other applications (33 %), while the 
prediction of planning relevant times is rarely considered (20 %), 
as shown in [19]. 

3. PLANNING QUALITY – DEFINITION 

Until now, there has been no uniform mathematical definition 
and understanding of the term PQ in the scientific literature 
within the framework of PPC. Therefore, we briefly explain and 
define what is meant by the term PQ as we see it. 

PQ is foreseen as a key indicator for the planner to assess the 
reliability of the production plan in the planning phase (time t) 
and to continuously improve the operational reliability of 
production plans in forthcoming phases (time t+n). In principle: 

• The PQ is high if there are ideally no deviations or at least 
deviations within an acceptable range (which can vary 
depending on different industrial contexts) between the 
production plan created in advance and the actual execution. 

Based upon this general statement, we need to derive some 
measurable values to determine exactly if the production plan 
and the execution show deviations or not. On a macroscopic 
level, we could say that PQ is high if at least the calculated end 
date of the last production step of a production order from the 
production plan meets the actual end date. We can also break this 
assumption down for every operation. This would mean that PQ 
is high if every operation/production step of the production 
order starts and ends on the designated day/shift/hour 
calculated during the plan creation. This leads to the conclusion 
that the times that are used for the plan creation need to be very 
accurate so that the start and end times of the setup, operation, 
transition, and waiting not only meet on average – if we look to 
a longer period of time – but also need to match the reality for 
every order in every situation (e.g. over- and underload). This 
brings us to the next definition: 

• The PQ is high if there are ideally no deviations or at least 
deviations within an acceptable range (which can vary 
depending on different industrial contexts) between the 
planning times created in advance and the actual times that 
result during the execution phase. 

There are many factors that could influence the actual times 
and cause deviations between the planning times and the actual 
times. One of the main problems is that we normally use static 
average values for the planning times. If we would use ML 
models with a high prediction accuracy and trained with 
confirmation data from the actual production system to predict 
the actual times, we could reformulate the statement from above: 

• The PQ is high if there are ideally no deviations or at least 
deviations within an acceptable range (which can vary 
depending on different industrial contexts) between 
predicted times and actual times (e.g. lead time, setup time, 
and operation time). Accurate dynamic prediction models 


ACTA IMEKO | www.imeko.org March 2020 | Volume 9 | Number 1 | 34 

are required to continuously reduce the deviation between 
reality and prediction AND 

• There are ideally no deviations between the static planning 
times used for the production plan creation and the 
predicted times OR 

• Instead of static planning times, the prediction model itself 
is used to create production plans that have a high PQ. 

Based on the general statement that PQ measures the 
deviation between a production plan and the later execution, we 
now have two possibilities for creating production plans with 
higher PQ. We achieve this by having accurate prediction models 
of the planning times on the one hand and by using these models 
during the planning phase on the other hand. The definition 
shows two methods. The first option is to adapt the planning 
times to match the predicted times. In this case, we assume that 
the planning times are still one static value, so we might adapt 
the times in several iterations until we reach a convergence point. 
Alternatively, we can use the dynamic time models, which can be 
considered as a function with several variables as the input for 
the plan creation. 

The above definitions and statements give rise to the 
conclusion that there are two possible ways of creating 
production plans with high PQ. These two methods are shown 
in two approaches that are described and discussed in sections 5 
and 6 and are subject to further research within a funded research 
project. Within the upcoming case study, the possibility of 
predicting planning times based on confirmation data is 
demonstrated. 

4. CASE STUDY ON TIME PREDICTION 

In this section, the authors give an example of the application 
of ML for time prediction. After an introduction to the related 
industrial sector of the use case partner and a general description 
of the production system, the approach and the results are given. 
Based on the results, ideas of how to apply ML within the 
planning process are stated. 

4.1. Introduction 

As semiconductor industries offer a high availability of data 
for the production process, these industries take the lead when it 
comes to innovative applications in the area of industrial data 
science. Furthermore, the market is highly competitive, and high 
productivity as well as short lead times (among others) are the 
key factors for success. 

Most products in the semiconductor industry are built in 
layers, where the same or similar production steps are repeated 
in cycles to build integrated circuits on the layers. Nevertheless, 
most production systems do not consist of rigidly linked 
machines. Rather, they are made up of highly automated 
machines that perform specific operations organised in a job 
shop-type production system. The often broad product spectrum 
is driven by rapid product innovations and results in a high 
material flow complexity. Therefore, constant sequencing is 
needed. 

Within the research project, the goal has been to increase the 
PQ of a semiconductor manufacturer. Due to good data 
availability, the authors decided to apply ML techniques to build 
prediction models for lead time prediction that can subsequently 
be used to analyse the accuracy of the production plans. As the 
company had no simulation model of the plant and the authors 
did not build a model, the prediction models were solely built 
with readily available data from the shop-floor IT systems 

(Enterprise Resource Planning [ERP] system and Manufacturing 
Execution System [MES]). 

4.2. Approach and results 

After collecting the process confirmation data from the MES, 
historical data about machines/equipment/work centres and 
customer-related information, features were generated based on 
the domain knowledge and experience of the process experts. 
Furthermore, some features that were relevant from a 
production logistics point of view were added. Many features 
that were created can be parametrised in a way such as ‘average 
lead time of the last x lots’, where x can be any positive integer 
number. The feature catalogue with the most important features 
can be found in [19]. 

According to the domain experts, the overall lead time 
(hereafter defined as the time span from the beginning of the first 
operation to the finish of the last operation of an order) is 
influenced by three main process steps. Therefore, we started the 
analysis with these steps. After calculating the features, several 
regression methods were tested to predict the cycle time. For the 
evaluation of the accuracy, the Normalised Root Mean Squared 
Error (NRMSE) was chosen as the measurement index. The 
NRMSE is the square root of the mean of the squares of the 
deviations between a predicted value and its actual regression 
dependent value divided by the range of the biggest and smallest 
regression dependent value for normalisation purposes. The 
normalisation makes different datasets and models comparable. 

The conclusion is that ensemble tree-based methods (bagged 
regression trees, random forests, and boosted regression trees) 
outperformed all tested regression algorithms. Furthermore, 
these methods give quantitative feedback about the feature’s 
importance i.e. the importance of a variable on the prediction. 

After these experiments, the scope of the study was 
broadened (time frame, products, and process steps). As a result, 
the lead time prediction for 80 process steps was done. 

Also for the broader scope random forest outperformed 
linear regression, based on different aspects (the need for data 
cleaning, the training time, and the accuracy of the model). To 
evaluate the advantages of the proposed approach compared to 
the current approach – creating production schedules with static 
target cycle times for every layer of every different product, 
updated in each quartal – the dynamic cycle time prediction was 

Table 1. NRMSE values for the different cycle time calculations. 

ProductID Average ML Number of lots 

1 17.9 11.7 2228 

2 12.5 12.7 1103 

3 15.9 13.7 964 

4 11.8 18.0 237 

5 23.1 18.4 289 

6 20.6 18.8 345 

7 23.7 20.4 142 

8 27.4 25.1 118 

9 25.9 29.0 32 

10 56.0 41.0 11 

11 37.4 43.4 32 

12 19.4 66.3 145 

13 18.3 - 7 

14 117.8 - 9 


ACTA IMEKO | www.imeko.org March 2020 | Volume 9 | Number 1 | 35 

done for each layer of the 14 most important products with the 
random forest algorithm. A comparison of the NRMSE values is 
given in Table 1 for the currently applied static planned cycle 
times (average) and for the predicted dynamic cycle times (ML). 
In most cases, random forest gives a more accurate prediction of 
the cycle time, as the NRMSE values are mostly smaller, 
especially when the sample is bigger. This is of high relevance in 
industry, as the lead times, in general, are three to seven weeks 
long, and even small deviations between predicted and actual end 
dates, calculated based on the lead time, cause a delay in shipping 
of several days. 

However, this means that the prediction accuracy is sensitive 
to the amount of data, and under a certain limit, no cycle time 
prediction with ML could be done. The inclusion of products 
with a minimum of 200 samples (ProductIDs 1 to 5) leads to the 
mean NRMSE values for the predicted dynamic cycle times (ML) 
of 15.5 % and 17 % for the static planned cycle times (average). 
To highlight the difference between the static and dynamic 
approach, the real cycle times (black dots) are compared with the 
currently used static (orange line – some outliers are not printed 
due to the scaling of the graph) and the predicted dynamic (green 
dots) cycle times for a given product and layer in Figure 1. It must 
be mentioned that a rolling forecast was applied to capture 
trends. This means that the ML algorithm learns from the past 
2000 observations and predicts the next 100 observations. 
Therefore, for the first 2000 observations, no prediction is 
available. 

4.3. Outcome of the project 

For the deployment, an interactive web application was 
developed to support the whole cycle – data cleaning, outlier 
detection, feature generation, model training, and prediction. 
The whole process is depicted in Figure 2. For the development 
of the web application, the authors used the free static 
programming language ‘R’ with the respective package ‘Shiny’. 

The user can upload new data – exported from different IT 
systems – in a specific predefined format. After uploading, 
automated data cleaning and an outlier detection process were 
executed. Several algorithms prepared the data for the next step 
of feature generation. 

The planner can select a set of different predefined features 
(e.g. the average lead time of the last x lots, where x can be any 
positive integer number defined by the user, for more see [19]) 
and add those features to the dataset (1). Some of the features 
are parameterisable, as explained above. The user can try 
different parameters and add the same feature with different 

selected parameters. In the next step, the user can train (2) 
different models. He is able to set a number of observations for 
the training and a number of predictions before the next training. 
For evaluation, he gets the NRSME (3) and can also see the 
predicted values and compare them to the currently used target 
values. Based on the results, he can adapt the parameters of the 
features and select the best features (4) until he reaches an 
acceptable accuracy level. Lastly, there is also the possibility of 
getting a prediction of a certain lot by entering the values of the 
features (5). An automated implementation of the developed 
prediction model for the current planning process is in progress, 
as the features used for training must also be calculated from the 
production plan itself. For this reason, new approaches need to 
be developed and researched further in order to be integrated 
into the application. 

What can be clearly seen is that even if the underlying 
planning data is updated regularly (in the current case, every three 
months), the cycle time is highly dependent on several 
influencing factors that cannot be depicted in a single, static 
planning value. Numerous influencing factors can be exploited 
with the help of different ML methods. This knowledge – gained 
from the available dataset – can be used to improve the quality 
of PPC in CPPSs. 

5. APPROACH OF THE INTEGRATION OF MACHINE 
LEARNING WITH PRODUCTION PLANNING 

In this section, the authors pick up the ideas from section 3 
and discuss the two possible approaches mentioned above. In 
the first approach, the planning times are adapted to match the 
predicted times in several iterations. This approach is called the 
‘evolutionary approach’. In the second approach, the dynamic 
time models are used directly for plan creation. This approach is 
called the ‘function-based approach’. 

5.1. Evolutionary approach 

The overall approach consists of six steps, numbered from 0 
to 5, while steps 1 to 5 run in ongoing loops and are depicted in 
Figure 3. It is assumed that in every iteration, PQ increases. This 
iterative process aims for continuous improvement and is the 
reason why the authors named the approach ‘evolutionary’. 

The planning starts with the generation of an initial 
production plan, whereby the planning itself is carried out by the 
conventional planning system, e.g. the ERP or MES of the 
company, and the planning data is extracted from the master data 
of the underlying system (steps 0 and 1). The production plan is 
the input for the second step. In step 2, ML models that have 
been trained by utilising historical data from the company 
beforehand are employed to predict different planning times, 
such as the cycle time, setup time, and operation time. In step 3, 
the deviations between the planned times and the forecasted 

 
Figure 1. Comparing the real and predicted cycle times of layer 1 with the 
currently used static cycle time. 

 
Figure 2. Shiny web application scheme. 

2. Training

3. Test / 
Evaluation ok?

4. Feature 
Selection

ERP

MES

R Shiny

0. Upload XLS

Export XLS

No

Yes

OutputInput

1. Feature 
Generation

5. Prediction


ACTA IMEKO | www.imeko.org March 2020 | Volume 9 | Number 1 | 36 

times and the impact thereof are evaluated. The results are 
summarised in a dashboard, showing the logistical fitness as 
proposed by Lödding et al. [25], and the newly developed PQ 
index. Depending on the impact of the deviation, the quality 
gives feedback about the reliability of the production plan. In the 
fourth step, the planner decides whether the production plan 
meets the preferences and objectives of the company or not. If 
not, it is possible to change the planning related master data by 
using the support system, which provides the planner with some 
alternative proposals to change the data. The overall decision and 
therefore the responsibility stays with the human (i.e. a human-
in-the-loop model) [18]). After the refinement and adjustment of 
the planning data, the planning can be re-initiated, and the cycle 
starts again. 

If the planner is satisfied with the Key Performance 
Indicators (KPIs) in step 4, he can activate the plan, thus 
transferring the planning related master data that was used to 
create the production plan to the planning systems. The 
innovative characteristics of the proposed evolutionary approach 
are as follows: 

• the provision of an evaluation method for the 
assessment of the production plans in step 3,  

• the creation of a dashboard to comprehensibly provide 
feedback to the planner through the visualisation of the 
main KPIs, and 

• the establishment of a method for the proposition of 
more suitable planning data. 

5.2. Function-based approach 

The second approach differs – the planning cannot be carried 
out by the standard/conventional planning system. While in the 
evolutionary approach, the quality of the production plan 
continuously increases, in the function-based approach, PQ is 
always set to high, and other logistical fitness factors of the plan 
are optimised iteratively, as shown in Figure 4. 

Step 0 is necessary for generating the prediction models and 
therefore must be done regularly. Based upon these ML models 
of planning-related data, a novel planning method generates 
production plans. The novelty of the planning models lies within 
the usage of the dynamic set of functions (i.e. an extendable 
vector of possible functions) for the planning-related data and 
not a single value or set of values. The general approach seems 
as follows: orders are subsequently scheduled, always taking the 
current information of the production plan as information for 
the prediction models of the planning related data. Some features 
for the prediction are not available at the time of planning, and 

therefore, these dynamic elements are predicted using different 
models. As there is no comparable approach in the field of 
research, this is the highest overall innovation within the 
approach. 

After planning, the evaluation of the logistical fitness follows. 
The logistical goal weighting of the company (e.g. an average lead 
time of six weeks and the timeliness of 87 % of all orders) models 
the benchmark that a production plan must achieve. Since PQ is 
already high, only these KPIs must be checked. In step 3, the 
planner again decides whether the plan is acceptable or not. If 
not, they try another planning sequence. As a support function, 
the system offers several strategies (based on the order size, due 
date, costumer priorities, etc.). Additionally, the planner can 
make manual changes. If the sequence is defined, they start the 
planning again. These steps are repeated until the plan is 
satisfactory. If so, the plan can be transferred/released to the 
ERP or MES. In sum, the innovative characters of the proposed 
function-based approach can be specified as follows: 

• Providing a novel planning method for dynamic times. 

• Creating a recommendation method to propose 
changes in the planning sequence. 

6. RESULTS AND DISCUSSION 

The objective of both approaches is to increase PQ in the 
sense of smaller deviations between planning and subsequent 
execution. The key is in the incorporation of uncertainty in the 
planning phase by carrying out production planning on the basis 
of dynamic instead of static time values. Only with the use of 
dynamic time values (e.g. standard times for machining and 
setup) can the interdependencies of different influencing factors, 
which occur naturally, be depicted close to reality. These 
influencing factors can be well known, but most planning system 
do not offer the possibility of considering them in planning or 
scheduling activities. In the worst-case scenario, these 
influencing factors remain unknown or at least undetected. An 
example of the former case could be fluctuations in the 
machining and setup times per week and/or shift. This is 
commonly caused by different skill levels of employees or the 
actual machine or tools that execute the job. Even though these 
effects are well known, most companies do not have the 
resources to levy and document all the effects so that a system 
would be able to process the information. Still, this information 
is available in, for example, the confirmation data, and ML is able 
to quantify the impact to some extent. 

In order to be able to represent these dynamics as realistically 

 
Figure 3. Evolutionary approach. 

 
Figure 4. Function-based approach 


ACTA IMEKO | www.imeko.org March 2020 | Volume 9 | Number 1 | 37 

as possible by using an ML algorithm, it is essential to ensure 
high data quality. In both approaches, the prediction provided by 
the ML algorithm can be as reliable and valid for the actual 
production system as the underlying historical data source that is 
used. For further discussion, we focus on different data sources 
for MES data. We distinguish between Plant Data Acquisition 
(PDA) and Machine Data Acquisition (MDA) data. MDA data 
can generally be considered as more reliable than PDA data, as 
the actual machine status is automatically captured. For example, 
a status like ‘no spindle rotation’ and ‘no malfunction’ indicate 
that the process has obviously been completed, and a time 
feedback via MDA is done correctly. In the same case with 
feedback via PDA, there can either be a time delay in the 
feedback of the machine operator or feedback is entirely missing. 

Therefore, in the case of PDA feedback, higher fluctuations 
and lower-quality training data are to be expected in comparison 
to MDA data. However, this does not mean that the approach is 
only applicable for MDA-generated data or data that is generated 
with a simulation model of the production system. Longer 
observation periods and therefore a larger data source help to 
increase the reliability. In conclusion, it can be stated that the use 
of data that reflects reality is essential. 

To ensure that the machine status is mapped in production 
planning, it is also important to consider them as input features 
of the ML algorithm. Depending on the input data of the 
machine status, the underlying causes should rather be 
questioned instead of a direct link between the change of the data 
and a change of the production plan. For example, following the 
assumption that the machining times of the machine increase 
over time, a variety of causes can be assumed, such as a change 
in the condition of the machine tool or the production machine 
(e.g. blunt tool or slow feed). In this case, it seems reasonable to 
invest resources to adjust the machine condition instead of 
generally increasing the machining times. It is important to 
provide the production planner with a decision support system 
in order to present decision options and their effects on the 
planning system. It seems conceivable to constantly increase the 
processing times due to a lack of investment in old machines or 
to restore the original machine condition. In the second case, the 
actual machining times should be significantly reduced. A major 
advantage of the function-oriented approach is the time-
consuming master data maintenance that is not required. Instead, 
the approach generates benefits through the dynamic 
adjustments of the relevant values. However, it should be noted 
that the logic of the ML algorithm can only create a production 
plan on the basis of the current data situation. Missing or 
defective data usually leads to a reduced PQ. Even if the 
condition of the machine worsens, immediate repair of the 
machines and the associated shortening of the processing times 
is generally not to be expected, nor is it appropriate. The 
permanent monitoring and visualisation of the machine status 
can alternatively be used to initiate measures to reduce 
production time. An example is the maintenance or change of 
the production parameters. As an ideal solution, the integration 
of a logic into the ML algorithm can be considered, which 
enables the examination of data from different data sources, 
machines, etc. The algorithm can assist in the interpretation and 
decision-making or, if required, perform appropriate weighting 
of individual data and states. By weighting, the effects of 
individual influences on production planning can be adapted, and 
implicit knowledge of planning can be represented to further 
increase PQ. 

6.1. Evolutionary approach 

Since this approach uses the existing production planning 
system, there are fewer interventions in the current existing 
system. This means that the existing systems do not become 
obsolete and can still be used to generate production plans. The 
evolutionary approach is meant to be an additional Decision 
Support System (DSS) to the actual planning system. Within this 
DSS, a comparison is made between the planned times and the 
predicted times (cycle time, setup time, operating time, etc.). 
Feedback to the planner is given in the form of various KPIs and 
the PQ index. Furthermore, the planner gets feedback on why a 
certain time prediction is different from the planned time. Based 
on the chosen ML method, ‘drivers’ for predicted times can be 
identified, and a note to the planner is given. Since the system is 
an additional system, the implementation is expected to be easier 
and quicker. Furthermore, the acceptance by the planners is 
expected to be higher, as the actual planning is still done by the 
existing system. The transparency and sovereignty of the planner 
about decision-making is an important factor for the acceptance 
of employees and proves a significant advantage of this 
approach. 

6.2. Function-based approach 

As the production plan is created directly using the time 
prediction models and these models get updated frequently, the 
quality is independent from the stored master data. Furthermore, 
the ‘dynamic’ models are always updated automatically. The 
authors expect this approach to require a smaller number of 
iteration cycles because the system always uses the planning data 
that most likely depicts the later execution. The iterations are 
needed to meet the objectives for the logistical KPIs. Therefore, 
the number of iterations is expected to be similar to the current 
number of iterations, which are needed for the creation of a 
proper plan today. However, there are several open research 
questions that need to be answered. Within the function-based 
approach, the planning algorithm chooses one order after the 
other. This means that the prediction only considers orders that 
have already been planned. However, subsequent orders have an 
impact on the features that are used for the prediction; for 
example, on the work in progress, when an order arrives at the 
same workstation, and the prior planned order is not finished. It 
is therefore crucial to derive correlations from planned orders 
and unplanned orders in the planning process. According to the 
authors, further research is needed to define a way in which the 
necessary features for the time prediction can be determined. 

The proposed approach is designed to replace the current 
planning system with a new planning algorithm. The planner has 
the option of weighting the KPIs (timely delivery, lead time, 
stock, etc.) and thus manually adapting the priorities for 
production planning. Since the PQ index is always high, it is 
important to check the corresponding KPIs and only implement 
the production plan when the KPIs deliver sufficient results. A 
further advantage of this approach is that in contrast to the 
evolutionary approach, no adjustments of the master data have 
to be carried out. Instead, a collection of the feedback data is 
necessary for training the ML algorithm. 

6.3. Evaluation of the approaches 

For the functionality and correct use of both approaches to 
increase PQ, it is also important to determine when and which 
approach is applied. Therefore, in addition to further developing 
the approaches, attention should be given to creating an 
evaluation method for the two approaches. This should clearly 


ACTA IMEKO | www.imeko.org March 2020 | Volume 9 | Number 1 | 38 

demonstrate the advantages of each approach as well as the 
complexity of implementation. It should also be ensured that the 
right approach is always used depending on the application, the 
industry, and other influencing factors in order to achieve 
optimal results. 

6.4. Implementation in ERP/MES 

In order to ensure that both approaches are fully functional 
and can be used separately from the implemented ERP or MES 
solution, they have to be made accessible regardless of the 
platform that is used. Different variants of ERP and MES 
systems in the industry do not represent restrictions for the 
application of new approaches. Therefore, the transfer of 
historical data for the training of the ML model is a general 
solution and must guarantee a data transfer at any time, but the 
training data can consist of individual characteristics depending 
on the application case. Even if the accuracy of the production 
plan prediction increases with the amount of training data used, 
it is important not to select a disproportionately large amount of 
data. The horizon of the data also has an important influence. 
The use of long-term data ensures that the results are based on a 
long history, while the use of short-term data is suitable for the 
representation of outliers and random events. The storage of 
data duplicates in MES and ERP has to be prevented as well. 

In each planning run, only the delta from old and new data is 
to be transferred, which guarantees speed advantages in the data 
transfer and keeps the transfer duration low. In fact, data 
transfers are not only possible for order confirmations but also 
for partial confirmations via MES or ERP. According to the 
current status, however, only the historical data of a completed 
order (order confirmation) is used to forecast production plans. 
Since the time horizon of the data used can influence the result 
of the prediction, it is also possible to differentiate between two 
variants of time-based use. In variant 1, the time horizon extends 
over the entirety of all training data, with all previous training 
data having an unknown influence on the results. In variant 2, 
the time horizon can be individually selected according to the 
principle of ongoing planning, which has advantages, for 
example, for the more accurate reproduction of short-term 
events. It is assumed that old data is less valid and can lead to 
distortions. In the further course of the research project, the 
advantages and disadvantages of different time horizons on the 
quality of production planning will be investigated. 

An essential basis for the successful introduction and use of 
these approaches in ERP or MES is an analysis and optimisation 
of the existing process flows. A large part of the benefits that can 
be realised by the integration can hardly be evaluated in advance 
using quantitative criteria. The improvement of the internal 
production planning is an essential benefit of the presented 
solution, but its valuable support in the entire order processing 
is difficult to present. The benefit assessment in particular is 
considered problematic, since only parts of the achievable 
benefits can be quantitatively assessed in advance; for instance, 
adherence to delivery dates or error avoidance. Another element 
that needs to be considered is the basic hardware requirements 
at the factory level that the constant use of ML algorithms in 
planning entails. A quantification of the required computing 
power is evaluated in the course of a research project. 

7. CONCLUSIONS AND OUTLOOK 

During the work, a case study for an application of ML for 
time prediction was given, the term PQ was defined, and two 

approaches to incorporate ML models for time prediction in the 
production planning process were discussed. 

The case study shows how ML can be used to predict cycle 
times based on the readily available data of a semiconductor 
manufacturer. The results show that the prediction is more 
accurate than the quarterly adapted static cycle time – the 
arithmetic average of cycle times of comparable lots within the 
past quarter (period of three months starting with January). The 
problem lies in the application of the trained models in the 
planning phase. Therefore, two approaches of how to utilise 
these trained ML models during the plan creation are presented. 
The two approaches are the evolutionary approach and the 
function-based approach. In the first approach, new plans are 
created repeatedly, while the used planning times are adapted 
after each cycle. In addition, PQ is measured by calculating the 
deviation between planning times and predicted times. If the 
deviation is within an acceptable level, the planning stops. The 
latter approach uses functions instead of static values for the 
planning times. Therefore, there are no deviations between 
planning times and predicted times. Still, several planning cycles 
are needed, as other logistical KPIs need to be optimised. 

In the first place, an evaluation method for the two 
approaches should be developed. The function-based approach 
appears to be smarter due to the usage of a new planning method 
based on historical data, creating a production plan immediately 
instead of creating a production plan with master data and then 
comparing it with predicted data. However, both approaches 
have to be examined more closely in the further proceedings of 
the research before an exact assessment can take place. 

Therefore, the authors successfully submitted a research 
proposal called ‘MLinPPC’ (project number 877446) to the 32nd 
‘Produktion der Zukunft’ call of the Austrian funding agency 
‘Österreichische Forschungsförderungs-gesellschaft mbH’. With 
the approval, the two proposed approaches will be implemented 
and tested together with two industrial companies. Furthermore, 
the whole process, from data recording to PQ evaluation, will be 
realised in order to assess both approaches. 

In fact, production plans are currently created from ERP or 
MES data. The acceptance level of employees in the planning 
department depends strongly on the recognisability of deviations 
between the predicted and the classic (ERP/MES) production 
plan. This is a clear advantage of the evolutionary approach. The 
confidence of the planner in a production plan that was created 
by an unknown planning logic could initially be low and thus 
restrict the implementation. Therefore, in addition to the 
development of the algorithms themselves, further research must 
be carried out to enable the planner to trace the result of the 
algorithms. This is especially important during the deployment 
phase of the new approaches. 

Therefore, it becomes clear that no matter which approach is 
chosen, the transparency of the approach must be consistent and 
comprehensible for the employee. Only if the two approaches 
are accepted and trusted can a correct implementation be 
achieved and the PQ in production planning increased, which 
will lead to an optimisation of logistics target values. 

ACKNOWLEDGEMENT 

The authors would like to acknowledge the financial support 
received within the EU project: Power Semiconductor and 
Electronics Manufacturing 4.0 (SemI40), which is funded by the 
programme Electronic Component Systems for European 
Leadership (ECSEL) (Grant Agreement No. 692466). 


ACTA IMEKO | www.imeko.org March 2020 | Volume 9 | Number 1 | 39 

REFERENCES 

[1] G. Schuh (ed.), Ergebnisbericht des BMF-Verbundprojektes 
PROSense. Hochauflösende Produktionssteuerung auf Basis 
kybernetischer Unterstützungssysteme und intelligenter Sensorik, 
2015, pp. 7-8. 

[2] F. Ansari, M. Khobreh, U. Seidenberg, W. Sihn, A Problem-
Solving Ontology for Human-Centered Cyber Physical 
Production Systems, CIRP Journal of Manufacturing Science and 
Technology, Elsevier, 22C (2018) pp. 91-106. 

[3] L. Monostori, B. Kádár, T. Bauernhansl, S. Kondoh, S. Kumara, 
G. Reinhart, O. Sauer, G. Schuh, W. Sihn, K. Ueda, Cyber-
Physical Systems in Manufacturing, CIRP Annals – Manufacturing 
Technology 65 (2016) pp. 621-41. 

[4] F. Ansari, Cyber-Physical Systems, Bericht für Ergebnispapier, 
‘Forschung, Entwicklung & Innovation in der Industrie 4.0’, 
Verein Industrie 4.0 Österreich (2018) pp. 26-28. 

[5] T. Bauernhansl, M. Hompel, B. Vogel-Heuser, Industrie 4.0 in 
Produktion, Automatisierung und Logistik, Anwendung, 
Technologien, Migration, 2014. 

[6] L. Yongkui, X. Xun, Industry 4.0 and Cloud Manufacturing: A 
Comparative Analysis, Journal of Manufacturing Science and 
Engineering 139 (2016). 

[7] A. Sanders, C. Elangeswaran, J. Wulfsberg, Industry 4.0 Implies 
Lean Manufacturing– Research Activities in Industry 4.0 Function 
as Enablers for Lean Manufacturing, JIEM 9(3) (2016) p. 811. 

[8] S. S. Kamble, A. Gunasekaran, S. A. Gawankar, Sustainable 
Industry 4.0 Framework – A Systematic Literature Review 
Identifying the Current Trends and Future Perspectives, Process 
Safety and Environmental Protection 117 (2018) pp. 408-425. 

[9] A. Hees, G. Reinhart, Approach for Production Planning in 
Reconfigurable Manufacturing Systems, Procedia CIRP 33 (2015) 
pp. 70-75. 

[10] G. Schuh, T. Potente, S. Fuchs, C. Hausberg, Methodology for the 
Assessment of Changeability of Production Systems Based on 
ERP Data, Procedia CIRP 3 (2012) pp. 412-417. 

[11] P. Nyhuis, T. Heinen, C. Rimpau, E. Abele, A. Wörn, 
Wandlungsfähige Produktionssysteme. Theoretischer 
Hintergrund zur Wandlungsfähigkeit von Produktionssystemen, 
Werkstattstechnik online 98 (2008) pp. 85-91. 

[12] H.-P. Wiendahl, H. A. ElMaraghy, P. Nyhuis, M. Zäh, H.-H. 
Wiendahl, N. Duffie, M. Brieke, Changeable Manufacturing – 
Classification, Design and Operation, Annals of the CIRP 56(2) 
(2007) pp. 783-809. 

[13] T. Tolio, M. Urgo, J. Váncza, Robust Production Control Against 
Propagation of Disruptions, CIRP Annals 60(1) (2011) pp. 489-
492. 

[14] D. Gyulai, A. Pfeiffer, L. Monostori, Robust Production Planning 
and Control for Multi-Stage Systems with Flexible Final Assembly 
Lines, International Journal of Production Research 55(13) (2017) 
pp. 3657-3673. 

[15] U. Bergmann, M. Heinicke, Resilience of Productions Systems by 
Adapting Temporal or Spatial Organization, Procedia CIRP 57 
(2016) pp. 183-188. 

[16] D. Gyulai, B. Kádár, L. Monosotori, Robust Production Planning 
and Capacity Control for Flexible Assembly Lines, IFAC-
PapersOnLine 48(3) 2015, pp. 2312-2317. 

[17] D. M. D’Addona, F. Bracco, A. Bettoni, N. Nishino, E. 
Carpanzano, A. A. Bruzzone, Adaptive Automation and Human 
Factors in Manufacturing – An Experimental Assessment for a 
Cognitive Approach, CIRP Annals 67(1) 2018, pp. 455-458. 

[18] F. Ansari, P. Hold, W. Sihn, Human-Centered Cyber Physical 
Production System: How Does Industry 4.0 Impact on Decision-
Making Tasks?, Proc. of the IEEE Technology and Engineering 
Management Society Conference, 27 June-1 July 2018. 

[19] L. Lingitz, V. Gallina, F. Ansari, D. Gyulai, A. Pfeiffer, W. Sihn, 
L. Monostori, Lead Time Prediction Using Machine Learning 
Algorithms – A Case Study by a Semiconductor Manufacturer, 
Procedia CIRP 72 (2018) pp. 1051-1056. 

[20] P. Domingos, A Few Useful Things to Know About Machine 
Learning, Commun. ACM 55(10) (2012) p. 78. 

[21] C. Rainer, Data Mining as Technique to Generate Planning Rules 
for Manufacturing Control in a Complex Production System, in: 
Robust Manufacturing Control, Springer Berlin Heidelberg 
(Lecture Notes in Production Engineering). K.Windt (ed.), 2013, 
pp. 203-214. 

[22] A. K. Choudhary, J. A. Harding, M. K. Tiwari, Data Mining in 
Manufacturing – A Review Based on the Kind of Knowledge, J 
Intell Manuf 20(5) (2009) pp. 501-521. 

[23] B. Csáji, L. Monosotori, Value Function Based Reinforcement 
Learning in Changing Markovian Environments, Journal of 
Machine Learning Research (JMLR), MIT Press and Microtome 
Publishing 9 (2008) pp. 1679-1709. 

[24] Y. Cheng, K. Chen, H. Sun, Y. Zhang, F. Tao, Data and 
Knowledge Mining with Big Data Towards Smart Production, 
Journal of Industrial Information Integration 9 (2018) pp. 1-13. 

[25] H. Lödding, Handbook of Manufacturing Control – 
Fundamentals, Description, Configuration, Springer, Berlin, 
Heidelberg, 2013, 978-3-642-24458-2.