CET Volume 86


                                                                    DOI: 10.3303/CET2186151 
 

Paper Received: 17 August 2020; Revised: 21 January 2021; Accepted: 9 May 2021 
Please cite this article as: Lisci S., Gitani E., Mulas M., Tronci S., 2021, Modeling a Biological Reactor Using Sparse Identification Method, 
Chemical Engineering Transactions, 86, 901-906  DOI:10.3303/CET2186151 

 CHEMICAL ENGINEERING TRANSACTIONS 
VOL. 86, 2021 

A publication of 

The Italian Association 
of Chemical Engineering 
Online at www.cetjournal.it 

Guest Editors: Sauro Pierucci, Jiří Jaromír Klemeš
Copyright © 2021, AIDIC Servizi S.r.l. 
ISBN 978-88-95608-84-6; ISSN 2283-9216

Modeling a Biological Reactor using Sparse Identification 
Method 

Silvia Liscia, Elisa Gitania, Michela Mulasb, Stefania Tronci a,* 
aDipartimento di Ingegneria Meccanica, Chimica e dei Materiali, Università degli Studi di Cagliari, Cagliari, Italy 
b Department of Teleinformatics Engineering, Federal University of Ceará, Campus of Pici, Fortaleza (Ceará), Brazil 
 stefania.tronci@dimcm.unica.it.com  

In this work a model-based controller for a fermentation bioreactor has been developed. By simulating the 
model of the process that acts as a virtual plant, input-output data have been generated and used to identify 
the system using sparse identification of nonlinear dynamics methodology. The obtained model is then used in 
a model-based algorithm to control the bioreactor temperature, where the manipulated action is obtained as a 
result of a constrained nonlinear optimization problem which minimizes the mismatch between the predicted 
trajectory and the desired one. Good performances have been obtained by applying the proposed control 
strategy for set-point changes and disturbance rejection.   

1. Introduction

Industrial biological processes are considered an important technological asset for the production of 
biochemicals and biofuels. Enormous effort has been made to develop mathematical models for different type 
of biotechnological processes, to be used for design, operation, optimization, scale-up and model-based 
control. However, even if they are widely used, they have not reached the same development as traditional 
chemical processes, particularly when considering automatic control solutions. Although bioreactors are 
relatively simple to operate, the complex network of reactions involved in microorganism growth makes their 
control very challenging (Wang et al., 2018). Even slight changes in the raw materials characteristics or in the 
process operating conditions can act as source that affects the growth of organisms and has an impact on the 
product quality. This issue becomes more demanding when considering the production of bio-chemicals or 
biofuels derived from waste, because the type of feedstock and pre-treatment used to obtain fermentable 
sugars have a strong effect on the fermentation that, then, affects the purification step (Robak and Balcerek, 
2020).  
The design of a proper control system can improve the efficiency of the biological system and reduce the 
effect of incoming disturbances. Unfortunately, this is not an easy task due to model uncertainties, nonlinear 
nature of the system and slow response of the process. The complexity of biological processes is mostly due 
to the presence of living organism and their metabolism is sensitive to process conditions, such as 
temperature, pH, substrate concentrations (Spigno and Tronci, 2015; Pachauri et al., 2017). A good 
description of the input-output relationships for the bioreactor is surely one of the main ingredients for 
designing a proper controller and guaranteeing the respect of the desired conditions. The model can be used 
to derive the control action required to minimize (or maximize) an objective function, with the possibility to 
include some known operating constraints in the optimization objective. Caution must be taken when 
developing the bioreactor model, because using an inaccurate model could lead to either bad performance or 
an unstable closed-loop system (Cogoni et al., 2014). On the other hand, first-principles model can be difficult 
to obtain, particularly for bioreactors, where it is very difficult to understand and describe all the phenomena 
which occur. Data-driven identification can be a possible solution (Armenise et al., 2018; Taris et al., 2017), 
but algorithm such as neural networks may require a large amount of data and, for complex systems, they 
may have a large number of parameters (weights). Brunton et al. (2016) proposed a solution for system 
identification based on sparse identification of nonlinear dynamics which looks for the main function describing 

901


the dynamics of the observed states. The procedure was successfully applied to different dynamical systems, 
considering measurement noise and partial observation of the states.  
The main objective of the present work is to design a model-based controller for a fermentation bioreactor, 
used as case study, exploiting the sparse identification approach. The bioreactor model was proposed by 
Nagy (2007) and involves detailed kinetic model and equations, which express the heat transfer, the 
dependence of kinetic parameters on temperature, the mass transfer of oxygen, as well as the influence of 
temperature and ionic strength on the mass transfer coefficient. By simulating the model of the process (virtual 
plant), input-output data have been generated and used to identify the system using sparse identification of 
nonlinear dynamics methodology (Brunton et al., 2016; Kaiser et al., 2018). A subset of states is considered 
measured, as it usually occurs in real plant. The study is aimed at obtaining a simple and parsimonious model 
that can be successfully applied to a nonlinear optimal control algorithm.  

2. Fermentation reactor model

A virtual plant (Nagy, 2007) is here used to address the problem of developing a nonlinear model predictive 
control for a bioreactor.  

Figure 1. The continuous fermentation bioreactor  

The system consists of six states, which are biomass concentration ( ), ethanol concentration ( ), substrate 
concentration ( ), dissolved oxygen concentration ( ), reactor temperature ( ), and jacket temperature 
( ), as reported in Eqs. (1-6). The reactant volume is constant.  =   −  (1) 

=  −   (2) 
= − − + , −   (3) 
=  ∗ − − (4) 

= ( + 273) − ( + 273) − ∆ , − ,   (5) 
= , − + , (6) 

= ( /( ( )) −  ( ) , ∗ = 14.6 − 0.3943 + 0.007714 − 0.0000646  (7) 
where ,  is glucose concentration in the feed flow;   is the flow entering the bioreactor;  is the outlet 
flow;  is the flow of the coolant agent;  is the constant: for oxygen consumption ( = ), of growth 
inhibition by ethanol ( = ), of fermentation inhibition by ethanol ( = 1), in the substrate term for growth ( = ), and in the substrate term for ethanol production ( = 1);   is the maximum specific rate for: oxygen 

902


consumption ( = ), fermentation ( = ), growth rate ( = );   ( ) is the ratio of ethanol produced per 
glucose consumed by fermentation (ratio of cell produced per glucose consumed for growth);  is the yield 
factor for biomass on oxygen;   is the heat transfer coefficient;  is the heat transfer area;  is the 
product between mass-transfer coefficient and the specific area; ∗  is the equilibrium concentration of oxygen 
in the liquid phase;  is the volume of the mass of reaction;  is the volume of the jacket; ,  ( , ) is 
the heat capacity of the mass of the reaction (heat capacity of the cooling agent);  ( ) is the reaction mass 
density (cooling agent density). Other details on the model and parameters value can be found in Nagy 
(2007). Table 1 reports nominal conditions for the fermentation system. 

Table 1: Input values at the nominal conditions  

Inputs   [L/h] [L/h] , [oC] , [g/L] 
51 18 15 60 

3. Model identification

A data-driven approach has been used to identify the model from the available outputs and inputs. The 
algorithm used in this work is that proposed by Brunton et al. (2016), which determines the governing 
equations of the bioreactor by sparse identification of nonlinear dynamical systems. To obtain the model, the 
available n states and l manipulated inputs have been collected, obtaining a time series of length  for each 
measure. The data sampled can be rearranged in matrixes, as reported in Eq. (8). 

= ( )⋮( ) = ( ) ⋯ ( )⋮ ⋱ ⋮( ) ⋯ ( ) ,   = ( )⋮( ) = ( ) ⋯ ( )⋮ ⋱ ⋮( ) ⋯ ( )   (8) 
Defining the augmented vector = [ , ], the matrix of the collected data is reported in Eq. (9) 

= ( )⋮( ) = ( ) ⋯ ( )⋮ ⋱ ⋮( ) ⋯ ( ) ( ) ⋯ ( )⋮ ⋱ ⋮( ) ⋯ ( )   (9) 
A set of candidate functions needs to be selected for describing the dynamics of the observed states (10). In 
this work, only quadratic functions (9) have been considered, aiming at a simple model.  

Θ(Z) = |1| |Z| |Z  |   (10)
The term Z  represents the quadratic terms of the model, as represented in Eq. (11). 

= ( ) ( ) ( ) ⋯( ) ( ) ( ) ⋯⋮ ⋮ ⋱      
( ) … ( )( ) … ( )⋮ ⋱ ⋮( ) ( ) ( ) …  ( ) … ( ) (11) 

The reconstructed dynamics has the following form (12) =  Θ( , )Ξ  (12) 
Each column of the matrix Ξ is a sparse vector of coefficients determining which terms are active in the right-
hand side of Eq. (12). The sparse solution has been calculated by means of the sequential threshold algorithm 
(Brunton et al., 2016), that starts with the least-square solution for Ξ and threshold all coefficients that are 
smaller than the given cutoff value (λ). Then, another least-square solution for Ξ is obtained using the 
functions that have not been eliminated in the previous step. The new coefficients are again threshold until 
convergence. To obtain data set for identification, the model (1-6) has been excited by appropriate input 
signals. The quality of identification depends on the quality of the data used to obtain the model, that means 
that data should contain enough information of the state dynamics.  

903


4. Control algorithm

The identified model was introduced in a model predictive control scheme as the internal model used for 
prediction during the control movement calculation. In each sampling period, the current temperature 
measurement is obtained by the virtual plant, and the control action is calculated by solving the optimization 
problem (13) (Ogunnaike and Ray, 1994): 

min ( ) ∑ ( + ) − ( + ) − ( ) − ( ) + ( − 1) − ( )   (13) 
where ( + ) is calculated by the identified model, ( ) is the measured temperature,  is the set-point 
trajectory, ( ) is the coolant flow rate (manipulated input) and  is a positive parameter, used to penalize 
the deviation of the manipulated inputs in order to avoid aggressive control action.  

5. Results

5.1 Model identification 

Only four states have been considered observable ( , , , ) and only one manipulated variable ( ), 
therefore = 4 and = 1  in Eq. (8).  The derivatives of the states in (12) have been numerically obtained 
using fourth-order approximation. Different input sequences have been used to excite the system, and the 
best results in terms of reconstruction capability and robustness have been obtained with the pseudo-random 
binary sequence (PRBS) shown in Figure 2. Data have been sampled every 0.025 h and the following cutoff 
values have been used to individuate the best prediction capabilities = ,, , , = [0.1,0.2,0.3,0.5] . (14) 

 Figure 2. PRBS signal used to excite the bioreactor.  

The comparison between the states calculated by integrating the model (1-6) and the ones predicted with the 
identification procedure described in Section (3) are reported in Figure 3, for the validation set. The identified 
model shows good prediction capability, as shown by the fact that prediction and virtual plant curves are 
nearly superimposed.  

5.2 Control 

The nonlinear predictive control has been implemented using the bioreactor model in (1-6) as virtual plant. 
The constrained optimization problem was solved using a gradient-based method, where the minimum flow 
rate value was set equal 0, and the maximum one was set equal to 200, according to Nagy (2007). The 
sample time of the controller is 3 minutes. It is important to note that the reactor is controllable, therefore it is 
in principle possible to obtain the required product composition with the proper control action. Temperature 
set-point can be therefore selected such that the required ethanol production is obtained. 

904


Figure 3. Comparison between prediction (dashed red line) and actual values (black continuous line) of 
substrate concentration (upper panel, left), oxygen concentration (upper panel, right), reactor temperature 
(bottom panel, left), cooling agent temperature (bottom panel, right).  

Control performance has been evaluated in terms of set-point tracking using the same temperature set-point 
variations proposed in Nagy (2007), who developed a neural network nonlinear temperature controller for the 
bioreactor. The comparison shows that the simple identification algorithm and the parsimonious model used in 
the present work leads to satisfactory results (Figure 4), being the controller able to efficiently track the system 
to the set-point.  

Figure 4. Proposed controller performance for set-point changes. Controlled reactor temperature (left panel, 
continuous line), temperature set-point (left, dashed line) and manipulated variable (right panel).   

The controller has been also evaluated in presence of disturbance variations, obtained by varying the input 
temperature of the cooling agent (Figure 5, right panel, right y-axis). Observing the left panel of Figure 5, it is 
possible to notice that the deviation of reactor temperature from the set-point at 30oC is very small, thanks to 
the optimal trajectory obtained for the manipulated variable as reported in the right panel of Figure 5 (left y-
axis).   

0 50 150 250 350
Time [h]

27

28

29

30

31

32

33

34

T
r [

o C
]

50 150 250 350
Time [h]

0

40

80

120

160

200

F
ag

 [l
/h

]

905


Figure 5. Proposed controller performance for disturbance rejection. Controlled reactor temperature (left 
panel, continuous line), temperature set-point (left panel, dashed line). Manipulated variable (right panel, 
continuous line, left y-axes) and inlet coolant agent temperature (right panel, dashed line, right y-axes).   

6. Conclusions

One of the most demanding issue when considering the development of advanced control systems in 
bioreactors is the obtainment of an accurate input-output model. This is because of the complex phenomena 
occurring in the biosystem and the lack of adequate monitoring tools. This problem has been here addressed 
by exploiting the sparse identification of nonlinear dynamics algorithm, that has been used to identify the 
nonlinear model of a bioreactor for ethanol production by fermentation of glucose. The aim was to obtain a 
parsimonious model with a limited number of measured states, that could be used to develop a model-based 
control, where the manipulated variable action was calculated by solving a nonlinear optimization problem. 
The identified model was able to give a good reconstruction of the observed states that led to the obtainment 
of good performance of the optimal model-based controller. Some issues need to be addressed in the future, 
as the introduction of measurement noise and development of a MIMO optimal controller to guarantee the 
product quality even in presence of disturbances that can affect the conversion (e.g., variation of pH, presence 
of contaminants in the reactor feed, variation of inlet substrate concentration).  

References 

Armenise, G., Vaccari, M., Di Capaci, R.B., Pannocchia, G., 2018, An Open-Source System Identification 
Package for Multivariable Processes, In 2018 UKACC 12th International Conference on Control 
(CONTROL) (pp. 152-157). IEEE. 

Cogoni, G., Tronci, S., Baratti, R., Romagnoli, J.A., 2014, Controllability of semibatch nonisothermal 
antisolvent crystallization processes, Industrial and Engineering Chemistry Research, 53(17), 7056-7065. 

Brunton, S. L., Proctor, J. L., Kutz, J. N., 2016, Discovering governing equations from data by sparse 
identification of nonlinear dynamical systems, Proceedings of the national academy of sciences, 113(15), 
3932-3937. 

Kaiser, E., Kutz, J. N., Brunton, S. L., 2018, Sparse identification of nonlinear dynamics for model predictive 
control in the low-data limit, Proceedings of the Royal Society A, 474(2219), 20180335. 

Lisci, S., Grosso, M., Tronci, S., 2020, A Geometric Observer-Assisted Approach to Tailor State Estimation in 
a Bioreactor for Ethanol Production, Processes, 8(4), 480. 

Nagy, Z. K.,2007, Model based control of a yeast fermentation bioreactor using optimally designed artificial 
neural networks, Chemical Engineering Journal, 127(1-3), 95-109. 

Ogunnaike, B.A. O., Ray, W. H., 1994,  Process dynamics, modeling and control. Oxford University Press. 
Pachauri, N., Singh, V., Rani, A., 2017, Two degree of freedom PID based inferential control of continuous 

bioreactor for ethanol production, ISA transactions, 68, 235-250. 
Robak, K., Balcerek, M., 2020, Current state-of-the-art in ethanol production from lignocellulosic feedstocks, 

Microbiological Research, 126534. 
Spigno, G., Tronci, S., 2015, Development of hybrid models for a vapor-phase fungi bioreactor, Mathematical 

Problems in Engineering, 2015. 
Taris, A., Grosso, M., Brundu, M., Guida, V., Viani, A., 2017, Application of combined multivariate techniques 

for the description of time-resolved powder X-ray diffraction data, J. of Applied Crystallography, 50, 451-
461. 

Wang, P., Tong, H., Shao, H., 2018, Application of Computer Monitoring Technology in Industrial Ethanol 
Production and Fermentation. Chemical Engineering Transactions, 71, 409-414. 

0 50 100 150 200 250 300 350
Time [h]

29.5

30

30.5
T

r [
o
C

]

0 100 200 300
Time [h]

0

50

100

150

F
ag

 [l
/h

]

10

15

20

25

T
in

,a
g 

[o
C

]

906