CET Volume 86 DOI: 10.3303/CET2186090 Paper Received: 22 October 2020; Revised: 9 March 2021; Accepted: 26 April 2021 Please cite this article as: Vairo T., Reverberi A.P., Bragatto P.A., Milazzo M.F., Fabiano B., 2021, Predictive Model and Soft Sensors Application to Dynamic Process Operative Control, Chemical Engineering Transactions, 86, 535-540 DOI:10.3303/CET2186090 CHEMICAL ENGINEERING TRANSACTIONS VOL. 86, 2021 A publication of The Italian Association of Chemical Engineering Online at www.cetjournal.it Guest Editors: Sauro Pierucci, Jiří Jaromír Klemeš Copyright © 2021, AIDIC Servizi S.r.l. ISBN 978-88-95608-84-6; ISSN 2283-9216 Predictive model and Soft Sensors Application to Dynamic Process Operative Control Tomaso Vairoa*, Andrea P. Reverberib, Paolo A. Bragattoc, Maria F. Milazzod, Bruno Fabianoa a DICCA - Civil, Chemical and Environmental Engineering Department, Polytechnic School - Genova University, via Opera Pia 15 - 16145 Genova, Italy b DCCI, Chemistry and Industrial Chemistry Department - Genova University, Via Dodecaneso 31, 16146 Genova, Italy c INAIL – Technological Innovation Department, Research Centre, via Fontana Candida, 1, 00078 Monteporzio, Italy d Dip.Inge. – Department of Engineering, University of Messina, Contrada di Dio, 98166 Messina, Italy tomaso.vairo@edu.unige.it As amply acknowledged, operational errors can be identified as one of the most common cause of plant equipment deterioration, consequently the operational control exerts a determining role in managing and slowing down the effects of aging and enhancing safety. Given the ongoing trend towards advanced sensors and real-time performance monitoring, novel approaches represent an up-to-date research topic in changing risk environments. In designing and implementing reliable operational control systems, based on data-driven models and machine learning for predicting the system behaviour, one of the critical issues to deal with is the co-existence of Boolean elements (e.g. failure of instruments) and analogical elements (deviation of process variables). Starting from this observation, this paper outlines a hybrid system consisting of a hierarchical predictive network, where the input of the analogical elements are the predicted values of the process variables obtained by deep learning neural networks (soft sensors). The combination of the two approaches allows integrating Boolean events and process variables in an overall predictive dynamic model. In order to verify the actual capability of the system, a pilot application to a hydrocarbon storage park is considered. Upon optimal training sets, the predictive system allows obtaining quasi real time predictions with an overall accuracy attained in the case study higher than 98% over the whole simulation test series. 1. Introduction As observed by Hutchins (1995), highly reliable performance depends upon deep knowledge of the operating environment and its limitations and to the possibility of correcting observed errors in any part of organization’s performance. Operational errors are identified as one of the most important causes of the deterioration of the plant equipment, consequently, the operational control is one of the main systems to manage and slow down the effects of aging and enhance safety. Process plants are complex systems and require precise supervision to remain within the safety conditions, but due to the nonlinear characteristics and multiple operative conditions, the traditional process monitoring method cannot be always effectively applied. Proper manipulation of big dataset collected nowadays in process plant and describing system state should be integrated to produce meaningful risk information. The prediction of critical process variables plays a central role in the prevention of major accidents as, recalling the key concepts of resilience assessment (Pasman et al., 2020). System resilience and dynamic risk management are the key pillars in the transition towards the new paradigm for safety science in the era of big data and Industry 4. recently defined as Safety 4.0 (Laciok et al., 2021). The novel concept of dynamic risk analysis has gained attention by industrial practioners and researchers and coupled with digitalization it is expected improvement of safety and consequent need of a further revised version of Seveso Directives (Laurent et al., 2021). As widely known, a trained ML model can be considered in industrial safety and process optimization on multiple targets, for extracting useful and actionable information from historical data, predicting, classifying and clustering new process outputs. Bayesian inferential approach is widely applied within the context of in risk assessment, (Vairo et al., 2019, 535 Yang et al., 2013, Kantalarmia et al., 2009), even though the connection with the process variables may be difficult. In fact, the Bayesian approach is tailored for predicting Boolean events, such as failures, malfunctions, unavailability, etc., but when dealing with analogical variables, the predictive capability decreases. To solve this issue, different fault detection and diagnosis based on data driven approaches were proposed, e.g., Adedigba et al. (2017) combined BN with principal component analysis to detect the fault of a crude oil distillation unit operation system. Don & Khan (2019) integrated data driven techniques including HMM and Bayesian network (BN) attaining a reliable approach to Abnormal Event Management (AEM), including the detection, diagnosis and correction of abnormal conditions of faults in a process. The development of the hybrid system hereby presented, started from the risk assessment outcome, which leads to the identification of the precursors; then, a predictive soft sensor, based on deep learning algorithms, was developed for predicting target variables and improve the decision-making process for hazardous condition prevention. The main contribution of this paper is to develop a novel process-monitoring framework based on machine-learning for identifying the different abnormal events of the system considering safety-relevant deviations leading to possible faults. To validate the presented work and analyse the performance of the proposed technique, an oil terminal case study is considered. Four main cases were selected to test the correction capability of the tool based on new evidence, namely safe operation, hard disrupted operation, human error disrupted operation, event escalation. At last, the predicted oscillations of critical variables were integrated into a hierarchical predictive model, based on Bayesian inference. 2. Methodology 2.1. Theoretical background As previously stated, the objectives of this paper can be summarized as follows: • Application of a Deep Learning Artificial Neural Network (ANN) to predict outputs from plant monitoring datasets. • Evaluation of predictions and accuracy of the model. • Improvement of parameters to scope the development of ML models A deep neural network (DNN) is an ANN) characterized by multiple layers between the input and output layers (Hinton et al., 2012). There are different types of neural networks, but they always consist of the same components: neurons, synapses, weights, biases, and functions. DNNs can model complex non-linear relationships. DNN architectures generate compositional models where the object is expressed as a layered composition of primitives (Szegedy et al., 2013). The extra layers enable composition of features from lower layers, potentially modelling complex data with fewer units than a similarly performing shallow network (Bengio et al., 2009). Deep architectures include many variants of a few basic approaches. Each architecture has found success in specific domains and it is not always possible to compare the performance of multiple architectures, unless they have been evaluated on the same data sets. DNNs are typically feed forward networks in which data flows from the input layer to the output layer without looping back. At first, the DNN creates a map of virtual neurons and assigns random numerical values, or "weights", to connections between them. The weights and inputs are multiplied and return an output between 0 and 1. If the network did not accurately recognize a particular pattern, an algorithm would adjust the weights (Hof, 2018), by making certain parameters more influential, until it determines the correct mathematical manipulation to fully process the data. Figure 1: Conceptual elements of an ANN (μ= system inputs, Xn=data signal, Wn= synaptic weight, Bk= Bias, vk= weighted input + bias, σ= activation function Yk= system output) 536 Figure 1 summarizes the conceptual elements of the ANN framework here developed. The activation function performs data analysis and processing; it takes the input values as an argument of function and releases the output; when the sum of weighted inputs and biases exceeds a precise activation threshold, the activation function considers the argument valid and processable. The activation function operates within the hidden and output neurons. Bias parameter has a similar role to weight parameter, as it controls the input sum determining whether they might be considered acceptable by comparison with the activation threshold value. The bias value added to the input signal influences the activation function argument and may be considered as an added value to the weighted inputs for the endorsement of the argument related to the activation function. Weight parameter quantifies the inputs importance coming from each input neuron. The greater the weight of the input branch, more important the information is for the ANN target. Weights and bias are corrected by learning algorithm for adapting the ANN to the input dataset. In the training step, the goal is the learning error minimization (N = training data dimension), which is calculated in terms of square error (SE): = 12 ( − ) (1) The output is: = ( + +. . . + + ) (2) where: Ai = inputs; Wi = weights; Bi = bias; = activation function. Lastly, the cost function is obtained as follows: = 12 ( − ) (3) 2.2. The machine learning pipeline A machine learning pipeline is used to help automate machine learning workflows. They operate by enabling a sequence of data to be transformed and correlated together in a model that can be tested and evaluated to achieve an outcome, whether positive or negative. Machine learning (ML) pipelines consist of several steps to train a model. Machine learning pipelines are iterative, in that every step is repeated to continuously improve the accuracy of the model and achieve a successful algorithm. To build better machine learning models and get reliable, accessible data scalable and durable storage solutions are imperative, paving the way for on- premises object storage. A classic machine learning pipeline consist of a stepwise procedure including in sequence: data collection and cleaning; feature extraction (labelling and dimensionality reduction); model validation and at last prediction accuracy determination. A schematic layout of the conceived approach is reproduced in the form pf ML pipeline in Figure 2. It is noted that the test set and training set have the same sampling interval. Figure 2: Typical Machine Learning pipeline 3. Case study I n order to verify the actual capability of the framework, we considered as pilot case-study a coastal storage facility located in Northern Italy, close to environmental sensitive areas (Vairo et al., 2017). The plant is characterized by a storage capacity of about 200,000 m3 divided into 21 tanks and covers a coastal area of 62,000 m2. The facility is connected to the oil terminal pumping station via two 10” and one 16” oil pipelines, 537 through which it is possible to both receive and ship the product by sea. The handled products are mostly final HC products (gasoline and diesel) of foreign and national origin; they can be received both by sea, through the equipment of the oil terminal and by the pipelines. The plant is equipped with a real-time monitoring system properly updated to transfer actual process parameters values to the designed predictive system, with focus on discovering unanticipated behaviours and creating reliable alarms, according to early warning pillar of resilience. Figure 3 schematically depicts the VRU section and the main lines including mail vapour, vapour recovery, regeneration and absorption streams, vacuum pump suction circuit and water drainage. Given the core activity of the storage site, in case of accidental HC releases the most probable accidental scenarios include either atmospheric releases, sea spills, liable to cause environmental damage in different matrixes (Vairo et al., 2014), or fire/explosion scenarios due to ignition source presence (Pesce et al., 2012). The core of VRU is the activated carbon absorption; the process is based on the Pressure Swing Absorption (PSA) with two absorbers operating alternatively, one active and the other one at regeneration stage. As in many process engineering tasks, accurate temperature prediction and estimate of heat transfer coefficient are required to guarantee the optimal operative and safety performance (Reverberi et al., 2013). The activated carbon bed temperature represents the control parameter for the unit physical state and the regeneration step sequence. Additionally, it is selected as critical parameter for dynamic early warning of possible hazardous scenario, with set points respectively at T= 70° C for alarm activation and T= 93°C for emergency shutdown. Figure 3: Schematic diagram of the Vapour Recovery Unit (VRU) 4. Results and discussion Table 1 summarizes the real-time monitored process variables in the VRU section of the plant, with a frequency of 5 sec. The whole dataset adopted in the study includes observations over one year time span. Table 1: Hardware sensors and relevant process parameters monitoring VRU process section. Monitoring system Aim TT101 Temperature sensor in the upper area of the absorbent filter V-1 TT102 Temperature sensor in the lower area of the absorbent filter V-1 TT201 Temperature sensor in the upper area of the absorbent filter V-2 TT202 Temperature sensor in the lower area of the absorbent filter V-2 PIT101 Inlet pressure transmitter for filter line V-1 and V-2 PIT501 Pressure transmitter on the vacuum line that manages SV-101/201 and SV-501 VOC_INLET It is the concentration of vapors entering the system CIM_FLOW CIM flow rate inside the plant. Extinguishing water circuit Extinguishing water circuit 538 Alert zone The model is designed with 4 DL-ANN (Deep Learning Artificial Neural Networks) based on the Resilient back propagation algorithm with a structure having the following characteristics as resulting after the validation phase: Hidden layers: 6; Neurons in each hidden layer: 24; Learning rate: 1E-7; Step max: 1E8; Activation function: Tanh; Error function: SSE; prediction time interval: 15 min. Figures 4 a-d represent the scatterplots of predicted values vs. plant observed values (monitored empirical data). Figure 4: Scatterplots predicted values vs. ground truth for temperature sensors TT101 (a), TT102 (b), TT201 (c), TT202 (d) The robust and accurate prediction ability reflects on a clear-cut overlap between predicted and field observed data. The analogical variables are used in a HBN (Hierarchical Bayesian Network) to predict the posterior probability of failure of the component in accordance with the hybrid model, as shown in Figure 5. According to the outlined approach, by integrating the predictions of DL-ANN-based soft sensors in a HBN, it is possible to obtain a continuous probability trend even for those elements characterized by a Boolean risk, such as failures of plant components. The DL-ANN model provides the variables predictions. In the Deep Learning approach, the synaptic weights are modified in each cyclic iteration. The early warning principle of resilience is used to determine the variables set points. The BN model relies on the probabilistic data and the data forecasted by the DL-ANN. The BN part of the hybrid model updates the risk probabilities by integrating the variables predictions time after time, taking into account the previous step (hierarchical). The algorithm has performed the prediction with remarkable accuracy and precision as demonstrated by standard statistical indicators shown in Table 2. Figure 5: Hybrid model for Boolean probability estimation a b c d O bs er ve d O bs er ve d O bs er ve d O bs er ve d Predicte Predicted Predicte Predicted 539 Table 2: Errors and Accuracy TT101 TT102 TT201 TT202 RMSE 0.008 0.003 0.003 0.004 MAE 0.005 0.002 0.003 0.003 Accuracy 0.999 0.999 0.999 0.999 5. Conclusions This research work presents a hybrid model incorporating different data driven models in a complete logical and interconnected model (DL-ANN - HBN). The former exhibits a robust predictive capability on the process variables and the latter explores the interdependencies among the system components and their modification alongside process variables fluctuation. The latter can thus generate a dynamic risk indicator connected with the process variables prediction. The outcomes analysis demonstrated that the framework could be used for the reliable predictions in terms of accuracy and test error performance. Tested K-Folds Cross validation algorithm allowed averting potential problems of over-fitting and under-fitting. The ANN algorithm represents an optimal solution for designing predictive soft sensors, with the assumption of having a large amount of data, obtained through real-time distributed monitoring systems. The hybrid model gives a real time estimate of the components failures likelihoods by analysing the variables predictions, capturing the temporal and spatial dependencies of the relevant process parameters, and the interdependencies in the component failure analysis. Even if easy to construct, it can provide robust performances thus representing a sharp jump towards early detection of systems weak signals and overall system resilience in perspective resulting in a risk function characterized by predictive capabilities and the ability to be updated with time. Acknowledgment This research was funded by INAIL within the framework of the call BRIC/2018/ID2 (Project DYN-RISK). References Adedigba S.A., Khan F., Yang M., 2017, Dynamic failure analysis of process systems using principal component analysis and Bayesian network, Ind. Eng. Chem. Res. 56, 2094-2106. Bengio Y. 2009, Learning deep architectures for AI, Foundations and Trends in Machine Learning, 2, 1-127. Don M.G., Khan F. 2019, Dynamic process fault detection and diagnosis based on a combined approach of hidden Markov and Bayesian network model, Chemical Engineering Science 201, 82-96. Hinton G. E., Srivastava N., Krizhevsky A., Sutskever I., Salakhutdinov R.R., 2012, Improving neural networks by preventing co-adaptation of feature detectors, arXiv:1207.0580. Hof R.D. 2018, Is Artificial Intelligence finally coming into its own? MIT Technology Review 2018. Hutchins E., 1995, Cognition in the Wild. MIT press, Cambridge, MA. Kalantarnia M., Khan F., Hawboldt K., 2009, Dynamic risk assessment using failure assessment and Bayesian theory, J. Loss Preven. Process Ind. 22, 600-606. Laciok V., Sikorova K., Fabiano B., Bernatik A., 2021, Trends and opportunities of tertiary education in safety engineering moving towards Safety 4.0, Sustainability 13, 524. https://doi.org/10.3390/su13020524 Laurent A., Pey A., Gurtel P., Fabiano B., 2021, A critical perspective on the implementation of the EU Council Seveso Directives in France, Germany, Italy and Spain, Process Saf. Environ. Prot., 148, 47–74 Pasman H., Kottawar K., Jain P. 2020, Resilience of process plant: what, why, and how resilience can improve safety and sustainability, Sustainability, 12, 6152. Pesce M., Paci P., Garrone S., Pastorino R., Fabiano B., 2012, Modelling ignition probabilities in the framework of quantitative risk assessments, Chemical Engineering Transactions, 26, 141-146. Reverberi A.P., Fabiano B., Dovì V.G., 2013, Use of inverse modelling techniques for the estimation of heat transfer coefficients to fluids in cylindrical conduits, Int. Commun. Heat Mass Transf., 42, 25-31. Szegedy C., Toshev A., Erhan D., 2013. Deep neural networks for object detection, Advances in Neural Information Processing Systems: 2553–2561. Vairo T., Milazzo M.F., Bragatto P., Fabiano B., 2019, A dynamic approach to fault tree analysis based on Bayesian Beliefs Networks, Chemical Engineering Transactions 77, 829-834. Vairo T., Currò F., Scarselli,S., Fabiano B., 2014, Atmospheric emissions from a fossil fuel power station: Dispersion models comparison, Chemical Engineering Transactions 36, 295-300. Vairo T., Del Giudice T., Quagliati M., Barbucci A., Fabiano B., 2017, From land- to water-use-planning: A consequence-based case-study related to cruise ship risk, Safety Science 97, 120-133. Yang M., Kahn F., Lye L. 2013, Precursor-based hierarchical Bayesian approach for rare event estimation: a case of oil spill accident, Process Safety and Environmental Protection 91, 333-342. 540