Microsoft Word - 42landucci.docx CHEMICAL ENGINEERING TRANSACTIONS VOL. 82, 2020 A publication of The Italian Association of Chemical Engineering Online at www.cetjournal.it Guest Editors: Bruno Fabiano, Valerio Cozzani, Genserik Reniers Copyright © 2020, AIDIC Servizi S.r.l. ISBN 978-88-95608-80-8; ISSN 2283-9216 From Risk Assessment to Resilience Assessment. An Application to a HazMat Storage Plant Tomaso Vairoa*, Andrea P. Reverberib, Bruno Fabianoa a DICCA - Civil, Chemical and Environmental Engineering Dept. – Genoa University, via Opera Pia 15 - 16145 Genoa, Italy b DCCI - Chemistry and Industrial Chemistry Dept., Genoa University, via Dodecaneso 31 - 16145 Genoa, Italy tomaso.vairo@edu.unige.it The purpose of this work is to outline a framework for assessing the resilience of a petrochemical storage plant, through the construction of a dynamic hierarchical Bayesian network. The BN approach allows keeping memory of the states, in order to manage the actual safety and reliability evidences during the petrol transfer operation from storage tank to trucks in a repository of oil products. The proposed framework aims at assessing risk in process plants by analysing continuous process hazard data from a Bayesian point of view. A sequence of hazard functions derived for the FTAs, is modelled with a hidden Markov chain. The capability of the model implemented by means of Markov Chain Monte Carlo methods are tested at a real scale plant. Keywords: data driven model, hidden Markov models, resilience, semi-supervised learning,. 1. Introduction The main objective of risk assessment within industrial settings is the minimization of accident probability or, at least, the preservation of this probability below an acceptable value. QRA is a legislative mandatory requirement in a wide range of industrial plants according to specific Seveso Directives used as the basis of regulations, also outside of the European Union (Fabiano et al., 2017). As commented by Genserik and Pasman (2014), the obtained risk picture of the system is however static, being fully developed at the design stage. The Bayesian approach is currently widely recognised as a proper framework for analysing risk in industrial plants (Vairo et al. 2019, Yang et al. 2013, Kantalarmia et al. 2009). However, the traditional Bayesian approach is unable to keep memory of the previous states of the plant components and thus is unable to catch the transition from “safe” to “unsafe” states, identifying the trend exclusively on the basis of the current state of the system. On these grounds, several studies were performed focusing on a dynamic risk assessment by use of the BN in the process industries. As amply reported, even when performing an accurate risk analysis, it is not possible to rule out uncertainty completely, mainly due to lack of knowledge about the system and the physical variability of a system response (Markowski et al., 2009). Additionally, from a survey on his personal experience in 92 QRAs over a time span of 36 years, Taylor (2016) evidenced that the 26 major accidents were related to uncompleted hazard identification and management not correctly implementing HazId. In this paper, a detailed comparison between the traditional risk analysis and the proposed resilience assessment is carried out, referring selected scenarios involving a significant loss of containment (LOC). Different databases including Lloyds’ Register allow concluding that human error is the main cause of a lot of operational mishaps causing LOCs: they are either covered by the previous HazId phase, or they appear in ageing plants as new causes (e.g. Vairo et al., 2018). On these grounds, the resilience of the system was analysed, i.e. the capacity of the system to respond to disturbances that may occur during the ongoing operations, maintaining a dynamic stability. For each precursor event, resilience analysis was carried out using dynamic Bayesian networks. A priori probabilities obtained from conventional risk analysis procedure are updated on the basis of the evidence gathered in the plant during the operations, then stochastically disturbed by inserting them in Markov-Monte Carlo chains. As recently proposed by Don & Khan, (2019) integrated data driven techniques, including HMM and Bayesian network (BN), allows a successful approach to Abnormal Event Management (AEM) which includes the detection, diagnosis, and DOI: 10.3303/CET2082026 Paper Received: 20 December 2019; Revised: 16 April 2020; Accepted: 28 August 2020 Please cite this article as: Vairo T., Reverberi A., Fabiano B., 2020, From Risk Assessment to Resilience Assessment. an Application to a Hazmat Storage Plant, Chemical Engineering Transactions, 82, 151-156 DOI:10.3303/CET2082026 151 correction of abnormal conditions of faults in a process. Four main cases were selected to test the correction capability of the approach based on new evidence, namely safe operation, hard disrupted operation, human error disrupted operation, event escalation. The overall system resilience is evaluated by a dynamic safety indicator, in relation to the posterior probability density function of the disruptive events. At last, the overall study under development will consider critical comparison of results with the outputs of conventional risk assessment. 2. Methodology As detailed in the following, the development of the model includes a customized implementation of Hidden Markov Model (HMM) coupled with Bayesian inference applied to dynamic fault trees. 2.1 Theoretical approach HMM is a statistical Markov model in which the system being modelled is assumed to be a Markov process with unobservable (i.e. hidden) states, which in the given context represent the transitions between the “safe” and “unsafe” states of the system. The focus of this work is to evaluate the results that HMM semi-supervised learning models could achieve to perform a reliable forecasting of states sequences during critical operations in a Seveso upper tier plant, starting from operational experience and field data. Basic assumption in defining the space of probable sequences is that only pairwise dependencies over hidden states are assumed. HMM generates a sequence of T output variables yt conditioned on a parallel sequence of latent categorical state variables ∈ 1,… . These hidden state variables are assumed to form a Markov chain (MC) so that zt is conditionally independent of other variables, once given zt−1. MC is parameterized by a transition matrix θ where θk is a K-simplex for ∈ 1,… . The probability of transitioning to state zt from state zt−1 is: ~ [ ] (1) The output yt at time t is generated conditionally independently based on the latent state zt (Munkhammar et al. 2018). It is possible describing HMMs with a simple categorical model for outputs yt∈{1,…,V}. The categorical distribution for latent state k is parameterized by a V-simplex ϕk. The observed output yt at time t is generated based on the hidden state indicator zt at time t: ~ ∅ [ ] (2) So HMMs form a discrete mixture model where the mixture component indicators form a latent Markov chain. Given the transition and emission parameters, θk,k′ and ϕk,v and an observation sequence u1, …, uT∈{1,…,V}, the Viterbi algorithm computes the state sequence which is most likely to have generated the observed output u (Blasiak et al. 2011). As widely discussed (Vairo et al. 2019, Meel et al. 2006), fault tree analysis (FTA) can be effectively transposed into dynamic Bayesian Networks. The latter however, can perform forward analysis, being the inference process based on the naive assumption of conditional independence between basic events. In order to overcome this limitation it is possible building a hierarchical network; following the original reasoning by Chatzis et al. (2011), the Bayesian structure was developed starting from the Markov Chain of hidden states. 2.2 Model development The model is developed according to the conceptual scheme depicted in Figure 1. As the failure frequencies of root events are transposed into probability distribution, the first stage consists in performing MCMC sampling (with the conditional rules coming from the fault trees) and obtaining the distributions of intermediate events. The second stage is considering the intermediate distributions (whose observation are the data, related to the root events), and perform a second MCMC sampling from the intermediate (i.e. the posterior of root events) to obtain the Top Event distribution. The third stage aims at identifying the “hidden states” between safe state and failure by applying the customized Hidden Markov Model in which only the first and the last states are known, and the intermediate can be inferred from the observation and the probability distributions. In this way, the probability of correctly classifying the states sequence has a maximum a posterior probability (MAP) probability. From Bayes' rule we obtain: | = | (3) where, Z is the sequence of states and T the emissions (observations). Since P(T) is independent of the sequence Z , the discriminant function to be maximized is: 152 = | (4) Following assumptions are made in order to reduce the problem down to manageable size: • The size of the sequence of observations is not very large. Let n be the size of a word. Then P(C) is the frequency of occurrence of words. • Conditional independence among the features vectors. The shape of a character, which generates a given feature vector, is independent of the shapes of neighboring characters, depending only on the character in question. Under these assumptions, from Eq. (4), one can write: = | , . . . , (5) For the case of the Viterbi algorithm, if we assume that the process is first-order Markov, it follows: = | [ | | . . . | ] (6) Figure 1: Model conceptual scheme. 3. Model validation The case study is based on a petroleum products coastal storage facility located in Northern Italy, close to environmental sensitive areas, thus requiring safety protection priorities (Vairo et al., 2017). The facility is characterized by a storage capacity of about 200,000 cubic meters divided into 21 tanks and covers an area of 62.000 square meters. The facility is connected to the oil terminal pumping station via two 10” and one 16” oil pipelines, through which it is possible to both receive and ship the product by sea. The depot can also transfer product to nearby depots connected with two 6” pipelines. The products handled are mostly finished products (gasoline and diesel) of foreign and national origin; they can be received both by sea, through the equipment of the oil terminal and by pipelines (Figure 2). Figure 2: The coastal storage facility (Italy). 153 3.1 Case-study The operation on which the present study is focused, is the transfer of product from storage tank to tank truck (ATB), with associated Top Event is the product leakage in the loading area. As amply reported, in case of onshore hydrocarbon release of flammable hydrocarbons to the surrounding, environment several types of hazards and different evolving scenarios may be considered, with pool fire covering an approximate figure of 42% of all accidents (Palazzi et al., 2017). As discussed in detail in Pesce et al. (2012), the overall ignition probability should consider conditional probabilities for immediate and delayed ignition accounting for the release rate and the number of ignition sources within the LFL envelope. According to conventional risk analysis approach, the loss of product (including very minor leaks) in the loading area was cautiously estimated at an occurrence frequency equal to 10 -3 occasions per year. The preliminary validation relies on a limited set of identified root events resulting also from operating experience: product loss from valves / flanges, error in tank truck positioning, human error in hardware connections. According to the outlined approach, a resilience assessment is conducted and the overall safety of the operation is measured by means of a dynamic safety indicator. We consider one of the key aspect of the resilience, i.e. the system's ability to respond to disturbances that may occur during operations, while maintaining dynamic stability. For each of the identified deviations, the analysis of system safety was carried out using dynamic BNs. The prior probabilities are those of the traditional risk analysis of the safety report, which were subsequently updated on the basis of the evidence collected in the plant during the course of the operations, then stochastically disturbed by inserting them in MMCC (generation random of independent events following a given probability distribution). Four possible system conditions are identified to test the capability of the model, i.e. normal (safe) conduct of operations; disturbed conduct of operations: valve failures; disturbed conduct of operations: errors in the positioning of the tank; disturbed conduct of operations: human error in observing and escalating events. 4. Results and discussion In the following, we outline in form of immediate readability results obtained by sampling from posterior distributions (red dots) in terms of mean posterior probability of occurrence. As depicted in Fig. 3 (a), the trend of the posterior probability of leakage during steady-state operation decreases: DRA takes into account operational evidence (occurrence of malfunctions, near-misses, etc.), absent in this case, properly updating the likelihood of the given hypotheses. The probability density function depicted in Fig. 3 (b) represents the safety indicator of the system connected to a valve failure. In this case, the probability initially increases, as there is evidence of an actual malfunction. However, the system detects it, therefore corrective measures (for example a replacement intervention) can be put in place before the system fails. After the intervention, the system dynamically resumes stability and the probability is lowered. Fig. 4 (a) shows the safety indicator in case of human error during tank truck load. Analogously, the probability initially increases, because there is evidence of an actual ATB positioning error (detected by the system and corrective measures are enforced before system failure and accident escalation. After the intervention, the system dynamically resumes stability, and the probability is lowered. Fig. 4(b) shows the safety indicator in case of ATB positioning errors. In this case, the probability increases, because there is evidence of a human error in highlighting the malfunctions, which can thus cause an escalation of events. Human error is distributed stochastically, but the system itself, once it has evidence of a loss of containment, can activate the protections. The subsequent identification of Hidden States is performed by stochastically distributing not only the errors, but all the events, inserting them as well as random evidence of the HMM and then performing Montecarlo simulations. Figure 3: (a) Safe operation, leakage on loading area (b) Perturbed operation, leakage on loading area 154 Figure 4: Posterior pdf : (a) perturbed operation, leakage on loading area (b) leakage on loading area, escalation. Figures 5 and 6 clearly evidence the pdf of the cumulative errors and the MCMC trace respectively. After performing the posterior pdf analysis, it is possible emphasizing the most probable sequence of states of the system, visualizing the results as shown in Figure 7. According to this strategy, it is possible observing the overall transitions of the system and deriving quantitative safety indicators. Figure 5: Pdf of cumulative errors. Figure 6: MCMC overall trace. Figure 7: Overall most probable (>94%) sequence of states (transitions) vs. operation time (min). 5. Conclusions Novel dynamic risk approaches are intended to capture the changes and deviations in operations based on collected data. We present a scheme relying on HHM to update the risk level during operation starting from sequence data. Additionally, the method allows extracting precise information on likelihood evidence from HMM towards actual BN updating. From the comparison between the PDF of minor LOCs, it is shown the 155 system ability in resuming dynamic stability, without hazardous consequences, improving as well overall safety performance. The possible states of the system resulting according to the performed tests are as follows, starting from safe conditions in the operation absence. 1. When carrying out operations without disturbances, the system is considered at steady-state safe and the leak posterior probability corresponds to 0.0003. 2. In case of operation disturbances absorbed by hard or soft barriers, the system is safe. The leak posterior probability rises to 0.003 and suddenly returns to 0.0005. 3. When operations with disturbances are performed without preventive barriers actions, but proper protection systems intervene, the system is safe. The leak posterior probability rises to 0.08 due to event escalation, but following protection systems intervention, go back to 0.0002. The benchmark exercise shows that in the given cases, the process resilience is able to ensure the stability of the operations, in case of deviations from steady-state. Under the considered hypotheses, the oscillations of the dynamic safety indicator are contained within 0.3%, with 0-0.12% as 95-98% HPD (Highest Posterior Density). By setting up a dynamic system in which periodically the values are updated and statistically treated, coupled with Bayesian network ability, may enable monitoring resilience trends. Proper validation is still under development by extending a benchmark exercise starting from conventional risk analysis on the same installation. As a further development, the implementation will cover all plant sections, integrating a sensitivity analysis into the dynamic simulation, in order to quantify inputs uncertainties and output uncertainty range. References Blasiak S., Rangwala H., 2011, A Hidden Markov Model variant for sequence classification. IJCAI Proceedings-International Joint Conference on Artificial Intelligence, 22, 1192-1197. Chatzis S., Kosmopoulos D., 2011, A variational Bayesian methodology for hidden Markov models utilizing Student's-t mixtures, Pattern Recognition, 44, 295–306. Don M.G., Khan, F., 2019, Dynamic process fault detection and diagnosis based on a combined approach of hidden Markov and Bayesian network model, Chemical Engineering Science, 201, 82-96. Fabiano B., Vianello C., Reverberi A.P., Lunghi E., Maschio G. 2016, A perspective on Seveso accident based on cause-consequences analysis by three different methods, J. Loss Preven. Process Ind. 49, 18-35. Jain P., Rogers W.J., Pasman, H.J., Mannan M.S. 2018, A resilience-based integrated process systems analysis. Part II management system layer, Process Safety and Environmental Protection, 118, 115-124. Kalantarnia M., Khan F., Hawboldt K. 2009, Dynamic risk assessment using failure assessment and Bayesian theory, Journal of Loss Prevention in the Process Industries, 22, 600-606. Leveson N. 2004. A new accident model for engineering safer systems, Safety Science, 42, 237-270. Markowski A. S., Mannan, M. S., Bigoszewska, A., 2009, Fuzzy logic for process safety analysis, Journal of Loss Prevention in the Process Industries, 22, 695–702. Meel, A., Seider, W. 2006, Plant-specific dynamic failure assessment using Bayesian theory, Chemical Engineering Science, 61, 7036-7056. Palazzi E., Caviglione C., Reverberi A.P., Fabiano B., 2017, A short-cut analytical model of hydrocarbon pool fire of different geometries, with enhanced view factor evaluation, Process Safety and Environmental Protection, 110, 89-101. Pasman H.J., Reniers G., 2014, Past, present and future of Quantitative Risk Assessment (QRA) and the incentive it obtained from Land-Use Planning (LUP), J. Loss Preven. Process Ind., 28, 2–9. Pesce M., Paci P., Garrone S., Pastorino R., Fabiano B., 2012, Modelling ignition probabilities in the framework of quantitative risk assessments, Chemical Engineering Transactions, 26, 141-146. Munkhammar J. Widén J. 2018, An N-state Markov-chain mixture distribution model of the clear-sky index, Solar Energy 173, 487-495. Taylor J.R., 2016, Can process plant QRA reduce risk? – Experience of ALARP from 92 QRA studies over 36 years, Chemical Engineering Transactions, 48, 811-816. Vairo T., Del Giudice T., Quagliati M., Barbucci A., Fabiano B., 2017, From land- to water-use-planning: A consequence-based case-study related to cruise ship risk, Safety Science 97, 120-133. Vairo T., Reverberi A.P., Milazzo M.F., Fabiano B., 2018, Ageing and creeping management in major accident plants according to Seveso III Directive, Chemical Engineering Transactions, 67, 403-408. Vairo T., Milazzo M.F., Bragatto P., Fabiano B., 2019, A dynamic approach to fault tree analysis based on Bayesian Beliefs Networks, Chemical Engineering Transactions, 77, 829-834. Wang H., Khan F., Abimbola M. 2018, A new method to study the performance of safety alarm system in process operations, Journal of Loss Prevention in the Process Industries, 56, 104-118. Yang M. Kahn F., Lye L., 2013, Precursor-based hierarchical Bayesian approach for rare event estimation: a case of oil spill accident, Process Safety and Environmental Protection, 91, 333-342. 156