Microsoft Word - 1.docx CHEMICAL ENGINEERING TRANSACTIONS VOL. 77, 2019 A publication of The Italian Association of Chemical Engineering Online at www.cetjournal.it Guest Editors: Genserik Reniers, Bruno Fabiano Copyright © 2019, AIDIC Servizi S.r.l. ISBN 978-88-95608-74-7; ISSN 2283-9216 Managing Process Safety in the Age of Digital Transformation Simon R. Jones Sphera, Pavilion 3, Craigshaw Business Park, Aberdeen, AB12 3QH, UK simon.jones@sphera.com Leading organizations continue to implement Process Safety Management (PSM) frameworks in an effort to reduce risk and create safe, sustainable and reliable operations. However, major incidents still occur. Even with the best PSM initiatives in place, it is difficult for organizations to understand the overall health of their assets and their process safety barriers in production operations over time. According to Petrotechnics’ 2018 survey on process safety and risk management, 86% reported a gap between how process safety is intended and the reality of its implementation in operations. In 2017, an overwhelming 90% said risk awareness and safety would be improved with access to real-time process safety risk indicators – and yet the 2018 survey reveals that today 60% of companies are not proactively monitoring and managing impaired process safety barriers. An emerging category of enterprise software system for Operational Risk Management seeks to close this gap. Operational risks arise from a complicated set of interrelated parameters and are viewed and managed in differing ways depending on the role and level in the organization. The challenge these systems seek to address is to simplify this complexity using a barrier health based risk model to enable users to focus on the risk-drivers that are most important from a process safety viewpoint. This paper shares the outcomes of two projects where major operators have taken a new approach to operational risk assessment and asset integrity data to better support decision-making in relation to major hazard management. 1. Introduction Process Safety approaches developed and implemented over the past 20 – 30 years have, few would argue, enabled us to improve the design basis for our facilities. The use of a risk-based approach is commonplace and is a requirement of many regulatory bodies around the world. The process safety practices and guidelines for designing, managing and operating the facility are well-known and documented by regulatory and engineering bodies. The rules of how to run facilities, maximize production and manage risks are typically encoded in organizations’ Operational Management systems. But how do we know good practice is being applied? How do we ensure it is effective, and how do we connect this practice to the frontline? Traditionally, KPIs and audits are our principal tools, yet despite our investments in performance monitoring and continuous improvement, the industry continues to experience catastrophic accidents. The 25th edition of the Marsh Report (2018) outlines the most significant losses in the Hydrocarbon industry between 1978 and 2017, covering the costs of property damage, debris removal, and clean-up all normalized to 2017 values. The report highlights that in the last two years there has been a spike in the number of high- cost downstream oil and gas losses which have had a significant impact on the global energy industry. For the same period, the authors draw attention to the findings from engineering surveys undertaken by the insurance sector. The findings highlight issues with work management systems, such as permit to work, shift handover and management of change, and also with fundamental inspection processes and practices. The report also references other insurance industry loss analyses which indicate “mechanical integrity failure” was responsible for 57% of human-influenced losses concluding: “Given these concerns, and combined with high operating rates, reduced staffing levels, and other cost saving programs, operators must maintain high levels of monitoring and vigilance to ensure that asset integrity is being maintained and accidents are eliminated.” DOI: 10.3303/CET1977104 Paper Received: 15 February 2019; Revised: 25 May 2019; Accepted: 28 June 2019 Please cite this article as: Jones S., 2019, Managing Process Safety in the Age of Digital Transformation, Chemical Engineering Transactions, 77, 619-624 DOI:10.3303/CET1977104 619 2. Survey of process safety and operational risk engineers in 2018 In 2018, Petrotechnics conducted its second major survey of process safety and operational risk engineers. 108 process safety, asset integrity and operational risk management senior leaders from around the world participated. 53% of respondents have more than 15 years’ experience, with 61% holding a Corporate or Regional management position. In terms of sectors represented 55% came from the oil and gas sector, 20% from chemicals sector, the remaining 25% came from a variety of other hazardous industries. When asked in 2018 about the operational reality of risk from their viewpoint, the following figures emerge: • 86% believe there are gaps between how process safety is intended and what actually happens on the plant/asset - up from the 2017 survey figure of 70% • 56% found an increase in risk on their plant when undertaking periodic process safety reviews (for example, 3-5 year periodic reviews of PHAs, HAZOPs, HAZIDs, LOPA studies, etc.) This speaks to the challenge faced when good design enters into service at the frontline – the operational environment causes processes, systems and equipment to degrade over time, thereby increasing risk. It seems that this change in risk is only highlighted when periodic process safety reviews are undertaken. When asked about the delivery of scheduled maintenance on safety critical elements on facilities: • An average of 73% of scheduled safety-critical maintenance is achieved, and 22% do not think it’s practical to achieve 100% scheduled safety-critical maintenance • When asked why, conflicting priorities (75%) and limited resources (72%) are said to be the main challenges to delivering planned safety-critical maintenance Following up on two key pieces of feedback from the 2017 Survey (Petrotechnics 2017a): • “It’s important that we understand hazards on a real-time basis and that the continual state of barriers is maintained as designed to reduce incidents.” • “Everyone would be more thoughtful on ensuring barriers perform to standards if they truly understood what the barrier was trying to prevent.” The 2018 survey found that only 38% believe industry operators proactively manage process safety. Respondents also believe that companies do not have effective systems in place for: • monitoring and managing impaired process safety barriers (60%) • monitoring and managing deviations from management system requirements or expectations (64%) 3. Can digitalization help close the gap? There is increasing focus and attention on the potential for new digitalization strategies to deliver increased value and sustainability in the energy and petrochemicals sector. Over 73% of industry leaders recognize the power of digitalization to accelerate and provide sustainable operational excellence (Petrotechnics 2017b). A reduction in operating costs, broader operational efficiencies, and fundamental transformation of the business are what is expected. The promises of data connectivity and analytics suggest continuous uptime, rapid response to risk exposure, incremental revenue gains, opportunities to better utilize assets, coordinate with operating and business needs and improve the efficiency of field service groups. According to McKinsey: "The effective use of digital technologies in the oil and gas sector could reduce capital expenditures by up to 20 percent; it could cut operating costs in upstream by 3 to 5 percent and by about half that in downstream." (Choudhry, H. et al., 2016) In many ways, the dynamic nature of the frontline is ripe for technology to better support decision-makers. In fact, according to a Verdantix study, Operational Excellence and Industry 4.0 strategies are among the top factors triggering operational risk management implementation today. Over 73% consider digital technology valuable, if not essential, for effective operational risk management (Verdantix 2018a). An emerging category of Operational Risk Management (ORM) enterprise software seeks to support safe and effective operational decisions by providing critical innovations (Verdantix 2018b) including: • An enterprise ORM approach to barrier management, permit to work (PTW), management of change (MoC), incident management, risk assessments and process safety management - all accessible via desktop web applications and mobile devices for use in the field • Wearable technology as a source of data for use in the field • Industrial Internet of Things (IIoT) as a source of data from critical devices in the field • Advanced modelling capabilities to create dynamic “digital twin” asset models • Advanced analytics from big data to provide actionable insights on operational risk status and trends 620 4. A new model for Operational Risk Management Operational risks arise from a complicated set of interrelated parameters and are viewed and managed in differing ways depending on the role and level in the organization. The challenge lies in simplifying this complexity and enabling all levels of the organization to collectively focus on the elements of risk that are most important. Operational risks can arise from critical equipment conditions - or non-conformances - and also from planned activities on the facility. A new approach to managing the cumulative impact of all these operational risks is to model their impact on process safety barrier groupings and associate them to the major accident hazards (MAH) under management. This simple, elegant approach enables operators to predict and better manage the outcome – whether that means to postpone a particular planned activity or accelerate maintenance to address the deviations or non-conformances on the facility (Jones 2017). At the heart of this model (Figure 1) is the need to carry out an operational risk assessment for any performance deviation or non-conformance identified on the facility. Examples include: • Performance standard failure • Verification inspection finding • Overdue safety-critical maintenance • Override of a safety-critical system or device • Management of hydrocarbon leaks Figure 1 Deviations and non-conformances as sources of risk An engineering technical authority typically leads the risk assessment process. This process identifies the major accident hazards under management, the fundamental barriers impacted, defines interim control measures and authorizations required and also the resulting residual risk associated with the individual deviation or non-conformance. As illustrated in figure 1, deviations and non-conformances may be tracked and managed through many different business processes and systems, such as asset integrity inspection systems, maintenance management systems, operator rounds or management of change processes, inspection data and environmental control systems. Therein lays the challenge and the opportunity from an Industry 4.0 perspective: if we can map these operational conditions, from whichever business system they arise from, and provide a better illustration of their impact on operational risk and the major accident hazards under management, we help support better decision-making from a risk management perspective. 4.1 Connecting Activity Risks and Fundamental Barriers Planned activity on the facility can also introduce potential process safety barrier impairments and increase risk exposure. These activities are typically planned and scheduled in a maintenance management system, and their execution is managed via a work permit processes, supported by a job safety analysis (JSA). In addition, operational activities are managed through a combination of operational procedures and operator rounds practices. 621 The potential barrier impact of operational activity can be modelled – for instance, if a planned isolation is needed to prepare for confined space entry. In this case, it is reasonable to assume there is a potential impact on the process containment barrier for the period in which first line break is undertaken. Similarly, open flame hot work in a unit represents a degradation of the ignition control barrier for the period the permit is issued. 4.2 Towards a common currency of risk If we have carried out operational risk assessments for all deviations and non-conformances on the facility, and we also know the potential barrier impairments introduced by planned work, we can have a more complete view of all activity and risk and their potential impact on the asset’s operational reality. And we can map this to a specific location, a given time/shift and see the MAH risks under management. From an Industry 4.0 data and systems perspective, we can use this model to connect disparate sources of data that represent all activity, deviations, and non-conformances on the facility and generate a common currency of risk. The cumulative impact of these risks can be modelled to help everyone understand and assess risk by the same criteria, to make better operational decisions and proactively intervene to prevent major hazard events. 5. Connecting the complete view of all risks and all activities to the frontline Once we have a common currency of risk and a comprehensive view of all of the activity and risk, the next challenge is to get this information into the hands of all those that can benefit. Enabling the connected industrial worker using mobile devices is the next opportunity in the digital transformation journey. From a process safety standpoint, we must take care to ensure the data presented to the frontline - through intrinsically safe mobile devices - offers the dynamic information work teams need to support process safety hazard management. If we think about the daily activities of the frontline worker, there are many situations where providing a common view of the operational reality of the facility - that is, an understanding of where equipment conditions or planned activities may impact operational risk – will help support effective decision-making. Typical use cases for a connected frontline worker, using an appropriate mobile device, include enabling the user to: • Know when and where to execute work • Carry out task risk assessments (JSAs) at the worksite • Manage work activities and associated tasks – for example, gas test recording • Update the details of an isolation plan – recording isolation status, lock and lockbox information in real- time From a risk oversight perspective, users can: • See when and where work and activity is happening on the facility • Monitor real-time work execution and performance • Understand the cumulative impact of all activity and risks on the facility, to support decision-making from an operational risk management perspective 6. Case Studies – applying the Operational Risk Management platform approach Here we share case studies of two major international oil and gas industry operators who are implementing the advanced operational risk model using Industry 4.0 Operational Risk Management software. 6.1 Case study 1: Improving the quality of technical risk assessments and modelling their cumulative risk impact A major international oil company operates multiple platforms offshore in the UK Continental Shelf (UKCS). This operator has a mature management system and a well-defined approach to process safety, asset integrity management and work control. By implementing their Operational Risk Management platform solution, a technical manager sought to further improve the risk assessments that are undertaken when critical equipment is not meeting its performance standard. The existing practice was to immediately carry out an operational risk assessment once such a deviation had been identified from formal inspection, maintenance or operator activities. This initial risk assessment was approved by the local offshore installation manager and would be discussed with the engineering support team onshore. A general criticism of the sector regulator (not specific to the operator) was that such risk assessments in practice rarely identified the true hazard related to the failure of the protective function of the critical equipment. 622 This operator had a well-defined approach to managing safety-critical elements (SCE) and associated components and equipment. A performance standard was defined for the identified SCEs on each installation, which is related to the risk reduction credit taken in the regulatory Safety Case. To improve the quality of the initial risk assessment, the operator used its ORM software to present the assessor with templated risk assessments based on the type/category of SCE impaired. The templates helped to: • Define the true hazard that the non-conforming equipment as a class gives rise to • Provide typical mitigating measures for the assessor to consider in order to minimize risk based on the equipment class/function • Present relevant SCE performance standard content as checklists – which encouraged the assessor to: • Identify the level of safety or integrity criticality the deviation represents • Consider other protective functions that might compound the problem – that is, other deviations that also impact the area and major hazard under management The ORM software also helped the assessor define if the impairment would impact a local area of the facility, or the whole platform – for example single gas detector may have a localized risk impact, whereas firewater pumps unable to deliver required capacity impact the entire installation. The ORM software was also used to drive a revised approval process: • Formally involve the defined functional technical authority in the approval of the technical risk assessment, based on the equipment class/function; and, • Based on the level of safety or integrity criticality identified in the assessment, indicate required approvals from asset and business managers, in addition to the local installation manager The Operational Risk Management software is also used to manage all permitted activity on the operator’s facilities. This provides a combined view of all equipment risks and all activity risks on a barrier model, highlighting MAH risk pathways. 6.2 Case study 2: Delivering a real-time view of critical equipment status and its impact on risk A major national oil company is building and will operate a world-class refinery in the Middle East. Currently in the greenfield phase, the operator has commissioned a significant Industry 4.0 initiative to develop and deliver a technology-driven approach to integrate a suite of business systems to better support asset and operations processes and management. The operator has a sophisticated safety management system and clear corporate standards and practices. The operator wishes to make a real-time view of operational and process safety risk a central element of decision-making from plant start-up onwards. The operator is implementing an Operational Risk Management software platform to manage all permitted activity and deviations on the facility. The ORM software will integrate with three other business systems to deliver a real-time view of the risk status associated with critical equipment: • Data historian – for near real-time status of critical equipment (this data historian is itself tracking the operational DCS and critical alarms systems) • Maintenance Management System (MMS) - for inspection and maintenance records and associated plans and schedules • Operator rounds system Since the project is still in the design phase, the project team was able to access design package materials from the EPC contractors responsible for each refinery unit to identify the critical equipment. This includes bringing together a variety of information in useful formats. • From design phase Hazop and asset integrity studies • Health parameters associated with specific items of identified critical equipment • Critical equipment types/categories mapped to the fundamental process safety barrier model • Records representing all critical equipment were set up in the ORM software • Through integration, the ORM software “listens” to the health of critical equipment, based on the above parameters, from three sources: • Near real-time status of equipment from the data historian • Inspection records for critical equipment from operator rounds and inspection management system • Deferred planned maintenance for critical equipment from MMS • Non-conformances are mapped to the fundamental process safety barrier model The ORM software is used to manage all permitted activity on the facilities. The integration described provides a combined real-time view of all equipment status risks and all activity risks on a barrier model, highlighting MAH risk pathways. 623 7. Conclusions The oil and gas industry continues to experience major accidents, despite the application of mature approaches to process safety in the design phase of projects. There appear to be gaps that arise when this good design goes into operation. The dynamic nature of the frontline, coupled with the siloed nature of sources of information on planned activities and critical equipment can give rise to major incidents. An emerging category of enterprise software system for Operational Risk Management seeks to close this gap by applying proven risk models to support all levels of operational decision-making with an improved approach to risk management - which is more pragmatic, simple in concept, and informed by real-time risk status. The concept of the connected industrial worker seeks to put real-time information in everyone’s hands - through intrinsically safe mobile devices - to help keep people and assets safe and productive. From a process safety standpoint, such devices provide the opportunity to support process safety hazard management by ensuring everyone knows what is happening, where it is happening, what is impacting process safety barriers and what is truly driving the complete picture of risk on the facility. References Choudhry, H. et al., 2016, The next frontier for digital technologies in oil and gas, , accessed 31.08.16 Jones, S, 2017, Do we really know how to manage risk?, Proceedings of 2017 International Symposium of Mary Kay O’Connor Process Safety Centre, Texas A&M University, USA Marsh & McLennan 2018, The 100 Largest Loss events 1978 - 2017 - Large property damage losses in the Hydrocarbon Industry, 25th edition, London, UK Petrotechnics, 2017a, 2017 Survey on Process Safety Management and Operational Risk Management, Petrotechnics, , accessed 25.09.2017 Petrotechnics, 2017b, Operational Excellence Index 2017, Petrotechnics, 2017, , accessed 21.11.2017 Verdantix, 2018a, Smart Innovators: Operational Risk Management Software, August 2018, London, UK Verdantix, 2018b, Operational Risk Survey 2018: Budgets, Priorities & Tech Preferences, London, UK 624