Microsoft Word - 1.docx CHEMICAL ENGINEERING TRANSACTIONS VOL. 77, 2019 A publication of The Italian Association of Chemical Engineering Online at www.cetjournal.it Guest Editors: Genserik Reniers, Bruno Fabiano Copyright © 2019, AIDIC Servizi S.r.l. ISBN 978-88-95608-74-7; ISSN 2283-9216 Measuring (Un)Safety. A Broad Understanding and Definition of Safety, Allowing for Instant Measuring of Unsafety Peter.J. Bloklanda,*, Genserik Reniersa,b,c aSafety & Security Science Group (S3G) DelftUniversity of Technology, The Netherlands bCenter for Corporate Sustainability (CEDON) - KULeuven - Campus Brussels, Belgium cDept. Engineering Management - Fac. of Applied Econ. Sciences (ENM) - University of Antwerp, Belgium p.j.blokland@tudelft.nl / peter.blokland@byaz.be Industrial safety performance has, for a long time, been the domain of health and safety specialists, measuring injury and absenteeism rates to discover patterns and trying to prevent accidents from happening. The drawback of this approach is that safety is reactive to accidents, mostly caused by operations. As a result, safety (performance) has become the reverse side of operations (performance) and is often seen as a hinderance in making the best possible profit for organisations. However, now already for a period of time, industries have become aware of the possible benefits of a more proactive approach towards safety. Therefore, increasingly more organisations are looking for more proactive methods in measuring and achieving safety performance. As a result, in recent years, important efforts have been undertaken to improve the understanding of safety culture and safety climate and how to measure these concepts in organisations, for instance in the process industry and chemical plants. Likewise, substantial efforts have been made to determine and develop a wide range of leading and lagging safety indicators that can reflect and predict safety performance. While developing leading indicators and making culture measurements are helpful, they both measure safety conditions indirectly. Because, an organisational culture or climate can be regarded as a specific indicator of a possible future performance, in the same way leading safety indicators aim to predict the future. Yet, little tools are currently available for the instant measuring of actual safety conditions and performance in organisations, providing information that allows for benchmarking between different sectors and industries. Nevertheless, when safety and its opposite “unsafety” are carefully defined, it becomes imaginable to develop tools that instantly measure the safety performances and actual safety situations in organisations so that they can be used for benchmarking regardless of sector or industry. In this article we will expound this way of thinking, based on an original paradigm about safety, unsafety and performance. 1. Introduction One of the major challenges in safety science is to develop methodologies and systems that are able to proactively capture and recognise situations and patterns that have the potential to provoke severe accidents. This instead of being obliged to use reactive approaches, such as learning from accident investigations when disasters already occurred. In the past decades, different methods, resulting from different research traditions have been developed to tackle this issue. Four historical traditions of empirical and theoretical research in the pursuit of a better understanding of this challenge can as such be distinguished; safety management systems (SMS), safety culture (SC), high-reliability organisations (HRO) and accident models and investigation (AIM), each of them encompassing different purposes, covering a variety of scientific disciplines and investigating in a range of socio-technical domains, such as energy, transportation or process industry (Le Coze, 2013). SMS is a concept which mainly is a result of empirical and conceptualised knowledge from industrial and consulting practices as well as guidance from control authorities, standards from international bodies or entities in specific industrial domains. Foremost, these practices and guidance are aimed at auditing and assessing safety in organisations. Safety culture, from a social engineering perspective, is concerned with designing and DOI: 10.3303/CET1977043 Paper Received: 9 January 2019; Revised: 5 May 2019; Accepted: 10 July 2019 Please cite this article as: Blokland P., Reniers G., 2019, Measuring (un)safety. A broad understanding and definition of safety, allowing for instant measuring of unsafety, Chemical Engineering Transactions, 77, 253-258 DOI:10.3303/CET1977043 253 implementing safety programs in organisations. HRO is more descriptive in what safety in organisations requires in organisational design and structure, while AIM is geared towards investigating accidents (Le Coze, 2013). As such, all of these traditions complement each other in their search of increasing safety performance in socio- technical systems. However, none of these traditions has led to a methodology or instrument that is capable of instantly measuring the results of any methodology in a way that it can equally be used for all approaches. Performance and safety have always been issues in organisations, certainly in an industrial context. It is why the cost of accidents, and the danger they present to organisations, have already been recognised a long time ago. A pioneer in this field was Herbert W. Heinrich, well known for his theories regarding human error and safety, expressed in concepts such as the domino theory or his accident pyramid (sometimes also called Heinrich’s safety triangle) (Heinrich, 1931, 1941). For many years these theories have dominated the realm of accident investigation and prevention, influencing a wide range of scholars in their search for safety indicators. Various authors have indicated that well into the 1990’s, and even up till today, one particular indicator has been the key indicator in the process industry, the Lost Time Incident Frequency (LTIF), presenting the number of absence to work due to an accident per million hours worked (Swuste et al., 2016). Safety performance indicators represent an important constituent of an SMS, involving the establishment, implementation and follow-up of corporate policies, acceptance criteria and goals related to safety. Safety indicators can be of a reactive (lagging indicators), or a proactive (leading indicators) nature and in developing safety performance indicators there will be a balance between concentrating on direct indicators with sufficient and meaningful data and focussing on indirect indicators with enough data providing early warnings, but with less direct safety relevance (Øien et al., 2011). To meet the need for quantification, dominant in industry, numbers of activities, incidents, interventions etc. are counted. However, problems with quantification, both for process as for management/organisation indicators, has been mentioned several times: they do not contain any information on quality. (Swuste et. al, 2016) Following the 1986 Chernobyl disaster, the term “safety culture” started to be regularly used amongst a broad community of safety scientists. As a result, many contemporary organisations strive to understand and improve their safety culture in order to enhance their safety performance. A popular tool or approach to assess safety cultures is the use of maturity models. Maturity models involve defining maturity stages or levels which assess the completeness of the analysed objects, usually organisations or processes, via different sets of multi- dimensional criteria. However, the ‘process’ of using a maturity model seems to be more important than the actual ‘outcome’ (Filho & Waterson, 2018). A model that tries to capture safety cultures in a holistic way is “The Egg Aggregated Model of safety culture” (TEAM). It describes the complexity of a safety culture in three constituting and interacting domains: Technological – Organisational – Personal, where visible and invisible factors can be distinguished. But measuring a safety culture requires then capturing both visible as well as invisible aspects, for which interviews and questionnaires are required, needing collection, processing and interpretation of data to come to an indication of the level of safety in organisations (Vierendeels et. al, 2018). So, the actual approaches to discover and describe the level of safety in socio-technical systems all work indirectly. They either describe what has been in the past to predict the future, or try to find parameters that are able to predict a future safety level. 2. Defining “safety” to discover “unsafety” Many practitioners and scholars are searching for ways to predict and prevent accidents, approaching this problem from different angles. However, scarce literature can be found regarding the development of tools which instantly and continuously measure the actual safety conditions and performance in socio-technical systems that can be used for direct feedback to managers down to the operational level. Maybe because this can only be realised by changing the way on how organisations look at safety and performance. Actually, there’s no commonly agreed upon definition of “safety” nor of its opposite “unsafety”. Today, safety is mostly defined by an absence of accidents, but how does one measure the absence of something? This lack of common ground also leads to different standards with which one tries to measure the safety conditions in organisations, without the possibility to benchmark and compare results between sectors and industries. However, a standardised (commonly agreed upon) definition of risk exists. “Risk is the effect of uncertainty on objectives”. Also, risk and safety are related. People want to avoid running risk in order not to lose what they want (and be safe), while they also take risks in order to achieve something and get or keep what they want (and be safe). In a sense, this is what Hollnagel describes being Safety I and Safety II (Hollnagel, 2014), where Safety II could be understood as being a focus on excellent performance when taking risk and where Safety I is the traditional view of safeguarding something from losses when running risks. The connection between risk and safety can therefore be seen as follows: risk is an uncertain effect on objectives, while the actual performance is the result of that uncertain effect. As such the actual performance is the result of Safety I + Safety II, indicating a level of safety (Blokland & Reniers, 2018). 254 Using the above stated connection between safety and risk, safety can be defined being “the condition/set of circumstances where the likelihood of negative effects on objectives is low”. This perspective on safety explains the difficulty of measuring safety in a general and standardised way. To measure safety it would be necessary to take into account all objectives that are or should be safeguarded or achieved and this at every moment in time. Furthermore, many “objectives” are not consciously monitored and often one is not aware of individual or other objectives, also being part of the objectives to be considered when trying to determine the (total) level of safety in the concerned socio-technical system. Nevertheless, it is much easier to measure unsafety instead. When everything goes right, most of the time, one is not aware of that fact, because this is a normal condition which the brain will dismiss. It is why measuring safety by leading indicators (SMS) or modelling safety cultures doesn’t give a conclusive answer regarding the level of safety in organisations. Yet, when things don’t go as planned, intended or wanted, this will be noticed somehow. It is why safety is traditionally measured by calculating numbers of identified mishaps, measuring “unsafety” instead of safety. In concert with the above definitions, unsafety could be defined as being “the condition/set of circumstances where the likelihood of negative effects on objectives is High” (Blokland & Reniers, 2017). In this way, it is possible to consider accidents and incidents as being “unsafety”, which can also be defined as “negative effects on objectives” or shorter as “failed objectives”. These definitions are the basis for developing a new approach in measuring safety in organisations, as explained further in this paper. 3. Heinrich and Reason revisited To a certain extent and in one form or another, Heinrich’s theories are still considered valid and used today. Actually, you could consider Reason’s Swiss cheese model (Reason, 1990) and all related models as an extension or a more comprehensive development of the domino theory, indicating that accidents happen due to a multitude of, most of the time human, factors that are influencing each other. As such, the holes in the slices of cheese in Reason’s model/metaphor can be viewed being similar to the dominoes in Heinrich’s model/metaphor. The strength of both models lies in their metaphors; they provide ways to understand the complex reality of accidents and loss and the presence and correlation of different categories of risk sources. However, the weakness of both models lies in the fact that they focus mainly on human behaviour and error and as a consequence, errors are categorised and given specific significance. Although it helps to have a limited number of categories and a specific significance, it sometimes leads people to dismiss the basic knowledge both models provide. Besides human error, other factors which are not represented in these models also play a role in organisational safety and performance. This makes these models incomplete from the start when only these specific categories are considered, certainly when maintaining a very strict interpretation on their significance. A more holistic way to look at the Swiss cheese metaphor is to consider that the presence of cheese can be understood as everything that goes well. It relates to the objectives that have been achieved and which a re safeguarded. Hence, the cheese stands for the objectives of sub-systems for which the value is present and can be considered being the result of Safety-II. The cheese therefore represents the achieved and safeguarded objectives and the related created value. On the other hand, the holes in the cheese are the sub-systems for which objectives are not achieved or not safe-guarded and for which the value has been lost or has become out of reach. It represents the unsafety which Safety-I traditionally aims to prevent or tries to compensate for by putting barriers in place. These are the objectives that have failed. As such, these are the different reasons (negative risk sources) which contribute to things going drastically wrong when they become connected. Thus, the holes in the cheese and their barriers represent Safety-I thinking. 4. A systems perspective Leveson says that ‘Safety’ is an emergent property of systems, not a component property (Leveson, 2011). It means ‘Safety’ is something that needs to be pursued, achieved and safeguarded by the system, repeatedly over and over again as a never ending story. Of course, a component can also be considered as a system on its own, but every system is also made up of sub-components which also have other and more specific (sub)objectives that need to be safe, different from the objectives of the overarching system, of which the safety has to be achieved and maintained as well. Systems are always part of larger systems and will always consist of subsystems and each of them has its specific objective(s) (purposes) and each of them is subjected to a set of risk sources that can affect those more specific objectives. Failure of objectives of sub-systems can therefore lead to failure of the objectives of the overarching system and ultimately the whole socio-technical system, causing disaster. Today’s socio-technical systems are ever more complex. Complex and chaotic contexts are unordered—there is no immediately apparent relationship between cause and effect, and the way forward is determined based on emerging patterns. Furthermore, a complex system has the following characteristics: 255 • It involves large numbers of interacting elements. • The interactions are nonlinear, and minor changes can produce disproportionately major consequences. • The system is dynamic, the whole is greater than the sum of its parts, and solutions can’t be imposed; rather, they arise from the circumstances. This is frequently referred to as emergence. • The system has a history, and the past is integrated with the present; the elements evolve with one another and with the environment; and evolution is irreversible. • Though a complex system may, in retrospect, appear to be ordered and predictable, hind-sight does not lead to foresight because the external conditions and systems constantly change. • Unlike in ordered systems (where the system constrains the agents), or chaotic systems (where there are no constraints), in a complex system the agents and the system constrain one another, especially over time. This means that we cannot forecast or predict what will happen. (Snowden & Boone, 2007) There are cause and effect relationships between the agents in complex systems, but both the number of agents and the number of relationships defy categorization or analytic techniques. Therefore, emergent patterns can be perceived but not predicted (Kurtz & Snowden, 2003). It is why even leading indicators or safety maturity levels will not predict disaster and although the measuring systems of SMS and safety maturity model results can provide interesting and very valid indications of the level of safety in organisations, only an instant notification and aggregation of as much as possible of failed objectives (holes in the cheese) of systems and their sub-systems of socio-technical system, can give a true indication of the level of safety in socio-technical systems. When the number and the importance of failed objectives of systems and their sub-systems is high, then the safety level is low. But when the number of failed objectives is low, the level of safety is high. This is what Heinrich already noticed when studying accidents in the early decades of the twentieth century. It is also the consequence of the Swiss cheese metaphor. Less and smaller holes in the cheese make it less likely to have a severe accident, as the possibilities for the holes to connect are reduced. 5. Measuring unsafety to quantify a safety level A precise measuring of the number of failed objectives (holes) and their level of importance (size of the holes) is, according to this systemic perspective, a direct way to measure the level of unsafety in any organisation or socio-technical system. The biggest challenge is to capture as much as possible of these failed objectives, regardless of the type of objective. These objectives can be technical, organisational or individual, and of relevant stakeholders, including failing technological devices. When this measuring can be instant and direct, an instant indication of a safety level can be obtained, independent from the size, industry or sector of the concerned organisation. However, to make this work, objectives need to be clustered in categories that are equal to any organisation and that provide meaningful information. Also levels of impact of failed objectives need to be determined in a way that they have the same value for any type of organisation. This would allow for direct benchmarking and comparison between organisations irrespective of their size, sector or industry. Surely, organising such a measurement would be a real challenge, but this becomes possible when all significant stakeholders are willing to instantly report any noticed occurrence of a failed objective and its impact, regardless of the kind of objectives involved. Ideally this results from a supporting organisational (no blame) culture, allowing to also report individual failed objectives without attributing consequences to such reporting. 6. Logical levels as impact categories Another challenge for such a measuring/reporting system, would be to assign commonly accepted categories and levels of importance of objectives to facilitate reporting of failed objectives. The obvious metrics to be used to represent the level of impact of the lost value would be an indication in terms of money and/or time. However, this would not have the same meaning or weight to each organisation or socio-technical system. For small organisations a certain amount of money and time can be much more significant than the same amount for a larger organisation. So, it is also necessary to provide a hierarchy of value that is equal to any organisation. One of the possible hierarchies that can be used to provide a quick judgement of the level of importance of a failed objective is the concept of the logical levels, attributed to Dilts and Bateson. Dilts (1996) defined the logical levels as leadership skills in applying the concept of Bateson (1972) who recognized “natural hierarchies of classification” in processes of learning, change, and communication. Dilts (1990) called logical levels “…an internal hierarchy in which each level is progressively more psychologically encompassing and impactful.’’ (Janschitz & Zimmermann, 2010). It means that an impact at a higher “logical level” will be perceived as being more important. The scientific problem with the originally proposed logical levels, is the fact that the upper levels, as defined by Dilts, are considered to be “spiritual”. But, it is less an objection when “spiritual” is replaced by “inspirational”. The inspiration of socio-technical systems lies in their purpose. The vision, mission and ambition that will determine the objectives that matter and how they can be valued. In their article “Organizational change: 256 A critical challenge for team effectiveness”, Goodman and Loh (2011) describe the logical levels related to change. It provides a good basis to see how the impact of a failed objective increases in importance when this concerns higher logical levels. The logical levels, in increasing level of importance, can be described as follows: Environment: is the lowest logical level and refers to what is outside the system: the place and time (where and when) the system pursues its objectives. Behaviour: refers to specific actions: what each system does. This will be the outward display of having successfully applied the key expected behaviours for achieving or safeguarding a particular objective. Capabilities: are also referred to as ‘competencies’. these are the skills, qualities and strategies, that characterise the system. They are how actions of the system are executed. They will often need to be defined, taught and practised in order to support the achievement and safeguarding of objectives. Values and Beliefs (rules): ‘Values’ are what an individual or team/system holds to be important, so they act as the drivers for what the system does. ‘Beliefs’ are what an individual or team holds to be true, and so influences what the system does and how it acts. Identity is how a system sees itself, it consists of the core beliefs and values that define it, and which provide a sense of ‘what the system’ is’. Purpose: ‘Purpose’ refers to the larger system of which the system is part. It connects to a wider purpose – ‘for whom?’ or ‘what else?’ Using Dilts’ model of logical levels to distinguish different levels of importance in failed objectives therefore provides a powerful tool to determine and assess the impact of a failed objective on a socio-technical system. 7. Impact levels Objectives can be individual, team related, at an organisational or even societal level. Another way to express the level of unsafety is therefore to indicate the corresponding level of the system that is impacted by a failed objective. This can be seen as a hierarchy of systems, ranging from an individual, a team, to an organisation, or even society as a whole. The larger the system impacted, the more significant the level of unsafety. Again these levels can be used for expressing a level of unsafety that is not industry, size or sector specific. 8. Discussion Measuring unsafety can be achieved by capturing and aggregating failed objectives. However, just capturing numbers of occurrences will not provide a correct basis for comparing and benchmarking between organisations of different size, sector or industry. To determine the level of safety of a system that can be compared to another system, irrespective of size, sector or industry, an indication in time and money is not sufficient. Though, this can be solved by creating a multidimensional model that allows to aggregate results in a way that is equal to all sorts of socio-technical systems. A first step is to distinguish the kind of objective that failed and to categorize them in groups of similar purpose (for instance: financial, technical, operational, reputational, physical, …). Furthermore, failed objectives can be scaled in size by categorizing them according to the logical levels and the levels of impact, as discussed earlier. Additionally, the impact can be further refined by setting universal categories of time and money to value the loss occurred by the failing objective. For instance time lost can be expressed in minutes, hours, days, weeks, months or even years and money can be expressed in <10, <10², <10³, <104, etc… of a currency, as such using levels that are the same worldwide. Measuring can be done by reporting anything that is not giving (full) satisfaction or of anything that doesn’t function as expected or doesn’t reach the intended goal, which has an impact on one or more objectives, linked to the different categories expressed earlier. These reports can then be aggregated per category as indicated earlier. This is not an easy thing to achieve, but ways can be found to build a workable solution that allows for such a multi-dimensional, instant measuring system. 9. Practical issues and challenges Still, such a measuring system needs to be easy and acceptable to the involved stakeholders. The proposed “pro-active unsafety measuring system” aims to accommodate for these prerequisites. Ease of use can be obtained by using such an application on smartphone, tablet or computer, categorising the objectives involved into categories that reflect the kind of loss incurred (loss categories) and categorising the negative effects by the range of logical levels impacted (impact category), the kind of system that is impacted (impact level) and its size expressed in clear numbers of money and/or time (severity level). This creates a multidimensional model describes and reports any set of negative effects on objectives and “near misses” (losses), ranging from the smallest time loss to the biggest catastrophe, in only a few seconds. 257 Acceptability needs to come from how collected data is represented, how managers use obtained information and how the information is presented and fed back to the concerned stakeholders. Ideally, the data is fed into a dashboard that instantly translates the data into a safety situation of the entire organisation and its components. 10. Conclusions For decades, scholars have been looking at ways to capture the level of safety in organisations, creating complicated measuring systems, capturing a multitude of parameters that have been determined by analysing organisations and their mishaps. But until now, no system is capable of exactly and continuously indicating a quantified level of safety of an organisation. Starting with a clear definition of safety and unsafety and a clear notion of what unsafety represents in socio- technical systems, combined with the use of a multicriteria model, using specific loss and impact categories combined with impact and severity levels, it would be possible to create an aggregated model that can provide a clear and instant indication of levels of unsafety in organisations, indifferent from size, sector or industry. Acknowledgments This article is part of a research project funded by Netbeheer Nederland, the branch organisation of energy distribution companies in the Netherlands, searching for ways to be more proactive regarding safety in the gas distribution sector in the Netherlands. We also wish to thank Prof. Dr. RW (Rolf) Künneke (TU Delft) for his guidance and contributions in this project. References Bateson, G. (1972). Steps to an ecology of mind. Ballantine, New York. Blokland, P., & Reniers, G. (2017). Safety and performance: Total Respect Management (TR3M): a novel approach to achieve safety and performance pro-actively in any organisation. Nova Science publishers, New York. Blokland, P., & Reniers, G. (2018). An ontological and semantic foundation for safety science. In Safety and reliability: safe societies in a changing world: proceedings of ESREL 2018, June 17-21, 2018, Trondheim, Norway/Haugen, Stein [edit.] (pp. 3157-3164). Dilts R. (1990). Changing Belief Systems with NLP, Cupertino, CA: Meta Publications. Dilts R. (1996). Visionary Leadership Skills, Capitola, CA: Meta Publications. Goncalves Filho, A. P., & Waterson, P. (2018). Maturity models and safety culture: A critical review. Safety science, 105, 192-211. Goodman, E., & Loh, L. (2011). Organizational change: A critical challenge for team effectiveness. Business Information Review, 28(4), 242-250. Heinrich, H. W. (1931). Industrial Accident Prevention. A Scientific Approach. Industrial Accident Prevention. A Scientific Approach. First ed. McGraw-Hill Book Company, London Heinrich, H. W. (1941). Industrial Accident Prevention. A Scientific Approach. Industrial Accident Prevention. A Scientific Approach. Second ed. McGraw-Hill Book Company, Lon-don Hollnagel, E. (2014). Safety-I and safety–II: the past and future of safety management. Ashgate Publishing, Ltd.. Janschitz, S., & Zimmermann, F. M. (2010). Regional modeling and the logics of sustainability–a social theory approach for regional development and change. Environmental Economics, 1(1), 134-142. Kurtz, C. F., & Snowden, D. J. (2003). The new dynamics of strategy: Sense-making in a complex and complicated world. IBM systems journal, 42(3), 462-483. Le Coze, J. C. (2013). Outlines of a sensitising model for industrial safety assessment. Safety science, 51(1), 187-201. Leveson, N. (2011). Engineering a safer world: Systems thinking applied to safety. MIT press. Øien, K., Utne, I. B., Tinmannsvik, R. K., & Massaiu, S. (2011). Building safety indicators: Part 2–application, practices and results. Safety Science, 49(2), 162-171. Reason, J. (1990). Human error. Cambridge university press, Cambridge. Reason, J. (1997). Organizational accidents: the management of human and organizational factors in hazardous technologies. England: Cambridge University Press, Cambridge. Snowden, D. J., & Boone, M. E. (2007). A leader's framework for decision making. Harvard business review, 85(11), 68. Swuste, P., Theunissen, J., Schmitz, P., Reniers, G., & Blokland, P. (2016). Process safety indicators, a review of literature. Journal of Loss Prevention in the Process Industries, 40, 162-173. Vierendeels, G., Reniers, G., van Nunen, K., & Ponnet, K. (2018). An integrative conceptual framework for safety culture: The Egg Aggregated Model (TEAM) of safety culture. Safety science, 103, 323-339. 258