Acta Polytechnica Vol. 43 No. 3/2003 A General Approach to Study the Reliability of Complex Systems G. M. Repici, A. Sorniotti In recent years new cornplex systems haae been deaeloped in the automotiue f,eld to increase safety and comfort. These systems integrate h,ardware and, sofiware to guarantee the best results in aehicle handling and mahe products competitite on the market. Howeuer, the increase in techni,cal dntails and the utilization and i,ntegration of these cornplicated systems require a high' leueL of dyamic control system reliabitity. In order to intproae this fundamental chnracteristic methods can be extracted from methods wed in the aeronauti.cal fieLd to dcal uith reliability and these can be integrated into one simplifed method for application in the automotiae feld. First|y, as a case stud,y, ue decid.ed to analyse VDC (the Vehicle Dyrnmics Control system) by def,ning a possible approach to relinbility techni.qucs. A VDC Fault Tree Analysis represenh the f.rst step in this actiaiQ: FM enables us to recognize the critical components in all possible uorking conditioru of a car, inclul,i.ng cranking, during'key-on'-'key-olf'phnses, wh.ich is particular\ cfiticalfor the electrical on-board system (because of u oltage reduction). By associating FA (Functional Arnlysis) and FTA results uith a good FFA (Funnti,onal Failure Ana\sis), it is possible to define the best architecture for the general system to achieae the aim of a htgh reliability structure. Th,e paper uill shou some preliminary results from the application of this methodologl, taken from uariow typical handling conditions from uell establisfud test procedures for aehicles. Ke",worik: safety, systems reliability, fault tree anallsis (FTA), functional analysis (FA), handling, uhicle dynamic control (VDC) 1 Introduction Automobiles and land vehicles in general have seen a dramatic increase in complexity in recent years. Today's auto- mobile presents a higher than evel and increasing, number of value-added features, many of which are controlled by the vehicle's electrical and electronic (E/E) system. In fact, a vehi- cle today has approximately twice as many E/E functions as one producedjust l0years ago. This rend requires electrical system designs that pmvide both increased functionality and increased reliability. This inflation effect has been caused mainly by two factors: the hrst is rising demands from the consumer. This has not only manifested itself through the de- sire for better performance or comfort, but also stems from increased awareness of safety related issues and more protec- tion for the occupants of the vehicle. The second factor has been the development of various electronic techniques and equipment. This technology has pushed the limits imposed by the on-board systems and, specifically, has allowed the im- plementation of many functions controlled by hardware and software systems on board. It is common practice when buying a car nowadays to find under the hood and scattered throughout the vehicle kilo- metres and kilometres of cables and wires, multiple control boxes and an equally high number of sensors picking up a very wide range of physical parameters. On top of this, all the electronic systems on board a car are interfaced in some way or another with themechanical and hydraulic systems. An example of typical functions now under the control or assistance of electronics is the control function. This aflects the ride and the handling performance, above all else. When the driver makes a sudden manoeuvre, control is critical. It is just as essential in bad weather or on rough roads, especially on unpredictable road surfaces. Even under normal condi- tions, on straightroads and turns, or during braking and acceleration, control determines the ride and handling per- 34 formance. Often, the level of control depends on the skill of the driver. The ride and handling technologies emerging in the industry help offer significantly more control for every driver in every situation, regardless of skill []. Howeve4 all this comes with a price tag. l1lis is not only in terms of the final price for the use4 but also in terms of increased design complexity that places heavier loads on the design engineers and extends the time to prototype and testing. This is a key point that has been taken as a key driver in our methodology. Later we will uncover how integrating the different analyses in an intelligent way can provide a way to develop prelimi- nary estimates early in the development. In this context the complex system identified has to be intended as the ensemble of subsystems in a vehicle integrating different and advanced functions. To give a brief idea, the list of various state-of-the-art technologies applied might include : Higher- and multiple-voltage power generation and storage, Networked communications (multiplexing), Fiber-optic communications, Multi-drop wiring, Networked control- lers with distributed computing, standard interfaces, and mechatronics (electronics integrated into switches, connec- tors, sensors, and actuators). Fig. 1 shows a generic vehicle encompassing a set of advanced circuitries and components. Tirrning to the aerospace field, we can see how avionics [5] has been a relevant part of the development of an airplane since the Iate 1940s. It has since developed into a variety of lesser streams, covering the most various functions on an air- plane: communications, navigations, control, etc. It still is at a level of complexity much higher than that of a car, but several system are comparable in terms of functions and criticalities. Since some of the electronics mounted on a vehicle oversee safety, and the development of some specific aspects can be derived from analogous activities from the aerospace indus- try. In particular we would like to point out here how evolu- tiorr in avionics design has shifted with hardware miniaturiza- tion and the concomitant architectural integration strategies t2l. Acta Polytechnica Vol. 43 No. 3/2003 Lderal a((clerohaief Yu rata Jantor Fig. l: Modern vehicle circuitry schematics (Courtesy of Delphi Automotive Systems) In its basic form, an aircraft avionics system can be viewed as a large number of interconnected computers. Up to the 1980s there would be about ten computers, going up around 30 for bigger airplanes. The capabilities developed over recent years have allowed a switch from these so called feder- ated architectures to a system integration aPproach. Basically, that functions can be mapped onto hardware as integrated computer nodes. The base line for an avionics architecture can be represented as in Frg. 2. This can be taken as a starting point for an analysis that will lead us to transition develop- ment methodologies aimed at taking the higher reliability levels achieved in aerospace into the automotive field. Aside flom the specific architecture, which is not under discussion in this work, we have recognized how some typical methodologies have been applied in, in a slightly diflerent way the aerospace field than in the automotive field. In partic- ular, well known analyses like EA (Functional Analysis), Fault Tiee Analysis (FTA) and Failure Modes and Effect Analysis (FMEA) [9], have all been used extensively and have under- gone improvements [6], [7]. Most significantly, however they are all inserted in a well structured methodology that allows results and trade information to be gathered from the very be- ginning, so that the overall results can be evaluated. A major advantage ofthis approach is the integration ofall the analy- ses, both horizontally and as vertically over the different levels of definition from equipment up to system level. It is not necessary to recollect and reframe the results since the analy- sis are all interlaced among them and they are evolved from common standpoints. At this point it is thus clear how this operational way can be exported to the automotive sector lvith great advantage in terms of development and overall reliability. ,.:-' ,-.,,-;,;:i i '..}-,i.'i 1,"'j!.lj ocl-lmwrcol-'ll-ll-loqooEg Er_ogEdEi" Fig. 2: Typical bus configuration for elementary on board avionics [3] DIGITAL DATA BUS 35 Acta Polytechnica Vol. 43 No. g/2002 t------l I Software I' L---r-J Reliabilitr and sqf'etv t regai.rements I Hardware IL_**_l Proiect requirements satisfaction Fig. 3: Overall scheme, methodology 2 Study scheme Starting from the background illustrated in the previous paragraph, we have identified the main target of our study as the.definition of a general methodology to suppor-r, as an ancillary analysis, the development of the design bf auto_ motive systems complex. This will essentially be aimed at reaching a consistent level of reliability and safety. These are features intended for a generic system encompassing electro-mechanical components and containing consistent software functions. Our purpose has been to deielop a true methodology. Though not as cornplex as others jvailable in the literature [4], it will have the great advanrage of being exfemely lean and widely applicable. One of the main fea_ turcs is the possibility, which we intend to make use of, to apply it from the very beginning of the design, accompanying all the phases of the developmenr of the project. Mo.eoverl our intention has been to define in an objective way, the requirements necessary to develop all the relited subsystems as well as the embedded softrvare. All of the above will rake into account the targets established at the beginning of the project. - For the time being the main developmental and test benchmarks come from the automotive industry. That is to that say some requirements have been extracted directly fiom the set of initial requirements and quality levels of the automotive industry. The purpose is to obtain significant reliability data well before the resrs are launched, and in this way to identify the critical issues and act to corect them. 36 Fig. 3 shows the overall logic driving the study. The frame- work within which we have moved is given by the basic idea of integrating from the very beginning all the analyses and activ_ ities carried out, in parallel as much as possible, for both hardware and software [10]. It is essential that the developers of the software are fully aware of what is taking place on the hardware side, and vice versa. Running all the ictivities in parallel allows us to evaluate the impact on the diflerent functions early in the project. As we can see from the figure, several cross checks are carried out during the developirent of the design. These are not intended aJ formal gates, but rather as check points for assessing the coherenidevelop- ment and exchange of information. The main drivers arc the reliability and safety requirements. They are pursued all along,-and.each analysis is aimed at implemenring, verifiing then checking compliance with the target values. The anayl sis, as shown in Fig. 3, cascades from functional down to the control loop, through fault trees, and then back to reassess the progress, and integrate the results from downstream. The naro plans, green and red for SW and HW, always move in tight parallel, maximizing the interchange. 3 Specific problem application We now introduce the system we have chosen as a case study for the application of our methodology. Speaking in general terms, we can call Vehicle Dynamics Control Sysiem (\rDC) ageneric system aiming at increasing the level of safety during the operations of a common vehicll. The main func_ ttt---i ri j ..t ; Acta PolytechnicaVoL 43 No. 3/2003 tion is to control the dynamic behaviour of the vehicle, inter- vening especially whenever the vehicle is approaching, the limits of its usage envelope. As a first approach we see the ac- tion as being carried out by acting on the brakes and simulta- neously controlling the torque produced by the engine. Typically, a \rDC includes functions related to the control of the braking actions (EBD), functions avoiding locking of brakes during the braking action (ABS), the traction control system [ICS), and a function controiling the release of torque in acceleration (ASR), and others. Each of the functions listed above encases several aspects and is carried out by Processing various quantities. As an example we can point out here that an ABS function has to control the various degrees of longitu- dinal variation of attrition according to the different motion conditions. While making a turn, both longitudinal and lat- eral forres act on the vehicle, and also an additional function is called in, Cornering Brake Control (CBC), which takes into account the different load exprcssed by the internal and external wheels on the ground. The VDC is a very complex system. lt is normally made up of a number of electro-hydraulic-mechanical components' Typically we have up to twelve two-lvay valves coupled to the limiting components, pumps, and actuators. In order to better analyse this part of the system, a dedicated action has been devoted to modeling all the hydraulic components. This modeling is essential in order to advance the knowl- edge of the system and so move on from an a priori logic to a rcsponsive system which acts according to the real vehicle-ground interaction and the conditions encountetcd while operating. Once again we would like to underline the importance of integrating the knowledge related to the soft- ware running the system with the physical definition of the system itself, especia\ the electro-hydraulic portion of it. From our point ofview it is useless to investigate the physical functionalities of the system without a substantial verification of the data transmissions logics. Hence the use of several commercial software packages for testing the data buses and data interchange. Assuming now that we want to develop a system similar to VDC, the flrst step is to think out the overall structure of the system. After all the main functions are identified and a thor- ough description has been made the next step is to start doing the prcliminary design. Therc are several ways of doing this; to maximizing the implementation of reliability and safety featules fiom the verT beginning [8], a potential development scheme is shorvn in Fig. 4. Logical scheme The starting point is represented by a Functional analysis: a ta{get function is defined, then a detailed representation of a breakdown of all the sub-functions. In this phase experts from different fields (mechanics, electronics, electrical, etc.) work together to evaluate every single function necessary to comply with the required target. when all functions are clearly identified, is possible to analyse the components implement- ing that function from both the hardware and software point of view. In practical terms the theoretical structure derived from the Functional Analysis becomes a physical structure inwhich we can see every single element making up the general system. At this point, a Fault Tiee Analysis can be applied to the obtained scheme and verified with a control loop if the re- quirernents are satisfied in the case of failure of various components. The control loop is basically a series of logical steps taken by the system engineer aimed at assessing the consequent- iality of all the functions and the full satisfaction through the dedicated hardware. Since the procedure has not yet been formalized, check lists are being prcpared in a generalized rvay and will be tested as more analyses are carried out on differ.ent subsystems. It is important to underline that it is also possible to veri$ through the control loop whether all fundamental functions have been correctly identified during the functional analysis step. At the end ofthis processwe have a general system archi- tecture fiom the point ofview of theoretical requirements and from the point ofview ofboth physical hardware and software elements. As the project design unfolds, thorough adherence to the scheme assures the safety and reliability allocations can be controlled in real time. In particular, the fault tree analysis results (see Fig. 5) can be used to check the failure rate target of each component, and so it is out relatively easy tb evaluate the trade offs and par-t substitutions to raise the overall system reliability. Reliabilitv data is nowadays widely used in every engi- neering field. MIL-HDBK 217 and RAC are nvo of the most widely used documents containing collections of failure rates and other data for various components. Several specific databases have also been built throughout the years to support design choices in the field ofaerospace' In the auto- rnotive field these databases are not yet fully developed to the salne extent, Physical scheme Analysis of the component to realise required functions Fig. 4: Development scheme for a VDCJike system 3t Acta Polytechnica Vol. 43 No. 312003 Fig. 5: VDC Fault Tree Analysis Fig. 6: Reliability data building and managemenr philosophy Fig. 6 shows a typical basic policy to enhance this situation. As an ancillary activiry to this study the procedure shown has been partially implemented. In particular the work has been focused on increasing the data yield from subcontractors and suppliers. As far as we have seen till now, the data collection and management activities are carried out on the nvo differ- ent planes in two not completely compatibleways. It happens that the data relevant to the overall production is not neces- sarily the data that is sensitive to the equipment manufac- turer. This causes biased collection, and subsequently data transmission. Due to all of the above it is sometimes very hard to conectly evaluate the overall reliability of a complex system. In the course of the study an alternative strategy, coping with 38 CAI{ communlcrdon frilurs Configurraio! Frults Bs me$rgcs not corr€ct Bus trrnsmitter not pre$nt hvrlid drtr on the CAN bN Sensor C.llbr.tlon Frults Ytw sensor Lrlerrl Sensor CORRECTIVE ACTIONS ANALYSIS REQUESTS the lack of data, has been evaluated. Starting from the reli- ability data available, a corrective factor PM (Proceeding Mu- tation) is generated through a series 6f 2nalyses aimed at cou- pling the aeronautical components (i.e., sensors) and the au- tomotive components. Through the correct application of this factor to aeronau- tical data, estimated values can be obtained for automotive el- ements and an approximate failure rate of these values can be calculated. While the appropriate databases are being expanded and refined, a temporary collection of all the PM f,actors devised will be used and updated, constantly crossing the values with the results obtained from experimental tests. Saeering Angle Sersor Acta Polytechnica Vol. 43 No. 312003 4 Identification of the problem Ti,vo main problems have been identified in the course of our work. They are diverse in nature, and can be solved immediately: firstly, there is the need to identiS, a methodol- ogy that will help, by means of graphical support, in correctly identifring the physical structure of the system being ana- lysed. To this purpose we have looked at FA, which requires a functional description of the system, and FMEA, usually carried out to a more thorough level of detail and spe- cific descriptions. Hence we will start out from the formel describe the main objectives that our system architecture has to comply with, and then move on to the latter to analyse the different components, their I'eatures and the potential failure modes. In doing this the two methodologies come together for whole and also work as a reciprocal verification. The second main point we stated is that all too often the analyses from the sofnvare and hardware components are carried out separately. In thisway the data gathered from the two sides, even though formally correct and complete, are not structurally integrated, and so information regarding the interactions is lost. To repair this fault a method [4] has been developed, called Hierarchically performed Hazard Origin and Propaga- tion Studies or Hip-Hops. These techniques are founded on the principle that all the existing methodologies function well, but need a higher degree of integration to suitably fit the most modern complex systems. The work evolves through integrating of several analyses, with the main purpose of maximizing the automation of the procedures through the development of appropriate tools and soffivare. 5 Fault injection techniques In order to achieve a more complete reliability analysis, it is deemed useful to analyse system reactions to hardware and software failures. The technique explained in this work, through fault injections, has proven to be cost effective and capable of providing valuable results. Using software tools like Amesim or.Matlab Simulink, it is possible to develop soft- ware models to obtain a very close simulation of real events without the use of prototypes; in particula4 it is possible to sirnulate the mathematical logic (Simulink) and physical elements (Amesim) of a generic electro-mechanical system. Analysing the mathematical equation and the logic which control the phenomena (Iig.7 shows the traction control logic), it is possible to simulate a failure in the virtual model, using results from FTA and FA to isolate the most critical componenrs. In this way we can study the behaviour of the system in critical conditions and evaluate whether the general re- sponse is sufficient to guarantee the minimum safety value. In addition to this, using simulation techniques starting fiom hardwarc and software integrated FTA and FA analysis a large number of results can be obtained in a short period of time, and the correctnessof the project design can be evaluated before the construction of the physical system (i.e., the first prototypes). Fig. 8 shows the Simulink implemented scheme for fault injection. As an example in order to better under- stand the process, we can take a failure in the data transmis- sion system. After injecting a generic or specific error in the transmission protocol, or the hardware implementing it, we evaluate the consequences, comparing the actual output with Vehicle State Information B;tke Torques Control Compensator - Brake Torques Fig. 7: Traction Control System operational logic Fig. 8: Simulink scheme for fault injection techniques 39 Acta Polytechnica Vol. 43 No. 312003 the set of expected values. The consequences, both over a short period and over a long period are then evaluated. A wrong sigrial can be either a lack of information or a set of inconsistent bits. A preliminary result of this analysis is the establishment of the so-called safe operational time. This rep- resents the minimum elapsed time during which the system, thanks to its robust design and reliability, can sdll operare within the safety limits. Later the analysis provides data about the period of latency and the reboot of the sysrem. 6 Conclusion and recommendations Our work has presented the preiiminary results and the overall methodological approach to the problem of designing reliability and safety in automotive systems since the very beginning of the developmenr. None of the component anal- yses used are brand new or innovative in themselves. That was not the purpose of the work. Nevertheless, the overall approach has been viewed as innovarive and valuable in the automotive world, where interconnections between the difler- ent analyses and also between software and hardware, func- tional and physical analysis, are still parrially lacking, at least in comparison with the aerospace environment. We have also seen how other researchers, all ofthem producing outstand- ing and valuable results, have travelled this road. What is dif- ferent in this structured method is the leaner approach, aim- ing at applying just the minimum analysis required at the right time in the developmenr, avoiding massive efforts for preliminary design. We have also highlighted how ro ler the software and hardware sides talk rogether right from the be- ginning in order to ensure that the development of the func- tions is correctly transformed into code and the hardware im- plementation evolves at the same pace. The fault injection technique has also proven its effective- ness in supporting the assessment of the system performance in the earlier design phases. The key point is to accurarely select the events to generate and support the simulations adequately, especially for those failures not €asily reproduc- ible on track. The main advantage of applying this methodology is that is avoid common pitfalls and mistakes, especially in rhe earli- est phases of the design, without overburdening the system with cumbersome procedures. Acknowledgments The authors wish to grarefully thank Prof. paolo Maggiore for his support and invaluable help provided throughout the research work, and for reviewing this paper. He is a well known expert in the field of aerospace systems reliabiliry anal_ ysis and design. References lll Delphi Automorive Systems Ridz and Handling Systems USA, Delphi, 2000. t2l Newport,J. R.:Avionics Systems Design. CRC press, 1994. t3l Henderson, M. F.: Aircrafi Instrurnmts and Adunics. Jep- pesen Sanderson Training Products Inc., 1g93. t4l Papadopoulos, Y., McDermid, J., Sasse, R., Heiner, G.: Arnlysis and Synthesis of the Behaaiour of Comptex Programmable Electrunic Systems in Conditioru of Failure. Reliability Engineering and Systems Safety ?1, 2001, p.229-2a7 . l5l Helfrick, A.: Prhciples of Avionits.2nd Edition, Avionics Communication Inc., 2002. t6l Chiesa, S.: Aflidabilita, Sirurepa e Manutenziow nel pro- getto dei Sistemi. Torino, CLUT, 1988. l7l Galetto, F.: Aflrdabilita. Vol. I Teoria e mctoili d:i calcolo. Torino CLEUP Editore. lg8l. t8l Society of Automotive Engineers, ARP-4761: Aerospace Recommendcd Practice: Gu'idclines and Methods for Con- ducting Safety Assessment Process on CiaiJ Airbonu Systems and. Equiprnent. l2th edition, SAE, USA, 1996. l9l Palady, P.: Failure Modcs ard Efect Ana,lysis. pT publica- tions, USA, 1995. I l0] Crow, K.: Vahrc Analysis anl. Functinr Analysis System Tech- ntque. USA, DRM Associates. Ing. Gianfrancesco Maria Repici phone: +39 0l I 564 6858 fax: *39 0l I 564 6899 e-mail : gianfrancesco.repici@polito.it Department of Aeronautical and Space Engineering Ing. Aldo Sorniotti phone: +39 0l I 564 6915 fax: *39 0l I 564 6999 e-mail: aldo.sorniotti@polito.it Department of Mechanical Engineering Politecnico di Torino Corso Duca degli Abruzzi, 24 10129 Torino, Italy 40 Scan34 Scan35 Scan36 Scan37 Scan38 Scan39 Scan40 Scan41