Ratio Mathematica, 21, 2011, pp. 91-105 91 An Optimization Framework for “Build-or-Buy” Strategy for component Selection in a Fault Tolerant Modular Software System under Recovery Block Scheme P.C.Jha* Ritu Arora** U.Dinesh Kumar*** *Department of Operational Research , University of Delhi,India **Maharaja Agrasen Institute of Technology, GGSIP University, Delhi,India. ***Indian Institute of Management, Bangalore, India *jhapc@yahoo.com ** arora_ritu21@yahoo.co.in ***dineshk@iimb.ernet.in Abstract This paper discusses a framework that helps developers to decide whether to buy or build components of software architecture. Two optimization models have been proposed. First model is Bi-criteria optimization model based on decision variables in order to maximize the software reliability with simultaneous minimization of the overall cost of the system. The second optimization model deals with the issue of compatibility. Keywords : Modular software, software reliability, software cost, fault tolerance, software components, recovery block scheme 1. Introduction Science and technology demand high quality software for making improvement and breakthroughs. Today, computer hardware and software permeates our modern society. The newest cameras, VCRs, and automobiles cannot be controlled and operated without computers. When the requirement for and dependencies on computer increases, the possibility of crises from computer failures also increases. Software systems are developed as per the requirements given by the users. While developing the software, quality and reliability of the software are two key factors. Reliability of a software system is defined as the probability that software operates without failure in a specified environment, during a specified exposure period. Introduction of redundancy in the parts of the hardware and/or software components is one of the most followed ways to improve the reliability of the system under development. A careful use of redundancy may allow the system to tolerate faults. Despite that we still cannot guarantee error free software. A way of handling unknown and unpredictable mailto:*jhapc@yahoo.com mailto:arora_ritu21@yahoo.co.in mailto:***dineshk@iimb.ernet.in 92 software failures is through fault tolerance. One way to reduce the risks of software design faults and thus enhance software dependability is to use software fault tolerance techniques. Software fault tolerance techniques are employed during the procurement, or development, of the software. They enable a system to tolerate software faults remaining in the system after its development. When a fault occurs, these techniques provide mechanisms to the software system to prevent system failure from occurring. There are two structural methodologies for Fault Tolerant System i.e. Recovery Block Scheme and N-Version Scheme. In this paper, we will discuss optimization model for recovery block. Non functional aspects play a significant role in determining software quality. Given the fact that lack of proper handling of non functional aspects (Cysneiros et al, [5]) of a software application has led to a series of software failures, nonfunctional attributes such as reliability security and performance should be considered during the component selection phase of software development. This paper discusses a framework that helps developers to decide whether buying or building components of software architecture on the base of cost and non functional factors. While developing software, components can be both bought as COTS (Commercial Off-The Shelf) products, and probably adapted to work in the software system, or they can be developed in-house. This decision is known as “build-or-buy decision”. This decision affects the overall cost and reliability of the system. Most of today’s software systems include one or more COTS products. COTS are pieces of software that can be reused by software projects to build new systems. Benefits of COTS based development include significant reduction in the development cost, time and improvement in the dependability requirement. No changes are normally made to their source codes. COTS components are used without any code modification and inspection. The components, which are not available in the market or cannot be purchased economically, can be developed within the organization and are known as in- house built components. Kapur et al [8] discussed issues related to reliability of systems through weighted maximization of system quality subject to budgetary constraint. This paper discusses the issues related with reliability of the software systems and cost produced by integrating COTS or in-house build components. Large software system has modular structure to perform a set of functions. Each function is performed by different modules having different alternatives for each module. In case a COTS component is selected then different versions are available for each 93 alternative and only one version will be selected for each alternative of a module. If a component is in-house build component, then the alternative of a module is selected. A schematic representation of the software system is given in Figure 1. We are selecting the components for modules to maximize the system reliability by simultaneously minimizing the cost. The frequency with which the functions are used is not same for all of them and not all the modules are called during the execution of a function, the software has in its menu. Software whose failure can have bad effects afterwards can be made fault tolerant through redundancy at module level (Belli and Jadrzejowicz, [1]). We assume that functionally equivalent and independently developed alternatives (i.e In-house or COTS) for each module are available with an estimated reliability and cost. The first optimization model (optimization model-I) of this paper maximizes the system reliability with simultaneously minimizing the cost. The model contains four problems (P1), (P2), (P3) and (P4). Problem (P1) is not in normalized form, therefore, it has been normalized and transformed into problem (P3) and (P4). The second optimization model (optimization model-II) considers the issue of compatibility between different alternatives of modules as it is observed that some COTS components cannot integrate with all the alternatives of another module. The models discussed are illustrated with numerical example. 2. Notations R : System quality measure lf : Frequency of use, of function l ls : Set of modules required for function l iR : Reliability of module i L : Number of functions, the software is required to perform n : Number of modules in the software im : Number of alternatives available for module i ijV : Number of versions available for alternative j of module i Total number of tests performed on the in- house developed instance (i.e. alternative of module ) Number of successful (i .e failure free) test performed on the in-house developed instance (i.e. alternative of module ) 1 t : Probability that next alternative is not invoked upon failure of the current alternative : tot ij N j i : suc ij N j i 94 2t : Probability that the correct result is judged wrong. 3 t : Probability that an incorrect result is accepted as correct. ij X : Event that output of alternative j of module i is rejected. Yij : Event that correct result of alternative j of module i is accepted. sij : Reliability of alternative j of module i r : Reliability of version of alternative for module ijk k j i ijk C : Cost of version k of alternative j for module i ijk r : Reliability of version k of alternative j for module i ijk d : Delivery time of version k of alternative j for module i ij c : Unitary development cost for alternative j of module i ij t : Estimated development time for alternative j of module i ij  : Average time required to perform a test case for alternative j of module i ij  : Probability that a single execution of software fails on a test case chosen from a certain input distribution t y : 0, if constraint is active 1, if constraint is inactive th th t t    : ijk x : 1, if the of COTS alternative of the module is chosen 0, otherwise th th th k version j i   ijz : Binary variable taking value 0 or 1 1 , if alternative is present in module 0, otherwise j i   3. Optimization Models The first optimization model is developed for the following situations which also hold good for the second model, but with additional assumptions related to compatibility among alternatives of a module. ijy 1 if the th alternative of th module is in-house developed. 0 otherwise j i   95 The following assumptions are common for the optimization models are: 1. Software system consists of a finite number of modules. 2. Software system is required to perform a known number of functions. The program written for a function can call a series of modules  n . A failure occurs if a module fails to carry out an intended operation. 3. Codes written for integration of modules don’t contain any bug. 4. Several alternatives are available for each module. Fault tolerant architecture is desired in the modules (it has to be within the specified budget). Independently developed alternatives (primarily COTS/ In-House components) are attached in the modules and work similar to the recovery block scheme discussed in (Berman et al., [2] and Kumar, [9]). 5. The cost of an alternative is the development cost, if developed in house; otherwise it is the buying price for the COTS product. 6. Different In- house alternatives with respect to unitary development cost, estimated development time, average time and testability of a module are available. 7. Cost, reliability and development time of an in-house component can be specified by using basic parameters of the development process, e.g., a component cost may depend on a measure of developer skills, or the component reliability depends on the amount of testing. 8. Different versions with respect to cost, reliability and delivery time of a module are available. 9. Other than available cost-reliability versions of an alternative, we assume the existence of virtual versions, which has a negligible reliability of 0.001, zero cost and zero delivery time. These components are denoted by index one in the third subscript of , C and . ijk ijk ijk x r for example 1ij r denotes the reliability of first version of alternatives j for module i . 3.1 Model Formulation Let S be a software architecture made of n modules having im alternatives available for each module and each COTS alternatives has different versions. 3.1.1 Build versus Buy Decision For each module i , if an alternative is bought (i.e. some 1 ijk x  ) then there is no in-house development (i.e. 0 ij y  ) and vice versa. 96 1 =1; 1, 2,...., and 1, 2,...., ijV ij ijk i k y x i n j m     3.1.2 Redundancy Constraint The equation stated below guarantees that redundancy is allowed for the components. 2 ; 1, 2,...., and 1, 2,...., ijV ij ijk ij i k y x z i n j m      1 1; 1, 2,...., and 1, 2,...., ij ij i x z i n j m    1 1; 1, 2,.... im ij j z i n    3.1.3 Probability of Failure Free In-house Developed Components The possibility of reducing the probability that the alternative of module th th j i fails by means of a certain amount of test cases (represented by the variable tot ij N ). Cortellessa et al [4] define the probability of failure on demand of an in-house developed alternative of module th th j i , under the assumption that the on-field users’ operational profile is the same as the one adopted for testing (Bertolino and Strigini, [3]). Basing on the testability definition, we can assume that the number suc ij N of successful (i.e. failure free) tests performed on th j alternative of same module.  1 ; 1, 2,...., and 1, 2,....,suc totij ij ij iN N i n j m    Let A be the event “ suc ij N failure – free test cases have been performed ” and B be the event “ the alternative is failure free during a single run ”.If ij  is the probability that the in- house developed alternative is failure free during a single run given that suc ij N test cases have been successfully performed, from the Bayes Theorem we get ( / ) ( ) ( / ) ( / ) ( ) ( / ) ( ) ij P A B P B P B A P A B P B P A B P B     The following equalities come straightforwardly: ( / ) 1; ( ) 1 ; ( / ) (1 ) ; ( ) suc ijN ij ij ijP A B P B P A B P B        97 therefore, we have     1 ; 1, 2,...., and 1, 2,...., 1 1 suc ij ij ij iN ij ij ij i n j m             3.1.4 Reliability equation of both in-house and COTS components The reliability ( ij s ) of th j alternative of th i module of the software. ; 1, 2,...., and 1, 2,...., ij ij ij ij i s y r i n j m    where 1 ; 1, 2,...., and 1, 2,...., ijV ij ijk ijk i k r r x i n j m     3.1.5 Delivery time constraint The maximum threshold T has been given on the delivery time of the whole system. In case of a COTS components the delivery time is simply given by ijk d , whereas for an in- house developed alternative the delivery time shall be expressed as ( ) tot ij ij ij t N .   1 1 1 iji Vmn tot ij ij ij ij ijk ijk i j k y t N d x T                3.2 Objective Function 3.2.1 Reliability objective function Reliability objective function maximizes the system quality (in terms of reliability) through a weighted function of module reliabilities. Reliability of modules that are invoked more frequently during use is given higher weights. Analytic Hierarchy Process (AHP) can be effectively used to calculate these weights. 1 Maximize ( ) l L l i l i s R X f R     where i R is the reliability of module i of the system under Recovery Block stated as follows.     ,.........2,1 ; 1 1 1 niYPXPzR ij i ij z ij m j j k z ikiji                   1 3 21 1 1ij ij ijP X t s t s t       98    2 1ij ijP Y s t  3.2.2 Cost objective function Cost objective function minimizes the overall cost of the system. The sum of the cost of all the modules is selected from “build – or - buy” strategy. The in-house development cost of the alternative j of module i can be expressed as  totij ij ij ijc t N   1 1 1 Minimize C(X)= iji Vmn tot ij ij ij ij ij ijk ijk i j k c t N y C x              3.3 Optimization Model I In the optimization model it is assumed that the alternatives of a module are in a Recovery Block. In recovery block more than one alternative of a program exist. For COTS based software multiple alternatives of a module can be purchased from different vendors. Each module works under a recovery block. Upon invocation of a module the first alternative is executed and the result is submitted for acceptance test. If it is rejected, the second alternative is executed with the original inputs. The same process continues through attached alternative until a result is accepted or the whole recovery block (module) fails. Fault tolerance in a recovery block is achieved by increasing the number of redundancies. Problem (P1) 1 Maximize ( ) l L l i l i s R X f R      (1)   1 1 1 Minimize C(X)= iji Vmn tot ij ij ij ij ij ijk ijk i j k c t N y C x              (2) Subject to  and y are binary variable/ijk ijX S x      ,.........2,1 ; 1 1 1 niYPXPzR ij i ij z ij m j j k z ikiji             (3)  1 , 1, 2,...., and 1, 2,....,suc totij ij ij iN N i n j m    (4)       1 3 2 1 1 1 ij ij ijP X t s t s t          2 1ij ijP Y s t  99     1 ; 1, 2,...., and 1, 2,...., 1 1 suc ij ij ij iN ij ij ij i n j m             (5) ; 1, 2,...., and 1, 2,...., ij ij ij ij i s y r i n j m    (6) 1 =1; 1, 2,...., and 1, 2,...., ijV ij ijk i k y x i n j m     (7) 2 ijV ij ijk ij k y x z    ; 1, 2,...., and 1, 2,...., ii n j m  (8) 1 1; 1, 2,...., and 1, 2,...., ij ij i x z i n j m    (9) 1 1 ; 1, 2,...., im ij j z i n    (10)   1 1 1 iji Vmn tot ij ij ij ij ijk ijk i j k y t N d x T               (11)  Where X is a vector of elements : and ; 1, 2,..... ; 1, 2,...., ; k=1,2,....V ijk ij i ij x y i n j m  3.3.1 Normalization The problem (P1) is Bi- criteria optimization problem in which on one hand system reliability is maximized and other hand cost of selected components to form / assemble the system is minimized. The reliability which is unit free is measured between zero and one whereas cost has its unit. Two objectives can be converted to single objective programming problem either if both objectives are of same unit or if both objectives can be made unit free. To make cost function unit free, the following transformation is used. 1 1 1 iji Vmn ijk i j k c C      ,   1 1 imn tot ij ij ij ij i j c c t N     Now   , and 1 tot ij ij ij ijijk ijk ijijk ij c t NC C c C c c c c c        The resulting problem then can be rewritten as follows. Problem (P2) Maximize   F 1 1      L l si il l RfX 100 Minimize 2 1 1 1 ( ) iji Vmn ij ij ijk ijk i j k F X c y C x               Subject to SX  The problem (P2) can further be written as vector optimization problem as. Problem (P3) Vector Max  XF Subject to SX  where       TXFXFXF 21 , 3.3.2 Finding Properly Efficient Solution Definition 1 (Steuer, [10]): A feasible solution SX  * is said to be an efficient solution for the below problem if there exists no SX  such that    *XFXF  and    *XFXF  Definition 2 (Steuer, [10]): An efficient solution SX  * is said to be an properly efficient solution for the problem (P2) if there exist 0 such that for each r            XFXFXFXF jjrr ** / for some j with    *XFXF jj  and    *XFXF rr  for SX  . Using Geoffrion’s scalarization the problem (P2) reduces to Problem (P4) Maxize Z= 1 1 2 2F F  Subject to SX  0, 1 2121   Lemma(Geoffrion,[6]):The optimal solution of the problem (P4) for fixed 21 and  is a properly efficient solution for the problem (P3) as well as (P1). 3.4 Optimization Model II Optimization model II is an extension of optimization model I. As explained in the introduction, it is observed that some alternatives of a module may not be compatible with alternatives of another module (Jung and Choi, [7]). The next optimization model II addresses this problem. It is done, incorporating additional constraints in the optimization models. This constraint can be represented as chugsq t xx  , which means that if alternative s for module g is chosen, then 101 alternative ztu t ,........1 ,  have to be chosen for module h . We also assume that if two alternatives are compatible, then their versions are also compatible. , tgsq hu c t x x My  ghugs msVVq t ,.....,1 , ,......,2 c , ,.......,2  (12)    2 thut Vzy (13) Constraint (12) and (13) make use of binary variable t y to choose one pair of alternatives from among different alternative pairs of modules. Problem (P3) can be transformed to another optimization problem using compatibility constraints and if more than one alternative compatible component is to be chosen for redundancy, constraint (13) can be relaxed as follows.    2 thut Vzy (14) 4. Illustrative Examples Consider a software system having two modules with more than one alternative for each module. The data sets for COTS and in-house developed components are given in Table-1 and table II, respectively. Let      1 2 3 1 2 33, 1, 2,3 , 1,3 , s 2 , 0.5, 0.3 and 0.2L s s f f f       . It is also assumed that 01. and 05. ,01. 321  ttt FUNCTIONS VERSION MODULE MODULE MODULE Figure 1 Structure of the software 1f 2f lf 1m 2m nm A L T E R N A T I V E S 102 Table 1: Data set for COTS components Table 2Data set for In-House conponents 4.1 Optimization Model – I Table 3 presents the solution for optimization model I. The problem is solved using software package LINGO (Thiriez, [11]). The solution to the model gives the optimal component selection for the software system along with the corresponding cost and reliability of the overall system. The sensitivity analysis to the delivery time constraint has been performed. It is clearly seen from the table Alternati ves Versions 1 2 3 Cost Reliability Delivery Time Cost Reliability Delivery Time Cost Reliability Delivery Time 1 1 0 0.001 0 14 0.90 3 11 0.88 4 2 0 0.001 0 12.5 0.86 4 18 0.92 2 3 0 0.001 0 17 0.90 2 15 0.88 3 2 1 0 0.001 0 13 0.87 4 17. 5 0.86 2 2 0 0.001 0 11 0.91 5 12 0.89 4 3 0 0.001 0 18 0.89 2 15 0.86 3 4 0 0.001 0 13 0.86 4 14 0.88 3 3 1 0 0.001 0 16 0.85 3 18 0.90 2 2 0 0.001 0 16 0.89 3 17 0.87 2 Module i Alternatives ijt ij  ij c ij  1 1 8 0.005 5 0.002 2 6 0.005 4 0.002 3 7 0.005 4 0.002 2 1 9 0.005 5 0.002 2 5 0.005 2 0.002 3 6 0.005 4 0.002 4 5 0.005 3 0.002 3 1 6 0.005 4 0.002 2 5 0.005 3 0.002 103 that when the delivery time was 10 units, then only COTS components were selected. When the delivery time increases along with the COTS components, in house build components were also selected. When the delivery time was 12 units, only one in-house component was developed with the minimum cost 79 units attained at reliability level 0.85.Our system cost decreases while the corresponding reliability increases because the components developed in-house decreases the cost initially but later if the level of reliability has to be kept at 0.90 then by increasing delivery time by 5 and 9 units respectively, more in-house build components were selected which in turn increases the cost and reliability of the overall system. Redundancy is also there in all the four cases. Table 3: Solution of Optimization Model I 4.2 Optimization Model-II To illustrate optimization model for compatibility, we use previous results. Case 1. Delivery Time is assumed to be 10 units. We assume third alternative of second module is compatible with second and third alternatives of first module. 111 123 133 1x x x   Case No. Delivery Time COTS IN-House System Reliability Overall system Cost Joint Objective Value 1 10 111 123 132 211 221 232 242 311 322 1 1 1 x x x x x x x x x          Nil 0.84 82 0.66 2 12 111 123 132 211 232 241 311 322 1 1 1 x x x x x x x x         22 1y  0.85 79 0.68 3 17 111 123 132 211 221 232 311 1 1 1 x x x x x x x        24 32 1y y  0.93 86 0.74 4 21 111 132 211 221 232 311 1 1 1 x x x x x x       12 24 32 1y y y   0.94 92 0.75 104 211 221 232 242 1x x x x    311 322 1x x  It is observed that due to the compatibility condition, third alternative of first module is chosen as it is compatible with third alternative of second module. The system reliability for the above solution is 0.84 and cost is 81 units. Case 2. Delivery Time is assumed to be 12 units. We assume second alternative of third module is compatible with second and third alternatives of first module. 22 1y  ; 111 123 133 211 232 241 311 322 1 1 1 x x x x x x x x         It is observed that due to the compatibility condition, third alternative of first module is chosen as it is compatible with second alternative of third module. The system reliability for the above solution is 0.85 and cost is 77 units. Case 3. Delivery Time is assumed to be 10 units. We assume third alternative of second module is compatible with second and third alternatives of first module. 24 32 1y y  111 123 133 211 221 232 311 1 1 1 x x x x x x x        It is observed that due to the compatibility condition, third alternative of first module is chosen as it is compatible with third alternative of second module. The system reliability for the above solution is 0.94 and cost is 84 units. 5. Conclusions We have presented optimization models that supports the decision whether to buy software components or to build them in-house upon designing structure. A fault tolerant software structure for Recovery block scheme is discussed. A numerical example is presented to support these models. When delivery time is small then all the COTS components were selected and redundancy is allowed. But as the delivery time increases along with the COTS components in-house components 105 were also selected and different impacts on cost and reliability were considered. Redundancy was also there in all the cases. References [1] F. Belli and P. Jadrzejowicz, An approach to reliability optimization software with redundancy, IEEE Transaction of Soft. Engineering, vol.17/3(1991), pp. 310-312. [2] O. Berman and U. D.Kumar, Optimization models for reliability of modular software system, IEEE Transactions of Software Engineering, vol. 19/11(1993), pp.1119-1123. [3] A. Bertolino, and L. Strigini, On the use of testability measures for dependability assessment, IEEE Transactions on Software Engineering, 22/2(1996), pp.97-108. [4] V. Cortellessa, F. Marinelli, and P. Potena, An optimization framework for “build-or-buy” decisions in software architecture, Computers and Operations Research, vol.35(2008), pp. 3090-3106. [5] L. M. Cysneiros and J.C.S. Leite, Nonfunctional requirements: from elicitation to conceptual models, IEEE transactions on Software engineering, vol.30 (2004), pp. 328-350. [6] A. M. Geoffrion, Proper efficiency and theory of vector maximization, Journal of Mathematical Analysis and Application, vol.22 (1968), pp. 613-630. [7] H. W.Jung and B. Choi, Optimization models for quality and cost of modular software system, European Journal of Operations Research, vol.112(1998), pp. 613-619. [8] P. K. Kapur, A.K. Bardhan and P.C. Jha, Optimal reliability allocation problem for a modular software system, OPSEARCH, vol.40/2(2003). [9] U. D. Kumar, Reliability analysis of fault tolerant recovery block, OPSEARCH, vol.35(1998),pp. 281-294. [10] R. E. Steuer, Multiple Criteria optimization: theory, computation and application, Wiley, New York ,1986. [11] H. Thiriez, OR software LINGO, European Journal of Operational Research, vol.12(2000), pp.655-656.