Jtam-A4.dvi JOURNAL OF THEORETICAL AND APPLIED MECHANICS 55, 4, pp. 1269-1278, Warsaw 2017 DOI: 10.15632/jtam-pl.55.4.1269 COMPARISON OF BAYESIAN AND OTHER APPROACHES TO THE ESTIMATION OF FATIGUE CRACK GROWTH RATE FROM 2D TEXTURAL FEATURES Matej Mojzeš, Jaroḿir Kukal Czech Technical University in Prague, Faculty of Nuclear Sciences and Physical Engineering, Department of Software Engineering, Praque, Czech Republic e-mail: mojzemat@fjfi.cvut.cz Hynek Lauschmann Czech Technical University in Prague, Faculty of Nuclear Sciences and Physical Engineering, Department of Materials, Praque, Czech Republic The fatigue crack growth rate can be explained using features of the surface of a structure. Among other methods, linear regression can be used to explain crack growth velocity. Non- linear transformations of fracture surface texture features may be useful as explanatory variables. Nonetheless, the number of derived explanatory variables increases very quickly, and it is very important to select only fewof thebest performingones andpreventoverfitting at the same time. To perform selection of the explanatory variables, it is necessary to assess quality of the given sub-model.We use fractographic data to study performance of different information criteria and statistical tests as means of the sub-model quality measurement. Furthermore, to address overfitting,weprovide recommendationsbasedona cross-validation analysis. Among other conclusions, we suggest the Bayesian Information Criterion, which favours sub-models fitting the data considerably well and does not lose the capability to generalize at the same time. Keywords: quantitative fractography, optimization, heuristic, linear regression, sub-model selection 1. Introduction One of the tasks of quantitative fractography consists of modelling the relation between the fatigue crack growth rate (CGR) and textural features of images of fatigue fracture surfaces as explained in (Lauschmann and Siska, 2012; Nadbal et al., 2008; Lauschmann et al., 2006). For this purpose, e.g. amultilinear regressionmodel (Sekeresova andLauschmann, 2008;Kunz et al., 2010) or a neural network may be used. Of these two, the neural network allows us to analyze and describe the structure of the obtained modeland better imagine the textural subset which is mutually related with the CGR (Lauschmann and Goldsmith, 2009). The parameters of the respective regressionmodelmay be estimated using the least squares method. However, in real-world applications, the basic linear model is not flexible enough to fit the data. This can be solved by adding terms defined by non-linear functions of the basic features, e.g. logarithm, square root, etc. However, adding such features is soon limited by the given number of images. According toLauschmannandGoldsmith (2009), onepossiblewayaround this limitation is a two-phase stepwise regressionwith the first stage beingbottom-up stepwise regression beginning with a constant model and terminating at a given overfitting level p0. In each iteration, a new explanatory variable is included – the one which maximally decreases the sum of squares of residui. The second stage is top-down stepwise regression beginning with the final sub-model 1270 M.Mojzeš et al. from the first stage and terminating at the given final overfitting level pF . In this procedure, an explanatory variable is selected for the elimination via theWald test on a selected critical level. While keeping in mind the relevant motivation for this problem, we suggest that instead of the stepwise regression, an alternative statistical approach based on the method of sub- -model multiple testing may provide better results (Mojzeš et al., 2012). There is a vast set of possible criteria that evaluate the quality of a given sub-model and are to beminimized. Further in the paper, we elaborate on the selection and assessment of some of the criteria. They are are interesting in the fractographic context, but may be applied generally to multi-parametric recognition as well. 2. Material and methods 2.1. Linear model Denote by vj the crack growth rate assigned to the j-th image of the fracture surface, and by fuj the set of image features. Themultilinearmodel in its basic form is based on the formula log10vj ≈ c0+ ∑ u cufuj (2.1) Parameters cu can be estimated by the least squares method. Since the linear model is not flexible enough to fit the data, we may add different non-linear functions of basic features and, therefore, modify the model to the following form log10vj ≈ c0+ ∑ q cqhq (2.2) where h are selected from an extended set of features containing the features fu and a selection of their basic non-linear functions, e.g. {hv}⊂P∪Q∪R (2.3) where P= {fu} Q= {log10fu,f−1u ,f1/2u ,f2u}, R= {Fuv,F−1uv ,F1/2uv ,F2uv |u>v} (2.4) where Fuv = fufv. The next task will consist of defining a specific methodology on how to select and assess a distinct combination of explanatory variables, i.e. how to select the best sub-model from the extended feature set. 2.2. Sub-model selection A sub-model should be considered a nested subset of the full model consisting of all the variables from the extended feature set. There are two extreme cases: the first is the full model and the second one corresponds to the constant model. Letn∈ N be the length of v (i.e. thenumberof observations),m∈ N the extended feature set cardinality and k∈{0,1, . . . ,m} the number of explanatory extended feature set variables used in the sub-model. Also, let c= [c0,c1, . . . ,ck] be the vector representing sub-model coefficients calculated solvingEq. (2.2).Thisvector canbedivided intocred = [c1,c2, . . . ,ck] as its significant part and c0 = [c0] as the constant term coefficient. Furthermore, wemay express the sum of squares for the optimum c of a given sub-model as SSQ and SSQ0 as the sum of squares for c0. Lastly, we will use the sub-model error defined as s 2 e = SSQ n−k−1 (2.5) Comparison of Bayesian and other approaches to the estimation of fatigue crack... 1271 2.3. Data description The methods developed in this paper will be applied to fatigue fracture surfaces of three laboratory specimens of the heat-resistant steel P92. Compact tension specimens (Fig. 1) (Lau- schmann et al., 2011) were loaded in air at 20◦C by constant sinusoidal cycles with parameters of the external force according to Table 1. The loading frequency was lowered in steps from 13Hz to 4Hz in the final stage. Fig. 1. Compact tension specimen Table 1.External force parameters Specimen Fmin [N] Fmax [N] 1 140 3300 2 2000 4800 3 3300 5500 The fatigue crack surfaces were documented using SEMwith magnification 200×, real field of view was 0.6mm×0.45mm (examples in Fig. 2). The sequence of images was located in the middleof the crack surface along the sameaxis according towhich the crack lengthwasmeasured (Fig. 3). The recorded areas were mutually shifted by 0.4mm. The direction of crack growth in the imageswas bottom-up.Digital representation in 1200×1600 pixels and 256 brightness values was used. The estimates of CGR were computed from frequently repeated records of the crack length. The course of the CGR related to the crack length was estimated, and every image was assigned a value of the CGR pertinent to its middle. Fig. 2. Examples of SEMpictures of fracture surface; (a) low crack growth rate, (b) high crack growth rate 1272 M.Mojzeš et al. Fig. 3. Layout of snaps in crack surface (schematic plot) For image textural features, energies of a 2D discrete wavelet transformwere taken (Lausch- mannandGoldsmith, 2009). Decomposition using theType3Daubechieswavelet at 8 levels was computed by Matlab function wavedec2. The energy is the mean square of wavelet coefficients for a given level and direction. The basic sequence of features, x1,x2, . . . ,x24, may be regarded as a set of H1,V1,D1, . . . ,H8,V8,D8 where Hj,Vj,Dj are wavelet decomposition energies at the j-th le- vel in the horizontal, vertical and diagonal directions. The vector y represents decimal logarithm of the crack growth rate y= log10v. The analysed data consisted of n = 162 observations and a total of 1224 features in the expanded feature set. It comprises: • basic linear featuresP, card(P) = 24, • non-linear transformations of the basic features Q, card(Q)= 96, • dot product transformations of the basic featuresR, card(R)= 1104, as stated in (2.3). To minimize potential numerical errors when working with the data, input data standardi- zation was implemented as follows xk = hk−Eh√ Dh (2.6) using Eh and Dh as the mean value and dispersion of the explanatory data. Last, but not least – apart from the significance of the data, we canmake use also of physical distribution of the data in the given data set, which is divided randomly into three separate groups. This will be especially useful when dealing with the cross-validation. 2.4. Selection heuristic Searching for the best available sub-model is a binary optimization task that can be defined as minimization of the objective function f:D→ R where D= {x∈{0,1}m | 0¬x¬1} (2.7) is thebinarydomain.Here, thebinaryvectorx is directly representingutilization of the extended feature set, i.e. its components that are equal to “one” are included in the corresponding sub- -model. Therefore, 0 refers to the constant model, and 1 to the full model. Furthermore, suppose that we have an acceptable value of the objective function f∗. Then, we can define a set of solutions, the goal set, as G= {x∈D | f(x)¬ f∗} (2.8) where f ∗ ­min{f(x) | x∈D} (2.9) Comparison of Bayesian and other approaches to the estimation of fatigue crack... 1273 For that purpose, we may utilize some of the well-known heuristic algorithms. We have chosen physically motivated Fast Simulated Annealing (FSA) (Kvasnička et al., 2000) with reputable efficiency in the case of integer optimization tasks. FSA performs mutation on the ring neigh- bourhood N(x)= {y∈D |‖y−x‖1 =1} (2.10) Beginning with k = 0, Tk > 0 and the initial solution vector generated by uniform distribu- tion x0 ∼ U(D), we perform FSA mutation as a uniformly generated random binary vector yk ∼U(N(xk)). Using ηk ∼U([−1,+1]), we set xk+1 =      yk for f(yk)