Opt imizing parameters of t he t wo-layer percept rons’ boost ing ensemble training for accuracy improvement  in wear stat e discontinuous ... 

Проблеми трибології (Problems of Tribology) 2015, № 1 

65 

 
Romanuke V.V. 
Khmelnit skiy Nat ional University,  
Khmelnit skiy, Ukraine,  
E-m ail: romanukevadimv@gmail.com 

OPTIMIZING PARAMETERS OF THE  
TWO-LAYER PERCEPTRONS’ BOOSTING 
ENSEMBLE TRAINING FOR ACCURACY 

IMPROVEMENT IN WEAR STATE 
DISCONTINUOUS TRACKING MODEL 

REGARDING STATISTICAL DATA 
INACCURACIES AND SHIFTS 

 
UDC 539.375.6+539.538+519.237.8 
 

There is a trial of op timization for imp roving accur acy  in trackin g metal tool wear states discontinuously , when the 
states’ finite set has been statistically  tied to the set of rep resentative wear influencin g factors. Range of wear states is p re-
sumed to be wholly  samp led into those factors. The tracker is a static model based on boosting ensemb le of two-lay er p ercep -
trons with nonlinear transfer functions. It successfully  regards statistical data in accur acies and shifts in a p roblem of  trackin g 
24 wear states featured with 16 wear influencin g factors. Having increased number of classifiers within the ensemble up  to 
30, the averaged gain with the optimized ensemble is about 56 % in resp ect of the best ensemble of three classifiers. Simi-
larly , varian ce of  trackin g error rate over  24 wear  states is about 53 % lower. Near ly  the same results are r egistered when the 
ensemble is comp osed without training, but just setting every  classifier’s weight to one thirtieth. To get the p erfected accu-
racy  more, such equally -weighted comp ositions shall be investigated in the sequel. 

 
Key words: wear state, statistical data, data inaccur acies, data shifts, tracking model, accur acy , two-lay er p erceptron,                            

boosting, boosting ensemb le trainin g, op timization, tracking error rate. 
 
Tracking wear states regar ding statistical data inaccur acies and shifts  
 
Metal processing is an inseparable part of heavy industry. For rationalized usage of billets and tools 

preventing their underuse and overuse, metal wear states are tracked and forecasted. At this, unavoidable statisti-
cal data inaccuracies and shifts (SDIS) of wear influencing factors (WIF) should be regarded because of the high 
stochasticity of the wear process. In tracking wear states regarding SDIS, the tracking accuracy is improved ei-
ther with refine ment of continuous models or accumulating additional statistics for discontinuous tracking mod-
els based on statistical correspondence [1, 2]. The statistical correspondence approach [1] looks we ll-pro mising 
inasmuch as regarding SDIS is possible with man ipulating huge statistics  anyway. Nonetheless, the high stochas-
ticity of the wear process  provoking the spoken SDIS a llows us ing universal classifiers wh ich perform great ly on 
Gaussian-noised data (GND). And name ly GND a re specificity of wea r influenced with a great dea l of factors 
(which, upon the whole, are innume rable). Th is lets treat those noises as normal. 

 
Appr oaches to wear state tracking accurac y i mprove ment and the gain 
 
With universal classifiers of GND, there are two ma jor approaches to improve their accuracy. They are 

training process optimization and boosting. Based on a boosting technique stated in [3] for ensemb le of three 
learners, where everyone is two-layer perceptron with nonlinear transfer functions (2LPNLTF), the averaged 
gain of the boosting in tracking 24 wea r states with 16 WIF e xceeded 50 %. Th is gain is defined with the track-
ing error rate (T ER) indicat ing virtually the percentage of the classifier’s correct responses among the total in-
puts. At the highest level of SDIS in every state, the mean ensemble TER was 6.82 %, and the averaged TER 
varied between 0.96 % and 1.12 %. And with the ensemble, mo reover, variance of wear states’ TER beca me  
more  than 50 % lowe r. 

However, the accuracy gains were reached at the raw para meters of the boosting training process 
(BT P). The rule  for redistributing weights of the train ing samples was naive as well. Thus, those parameters may  
be optimized in order to get the accuracy perfected. 

 
The  ar ticle goal and tasks  
 
Taken the exa mple of tracking 24 wear states with 16 WIF fro m [3], the 2LPNLTF-boosting gain is go-

ing to be improved. The improve ment is the statistical approximat ion accuracy incre ment. For this, two raw pa-
ra meters of BTP are swept within their ranges to see the lowest TER. A long with that, the linear function rule for 
redistributing weights of the training samp les will be adjusted through a set of nonlinear functions. Eventually, 
we are  to evaluate the new gain after testing the optimized 2LPNLTF-boosting. 

 
Tracking wear states with boosting ensemble  of 2 LPNLTF 
 
In general, there a re Q  WIF and  \ 1N   wear states. The forecasted state by 2LPNLT F is  


Opt imizing parameters of t he t wo-layer percept rons’ boost ing ensemble training for accuracy improvement  in wear stat e discontinuous ... 

Проблеми трибології (Problems of Tribology) 2015, № 1 

66 

*
1,

arg max s
s N

s v


 ,  
HL

1
1

1 1

1 exp 1 exp
S Q

s ks i ik k s
k i

v u x a h b




 

                                  
                  (1) 

by  1
Q

i Q
x X


  X  and H LS  neurons in 2LPNLTF h idden layer. 2LPNLTF coeffic ients  in matrices 

 
S HL

ik Q S
a


,  

SH L
ks S N

u


,  
SH L1k S

h


,  1 Nsb  , particularly, in [2, 3] were determined by three methods: “traingda” 

( 1  ), “traingd x” ( 2  ), “trainscg” ( 3  ). In itia lly, finite statistical data set (FSDS) is   1,
L

jj j
w


X  by 

 \ 1, 1L N   and the wear state  1,jw N  labeled as 
1

j
j i

Q
x X


   X . All possible wear states are 

represented in FSDS:      1 1, 1,
L

j j
w N N


  by the s -th state 

sj
w  labeled as the pure representative (PR) 

sj
X . The 2LPNLTF train ing sets are   : sjis isQ N iy y x Y  and 

    10 01 1 , 1, , , 0, , 0, 1 , 0, , 0, : h h hHR h hr h h H h H H                          Y Y Y Ξ Θ Ξ ΘY   
        0 0, 1, , , 0, , 0, 1 , 0, , 0, 1is is is is sQ N Q N                         Ξ Θ Ξ ΘN N  (2) 
for PR and SDIS correspondingly, where  0R    and  0, 1N  is the infinite set of standard normal vari-
ate’s values. Denoting the  -th 2LPNLT F value sv  in (1) as  sv  , the boosted classifier output is  

 *
1,

arg max
s

s
N

s v


    by     
1

B

s sv v


     (3) 
for  \ 1B   2LPNLTF in the ensemble, where we ights    1

B


   are determined as follows. In training the 

ensemble, the set (2) is re-generated for T  times, forming FSDS of  M R H N T     tra ining samp les. 
These samples have the weights in    

1 M
q d q    D  at the q -th iteration of BTP by 01,q q  at some final 

iteration nu mber 0q , where   
11d M    1, M  . Matrix  B Ma A  is of flags of c lassifiers’ correct 

responses, where 1a   is the correct classification of  -th sample by the  -th 2LPNLT F, otherwise 

0a  . The we ighted errors are in     B Mq q    E , where the  -th classifier’s we ighted error 

      
1

1
M

q d q a  


    , 1, B  . (4) 
Starting fro m 1q  , there are found the best 2LPNLT F  * q  and the minima l we ighted error  * q , 
    * 1,arg minBq q     and     * 1,minBq q    (5) 
respectively, letting lea rn the coeffic ient 
    *1q q     (6) 
and calculate the new distribution  1q D  of weights 

  
1

1
M

d q d d  


      by        * ,exp 2 1qd d q q a           (7) 
over M  training sa mples. If   1* 1q N

    then q q  and 1q q  , and (4) — (7) are re-found. If 

  1* 1q N
 …  then 0q q  and there are calculated the coeffic ients  

      
0

1

q

p

q q p


      by  01,q q   for     
   0 *1, ,q q q

q
 

       1, B  . (8) 


Opt imizing parameters of t he t wo-layer percept rons’ boost ing ensemble training for accuracy improvement  in wear stat e discontinuous ... 

Проблеми трибології (Problems of Tribology) 2015, № 1 

67 

For regarding SDIS modeled with 0 0.25   and 1.5   for 16Q  , 24N  ,   
16

1
0; 1sji

i
x


 , by 

HL 45S   and 3B   in [3], FSDS (2) for BT P was formed by 1R  , 20H  , 100T  . Denote the averaged 

TER by  , ,H T g  for the power 0g   to form a set of nonlinear functions     *1
g

q q        

instead of (6). Thus, the problem is  
0

min min min , ,
H T g

H T g
  

  and to find *H , *T , *g , which 

 
  

 
 * * * , , 0, , arg min , ,H T gH T g H T g    . (9) 

 
Having swept those three parameters of BT P within their reasonable ranges, there has been exposed  

that by  1, 40H   and  1, 200T   and  0; 10g   TER re mains nearly the same. At the most, the aver-
aged value  * * *, ,H T g  doesn’t seem to be significantly less than the registered TER in [3]. An important fact 
is that by  1, 5H   and  1, 5T   the inequality   1* 1q N …  is never true for some BTP. So met imes 
this gap can be dealt through setting up a best classifier we ighted error tolerance (BCW ET) BCWET 0  . Then 

the items (4) — (7) are re-found while   1* BCWET1q N
    . For this, BCWET may be taken as 

 1BCWET 0.1, , 0.01N    or other reasonable values. Nonetheless, here * * 6H T   and * 1g   because the 
greater values of H  and T  the longer BTP is. 

However, when assembling much greater number of 2LPNLT F, T ER becomes lower. For instance, it is 
e xpected  20, 100, 1 0.75   with 30B   classifiers. Ama zingly enough, but here TER a lso isn’t influenced 
much when H  and T  increase. And with 30H T   BTP convergence is ensured even for BCWET 0  . 

 
Opti mization results and discussion 
 
In a problem of 24 wea r states tracking with universal GND classifiers, the boosting ensemble of 30 

2LPNLT F is trained optimally under para meters * 30H   and * 40T   and * 1g  . This is so because, partially, 
by 20H T   misconvergence was spotted. Although the inequality    60, 200, 1 30, 40, 1    is e x-
pected, time o f BTP is obviously shorter for lo wer H  and T , and the difference between  60, 200, 1  and 
 30, 40, 1  is too small and unstable. The best ensemble has  30, 40, 1 0.63   and it takes about 330 sec-

onds to train it, i. e. to find the distribution   30 1  . Moreover,  30, 40, 1 4.8   at the highest level of 
SDIS in every state, giving the 56 % gain in respect of the best ensemble with three 2LPNLTF. And variance of 
24 wea r states’ TER has become 53 % lower when used 10 t imes greater nu mber of 2LPNLTF. 

The astounding event is that without factual train ing, but just setting   130    1, 30   , T ER 
close to optimal  30, 40, 1  is revealed. This is e xpla ined with that all 2LPNLT F are  roughly simila rly-trained  
GND c lassifiers, without focusing on specific SDIS. Such fact is a heuristic alternative to optimization of BT P. 

 
Conclusion 
 
Tracking wear states regarding SDIS at low T ER is very essential for metal processing. With the opti-

mized boosting 2LPNLTF ensemb le as a wear state discontinuous tracking model, the accuracy is reached 
higher. In the presented exa mple , just every twentieth state is tracked erroneously at the highest level of SDIS. 
On average, just a state from 158 is tracked erroneous ly. All what is needed is high-precision statistical corre-
spondence of 24 wear states and 24 groups of 16-d imensional points  labeled as 24 PR. The rest correspondence 
is modeled as FSDS (2) due to that wear is valued as GND. It’s a way of real imple mentation of rationalized us-
age of billets and tools  under controlling their wear. Genera lly, the optimized 2LPNLT F-boosting shall work in  
solving problems having various Q  and N . A remarkable property of straight boosting under simply  

  1B    1, B    is going to be e xplored to perfect the trac ker accuracy fu rther. 
 

Opt imizing parameters of t he t wo-layer percept rons’ boost ing ensemble training for accuracy improvement  in wear stat e discontinuous ... 

Проблеми трибології (Problems of Tribology) 2015, № 1 

68 

 
References 
 
1. Chungchoo C. On-line tool wea r estimation in CNC turning operations using fuzzy neura l network 

model / C. Chungchoo, D. Sa ini. International Journal of Machine Tools and Manufacture.  2002.  Vo lu me 42, 
Issue 1.  P. 29 -  40. 

2. Ro manuke  V.V. Wear state discontinuous tracking model as two-layer perceptron with nonlinear 
transfer functions being trained on an extended general totality regarding statistical data inaccuracies and shifts . 
Proble ms of tribology.  2014.  N. 3.  P. 50 - 53. 

3. Ro manuke V.V. Accuracy improve ment in wear state discontinuous tracking model regarding statis-
tical data inaccuracies and shifts with boosting mini-ensemble of two-layer perceptrons . Proble ms of tribology.  
2014.  N. 4.  P. 55 - 58. 

 
Поступи ла в редакц ію 09.02.2015 
 
 
Романюк В. В.  Оптимізація параметрів навчання комітету бустингу двоша рових персептронів для 

покращення точності у дискретній моделі відслідковування стану зносу з урахуванням похибок і зсувів у ста-
тистичних даних. 

 
Пр едставляється спр оба оптимізації для покр ащення точності дискр етного відслідкову вання станів зносу 

металевого засобу , коли скінченна множина цих станів бу ла статистич но пов’язана з мно жиною  р епр езентативних 
фактор ів, що впливають на знос. Діапазон станів зно су  вважається повністю р озбитим за цими ф актор ами.  
Відстежу вачем є статична модель на основі комітету  бу стингу  двошар ових пер септр онів з нелінійними пер едаваль-
ними фу нкціями. Вона у спішно вр ахову є похибк и і зсу ви у  статистичних даних в задачі відслідкову вання 24 станів  
зносу  з 16 фактор ами впливу  на знос. Збільшивши кількість класифік атор ів у комітеті до 30, у сер еднений вигр аш з 
оптимізованим ко мітето м складає бл изько 56 % по відно шенню до найкр ащого комітету  з тр ьох класиф ікатор ів. 
Аналогічно диспер сія р івня помилок відслідкову вання по 24  станам зно су  є майже на 53 % меншою. Пр иблизно такі 
самі р езу льтати зафіксовані тоді, коли комітет складається без навчання, а лише з пр ир івнюванням ваги кожного  
класифікатор а до о днієї тр идцятої.  Такі р івновагові композиції бу ду ть досліджені у  подальшому  для того, щоб о тр и-
мати ще більш вдосконалену  точність. 

 
Ключові слова: стан зно су , статистичні дані, по хибки у  даних,  зсу ви у  даних, модель відсл ідкову вання, точність, 

двошар овий пер септр он, бу стинг, навчання комітету  бу стингу , оптимізація, р івень помилок 
відслідкову вання.