HUNGARIAN JOURNAL 
OF INDUSTRIAL CHEMISTRY 

VESZPREM 
Vol. 29. pp. 123- 127 (2001) 

OBTAINING QUANTITATIVE INFORMATION ON THE FLUCTUATION OF 
THE ACTIVE INGREDIENT CONTENT IN DRUGS- WHAT \VOULD THE 

CUSTOMER FIND 

A. DREGELYI-KISS andS. KEMENY 

(Department of Chemical Engineering, Budapest University of Technology and Economics, 
H-1521 Budapest, HUNGARY) 

Received: October 8, 2001 

This paper was presented at the 7th International Workshop on Chemical Engineering Mathematics, Bad Honnef, 
Germany, August 12-17, 2001 

The active ingredient content of tablets is not uniform due to inhomogeneity and the fluctuation of the process 
circumstances. Moreover, the measured data are subject to measurement (analytical) error. Both the consumer and the 
producer should be aware of the possible range of active ingredient content of the tablets. The analysis of variance 
technique was used in the context of nested designs. Several variance components and their confidence ranges were 
calculated utilising the Satterthwaite-approximation. 
The customers may also control the product quality. Our purpose is to study the measurement process, as the customer 
would perform it, raising the question on the range in which the customer finds the amount of the key compound in a 
tablet purchased at a pharmacy. Various cases are compared concerning the measurement precision and way of chemical 
analysis performed by the customer, calculating the ranges in which the active ingredient content could be tnund with 
95% probability. The width of these ranges may be affected by the bias of the Satterthwaite-approximation. 

Keywords: nested design, Satterthwaite-approximation, confidence intervals, variance components, drug analysis 

Introduction 

In pharmaceutical industries there are strict guidelines 
to check the manufacturing processes in order to assure 
the steadiness of quality. The companies have to 
elaborate their own specifications related to the 
processes, chemical analysis, etc. These guidelines 
contain the appropriate design of experiments, where it 
can be seen how to perform the measures and statistical 
methods to appraise the results, for instance giving the 
confidence interval for the expected value in a 3x3 
design. In this paper we examine the relevant guidelines 
and ask some questions from the customer's point of 
view. 

Data source 

Table 1 contains data obtained from a real 
manufacturing process in the course of the current 

guideline and process validation of the factory. Durmg 
the batch-wise production of drugs these tablets were 
collected in lose-boxes. Tablets of one batch of the 
finished products are collected to 13 or 141ose-box. The 
first five boxes are called the beginning of the batch 
(first fraction), the 6-9/lOth boxes are the middle and the 
1 0/11-13/14th boxes are the end of the batch. In order to 
control the process the active ingredient contents have 
to be measured in drugs. During the sampling 2-3 
tablets were taken from the three different fractions of 
three batches, they were pulverized and powder 
fractions (altogether 9) were analysed three times each. 
The analytical procedure was high-temperature HPLC 
with low-wavelength detection. These data are shown in 
Table 1. The declared active ingredient content of the 
tablets calculated for the average ma:,s of tablet is 2.5 
mg ±5%, i.e. 2.375 mg - 2.625 mg. thus these samples 
have met this requirement. 

Contact information: E-mail: cJ:regelyi. vmt@chem.bme.hu. kemeny. vmt@chem.bme.hu 


124 

Table 1 The active ingredient content of drugs in several batches, from different fractions of batches and with repeated chemical 
analysis 

Batch 
Sampling 

Mass [mg] Batch 
Sampling 

Mass [mg] Batch 
Sampling 

Mass [mg] 
Fraction Fraction Fraction 

first 2.60 2 first 2.58 3 first 2.55 

1 first 2.59 2 first 2.57 3 first 2.56 

1 first 2.60 2 first 2.56 3 first 2.58 

1 middle 2.62 2 middle 2.58 3 middle 2.56 

1 middle 2.60 2 middle 2.58 3 middle 2.60 

1 middle 2.62 2 middle 2.58 3 middle 2.57 

1 end 2.57 2 end 

1 end 2.57 2 end 

1 end 2.58 2 end 

ANOVA and variance components 

Model 

The measured data were processed using analysis of 
variance technique (ANOV A) and the Statistica for 
Windows software was used for calculations. 

The experimental design contains batch as random 
factor with 3 levels (1, 2t 3}, sampling fraction as 
random factor with 3 levels (first, middle, end), and 
analysis repeated three times as repetition. The sampling 
fraction factor is nested within batches. 
The factors are: 

a: batch (r:=3 levels) 
{J(a): fraction within a batch (q=3Ievels) 

Thus the measurements are assumed to follow the 
nested-random-effects model: 

(1) 
i = I, ... ,r;j =l, ...• q;k = l, ... ,p 

where J1 is the expected value. £Xi is the random effect of 
the l 1 batch, P.im is the random effect of the / 1 fraction 
within the ,4h batch, and Eijt is the random noise for the 
kth measurement taken from the ,.m fraction of the fh 
batch. 

Certain assumptions have to be fulfilled when 
calculating ANOV A. Assume that a;.. fJ.ftu and Eut are 
independent and identically distributed variables with 

normal distribution, mean 0 and variance a! , aitAl 

and o} .. respectively. 
The nun hypotheses: 

H: :a~ = 0. i.e. there is no batch effect. 
H: :a~(AJ :::=0, i.e. the sampling fractions are not 

different (there is no inhomogeneity}. 

The theoretical ANOV A table is found as Table 2 
with the calculated and expected mean squares. the 
terms used for F-tests to check the nuil~hypotheses. 

2.59 3 end 2.57 

2.59 3 end 2.56 

2.57 3 end 2.57 

Results of ANOVA calculations 

The homoscedasticity and normality requirements are 
checked with positive results. The analysis of variance 
results are shown in Table 3. There is no significant 
difference between batches, but the inhomogeneity is 
significant at 0.05 level. 

As in the F test the batch mean square is compared 
with the mean square of the sampling fraction, the large 
value of the latter may cover the otherwise important 
effect of batches. This was checked by calculating the 
probability of the error of second kind ({3) for a fixed 
probability of the error of first kind, a=0.05 . 

The alternative hypothesis considered for the 
calculation is the value of variance found as point 
estimate: 

(2) 

This means that the question is the probability of not 
detecting a variance of the size really .estimated 

(a! =1.13·10-4, see later). 
The probability of not detecting is: 

(3) 

Degrees of freedom for calculating the Fa critical 
value are: VnumeratrVA=2, and Vdeoominator=Ya=6. The 
critical value itself is F0.05=5.14. Thus the probability of 
the error of second kind is: 

fJ = J F <5.14 S·l0-4 
3 
)= P(F <1.696)=0.74 

&l 1.515·10-

The chance that the difference between batches remains 
unobserved is /3=0.74 with a=0.05 . This risk is very 
high. thus it is advisable to keep the batch effect in the 
model instead of neglecting it. 


125 

Table 2 The theoretical ANOV A-table for two-way nested-random-effects model 

Effect Sum of Squares df Mean Squares ExpectedMS Fo 

s - L(- - )2 2 SA 
2 

A r-1 qpa~ + pa~ +a; 
SA 

A -qp Yi··- Y ... sA=-- -2-
i r-1 SB(A) 

SB(A) = p ~(.Yij.- Y; .. )
2 

2 SB(A) 
') 

B(A) r(q-1) ') ) SB(A) 
1 

SB(A) = r(q -1) PO"n +a; s2 
R 

Error sR = LLL(Yijk- .vij.r rq(p-1) s~ = SR (J2 
i j k rq(p-1) e 

Table 3 The ANOVA table: numerical evaluation 

Effect df Mean Squares 

A: batch 2 0.001515 

B(A): sampling fraction 6 0.000500 

Error 18 0.000126 

It is important to estimate variance components 

( <J~, ai(A) and cr;) in order to split the variance of the 
process into different parts. Its usual way is the method 
of moments or ANOV A method, where the estimates 
are obtained using the terms of expected mean squares 
in Table 2: 

~2 
(JA 

s! -s~(A) 
1.13-10-4 (4.a) 

qp 

2 2 
~2 SB(A) -SR =1.25 ·10-4 (4.b) (JB(A) 

p 

a-;= s~ = 1.26·10-4 (4.c) 
The estimated variances are obviously of the same 

order of magnitude, thus neglecting the between-batch 
variation is not justified. 

Computation of the content range relevant for the 
customer 

Model 

Two questions arise: 

• What is the range for the active ingredient content 
of the tablet purchased by the customer at a 
pharmacy? 

• What is the range in which the customer would find 
the content analysing a tablet? 

In the first case (range for the true content) the error 
of the analysis does not affect the result, this is achieved 
by assuming an infinite number of repetitions ( p' -7 oo ). 

The interval in which the customer at 95% probability 
would measure the active ingredient content of a tablet 

ExpectedMS Fa p 

9a! +3a~<Al +a; 3.030 0.123 
3ai<A> +a; 3.970 0.011 

u? 

depends on the precision of her own measurement 
system and on the number of repetitions in chemical 
analysis. 

The statistical treatment is common for the two 
cases. Student's t distribution is used to calculate the 
range for the content on the customer's side. 
A deviation variable (d) is introduced: 

d=y.-y ... (5} 

where y. is the average value measured by the 
customer, y ... is the grand average measured by the 
manufacturer (calculated from Table 1, y ... =2.580). 

The expected value of this d deviation is E(d) = 0 . 
Its variance is a sum of two terms: 

Var(d)= Var(y.)+ Var(y ... ) {6) 

The two variances are added, because the error of 
the measurements by the manufacturer is independent 
from that at the customer. These variances are expressed 
in terms of the variance components: 

(
-) 2 z a•; Vary. =O'A +aB(.AJ +-

p* 
(7) 

where cr'; is the variance of measurement error 

obtained by the customer. p' is the customer's number 
of repetition, 

11 (- ) 1 2 1 z a; vary ... =-a A +-O"B,M +-
r rq rqp 

(8) 

It may well be assumed that the analytical method 
and the measurement apparatus of the 'u))tomer is 
analogous to the system used by tht~ analytical 
laboratory of the manufacturer, thus the un~ertainty of 

their ~easurements is equal ( a;:! = a; ). The number of 
repetitions may not be the same, however. Upon 


126 

Table 4 The dependence of customer's 95% range on the number of repetitions 

i width of the 95% intervalw.a. 
width of the 

p' 95% intervals to.975, w.a. 
d to.97s,s interval intervalw.a. 

4.20·10'4 2.413 2.531 <y<2.630 0.099 3.370 2.511<y<2.649 0.138 

3 3.36·10'
4 2.742 2.530<y.<2.631 0.100 3.688 2.513<jl. <2.648 0.135 

5 3.19·10'4 2.858 2.529<jl. <2.631 0.102 3.772 2.513<j/.<2.648 0.135 

10 3.06·10"4 2.966 2.529<j7. <2.632 0.104 3.841 2.513<j7. <2.648 0.134 

20 3.00·10"4 3.029 2.528<j7. <2.633 0.105 3.877 2.513<j7. <2:648 0.134 

00 2.94·10-4 3.098 2.527<y<2.634 0.106 3.915 2.513<y<2.647 0.134 

prescribed 2.375<y <2.625 0.250 

Subscripts: S means calculating with Satterthwaite method, w.a. means using weighted average method 

substitution the resulting variance for the d deviation 
variable is 

Var(d)=(1+.!. k! +(1+_!_ ki<A> +(_!_+-1-k; (9) 
r r rq r p' rqp r 

As the variance components ( cr~ , ai<A> , a;) are 
not known, they are estimated from the experimental 
data: 

s~ =(1+.!_ 'k! +(1+...!_ 'k,i<Al +(..!..+-1-k; (10) 
r r rq r . p' rqp r 

Upon substituting Eqs.(4.a)-(4.c) for the estimates 
of variance components the following expression is 
obtained: 

z r+l 2 q-1 z ( 1 1) 2 
Sd =--SA +--SB(A) + --+- SR 

rqp qp P p' 
(11) 

The interval. where the customer would find the 
average active ingredient content of a tablet at e.g. 95% 
probability~ is calculated as: 

The main difficulty of the further calculation lies in 
the fact that the estimator for the resulting variance, as a 
linear combination of mean squares, does not follow 

2 2 

X ad distribution~ thus the range above is only 
v 

approximate. According to the Satterthwaite 

approximation [ 11 the L ais; linear combination of 
i 

2 2 

mean squares is treated as if it were X u 11 , with 
v 

degrees of freedom expressed as: 

{13) 

where s,2 is the 1-m mean square. V; its degrees of 
freedom~ a, is the coefficient of the ;th mean square in 
the linear combination. 

Another method [2] suggests calculating t 1-an;v as a 

weighted average of the appropriate t critical values 
related to the calculated mean squares and degrees of 
freedom: 

tl-al2,v (14) 

Results and discussion 

Eq. (11) gives the following expression for s~ if r=3, 
q=3 and p=3 is substituted for the number of batches, 
number of sampling fractions and number of repetitions, 
respectively: 

2 4 2 2 2 ( 1 1) 2 
Sd =-sA +-SB(A) + --+- SR 

27 9 3 p' 
(1l.a) 

The degrees of freedom using Satterthwaite's 
approximation is given as 

(13.a) 

The other method, averaging the critical, t values 
takes the following form: 

(14.a) 

Several p' values were taken for calculations, 

including p' -7 oo , the latter stands for the case of no 

analysis on the customert s side, giving the range for the 
true content. 
The results for the 95% intervals are given in Table 4. 

It is well seen that neither the true content nor the 
values to be obtained by the customer upon chemical 
analysis at 95% probability are within the required 
range. there is a clear overage. At the same time the 
range of uncertainty is much narrower than it would be 


allowed. The uncertainty here means not only the 
measurement error but also batch differences and 
inhomogeneity within batches, as the tablet purchased 
by the customer may come from any batch and from any 
sampling fraction of a batch. 

What is surprising in Table 4 is, that the width of the 
95% range obtained using Satterthwaite approximation 
is increasing with p' . That means the more precise is 

the measurement due to more repetitions, the wider is 
the interval. The reason comes from Eq. (13.a): 
Increasing the number of repetitions ( p') the numerator 

decreases, while the denominator is almost unchanged. 
This gives smaller degrees of freedom and larger critical 
t value at larger p' . This over-compensates the 

reduction of s~ and slightly broadens the interval. This 
anomaly is the error of the approximation, but the 
numerical consequence is not serious. 

The weighted average method results in wider 
intervals. 

Conclusion 

In pharmaceutical industry there are strict guidelines 
regarding e.g. active ingredient content. The problem is 
that the prescribed interval is related to the average 
measured by a laboratory near the process or the 
average measured by the "customer" (this could be even 
the next laboratory). If we take 'the customer's 
uncertainty into consideration, the interval for the 
average may be evaluated. There are two methods to 
construct this interval; _ the first one uses the 
Satterthwaite-approximation, the second one calculates 
the average of critical t-values weighted by mean 
squares. Due to the bias of the approximation, the larger 
is the number of repetition, the broader is the width of 
the interval. The second method gives broader interval 
for the average. In spite of the fact that all tablets 
analysed, individually conform to the specifications, the 
interval in which values may occur is partly outside the 
specifications. 

Acknowledgement 

The authors wish to express their gratitude to Dr. K. 
KOLLAR-HUNEK (Department of Chemical Informatics, 
Budapest University of Technology and Economics 
BUTE). The work has been supported by the Hungarian 
National Research Foundation OTKA under contract 
number T0033005 and Varga Jozsef Foundation of the 
Chemical Engineering Faculty of the BtJTE. 

SYMBOLS 

d deviation vatiable 
F Fisher test statistic 

127 

H hypothesis 
p number of repetitions in the producer's 

laboratory 
p' number of repetitions performed by the customer 
q number of levels of the fraction factor 
r number of levels of the batch factor 
i mean square 
ta. v critical value of Student-distribution for a 

probability and v degrees of freedom 
Yiik the measured value 
y. the average value measured by the customer 

y... the average value measured by the laboratory of 
the manufacturer 

Greek symbols 

a the probability of the error of first kind 
at · the effect of the batch 
{3 the probability of the error of second kind 
{3j(iJ the effect of the fraction within the batch 
Gjk the random noise 
v degrees of freedom 
J1 the expected value 
CJ variance component 

Subscript/Superscript 

related to the customer 
A related to the factor of the batch 
B related to the factor of the fraction 
d related to the deviation variable 
r related to the error 
R related to the error 

REFERENCES 

1. SAITERTHWAITE F. E.: Biometrics BulL, 1946, 
2(6), 110-114 

2. PAARK K. and BURDICK R. K.: Commun. Statist.-
Theory Meth., 1998, 27(11). 2807-2825 

3. BOX G. E. P., HUNTER W. G., HUNTER J. S.: 
Statistics for experimenters, J. Wiley and Sons. 
1978 

4. LORENZEN T. J., ANDERSON V. L.: Design of 
experiments. A no-name approach, Marcel Dekker, 
1993 

5. KaMENY S., DEAK A.: Design of experiments, 
Miiszaki KOnyvkiad6, Budapest, 2000 {in 
Hungarian) 


	Page 125 
	Page 126 
	Page 127 
	Page 128 
	Page 129