HUNGARIAN JOURNAL OF 
  INDUSTRY AND CHEMISTRY 

Vol. 45(1) pp. 23–27 (2017) 

hjic.mk.uni-pannon.hu  

DOI: 10.1515/hjic-2017-0005   

REPLACEMENT OF BIASED ESTIMATORS WITH UNBIASED ONES IN 
THE CASE OF STUDENT'S t-DISTRIBUTION AND GEARY’S KURTOSIS 

GERGELY TÓTH* AND PÁL SZEPESVÁRY 

Institute of Chemistry, Faculty of Science, Eötvös Loránd University, 1/A Pázmány Péter sétány, 
Budapest, H-1117, HUNGARY  

The use of biased estimators can be found in some historically and up to now important tools in statistical data 
analysis. In this paper their replacement with unbiased estimators at least in the case of the estimator of the 
population standard deviation for normal distributions is proposed. By removing the incoherence from the Stu-
dent’s t-distribution caused by the biased estimator, a corrected t-distribution may be defined. Although the 
quantitative results in most data analysis applications are identical for both the original and corrected t-
distributions, the use of this last t-distribution is suggested because of its theoretical consistency. Moreover, the 
frequent qualitative discussion of the t-distribution has come under much criticism, because it concerns artefacts 
of the biased estimator. In the case of Geary’s kurtosis the same correction results (2/π)

1/2
 unbiased estimation 

of kurtosis for normally distributed data that is independent of the size of the sample. It is believed that by re-
moving the sample-size-dependent biased feature, the applicability domain can be expanded to include small 
sample sizes for some normality tests. 

Keywords: unbiased estimator, normal distribution, Anscombe-Glynn test, Jarque-Bera test, 
Bonett-Seier test 

1. Introduction 

The Student's t-distribution [1] is one of the most widely 
used statistical functions in experimental practice. It is 
well documented that experts active in analytical, physi-
cal and clinical chemistry, biology, agriculture, ecology, 
economy as well as forensic science or even legal repre-
sentatives apply this tool to formulate solid statements, 
e.g. population means, to differentiate between two 
sample means, etc., which are in most cases necessarily 
based on a limited number of observations. Fortunately, 
scientists are in this respect soundly supported by a lot 
of textbooks, standards, software, software systems, and 
last but not least trained in elements of statistics. How-
ever, it should be a moral obligation to be aware of the 
evolution of this routinely used method, its principles 
and often tacitly supposed assumptions. Although not 
crucial in daily use, it is worthwhile to know that be-
sides Student’s t-distribution there are different func-
tions which may be suitable for the determination of 
percentiles in the same way as the t-distribution. They 
may differ mainly in terms of alternate estimators for 
the population mean and population standard deviation. 
An attractive variant is presented in this work emphasiz-
ing its theoretically fully consistent feature on the con-
trary to the Student's t-distribution. 

                                                           
*Correspondence: toth@chem.elte.hu 

The Student’s t-distribution corresponds to a ratio 
of normally distributed random variables to chi-
distributed random variables (see the references in the 
historical review of Zabell [2]). Chi-distributed varia-
bles postulate normally distributed data as well. Gosset 
[1] used in his definition an estimate of the standard 
deviation which is biased relating the population stand-
ard deviation and even the variance in the case of nor-
mally distributed random variables as was shown earlier 
by Helmert [3-4]. Amazingly, when Fischer proposed a 
transformation of Gosset’s original z variable to 

1 Nzt  [5-6], he chose an estimate for the standard 

deviation which is also biased relating σ. A corrected tc-
distribution is proposed that fulfils all theoretical re-
quirements and yields a more normal distribution-like 
shape. It is consequently based on normal sample data 
and uses an unbiased estimator. 

A similar correction can be applied to Geary’s kur-
tosis that is the ratio of the mean deviation to the stand-
ard deviation [7]. In this work the use of the correction 
is proposed in order to eliminate the sample-size de-
pendency of the mean Geary’s kurtosis on normally 
distributed data. 

Finally, some remarks are made on statistical tests 
based on sample-size dependent values in order to ex-
tend their applicability to small sample sizes. 


  TÓTH AND SZEPESVÁRY 

Hungarian Journal of Industry and Chemistry 

24

2. Theoretical Background 

The square root of the mean of the square of the centred 
observations of the random variable, Y, 

 
 

N

yy

s

N

i

i

N






 1

2

 (1) 

may be an estimator of the population standard devia-
tion, σ, of a sample. N is the sample size and y  denotes 

the sample mean. Bessel pointed long before to the bias 
of sN and proposed: 

 
 

1
1

2










N

yy

s

N

i

i

. (2) 

 
Although ��  is already an unbiased estimator of 

the variance, ��, irrespective of the distribution of Y, the 
statistics �� and � are both biased in terms of the stand-
ard deviation, �. However, there is a correction [5-6, 8-
9]: 

 ,

2

1

2

1

2
)(4








 















N

N

N
Nc  (3) 

which applies to � by transforming it, in the case of 
normally distributed variables, into an unbiased statistic: 

 
 Nc
s

s
4

c  . (4) 

 Nc4  follows from Helmert’s papers or Cochran's 

theorem [10]. For normally distributed Y, )(1 sN   

exhibits a chi distribution with N-1 degrees of freedom. 

 Nc4  is the expected value of s/σ. There is a  Nc2  
value as well, that is equal to the expected value of 
�� �⁄ . The correcting effect of  Nc4  may be consider-
able for low values of N (Table 1).  

 
Table 1. Selected c4(N) corrections 

 
N c4(N) 

2 0.7979 

3 0.8862 

4 0.9213 

5 0.9400 

10 0.9727 

20 0.9869 

30 0.9914 

 
Figure 1. Student’s t-distribution compared with the 
standard normal distribution 

 
The correction term is frequently used in Statistical 

Process Control (SPC) to define ±3σ intervals and pro-
cess standard deviations that are determined for samples 
of different sizes. 

3. Results and Discussion 

3.1. Discussion of the density functions of the 
Student’s t-distribution 

Let Y be an independent N(μ,σ
2
) variable. The  

 
Ns

y
t

/


  (5) 

statistic exhibits the usual Student t-distribution with N-
1 degrees of freedom. The density function (Eq.(6)) can 
be derived as the quotient of normal and chi-distributed 
random variables. The independency of the nominator 
and denominator can be shown and it is also proved by 
a series of theorems that t defined in Eq.(5) follows a 
Student’s t-distribution with N-1 degrees of freedom 
(see the references in [2]): 

 
 

22

1
1

2

1

2

1

1
)(

N

N

t

N

N

N
tf



























 
















 (6) 

The distribution sketched in Fig.1 is widely known and 
needs no comments except for its flaw: it is based on the 
biased statistic (Eq.(2)) for σ. 

By replacing the standard deviation of the sample, 
s, in Eq.(5) with the unbiased equivalent one, sc, cor-
rected by c4(N), results in a new value:  

 
Ns

y
t

/c
c


 , (7) 

and in the density function: 
 

NOTE ON STUDENT’S T-DISTRIBUTION 

45(1) pp. 23–27 (2017) 

25

   

 

2

2
4

2
c

2

2
4

2
c

4
c

1
1

2

1

1
1

2

1
1

2

)(

1
)(

N

N

Nc

t

Nc

t

N
N

N

Nc
tf














































 


















.

)(
)(

)1,2/(crit

4
)1,2/(crit

4

)1,2/(crit
cc









N

NN

st

Nct
Nc

s
ts





Table 2. Sample-size dependence of Geary’s and 
Pearson’s kurtosis. The c4(N)-corrected Geary’s and 
the size-corrected Pearson’s kurtosis [11] provided the 
∞ limit values for all sample sizes within the statistical 
uncertainty of the 105 random samples. 

 
N Geary’s kurtosis 
Pearson’s kur-

tosis 

3 0.9004 1.5000 
4 0.8659 1.8005 
5 0.8489 1.9996 
6 0.8385 2.1437 
7 0.8319 2.2501 
8 0.8267 2.3340 
9 0.8233 2.4006 
10 0.8200 2.4557 
20 0.8085 2.7141 
50 0.8021 2.8820 

100 0.7999 2.9408 

∞ 0.7979 3.0000 
 

                  (8) 

 
which is shown in Fig.2. As expected, the corrected 
Student’s t-distribution (Student’s tc-distribution) con-
sists of more distinct peaks, it exhibits fatter tails than 
the standard normal distribution, but rather oddly at its 
maximum, i.e. when tc = 0, the value f(tc) does not de-

pend on N and is represented by the value 2/1 , the 

same as for those of normal distribution. The function 
f(tc) and the Student’s t-distribution without a doubt 
differ in this respect. 

The obvious differences between the Student and 
modified Student distributions do not complicate their 
daily usage. The confidence intervals calculated using s 
and t or sc and tc estimators and distributions, 
respectively, 

 
N

ts
y

N )1(,2/
cc






 and 
N

st
y

N )1(,2/ 




 (9) 

do not differ, because 

  
       (10) 

Evidently, corresponding estimators and functions 
should be used. 

3.2. The effect of the correction to Geary’s 
kurtosis 

Geary’s kurtosis [7], wN, is the ratio of the mean abso-
lute deviation (MAD) to the standard deviation 
(Eq.(11)). It is an alternative to Pearson’s kurtosis based 
on the fourth moment. The expected value of Geary’s 
kurtosis depends on the sample size even for normally 
distributed data [7]. The mean values of 10

5
 random 

samples from standard normal distributions are shown 
in Table 2.  

 
N

N

i

i

N
s

yy
N

w






 1

1

 (11) 

If the c4(N) corrections (Table 1) are applied during the 
calculation of the nominator of the ratio as 
wN,corr=wN c4(N), the expected value of the kurtosis is 
(2/π)

1/2
 for all sample sizes. This means that the platy-

kurtic and the leptokurtic features of a sample can be 
found without searching for the size-dependent dividing 
value in tables. 

3.3. Sample-size bias in statistical tests 

Geary’s kurtosis and its transformed values are used in 
normality tests due to their enhanced sensitivity to some 
leptokurtic deviations from normality [12]. Contrary to 
the case of the Student’s t-distribution, where the cor-
rection has no effect on the t-test, here the effect of the 
correction is not cancelled. Generally, the size depend-
ence decreases the performance of the tests for small 
sample sizes. This feature is interpreted by users as a 
recommendation that the tests are unsuitable for small 
sample sizes. In the same way, neglect of sample-size 
dependence is applicable in tests where Pearson’s kurto-
sis is used. The calculated mean value of Pearson’s kur-
tosis is shown in Table 2 but its convergence is rather 
weak to the theoretical value of 3. It should be noted 
here that the sample-size unbiased estimator of kurtosis 
can be easily calculated [11]. 

 
Figure 2. Corrected Student’s t-distribution compared 
with the standard normal distribution  

 
  TÓTH AND SZEPESVÁRY 

Hungarian Journal of Industry and Chemistry 

26

Table 3. Sample-size dependence of five normality 
tests based on unbiased or sample-size-dependent 
biased estimators. The numbers show the ratio of the 
rejected null hypotheses to all trials at a significance 
level of 0.05 from 105 random samples. 

 
N 
Shapiro-

Wilk 
D’ 

Agostino 
Anscombe-

Glynn 
Bonett-
Seier 

Jarque
-Bera 

4 0.0502     

5 0.0521   0.0373  

6 0.0477  0.0189 0.0087  

7 0.0492  0.0343 0.0282  

8 0.0505  0.0365 0.0348  

9 0.0507 0.0538 0.0369 0.0390 0.0024 

10 0.0498 0.0528 0.0394 0.0417 0.0058 

20 0.0497 0.0525 0.0406 0.0421 0.0089 

50 0.0498 0.0500 0.0466 0.0470 0.0241 

100 0.0499 0.0493 0.0533 0.0490 0.0368 

 
In Table 3 the type-I error of some normality tests 

calculated on 10
5
 standard normal samples is shown. In 

the calculation the ‘moments’ package in R was used 
[13]. Table 3 contains the ratio of the samples to all 
samples where the H0 hypothesis of normality was re-
jected at the significance level of 0.05. The Shapiro-
Wilk method [14] uses the ratio of two unbiased estima-
tors of the variance, and is suitable for all data sizes. 
The skewness test of D’Agostino [15] slightly overesti-
mates the number of rejected cases. It is based on the 
normalized third-order central moment definition of 
skewness, where the expected value is estimated with-
out bias. The Anscombe-Glynn [16] test applies Pear-
son’s kurtosis without size correction using a biased 
estimation of normally distributed data. The Bonett-
Seier [12] test shown here uses Geary’s kurtosis and the 
Jarque-Bera test [17] combines skewness and Pearson’s 
kurtosis. The performance of the last three tests was 
rather weak for small sample sizes, whereof one cause 
might be the lack of correction for small sample sizes 
even for normally distributed data. These tests are usu-
ally only recommended for medium and large sample 
sizes. The correction should extend the applicability 
domain to small sample sizes. Of course, the type-I er-
ror of normally distributed data is only one narrow as-
pect of a test, detailed analysis should be performed to 
investigate the effect of correction on many distribu-
tions, like, e.g. in ref. [12]. 

4. Conclusion  

Nowadays, data are evaluated by computers and biased 
estimators can be replaced by unbiased ones, even if 
their calculation schemes are complex.  

It has been shown that, in terms of Student’s t-
distributions, to decide upon the confidences of statis-
tics one has two functions which are completely equiva-
lent as far as practical applicability is concerned. They 
can, however, be distinguished theoretically. The asser-
tion that only the unbiased estimator should be recog-
nized as the correct one implies the use of the corrected 
Student’s t-distribution, f(tc). In that case the known 

shape of the Student’s t-distribution may be labelled as 
an artefact and the usual application of the Student’s t-
distribution as a production of “correct numbers by an 
incoherent theory”. 

In the case of Geary’s kurtosis, the correction re-
moves the sample-size dependence from the expected 
value. This change of distinguishing platykurtic or lep-
tokurtic features of the sample is simpler than using the 
original version of Geary’s kurtosis. Furthermore, sub-
tracting (2/π)

1/2
 results in a number to be interpreted in a 

similar way to the excess kurtosis obtained by subtract-
ing 3 from the Pearson’s kurtosis. 

As a further study, the use of unbiased/sample-
size-dependent corrections to extend the applicability 
domain to small sample sizes in the case of normality 
tests is recommended. It is believed that the use of bi-
ased estimators was acceptable before the age of com-
puters and a systematic change to unbiased ones might 
be necessary in terms of statistics and standards with 
regard to industrial processes. 

Acknowledgement  

The authors are sincerely grateful for the valuable 
comments of Prof. S. Kemény and other participants 
during the scientific discussion after the presentation 
concerning Student’s t-distribution at the SCAC 2010 
conference in Budapest in September 2010. 

REFERENCES  

[1] Student (Gosset, W.S.): The Probable Error of a 
Mean, Biometrika 1908 6(1), 1–25 DOI: 
10.2307/2331554 

[2] Zabell, S.L.: On Student’s 1908 Article “The Prob-
able Error of a Mean”, J. Am. Stat. Assoc. 2008 
103(481), 1–7 DOI: 10.1198/016214508000000049 

[3] Helmert, F.R.: Über die Wahrscheinlichkeit der 
Potenzsummen der Beobachtungsfehler und über 
einige damit in Zusammenhang stehende Fragen, 
Z. Math. Phys. 1876 21, 192–218  

[4] Helmert, F.R.: Die Genauigkeit der Formel von 
Peters zur Berechnung des wahrscheinlichen Beo-
bachtungsfehles directer Beobachtungen gleicher 
Genauigkeit, Astronomische Nachrichten 1876 
88(8-9), 113–131 DOI: 10.1002/asna.18760880802 

[5] Fischer, R.A.: Applications of “Student’s” distribu-
tion, Metron 1925 5, 90–104 

[6] Fischer, R.A.: Statistical methods for research 
workers, (Oliver & Boyd, Edinburgh and London) 
1925. 

[7] Geary, R.C.: Moments of the ratio of the mean de-
viation to the standard deviation for normal sam-
ples, Biometrika 1936 28(3/4), 295–307 DOI: 
10.2307/2333953 

[8] Grubbs, F.E.; Weaver, C.L.: The best unbiased es-
timate of population standard deviation based on 
group ranges, J. Am. Stat. Assoc. 1947 42(238), 
224–241 DOI: 10.1080/01621459.1947.10501922 


NOTE ON STUDENT’S T-DISTRIBUTION 

45(1) pp. 23–27 (2017) 

27

[9] Vincze, I.: Matematikai statisztika ipari alkal-
mazásokkal, (Műszaki Kiadó, Budapest), 1975 pp. 
58, 89, 165–168 ISBN 963-10-0472-4 

[10] Cochran, W.G.: The distribution of quadratic forms 
in a normal system, with applications to the analy-
sis of covariance, Math. Proc. Cambridge 1934 
30(2), 178–191 DOI: 10.1017/S0305004100016595 

[11] Joanes, D.N.; Gill, C.A.: Comparing measures of 
sample skewness and kurtosis, J. Roy. Stat. Soc. D-
Sta. 1998 47(1), 183–189 DOI:10.1111/1467-9884.00122 

[12] Bonett, D.G.; Seier, E.: A test of normality with 
high uniform power, Comput. Stat. Data An. 2002 
40(3), 435–445 DOI: 10.1016/S0167-9473(02)00074-9 

[13] Komsta, L.; Novomestky, F.: Moments R package, 
version 0.14, 2015 at http://cran.r-project.org, ac-
cessed in October 2017 

[14] Shapiro, S.S.; Wilk, M.B.: An Analysis of Vari-
ance Test for Normality (Complete Samples), Bio-
metrika 1965 52(3/4), 591–611 DOI: 
10.1093/biomet/52.3-4.591 

[15] D’Agostino, R.B.: Transformation to Normality of 
the Null Distribution of G1, Biometrika 1970 
57(3), 679–681 DOI: 10.1093/biomet/57.3.679 

[16] Anscombe, F.J.; Glynn, W.J.: Distribution of the 
kurtosis statistic b2 for normal samples, Biometrika 
1983 70(1), 227–234 DOI: 10.1093/biomet/70.1.227 

[17] Jarque, C.M.; Bera, A.K.: Efficient tests for nor-
mality, homoscedasticity and serial independence 
of regression residuals, Economic Letters 1980 
6(3), 255–259 DOI: 10.1016/0165-1765(80)90024-5