J. Nig. Soc. Phys. Sci. 4 (2022) 713

Journal of the
Nigerian Society

of Physical
Sciences

Combating the Multicollinearity in Bell Regression Model:
Simulation and Application

G. A. Shewaa,∗, F. I. Ugwuowob

aDepartment of Mathematical Sciences, Taraba State University, Jalingo/Taraba, Nigeria
bDepartment of Statistics, University of Nigeria, Nsukka/Enugu, Nigeria

Abstract

Poisson regression model has been popularly used to model count data. However, over-dispersion is a threat to the performance of the Poisson
regression model. The Bell Regression Model (BRM) is an alternative means of modelling count data with over-dispersion. Conventionally, the
parameters in BRM is popularly estimated using the Method of Maximum Likelihood (MML). Multicollinearity posed challenge on the efficiency
of MML. In this study, we developed a new estimator to overcome the problem of multicollinearity. The theoretical, simulation and application
results were in favor of this new method.

DOI:10.46481/jnsps.2022.713

Keywords: Bell regression, Liu, Multicollinearity, Poisson regression, Ridge

Article History :
Received: 17 March 2022
Received in revised form: 07 June 2022
Accepted for publication: 08 June 2022
Published: 15 August 2022

c© 2022 The Author(s). Published by the Nigerian Society of Physical Sciences under the terms of the Creative Commons Attribution 4.0 International license

(https://creativecommons.org/licenses/by/4.0). Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.

Communicated by: T. Latunde

1. Introduction

Regression modeling is crucial in describing the outcome
(response) variable of interest as a function of predictor vari-
able(s). The outcome variable usually assumed to follow a
normal distribution in the linear regression model. However,
in practice this may not hold. The Generalized Linear Model
(GLM) is employed when the response variable fails to follow
a normal distribution. GLM is the generalization of the linear
regression model that allow the linear model to be related to
the response variable via a link function. Examples include the
Poisson regression, negative binomial, the Bell regression, the
Beta regression models and others. It becomes inappropriate

∗Corresponding author tel. no:
Email address: gladysshewa@yahoo.com (G. A. Shewa )

to adopt the linear regression model when the response vari-
able is a count data. Example of count data includes the num-
ber of thunderstorms occurrences, the number of accidents, the
number of insurance claims, the number of species in a habitat
among others.

The Poisson regression model is popularly employed to model
the count data. However, the major drawback of the model is
that the model restricts the variance to be equal to the mean and
when the variance exceeds the mean [1]. The bell regression
model was introduced as an alternative to the Poisson regres-
sion model to model count data with over-dispersion [1]. The
properties of the bell regression model were discussed in de-
tail by [1]. The parameters of the model is determined using
the Method of Maximum Likelihood (MML), but the efficiency
of MML suffers drawback in the presence of multicollinear-
ity. The Ridge and the Liu estimators were proposed for the

1


Shewa & Ugwuowo / J. Nig. Soc. Phys. Sci. 4 (2022) 713 2

parameter estimation of the Bell regression model with multi-
collinearity [2,3].

The main objective given in this study is to propose a new
estimator to account for multicollinearity in the Bell regression
model, derive its property. We illustrate the proposed estima-
tor using a real life data and compare with the popular Poisson
regression model. The rest of this article is organized as fol-
lows. In Section 2, we introduce the Bell regression model and
the parameter estimation. Also, we discuss the new method of
estimation and its property. The simulation and the practical il-
lustration are in Section 3 and 4, respectively. The concluding
remark is given in the last section.

2. Existing Estimators in Bell Regression Model

Assuming the probability distribution of the response vari-
able yi is as follows:

f (y) =
θye1−e

θ

By
y!

, y = 0, 1, 2, . . . , (1)

where θ > 0 and By = (1/e)
∑
∞
d=0(d

y/d!) denotes the Bell num-
bers [1,4,5]. The Bell distribution in (1) has the following prop-
erties:

E (y) = θeθ, (2)

V ar (y) = θ (1 + θ) eθ, (3)

The model is expressed as a function of the mean response.
Assume there exists a function, ϕ = θeθ and θ = Wo (ϕ) , where
Wo represent the Lambert function [1]. Therefore, equation (1)
can be re-parameterize as follows:

f (y) =
e1−e

Wo (ϕ)
Wo (ϕ)

y By
y!

, y = 0, 1, 2, . . . , (4)

Where ϕ > 0 is the mean response. Consequently,

E (y) = ϕ, (5)

V ar (y) = ϕ (1 + Wo (ϕ)) , (6)

The Probability Mass Function (pmf) in equation (4) is an
example of the one-parameter exponential family. The Bell dis-
tribution is fit for modelling over-dispersed data because
V ar (y) > E (y) . Assume yi follows a Bell distribution with
mean ϕi, yi ∼ Bell (Wo (ϕi)) , such that

g(ϕi) = ηi = x
T
i β, i = 1, 2, . . . n, (7)

where β =
(
β1,β2, . . . ,βp

)T
is the vector of the regression pa-

rameters, ηi is the linear predictor, and xTi =
(
xi1, xi2, . . . , xip

)
denotes the p-known predictors. The Bell regression model

(BRM) can be modeled by assuming that ϕi = ex
T
i βee

xTi β and

ln(ϕi) = xTi βe
xTi β as yi ∼ Bell (ln (ϕi)) . The log-likelihood func-

tion becomes

l (β,ϕi) =
n∑

i=1

yilog
(
ex

T
i βee

xTi β
)

+

n∑
i=1

(
1 − ee

xTi βee
xTi β

)
+ logBy − log

 n∏
i=1

yi!

 (8)
Thus, the Method of Maximum Likelihood (MML) is obtained
by equating the first derivative of equation (8) to zero. The first
derivative of equation (8) cannot be solved analytically since it
is nonlinear in β. So, β is obtained iteratively using the Fisher-
scoring algorithm [6] defined as follows:

β(r+1) = β(r) + I−1β(r)S (β(r)) (9)

where I−1 (β) =
(
−E

(
∂2l (β,ϕi) /∂β∂βT

))−1
.

Consequently, the MML of β is defined as

β̂M ML = H
−1 XT Ŵû (10)

where
H = XT Ŵ X and û = log

(
ϕ̂i

)
+

yi−ϕ̂i√
var(ϕ̂i )

, and

Ŵ = diag
[
(∂ϕi/∂ηi)

2/V (yi)
]
.

The asymptotic covariance matrix is given by:

Cov(̂βM ML) =
(
XT Ŵ X

)−1
. (11)

Multicollinearity threatens the efficiency of the MML.
Multicollinearity occurs when the predictors are correlated and
makes the MML estimate unstable. Also, multicollinearity in-
flates the covariance matrix of MML [7-9].

[10] developed the ridge estimator for the linear regression
model while [2] developed the Bell ridge regression model and
defined it as follows:

β̂k−B = (H + kI)
−1 XT Ŵû, (12)

where the tuning parameter k>0. The bias, variance and Matrix
Mean Squared Error (MSEM) of Bell ridge estimator are shown
as follows:

Bias
(̂
βk−B

)
= − kQ (B + kI)−1γ (13)

Variance
(̂
βk−B

)
=Q (H + kI)−1H(H + kI)−1 QT , (14)

MSEM
(̂
βk−B

)
=Q (H + kI)−1H(H + kI)−1 QT

+ k2(H + kI)−2γγT , (15)

where Q denotes the eigenvectors of the XT ŴX matrix, and
γ = QT β.

[11] developed the Liu estimator for linear regression model
while [3] developed the Bell Liu estimator and defined as fol-
lows:

β̂d−B = (H + I)
−1(H + dI)̂βM ML. (16)

2


Shewa & Ugwuowo / J. Nig. Soc. Phys. Sci. 4 (2022) 713 3

where the tuning parameter d > 0.
The bias, variance and Matrix Mean Squared Error (MSEM)

of Bell Liu estimator are as follows:

Bias
(̂
βd−B

)
= −(1 − d)Q (H + I)−1γ (17)

Variance
(̂
βd−B

)
= Q (H + I)−1 (H + dI) H−1 (H + dI) (H + I)−1 QT , (18)

MSEM
(̂
βd−B

)
= Q (H + I)−1 (H + dI) H−1 (H + dI) (H + I)−1 QT

+ (1 − d)2(H + I)−2ββT , (19)

where Q denotes the eigenvectors of the XT ŴX matrix.
Let QT XT Ŵ XQ = E = diag

(
e1, ..., ep

)
, e1 ≥ e2 ≥ ... ≥

ep, E is the matrix of eigenvalues of XT Ŵ Xand Q is the matrix
whose columns are the eigenvectors of XT Ŵ X.Thus, the canon-
ical model can be expressed in terms of M = XQ,γ = QT βand
MT Ŵ M = E. Consequently, the MML in equation (10) can be
re-written as

γ̂M ML = E
−1 MT Ŵû, (20)

Cov(̂γM ML) = E
−1. (21)

Thus, the Scalar Mean Squared Error (SMSE) is as follows:

S MS E(̂γM ML) =
p∑

j=1

e−1j , (22)

The ridge estimator in canonical form is as follows:

γ̂k−B = (E + kI)
−1 MT Ŵû, (23)

The MSEM and SMSE of the ridge estimator in canonical form
is calculated as

MSEM
(̂
γk−B

)
= Q (E + kI)−1 E(E + kI)−1 QT

+ k2(E + kI)−2γγT , (24)

S MS E
(̂
γk−B

)
=

p∑
j=1

e j(
e j + k

)2 +k2
p∑

j=1

γ2j(
e j + k

)2 (25)
The Liu estimator and its MSEM and SMSE in canonical form
are given by:

γ̂d−B = (E + I)
−1(E + dI)̂γM ML. (26)

MSEM
(̂
γd−B

)
= Q (E + I)−1 (E + dI) E−1 (E + dI) (E + I)−1 QT

+ (1 − d)2(E + I)−2γγT , (27)

S MS E
(̂
γd−B

)
=

p∑
j=1

(
e j + d

)2
e j

(
e j + 1

)2
+ (1 − d)2

p∑
j=1

γ2j(
e j + 1

)2 (28)

2.1. The Proposed Estimator
The Kibria-Lukman estimator for the linear regression model

is defined as follows:

γ̂K L =
(
XT X + kI

)−1
(XT X − kI)̂γM ML. (29)

Hence, the Kibria-Lukman estimator for the Bell regression
model will be as follows:

γ̂K L = (E + kI)
−1(E − kI)̂γM ML. (30)

MSEM
(̂
γkl−B

)
= Q (E + kI)−1 (E − kI) E−1 (E − kI) (E + kI)−1 QT

+ (2k)2(E + kI)−2γγT , (31)

S MS E
(̂
γkl−B

)
=

p∑
j=1

(
e j − k

)2
e j

(
e j + k

)2 +(2k)2
p∑

j=1

γ2j(
e j + k

)2 (32)
Also, [12] developed the Modified Ridge Estimator for the

linear regression model which is given by:

γ̂MRT =
(
XT X + k(1 + d)I

)−1
XT y (33)

The proposed estimator in this study is motivated by replacing
γ̂M ML in equation (30) with γ̂MRT . Hence, the Modified Kibria-
Lukman estimator can be defined as follows:

γ̂kd−B = (X
T X − kI)

(
XT X + kI

)−1(
XT X + k(1 + d)I

)−1
XT Ŵû,

k = 0. (34)

The proposed can be written in canonical form as follows:

γ̂kd−B = (E − kI) (E + kI)
−1(E + k(1 + d)I)−1 MT Ŵû,

k = 0. (35)

The statistical properties of γ̂kd−B are as follows:

Bias
(̂
γkd−B

)
= −k ((3E + dE) + k(1 + d)) (E + kI)−1(E + k(1 + d)I)−1γ

(36)

V ar
(̂
γkd−B

)
= (E − kI)2 (E + kI)

−2
(E + k(1 + d)I)−2 (37)

MS E M
(̂
γkd−B

)
= (E + kI)−2(E − kI)2 (E + k(1 + d)I)

−2

+ k2((3 + d) E + k(1 + d))2 (E + kI)
−2

(E + k(1 + d)I)−2γT γ
(38)

S MS E
(̂
γkd−B

)
=

p∑
j=1

(
λ j − k

)2
(
λ j + k

)2(
λ j + k(1 + d)

)2
+ k2

p∑
j=1

(
3λ j + λ jd + k(1 + d)

)2
γ2j(

λ j + k
)2(
λ j + k(1 + d)

)2 (39)
3


Shewa & Ugwuowo / J. Nig. Soc. Phys. Sci. 4 (2022) 713 4

2.2. Theoretical Comparison based on MSEM and MSE
Lemma 2.1. Given a positive definite (p.d) matrix M*, and θ∗ be some vector, then M∗ − θ∗θ∗T ≥ 0 if and only if θ∗

′

M∗−1θ∗ ≤
1[13].

Lemma 2.2. θ̂1 = C1y and θ̂2 = C2y are two estimators of θ with covariance matrix Cov(̂θ1) and Cov
(̂
θ2

)
, respectively.

Suppose that Cov(̂θ1) > Cov
(̂
θ2

)
, and the bias fi = (Ci X − I)θ, i = 1, 2 then, MSEM

(̂
θ1

)
−MSEM

(̂
θ2

)
>0 if and only if

f ′T2
[
φV + f1 f T1 1

]−1
f2 < 1 where MS E M

(
θ̂i
)

= Cov
(
θ̂i
)

+ fi f Ti [14].
Theorem 1. Under the Bell regression model, if k>0 and d>0, then the proposed estimator γ̂kd−B is preferred to γ̂k−B if and

only if,

γT b
[
(E + kI)−1 E(E + kI)−1 + k2(E + kI)−2γγT − (E + kI)−2(E − kI)2(E + k(1 + d)I)−2

]−1
bT γ < 1

where b = −k ((3E + dE) + k(1 + d)) (E + kI)−1(E + k(1 + d)I)−1.

Proof: We show that the difference in the bias is positive definite.

Bias(̂γk−B) − Bias
(
γ̂kd−B

)
=

− k(E + kI)−1γ+k ((3E + dE) + k(1 + d)) (E + kI)−1(E + k(1 + d)I)−1γ = −kdiag
{

1
e j + k

−
3e j + de j + k (1 + d)

(e j + k)(e j + k (1 + d))

} p
j=1
γ (40)

−k
[(

e j + k (1 + d)
)
−

(
3e j + de j + k (1 + d)

)]
γ = k(2e j + e jd)γ > 0.

Consequently, Bias (̂γk−B)-Bias
(
γ̂kd−B

)
> 0.

We show that the variance difference of the two estimators is Positive Definite (pd).

Var(̂γk−B) − Var
(
γ̂kd−B

)
= (E + kI)−2 E(E + kI)−2 − (E + kI)−2(E − kI)2(E + k(1 + d)I)−2

= diag
{

e j
(e j + k)2

−
(e j − k)2

(e j + k)2(e j + k(1 + d))2

} p
j=1

(41)

(E + kI)−2 E(E + kI)−2-(E + kI)−2(E − kI)2(E + k(1 + d)I)−2 is pd since e j
(
e j + k(1 + d)

)2
−

(
e j − k

)2
> 0 for k,d>0.

Consequently, the proposed estimator is preferred.
Theorem 2. Under the bell regression model, if k>0 and d>0, then the proposed estimator γ̂kd−B is preferred to γ̂d−B if and only

if

γT b
[
(E + I)−2 E−1(E + dI)2 + (1 − d)2(E + I)−2γγT − (E + kI)−2(E − kI)2(E + k(1 + d)I)−2

]−1
bT γ < 1

where b = −k ((3E + dE) + k(1 + d)) (E + kI)−1(E + k(1 + d)I)−1.

Proof: We show that the difference in the bias is positive definite.

Bias(̂γk−B)-Bias
(
γ̂kd−B

)
= −(1 + d)(E + I)−1γ+k ((3E + dE) + k(1 + d)) (E + kI)−1(E + k(1 + d)I)−1γ

= −kdiag
{

1 + d
e j + 1

−
3e j + de j + k (1 + d)

(e j + k)(e j + k (1 + d))

} p
j=1
γ (42)

− (1 + d) (E + I)−1γ+k ((3E + dE) + k (1 + d)) (E + kI)−1(E + k (1 + d) I)−1γ = k (1 + d)
[
k + e jd + dk − 1

]
+e j(k+dk−3−d) > 0

for k,d>0
Consequently, Bias (̂γd−B)-Bias

(
γ̂kd−B

)
> 0.

We show that the variance difference of the two estimators is positive definite (pd).
Var(̂γk−B)-Var

(
γ̂kd−B

)
=(E + I)−2(E + dI)2-(E + kI)−2(E − kI)2(E + k(1 + d)I)−2

= diag


(
e j + d

)2
e j

(
e j + 1

)2 − (e j − k)2(e j + k)2(e j + k(1 + d))2


p

j=1

(43)

(E + kI)−2 E(E + kI)−2-(E + kI)−2(E − kI)2(E + k(1 + d)I)−2 is pd since
(
e j + d

)2(
e j + k (1 + d)

)2
(e j +k)2 > e j

(
e j + 1

)2 (
e j − k

)2
> 0

for k,d>0.
4


Shewa & Ugwuowo / J. Nig. Soc. Phys. Sci. 4 (2022) 713 5

Consequently, the proposed estimator is preferred.
Theorem 3. Under the bell regression model, if k>0 and d>0, then the proposed estimator γ̂kd−B is preferred to γ̂kl−B if and

only if,

γT b
[
(E + kI)−1 (E − kI) E−1 (E − kI) (E + kI)−1 + (2k)2(E + kI)−2γγT − (E + kI)−2(E − kI)2(E + k(1 + d)I)−2

]−1
bT γ < 1

where b = −k ((3E + dE) + k(1 + d)) (E + kI)−1(E + k(1 + d)I)−1.

Proof: We show that the difference in the bias is positive definite.

Bias(̂γkl−B) − Bias
(
γ̂kd−B

)
= −2k(E + I)−1γ+k ((3E + dE) + k(1 + d)) (E + kI)−1(E + k(1 + d)I)−1γ

= −kdiag
{

2
e j + k

−
3e j + de j + k (1 + d)

(e j + k)(e j + k (1 + d))

} p
j=1
γ (44)

−2k(E + kI)−1γ+k ((3E + dE) + k (1 + d)) (E + kI)−1(E + k (1 + d) I)−1γ > 0 for k,d>0
Consequently, Bias (̂γkl−B)-Bias

(
γ̂kd−B

)
> 0.

We show that the variance difference of the two estimators is positive definite (pd).
Var(̂γk−B)-Var

(
γ̂kd−B

)
=(E − kI)2 E−1(E + kI)−2-(E + kI)−2(E − kI)2(E + k(1 + d)I)−2

= diag


(
e j − k

)2
e j

(
e j + k

)2 − (e j − k)2(e j + k)2(e j + k(1 + d))2


p

j=1

(45)

(E − kI)2 E−1(E + kI)−2-(E + kI)−2(E − kI)2(E + k(1 + d)I)−2 is pd since
(
e j − k

)2(
e j + k (1 + d)

)2
(e j + k)2 > e j

(
e j + k

)2 (
e j − k

)2
>

0 for k,d>0. Consequently, the proposed estimator is preferred.

2.3. Estimation of Shrinkage Parameters k and d
The shrinkage parameter for the proposed is obtained by differentiating its mean squared error. We adopted the Maple software

for the simplification and simplest form of the calculated shrinkage parameter. Hence, the ridge parameter k is defined as

k̂ =

 p∏
j=1

(
d + 1 +

√
2d2 + 6d + 4

)
e j

1 + d


1/p

(46)

d̂ = min

 γ̂2j1
e j

+ γ̂2j

 [15] (47)
where e j is the eigenvalue of M

′

Ŵ M. k̂ in (46) produced optimum performance for the proposed and Kibria-Lukman estimator.
Also, d in (47) is adopted for the proposed and the Liu estimator.

The shrinkage parameter for the ridge estimator is defined as:

k̂ =
1

max(̂γ2j )
(48)

3. Simulation Study

In this study, we simulate using R software with the help of bellreg-package [1,16]. The predictors are generated in accordance
to [7,8,9,17,18,19,20,21,22,23,24,25]:

xi j =
√

(1 −ρ2)mi j + ρmi( j+1), i = 1, . . . , n; j = 1, . . . p, (49)

where mi j are independent standard normal pseudo-random numbers and ρ2 denotes the correlation between the explanatory vari-
ables such that ρ = 0.8, 0.9, 0.99 and 0.999. It is assumed that yi ∼ bell (Wo (µi)) , such that

log(µi) = ηi = β1 xi1 + β2 xi2 + · · · + βp xip (50)

The sample sizes n are 50, 100, 200 and 500 while p is taken to be 3, 8 and 12. We choose the true regression parameters β such
that

∑p
i=1 β̂

2
i = 1 [26].

5


Shewa & Ugwuowo / J. Nig. Soc. Phys. Sci. 4 (2022) 713 6

Table 1. Simulated result in terms of MSE when p= 4
coef 50 100 200 500

0.8
γ̂MML 5.1368 1.9656 1.1792 1.0398
γ̂k−B 4.3841 1.8570 1.1404 1.0386
γ̂d−B 3.3761 1.9492 1.1699 1.0392
γ̂kl−B 1.9286 1.5390 1.1393 1.0234
γ̂kd−B 1.4073 1.3149 1.2070 1.0183

0.9
γ̂MML 5.6531 1.9926 1.3388 1.1347
γ̂k−B 4.8511 1.9512 1.3192 1.1335
γ̂d−B 4.7792 1.9629 1.3342 1.1338
γ̂kl−B 3.8790 1.7835 1.3128 1.1234
γ̂kd−B 2.5191 1.5851 1.3093 1.0191

0.99
γ̂MML 6.8980 2.9614 1.8329 1.2415
γ̂k−B 5.5356 2.2753 1.3614 1.2360
γ̂d−B 5.2118 2.4122 1.4868 1.2081
γ̂kl−B 4.2356 2.1345 1.3545 1.2076
γ̂kd−B 3.4242 1.6709 1.4037 1.2001

0.999
γ̂MML 15.4196 8.3391 8.2790 3.4418
γ̂k−B 6.1161 3.7683 7.2700 2.0326
γ̂d−B 5.6982 2.9212 4.2932 1.7886
γ̂kl−B 4.8923 2.7634 2.4512 1.5531
γ̂kd−B 3.4986 2.0303 1.6574 1.2747

The simulation study was conducted by adopting the RStu-
dio programming language. The experiment was replicated 1000
times and the Mean Squared Error (MSE) was employed to
evaluate the performance of the estimators’.

MS E(β∗) =
1

1000

1000∑
j=1

(β∗i j −βi)
T (β∗i j −βi) (51)

where β∗i j is the estimator and βi is the parameter. The estima-
tor with the minimum MSE is preferred to the estimator with
maximum MSE. The simulation result is presented in Tables
1-3.

The following observations were made from Tables 1-3.
The Mean Squared Errors for each of the estimators were com-
puted at different specifications. The Method of Maximum Like-
lihood has the least performance in this study. This is in line
with the literature as expected. We observed that its perfor-
mance drops as the level of multicollinearity changes. The new
estimator produces a better performance in terms of minimum
Mean Squared Error than the ridge and the Liu estimator. The
other noticeable trend from Tables 1-3 are as follows:

1. The MSE rises as the level of multicollinearity rises, keep-
ing other factors constant.

2. Also, MSE rises as the predictor variable increases keep-
ing other factors constant.

3. Increasing the sample size n results in a decrease in the
MSE for all the estimators’ keeping other factors constant

4. These results agreed with the theoretical section.

Table 2. Simulated result in terms of MSE when p= 8
coef 50 100 200 500

0.8
γ̂MML 5.5490 5.2430 4.9503 1.4723
γ̂k−B 5.5467 4.9460 4.1475 1.4700
γ̂d−B 5.5375 4.9272 3.1725 1.4710
γ̂kl−B 4.7675 4.3211 3.2415 1.3980
γ̂kd−B 3.8533 3.1229 1.2796 1.3476

0.9
γ̂MML 6.1409 6.0730 5.2045 2.0049
γ̂k−B 6.1372 6.0708 4.8290 2.0023
γ̂d−B 6.1169 6.0617 4.1508 2.0024
γ̂kl−B 5.9890 5.3678 4.0234 1.9087
γ̂kd−B 4.3141 4.4619 3.4109 1.7746

0.99
γ̂MML 7.3489 7.7949 6.8973 3.0801
γ̂k−B 7.3374 6.7782 5.8251 3.0555
γ̂d−B 7.2683 6.7231 5.0666 3.0412
γ̂kl−B 6.9764 6.0123 5.1034 2.9807
γ̂kd−B 6.1818 5.5338 4.0534 2.4082

0.999
γ̂MML 17.7282 10.7036 10.0083 5.9225
γ̂k−B 7.5111 6.9166 5.5853 4.7222
γ̂d−B 8.0643 7.0671 6.7115 4.1167
γ̂kl−B 8.0423 7.0654 6.0980 4.0088
γ̂kd−B 7.0007 6.9086 4.4547 2.5738

Table 3. Simulated result in terms of MSE when p= 12
coef 50 100 200 500

0.8
γ̂MML 8.3479 6.6716 5.6898 2.5792
γ̂k−B 8.3446 5.6696 5.4483 2.1417
γ̂d−B 8.3357 5.6696 4.6284 2.2176
γ̂kl−B 6.7845 5.3222 3.9908 2.2214
γ̂kd−B 5.9400 4.1370 3.8202 1.8786

0.9
γ̂MML 9.7388 7.6736 6.0451 3.0170
γ̂k−B 9.7362 6.8409 5.0426 2.6473
γ̂d−B 9.7256 5.8649 5.0422 2.8627
γ̂kl−B 8.7656 5.0119 4.8765 2.9080
γ̂kd−B 7.8950 4.4621 4.2318 2.5720

0.99
γ̂MML 15.5725 9.5476 5.6815 4.7363
γ̂k−B 10.6565 7.4973 5.6736 4.6573
γ̂d−B 10.6124 7.3872 5.1511 4.6358
γ̂kl−B 9.6787 7.0577 5.0335 4.4631
γ̂kd−B 9.2586 6.7495 4.9946 2.9945

0.999
γ̂MML 31.7659 21.8870 11.1505 8.6753
γ̂k−B 16.2341 10.0161 9.0282 6.1041
γ̂d−B 12.1141 9.0713 7.0907 5.7521
γ̂kl−B 11.0886 8.0989 6.5656 4.8989
γ̂kd−B 11.1061 7.0443 5.0506 3.2270

6


Shewa & Ugwuowo / J. Nig. Soc. Phys. Sci. 4 (2022) 713 7

Table 4. Bell regression estimates
Coef. γ̂M ML γ̂k−B γ̂d−B γ̂kl−B γ̂kd−B
Intercept -0.5423 -0.1484 -0.3006 0.0375 0.0322
x1 0.5992 0.3396 -0.0034 0.3286 -0.0459
x2 0.1630 0.1667 0.0119 0.1605 0.1685
x3 -0.0117 -0.0146 0.0023 -0.0161 -0.0136
k/d 2.78554 0.1108 1.1333 18.0445

(0.1108)
MSE 1.7447 0.3914 0.5327 0.1493 0.0181

4. Application

In this section, we adopt the aircraft data to evaluate the
performance of the existing estimators and the proposed. This
dataset is originally assumed to follow the Poisson regression
model [8, 9, 27]), among others. There is one response variable
and three predictors (see [8, 9] for the details). Poisson distri-
bution fits well to the outcome variable [8, 9, 27]). The model
suffers from multicollinearity because the condition number is
219.3654 [8, 9]. However, the variance of the number of loca-
tions with damage on the aircraft is more than twice the mean
(2.0569). With this, it is evident that the data exhibit over-
dispersion. We fit the Bell regression model as alternative to
account for the over-dispersion in the data. Table 4 provides
the regression estimates and the Mean Squared Error of each of
the adopted estimators in this study.

It is obvious from the result in Table 4, the proposed esti-
mator produced the lowest MSE and dominates the MML, the
ridge and the Liu estimators. The MML has the highest mean
squared error. Thus, not recommended when there is multi-
collinearity.

5. Conclusion

The Poisson regression is often employed to model count
data. However, it is certain that the Poisson regression model
gives poor fit for count data with over-dispersion. Recently, the
Bell regression model was introduced as alternative to the Pois-
son regression model for the purpose of accounting for over-
dispersion in count data modelling. The conventional Method
of Maximum Likelihood (MML) is employed to estimate the
regression parameters. The estimator flop when the regressors
are correlated. The ridge and the Liu estimator were developed
to combat correlated regressors in Bell regression model. In
this study, we developed a new method of parameter estimation
in Bell regression model to compete with the existing ones. We
compared the performance of the new method with some meth-
ods. The new method dominates the existing methods by giving
a minimum Mean Squared Error.

Acknowledgments

The authors are very grateful to anonymous referees for
their careful and diligent reading of the paper and helpful sug-
gestions. This research has not received any sponsorship.

References

[1] F. Castellares, S. L. P. Ferrari & A. J. Lemonte, “On the Bell distribution
and its associated regression model for count data”, Applied Mathemati-
cal Modelling 56 (2017) 172, https://doi.org/10.1016/j.apm.2017.12.014

[2] M. Amin, M. N. Akram, & A. Majid, “On the estimation
of Bell regression model using ridge estimator”, Communica-
tions in Statistics - Simulation and Computation 2021 (2021),
https://doi.org/10.1080/03610918.2020.1870694

[3] A. Majid, M. Amin, & M. N. Akram, “On the Liu estima-
tion of Bell regression model in the presence of multicollinearity”,
Journal of Statistical Computation and Simulation 2021 (2021) 21,
https://doi.org/10.1080/00949655.2021.1955886

[4] E. T. Bell, “Exponential numbers”, The American Mathematical Monthly
41 (1934a) 419.

[5] E. T. Bell, “Exponential polynomials”, Annals of Mathematics 35
(1934b) 258.

[6] N. T. Longford, “A fast scoring algorithm for maximum likelihood
estimation in unbalanced mixed models with nested random effects”,
Biometrika 74 (1987) 817.

[7] B. M. G. Kibria, “Performance of some new ridge regression estimators”,
Communications in Statistics-Simulation and Computation 32 (2003)
419.

[8] A. F. Lukman, B. Aladeitan, K. Ayinde & M. R. Abonazel,
“Modified ridge – type for the Poisson regression model: Simula-
tion and Application”, Journal of Applied Statistics 2021 (2021a).
10.1080/02664763.2021.1889998

[9] A. F. Lukman, E. Adewuyi, K. Månsson & B. M. G. Kibria,
“A new estimator for the multicollinear Poisson regression model:
Simulation and Application, Scientific Reports 11 (2021b) 3732.
https://doi.org/10.1038/s41598-021-82582-w.

[10] A. E. Hoerl & R. W. Kennard “Ridge regression: biased estimation for
nonorthogonal problems”, Technometrics, 12 (1970) 55.

[11] K. Liu, “A new class of biased estimate in linear regression”, Commun
Stat. 22 (1993) 393.

[12] A. F. Lukman, K. Ayinde, S. Binuomote & A. C. Onate, “Mod-
ified ridge - type estimator to combat multicollinearity: applica-
tion to chemical data”, Journal of Chemometrics 2019 (2019) e3125.
https://doi.org/10.1002/cem.3125

[13] R. Farebrother, “Further results on the mean square error of ridge regres-
sion”, Journal of the Royal Statistical Society, Series B (Methodological)
38 (1976) 248.

[14] G. Trenkler & H. Toutenburg, “Mean squared error matrix comparisons
between biased estimators an overview of recent results”, Statistical Pa-
pers 31 (1990) 179.

[15] M. R. Ozkale & S. Kaciranlar, The restricted and unrestricted two-
parameter estimators” Commun. Statist. Theory, Meth. 36 (2007) 2707.

[16] Stan Development Team, RStan: The R interface to Stan. R package ver-
sion 2.19.3 (2020), https://mc-stan.org

[17] M. Arashi, M. Roozbeh, N. A. Hamzah, & M. Gasparini, “Ridge re-
gression and its applications in genetic studies” PloS One 16 (2021a) 4,
e0245376.

[18] M. Arashi, M. Norouzirad, M. Roozbeh, & N. M. Khan,
“A high-dimensional counterpart for the ridge estimator
in multicollinear situations”, Mathematics 9 (2021b) 3057,
https://doi.org/10.3390/math9233057

[19] Y. M. Bulut, “Performance of the Liu-type estimator in the Bell regression
model”, 9th International Conference on Applied Analysis and Mathe-
matical Modeling (ICAAMM21), Istanbul/TURKEY (2021).

[20] O. G. Obadina, A. F. Adedotuun & O. A. Odusanya, “Ridge estimation’s
effectiveness for multiple linear regression with multicollinearity: an In-
vestigation Using Monte-Carlo Simulations”, Journal of the Nigerian So-
ciety of Physical Sciences 3 (2021) 278.

[21] M. Qasim, K. Månsson, P. Sjolander & B. M. G. Kibria, ”A
new class of efficient and debiased two – step shrinkage estimators:
method and application”, Journal of Applied Statistics 2021 (2021),
https://doi.org/10.1080/02664763.2021.1973389

[22] A. K. M. E. Saleh, M. Arashi, & B. M. G. Kibria, Theory of ridge regres-
sion estimation with applications John Wiley, USA (2019).

[23] M. Suhail, S. Chand & B. M. G. Kibria, “Quantile-based robust ridge M-
estimator for linear regression model in presence of multicollinearity and

7


Shewa & Ugwuowo / J. Nig. Soc. Phys. Sci. 4 (2022) 713 8

outliers”, Communications in Statistics-Simulation and Computation 50
(2021) 3194.

[24] N. K. Rashad & Z. Y. Algamal, “A new ridge estimator for the Poisson
regression model”, Iranian Journal of Science and Technology, Transac-
tions A: Science, 43 (2019), https://doi.org/10.1007/s40995-019-00769-3

[25] Z. Y. Algamal & Y. Asar, “Liu-type estimator for the Gamma regres-
sion model”, Communications in Statistics-Simulation and Computation
8 (2018) 2035.

[26] B. M. G. Kibria & A. F. Lukman, “A new ridge – type estimator for the
linear regression model: simulations and applications”, Scientific 2020
(2020) 9758378, https://doi.org/10.1155/2020/9758378

[27] R. H. Myers, D. C. Montgomery, G. G. Vining & T. J. Robinson, Gener-
alized linear models: with applications in engineering and the sciences,
Wiley, New York (2012) 791

8