©2022 Ada Academica https://adac.eeEur. J. Math. Anal. 2 (2022) 7doi: 10.28924/ada/ma.2.7
On the Stratonovich Estimator for the Itô Diffusion

Jaya P. N. Bishwal
Department of Mathematics and Statistics, University of North Carolina at Charlotte,

376 Fretwell Bldg, 9201 University City Blvd., Charlotte, NC 28223-0001, USA

Correspondence: J.Bishwal@uncc.edu

Abstract. For the parameter appearing non-linearly in the drift coefficient of homogeneous Itô sto-chastic differential equation having a stationary ergodic solution, the paper obtains the strong con-sistency of an approximate maximum likelihood estimator based on Stratonovich type approximationof the continuous Girsanov likelihood, under some regularity conditions, when the corresponding dif-fusion is observed at equally spaced dense time points over a long time interval in the high frequencyregime. Pathwise convergence of stochastic integral approximations and their connection to discretedrift estimators is studied. Often it is shown that discrete drift estimators converge in probability. Weobtain convergence of the estimator with probability one. Ornstein-Uhlenbeck process is consideredas an example.
1. Introduction and Preliminaries

Parameter estimation in diffusion processes based on discrete observations is being paid a lot ofattention now a days in view of its application in many fields such as biology, physics, oceanograpgyand especially in finance, see Kutoyants (2004) and Bishwal (2008, 2021).Consider the Itô stochastic differential equation
dXt = f (θ,Xt)dt + dWt, t ≥ 0
X0 = X

0
(1.1)

where {Wt,t ≥ 0} is a one dimensional standard Wiener process, θ ∈ Θ, Θ is a compact subsetof R, f is a known real valued function defined on Θ × R, the unknown parameter θ is to beestimated on the basis of observation of the proces {Xt,t ≥ 0}. Let θ0 be the true value of theparameter which is in the interior of Θ. We assume that the process {Xt,t ≥ 0} is observed at
0 = t0 < t1 < ... < tn = T with ∆ti := ti − ti−1 = Tn = h, i = 1, 2, . . . ,n and T = dn1/2for some fixed real number d > 0. We estimate θ from the observations {Xt0,Xt1, . . . ,Xtn}. This

Received: 5 Dec 2021.
Key words and phrases. Itô stochastic differential equation; Stratonovich integral; diffusion process; discrete ob-servations; high frequency; approximate maximum likelihood estimators; conditional least squares estimator; strongconsistency; Monte Carlo methods. 1

https://adac.ee
https://doi.org/10.28924/ada/ma.2.7


Eur. J. Math. Anal. 10.28924/ada/ma.2.7 2
model was first studied by Dorogovcev (1976) who obtained weak consistency of the conditionalleast squares estimator (CLSE) under some regularity conditions as T →∞ and T

n
→ 0. Kasonga(1988) obtained the strong consistency of the CLSE under some regularity conditions as n → ∞assuming that T = dn1/2 for some fixed real number d > 0.Note that the conditional least squares estimator (CLSE) of θ is defined as

θn,T := arg min
θ∈Θ

Qn,T (θ)

where Qn,T (θ) = n∑
i=1

[
Xti −Xti−1 − f (θ,Xti−1 )h

]2
∆ti

.

Note that the CLSE, the Euler-Maruyama estimator and the IAMLE are the same estimator(see Shoji (1997)). For the Ornstein-Uhlenbeck process, Bishwal and Bose (2001) studied therates of weak convergence of approximate maximum likelihood estimators, which are of conditionalleast squares type. For the Ornstein-Uhlenbeck process Bishwal (2010a) studied uniform rate ofweak convergence for the minimum contrast estimator, which has close connection to Stratonovich-Milstein scheme. Bishwal (2009a) studied Berry-Esseen inequalities for conditional least squaresestimator discretely observed nonlinear diffusions. Bishwal (2009b) studied Stratonovich basedapproximate M-estimator of discretely sampled nonlinear diffusions. Bishwal(2011a) studied Mil-stein approximation of posterior density of diffusions. Bishwal (2010b) studied conditional leastsquares estimation in nonlinear diffusion processes based on Poisson sampling. Bishwal (2011b)obtained some new estimators of integrated volatility using the stochastic Taylor type schemeswhich could be useful for option pricing in stochastic volatility models. In mathematical finance,almost sure optimal hedging has received recent attention. Gobet and Landon (2014) studied theoptimal discretization error in the context of hedging error in a multidimensional Itô model wherethe convergence is studied in an almost sure sense and the discrete trading dates are stoppingtimes which includes the sampling scheme of Karandikar (1995) who studied pathwise convergenceof stochastic integrals. Bishwal (2011c) studied higher order approximation of hedging error inthe mean square sense. Almost sure hedging and optimality of discretization error motivates ouralmost sure consistency in estimation problem.Florens-Zmirou (1989) studied minimum contrast estimator, based on an Euler-Maruyama typefirst order approximate discrete time scheme of the SDE (1.1) which is given by
Zti −Zti−1 = f (θ,Zti−1 )(ti − ti−1) + Wti −Wti−1, i ≥ 1, Z0 = X

0.

The log-likelihood function of {Zti, 0 ≤ i ≤ n} is given by
C

n∑
i=1

[
Zti −Zti−1 − f (θ,Zti−1 )h

]2
∆ti

.

https://doi.org/10.28924/ada/ma.2.7


Eur. J. Math. Anal. 10.28924/ada/ma.2.7 3
where C is a constant independent of θ. A contrast for the estimation of θ is derived from the abovelog-likelihood by substituting {Zti, 0 ≤ i ≤ n} with {Xti, 0 ≤ i ≤ n}. The resulting contrast is

Hn,T = C

n∑
i=1

[
Xti −Xti−1 − f (θ,Xti−1 )h

]2
∆ti

.

and the resulting minimum contrast estimator, called the Euler estimator, is
θ̌n,T := arg min

θ∈Θ
Hn,T (θ)

Florens-Zmirou (1989) showed L2 consistency of the estimator as T →∞ and Tn → 0.If continuous observation of {Xt} on the interval [0,T ] were available, then the likelihood functionof θ would be
LT (θ) = exp

{∫ T
0

f (θ,Xt)dXt −
1

2

∫ T
0

f 2(θ,Xt)dt

}
, (1.2)

(see Liptser and Shiryayev (1977)). In our case we have discrete data and we have to approximatethe likelihood to get the MLE. Taking Itô type approximation of the stochastic integral and rectanglerule approximation of the ordinary integral in (1.2) and obtain the approximate likelihood function
Ln,T (θ) = exp

{
n∑
i=1

f (θ,Xti−1 )(Xti −Xti−1 ) −
h

2

n∑
i=1

f 2(θ,Xti−1 )

}
. (1.3)

An approximate maximum likelihood estimate (AMLE) based on Ln,T is defined as
θ̂n,T := arg max

θ∈Θ
Ln,T (θ).

Weak consistency and other properties of this estimator were studied by Yoshida (1992) as T →∞and T
n
→ 0.Note that the CLSE, the Euler estimator and the AMLE1 are the same estimator (see Shoji(1997)).In order to obtain a better estimator, which may have faster rate of convergence, we propose anew algorithm. Note that the Itô and the Stratonovich integrals are connected by∫ T

0

f (θ,Xt)dXt =

∫ T
0

f (θ,Xt) o dXt − 1
2

∫ T
0

ḟ (θ,Xt)dt.

(see Ikeda and Watanabe (1989)). We transform the Itô integral in (1.2) to Stratonovich integraland apply Stratonovich type approximation of the stochastic integral and rectangular rule typeapproximation of the ordinary integrals and obtain the approximate likelihood
L̃n,T (θ) = exp

{
1

2

n∑
i=1

(f (θ,Xti−1 ) + f (θ,Xti ))(Xti −Xti−1 )

−
h

2

n∑
i=1

(ḟ (θ,Xti−1 ) + f
2(θ,Xti−1 ))

}
.

(1.4)

https://doi.org/10.28924/ada/ma.2.7


Eur. J. Math. Anal. 10.28924/ada/ma.2.7 4
The Stratonovich approximate maximum likelihood estimator (SAMLE) based on ∼Ln,T is defined as

θ̃n,T := arg max
θ∈Θ

∼
Ln,T (θ).

This estimator is known to have faster rate of convergence (in the mean square sense) than theconditional least squares estimator, see Bishwal (2009b).For Monte Carlo simulations in finance, one would be interested for pathwise convergence ofthe estimator. In this paper prove the strong consistency of the SAMLE under some regularityconditions given below as n → ∞. We shall use the following notations : ∆Xi = Xti − Xti−1 ,
∆Wi = Wti − Wti−1 , C is a generic constant independent of h,n and other variables (perhaps itmay depend on θ). Prime denotes derivative w.r.t. θ and dot denotes derivative w.r.t. x. Supposethat θ0 denote the true value of the parameter and θ0 ∈ Θ. We assume the following conditions:(A1) The parameter space Θ is compact.(A2) |f (θ,x)| ≤ K(θ)(1 + |x|),

|f (θ,x) − f (θ,y)| ≤ K(θ)|x −y|.
|f (θ,x) − f (φ,y)| ≤ C(x)|θ−φ| for all θ,φ ∈ Θ,x,y ∈R where

sup
θ∈Θ
|K(θ)| = K < ∞,E|C(X0)|m = Cm < ∞ for some m > 16.

(A3) The diffusion process X is stationary and ergodic with invariant measure ν, i.e., for any gwith E[g(·)] < ∞
1

n

n∑
i=1

g(Xti ) → Eν[g(X0)] a.s. as T →∞ and h → 0.
Further E|X0|m < ∞ for some m > 16.(A4) E|f (θ,X0) − f (θ,X0)|2 = 0 iff θ = θ0.(A5) f is twice continuously differentiable function in x with

E sup
t
|ḟ (Xt)|2 < ∞, E sup

t
|f̈ (Xt)|2 < ∞.

2. Main Results

We shall use the following theorem to prove the strong consistency of the SAMLE.
Theorem 2.1 (Frydman (1980). Suppose the random function Dn satisfy the following conditions:
(C1) With probability one, Dn(θ) → D(θ) uniformly in θ ∈ Θ as n →∞.
(C2) The limiting nonrandom function D is such that

D(θ0) ≥ D(θ) for all θ ∈ Θ.

https://doi.org/10.28924/ada/ma.2.7


Eur. J. Math. Anal. 10.28924/ada/ma.2.7 5
(C3) D(θ) = D(θ0) iff θ = θ0.
Then θn → θ0 a.s. as n →∞, where θn = supθ∈Θ Dn(θ).

We need the following lemmas in order to prove our main result.
Lemma 2.1 Under (A1)- (A5),

sup
θ∈Θ

1

2T

{
n∑
i=1

[
v(θ,Xti−1 ) + v(θ,Xti )

]
∆Wi −

h

2

n∑
i=1

[
v̇(θ,Xti−1 ) + v̇(θ,Xti )

]}
→ 0 a.s.

as T →∞, T
n
→ 0.

Proof. Let v(θ,x) := f (θ,x) − f (θ0,x). The Fourier expansion of v(θ,x) in L(Θ) be given by
v(θ,x) =

∞∑
m=1

am(x)e
πjmθ, j =

√
−1, x ∈R

where ak(x) are the Fourier coefficients. Thus
1

2T

{
n∑
i=1

[
v(θ,Xti−1 ) + v(θ,Xti )

]
∆Wi −

h

2

n∑
i=1

[
v̇(θ,Xti−1 ) + v̇(θ,Xti )

]}

=
1

2T

{
∞∑
m=1

n∑
i=1

[
am(Xti−1 ) + am(Xti )

]
eπjmθ∆Wi

−
h

2

∞∑
m=1

n∑
i=1

[
ȧm(Xti−1 ) + ȧm(Xti )

]
eπjmθ

}
where

|am(x)| ≤ cm|x|,
∞∑
m=1

m1+γc4m < ∞.

Let
Am,n(s) :=

1

2

n∑
i=1

[
am(Xti−1 ) + am(Xti )

]
I(ti−1−ti ](s)

where I(ti−1−ti ], i = 1, 2, ...,n are indicator functions. Then
1

2

n∑
i=1

[
am(Xti−1 ) + am(Xti )

]
∆Wi =

∫ T
0

Am,n(s) o dWs
and

h

2

n∑
i=1

[
ȧm(Xti−1 ) + ȧm(Xti )

]
=

∫ T
o

Ȧm,nds.

But ∫ T
0

Am,n(s) o dWs − 1
2

∫ T
o

Ȧm,nds =

∫ T
0

Am,n(s)dWs.

https://doi.org/10.28924/ada/ma.2.7


Eur. J. Math. Anal. 10.28924/ada/ma.2.7 6
By exponential inequality for martingales, we have

P

{∫ T
0

Am,n(s)dWs −
α

2

∫ T
o

A2m,nds > β

}
≤ e−αβ

for any α,β > 0. Thus
P

{
1

T

∫ T
0

Am,n(s)dWs >
β

T
+

α

2T

∫ T
o

A2m,nds

}
≤ e−αβ

and
P

{∣∣∣∣ 1T
∫ T

0

Am,n(s)dWs

∣∣∣∣ > βT + αh8T
n∑
i=1

[
am(Xti−1 ) + am(Xti )

]2} ≤ 2e−αβ.
Since

h

2T

n∑
i=1

[
am(Xti−1 ) + am(Xti )

]2 ≤ c2m hT
n∑
i=1

[
(Xti−1 )

2 + (Xti )
2
]

and by (A3)
h

2T

n∑
i=1

[
(Xti−1 )

2 + (Xti )
2
]
→ E(X20 ) > 0 a.s.,

there exists a random variable V such that
h

2T

n∑
i=1

[
(Xti−1 )

2 + (Xti )
2
]
< V a.s.

for all T > 0,n = 1, 2, . . . . where P (V < ∞) = 1.Denote
Zm,n :=

1

tn

∫ tn
0

Am,n(s)dWs.

Recall that T = tn. Choose
α :=

ma

tδn
, β :=

t
γ
n

mb
,

where δ < γ < 1 and 1
2
< b <

1+γ
2
.Then

P

(
|Zm,n| >

1

t
1−γ
n m

b
+
mac2mV

2tδn

)
< 2e−m

a−bt
γ−δ
n .

https://doi.org/10.28924/ada/ma.2.7


Eur. J. Math. Anal. 10.28924/ada/ma.2.7 7
This

P

(
∞∑
m=1

Z2m,n >

∞∑
m=1

(
1

t
1−γ
n m

b
+
mac2mV

2tδn

)2)

≤
∞∑
m=1

P

(
Z2m,n >

(
1

t
1−γ
n m

b
+
mac2mV

2tδn

)2)

=

∞∑
m=1

P

(
|Zm,n| >

1

t
1−γ
n m

b
+
mac2mV

2tδn

)
≤ 2

∞∑
m=1

e−m
a−bt

γ−δ
n

≤ 2e−t
γ−δ
n

∞∑
m=1

e−m
a−b
.

Hence
∞∑
n=1

P

(
∞∑
m=1

Z2m,n >

∞∑
m=1

(
1

t
1−γ
n m

b
+
mac2mV

2tδn

)2)

≤ 2
∞∑
n=1

e−t
1−γ
n

∞∑
m=1

e−m
a−b

< ∞

since γ −δ > 0 and a−b > 0. The above implies
∞∑
n=1

P

(
∞∑
m=1

Z2m,n >
2

t
2(1−γ)
n

∞∑
m=1

m−2b +
V 2

t2δn

∑
m

m2ac4m

)
< ∞.

By Borel-Cantelli lemma,
∞∑
m=1

(
1

2tn

n∑
i=1

[
am(Xti−1 ) + am(Xti )

]
∆Wi −

h

2tn

n∑
i=1

[
v̇(θ,Xti−1 ) + v̇(θ,Xti )

])2
−→ 0 a.s. as n →∞.

This completes the proof of the lemma.
Lemma 2.2 Under (A1)– (A5), with probability one,

sup
θ∈Θ

∣∣∣∣∣ 1T
n∑
i=1

∫ ti
ti−1

[f (θ0,Xs) − f (θ0,Xti−1 )]v(θ,Xti−1 )ds

∣∣∣∣∣ → 0.
Proof. For m > 0, we have

E

supθ∈Θ
∣∣∣∣∣ 1T

n∑
i=1

∫ ti
ti−1

[f (θ0,Xs) − f (θ0,Xti−1 )]v(θ,Xti−1 )ds

∣∣∣∣∣
2m


= E

{
sup
θ∈Θ

∣∣∣∣ 1T
∫ T

0

Gn(s)ds

∣∣∣∣2m
}
.

https://doi.org/10.28924/ada/ma.2.7


Eur. J. Math. Anal. 10.28924/ada/ma.2.7 8
where Gn(s) = ∑ni=1 ∫ titi−1 [f (θ0,Xs) − f (θ0,Xti−1 )]v(θ,Xti−1 ) if ti−1 ≤ s ≤ ti .Hölder’s inequality implies that

E

{
sup
θ∈Θ

∣∣∣∣ 1T
∫ T

0

Gn(s)ds

∣∣∣∣2m
}

≤ T−2mE
{

sup
θ∈Θ

T 2m−1
∫ T

0

|Gn(s)|2mds
}

≤ T−2mE

(
sup
θ∈Θ

T 2m−1
n∑
i=1

∫ ti
ti−1

|f (θ0,Xs) − f (θ0,Xti−1 )|
2m|v(θ,Xti−1 )|

2mds

)

≤ T−1Um
n∑
i=1

∫ ti
ti−1

E(|f (θ0,Xs) − f (θ0,Xti−1 )|
2m|C(Xti−1 )|

2mds)

by condition (A2) where Um := supθ∈Θ |θ−θ0|2m < ∞.By Cauchy-Schwarz’s inequality the above term is
≤ T−1Um

n∑
i=1

∫ ti
ti−1

(E|f (θ0,Xs) − f (θ0,Xti−1 )|
4m)1/2(E(C(Xti−1 )|

4m)1/2ds

≤ T−1UmK2m(θ0)(E|C(X0)|4m)1/2
n∑
i=1

∫ ti
ti−1

(E|Xs −Xti−1 )|
4m)1/2ds

by condition (A2). Since E|Xt −Xs|2m ≤ M(t − s)m, from Gikhman and Skorohod (1975, p.48),the above term
≤ T−1UmK2m(θ0)(E|C(X0)|4m)1/2M1/2

n∑
i=1

∫ ti
ti−1

(s − ti−1)mds

= UmK
2m(θ0)(E|C(X0)|4mM)1/2T−1

n∑
i=1

(∆ti )
m+1

m + 1

≤
UmK

2m(θ0)

m + 1
(E|C(X0)|4mM)1/2hmn−m/2, m > 4.

Chebyshev’s inequality and the above implies that for any � > 0,
∞∑
n=1

P

{
sup
θ∈Θ

∣∣∣∣∣ 1T
n∑
i=1

∫ ti
ti−1

[f (θ0,Xs) − f (θ0,Xti−1 )]v(θ,Xti−1 )ds

∣∣∣∣∣ > �
}
< ∞.

Hence Borel-Cantelli lemma yields the result.
Lemma 2.3 Under (A1)- (A6), with probability one,

1

T

n∑
i=1

[f (θ,Xti−1 ) − f (θ0,Xti−1 )]
2∆ti → E|v(θ,X0)|2

uniformly in θ as T →∞, T
n
→ 0.

https://doi.org/10.28924/ada/ma.2.7


Eur. J. Math. Anal. 10.28924/ada/ma.2.7 9
Proof. By the strong law of large numbers (ergodicity),

1

T

∫ T
0

|v(θ,Xs)2ds → E|v(θ,X0)|2.

a.s. as T →∞ for each θ ∈ Θ. The condition (A2) implies that
1

T

∫ T
0

|v(θ,Xs)2ds ≤
1

T
|θ−θ0|2

∫ T
0

|C(Xs)|2ds

≤ sup
θ∈Θ
|θ−θ0|2

1

T

∫ T
0

|C(Xs)|2ds ≤ B

almost surely for some random variable B by (A1), (A2) and (A3). It also follows easily by (A1)-(A4)that ∣∣∣∣ 1T
∫ T

0

|v(θ1,Xs)2ds −
1

T

∫ T
0

|v(θ2,Xs)2ds
∣∣∣∣ ≤ J|θ1 −θ2|

almost surely for some random variable J and θ1,θ2 ∈ Θ. Thus the family of functions
{

1

T

∫ T
0

|v(·,Xs)|2ds, T ≥ 0
}

is equicontinuous. Hence by Arzela-Ascoli theorem, the convergence is uniform. Denote
g2n(θ) :=

h

2

n∑
i=1

[
(Xti−1 )

2 + (Xti )
2
]
.

Now it is enough to show that
1

T

∫ T
0

|v(θ,Xs)|2ds −
1

T
g2n(θ) → 0

a.s. uniformly in θ. We have
E

{
sup
θ∈Θ
|
∫ T

0

|v(θ,Xs)2ds −g2n(θ)|
2m

}
E

{
sup
θ∈Θ
|
∫ T

0

|v(θ,Xs)2ds −h
n∑
i=1

|v(θ,Xti−1 )|
2|2m

}

= E

{
sup
θ∈Θ
|
n∑
i=1

∫ ti
ti−1

n∑
i=1

(v(θ,Xs −v(θ,Xti−1 ))(v(θ,Xs + v(θ,Xti−1 ))ds|
2m

}
.

https://doi.org/10.28924/ada/ma.2.7


Eur. J. Math. Anal. 10.28924/ada/ma.2.7 10
Hölder inequality implies the above expectation
≤ T 2m−1E sup

θ∈Θ

n∑
i=1

{
∫ ti
ti−1

|v(θ,Xs −v(θ,Xti−1 )|
2m|v(θ,Xs + v(θ,Xti−1 ))|

2m}

≤ T 2m−1
n∑
i=1

∫ ti
ti−1

E[sup
θ∈Θ
|v(θ,Xs −v(θ,Xti−1 )|

2m sup
θ∈Θ
|v(θ,Xs + v(θ,Xti−1 ))|

2m]ds

≤ T 2m−1K2m22mUm
n∑
i=1

∫ ti
ti−1

E[|Xs −Xti−1|
2m(|C(Xs)|2m + |C(Xti−1 ))|

2m]ds

≤ T 2m−1K2m22m+1Um
n∑
i=1

∫ ti
ti−1

(E|Xs −Xti−1 )|
4m)1/2(E|C(Xs)|4m + E|C(Xti−1 ))|

4m)1/2ds

≤ T 2m−1K2m22m+1UmM1/2(E|C(X0)|2m))1/2
n∑
i=1

∫ ti
ti−1

(s − ti−1)mds|

(by stationarity)
≤ RmT 2m−1n(T/n)m+1

where Um := supθ∈Θ |θ − θ0|2m < ∞ and Rm := K2m22m+2UmM1/2(E|C(X0)|4m)1/2. Hence if
m > 4,

E

{
sup
θ∈Θ
|

1

T

∫ T
0

|v(θ,Xs)|2ds −
1

T
g2n(θ)|

2m

}
≤ Rm(T/n)m ≤ Rmhm/2n−m/2.

Borel-Cantelli argument yields the result.
Now we are ready to present the main result of the paper:
Theorem 2.2 Under the conditions (A1)-(A5), the SAMLE is strongly consistent, i.e.,

θ̃n,T → θ0 a.s. as T →∞, T
n
→ 0.

Proof. Let
∼
l n,T (θ) := log

∼
Ln,T (θ)

and
v(θ,x) := f (θ,x) − f (θ0,x).

https://doi.org/10.28924/ada/ma.2.7


Eur. J. Math. Anal. 10.28924/ada/ma.2.7 11
Note that

1

T

[∼
l n,T (θ) −

∼
l n,T (θ0)

]
=

1

2T

n∑
i=1

[f (θ,Xti−1 ) + f (θ,Xti )](Xti −Xti−1 )

−
1

2T

n∑
i=1

[f (θ0,Xti−1 ) + f (θ0,Xti )](Xti −Xti−1 )

−
1

2n

n∑
i=1

[ḟ (θ,Xti−1 ) − ḟ (θ0,Xti−1 )]

−
1

2n

n∑
i=1

[f 2(θ,Xti−1 ) − f
2(θ0,Xti−1 )]

=
1

2T

{
n∑
i=1

[
v(θ,Xti−1 ) + v(θ,Xti )

]
∆Wi −h

n∑
i=1

v̇(θ,Xti−1 )

}

−
1

2n

n∑
i=1

v2(θ,Xti−1 )

−
1

T

n∑
i=1

∫ ti
ti−1

v(θ,Xti−1 )[f (θ0,Xt) + f (θ0,Xti−1 )]dt

−
1

T

n∑
i=1

∫ ti
ti−1

[v(θ,Xti )f (θ0,Xt) −v(θ0,Xti−1 )f (θ0,Xti−1 )]dt

=: I1 − I2 − I3 − I4.

Let
Dn,T (θ) :=

1

T

[∼
l n,T (θ) −

∼
l n,T (θ0)

]
.

Below Lemma 2.1-2.3 show that
Dn,T (θ) → D(θ) a.s. as T →∞, T

n
→ 0

where
D(θ) := −

1

2
E|f (θ,X0) − f (θ0,X0)|2.Thus condition (C1) of Theorem 2.1 is satisfied. The limiting function D(θ) satisfies the conditions(C2) and (C3) of Theorem. Hence as a consequence of Theorem 2.1 we obtain the result.

https://doi.org/10.28924/ada/ma.2.7


Eur. J. Math. Anal. 10.28924/ada/ma.2.7 12
3. Ornstein-Uhlenbeck Process

Consider the Ornstein-Uhlenbeck process satisfying
dXt = θXtdt + dWt, t ≥ 0, X0 = 0, θ < 0.

The Euler Estimator (conditional least squares estimator) is given by
θ̌n,T =

∑n
i=1 Xti−1 (Xti −Xti−1 )
h
∑n
i=1 X

2
ti−1

.

Strong consistency of this estimator is obtained in Kasonga (1988). As a consequence of Theorem2.2, we obtain the strong consistency of three estimators with
θ̃n,T =

(X2T −T )/2
h
∑n
i=1 X

2
ti−1

, θ̄n,T,3 =
X2T/2

h
∑n
i=1 X

2
ti−1

, θ̂n,T,2 =
−T/2

h
∑n
i=1 X

2
ti−1

.

which are SAMLE, YAMLE (Young AMLE) and , AMCE respectively as T → ∞ and T/n → 0.SAMLE is the linear combination of AMCE and YAMLE.Define the continuous MLE, YMLE and MCE respectively
θT,1 =

∫T
0
XtdXt∫T

0
X2t dt

, θT,2 =
X2T/2∫T

0
X2t dt

, θT,3 =
−T/2∫T
0
X2t dt

.

Interpreting ∫T
0
XtdXt to be the Young (1936) integral, it equals X2T/2. Belfadli et al. (2011)(see also El Machkouri et al. (2016)) obtained the strong consistency of θT,2 as T → ∞. TheYAMLE θ̄n,T,2 is the Euler discretization of θT,2. Lanksa (1979) obtained strong consistency ofthe MCE θT,3 as T → ∞ whose Euler discretetization is θ̂n,T,3. Liptser and Shiryayev (1978)obtained strong consistency of the MLE θT,1 as T →∞ whose Euler discretetization is θ̌n,T,1.

Concluding Remark It would be interesting to extend the results of the paper to diffusions drivenby persistent fractional Brownian motion which are neither Markov processes nor semimartingales,but preserved long memory property of the model.
References

[1] R. Belfadli, K. Es-Sebaiy and Y. Ouknine, Parameter estimation for fractional Ornstein-Uhlenbeck processes:non-ergodic case, Front. Sci. Eng. 1 (2011) 41-56. https://doi.org/10.34874/IMIST.PRSM/fsejournal-v1i1.
26873.[2] J.P.N. Bishwal, Parameter estimation in stochastic differential equations, Springer Berlin Heidelberg, Berlin, Hei-delberg, 2008. https://doi.org/10.1007/978-3-540-74448-1.[3] J.P.N. Bishwal, Berry–Esseen inequalities for discretely observed diffusions, Monte Carlo Methods and Applications.15 (2009) 229-239. https://doi.org/10.1515/MCMA.2009.013.[4] J.P.N. Bishwal, M-Estimation for discretely sampled diffusions, Theory Stoch. Processes, 15 (31) (2) (2009b) 62-83.

https://doi.org/10.28924/ada/ma.2.7
https://doi.org/10.34874/IMIST.PRSM/fsejournal-v1i1.26873
https://doi.org/10.34874/IMIST.PRSM/fsejournal-v1i1.26873
https://doi.org/10.1007/978-3-540-74448-1
https://doi.org/10.1515/MCMA.2009.013


Eur. J. Math. Anal. 10.28924/ada/ma.2.7 13
[5] J.P.N. Bishwal, Uniform rate of weak convergence of the minimum contrast estimator in the Ornstein–Uhlenbeckprocess, Methodol. Comput. Appl. Probab. 12 (2010) 323–334. https://doi.org/10.1007/s11009-008-9099-x.[6] J.P.N. Bishwal, Conditional least squares estimation in diffusion processes based on Poisson sampling, J. Appl.Probab. Stat. 5 (2) (2010b) 169-180.[7] J.P.N. Bishwal, Milstein approximation of posterior density of diffusions, Int. J. Pure Appl. Math. 68 (4) (2011a)403-414.[8] J.P.N. Bishwal, Some new estimators of integrated volatility, Amer. Open J. Stat. 1 (2) (2011b) 74-80.[9] J.P.N. Bishwal, Stochastic moment problem and hedging of generalized Black–Scholes options, Appl. Numer. Math.61 (2011) 1271–1280. https://doi.org/10.1016/j.apnum.2011.08.005.[10] J.P.N. Bishwal, Parameter estimation in stochastic volatility models, Springer Nature Switzerland AG (forthcoming).(2021).[11] J.P.N. Bishwal, A. Bose, Rates of convergence of approximate maximum likelihood estimators in the Ornstein-Uhlenbeck process, Computers Math. Appl. 42 (2001) 23–38. https://doi.org/10.1016/S0898-1221(01)

00127-4.[12] M. El Machkouri, K. Es-Sebaiy, Y. Ouknine, Least squares estimator for non-ergodic Ornstein-Uhlebeck processesdriven by Gaussian processes, J. Korean Stat. Soc. 45 (2016) 329-341.[13] D. Florens-zmirou, Approximate discrete-time schemes for statistics of diffusion processes, Statistics. 20 (1989)547–557. https://doi.org/10.1080/02331888908802205.[14] R. Frydman, A proof of the consistency of maximum likelihood estimators of nonlinear regression models withautocorrelated errors, Econometrica. 48 (1980) 853-860. https://doi.org/10.2307/1912936.[15] E. Gobet, N. Landon, Almost sure optimal hedging strategy, Ann. Appl. Probab. 24 (2014). https://doi.org/10.
1214/13-AAP959.[16] N. Ikeda, S. Watanabe, Stochastic differential equations and diffusion processes, Second Edition, North-Holland,Amsterdam (Kodansha Ltd., Tokyo). (1989).[17] R.L. Karandikar, On pathwise stochastic integration, Stoch. Processes Appl. 57 (1995) 11–18. https://doi.org/
10.1016/0304-4149(95)00002-O.[18] R.A. Kasonga, The consistency of a non-linear least squares estimator from diffusion processes, Stoch. ProcessesAppl. 30 (1988) 263-275. https://doi.org/10.1016/0304-4149(88)90088-9.[19] Y.A. Kutoyants, Statistical inference for ergodic diffusion processes, Springer London, London, 2004. https://doi.
org/10.1007/978-1-4471-3866-2.[20] V. Lánska, Minimum contrast estimation in diffusion processes, J. Appl. Probab. 16 (1979) 65–75. https://doi.
org/10.2307/3213375.[21] R.S. Liptser, A.N. Shiryayev, Statistics of Random Processes I: General Theory Springer-Verlag, Berlin. (1977).[22] R.S. Liptser, A.N. Shiryayev, Statistics of Random Processes II : Applications Springer-Verlag, Berlin. (1978).[23] I. Shoji, A note on asymptotic properties of estimator derived from the Euler method for diffusion processes atdiscrete times, Stat. Probab. Lett. 36 (1997) 153-159. https://doi.org/10.1016/S0167-7152(97)00058-8.[24] N. Yoshida, Estimation for diffusion processes from discrete observation, J. Multivar. Anal. 41 (1992) 220–242.
https://doi.org/10.1016/0047-259X(92)90068-Q.[25] L.C. Young, An inequality of the Hölder type, connected with Stieltjes integration, Acta Math. 67 (1936) 251–282.
https://doi.org/10.1007/BF02401743.

https://doi.org/10.28924/ada/ma.2.7
https://doi.org/10.1007/s11009-008-9099-x
https://doi.org/10.1016/j.apnum.2011.08.005
https://doi.org/10.1016/S0898-1221(01)00127-4
https://doi.org/10.1016/S0898-1221(01)00127-4
https://doi.org/10.1080/02331888908802205
https://doi.org/10.2307/1912936
https://doi.org/10.1214/13-AAP959
https://doi.org/10.1214/13-AAP959
https://doi.org/10.1016/0304-4149(95)00002-O
https://doi.org/10.1016/0304-4149(95)00002-O
https://doi.org/10.1016/0304-4149(88)90088-9
https://doi.org/10.1007/978-1-4471-3866-2
https://doi.org/10.1007/978-1-4471-3866-2
https://doi.org/10.2307/3213375
https://doi.org/10.2307/3213375
https://doi.org/10.1016/S0167-7152(97)00058-8
https://doi.org/10.1016/0047-259X(92)90068-Q
https://doi.org/10.1007/BF02401743

	References