International Journal of Analysis and Applications

Volume 17, Number 5 (2019), 686-710

URL: https://doi.org/10.28924/2291-8639

DOI: 10.28924/2291-8639-17-2019-686

ESTIMATION OF DIFFERENT ENTROPIES VIA TAYLOR ONE POINT AND

TAYLOR TWO POINTS INTERPOLATIONS USING JENSEN TYPE FUNCTIONALS

TASADDUQ NIAZ1,2,∗, KHURAM ALI KHAN1, D̄ILDA PEČARIĆ3, JOSIP PEČARIĆ4

1Department of Mathematics, University of Sargodha, Sargodha 40100, Pakistan

2Department of Mathematics, The University of Lahore, Sargodha-Campus, Sargodha 40100, Pakistan

3Catholic University of Croatia, Ilica 242, Zagreb, Croatia

4RUDN University, Miklukho-Maklaya str. 6, 117198 Moscow, Russia

∗Corresponding author: tasadduq khan@yahoo.com

Abstract. In this work, we estimated the different entropies like Shannon entropy, Rényi divergences,

Csiszar divergence by using the Jensen’s type functionals. The Zipf’s mandelbrot law and hybrid Zipf’s

mandelbrot law are used to estimate the Shannon entropy. Further the Taylor one point and Taylor two

points interpolations are used to generalize the new inequalities for m-convex function.

1. Introduction and preliminary results

In numerical analysis, interpolation is a method of constructing new data points within the range of a

discrete set of known data points for example in the situation when one obtained the number of data after

experiment which actually represent the value of function for a limited number of value of the independent

variable. It is usually require to interpolate which means that it has to be estimated the value of the function

for an intermediate value of independent variable. There are many interpolating polynomial can be found

in literature for example Taylor polynomial, Lidstone polynomial etc.

Received 2019-05-27; accepted 2019-07-16; published 2019-09-02.

2010 Mathematics Subject Classification. 26D07, 94A17.

Key words and phrases. m-convex function; Jensen’s inequality; Shannon entropy; f- and Rényi divergence; Taylor inter-

polation; entropy.

c©2019 Authors retain the copyrights

of their papers, and all open access articles are distributed under the terms of the Creative Commons Attribution License.

686

https://doi.org/10.28924/2291-8639
https://doi.org/10.28924/2291-8639-17-2019-686


Int. J. Anal. Appl. 17 (5) (2019) 687

The most commonly used words, the largest cities of countries income of billionare can be described

in term of Zipf’s law. The f-divergence which means that distance between two probability distribution

by making an average value, which is weighted by a specified function. As f-divergence, there are other

probabilities distributions like Csiszar f-divergence [15, 16], some special case of which are Kullback-Leibler-

divergence use to find the appropriate distance between the probability distribution (see [19,20]). The notion

of distance is stronger than divergence because it give the properties of symmetry and triangle inequalities.

Probability theory has application in many fields and the divergence between probability distribution have

many application in these fields.

Many natural phenomena’s like distribution of wealth and income in a society, distribution of face book

likes, distribution of football goals follows power law distribution (Zipf’s Law). Like above phenomena’s,

distribution of city sizes also follow Power Law distribution. Auerbach [2] first time gave the idea that

the distribution of city size can be well approximated with the help of Pareto distribution (Power Law

distribution). This idea was well refined by many researchers but Zipf [28] worked significantly in this

field. The distribution of city sizes is investigated by many scholars of the urban economics, like Rosen and

Resnick [26] , Black and Henderson [3], Ioannides and Overman [14], Soo [27], Anderson and Ge [1] and

Bosker et al. [4]. Zipf’s law states that: “The rank of cities with a certain number of inhabitants varies

proportional to the city sizes with some negative exponent, say that is close to unit”. In other words, Zipf’s

Law states that the product of city sizes and their ranks appear roughly constant. This indicates that the

population of the second largest city is one half of the population of the largest city and the third largest city

equal to the one third of the population of the largest city and the population of n-th city is 1
n

of the largest

city population. This rule is called rank, size rule and also named as Zipf’s Law. Hence Zip’s Law not only

shows that the city size distribution follows the Pareto distribution, but also show that the estimated value

of the shape parameter is equal to unity.

In [17] L. Horváth et al. introduced some new functionals based on the f-divergence functionals, and

obtained some estimates for the new functionals. They obtained f-divergence and Rényi divergence by

applying a cyclic refinement of Jensen’s inequality. They also construct some new inequalities for Rényi and

Shannon entropies and used Zipf-Madelbrot law to illustrate the results.

The inequalities involving higher order convexity are used by many physicists in higher dimension problems

since the founding of higher order convexity by T. Popoviciu (see [24, p. 15]). It is quite interesting fact that

there are some results that are true for convex functions but when we discuss them in higher order convexity

they do not remaind valid.

In [24, p. 16], the following criteria is given to check the m-convexity of the function.

If f(m) exists, then f is m-convex if and only if f(m) ≥ 0.

In recent years many researchers have generalized the inequalities for m-convex functions; like S. I. Butt et


Int. J. Anal. Appl. 17 (5) (2019) 688

al. generalized the Popoviciu inequality for m-convex function using Taylor’s formula, Lidstone polynomial,

montgomery identity, Fink’s identity, Abel-Gonstcharoff interpolation and Hermite interpolating polynomial

(see [5–9]).

In [23] T. Niaz et al generalized the refinement of Jensen’s inequality for m-convex function using Abel-

Gontscharoff green function and Fink’s identity. In [18] K. A. Khan et al used refinement of Jensen inequality

and introduced new functional based on an f-divergence functional, and estimate some bounds for the new

functionals, the f-divergence and Rényi divergence. They also constructed some new inequalities for Réneyi

and Shannon estimates. They also generalized the new inequality for m-convex function using Montgomery

identity. Further the used hybrid Zipf Mandelbrot law to estimate the Shannon entropy.

Since many years Jensen’s inequality has of great interest. The researchers have given the refinement of

Jensen’s inequality by defining some new functions (see [12, 13] ). Like many researchers L. Horváth and J.

Pečarić in ( [10, 13], see also [11, p. 26]), gave a refinement of Jensen’s inequality for convex function. They

defined some essential notions to prove the refinement given as follows:

Let X be a set, and:

P(X) := Power set of X,

|X|:= Number of elements of X,

N:= Set of natural numbers with 0.

Consider q ≥ 1 and r ≥ 2 be fixed integers. Define the functions

Fr,s : {1, . . . ,q}r →{1, . . . ,q}r−1 1 ≤ s ≤ r,

Fr : {1, . . . ,q}r → P
(
{1, . . . ,q}r−1

)
,

and

Tr : P ({1, . . . ,q}r) → P
(
{1, . . . ,q}r−1

)
,

by

Fr,s(i1, . . . , ir) := (i1, i2, . . . , is−1, is+1, . . . , ir) 1 ≤ s ≤ r,

Fr(i1, . . . , ir) :=

r⋃
s=1

{Fr,s(i1, . . . , ir)},

and

Tr(I) =




φ, I = φ;⋃
(i1,...,ir)∈I

Fr(i1, . . . , ir), I 6= φ.

Next let the function

αr,i : {1, . . . ,q}r → N 1 ≤ i ≤ q


Int. J. Anal. Appl. 17 (5) (2019) 689

defined by

αr,i(i1, . . . , ir) is the number of occurences of i in the sequence (i1, . . . , ir).

For each I ∈ P({1, . . . ,q}r) let

αI,i :=
∑

(i1,...,ir)∈I

αr,i(i1, . . . , ir) 1 ≤ i ≤ q.

(H1) Let n,m be fixed positive integers such that n ≥ 1, m ≥ 2 and let Im be a subset of {1, . . . ,n}m such

that

αIm,i ≥ 1 1 ≤ i ≤ n.

Introduce the sets Il ⊂{1, . . . ,n}l(m− 1 ≥ l ≥ 1) inductively by

Il−1 := Tl(Il) m ≥ l ≥ 2.

Obviously the sets I1 = {1, . . . ,n}, by (H1) and this insures that αI1,i = 1(1 ≤ i ≤ n). From (H1) we have

αIl,i ≥ 1(m− 1 ≥ l ≥ 1, 1 ≤ i ≤ n).

For m ≥ l ≥ 2, and for any (j1, . . . ,jl−1) ∈ Il−1, let

HIl(j1, . . . ,jl−1) := {((i1, . . . , il),k) ×{1, . . . , l}|Fl,k(i1, . . . , il) = (j1, . . . ,jl−1)}.

With the help of these sets they define the functions ηIm,l : Il → N(m ≥ l ≥ 1) inductively by

ηIm,m(i1, . . . , im) := 1 (i1, . . . , im) ∈ Im;

ηIm,l−1(j1, . . . ,jl−1) :=
∑

((i1,...,il),k)∈HIl(j1,...,jl−1)

ηIm,l(i1, . . . , il).

They define some special expressions for 1 ≤ l ≤ m, as follows

Am,l = Am,l(Im,x1, . . . ,xn,p1, . . . ,pn; f) :=
(m− 1)!
(l− 1)!

∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

pij
αIm,ij


f




l∑
j=1

pij
αIm,ij

xij

l∑
j=1

pij
αIm,ij




and prove the following theorem.

Theorem 1.1. Assume (H1), and let f : I → R be a convex function where I ⊂ R is an interval. If

x1, . . . ,xn ∈ I and p1, . . . ,pn are positive real numbers such that
n∑
i=1

pi = 1, then

f

(
n∑
s=1

psxs

)
≤Am,m ≤Am,m−1 ≤ . . . ≤Am,2 ≤Am,1 =

n∑
s=1

psf (xs) . (1.1)


Int. J. Anal. Appl. 17 (5) (2019) 690

We define the following functionals by taking the differences of refinement of Jensen’s inequality given in

(1.1).

Θ1(f) = Am,r −f

(
n∑
s=1

psxs

)
, r = 1, . . . ,m, (1.2)

Θ2(f) = Am,r −Am,k, 1 ≤ r < k ≤ m. (1.3)

Under the assumptions of Theorem 1.1, we have

Θi(f) ≥ 0, i = 1, 2. (1.4)

Inequalities (1.4) are reversed if f is concave on I.

2. Inequalities for Csiszár divergence

In [15, 16] Csiszár introduced the following notion.

Definition 2.1. Let f : R+ → R+ be a convex function, let r = (r1, . . . ,rn) and q = (q1, . . . ,qn) be positive

probability distributions. Then f-divergence functional is defined by

If (r, q) :=

n∑
i=1

qif

(
ri
qi

)
. (2.1)

And he stated that by defining

f(0) := lim
x→0+

f(x); 0f

(
0

0

)
:= 0; 0f

(a
0

)
:= lim

x→0+
xf
(a
x

)
, a > 0, (2.2)

we can also use the nonnegative probability distributions as well.

In [17], L. Horv́ath, et al. gave the following functional on the based of previous definition.

Definition 2.2. Let I ⊂ R be an interval and let f : I → R be a function, let r = (r1, . . . ,rn) ∈ Rn and

q = (q1, . . . ,qn) ∈ (0,∞)n such that

rs
qs
∈ I, s = 1, . . . ,n.

Then they define the sum as Îf (r, q) as

Îf (r, q) :=

n∑
s=1

qsf

(
rs
qs

)
. (2.3)

We apply Theorem 1.1 to Îf (r, q)

Theorem 2.1. Assume (H1), let I ⊂ R be an interval and let r = (r1, . . . ,rn) and q = (q1, . . . ,qn) are in

(0,∞)n such that

rs
qs
∈ I, s = 1, . . . ,n.


Int. J. Anal. Appl. 17 (5) (2019) 691

(i) If f : I → R is convex function, then

Îf (r, q) =

n∑
s=1

qsf

(
rs
qs

)
= A

[1]
m,1 ≥ A

[1]
m,2 ≥ . . . ≥ A

[1]
m,m−1 ≥ A

[1]
m,m ≥ f

(∑n
s=1 rs∑n
s=1 qs

) n∑
s=1

qs. (2.4)

where

A
[1]
m,l =

(m− 1)!
(l− 1)!

∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

qij
αIm,ij


f



∑l
j=1

rij
αIm,ij

l∑
j=1

qij
αIm,ij


 (2.5)

If f is concave function, then inequality signs in (2.4) are reversed.

(ii) If f : I → R is a function such that x → xf(x)(x ∈ I) is convex, then(
n∑
s=1

rs

)
f

(
n∑
s=1

rs∑n
s=1 qs

)
≤ A[2]m,m ≤ A

[2]
m,m−1 ≤ . . . ≤ A

[2]
m,2 ≤ A

[2]
m,1 =

n∑
s=1

rsf

(
rs
qS

)
= Îidf (r, q) (2.6)

where

A
[2]
m,l =

(m− 1)!
(l− 1)!

∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

qij
αIm,ij




∑lj=1 rijαIm,ij∑l

j=1

qij
αIm,ij


f


∑lj=1 rijαIm,ij∑l

j=1

qij
αIm,ij


 .

Proof. (i) Consider ps =
qs∑
n
s=1

qs
and xs =

rs
qs

in Theorem 1.1, we have

f

(
n∑
s=1

qs∑n
s=1 qs

rs
qs

)
≤ . . . ≤

(m− 1)!
(l− 1)!

∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

qij∑
n
s=1

qs

αIm,ij


f




l∑
j=1

qij∑n
i=1

qi

αIm,ij

rij
qij

l∑
j=1

qij∑n
i=1

qi

αIm,ij


 ≤ . . . ≤

n∑
s=1

qs∑n
i=1 qs

f

(
rs
qs

)
. (2.7)

On multiplying
∑n
s=1 qs, we have (2.4).

(ii) Using f := idf (where “id” is the identity function) in Theorem 1.1, we have

n∑
s=1

psxsf

(
n∑
s=1

psxs

)
≤ . . . ≤

(m− 1)!
(l− 1)!

∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

pij
αIm,ij






l∑
j=1

pij
αIm,ij

xij

l∑
j=1

pij
αIm,ij


f




l∑
j=1

pij
αIm,ij

xij

l∑
j=1

pij
αIm,ij


 ≤ . . . ≤

n∑
s=1

psxsf(xs). (2.8)

Now on using ps =
qs∑
n
s=1

qs
and xs =

rs
qs
, s = 1, . . . ,n, we get

n∑
s=1

qs∑n
s=1 qs

rs
qs
f

(
n∑
s=1

qs∑n
s=1 qs

rs
qs

)
≤ . . . ≤

(m− 1)!
(l− 1)!

∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

qij∑
n
s=1

qs

αIm,ij





∑l
j=1

qij∑n
s=1

qs

αIm,ij

rij
qij∑l

j=1

qij∑n
s=1

qs

αIm,ij


f



∑l
j=1

qij∑n
s=1

qs

αIm,ij

rij
qij∑l

j=1

qij∑n
s=1

qs

αIm,ij


 ≤ . . . ≤ n∑

s=1

qs∑n
s=1 qs

rs
qs
f

(
rs
qS

)
. (2.9)


Int. J. Anal. Appl. 17 (5) (2019) 692

On multiplying
∑n
s=1 qs, we get (2.6). �

3. Inequalities for Shannon Entropy

Definition 3.1 (see [17]). The Shannon entropy of positive probability distribution r = (r1, . . . ,rn) is defined

by

S := −
n∑
s=1

rs log(rs). (3.1)

Corollary 3.1. Assume (H1).

(i) If q = (q1, . . . ,qn) ∈ (0,∞)n, and the base of log is greater than 1, then

S ≤ A[3]m,m ≤ A
[3]
m,m−1 ≤ . . . ≤ A

[3]
m,2 ≤ A

[3]
m,1 = log

(
n∑n
s=1 qs

) n∑
s=1

qs, (3.2)

where

A
[3]
m,l = −

(m− 1)!
(l− 1)!

∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

qij
αIm,ij


 log


 l∑
j=1

qij
αIm,ij


 . (3.3)

If the base of log is between 0 and 1, then inequality signs in (3.2) are reversed.

(ii) If q = (q1, . . . ,qn) is a positive probability distribution and the base of log is greater than 1, then we

have the estimates for the Shannon entropy of q

S ≤ A[4]m,m ≤ A
[4]
m,m−1 ≤ . . . ≤ A

[4]
m,2 ≤ A

[4]
m,1 = log(n), (3.4)

where

A
[4]
m,l = −

(m− 1)!
(l− 1)!

∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

qij
αIm,ij


 log


 l∑
j=1

qij
αIm,ij


 .

Proof. (i) Using f := log and r = (1, . . . , 1) in Theorem 2.1 (i), we get (3.2).

(ii) It is the special case of (i). �

Definition 3.2 (see [17]). The Kullback-Leibler divergence between the positive probability distribution r =

(r1, . . . ,rn) and q = (q1, . . . ,qn) is defined by

D(r, q) :=

n∑
s=1

ri log

(
ri
qi

)
. (3.5)

Corollary 3.2. Assume (H1).

(i) Let r = (r1, . . . ,rn) ∈ (0,∞)n and q := (q1, . . . ,qn) ∈ (0,∞)n. If the base of log is greater than 1, then

n∑
s=1

rs log

(
n∑
s=1

rs∑n
s=1 qs

)
≤ A[5]m,m ≤ A

[5]
m,m−1 ≤ . . . ≤ A

[5]
m,2 ≤ A

[5]
m,1 =

n∑
s=1

rs log

(
rs
qs

)
= D(r, q), (3.6)


Int. J. Anal. Appl. 17 (5) (2019) 693

where

A
[5]
m,l =

(m− 1)!
(l− 1)!

∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

qij
αIm,ij




∑lj=1 rijαIm,ij∑l

j=1

qij
αIm,ij


 log


∑lj=1 rijαIm,ij∑l

j=1

qij
αIm,ij


 .

If the base of log is between 0 and 1, then inequality in (3.6) is reversed.

(ii) If r and q are positive probability distributions, and the base of log is greater than 1, then we have

D(r,q) = A
[6]
m,1 ≥ A

[6]
m,2 ≥ . . . ≥ A

[6]
m,m−1 ≥ A

[6]
m,m ≥ 0, (3.7)

where

A
[6]
m,l =

(m− 1)!
(l− 1)!

∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

qij
αIm,ij




∑lj=1 rijαIm,ij∑l

j=1

qij
αIm,ij


 log


∑lj=1 rijαIm,ij∑l

j=1

qij
αIm,ij


 .

If the base of log is between 0 and 1, then inequality signs in (3.7) are reversed.

Proof. (i) On taking f := log in Theorem 2.1 (ii), we get (3.6).

(ii) Since r and q are positive probability distributions therefore
∑n
s=1 rs =

∑n
s qs = 1, so the smallest term

in (3.6) is given as

n∑
s=1

rs log

(
n∑
s=1

rs∑n
s=1 qs

)
= 0. (3.8)

Hence for positive probability distribution r and q the (3.6) will become (3.7).

�

4. Inequalities for Rényi Divergence and Entropy

The Rényi divergence and entropy come from [25].

Definition 4.1. Let r := (r1, . . . ,rn) and q := (q1, . . . ,qn) be positive probability distributions, and let λ ≥ 0,

λ 6= 1.

(a) The Rényi divergence of order λ is defined by

Dλ(r,q) :=
1

λ− 1
log

(
n∑
i=1

qi

(
ri
qi

)λ)
. (4.1)

(b) The Rényi entropy of order λ of r is defined by

Hλ(r) :=
1

1 −λ
log

(
n∑
i=1

rλi

)
. (4.2)

The Rényi divergence and the Rényi entropy can also be extended to non-negative probability distribu-

tions. If λ → 1 in (4.1), we have the Kullback-Leibler divergence, and if λ → 1 in (4.2), then we have the

Shannon entropy. In the next two results, inequalities can be found for the Rényi divergence.


Int. J. Anal. Appl. 17 (5) (2019) 694

Theorem 4.1. Assume (H1), let r = (r1, . . . ,rn) and q = (q1, . . . ,qn) are probability distributions.

(i) If 0 ≤ λ ≤ µ such that λ,µ 6= 1, and the base of log is greater than 1, then

Dλ(r,q) ≤ A[7]m,m ≤ A
[7]
m,m−1 ≤ . . . ≤ A

[7]
m,2 ≤ A

[7]
m,1 = Dµ(r,q), (4.3)

where

A
[7]
m,l =

1

µ− 1
log


(m− 1)!(l− 1)!

∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

rij
αIm,ij






l∑
j=1

rij
αIm,ij

(
rij
qij

)λ−1
l∑

j=1

rij
αIm,ij




µ−1
λ−1



The reverse inequalities hold in (4.3) if the base of log is between 0 and 1.

(ii) If 1 < µ and the base of log is greater than 1, then

D1(r,q) = D(r,q) =
n∑
s=1

rs log

(
rs
qs

)
≤ A[8]m,m ≤ A

[8]
m,m−1 ≤ . . . ≤ A

[8]
m,2 ≤ A

[8]
m,1 = Dµ(r,q), (4.4)

where

A
[8]
m,l =≤

1

µ− 1
log


(m− 1)!

(l− 1)!
∑

(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

rij
αIm,ij


 exp




(µ− 1)
l∑

j=1

rij
αIm,ij

log
(
rij
qij

)
l∑

j=1

rij
αIm,ij






here the base of exp is same as the base of log, and the reverse inequalities hold if the base of log is between

0 and 1.

(iii) If 0 ≤ λ < 1, and the base of log is greater than 1, then

Dλ(r,q) ≤ A[9]m,m ≤ A
[9]
m,m−1 ≤ . . . ≤ A

[9]
m,2 ≤ A

[9]
m,1 = D1(r,q), (4.5)

where

A
[9]
m,l =

1

λ− 1
(m− 1)!
(l− 1)!

∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

rij
αIm,ij


 log




l∑
j=1

rij
αIm,ij

(
rij
qij

)λ−1
l∑

j=1

rij
αIm,ij


 . (4.6)

Proof. By applying Theorem 1.1 with I = (0,∞), f : (0,∞) → R, f(t) := t
µ−1
λ−1

ps := rs, xs :=

(
rs
qs

)λ−1
, s = 1, . . . ,n,


Int. J. Anal. Appl. 17 (5) (2019) 695

we have

(
n∑
s=1

qs

(
rs
qs

)λ)µ−1λ−1
=

(
n∑
s=1

rs

(
rs
qs

)λ−1)µ−1λ−1
≤

. . . ≤
(m− 1)!
(l− 1)!

∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

rij
αIm,ij






l∑
j=1

rij
αIm,ij

(
rij
qij

)λ−1
l∑

j=1

rij
αIm,ij




µ−1
λ−1

≤ . . . ≤
n∑
s=1

rs

((
rs
qs

)λ−1)µ−1λ−1
, (4.7)

if either 0 ≤ λ < 1 < β or 1 < λ ≤ µ, and the reverse inequality in (4.7) holds if 0 ≤ λ ≤ β < 1. By raising

to power 1
µ−1 , we have from all

(
n∑
s=1

qs

(
rs
qs

)λ) 1λ−1
≤

. . . ≤


(m− 1)!(l− 1)!

∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

rij
αIm,ij






l∑
j=1

rij
αIm,ij

(
rij
qij

)λ−1
l∑

j=1

rij
αIm,ij




µ−1
λ−1



1
µ−1

≤ . . . ≤


 n∑
s=1

rs

((
rs
qs

)λ−1)µ−1λ−1 
1

µ−1

=

(
n∑
s=1

qs

(
rs
qs

)µ) 1µ−1
. (4.8)

Since log is increasing if the base of log is greater than 1, it now follows (4.3). If the base of log is between 0

and 1, then log is decreasing and therefore inequality in (4.3) are reversed. If λ = 1 and β = 1, we have (ii)

and (iii) respectively by taking limit, when λ goes to 1. �

Theorem 4.2. Assume (H1), let r = (r1, . . . ,rn) and q = (q1, . . . ,qn) are probability distributions. If either

0 ≤ λ < 1 and the base of log is greater than 1, or 1 < λ and the base of log is between 0 and 1, then

1∑n
s=1 qs

(
rs
qs

)λ n∑
s=1

qs

(
rs
qs

)λ
log

(
rs
qs

)
= A

[10]
m,1 ≤ A

[10]
m,2 ≤ . . . ≤ A

[10]
m,m−1 ≤ A

[10]
m,m ≤ Dλ(r,q) ≤ A

[11]
m,m

≤ A[11]m,m ≤ . . . ≤ A
[11]
m,2 ≤ A

[11]
m,1 = D1(r,q)

(4.9)


Int. J. Anal. Appl. 17 (5) (2019) 696

where

A[10]m,m =
1

(λ− 1)
∑n
s=1 qs

(
rs
qs

)λ (m− 1)!(l− 1)! ∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

rij
αIm,ij

(
rij
qij

)λ−1

log




l∑
j=1

rij
αIm,ij

(
rij
qij

)λ−1
l∑

j=1

rij
αIm,ij




and

A[11]m,m =
1

λ− 1
(m− 1)!
(l− 1)!

∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

rij
αIm,ij


 log




l∑
j=1

rij
αIm,ij

(
rij
qij

)λ−1
l∑

j=1

rij
αIm,ij


 .

The inequalities in (4.9) are reversed if either 0 ≤ λ < 1 and the base of log is between 0 and 1, or 1 < λ

and the base of log is greater than 1.

Proof. We prove only the case when 0 ≤ λ < 1 and the base of log is greater than 1 and the other cases

can be proved similarly. Since 1
λ−1 < 0 and the function log is concave then choose I = (0,∞), f := log,

ps = rs, xs :=
(
rs
qs

)λ−1
in Theorem 1.1, we have

Dλ(r, q) =
1

λ− 1
log

(
n∑
s=1

qs

(
rs
qs

)λ)
=

1

λ− 1
log

(
n∑
s=1

rs

(
rs
qs

)λ−1)

≤ . . . ≤
1

λ− 1
(m− 1)!
(l− 1)!

∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

rij
αIm,ij


 log




l∑
j=1

rij
αIm,ij

(
rij
qij

)λ−1
l∑

j=1

rij
αIm,ij




≤ . . . ≤
1

λ− 1

n∑
s=1

rs log

((
rs
qs

)λ−1)
=

n∑
s=1

rs log

(
rs
qs

)
= D1(r, q) (4.10)

and this give the upper bound for Dλ(r, q).

Since the base of log is greater than 1, the function x 7→ xf(x) (x > 0) is convex therefore 1
1−λ < 0 and


Int. J. Anal. Appl. 17 (5) (2019) 697

Theorem 1.1 gives

Dλ(r, q) =
1

λ− 1
log

(
n∑
s=1

qs

(
rs
qs

)λ)

=
1

λ− 1
(∑n

s=1 qs

(
rs
qs

)λ)
(

n∑
s=1

qs

(
rs
qs

)λ)
log

(
n∑
s=1

qs

(
rs
qs

)λ)

≥ . . . ≥
1

λ− 1
(∑n

s=1 qs

(
rs
qs

)λ) (m− 1)!(l− 1)! ∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

rij
αIm,ij







l∑
j=1

rij
αIm,ij

(
rij
qij

)λ−1
l∑

j=1

rij
αIm,ij


 log




l∑
j=1

rij
αIm,ij

(
rij
qij

)λ−1
l∑

j=1

rij
αIm,ij


 =

1

λ− 1
(∑n

s=1 qs

(
rs
qs

)λ) (m− 1)!(l− 1)! ∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

rij
αIm,ij

(
rij
qij

)λ−1 log



l∑
j=1

rij
αIm,ij

(
rij
qij

)λ−1
l∑

j=1

rij
αIm,ij




≥ . . . ≥
1

λ− 1

n∑
s=1

rs

(
rs
qs

)λ−1
log

(
rs
qs

)λ−1
1∑n

s=1 rs

(
rs
qs

)λ−1
=

1∑n
s=1 qs

(
rs
qs

)λ n∑
s=1

qs

(
rs
qs

)λ
log

(
rs
qs

)
(4.11)

which give the lower bound of Dλ(r, q). �

By using the Theorem 4.1, Theorem 4.2 and Definition 4.1, some inequalities of Rényi entropy are obtained.

Let 1
n

= ( 1
n
, . . . , 1

n
) be a discrete probability distribution.

Corollary 4.3. Assume (H1), let r = (r1, . . . ,rn) and q = (q1, . . . ,qn) are positive probability distributions.

(i) If 0 ≤ λ ≤ µ, λ,µ 6= 1, and the base of log is greater than 1, then

Hλ(r) = log(n) −Dλ
(
r,

1

n

)
≥ A[12]m,m ≥ A

[12]
m,m ≥ . . .A

[12]
m,2 ≥ A

[12]
m,1 = Hµ(r), (4.12)

where

A
[12]
m,l =

1

1 −µ
log


(m− 1)!

(l− 1)!
∑

(i1,...,il)∈Il

ηIm,l(i1, . . . , il) ×


 l∑
j=1

rij
αIm,ij






l∑
j=1

rλij
αIm,ij

l∑
j=1

rij
αIm,ij




µ−1
λ−1

 .


Int. J. Anal. Appl. 17 (5) (2019) 698

The reverse inequalities holds in (4.12) if the base of log is between 0 and 1.

(ii) If 1 < µ and base of log is greater than 1, then

S = −
n∑
s=1

pi log(pi) ≥ A[13]m,m ≥ A
[13]
m,m−1 ≥ . . . ≥ A

[13]
m,2 ≥ A

[13]
m,1 = Hµ(r) (4.13)

where

A
[13]
m,l = log(n) +

1

1 −µ
log


(m− 1)!

(l− 1)!
∑

(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

rij
αIm,ij




exp




(µ− 1)
l∑

j=1

rij
αIm,ij

log
(
nrij

)
l∑

j=1

rij
αIm,ij




 ,

the base of exp is same as the base of log. The inequalities in (4.13) are reversed if the base of log is between

0 and 1.

(iii) If 0 ≤ λ < 1, and the base of log is greater than 1, then

Hλ(r) ≥ A[14]m,m ≥ A
[14]
m,m−1 ≥ . . . ≥ A

[14]
m,2 ≤ A

[14]
m,1 = S, (4.14)

where

A[14]m,m =
1

1 −λ
(m− 1)!
(l− 1)!

∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

rij
αIm,ij


 log




l∑
j=1

rλij
αIm,ij

l∑
j=1

rij
αIm,ij


 . (4.15)

The inequalities in (4.14) are reversed if the base of log is between 0 and 1.

Proof. (i) Suppose q = 1
n

then from (4.1), we have

Dλ(r, q) =
1

λ− 1
log

(
n∑
s=1

nλ−1rλs

)
= log(n) +

1

λ− 1
log

(
n∑
s=1

rλs

)
, (4.16)

therefore we have

Hλ(r) = log(n) −Dλ(r,
1

n
). (4.17)

Now using Theorem 4.1 (i) and (4.17), we get

Hλ(r) = log(n) −Dλ
(

r,
1

n

)
≥ . . . ≥ log(n) −

1

µ− 1
log


nµ−1 (m− 1)!

(l− 1)!
∑

(i1,...,il)∈Il

ηIm,l(i1, . . . , il)

×


 l∑
j=1

rij
αIm,ij






l∑
j=1

rλij
αIm,ij

l∑
j=1

rij
αIm,ij




µ−1
λ−1

 ≥ . . . ≥ log(n) −Dµ(r, q) = Hµ(r), (4.18)

(ii) and (iii) can be proved similarly. �


Int. J. Anal. Appl. 17 (5) (2019) 699

Corollary 4.4. Assume (H1) and let r = (r1, . . . ,rn) and q = (q1, . . . ,qn) are positive probability distribu-

tions.

If either 0 ≤ λ < 1 and the base of log is greater than 1, or 1 < λ and the base of log is between 0 and 1,

then

−
1∑n
s=1 r

λ
s

n∑
s=1

rλs log(rs) = A
[15]
m,1 ≥ A

[15]
m,2 ≥ . . . ≥ A

[15]
m,m−1 ≥ A

[15]
m,m ≥ Hλ(r) ≥ A

[16]
m,m

≥ A[16]m,m−1 ≥ . . .A
[16]
m,2 ≥ A

[16]
m,1 = H (r) ,

(4.19)

where

A
[15]
m,l =

1

(λ− 1)
∑n
s=1 r

λ
s

(m− 1)!
(l− 1)!

∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

rλij
αIm,ij


 log


nλ−1

l∑
j=1

rλij
αIm,ij

l∑
j=1

rij
αIm,ij




and

A
[16]
m,1 =

1

1 −λ
(m− 1)!
(l− 1)!

∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

rij
αIm,ij


 log




l∑
j=1

rλij
αIm,ij

l∑
j=1

rij
αIm,ij


 .

The inequalities in (4.19) are reversed if either 0 ≤ λ < 1 and the base of log is between 0 and 1, or 1 < λ

and the base of log is greater than 1.

Proof. The proof is similar to the Corollary 4.3 by using Theorem 4.2. �

5. Inequalities by Using Zipf-Mandelbrot Law

In probability theory and statistics, the Zipf-Mandelbrot law is a distribution. It is a power law distribution

on ranked data, named after the linguist G. K. Zipf who suggest a simpler distribution called Zipf’s law.

The Zipf’s law is defined as follow (see [28]).

Definition 5.1. Let N be a number of elements, s be their rank and t be the value of exponent characterizing

the distribution. Zipf ’s law then predicts that out of a population of N elements, the normalized frequency

of element of rank s, f(s,N,t) is

f(s,N,t) =
1
st∑N
j=1

1
jt

. (5.1)

The Zipf-Mandelbrot law is defined as follows (see [21]).

Definition 5.2. Zipf-Mandelbrot law is a discrete probability distribution depending on three parameters

N ∈{1, 2, . . . ,},q ∈ [0,∞) and t > 0, and is defined by

f(s; N,q,t) :=
1

(s + q)tHN,q,t
, s = 1, . . . ,N, (5.2)


Int. J. Anal. Appl. 17 (5) (2019) 700

where

HN,q,t =

N∑
j=1

1

(j + q)t
. (5.3)

If the total mass of the law is taken over all N, then for q ≥ 0, t > 1, s ∈ N, density function of Zipf-

Mandelbrot law becomes

f(s; q,t) =
1

(s + q)tHq,t
, (5.4)

where

Hq,t =

∞∑
j=1

1

(j + q)t
. (5.5)

For q = 0, the Zipf-Mandelbrot law (5.2) becomes Zipf ’s law (5.1).

Conclusion 5.1. Assume (H1), let r be a Zipf-Mandelbrot law, by Corollary 4.3 (iii), we get. If 0 ≤ λ < 1,

and the base of log is greater than 1, then

Hλ(r) =
1

1 −λ
log

(
1

HλN,q,t

n∑
s=1

1

(s + q)λs

)
≥ . . . ≥

1

1 −λ
(m− 1)!
(l− 1)!

∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

1

αIm,ij (ij + q)HN.q,t


 log


 1Hλ−1N,q,t

l∑
j=1

1
αIm,ij (ij−q)

λs

l∑
j=1

1
αIm,ij (ij−q)

s




≥ . . . ≥
t

HN,q,t

N∑
s=1

log(s + q)

(s + q)t
+ log(HN,q,t) = S. (5.6)

The inequalities in (5.6) are reversed if the base of log is between 0 and 1.

Conclusion 5.2. Assume (H1), let r1 and r2 be the Zipf-Mandelbort law with parameters N ∈ {1, 2, . . .},

q1,q2 ∈ [0,∞) and s1,s2 > 0, respectively, then from Corollary 3.2 (ii), we have If the base of log is greater

than 1, then

D̄(r1,r2) =
n∑
s=1

1

(s + q1)t1HN,q1,t1
log

(
(s + q2)

t2HN,q2,t2
(s + q1)t1HN,q2,t1

)
≥ . . . ≥

(m− 1)!
(l− 1)!

∑
(i1,...,il)∈Il

ηIm,l(i1, . . . , il)


 l∑
j=1

1
(ij+q2)

t2HN,q2,t2

αIm,ij





∑l
j=1

1

(ij+q1)
t1HN,q1,t1

αIm,ij∑l
j=1

1

(ij+q2)
t2HN,q2,t2

αIm,ij


 log



∑l
j=1

1

(ij+q1)
t1HN,q1,t1

αIm,ij∑l
j=1

1

(ij+q2)
t2HN,q2,t2

αIm,ij


 ≥ . . . ≥ 0. (5.7)

The inequalities in (5.7) are reversed if base of log is between 0 and 1.


Int. J. Anal. Appl. 17 (5) (2019) 701

6. Shannon Entropy, Zipf-Mandelbrot Law and Hybrid Zipf-Mandelbrot Law

Here we maximize the Shannon entropy using method of Lagrange multiplier under some equations con-

straints and get the Zipf-Mandelbrot law.

Theorem 6.1. If J = {1, 2, . . . ,N}, for a given q ≥ 0 a probability distribution that maximize the Shannon

entropy under the constraints

∑
s∈J

rs = 1,
∑
s∈J

rs (ln(s + q)) := Ψ,

is Zipf-Madelbrot law.

Proof. If J = {1, 2, . . . ,N}. We set the Lagrange multipliers λ and t and consider the expression

S̃ = −
N∑
s=1

rs ln rs −λ

(
N∑
s=1

rs − 1

)
− t

(
N∑
s=1

rs ln(s + q) − Ψ

)

Just for the sake of convenience, replace λ by ln λ− 1, thus the last expression gives

S̃ = −
N∑
s=1

rs ln rs − (ln λ− 1)

(
N∑
s=1

rs − 1

)
− t

(
N∑
s=1

rs ln(s + q) − Ψ

)

From S̃rs = 0, for s = 1, 2, . . . ,N, we get

rs =
1

λ (s + q)
t
,

and on using the constraint
∑N
s=1 rs = 1, we have

λ =

N∑
s=1

(
1

(s + 1)t

)
where t > 0, concluding that

rs =
1

(s + q)tHN,q,t
, s = 1, 2, . . . ,N.

�

Remark 6.2. Observe that the Zipf-Mandelbrot law and Shannon Entroy can be bounded from above (see

[22]).

S = −
N∑
s=1

f (s,N,q,t) ln f(s,N,q,t) ≤−
N∑
s=1

f(s,N,q,t) ln qs

where (q1, . . . ,qN ) is a positive N-tuple such that
∑N
s=1 qs = 1.


Int. J. Anal. Appl. 17 (5) (2019) 702

Theorem 6.3. If J = {1, . . . ,N}, then probability distribution that maximize Shannon entropy under con-

straints

∑
s∈J

rs = 1,
∑
s∈J

rs ln(s + q) := Ψ,
∑
s∈J

srs := η

is hybrid Zipf-Mandelbrot law given as

rs =
ws

(s + q)
k

Φ∗(k,q,w)
, s ∈ J,

where

ΦJ(k,q,w) =
∑
s∈J

ws

(s + q)k
.

Proof. First consider J = {1, . . . ,N}, we set the Lagrange multiplier and consider the expression

S̃ = −
N∑
s=1

rs ln rs + ln w

(
N∑
s=1

srs −η

)
− (ln λ− 1)

(
N∑
s=1

rs − 1

)
−k

(
N∑
s=1

rs ln(s + q) − Ψ

)
.

On setting S̃rs = 0, for s = 1, . . . ,N, we get

− ln rs + s ln w − ln λ−k ln(s + q) = 0,

after solving for rs, we get

λ =

N∑
s=1

ws

(s + q)
k
,

and we recognize this as the partial sum of Lerch’s transcendent that we will denote with

Φ∗N (k,q,w) =

N∑
s=1

ws

(s + q)k

with w ≥ 0,k > 0.

�

Remark 6.4. Observe that for Zipf-Mandelbrot law, Shannon entropy can be bounded from above (see [22]).

S = −
N∑
s=1

fh (s,N,q,k) ln fh (s,N,q,k) ≤−
N∑
s=1

fh (s,N,q,k) ln qs

where (q1, . . . ,qN ) is any positive N-tuple such that
∑N
s=1 qs = 1

Under the assumption of Theorem 2.1 (i), define the non-negative functionals as follows.

Θ3(f) = A[1]m,r −f
(∑n

s=1 rs∑n
s=1 qs

) n∑
s=1

qs, r = 1, . . . ,m, (6.1)

Θ4(f) = A[1]m,r −A
[1]
m,k, 1 ≤ r < k ≤ m. (6.2)


Int. J. Anal. Appl. 17 (5) (2019) 703

Under the assumption of Theorem 2.1 (ii), define the non-negative functionals as follows.

Θ5(f) = A[2]m,r −

(
n∑
s=1

rs

)
f

(∑n
s=1 rs∑n
s=1 qs

)
, r = 1, . . . ,m, (6.3)

Θ6(f) = A[2]m,r −A
[2]
m,k, 1 ≤ r < k ≤ m. (6.4)

Under the assumption of Corollary 3.1 (i), define the following non-negative functionals.

Θ7(f) = A
[3]
m,r +

n∑
i=1

qi log(qi), r = 1, . . . ,n (6.5)

Θ8(f) = A
[3]
m,r −A

[3]
m,k, 1 ≤ r < k ≤ m. (6.6)

Under the assumption of Corollary 3.1 (ii), define the following non-negative functionals give as.

Θ9(f) = A
[4]
m,r −S, r = 1, . . . ,m (6.7)

Θ10(f) = A
[4]
m,r −A

[4]
m,k, 1 ≤ r < k ≤ m. (6.8)

Under the assumption of Corollary 3.2 (i), let us define the non-negative functionals as follows.

Θ11(f) = A
[5]
m,r −

n∑
s=1

rs log

(
n∑
s=1

log
rn∑n
s=1 qs

)
, r = 1, . . . ,m (6.9)

Θ12(f) = A
[5]
m,r −A

[5]
m,k, 1 ≤ r < k ≤ m. (6.10)

Under the assumption of Corollary 3.2 (ii), define the non-negative functionals as follows.

Θ13(f) = A
[6]
m,r −A

[6]
m,k, 1 ≤ r < k ≤ m. (6.11)

Under the assumption of Theorem 4.1 (i), consider the following functionals.

Θ14(f) = A
[7]
m,r −Dλ(r, q), r = 1, . . . ,m (6.12)

Θ15(f) = A
[7]
m,r −A

[7]
m,k, 1 ≤ r < k ≤ m. (6.13)

Under the assumption of Theorem 4.1 (ii), consider the following functionals.

Θ16(f) = A
[8]
m,r −D1(r, q), r = 1, . . . ,m (6.14)

Θ17(f) = A
[8]
m,r −A

[8]
m,k, 1 ≤ r < k ≤ m. (6.15)

Under the assumption of Theorem 4.1 (iii), consider the following functionals.

Θ18(f) = A
[9]
m,r −Dλ(r, q), r = 1, . . . ,m (6.16)

Θ19(f) = A
[9]
m,r −A

[9]
m,k, 1 ≤ r < k ≤ m. (6.17)


Int. J. Anal. Appl. 17 (5) (2019) 704

Under the assumption of Theorem 4.2 consider the following non-negative functionals.

Θ20(f) = Dλ(r, q) −A[10]m,r, r = 1, . . . ,m (6.18)

Θ21(f) = A
[10]
m,k −A

[10]
m,r, 1 ≤ r < k ≤ m. (6.19)

Θ22(f) = A
[11]
m,r −Dλ(r, q), r = 1, . . . ,m (6.20)

Θ23(f) = A
[11]
m,r −A

[11]
m,r, 1 ≤ r < k ≤ m. (6.21)

Θ24(f) = A
[11]
m,r −A

[10]
m,k, r = 1, . . . ,m, k = 1, . . . ,m (6.22)

Under the assumption of Corollary 4.3 (i), consider the following non-negative functionals.

Θ25(f) = Hλ(r) −A[12]m,r, r = 1, . . . ,m (6.23)

Θ26(f) = A
[12]
m,k −A

[12]
m,r, 1 ≤ r < k ≤ m. (6.24)

Under the assumption of Corollary 4.3 (ii), consider the following functionals

Θ27(f) = S −A[13]m,r, r = 1, . . . ,m (6.25)

Θ28(f) = A
[13]
m,k −A

[13]
m,r, 1 ≤ r < k ≤ m. (6.26)

Under the assumption of Corollary 4.3 (iii), consider the following functionals.

Θ29(f) = Hλ(r) −A[14]m,r, r = 1, . . . ,m (6.27)

Θ30(f) = A
[14]
m,k −A

[14]
m,r, 1 ≤ r < k ≤ m. (6.28)

Under the assumption of Corollary 4.4, defined the following functionals.

Θ31 = A
[15]
m,r −Hλ(r), r = 1, . . . ,m (6.29)

Θ32 = A
[15]
m,r −A

[15]
m,k, 1 ≤ r < k ≤ m. (6.30)

Θ33 = Hλ(r) −A[16]m,r, r = 1, . . . ,m (6.31)

Θ34 = A
[16]
m,k −A

[16]
m,r, 1 ≤ r < k ≤ m. (6.32)

Θ35 = A
[15]
m,r −A

[16]
m,k, r = 1, . . . ,m, k = 1, . . . ,m. (6.33)

7. Generalization of refinement of Jensen’s, Rényi and Shannon type inequalities via

Taylor one point and Taylor two points interpolations

In [5], the following functions are consider to generalized the Popoviciu’s inequality, defined as

(u−v)+ =


 (u−v), v ≤ u;0, v > u,


Int. J. Anal. Appl. 17 (5) (2019) 705

and the well known Taylor formula is as follows.

Let m be a positive integer and f : [α1,α2] → R be such that f(m−1) is absolutely continuous, then for all

u ∈ [α1,α2] the Taylor’s formula at point c ∈ [α1,α2] is

f(u) = Tm−1(f; c; u) + Rm−1(f; c; u), (7.1)

where

Tm−1(f; c; u) =

m−1∑
l=0

f(l)(c)

l!
(u− c)l,

and the remainder is given by

Rm−1(f; c; u) =
1

(m− 1)!

∫ u
c

f(m)(t)(u− t)m−1dt.

The Taylor’s formula at point α1 and α2 is given by:

f(u) =

m−1∑
l=0

f(l)(α1)

l!
(u−α1)l +

1

(m− 1)!

∫ α2
α1

f(m)(t)
(
(u− t)m−1+

)
dt. (7.2)

f(u) =

m−1∑
l=0

(−1)lf(l)(α2)
l!

(α2 −u)l +
(−1)m−1

(m− 1)!

∫ α2
α1

f(m)(t)
(
(t−u)m−1+

)
dt. (7.3)

We construct some new identities with the help of Taylor polynomial (7.1).

Theorem 7.1. Assume (H1), let f : [α1,α2] → R be a function where [α1,α2] ⊂ R be an interval. Also

let x1, . . . ,xn ∈ [α1,α2] and p1, . . . ,pn are positive real numbers such that
n∑
i=1

pi = 1. Then we have the

following identities:

(i)

Θi(f) =
m−1∑
l=2

f(l)(α1)

l!
Θi
(
(u−α1)l

)
+

1

(m− 1)!

∫ α2
α1

f(m)(t)Θi
(
(u− t)m−1+

)
dt, i = 1, 2, . . . , 35. (7.4)

(ii)

Θi(f) =

m−1∑
l=2

(−1)lf(l)(α2)
l!

Θi
(
(α2 −u)l

)
+

(−1)m−1

(m− 1)!

∫ α2
α1

f(m)(t)Θi
(
(t−u)m−1+

)
dt, i = 1, 2, . . . , 35. (7.5)

Proof. Using (7.2) and (7.3) in (1.3), we get the required result. �

Theorem 7.2. Assume (H1), let f : [α1,α2] → R be a function where [α1,α2] ⊂ R be an interval. Also

let x1, . . . ,xn ∈ [α1,α2] and p1, . . . ,pn are positive real numbers such that
n∑
i=1

pi = 1. Let f is m-convex

function such that f(m−1) is absolutely continuous. Then we have the following results:

(i) If

Θi
(
(u− t)m−1+

)
≥ 0 t ∈ [α1,α2], i = 1, 2, . . . , 35,


Int. J. Anal. Appl. 17 (5) (2019) 706

then

Θi(f(u)) ≥
m−1∑
l=2

f(l)(α1)

l!
Θi
(
(u−α1)l

)
, i = 1, 2, . . . , 35. (7.6)

(ii) If

(−1)m−1Θi
(
(t−u)m−1+

)
≤ 0 t ∈ [α1,α2], i = 1, 2, . . . , 35,

then

Θi(f(u)) ≥
m−1∑
l=2

(−1)lf(l)(α2)
l!

Θi
(
(α2 −u)l

)
, i = 1, 2, . . . , 35. (7.7)

Proof. Since f(m−1) is absolutely continuous on [α1,α2], f
(m) exists almost everywhere. As f is m-convex

therefore f(m)(u) ≥ 0 for all u ∈ [α1,α2]. Hence using Theorem 7.1 we obtain (7.6) and (7.7). �

Theorem 7.3. Assume (H1), let f : [α1,α2] → R be a function where [α1,α2] ⊂ R be an interval. Also let

x1, . . . ,xn ∈ [α1,α2] and p1, . . . ,pn are positive real numbers such that
n∑
i=1

pi = 1. Then the following results

are valid.

(i) If f is m-convex, then (7.6) holds. Also if f(l)(α1) ≥ 0 for l = 2, . . . ,m− 1, then the right hand side

of (7.6) will be non-negative.

(ii) If m is even and f is m-convex, then (7.7) holds. Also if f(l)(α1) ≤ 0 for l = 2, . . . ,m− 1 and f(l) ≥ 0

for l = 3, . . . ,m− 1, then right hand side of (7.7) will be non-negative.

(iii)If m is odd and f is m-convex function then (7.7) is valid. Also if f(l)(α2) ≥ 0 for l = 2, . . . ,m−1 and

f(l)(α2) ≤ 0 for l = 2, . . . ,m− 2, then right hand side of (7.7) will be non positive.

In [7, p.20] the Green function G : [α1,α2] × [α1,α2] → R is defined as

G(u,v) =




(u−α2)(v−α1)
α2−α1

, α1 ≤ v ≤ u;
(v−α2)(u−α1)

α2−α1
, u ≤ v ≤ α2.

(7.8)

The function G is convex and continuous with respect to v, since G is symmetric therefore it is also convex

and continuous with respect to variable u.

Let ψ ∈ C2 ([α1,α2]), then

ψ (t) =
α2 − t
α2 −α1

ψ(α1) +
t−α1
α2 −α1

ψ(α2) +

α2∫
α1

G (t,v) ψ′′(v)dv. (7.9)

Theorem 7.4. Assume (H1), let f : [α1,α2] → R be a function where [α1,α2] ⊂ R be an interval. Also

let x1, . . . ,xn ∈ [α1,α2] and p1, . . . ,pn are positive real numbers such that
n∑
i=1

pi = 1. Then we have the


Int. J. Anal. Appl. 17 (5) (2019) 707

following results:

(i) For i = 1, 2, . . . , 35,

Θi(f) =

α2∫
α1

Θi(G(t,v))

(
n−1∑
l=1

f(l)(α1)(v − α1)l−2

(l − 2)!

)
dv +

1

(n − 3)!

α2∫
α1

f
(m)

(s)


 α2∫
α1

Θi(G(t,v))(v − s)n−3dv


ds. (7.10)

(ii) For i = 1, 2, . . . , 35,

Θi(f) =

α2∫
α1

Θi(G(t,v))

(
n−1∑
l=1

fl(α2)(v − α2)l−2

(l − 2)!

)
dv −

1

(n − 3)!

α2∫
α1

f
(m)

(s)


 α2∫
α1

Θi(G(t,v))(v − s)n−3dv


ds (7.11)

Proof. Using (7.9) in Θi, i = 1, 2, . . . , 35, we get

Θi(f) =

α2∫
α1

Θi (G(t,v)) f
′′(v)dv. (7.12)

Differentiate (7.2) twice, we get

f′′(v) =

n−1∑
l=2

f(l)(α1)

(l− 2)!
(v −α1)l−2 +

1

(m− 3)!

α2∫
α1

f(m)(v −u)m−3du. (7.13)

Using (7.13) in (7.12) and using Fubini’s theorem, we get (7.10). Similarly use second derivative of (7.3) in

(7.12) and apply Fubini’s theorem, we get (7.11). �

Now we obtain generalization of refinement of Jensen’s inequality for n-convex function.

Theorem 7.5. Assume (H1), let f : [α1,α2] → R be a function where [α1,α2] ⊂ R be an interval. Also

let x1, . . . ,xn ∈ [α1,α2] and p1, . . . ,pn are positive real numbers such that
n∑
i=1

pi = 1. Let f is m-convex

function such that f(m−1) is absolutely continuous. Then we have the following results:

(i) If

α2∫
u

Θi (G(t,v)) (v −u)n−3dv ≥ 0 u ∈ [α1,α2], i = 1, 2, . . . , 35, (7.14)

then

Θi(f) ≥
α2∫
α1

Θi (G(t,v))

(
n−2∑
l=2

f(l)(α1)(v −α1)l−2

(l− 2)!

)
dv, i = 1, 2, . . . , 35, (7.15)

and if

u∫
α1

Θi (G(t,v)) (v −u)n−3dv ≤ 0 u ∈ [α1,α2], i = 1, 2, . . . , 35, (7.16)

then

Θi(f) ≥
α2∫
α1

Θi (G(t,v))

(
n−2∑
l=2

f(l)(α2)(v −α2)l−2

(l− 2)!

)
dv i = 1, 2, . . . , 35. (7.17)


Int. J. Anal. Appl. 17 (5) (2019) 708

Proof. Similar to the proof of Theorem 7.2. �

Corollary 7.6. Assume (H1), let f : [α1,α2] → R be a function where [α1,α2] ⊂ R be an interval. Also let

x1, . . . ,xn ∈ [α1,α2] and p1, . . . ,pn are positive real numbers such that
n∑
i=1

pi = 1. Then the following results

are valid.

(i) If f is m-convex, then (7.15) holds. Also if

n−1∑
l=2

f(l)(α1)(v −α1)l−2

(l− 2)!
≥ 0, (7.18)

then

Θi (f) ≥ 0, i = 1, 2, . . . , 35. (7.19)

(ii) If m is even and f is m-convex, then (7.17) holds. Also if

n−1∑
l=2

f(l)(α2)(v −α2)l−2

(l− 2)!
≥ 0, (7.20)

then (7.19) holds.

Remark 7.7. We can investigate the bounds for the identities related to the generalization of refinement of

Jensen inequality using inequalities for the C̆ebys̆ev functional and some results relating to the Gr̈uss and

Ostrowski type inequalities can be constructed as given in Section 3 of [5]. Also we can construct the non-

negative functionals from inequalities (7.6), (7.7), (7.15) and (7.17) and give related mean value theorems

and we can construct the new families of m-exponentially convex functions and Cauchy means related to

these functionals as given in Section 4 of [5].

Funding

The research of 4th author was supported by the Ministry of Education and Science of the Russian Federation

(the Agreement number No. 02.a03.21.0008).

Competing interests

The authors declares that there is no conflict of interests regarding the publication of this paper.

Authors contribution

All authors jointly worked on the results and they read and approved the final manuscript.

Acknowledgements

The authors wish to thank the anonymous referees for their very careful reading of the manuscript and

fruitful comments and suggestions.


Int. J. Anal. Appl. 17 (5) (2019) 709

References

[1] Anderson, G., & Ge, Y. The size distribution of Chinese cities. Reg. Sci. Urban Econ., 35 (2005), 756-776.

[2] Auerbach, F. (1913). Das Gesetz der Bevlkerungskonzentration. Petermanns Geographische Mitteilungen, 59 (2005), 74-76.

[3] Black, D., & Henderson, V. Urban evolution in the USA. J. Econ. Geogr., 3 (2003), 343-372.

[4] Bosker, M., Brakman, S., Garretsen, H., & Schramm, M. A century of shocks: the evolution of the German city size

distribution 19251999. Reg. Sci. Urban Econ., 38 (2008), 330-347.

[5] Butt, S. I., Khan, K. A., & Pečarić, J. Generaliztion of Popoviciu inequality for higher order convex function via Tayor’s

polynomial, Acta Univ. Apulensis Math. Inform., 42 (2015), 181-200.

[6] Butt, S. I., Mehmood, N., & Pečarić, J. New generalizations of Popoviciu type inequalities via new green functions and

Fink’s identity. Trans. A. Razmadze Math. Inst., 171 (2017), 293-303.

[7] Butt, S. I., & Pečarić, J. Popoviciu’s Inequality For N-convex Functions. Lap Lambert Academic Publishing, (2016).

[8] Butt, S. I., & Pečarić, J. Weighted Popoviciu type inequalities via generalized Montgomery identities. Rad Hazu. Mat.

Znan., 19 (2015), 69-89.

[9] Butt, S. I., Khan, K. A., & Pečarić, J. Popoviciu type inequalities via Hermite’s polynomial. Math. Inequal. Appl., 19

(2016), 1309-1318.

[10] Horváth, L. A method to refine the discrete Jensen’s inequality for convex and mid-convex functions. Math. Computer

Model., 54 (2011), 2451-2459.

[11] Horváth, L., Khan, K. A., & Pečarić, J. Combinatorial Improvements of Jensens Inequality / Classical and New Refinements

of Jensens Inequality with Applications, Monographs in inequalities 8, Element, Zagreb. (2014).

[12] Horváth, L., Khan, K. A., & Pečarić, J. Refinement of Jensen’s inequality for operator convex functions. Adv. Inequal.

Appl., 2014 (2014), Art. ID 26.

[13] Horváth, L., Pečarić, J. A refinement of discrete Jensen’s inequality, Math. Inequal. Appl. 14 (2011), 777-791.

[14] Ioannides, Y. M., & Overman, H. G. Zipf’s law for cities: an empirical examination. Reg. Sci. Urban Econ., 33 (2003),

127-137.

[15] Csiszár, I. Information measures: a critical survey. In: Tans. 7th Prague Conf. on Info. Th., Statist. Decis. Funct., Rand.

Proc. 8th Eur. Meeting Stat., Vol. B (1978), 73-86.

[16] Csiszár. I. . Information-type measures of difference of probability distributions and indirect observations. Stud. Sci. Math.

Hungar. 2 (1967), 299-318.

[17] Horváth, L., Pecaric, D. & Pečarić, J. Estimations of f-and Rényi divergences by using a cyclic refinement of the Jensen’s

inequality. Bull. Malaysian Math. Sci. Soc., 42 (2019). 933-946 .

[18] Khan, K. A., Niaz, T., Pečarić, D̄., Pečarić, J. Refinement of Jensen’s Inequality and Estimation of f- and Renyi Divergence

via Montgomery identity. J. Inequal. Appl., 2018 (2018), Art. ID 318.

[19] Kullback, S. Information theory and statistics. Courier Corporation.

[20] Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. Anna. Math. Stat., 22 (1997), 79-86. Math. Dokl.

4(1963), 121-124.

[21] Lovricevic, N., Pecaric, D. & Pecaric, J. ZipfMandelbrot law, f-divergences and the Jensen-type interpolating inequalities.

J. Inequal. Appl., 2018 (2018), Art. ID 36.

[22] Matic, M., Pearce, C. E., & Pečarić, J. Shannon’s and related inequalities in information theory. In Survey on Classical

Inequalities (pp. 127-164). Springer, Dordrecht. (2000).


Int. J. Anal. Appl. 17 (5) (2019) 710

[23] Niaz, T., Khan, K. A., & Pečarić, J. On generalization of refinement of Jensen’s inequality using Fink’s identity and

Abel-Gontscharoff Green function. J. Inequal. Appl., 2017 (2017), Art. ID 254.

[24] Pečarić, J., Proschan, F., & Tong, Y. L. Convex functions, Partial Orderings and Statistical Applications, Academic Press,

New York. (1992).

[25] Rényi, A. On measure of information and entropy. In: Proceeding of the Fourth Berkely Symposium on Mathematics,

Statistics and Probability, pp. 547-561. (1960).

[26] Rosen, K. T., & Resnick, M. The size distribution of cities: an examination of the Pareto law and primacy. J. Urban Econ.,

8 (1980), 165-186.

[27] Soo, K. T. Zipf’s Law for cities: a cross-country investigation. Reg. Sci. Urban Econ., 35 (2005), 239-263.

[28] Zipf, G. K. Human behaviour and the principle of least-effort. Cambridge MA edn. Reading: Addison-Wesley. (1949).


	1. Introduction and preliminary results
	2. Inequalities for Csiszár divergence
	3. Inequalities for Shannon Entropy
	4. Inequalities for Rényi Divergence and Entropy
	5. Inequalities by Using Zipf-Mandelbrot Law
	6. Shannon Entropy, Zipf-Mandelbrot Law and Hybrid Zipf-Mandelbrot Law
	7. Generalization of refinement of Jensen's, Rényi and Shannon type inequalities via Taylor one point and Taylor two points interpolations
	References