Mathematical Problems of Computer Science 54, 18–33, 2020.

UDC 519.2

A Neyman-Pearson Proper Way to Universal Testing

of Multiple Hypotheses Formed by Groups of

Distributions

Evgueni A. Haroutunian and Aram O. Yesayan

Institute for Informatics and Automation Problems of NAS RA

e-mail: eghishe@sci.am, armfrance@yahoo.fr

Abstract

The asymptotically optimal Neyman-Pearson procedures of detection for models
characterized by M discrete probability distributions arranged into K, 2 ≤ K ≤ M
groups considered as hypotheses are investigated. The sequence of tests based on a
growing number of observations is logarithmically asymptotically optimal (LAO) when
a certain part of the given error probability exponents (reliabilities) provides positives
values for all other reliabilities. LAO tests sequences for some models of objects,
including cases, when rejection of decision may be permitted, and when part, or all
given error probabilities decrease subexponentially with an increase in the of number of
experiments, are desined. For all reliabilities of such tests single-letter characterizations
are obtained. A simple case with three distributions and two hypotheses is considered.

Keywords: Statistical hypotheses testing, Families of hypotheses, Optimal de-
tection, Test with no match detection, Neyman-Pearson approach, Neyman-Pearson
Lemma, Principle of maximum of Kullback-Leibler distance, Error exponent.

1. Introduction

This paper is devoted to the generalization of Neyman-Pearson criterion for some specific
universal hypotheses testing problem pointed out in the title. In [8] and in the following
papers [9], [10], Cox formulated a number of divers examples of problems for two families of
hypotheses testing and developed a general modification of the Neyman-Pearson maximum-
likelihood ratio procedures for solving such problems. In a series of papers and in disseration
of F. Harmosi-nejad and all [27], two stage procedures were investigated for certain models of
problems of hypotheses testing. The first stage in these actions executes detecting between
families of distributions, and the second stage performs detection of certain distribution in
the selected family. Investigation of the present paper can be considered as a more detailed
analysis of this first-stage problems.

The asymptotically optimal testing of two hypotheses was investigated by Hoeffding in
[32], also the concept of universal hypotheses testing was introduced there.

18


E. Haroutunian and A. Yesayan 19

The hypotheses testing problems for two hypotheses were also studied by Borovkov [6],
Levy [35], van Trees [40], Csiszár and Longo [12], Tusnady [39], Longo and Sgarro [36].

Neyman-Pearson criterion of multiple hypotheses testing for discrete random variables
was explored in [25]. In publications [1], [24] and [26], many hypotheses logarithmically
asymptotically optimal (LAO) testing for the models consisting of many independent ob-
jects was investigated. Following Birgé [3], we called the sequence of tests logarithmically
asymptotically optimal (LAO), when for given values of some reliabilities (error probabil-
ity exponents) the test ensures the best values for the rest of them. Haroutunian [18]-[20]
investigated the problem of multiple hypotheses testing at the suggestion of R. L. Dobrushin.

Construction of LAO tests sequence is realized applying “Kullback-Leibler balls” around
the hypothetic distributions in the space of distributions as sets for detection of corresponding
hypotheses. This concept, introduced in [16]-[17] and applied in [20]-[31] and in the present
paper, conforms to the idea of “r-divergent sequences” defined in [17] and used in other
works.

Hypotheses testing with no-match decision was considered by Gutman [16]. In papers
[28]-[30] the results of researches of characteristics of LAO hypotheses testing with possibility
of rejection of decision for some models with one or multiple objects, with side information
are presented.

Our study is based on information theoretic methods including the method of types.
Applications of methods of information theory in mathematical statistics, in particular in
hypotheses testing, are exposed in the monographs by Csiszár and Körner [11], Blahut [5],
Cover and Thomas [7], Csiszár and Shields [13], Poor [37], Kullback [33] Haroutunian and
all [31] , in paper of Blahut [4].

The structure of this paper is as follows. Section 2 contains definitions, notations and
problem argument. In central Section 3 the construction of desired LAO tests is exposed for
model with groups of distributions and with possibility of rejection of decision. In Section 4,
the theorem of the Section 3 is reformulated for the case without rejection option. Section 5
is devoted to the models with some reliabilities equal to 0. In final Section 6, the testing for
simplest model with three distributions and two hypotheses is discussed. Conclusion also
contains some open problems.

2. Problem Presentation

Let P(X) be the space of all probability distributions (PDs) on a finite alphabet X . Let
X be a random variable (RV) taking values in the set X with one of M possible PDs
Gm ∈ P(X), m = 1,M. Let x = (x1,x2, ...,xN ), xn ∈ X , n = 1,N, be a vector of
results of N independent observations of the RV X. Then the PD GNm(x) =

N∏
n=1

Gm(xn) and

GNm ∈P(XN )
The M different PDs are arranged into K, 2 ≤ K ≤ M different groups B1, B2, ...,

BK, which we consider as K hypotheses (suppositions) Hk concerning the distributions of
the studied object. We consider also an empty “group” BK+1. These groups are mutually
disjoint, contain |B1|, |B2|, ..., |BK+1| PDs such that

K∑
k=1

|Bk| = M, |BK+1| = 0.


20 A Neyman-Pearson Proper Way to Universal Testing of Multiple Hypotheses Formed by Groups of Distr.

When |Bk| > 1, the hypothesis Hk is composite [6], [14], [35]. In applications, the groups
may be formed with some different values of parameters of a certain PD.

We study the hypothesis testing problem, which is that to decide, based on the observed
sample x, where this vector has originated from a source with a PD from a series Bk, k = 1,K,
or to accept BK+1, that is to reject to make any judgement. The procedure is the universal
test (do not specializing individual PDs in groups), we denote it by ΦN [34]. This problem
may also be considered as specific task of detection for multiple composite hypotheses, also
having the possibility to refuse any decision.

The test ΦN can be defined by partition of the space XN into K + 1 disjoint subsets
AN1 ,AN2 , ...,ANK+1, where ANk , k = 1,K, contains all vectors x for which the test adopts the
hypothesis Hk, and ANK+1 includes all vectors x for which the test refuses to take a certain
answer.

We denote by Φ the infinite sequence of tests ΦN. Let αl|k(ΦN ) for l 6= k, l = 1,K,
k = 1,K be the probability of the erroneous acceptance of the hypothesis Hl by the test ΦN
provided that the hypothesis Hk is true, we define (see [6], [35]):

αNl|k = αl|k(ΦN )
4
= max

Gm∈Bk
GNm(A

N
l ). (1)

When we decline any decision, but the hypothesis Hk is true, we consider the following
probability of error:

αNK+1|k = αK+1|k(ΦN )
4
= max

Gm∈Bk
GNm(A

N
K+1), k = 1,K. (2)

The probability of not accepting the true hypothesis Hk, we define in the following way:

αNk|k = αk|k(ΦN )
4
=

∑
l 6=k, l=1,K+1

αNl|k = max
Gm∈Bk

GNm(ANk ), k = 1,K. (3)

Note that our approach differs from the approaches in [6], where only αNk|k are studied, and

in [38] where the αNk|k are not considered.
We study the corresponding reliabilities (error probabilities exponents) El|k of the tests

sequence Φ:

El|k = El|k(Φ)
4
= lim

N→∞

(
−

1

N
log αNl|k

)
, k = 1,K, l = 1,K + 1. (4)

All reliabilities are arranged in (K + 1) ×K matrix. For instance, at K = 3 the matrix of
reliabilities has the following form

E(Φ) =


 E1|1 E2|1 E3|1 E4|1E1|2 E2|2 E3|2 E4|2
E1|3 E2|3 E3|3 E4|3


 .

Definitions (3) and (4) imply that

Ek|k = min
l 6=k, l=1,K+1

El|k, k = 1,K. (5)

We call the tests sequence Φ∗ logarithmically asymptotically optimal (LAO) for this
model if for given positive values of certain K elements of the reliabilities matrix E(Φ∗) the


E. Haroutunian and A. Yesayan 21

procedure Φ∗ provides maximal values for all other elements of it [3]. This criterion can be
considered as a proper specification of the Neyman-Pearson approach to the universal test
of multiple hypotheses in the sense of optimality of reliabilities. In certain publications, the
LAO approach is referred to as the “exponential rate optimal” (ERO) [32], [16], [39].

In opposition to the criterion adopted by Gutman [16], we recognize the asymmetry in the
importance of different hypotheses and consider unequal requirements to error probabilities,
or reliabilities of their detection.

We use the following notions and notations:
Shannon entropy of PD P on alphabet X :

H(P) = −
∑
x

P(x) log P(x),

divergence, Kullback-Leibler information, relative entropy, or“distance” of two PDs P1 and
P2 on X :

D(P1||P2) =
∑
x

P1(x) log
P1(x)

P2(x)
,

a new notion introduced in [22], divergence of three PDs P1,P2,P3 on X :

D(P1||P2||P3)
4
=
∑
x

P1(x) log
P2(x)

P3(x)
= D(P1||P2) −D(P1||P3).

As was noted in introduction, our study applies the method of types, developed in infor-
mation theory [7, 11, 13, 31]. The basic notion in this method is the notion of the type Qx of
the vector x ∈XN , which is equivalent to the statistical notion of the empirical distribution
of the sample x:

Qx = {Qx(x) = N(x/x)/N,x ∈X},
where N(x/x) is the number of repetitions of the element x in the sample x. We denote by
Q(XN ) the set of all possible types on XN . It is clear that Q(XN ) ⊂P(X).

We will denote divergence by DN (Q||P) when Q ∈ Q(XN ) and P ∈ P(X). Note that
DN (Q||P) → D(Q||P), when N →∞.

Let T NQ (X) be the family of all vectors x of the type Q. For Q /∈ Q(XN ) , we have
T NQ (X) = ∅. We will use the following estimates [11], [31]:

| Q(XN ) |≤ (N + 1)|X|, (6)

(N + 1)−|X| exp{NH(Q)}≤ |T NQ (X)| ≤ exp{NH(Q)}. (7)
We will denote for brevity:

for Q ∈P(X), D(Q||Bk)
4
= min

Gm∈Bk
D(Q||Gm), (8)

and for Q ∈Q(XN ), DN (Q||Bk)
4
= min

Gm∈Bk
DN (Q||Gm), (9)

for Rl ⊂P(X), D(Rl||Bk)
4
= min

Q∈Rl
D(Q||Bk), (10)

and for RNl ⊂Q(X
N ), DN (RNl ||Bk)

4
= min

Q∈RN
l

DN (Q||Bk). (11)

In the following sections, we present ways of optimal tests construction for the considered
models and investigate the corresponding error probabilities and reliabilities.


22 A Neyman-Pearson Proper Way to Universal Testing of Multiple Hypotheses Formed by Groups of Distr.

3. Testing for Model with Rejection Option

To construct the desired LAO test corresponding to preliminary given strictly positive num-
bers E1|1,E2|2, ...,EK|K we define the following subsets of distributions:

RNk
4
= {Q ∈Q(XN ) : DN (Q||Bk) ≤ Ek|k}, k = 1,K, (12)

RNK+1
4
= {Q ∈Q(XN ) : DN (Q||Bk) > Ek|k, k = 1,K}, (13)

Rk
4
= {Q ∈P(X) : D(Q||Bk) ≤ Ek|k}, k = 1,K, (14)

RK+1
4
= {Q ∈P(X) : D(Q||Bk) > Ek|k, k = 1,K}, (15)

It is clear that
RNk ⊂Rk, k = 1,K + 1. (16)

Define also the following values of reliabilities:

E∗k|k = E
∗
k|k(Ek|k)

4
= Ek|k, k = 1,K, (17)

E∗l|k = E
∗
l|k(El|l)

4
= D(Rl||Bk), k = 1,K, k 6= l, l = 1,K, (18)

E∗K+1|k = E
∗
K+1|k(E1|1,E2|2, ...,EK|K)

4
= D(RK+1||Bk) = E∗k|k, k = 1,K. (19)

Theorem 1: If all distributions Gm, m = 1,M, are different in the sense that
D(Gm′||Gm) > 0, m′ 6= m, and the strictly positive numbers E1|1,E2|2, ...,EK|K are such
that the following inequalities hold

E∗1|1 < min
l=2,K

D(Rl||B1) (20)

E∗k|k < min( min
l=1,k−1

E∗l|k(El|l), min
l=k+1,K

D(Rl||Bk)) (20′)

E∗K|K < min
l=1,K−1

E∗l|K(El|l) (20
′′)

then there exists an LAO sequence of tests, all elements of the reliability matrix E∗ = {E∗l|k}
of which are defined in (17)-(19) and are strictly positive.

When at least one of the inequalities in (20) is violated, then at least one element of the
matrix of reliabilities E∗ is equal to 0. More than that, if we try to detect with such El|l
which for some l ∈ [1; K + 1] and k ∈ [1; K] is greater than D(Rl||Bk), then the test for all
N = 1, 2, ... will make an error with the probability 1.

Proof: Having a collection of numbers satisfying the conditions (20) we pass to the proof
of the positive statement of the theorem, that is, to the construction of the test.

Consider a sequence of tests Φ∗, which is defined by partition of sample space XN on the
following K + 1 subsets:

AN∗k =
⋃

Q∈RN
k

T NQ (X), k = 1,K,

(21)

AN∗K+1 = X
N −

K⋃
k=1

AN∗k , N = 1, 2, ....


E. Haroutunian and A. Yesayan 23

This AN∗k ⊂ XN , because when Q /∈ Q(XN ), then T NQ (X) is empty. Let us prove that the
collection of sets in (21) determines a test, namely, each x belongs to one and only to one
subset AN∗k ,

AN∗k
⋂
AN∗r = ∅, r 6= k, and

K+1∑
k=1

AN∗k = X
N.

Really, for k = 2,K, r = 1,K − 1, for each k > r let us consider arbitrary x ∈AN∗k . We see
that in accordance with (21) and (12) there exists T NQx (X) ⊂A

N∗
k , such that D

N (Qx||Bk) ≤
Ek|k. As k > r from conditions (20) it follows that Er|r < E

∗
k|r(Ek|k). From definition (18) and

inequality DN (Qx||Bk) ≤ Ek|k we obtain Er|r < E∗k|r(Ek|k) = min
Q∈RN

k

D(Q||Br) ≤ DN (Qx||Br).

Hence Qx /∈RNr and from (21) it follows that x /∈AN∗r .
We can verify that AN∗K+1

⋂
AN∗k = ∅, k = 1,K, because if x ∈ AN∗K+1, then by (15) for

type Qx the inequality D
N (Qx||Bk) > Ek|k is true for k = 1,K. According to the definition

(21) of AN∗k , k = 1,K, we see that x /∈AN∗k .
The sample x from T NQ (X) ⊂Q(XN ) has the following probability:

GNm(x) =
N∏

n=1

Gm(xn) =
∏
x

Gm(x)
N(x/x) =

∏
x

Gm(x)
NQx(x)

= exp{N
∑
x

(−Qx(x) log
Qx(x)

Gm(x)
+ Qx(x) log Qx(x))}

= exp{−N[DN (Qx||Gm) + H(Qx)]}. (22)

Now for k = 1,K, using (3), (21), (6), (7), (12) and (22) we can upper estimate αk|k(Φ
∗
N ) as

follows:
αk|k(Φ

∗
N ) = max

Gm∈Bk
GNm(AN∗k ) = max

Gm∈Bk
GNm(

⋃
Q/∈RN

k

T NQ (X))

≤ (N + 1)|X| max
Gm∈Bk

max
Q:DN (Q||Bk)>Ek|k

GNm(T
N
Q (X))

≤ (N + 1)|X| max
Q:DN (Q||Bk)>Ek|k

exp{−NDN (Q||Bk)}

≤ exp{−N[ inf
Q:DN (Q||Bk)>Ek|k

DN (Q||Bk) −oN (1)]}

= exp{−N(Ek|k −oN (1))}, (23)

where oN (1) → 0 with N →∞. From here E∗k|k ≥ Ek|k, k = 1,K.
Now let us prove the lower inequalities for l = 1,K, k = 1,K, l 6= k. From (1), (21), (7)

and (22) we obtain

αk|k(Φ
∗
N ) = max

Gm∈Bk
GNm(AN∗k ) = max

Gm∈Bk
GNm(

⋃
Q/∈RN

k

T NQ (X))

≥ max
Gm∈Bk

max
Q/∈RN

k

GNm(T
N
Q (X))

≥ (N + 1)−|X| max
Gm∈Bk

max
Q:DN (Q||Bk)>Ek|k

exp{−NDN (Q||Gm)}

= exp{−N( min
Gm∈Bk

inf
Q:DN (Q||Bk)>Ek|k

DN (Q||Gm) + o(1))}


24 A Neyman-Pearson Proper Way to Universal Testing of Multiple Hypotheses Formed by Groups of Distr.

= exp{−N( inf
Q:DN (Q||Bk)>Ek|k

DN (Q||Bk) + o(1))}

= exp{−N(Ek|k + o(1))}. (24)
(23) and (24) give us (17). We can obtain similar upper estimates E∗l|k ≥ El|k for l =
1,K, k = 1,K, l 6= k. According to (1), (6), (7) and (10) we have

αl|k(Φ
∗
N ) = max

Gm∈Bk
GNm(A

N∗
l ) = max

Gm∈Bk
GNm(

⋃
Q∈RN

l

T NQ (X))

≤ (N + 1)|X| max
Gm∈Bk

max
Q∈RN

l

exp{−NDN (Q||Gm)}

= exp{−N(DN (Rl||Bk) −oN (1))}. (25)
Again for l = 1,K,k = 1,K,l 6= k, we lower estimate

αl|k(Φ
∗
N ) = max

Gm∈Bk
GNm(A

N∗
l ) = max

Gm∈Bk
GNm(

⋃
Q∈RN

l

T NQ (X))

≥ (N + 1)−|X| max
Gm∈Bk

max
Q∈RN

l

exp{−ND(Q||Gm)}

= exp{−N(D(Rl||Bk) + oN (1))} = exp{−N(El|k + oN (1))}. (26)
According to the definition (4), the reliability El|k(Φ

∗) of the test sequence Φ∗ is the limit
inferiour lim

N→∞
(− 1

N
log αl|k(Φ

∗
N )), taking into account (25), (26) and the continuity of the

functional DN (Q||Gl), we obtain that lim
N→∞

(− 1
N

log αl|k(Φ
∗
N )) exists and (18) is correct.

Similarly we can obtain upper and lower bounds for αK+1|k(Φ
∗
N ), k = 1,K. Applying the

analogous resoning we get (19).
The proof of the first part of the theorem will be accomplished if we demonstrate that

the sequence of tests Φ∗ is LAO, that is, for every other sequence of tests Φ∗∗ with the same
reliabilities E1|1, ...,EK|K for all l = 1,K + 1, l 6= k, k = 1,K, inequalities El|k(Φ∗∗) ≤
El|k(Φ

∗) hold. Suppose the contrary is the case, that is there exists sequence of tests Φ∗∗

defined by the sets DN1 , ...,DNK+1 such that

El|k(Φ
∗∗) > El|k(Φ

∗) for some l ∈ [1,K + 1], k ∈ [1,K], l 6= k. (27)

For tests Φ∗ and Φ∗∗ the space XN is decomposed into subsets AN∗l and, respectively, into
DNl , l = 1,K + 1, such that for l = 1,K and N large enough.

max
Gm∈Bl

GNm(A
N∗
l ) = max

Gm∈Bl
GNm(D

N
l ) = 1 − exp{−NEl|l} (28)

and AN∗l are constructed in (21) with sets of types T NQ (X) including almost all vectors x
having positive probability max

Gm∈Bl
Gm. From here for the set AN∗l −DNl ⊂DN∗l we have

lim
N→∞

max
Gm∈Bl

GNm(A
N∗
l −D

N
l ) ≤ lim

N→∞
exp(−NEl|l) = 0. (29)

By (4), (1) and (27)

El|k(Φ
∗∗) = lim

N→∞

(
−

1

N
log max

Gm∈Bk
GNm(D

N
l )
)


E. Haroutunian and A. Yesayan 25

≥ lim
N→∞

(
−

1

N
log max

Gm∈Bk
GNm(A

N
l )
)
.

That is for N large enough

max
Gm∈Bk

GNm(D
N
l ) ≤ max

Gm∈Bk
GNm(A

N∗
l ),

and hence
max
Gm∈Bk

GNm(A
N∗
l −D

N
l ) > 0,

which contradicts (29). So, we must conclude that (27) is not possible.
For the proof of the second part of Theorem 1, it is enough to note that if one of the

conditions (20) is violated, then from (12)-(19) it follows that at least one of the elements
El|k is equal to 0. In case when El|l > D(Rl||Bk) from (3), (21), (12), (22), (7) we have for
all N

αNl|l = max
Gm∈Bl

GNm(A
N
l ) = max

Gm∈Bl
GNm(

⋃
Q∈RN

k

T NQ (X)) ≥ max
Gm∈Bl

max
Q∈RN

l

GNm(T
N
Q (X))

= exp{−N min
Gm∈Bl

min
Q∈RN

l

DN (Q||Gm)} = exp{−N × 0} = 1,

because for Gm ∈BNl we have min
Gm∈Bl

min
Q∈RN

l

DN (Q||Gm) = 0. Theorem 1 is proved.

4. Case without Rejection of Decision

Consider also the standard case when the decision is obligatory. Again we have M possible
PDs Gm ∈P(X), m = 1,M, which are placed in K groups B1, B2, ...,BK, which we envisage
as hypotheses Hk, k = 1,K. The unknown hypothesis must be detected on the base of
sample x = (x1,x2, ...,xN ). The test ΦN can be designed by dividing the sample space XN
into K subsets AN1 ,AN2 , ...,ANK as acceptance regions for the hypotheses of the same number.
The test is characterized by error probabilities.

αNl|k = αl|k(ΦN )
4
= max

Gm∈Bk
GNm(A

N
l ), l 6= k, l,k = 1,K.

αNk|k = αk|k(ΦN )
4
=

∑
l 6=k, l=1,K

αNl|k, k = 1,K.

The reliabilities are difined as in (4)

El|k = El|k(Φ)
4
= lim

N→∞

(
−

1

N
log αNl|k

)
, k, l = 1,K.

We shape the LAO sequence of tests Φ for preliminary given positive numbers
E1|1,E2|2, ...,EK−1|K−1 by the following regions of PDs

Rk
4
= {Q ∈P(X),D(Q||Bk) ≤ Ek|k}, k = 1,K − 1,

RK
4
= {Q ∈P(X),D(Q||Bk) > Ek|k, k = 1,K − 1}.


26 A Neyman-Pearson Proper Way to Universal Testing of Multiple Hypotheses Formed by Groups of Distr.

Let the corresponding reliabilities be as in (17)-(19)

E∗k|k = E
∗
k|k(Ek|k)

4
= Ek|k, k = 1,K − 1, (30)

E∗l|k = E
∗
l|k(El|l)

4
= D(Rl||Bk), k = 1,K, k 6= l, l = 1,K − 1, (31)

E∗K|k = E
∗
K|k(E1|1,E2|2, ...,EK−1|K−1)

4
= D(RK||Bk), k = 1,K. (32)

Theorem 2: If all PDs Gm, m = 1,M, are different, D(Gm′||Gm) > 0, m′ 6= m, and the
strictly positive numbers E1|1,E2|2, ...,EK−1|K−1 are such that the following inequlities hold

E∗1|1 < min
l=2,K−1

D(Rl||B1)) (33)

E∗k|k < min( min
l=1,k−1

E∗l|k(E
∗
l|l), min

l=k+1,K−1
D(Rl||Bk)) k = 2,K − 2, (33′)

E∗K−1|K−1 < min
l=1,K−2

E∗l|K−1(E
∗
l|l) (33

′′)

then there exists an LAO sequence of tests, all elements of the reliabilities matrix E of which
are defined in (30)-(32) and are strictly positive. When one of the inequalities in (33) is
violated then at least one element of the matrix of reliabilities E∗ is equal to 0.

5. Some or All Given Reliabilities are Equal to Zero

The well-known Stein lemma [11] also called Chernoff-Stein lemma in [7] provides the es-
timate of the error probability for the case of two hypotheses. It asserts that when the
error probability αN1|1 is postulated as a constant, then the error probability α

N
2|2 goes to 0 as

exp{−ND(P1||P2)} as the number of observations N tends to infinity.
In this section, we present a generalization of Stein lemma in two directions. First, a

more general model, when Section 2 consists of M PDs grouped in K hypotheses and the
test has to detect an unknown hypothesis or reject any decision. And secondly, it is known
that some error probabilities αNl|l, or all K of them, tend to 0 when N goes to infinity as a

function δNl|l, such that

lim
N→∞

(
−

1

N
log δNl|l

)
= 0. (34)

In practice, δNl|l can be constants or polynomials by N. The following theorem is a general-
ization of a result from [23] as an addition to Theorem 1.

Theorem 3: When all distributions Gm, m = 1,M are different in the sense that
D(Gm′||Gm) > 0, m′ 6= m and given numbers Ek|k, k = 1,K partly or all of them are
equal to 0 and verify condition (34), then there exists an LAO test sequence Φ∗, the elements
of reliabilities matrix E(Φ∗) = {E∗l|k} of which are defined by (14), (15), (17)-(19), if the
conditions (20) hold. But in the case when the given El|l is equal to zero, the formula (18)
changes as follows:

E∗l|k = E
∗
l|k(0) = D(Bl||Bk), k = 1,K, k 6= l.

For the proof, it is enough to replace (12) by the following expressions:

RNl
4
= {Q ∈Q(XN ) : DN (Q||Bl) ≤−

1

N
log δNl|l}.


E. Haroutunian and A. Yesayan 27

6. Case of Three Distributions and Two Hypotheses

In this Section, we discuss a number of questions concerning the most simple model amongst
the considered in this paper. At first we represent a generalization of the fundamental result
of Neyman-Pearson lemma for the noted case of 3 PDs and two groups. There are given
three distributions G1, G2, G3 for a random variable X. These distributions are divided
into two groups (hypotheses) such that the first hypothesis H1 is the first distribution and
the second hypothesis is the group of two other PDs

H1 = (G1), H2 = (G2, G3). (35)

The statistician must accept or reject the first hypothesis on the base of the sample x.

Theorem 4: (Neyman-Pearson lemma) For a threshold t > 1, consider test Ψ∗N defined by
the region of acceptance AN∗ for hypothesis H1:

AN∗ = {x :
GN1 (x)

max(G2
N (x); G3

N (x))
> t}, (36)

and acceptance region AN∗ for H2. The corresponding error probabilities are

αN∗1|1 (t) = α
N∗
2|1 (t) = G

N
1 (AN∗)

αN∗2|2 (t) = α
N∗
1|2 (t) = max(G

N
2 (A

N∗
1 ); G

N
3 (A

N∗)).

Let AN ⊂ XN be the decision region for H1 of the another test ΦN with error probabilities
αN1|1 and α

N
2|2. If α

N
1|1 ≤ α

N∗
1|1 , then α

N
2|2 ≥ α

N∗
2|2.

Proof: The numbers N and t are fixed, we can do not note them during proof. Let
ΨAN∗ and ΨAN be indicator functions of the regions. It is not difficult to verify that for all
x ∈XN ,

(ΨAN∗(x) − ΨAN (x))(GN1 (x) − t max(G2
N (x); G3

N (x))) ≥ 0.

Then ∑
x∈XN

(ΨAN∗(x)G
N
1 (x) − tΨAN∗(x) max(G2

N (x); G3
N (x))

−ΨAN (x)GN1 (x) + tΨAN (x) max(G2
N (x); G3

N (x)))

=
∑

x∈AN∗
(GN1 (x) − t max(G

N
2 (x); G

N
3 (x))) −

∑
x∈AN

(GN1 (x) − t max(G
N
2 (x); G

N
3 (x)))

= (1 −α∗1|1) − tα
∗
2|2 − (1 −α1|1) + tα2|2

= (α1|1 −α∗1|1) + t(α2|2 −α
∗
2|2) ≥ 0.

So from α1|1 ≤ α∗1|1 it follows that α2|2 ≥ α
∗
2|2.

Now we reformulate Theorem 2 for the model given in (35).

Theorem 5: If PDs G1, G2, G3 are different, the strictly positive number E1|1 is such that

E1|1 ≤ min(D(G2||G1), D(G3||G1)),


28 A Neyman-Pearson Proper Way to Universal Testing of Multiple Hypotheses Formed by Groups of Distr.

for

R1
4
= {Q ∈P(X), D(Q||G1) ≤ E1|1},

R2
4
= {Q ∈P(X), D(Q||G1) > E1|1},

we consider
E∗1|1 = E1|1 = E

∗
2|1

E∗2|2 = E
∗
1|2 = min

Q∈R1
min(D(Q||G2),D(Q||G3))

then for the hypotheses in (35) there exists an LAO sequence of tests with strictly positive
reliabilities given above and with regions of decision for hypotheses Hk

AN∗k =
⋃

Q∈Rk
T NQ (X), k = 1, 2.

In the paper [20] of 1990 Haroutunian noted that “the principle of maximum of likelihood
is equivalent to the principle of maximum of Kullback-Leibler distance” and “the desired
tests sequence is constructed by means of distances between the sample distribution and the
hypothetical distributions”. It is worth to note that this assertion is something in common
with the following note in Cover and Thomas monograph of 1991 (p. 307) [7] concerning the
test of two hypotheses (the next one with our adopted exposition). ”In the above theorem
(the Neyman-Pearson lemma), we have shown that the optimum test is a likelihood ratio
test. We can rewrite the log-likelihood ratio as the difference between the relative entropy
distances of the sample type to each of the two distributions. Hence the likelihood ratio test
(in our notation)

GN1 (x)

G2
N (x)

> t > 1

is equivalent to

D(Qx||GN2 ) −D(Qx||G
N
1 ) >

log t

N

or (with our new notation of divergence of three PDs)

D(Qx||GN1 ||G
N
2 ) >

log t

N
.

It remains to add that for the case of simple model in (36) the likelihood ratio test is
equivalent to the following condition specifying the region of detection of the first hypothesis
in (36)

min[D(Qx||GN1 ||G
N
2 ),D(Qx||G

N
1 ||G

N
3 )] >

log t

N
.

7. Conclusion

Here we offer some concluding remarks and open problems. In this paper, we have discussed
error exponents trade-off of Neyman-Pearson suitable strategy of hypotheses testing for
models with M known discrete probability distributions joined in K (2 ≤ K ≤ M) clusters,
considered as hypotheses.

We presented a single letter characterization of the error exponents of all possible pairs
of hypotheses of tests for some cases. After a detailed proof of the point in question for a


E. Haroutunian and A. Yesayan 29

general case with the possibility of decision rejection, analogical results are announced the
case without a rejection option, the case when all or a part of the given reliabilities are
equal to zero, and finally, for a particular case of three disributions and two hypotheses. The
reasonings at the end of the previous section confirm the optimality of the tests considered
in the paper based on the use of distances between the sample and hypotheses.

For further works it is deserving exploration of characteristics of testing for generalization
and enlargement of models studied in this paper. Interesting is the case with multiple objects
[cf. 21, 24, 26, 27]. Significant are arbitrarily varying models with a sequence of states known
to the decision maker [cf. 12] and also the case when states are not known to the statistician
[cf. 2, 25]. Important is the problem of hypothesis identification [cf. 1, 18, 23]. Bayesian
framework of the problem, and sources with other than independent issues, for instance with
Markov dependence must also be investigated [cf. 16].

Acknowledgement

The authors are thankful to the reviewers for their helpful comments.

References

[1] R. F. Ahlswede and E. A. Haroutunian, “On logarithmically asymptotically optimal
testing of hypotheses and identification”, Lecture Notes in Computer Science, volume
4123, ”General Theory of Information Transfer and Combinatorics”, Springer, pp. 462–
478, 2006.

[2] R. F. Ahlswede, E. Aloyan E. A. Haroutunian, “On logarithmically asymptotically
optimal hypothesis testing for arbitrarily varying source with side information”, Lecture
Notes in Computer Science, volume 4123, ”General Theory of Information Transfer and
Combinatorics”, Springer, pp. 457-461, 2006.

[3] L. Birgé, “Vitesses maximals de décroissence des erreurs et tests optimaux associés”. Z.
Wahrsch. Verw. Gebiete, vol. 55, pp. 261–273, 1981.

[4] R. E. Blahut, “Hypotheses testing and information theory” IEEE Trans. Inform Theory,
vol. 20, no 4, pp. 405-417, 1974.

[5] R. E. Blahut, Principles and Practice of Information Theory, Addison-Wesley, Reading,
MA, 1987.

[6] A. A. Borovkov Mathematical Statistics (in Russian). Nauka, Novosibirsk, 1997.

[7] T. M. Cover and J. A. Thomas, Elements of Information Theory. Second Edition, New
York, Wiley, 2006.

[8] D. R. Cox, “Tests of separate families of hypotheses” In Proceeding 4th Berkley Simp.
Math. Statist. Prob., University of Callifornia Press, Berkelly, pp. 105-123, 1961.

[9] D. R. Cox, “Furthe results on tests of separate families of hypotheses”, Journal of the
Royal Statist. Society: Serie B, vol. 24, no. 2, pp. 406-424, 1962.

[10] D. R. Cox, “A return to an old paper: Tests of separate families of hypotheses”, Journal
of the Royal Statist. Society: Serie B, vol. 75, no. 2, pp. 207-2015, 2013.

[11] I. Csiszár and J. Körner, Information Theory: Coding Theorems for Discrete Memory-
less Systems, Academic press., New York, 1981.


30 A Neyman-Pearson Proper Way to Universal Testing of Multiple Hypotheses Formed by Groups of Distr.

[12] I. Csiszár and G. Longo, “On the error exponent for source coding and for testing simple
statistical hypotheses”, Studia Sc. Math. Hungarica, vol. 6, pp. 181–191, 1971.

[13] I. Csiszár and P. Shields, “ Information theory and statistics: A tutorial”, Foundations
and Trends in Communications and Information Theory, vol. 1, no. 4, 2004.

[14] M. Feder and N. Merhav, “ Universal composite hypotheses testing: A competetive
minimax approach,” IEEE Trans. Inform. Theory, vol. 48, pp. 1504-1517, June 2002.

[15] F. W. Fu and S. Y. Shan, “Hypotheses testing for arbitrarily varying source with expo-
nential type constraint”, IEEE Trans. on Inform. Theory, vol. 14, no. 2, pp. 892-895,
1998.

[16] M. Gutman, “Asymptotically optimal classification for multiple tests with empirically
observed statistics”, IEEE Trans. Inform. Theory, vol. 35, no. 2, pp. 401–408, 1989.

[17] T. S. Han and K. Kobayashi , “Exponential-type error probabilities for multiterminal
hypotheses testing”, IEEE Trans. Inform. Theory, vol. 35, no. 1, pp. 2-13, 1989.

[18] E. A. Haroutunian, “Many statistical hypotheses: interdependence of optimal test’s
error probabilities exponents”, (In Russian), Abstract of the report on the 3rd All-
Union school-seminar, “Program-algorithmical software for applied multi-variate statis-
tical analysis”, Tsakhkadzor, Part 2, pp. 177–178, 1988.

[19] E. A. Haroutunian, “On asymptotically optimal testing of many statistical hypothe-
ses conserning Markov chain”, (in Russian), Isvestia Akademii Nauk Armenii, Mathe-
matika, vol. 23, no. 1, pp. 76-80, 1988.

[20] E. A. Haroutunian, “Logarithmically asymptotically optimal testing of multiple statisti-
cal hypotheses”, Problems of Control and Information Theory, vol. 19(5-6), pp. 413–421,
1990.

[21] E. A. Haroutunian, “Reliability in multiple hypotheses testing and identification prob-
lems” Nato Science Series III, Computer and System Sciences, vol.198, IOS Press, pp.
189-201, 2003.

[22] E. A. Haroutunian, “On three hypotheses robust detection design under mismatch”,
Proceedings of International Conference CSIT 2019, pp. 125 – 128, Yerevan 2019.

[23] E. A. Haroutunian and P. M. Hakobyan, “On LAO testing of multiple hypotheses for
pair of objects”, Mathematical Problems of Computer Sciences, vol. 25, pp. 93-101,
2006.

[24] E. A. Haroutunian and P. M. Hakobyan, “Multiple hypotheses LAO testing for many
independent objects”, International Journal “Scholarly Research Exchange”, vol. 2009,
pp. 1-6, 2009.

[25] E. A. Haroutunian and P. M. Hakobyan, “On Neyman-Pearson principle in multiple
hypotheses testing”, Mathematical Problems of Computer Sciences, vol. 40, pp. 34-37,
2013.

[26] E. A. Haroutunian and P. M. Hakobyan, “Multiple objects: Error exponents in hy-
potheses testing and identification”, Lecture Notes in Computer Science, volume 7777,
”Ahlswede Festschrift”, Springer , pp. 313-345, 2013.

[27] E. A. Haroutunian, P. M. Hakobyan and F. Hormosi-nejad, “On two-stage LAO testing
of multiple hypotheses for the pair of families of distributions”, Journal of Statistics
and Econometrics Methods , vol. 2, no. 2, pp. 127-156, 2013.


E. Haroutunian and A. Yesayan 31

[28] E. A. Haroutunian, P. M. Hakobyan and A. O. Yessayan, “On multiple hypotheses
LAO testing with rejection of decision for many independent objects”, Proceedings of
International Conference CSIT, pp. 117 – 120, 2011.

[29] E. A. Haroutunian, P. M. Hakobyan and A. O. Yesayan, “Many hypotheses LAO testing
with rejection of decision for arbitrarily varying object”, Mathematical Problems of
Computer Sciences, vol. 35, pp. 77-85, 2011.

[30] E. A. Haroutunian, P. M. Hakobyan and A. O. Yessayan, “On multiple hypotheses LAO
testing with liberty of rejection of decision for two independent objects”, International
Journal, Information theories and applications, vol. 25 , no. 1, pp. 38-46, 2018.

[31] E. A. Haroutunian, M. E. Haroutunian and A. N. Harutyunyan, “Reliability criteria
in information theory and in statistical hypothesis testing”, Foundations and Trends in
Communications and Information Theory, vol. 4, no. 2-3, 2008.

[32] W. Hoeffding, “Asymptotically optimal tests for multinomial distributions,” The Annals
of Mathematical Statistics, vol. 36, pp. 369–401, 1965.

[33] S. Kullback, Information Theory and Statistics, New York, Wiley, 1959.

[34] E. Levitan and N. Merhav, “A competitive Neyman-Pearson approach to universal
hypothesis testing with applications”, IEEE Trans. Inform. Theory, vol. 48, no. 8, pp.
2215-2229, 2002.

[35] B. C. Levy, Principles of Signal Detection and Parameter Estimation, Springer, 2008.

[36] G. Longo and A. Sgarro, “ The error exponent for the testing of simple statistical
hypotheses: A combinatorial approach”, Journal of Combinatorics and Informational
System Science, vol. 5, no. 1, pp. 58-67, 1980.

[37] V. Poor, An Introduction to Signal Deyection and Estimation, Springer, 1994.

[38] E. Tuncel, “On error exponents in hypothesis testing’, IEEE Trans. Inform. Theory,
vol. 51, mo.8, pp. 2945-2950, 2005.

[39] G. Tusnády, “On asymptotically optimal tests,” Annals of Statatistics, vol. 5, no. 2, pp.
385-393, 1977.

[40] H. van Trees, Detection, Estimation and Modulation Theory, pt.1 New York, Wiley,
1968.

Submitted 28.07.2020, accepted 26.11.2020.


3 2 A Neyman-Pearson Proper Way to Universal Testing of Multiple Hypotheses Formed by Groups of Distr.

´³ßËáõÙÝ»ñÇ ËÙµ»ñáí Ï³½Ùí³Í µ³½Ù³ÃÇí í³ñÏ³ÍÝ»ñÇ

Ü»ÛÙ³ÝÇ-äÇñëáÝÇ Ñ³ÙÁÝ¹Ñ³Ýáõñ ï»ëï³íáñÙ³Ý

Ñ³ïáõÏ áõÕÇ

ºí·»ÝÇ ². Ð³ñáõÃÛáõÝÛ³Ý ¨ ²ñ³Ù ú. ºë³Û³Ý

ÐÐ ¶²² ÆÝýáñÙ³ïÇÏ³ÛÇ ¨ ³íïáÙ³ï³óÙ³Ý åñáµÉ»ÙÝ»ñÇ ÇÝëïÇïáõï

e-mail: eghishe@sci.am, armfrance@yahoo.fr

²Ù÷á÷áõÙ

Ð»ï³½áïí»É »Ý Ü»ÛÙ³ÝÇ-äÇñëáÝÇ ³ëÇÙåïáïáñ»Ý ûåïÇÙ³É Ñ³ÛïÝ³µ»ñÙ³Ý
ÁÝÃ³ó³Ï³ñ·»ñÁ ³ÛÝ Ùá¹»ÉÝ»ñÇ Ñ³Ù³ñ, áñáÝù µÝáõÃ³·ñíáõÙ »Ý M ¹ÇëÏñ»ï
Ñ³í³Ý³Ï³ÝáõÃÛáõÝÝ»ñÇ µ³ßËáõÙÝ»ñáí, áñáÝù ËÙµ³íáñí³Í »Ý Áëï K ¹³ë»ñÇ, 2 ·
K · M , áñáÝù ¹Çï³ñÏíáõÙ »Ý áñå»ë í³ñÏ³ÍÝ»ñ: ¸Çï³ñÏáõÙÝ»ñÇ ù³Ý³ÏÇ ³×Ç íñ³

Ñïåöèàëüíûé ïóòü Íåéìàíà-Ïèðñîíà ê óíèâåðñàëüíîé

ïðîâåðêå ìíîãèõ ãèïîòåç, ñôîðìèðîâàííûõ
ãðóïïàìè ðàñïðåäåëåíèé

Åâãåíèé À. Àðóòóíÿí è Àðàì Î. Åñàÿí

Èíñòèòóò ïðîáëåì èíôîðìàòèêè è àâòîìàòèçàöèè ÍÀÍ ÐÀ
e-mail: eghishe@sci.am, armfrance@yahoo.fr

Àííîòàöèÿ

Èññëåäóþòñÿ àñèìïòîòè÷åñêè îïòèìàëüíûå ïðîöåäóðû Íåéìàíà-Ïèðñîíà
îáíàðóæåíèÿ äëÿ ìîäåëåé, õàðàêòåðèçóþùèõñÿ M äèñêðåòíûìè ðàñïðåäåëåíèÿìè
âåðîÿòíîñòåé, ñãðóïïèðîâàííûìè â K ãðóïï, 2 · K · M, ðàññìàòðèâàåìûõ
êàê ãèïîòåçû. Ïîñëåäîâàòåëüíîñòü òåñòîâ, ïðè âîçðàñòàíèè ÷èñëà íàáëþäåíèé,
ÿâëÿåòñÿ ëîãàðèôìè÷åñêè àñèìïòîòè÷åñêè îïòèìàëüíîé (LAO), êîãäà îïðåäåëåííàÿ

´³Ý³ÉÇ µ³é»ñ` íÇ×³Ï³·ñ³Ï³Ý í³ñÏ³ÍÝ»ñÇ ëïáõ·áõÙ, í³ñÏ³ÍÝ»ñÇ ÁÝï³ÝÇùÝ»ñ,
ûåïÇÙ³É Ñ³ÛïÝ³µ»ñáõÙ, Ü»ÛÙ³Ý-äÇñëáÝÇ Ùáï»óáõÙ, Ü»ÛÙ³Ý-äÇñëáÝÇ É»ÙÙ³, ÎáõÉµ³ÏÇ-
ÈáÛµÉ»ñÇ Ñ»é³íáñáõÃÛ³Ý Ù³ùëÇÙáõÙÇ ëÏ½µáõÝù, ëË³ÉÇ óáõóÇã:

ÑÇÙÝí³Í ï»ëï»ñÇ Ñ³çáñ¹³Ï³ÝáõÃÛáõÝÁ Éá·³ñÇÃÙáñ»Ý ³ëÇÙåïáïÇÏáñ»Ý ûåïÇÙ³É
(È²ú) ¿, »ñµ ïñí³Í óáõóÇãÝ»ñÇ (Ñáõë³ÉÇáõÃÛáõÝÝ»ñÇ) áñáß³ÏÇ Ù³ëÁ ³å³ÑáíáõÙ
¿ÙÝ³ó³Í µáÉáñ Ñáõë³ÉÇáõÃÛáõÝÝ»ñÇ Ñ³Ù³ñ ¹ñ³Ï³Ý ³ñÅ»ùÝ»ñ áõÝ»Ý³ÉÁ: Î³éáõóí»É »Ý
ûµÛ»ÏïÝ»ñÇ áñáß Ùá¹»ÉÝ»ñÇ Ñ³Ù³ñ È²ú ï»ëï»ñÇ ³ÛÝ ¹»åù»ñÁ, »ñµ ÃáõÛÉ³ïñíáõÙ ¿
Ññ³Å³ñí»É áñáßáõÙ ÁÝ¹áõÝ»Éáõó ¨ »ñµ ëË³ÉÝ»ñÇ Ñ³í³Ý³Ï³ÝáõÃÛáõÝÝ»ñÇ ÙÇ Ù³ëÁ Ï³Ù
µáÉáñÁ ÷áñÓ»ñÇ ÃíÇ ³×Ç Ñ»ï Ù»Ïï»Õ Ýí³½áõÙ »Ý »ÝÃ³óáõóã³ÛÇÝ ûñ»Ýùáí: êï³óí»É »Ý
³Û¹åÇëÇ ï»ëï»ñÇ µáÉáñ Ñáõë³ÉÇáõÃÛáõÝÝ»ñÇ ÙÇ³ï³é µÝáõÃ³·ñáõÙÝ»ñÁ: ¸Çï³ñÏí³Í ¿
ÙÇ Ñ³ë³ñ³Ï ¹»åù, »ñµ µ³ßËáõÙÝ»ñÇ ÃÇíÁ »ñ»ù ¿, ÇëÏ í³ñÏ³ÍÝ»ñÇ ÃÇíÁ` »ñÏáõ:


E. Haroutunian and A. Yesayan 3 3

÷àñòü çàäàííûõ ýêñïîíåíò (íàäåæíîñòåé) îáåñïå÷èâàåò ïîëîæèòåëüíûå çíà÷åíèÿ
äëÿ âñåõ äðóãèõ íàäåæíîñòåé. Ñêîíñòðóèðîâàíû LAO ïîñëåäîâàòåëüíîñòè
òåñòîâ äëÿ íåêîòîðûõ ìîäåëåé, â òîì ÷èñëå â ñëó÷àÿõ, êîãäà ðàçðåøåí îòêàç
îò ïðèíÿòèÿ ðåøåíèÿ è êîãäà ÷àñòü èëè âñå çàäàííûå âåðîÿòíîñòè îøèáîê
óáûâàþò ñóáýêñïîíåíöèàëüíî ñ ðîñòîì êîëè÷åñòâà ýêñïåðèìåíòîâ. Ïîëó÷åíû
îäíîáóêâåííûå õàðàêòåðèñòèêè äëÿ âñåõ íàäåæíîñòåé òàêèõ òåñòîâ. Ðàññìîòðåí
ïðîñòîé ñëó÷àé ñ òðåìÿ ðàñïðåäåëåíèÿìè è äâóìÿ ãèïîòåçàìè.

Êëþ÷åâûå ñëîâà: ïðîâåðêà ñòàòèñòè÷åñêèõ ãèïîòåç, ñåìåéñòâà ãèïîòåç,
îïòèìàëüíîå îáíàðóæåíèå, ïîäõîä Íåéìàíà-Ïèðñîíà, ëåììà Íåéìàíà-Ïèðñîíà,
ïðèíöèï ìàêñèìóìà ðàññòîÿíèÿ Êóëüáàêà-Ëåéáëåðà, ïîêàçàòåëü îøèáêè.


	02_Haroutunian_54_18_33
	Haroutunian_new