Mathematical Problems of Computer Science 54, 18–33, 2020. UDC 519.2 A Neyman-Pearson Proper Way to Universal Testing of Multiple Hypotheses Formed by Groups of Distributions Evgueni A. Haroutunian and Aram O. Yesayan Institute for Informatics and Automation Problems of NAS RA e-mail: eghishe@sci.am, armfrance@yahoo.fr Abstract The asymptotically optimal Neyman-Pearson procedures of detection for models characterized by M discrete probability distributions arranged into K, 2 ≤ K ≤ M groups considered as hypotheses are investigated. The sequence of tests based on a growing number of observations is logarithmically asymptotically optimal (LAO) when a certain part of the given error probability exponents (reliabilities) provides positives values for all other reliabilities. LAO tests sequences for some models of objects, including cases, when rejection of decision may be permitted, and when part, or all given error probabilities decrease subexponentially with an increase in the of number of experiments, are desined. For all reliabilities of such tests single-letter characterizations are obtained. A simple case with three distributions and two hypotheses is considered. Keywords: Statistical hypotheses testing, Families of hypotheses, Optimal de- tection, Test with no match detection, Neyman-Pearson approach, Neyman-Pearson Lemma, Principle of maximum of Kullback-Leibler distance, Error exponent. 1. Introduction This paper is devoted to the generalization of Neyman-Pearson criterion for some specific universal hypotheses testing problem pointed out in the title. In [8] and in the following papers [9], [10], Cox formulated a number of divers examples of problems for two families of hypotheses testing and developed a general modification of the Neyman-Pearson maximum- likelihood ratio procedures for solving such problems. In a series of papers and in disseration of F. Harmosi-nejad and all [27], two stage procedures were investigated for certain models of problems of hypotheses testing. The first stage in these actions executes detecting between families of distributions, and the second stage performs detection of certain distribution in the selected family. Investigation of the present paper can be considered as a more detailed analysis of this first-stage problems. The asymptotically optimal testing of two hypotheses was investigated by Hoeffding in [32], also the concept of universal hypotheses testing was introduced there. 18 E. Haroutunian and A. Yesayan 19 The hypotheses testing problems for two hypotheses were also studied by Borovkov [6], Levy [35], van Trees [40], Csiszár and Longo [12], Tusnady [39], Longo and Sgarro [36]. Neyman-Pearson criterion of multiple hypotheses testing for discrete random variables was explored in [25]. In publications [1], [24] and [26], many hypotheses logarithmically asymptotically optimal (LAO) testing for the models consisting of many independent ob- jects was investigated. Following Birgé [3], we called the sequence of tests logarithmically asymptotically optimal (LAO), when for given values of some reliabilities (error probabil- ity exponents) the test ensures the best values for the rest of them. Haroutunian [18]-[20] investigated the problem of multiple hypotheses testing at the suggestion of R. L. Dobrushin. Construction of LAO tests sequence is realized applying “Kullback-Leibler balls” around the hypothetic distributions in the space of distributions as sets for detection of corresponding hypotheses. This concept, introduced in [16]-[17] and applied in [20]-[31] and in the present paper, conforms to the idea of “r-divergent sequences” defined in [17] and used in other works. Hypotheses testing with no-match decision was considered by Gutman [16]. In papers [28]-[30] the results of researches of characteristics of LAO hypotheses testing with possibility of rejection of decision for some models with one or multiple objects, with side information are presented. Our study is based on information theoretic methods including the method of types. Applications of methods of information theory in mathematical statistics, in particular in hypotheses testing, are exposed in the monographs by Csiszár and Körner [11], Blahut [5], Cover and Thomas [7], Csiszár and Shields [13], Poor [37], Kullback [33] Haroutunian and all [31] , in paper of Blahut [4]. The structure of this paper is as follows. Section 2 contains definitions, notations and problem argument. In central Section 3 the construction of desired LAO tests is exposed for model with groups of distributions and with possibility of rejection of decision. In Section 4, the theorem of the Section 3 is reformulated for the case without rejection option. Section 5 is devoted to the models with some reliabilities equal to 0. In final Section 6, the testing for simplest model with three distributions and two hypotheses is discussed. Conclusion also contains some open problems. 2. Problem Presentation Let P(X) be the space of all probability distributions (PDs) on a finite alphabet X . Let X be a random variable (RV) taking values in the set X with one of M possible PDs Gm ∈ P(X), m = 1,M. Let x = (x1,x2, ...,xN ), xn ∈ X , n = 1,N, be a vector of results of N independent observations of the RV X. Then the PD GNm(x) = N∏ n=1 Gm(xn) and GNm ∈P(XN ) The M different PDs are arranged into K, 2 ≤ K ≤ M different groups B1, B2, ..., BK, which we consider as K hypotheses (suppositions) Hk concerning the distributions of the studied object. We consider also an empty “group” BK+1. These groups are mutually disjoint, contain |B1|, |B2|, ..., |BK+1| PDs such that K∑ k=1 |Bk| = M, |BK+1| = 0. 20 A Neyman-Pearson Proper Way to Universal Testing of Multiple Hypotheses Formed by Groups of Distr. When |Bk| > 1, the hypothesis Hk is composite [6], [14], [35]. In applications, the groups may be formed with some different values of parameters of a certain PD. We study the hypothesis testing problem, which is that to decide, based on the observed sample x, where this vector has originated from a source with a PD from a series Bk, k = 1,K, or to accept BK+1, that is to reject to make any judgement. The procedure is the universal test (do not specializing individual PDs in groups), we denote it by ΦN [34]. This problem may also be considered as specific task of detection for multiple composite hypotheses, also having the possibility to refuse any decision. The test ΦN can be defined by partition of the space XN into K + 1 disjoint subsets AN1 ,AN2 , ...,ANK+1, where ANk , k = 1,K, contains all vectors x for which the test adopts the hypothesis Hk, and ANK+1 includes all vectors x for which the test refuses to take a certain answer. We denote by Φ the infinite sequence of tests ΦN. Let αl|k(ΦN ) for l 6= k, l = 1,K, k = 1,K be the probability of the erroneous acceptance of the hypothesis Hl by the test ΦN provided that the hypothesis Hk is true, we define (see [6], [35]): αNl|k = αl|k(ΦN ) 4 = max Gm∈Bk GNm(A N l ). (1) When we decline any decision, but the hypothesis Hk is true, we consider the following probability of error: αNK+1|k = αK+1|k(ΦN ) 4 = max Gm∈Bk GNm(A N K+1), k = 1,K. (2) The probability of not accepting the true hypothesis Hk, we define in the following way: αNk|k = αk|k(ΦN ) 4 = ∑ l 6=k, l=1,K+1 αNl|k = max Gm∈Bk GNm(ANk ), k = 1,K. (3) Note that our approach differs from the approaches in [6], where only αNk|k are studied, and in [38] where the αNk|k are not considered. We study the corresponding reliabilities (error probabilities exponents) El|k of the tests sequence Φ: El|k = El|k(Φ) 4 = lim N→∞ ( − 1 N log αNl|k ) , k = 1,K, l = 1,K + 1. (4) All reliabilities are arranged in (K + 1) ×K matrix. For instance, at K = 3 the matrix of reliabilities has the following form E(Φ) =   E1|1 E2|1 E3|1 E4|1E1|2 E2|2 E3|2 E4|2 E1|3 E2|3 E3|3 E4|3   . Definitions (3) and (4) imply that Ek|k = min l 6=k, l=1,K+1 El|k, k = 1,K. (5) We call the tests sequence Φ∗ logarithmically asymptotically optimal (LAO) for this model if for given positive values of certain K elements of the reliabilities matrix E(Φ∗) the E. Haroutunian and A. Yesayan 21 procedure Φ∗ provides maximal values for all other elements of it [3]. This criterion can be considered as a proper specification of the Neyman-Pearson approach to the universal test of multiple hypotheses in the sense of optimality of reliabilities. In certain publications, the LAO approach is referred to as the “exponential rate optimal” (ERO) [32], [16], [39]. In opposition to the criterion adopted by Gutman [16], we recognize the asymmetry in the importance of different hypotheses and consider unequal requirements to error probabilities, or reliabilities of their detection. We use the following notions and notations: Shannon entropy of PD P on alphabet X : H(P) = − ∑ x P(x) log P(x), divergence, Kullback-Leibler information, relative entropy, or“distance” of two PDs P1 and P2 on X : D(P1||P2) = ∑ x P1(x) log P1(x) P2(x) , a new notion introduced in [22], divergence of three PDs P1,P2,P3 on X : D(P1||P2||P3) 4 = ∑ x P1(x) log P2(x) P3(x) = D(P1||P2) −D(P1||P3). As was noted in introduction, our study applies the method of types, developed in infor- mation theory [7, 11, 13, 31]. The basic notion in this method is the notion of the type Qx of the vector x ∈XN , which is equivalent to the statistical notion of the empirical distribution of the sample x: Qx = {Qx(x) = N(x/x)/N,x ∈X}, where N(x/x) is the number of repetitions of the element x in the sample x. We denote by Q(XN ) the set of all possible types on XN . It is clear that Q(XN ) ⊂P(X). We will denote divergence by DN (Q||P) when Q ∈ Q(XN ) and P ∈ P(X). Note that DN (Q||P) → D(Q||P), when N →∞. Let T NQ (X) be the family of all vectors x of the type Q. For Q /∈ Q(XN ) , we have T NQ (X) = ∅. We will use the following estimates [11], [31]: | Q(XN ) |≤ (N + 1)|X|, (6) (N + 1)−|X| exp{NH(Q)}≤ |T NQ (X)| ≤ exp{NH(Q)}. (7) We will denote for brevity: for Q ∈P(X), D(Q||Bk) 4 = min Gm∈Bk D(Q||Gm), (8) and for Q ∈Q(XN ), DN (Q||Bk) 4 = min Gm∈Bk DN (Q||Gm), (9) for Rl ⊂P(X), D(Rl||Bk) 4 = min Q∈Rl D(Q||Bk), (10) and for RNl ⊂Q(X N ), DN (RNl ||Bk) 4 = min Q∈RN l DN (Q||Bk). (11) In the following sections, we present ways of optimal tests construction for the considered models and investigate the corresponding error probabilities and reliabilities. 22 A Neyman-Pearson Proper Way to Universal Testing of Multiple Hypotheses Formed by Groups of Distr. 3. Testing for Model with Rejection Option To construct the desired LAO test corresponding to preliminary given strictly positive num- bers E1|1,E2|2, ...,EK|K we define the following subsets of distributions: RNk 4 = {Q ∈Q(XN ) : DN (Q||Bk) ≤ Ek|k}, k = 1,K, (12) RNK+1 4 = {Q ∈Q(XN ) : DN (Q||Bk) > Ek|k, k = 1,K}, (13) Rk 4 = {Q ∈P(X) : D(Q||Bk) ≤ Ek|k}, k = 1,K, (14) RK+1 4 = {Q ∈P(X) : D(Q||Bk) > Ek|k, k = 1,K}, (15) It is clear that RNk ⊂Rk, k = 1,K + 1. (16) Define also the following values of reliabilities: E∗k|k = E ∗ k|k(Ek|k) 4 = Ek|k, k = 1,K, (17) E∗l|k = E ∗ l|k(El|l) 4 = D(Rl||Bk), k = 1,K, k 6= l, l = 1,K, (18) E∗K+1|k = E ∗ K+1|k(E1|1,E2|2, ...,EK|K) 4 = D(RK+1||Bk) = E∗k|k, k = 1,K. (19) Theorem 1: If all distributions Gm, m = 1,M, are different in the sense that D(Gm′||Gm) > 0, m′ 6= m, and the strictly positive numbers E1|1,E2|2, ...,EK|K are such that the following inequalities hold E∗1|1 < min l=2,K D(Rl||B1) (20) E∗k|k < min( min l=1,k−1 E∗l|k(El|l), min l=k+1,K D(Rl||Bk)) (20′) E∗K|K < min l=1,K−1 E∗l|K(El|l) (20 ′′) then there exists an LAO sequence of tests, all elements of the reliability matrix E∗ = {E∗l|k} of which are defined in (17)-(19) and are strictly positive. When at least one of the inequalities in (20) is violated, then at least one element of the matrix of reliabilities E∗ is equal to 0. More than that, if we try to detect with such El|l which for some l ∈ [1; K + 1] and k ∈ [1; K] is greater than D(Rl||Bk), then the test for all N = 1, 2, ... will make an error with the probability 1. Proof: Having a collection of numbers satisfying the conditions (20) we pass to the proof of the positive statement of the theorem, that is, to the construction of the test. Consider a sequence of tests Φ∗, which is defined by partition of sample space XN on the following K + 1 subsets: AN∗k = ⋃ Q∈RN k T NQ (X), k = 1,K, (21) AN∗K+1 = X N − K⋃ k=1 AN∗k , N = 1, 2, .... E. Haroutunian and A. Yesayan 23 This AN∗k ⊂ XN , because when Q /∈ Q(XN ), then T NQ (X) is empty. Let us prove that the collection of sets in (21) determines a test, namely, each x belongs to one and only to one subset AN∗k , AN∗k ⋂ AN∗r = ∅, r 6= k, and K+1∑ k=1 AN∗k = X N. Really, for k = 2,K, r = 1,K − 1, for each k > r let us consider arbitrary x ∈AN∗k . We see that in accordance with (21) and (12) there exists T NQx (X) ⊂A N∗ k , such that D N (Qx||Bk) ≤ Ek|k. As k > r from conditions (20) it follows that Er|r < E ∗ k|r(Ek|k). From definition (18) and inequality DN (Qx||Bk) ≤ Ek|k we obtain Er|r < E∗k|r(Ek|k) = min Q∈RN k D(Q||Br) ≤ DN (Qx||Br). Hence Qx /∈RNr and from (21) it follows that x /∈AN∗r . We can verify that AN∗K+1 ⋂ AN∗k = ∅, k = 1,K, because if x ∈ AN∗K+1, then by (15) for type Qx the inequality D N (Qx||Bk) > Ek|k is true for k = 1,K. According to the definition (21) of AN∗k , k = 1,K, we see that x /∈AN∗k . The sample x from T NQ (X) ⊂Q(XN ) has the following probability: GNm(x) = N∏ n=1 Gm(xn) = ∏ x Gm(x) N(x/x) = ∏ x Gm(x) NQx(x) = exp{N ∑ x (−Qx(x) log Qx(x) Gm(x) + Qx(x) log Qx(x))} = exp{−N[DN (Qx||Gm) + H(Qx)]}. (22) Now for k = 1,K, using (3), (21), (6), (7), (12) and (22) we can upper estimate αk|k(Φ ∗ N ) as follows: αk|k(Φ ∗ N ) = max Gm∈Bk GNm(AN∗k ) = max Gm∈Bk GNm( ⋃ Q/∈RN k T NQ (X)) ≤ (N + 1)|X| max Gm∈Bk max Q:DN (Q||Bk)>Ek|k GNm(T N Q (X)) ≤ (N + 1)|X| max Q:DN (Q||Bk)>Ek|k exp{−NDN (Q||Bk)} ≤ exp{−N[ inf Q:DN (Q||Bk)>Ek|k DN (Q||Bk) −oN (1)]} = exp{−N(Ek|k −oN (1))}, (23) where oN (1) → 0 with N →∞. From here E∗k|k ≥ Ek|k, k = 1,K. Now let us prove the lower inequalities for l = 1,K, k = 1,K, l 6= k. From (1), (21), (7) and (22) we obtain αk|k(Φ ∗ N ) = max Gm∈Bk GNm(AN∗k ) = max Gm∈Bk GNm( ⋃ Q/∈RN k T NQ (X)) ≥ max Gm∈Bk max Q/∈RN k GNm(T N Q (X)) ≥ (N + 1)−|X| max Gm∈Bk max Q:DN (Q||Bk)>Ek|k exp{−NDN (Q||Gm)} = exp{−N( min Gm∈Bk inf Q:DN (Q||Bk)>Ek|k DN (Q||Gm) + o(1))} 24 A Neyman-Pearson Proper Way to Universal Testing of Multiple Hypotheses Formed by Groups of Distr. = exp{−N( inf Q:DN (Q||Bk)>Ek|k DN (Q||Bk) + o(1))} = exp{−N(Ek|k + o(1))}. (24) (23) and (24) give us (17). We can obtain similar upper estimates E∗l|k ≥ El|k for l = 1,K, k = 1,K, l 6= k. According to (1), (6), (7) and (10) we have αl|k(Φ ∗ N ) = max Gm∈Bk GNm(A N∗ l ) = max Gm∈Bk GNm( ⋃ Q∈RN l T NQ (X)) ≤ (N + 1)|X| max Gm∈Bk max Q∈RN l exp{−NDN (Q||Gm)} = exp{−N(DN (Rl||Bk) −oN (1))}. (25) Again for l = 1,K,k = 1,K,l 6= k, we lower estimate αl|k(Φ ∗ N ) = max Gm∈Bk GNm(A N∗ l ) = max Gm∈Bk GNm( ⋃ Q∈RN l T NQ (X)) ≥ (N + 1)−|X| max Gm∈Bk max Q∈RN l exp{−ND(Q||Gm)} = exp{−N(D(Rl||Bk) + oN (1))} = exp{−N(El|k + oN (1))}. (26) According to the definition (4), the reliability El|k(Φ ∗) of the test sequence Φ∗ is the limit inferiour lim N→∞ (− 1 N log αl|k(Φ ∗ N )), taking into account (25), (26) and the continuity of the functional DN (Q||Gl), we obtain that lim N→∞ (− 1 N log αl|k(Φ ∗ N )) exists and (18) is correct. Similarly we can obtain upper and lower bounds for αK+1|k(Φ ∗ N ), k = 1,K. Applying the analogous resoning we get (19). The proof of the first part of the theorem will be accomplished if we demonstrate that the sequence of tests Φ∗ is LAO, that is, for every other sequence of tests Φ∗∗ with the same reliabilities E1|1, ...,EK|K for all l = 1,K + 1, l 6= k, k = 1,K, inequalities El|k(Φ∗∗) ≤ El|k(Φ ∗) hold. Suppose the contrary is the case, that is there exists sequence of tests Φ∗∗ defined by the sets DN1 , ...,DNK+1 such that El|k(Φ ∗∗) > El|k(Φ ∗) for some l ∈ [1,K + 1], k ∈ [1,K], l 6= k. (27) For tests Φ∗ and Φ∗∗ the space XN is decomposed into subsets AN∗l and, respectively, into DNl , l = 1,K + 1, such that for l = 1,K and N large enough. max Gm∈Bl GNm(A N∗ l ) = max Gm∈Bl GNm(D N l ) = 1 − exp{−NEl|l} (28) and AN∗l are constructed in (21) with sets of types T NQ (X) including almost all vectors x having positive probability max Gm∈Bl Gm. From here for the set AN∗l −DNl ⊂DN∗l we have lim N→∞ max Gm∈Bl GNm(A N∗ l −D N l ) ≤ lim N→∞ exp(−NEl|l) = 0. (29) By (4), (1) and (27) El|k(Φ ∗∗) = lim N→∞ ( − 1 N log max Gm∈Bk GNm(D N l ) ) E. Haroutunian and A. Yesayan 25 ≥ lim N→∞ ( − 1 N log max Gm∈Bk GNm(A N l ) ) . That is for N large enough max Gm∈Bk GNm(D N l ) ≤ max Gm∈Bk GNm(A N∗ l ), and hence max Gm∈Bk GNm(A N∗ l −D N l ) > 0, which contradicts (29). So, we must conclude that (27) is not possible. For the proof of the second part of Theorem 1, it is enough to note that if one of the conditions (20) is violated, then from (12)-(19) it follows that at least one of the elements El|k is equal to 0. In case when El|l > D(Rl||Bk) from (3), (21), (12), (22), (7) we have for all N αNl|l = max Gm∈Bl GNm(A N l ) = max Gm∈Bl GNm( ⋃ Q∈RN k T NQ (X)) ≥ max Gm∈Bl max Q∈RN l GNm(T N Q (X)) = exp{−N min Gm∈Bl min Q∈RN l DN (Q||Gm)} = exp{−N × 0} = 1, because for Gm ∈BNl we have min Gm∈Bl min Q∈RN l DN (Q||Gm) = 0. Theorem 1 is proved. 4. Case without Rejection of Decision Consider also the standard case when the decision is obligatory. Again we have M possible PDs Gm ∈P(X), m = 1,M, which are placed in K groups B1, B2, ...,BK, which we envisage as hypotheses Hk, k = 1,K. The unknown hypothesis must be detected on the base of sample x = (x1,x2, ...,xN ). The test ΦN can be designed by dividing the sample space XN into K subsets AN1 ,AN2 , ...,ANK as acceptance regions for the hypotheses of the same number. The test is characterized by error probabilities. αNl|k = αl|k(ΦN ) 4 = max Gm∈Bk GNm(A N l ), l 6= k, l,k = 1,K. αNk|k = αk|k(ΦN ) 4 = ∑ l 6=k, l=1,K αNl|k, k = 1,K. The reliabilities are difined as in (4) El|k = El|k(Φ) 4 = lim N→∞ ( − 1 N log αNl|k ) , k, l = 1,K. We shape the LAO sequence of tests Φ for preliminary given positive numbers E1|1,E2|2, ...,EK−1|K−1 by the following regions of PDs Rk 4 = {Q ∈P(X),D(Q||Bk) ≤ Ek|k}, k = 1,K − 1, RK 4 = {Q ∈P(X),D(Q||Bk) > Ek|k, k = 1,K − 1}. 26 A Neyman-Pearson Proper Way to Universal Testing of Multiple Hypotheses Formed by Groups of Distr. Let the corresponding reliabilities be as in (17)-(19) E∗k|k = E ∗ k|k(Ek|k) 4 = Ek|k, k = 1,K − 1, (30) E∗l|k = E ∗ l|k(El|l) 4 = D(Rl||Bk), k = 1,K, k 6= l, l = 1,K − 1, (31) E∗K|k = E ∗ K|k(E1|1,E2|2, ...,EK−1|K−1) 4 = D(RK||Bk), k = 1,K. (32) Theorem 2: If all PDs Gm, m = 1,M, are different, D(Gm′||Gm) > 0, m′ 6= m, and the strictly positive numbers E1|1,E2|2, ...,EK−1|K−1 are such that the following inequlities hold E∗1|1 < min l=2,K−1 D(Rl||B1)) (33) E∗k|k < min( min l=1,k−1 E∗l|k(E ∗ l|l), min l=k+1,K−1 D(Rl||Bk)) k = 2,K − 2, (33′) E∗K−1|K−1 < min l=1,K−2 E∗l|K−1(E ∗ l|l) (33 ′′) then there exists an LAO sequence of tests, all elements of the reliabilities matrix E of which are defined in (30)-(32) and are strictly positive. When one of the inequalities in (33) is violated then at least one element of the matrix of reliabilities E∗ is equal to 0. 5. Some or All Given Reliabilities are Equal to Zero The well-known Stein lemma [11] also called Chernoff-Stein lemma in [7] provides the es- timate of the error probability for the case of two hypotheses. It asserts that when the error probability αN1|1 is postulated as a constant, then the error probability α N 2|2 goes to 0 as exp{−ND(P1||P2)} as the number of observations N tends to infinity. In this section, we present a generalization of Stein lemma in two directions. First, a more general model, when Section 2 consists of M PDs grouped in K hypotheses and the test has to detect an unknown hypothesis or reject any decision. And secondly, it is known that some error probabilities αNl|l, or all K of them, tend to 0 when N goes to infinity as a function δNl|l, such that lim N→∞ ( − 1 N log δNl|l ) = 0. (34) In practice, δNl|l can be constants or polynomials by N. The following theorem is a general- ization of a result from [23] as an addition to Theorem 1. Theorem 3: When all distributions Gm, m = 1,M are different in the sense that D(Gm′||Gm) > 0, m′ 6= m and given numbers Ek|k, k = 1,K partly or all of them are equal to 0 and verify condition (34), then there exists an LAO test sequence Φ∗, the elements of reliabilities matrix E(Φ∗) = {E∗l|k} of which are defined by (14), (15), (17)-(19), if the conditions (20) hold. But in the case when the given El|l is equal to zero, the formula (18) changes as follows: E∗l|k = E ∗ l|k(0) = D(Bl||Bk), k = 1,K, k 6= l. For the proof, it is enough to replace (12) by the following expressions: RNl 4 = {Q ∈Q(XN ) : DN (Q||Bl) ≤− 1 N log δNl|l}. E. Haroutunian and A. Yesayan 27 6. Case of Three Distributions and Two Hypotheses In this Section, we discuss a number of questions concerning the most simple model amongst the considered in this paper. At first we represent a generalization of the fundamental result of Neyman-Pearson lemma for the noted case of 3 PDs and two groups. There are given three distributions G1, G2, G3 for a random variable X. These distributions are divided into two groups (hypotheses) such that the first hypothesis H1 is the first distribution and the second hypothesis is the group of two other PDs H1 = (G1), H2 = (G2, G3). (35) The statistician must accept or reject the first hypothesis on the base of the sample x. Theorem 4: (Neyman-Pearson lemma) For a threshold t > 1, consider test Ψ∗N defined by the region of acceptance AN∗ for hypothesis H1: AN∗ = {x : GN1 (x) max(G2 N (x); G3 N (x)) > t}, (36) and acceptance region AN∗ for H2. The corresponding error probabilities are αN∗1|1 (t) = α N∗ 2|1 (t) = G N 1 (AN∗) αN∗2|2 (t) = α N∗ 1|2 (t) = max(G N 2 (A N∗ 1 ); G N 3 (A N∗)). Let AN ⊂ XN be the decision region for H1 of the another test ΦN with error probabilities αN1|1 and α N 2|2. If α N 1|1 ≤ α N∗ 1|1 , then α N 2|2 ≥ α N∗ 2|2. Proof: The numbers N and t are fixed, we can do not note them during proof. Let ΨAN∗ and ΨAN be indicator functions of the regions. It is not difficult to verify that for all x ∈XN , (ΨAN∗(x) − ΨAN (x))(GN1 (x) − t max(G2 N (x); G3 N (x))) ≥ 0. Then ∑ x∈XN (ΨAN∗(x)G N 1 (x) − tΨAN∗(x) max(G2 N (x); G3 N (x)) −ΨAN (x)GN1 (x) + tΨAN (x) max(G2 N (x); G3 N (x))) = ∑ x∈AN∗ (GN1 (x) − t max(G N 2 (x); G N 3 (x))) − ∑ x∈AN (GN1 (x) − t max(G N 2 (x); G N 3 (x))) = (1 −α∗1|1) − tα ∗ 2|2 − (1 −α1|1) + tα2|2 = (α1|1 −α∗1|1) + t(α2|2 −α ∗ 2|2) ≥ 0. So from α1|1 ≤ α∗1|1 it follows that α2|2 ≥ α ∗ 2|2. Now we reformulate Theorem 2 for the model given in (35). Theorem 5: If PDs G1, G2, G3 are different, the strictly positive number E1|1 is such that E1|1 ≤ min(D(G2||G1), D(G3||G1)), 28 A Neyman-Pearson Proper Way to Universal Testing of Multiple Hypotheses Formed by Groups of Distr. for R1 4 = {Q ∈P(X), D(Q||G1) ≤ E1|1}, R2 4 = {Q ∈P(X), D(Q||G1) > E1|1}, we consider E∗1|1 = E1|1 = E ∗ 2|1 E∗2|2 = E ∗ 1|2 = min Q∈R1 min(D(Q||G2),D(Q||G3)) then for the hypotheses in (35) there exists an LAO sequence of tests with strictly positive reliabilities given above and with regions of decision for hypotheses Hk AN∗k = ⋃ Q∈Rk T NQ (X), k = 1, 2. In the paper [20] of 1990 Haroutunian noted that “the principle of maximum of likelihood is equivalent to the principle of maximum of Kullback-Leibler distance” and “the desired tests sequence is constructed by means of distances between the sample distribution and the hypothetical distributions”. It is worth to note that this assertion is something in common with the following note in Cover and Thomas monograph of 1991 (p. 307) [7] concerning the test of two hypotheses (the next one with our adopted exposition). ”In the above theorem (the Neyman-Pearson lemma), we have shown that the optimum test is a likelihood ratio test. We can rewrite the log-likelihood ratio as the difference between the relative entropy distances of the sample type to each of the two distributions. Hence the likelihood ratio test (in our notation) GN1 (x) G2 N (x) > t > 1 is equivalent to D(Qx||GN2 ) −D(Qx||G N 1 ) > log t N or (with our new notation of divergence of three PDs) D(Qx||GN1 ||G N 2 ) > log t N . It remains to add that for the case of simple model in (36) the likelihood ratio test is equivalent to the following condition specifying the region of detection of the first hypothesis in (36) min[D(Qx||GN1 ||G N 2 ),D(Qx||G N 1 ||G N 3 )] > log t N . 7. Conclusion Here we offer some concluding remarks and open problems. In this paper, we have discussed error exponents trade-off of Neyman-Pearson suitable strategy of hypotheses testing for models with M known discrete probability distributions joined in K (2 ≤ K ≤ M) clusters, considered as hypotheses. We presented a single letter characterization of the error exponents of all possible pairs of hypotheses of tests for some cases. After a detailed proof of the point in question for a E. Haroutunian and A. Yesayan 29 general case with the possibility of decision rejection, analogical results are announced the case without a rejection option, the case when all or a part of the given reliabilities are equal to zero, and finally, for a particular case of three disributions and two hypotheses. The reasonings at the end of the previous section confirm the optimality of the tests considered in the paper based on the use of distances between the sample and hypotheses. For further works it is deserving exploration of characteristics of testing for generalization and enlargement of models studied in this paper. Interesting is the case with multiple objects [cf. 21, 24, 26, 27]. Significant are arbitrarily varying models with a sequence of states known to the decision maker [cf. 12] and also the case when states are not known to the statistician [cf. 2, 25]. Important is the problem of hypothesis identification [cf. 1, 18, 23]. Bayesian framework of the problem, and sources with other than independent issues, for instance with Markov dependence must also be investigated [cf. 16]. Acknowledgement The authors are thankful to the reviewers for their helpful comments. References [1] R. F. Ahlswede and E. A. Haroutunian, “On logarithmically asymptotically optimal testing of hypotheses and identification”, Lecture Notes in Computer Science, volume 4123, ”General Theory of Information Transfer and Combinatorics”, Springer, pp. 462– 478, 2006. [2] R. F. Ahlswede, E. Aloyan E. A. Haroutunian, “On logarithmically asymptotically optimal hypothesis testing for arbitrarily varying source with side information”, Lecture Notes in Computer Science, volume 4123, ”General Theory of Information Transfer and Combinatorics”, Springer, pp. 457-461, 2006. [3] L. Birgé, “Vitesses maximals de décroissence des erreurs et tests optimaux associés”. Z. Wahrsch. Verw. Gebiete, vol. 55, pp. 261–273, 1981. [4] R. E. Blahut, “Hypotheses testing and information theory” IEEE Trans. Inform Theory, vol. 20, no 4, pp. 405-417, 1974. [5] R. E. Blahut, Principles and Practice of Information Theory, Addison-Wesley, Reading, MA, 1987. [6] A. A. Borovkov Mathematical Statistics (in Russian). Nauka, Novosibirsk, 1997. [7] T. M. Cover and J. A. Thomas, Elements of Information Theory. Second Edition, New York, Wiley, 2006. [8] D. R. Cox, “Tests of separate families of hypotheses” In Proceeding 4th Berkley Simp. Math. Statist. Prob., University of Callifornia Press, Berkelly, pp. 105-123, 1961. [9] D. R. Cox, “Furthe results on tests of separate families of hypotheses”, Journal of the Royal Statist. Society: Serie B, vol. 24, no. 2, pp. 406-424, 1962. [10] D. R. Cox, “A return to an old paper: Tests of separate families of hypotheses”, Journal of the Royal Statist. Society: Serie B, vol. 75, no. 2, pp. 207-2015, 2013. [11] I. Csiszár and J. Körner, Information Theory: Coding Theorems for Discrete Memory- less Systems, Academic press., New York, 1981. 30 A Neyman-Pearson Proper Way to Universal Testing of Multiple Hypotheses Formed by Groups of Distr. [12] I. Csiszár and G. Longo, “On the error exponent for source coding and for testing simple statistical hypotheses”, Studia Sc. Math. Hungarica, vol. 6, pp. 181–191, 1971. [13] I. Csiszár and P. Shields, “ Information theory and statistics: A tutorial”, Foundations and Trends in Communications and Information Theory, vol. 1, no. 4, 2004. [14] M. Feder and N. Merhav, “ Universal composite hypotheses testing: A competetive minimax approach,” IEEE Trans. Inform. Theory, vol. 48, pp. 1504-1517, June 2002. [15] F. W. Fu and S. Y. Shan, “Hypotheses testing for arbitrarily varying source with expo- nential type constraint”, IEEE Trans. on Inform. Theory, vol. 14, no. 2, pp. 892-895, 1998. [16] M. Gutman, “Asymptotically optimal classification for multiple tests with empirically observed statistics”, IEEE Trans. Inform. Theory, vol. 35, no. 2, pp. 401–408, 1989. [17] T. S. Han and K. Kobayashi , “Exponential-type error probabilities for multiterminal hypotheses testing”, IEEE Trans. Inform. Theory, vol. 35, no. 1, pp. 2-13, 1989. [18] E. A. Haroutunian, “Many statistical hypotheses: interdependence of optimal test’s error probabilities exponents”, (In Russian), Abstract of the report on the 3rd All- Union school-seminar, “Program-algorithmical software for applied multi-variate statis- tical analysis”, Tsakhkadzor, Part 2, pp. 177–178, 1988. [19] E. A. Haroutunian, “On asymptotically optimal testing of many statistical hypothe- ses conserning Markov chain”, (in Russian), Isvestia Akademii Nauk Armenii, Mathe- matika, vol. 23, no. 1, pp. 76-80, 1988. [20] E. A. Haroutunian, “Logarithmically asymptotically optimal testing of multiple statisti- cal hypotheses”, Problems of Control and Information Theory, vol. 19(5-6), pp. 413–421, 1990. [21] E. A. Haroutunian, “Reliability in multiple hypotheses testing and identification prob- lems” Nato Science Series III, Computer and System Sciences, vol.198, IOS Press, pp. 189-201, 2003. [22] E. A. Haroutunian, “On three hypotheses robust detection design under mismatch”, Proceedings of International Conference CSIT 2019, pp. 125 – 128, Yerevan 2019. [23] E. A. Haroutunian and P. M. Hakobyan, “On LAO testing of multiple hypotheses for pair of objects”, Mathematical Problems of Computer Sciences, vol. 25, pp. 93-101, 2006. [24] E. A. Haroutunian and P. M. Hakobyan, “Multiple hypotheses LAO testing for many independent objects”, International Journal “Scholarly Research Exchange”, vol. 2009, pp. 1-6, 2009. [25] E. A. Haroutunian and P. M. Hakobyan, “On Neyman-Pearson principle in multiple hypotheses testing”, Mathematical Problems of Computer Sciences, vol. 40, pp. 34-37, 2013. [26] E. A. Haroutunian and P. M. Hakobyan, “Multiple objects: Error exponents in hy- potheses testing and identification”, Lecture Notes in Computer Science, volume 7777, ”Ahlswede Festschrift”, Springer , pp. 313-345, 2013. [27] E. A. Haroutunian, P. M. Hakobyan and F. Hormosi-nejad, “On two-stage LAO testing of multiple hypotheses for the pair of families of distributions”, Journal of Statistics and Econometrics Methods , vol. 2, no. 2, pp. 127-156, 2013. E. Haroutunian and A. Yesayan 31 [28] E. A. Haroutunian, P. M. Hakobyan and A. O. Yessayan, “On multiple hypotheses LAO testing with rejection of decision for many independent objects”, Proceedings of International Conference CSIT, pp. 117 – 120, 2011. [29] E. A. Haroutunian, P. M. Hakobyan and A. O. Yesayan, “Many hypotheses LAO testing with rejection of decision for arbitrarily varying object”, Mathematical Problems of Computer Sciences, vol. 35, pp. 77-85, 2011. [30] E. A. Haroutunian, P. M. Hakobyan and A. O. Yessayan, “On multiple hypotheses LAO testing with liberty of rejection of decision for two independent objects”, International Journal, Information theories and applications, vol. 25 , no. 1, pp. 38-46, 2018. [31] E. A. Haroutunian, M. E. Haroutunian and A. N. Harutyunyan, “Reliability criteria in information theory and in statistical hypothesis testing”, Foundations and Trends in Communications and Information Theory, vol. 4, no. 2-3, 2008. [32] W. Hoeffding, “Asymptotically optimal tests for multinomial distributions,” The Annals of Mathematical Statistics, vol. 36, pp. 369–401, 1965. [33] S. Kullback, Information Theory and Statistics, New York, Wiley, 1959. [34] E. Levitan and N. Merhav, “A competitive Neyman-Pearson approach to universal hypothesis testing with applications”, IEEE Trans. Inform. Theory, vol. 48, no. 8, pp. 2215-2229, 2002. [35] B. C. Levy, Principles of Signal Detection and Parameter Estimation, Springer, 2008. [36] G. Longo and A. Sgarro, “ The error exponent for the testing of simple statistical hypotheses: A combinatorial approach”, Journal of Combinatorics and Informational System Science, vol. 5, no. 1, pp. 58-67, 1980. [37] V. Poor, An Introduction to Signal Deyection and Estimation, Springer, 1994. [38] E. Tuncel, “On error exponents in hypothesis testing’, IEEE Trans. Inform. Theory, vol. 51, mo.8, pp. 2945-2950, 2005. [39] G. Tusnády, “On asymptotically optimal tests,” Annals of Statatistics, vol. 5, no. 2, pp. 385-393, 1977. [40] H. van Trees, Detection, Estimation and Modulation Theory, pt.1 New York, Wiley, 1968. Submitted 28.07.2020, accepted 26.11.2020. 3 2 A Neyman-Pearson Proper Way to Universal Testing of Multiple Hypotheses Formed by Groups of Distr. ´³ßËáõÙÝ»ñÇ ËÙµ»ñáí ϳ½Ùí³Í µ³½Ù³ÃÇí í³ñϳÍÝ»ñÇ Ü»ÛÙ³ÝÇ-äÇñëáÝÇ Ñ³ÙÁݹѳÝáõñ ï»ëï³íáñÙ³Ý Ñ³ïáõÏ áõÕÇ ºí·»ÝÇ ². гñáõÃÛáõÝÛ³Ý ¨ ²ñ³Ù ú. ºë³Û³Ý ÐÐ ¶²² ÆÝýáñÙ³ïÇϳÛÇ ¨ ³íïáÙ³ï³óÙ³Ý åñáµÉ»ÙÝ»ñÇ ÇÝëïÇïáõï e-mail: eghishe@sci.am, armfrance@yahoo.fr ²Ù÷á÷áõ٠лﳽáïí»É »Ý Ü»ÛÙ³ÝÇ-äÇñëáÝÇ ³ëÇÙåïáïáñ»Ý ûåïÇÙ³É Ñ³Ûïݳµ»ñÙ³Ý ÁÝóó³Ï³ñ·»ñÁ ³ÛÝ Ùá¹»ÉÝ»ñÇ Ñ³Ù³ñ, áñáÝù µÝáõó·ñíáõÙ »Ý M ¹ÇëÏñ»ï ѳí³Ý³Ï³ÝáõÃÛáõÝÝ»ñÇ µ³ßËáõÙÝ»ñáí, áñáÝù ËÙµ³íáñí³Í »Ý Áëï K ¹³ë»ñÇ, 2 · K · M , áñáÝù ¹Çï³ñÏíáõÙ »Ý áñå»ë í³ñϳÍÝ»ñ: ¸Çï³ñÏáõÙÝ»ñÇ ù³Ý³ÏÇ ³×Ç íñ³ Ñïåöèàëüíûé ïóòü Íåéìàíà-Ïèðñîíà ê óíèâåðñàëüíîé ïðîâåðêå ìíîãèõ ãèïîòåç, ñôîðìèðîâàííûõ ãðóïïàìè ðàñïðåäåëåíèé Åâãåíèé À. Àðóòóíÿí è Àðàì Î. Åñàÿí Èíñòèòóò ïðîáëåì èíôîðìàòèêè è àâòîìàòèçàöèè ÍÀÍ ÐÀ e-mail: eghishe@sci.am, armfrance@yahoo.fr Àííîòàöèÿ Èññëåäóþòñÿ àñèìïòîòè÷åñêè îïòèìàëüíûå ïðîöåäóðû Íåéìàíà-Ïèðñîíà îáíàðóæåíèÿ äëÿ ìîäåëåé, õàðàêòåðèçóþùèõñÿ M äèñêðåòíûìè ðàñïðåäåëåíèÿìè âåðîÿòíîñòåé, ñãðóïïèðîâàííûìè â K ãðóïï, 2 · K · M, ðàññìàòðèâàåìûõ êàê ãèïîòåçû. Ïîñëåäîâàòåëüíîñòü òåñòîâ, ïðè âîçðàñòàíèè ÷èñëà íàáëþäåíèé, ÿâëÿåòñÿ ëîãàðèôìè÷åñêè àñèìïòîòè÷åñêè îïòèìàëüíîé (LAO), êîãäà îïðåäåëåííàÿ ´³Ý³ÉÇ µ³é»ñ` íÇ׳ϳ·ñ³Ï³Ý í³ñϳÍÝ»ñÇ ëïáõ·áõÙ, í³ñϳÍÝ»ñÇ ÁÝï³ÝÇùÝ»ñ, ûåïÇÙ³É Ñ³Ûïݳµ»ñáõÙ, Ü»ÛÙ³Ý-äÇñëáÝÇ Ùáï»óáõÙ, Ü»ÛÙ³Ý-äÇñëáÝÇ É»ÙÙ³, Îáõɵ³ÏÇ- Èá۵ɻñÇ Ñ»é³íáñáõÃÛ³Ý Ù³ùëÇÙáõÙÇ ëϽµáõÝù, ë˳ÉÇ óáõóÇã: ÑÇÙÝí³Í ï»ëï»ñÇ Ñ³çáñ¹³Ï³ÝáõÃÛáõÝÁ Éá·³ñÇÃÙáñ»Ý ³ëÇÙåïáïÇÏáñ»Ý ûåïÇÙ³É (Ȳú) ¿, »ñµ ïñí³Í óáõóÇãÝ»ñÇ (Ñáõë³ÉÇáõÃÛáõÝÝ»ñÇ) áñáß³ÏÇ Ù³ëÁ ³å³ÑáíáõÙ ¿Ùݳó³Í µáÉáñ Ñáõë³ÉÇáõÃÛáõÝÝ»ñÇ Ñ³Ù³ñ ¹ñ³Ï³Ý ³ñÅ»ùÝ»ñ áõݻݳÉÁ: γéáõóí»É »Ý ûµÛ»ÏïÝ»ñÇ áñáß Ùá¹»ÉÝ»ñÇ Ñ³Ù³ñ Ȳú ï»ëï»ñÇ ³ÛÝ ¹»åù»ñÁ, »ñµ ÃáõÛɳïñíáõÙ ¿ Ññ³Å³ñí»É áñáßáõÙ ÁݹáõÝ»Éáõó ¨ »ñµ ë˳ÉÝ»ñÇ Ñ³í³Ý³Ï³ÝáõÃÛáõÝÝ»ñÇ ÙÇ Ù³ëÁ ϳ٠µáÉáñÁ ÷áñÓ»ñÇ ÃíÇ ³×Ç Ñ»ï Ù»Ïï»Õ Ýí³½áõÙ »Ý »Ýóóáõóã³ÛÇÝ ûñ»Ýùáí: êï³óí»É »Ý ³Û¹åÇëÇ ï»ëï»ñÇ µáÉáñ Ñáõë³ÉÇáõÃÛáõÝÝ»ñÇ Ùdzï³é µÝáõó·ñáõÙÝ»ñÁ: ¸Çï³ñÏí³Í ¿ ÙÇ Ñ³ë³ñ³Ï ¹»åù, »ñµ µ³ßËáõÙÝ»ñÇ ÃÇíÁ »ñ»ù ¿, ÇëÏ í³ñϳÍÝ»ñÇ ÃÇíÁ` »ñÏáõ: E. Haroutunian and A. Yesayan 3 3 ÷àñòü çàäàííûõ ýêñïîíåíò (íàäåæíîñòåé) îáåñïå÷èâàåò ïîëîæèòåëüíûå çíà÷åíèÿ äëÿ âñåõ äðóãèõ íàäåæíîñòåé. Ñêîíñòðóèðîâàíû LAO ïîñëåäîâàòåëüíîñòè òåñòîâ äëÿ íåêîòîðûõ ìîäåëåé, â òîì ÷èñëå â ñëó÷àÿõ, êîãäà ðàçðåøåí îòêàç îò ïðèíÿòèÿ ðåøåíèÿ è êîãäà ÷àñòü èëè âñå çàäàííûå âåðîÿòíîñòè îøèáîê óáûâàþò ñóáýêñïîíåíöèàëüíî ñ ðîñòîì êîëè÷åñòâà ýêñïåðèìåíòîâ. Ïîëó÷åíû îäíîáóêâåííûå õàðàêòåðèñòèêè äëÿ âñåõ íàäåæíîñòåé òàêèõ òåñòîâ. Ðàññìîòðåí ïðîñòîé ñëó÷àé ñ òðåìÿ ðàñïðåäåëåíèÿìè è äâóìÿ ãèïîòåçàìè. Êëþ÷åâûå ñëîâà: ïðîâåðêà ñòàòèñòè÷åñêèõ ãèïîòåç, ñåìåéñòâà ãèïîòåç, îïòèìàëüíîå îáíàðóæåíèå, ïîäõîä Íåéìàíà-Ïèðñîíà, ëåììà Íåéìàíà-Ïèðñîíà, ïðèíöèï ìàêñèìóìà ðàññòîÿíèÿ Êóëüáàêà-Ëåéáëåðà, ïîêàçàòåëü îøèáêè. 02_Haroutunian_54_18_33 Haroutunian_new