Microsoft Word - Issue-2_Volume-10_All-Articles.docx 128 Empirical Null Distribution of -2log(lambda) Bushra Shamshad Department of Statistics, University of Karachi Main University Rd, Karachi, Karachi City, Sindh 75270, Pakistan Phone: +92 21 99261300 bshamshad@uok.edu.pk Junaid Saghir Siddiqi Department of Statistics, University of Karachi Main University Rd, Karachi, Karachi City, Sindh 75270, Pakistan Phone: +92 21 99261300 jssdr123@yahoo.com Abstract Approximation of Non-central Chi-square distribution as an empirical distribution of log- likelihood ratio test statistics (-2logλ; abbreviated as LRT) has been a concern in the field of structural equation modeling. Under extremely severe misspecification (Chun & Shapiro, 2009) reported that non-central Chi-square is not a good choice. In this paper, we have used a bootstrap sampling procedure to investigate the empirical null distribution of LRT specifically in the context of a latent class model (LCM) via frequentist framework (that is, EM algorithm). We used two types of data sets. The first type includes those sets of data on which LCM had been carried out (published results; named as “training data”). The other type is that of those data sets which are not published earlier (i.e. “real” collected data; named as “test data”). Non-central χ2 distribution with degrees of freedom equals to the expected value of bootstrap LRT and non-centrality parameter equals to inverse of the variance of bootstrapped LRT is found to be very well fitted empirical null distribution of LRT in case of LCM. These results will help in obtaining the significance value of LRT for deciding on the number of classes present in a latent variable. Keywords: Latent Class Model, Likelihood Ratio Test Statistics, Bootstrapping, Em- Algorithm, Non-Central χ2 Distribution, Goodness of Fit. 1. Introduction Likelihood ratio is defined as the ratio of the likelihood of one model (stated in the null hypothesis) over the other (the alternative hypothesis). It is a statistical test and used to compare the fit of one model over the other, where one of the two models is nested in the other. When all regularity conditions are satisfied, -2logλ follows Chi-square distribution and can be used for testing the significance of the fitted model. In the context of latent class analysis and mixture distributions, it has been known for years that the regularity condition of -2logλ does not hold, that is, the model parameters under the null hypothesis lie on the boundary of parameter space (Aitkin., Anderson & Hinde 1981). In other words, it is not possible to carry out the test of likelihood ratio (LR) as two models (under the null and alternative hypothesis) that have a different number of parameters, where one model is nested under the other. A subset of the parameters stated under the null hypothesis is set to zero in order to compare it with the other model stated in the alternative hypothesis. Therefore, it fails to follow the asymptotic chi-square distribution and the distribution is undefined for the likelihood ratio test. (Wilks, 1935; 1938) showed that for large samples, the distribution of log-likelihood ratio test (-2logλ) for nested models will be asymptotically χ2 with degrees of freedom equal to the difference between the dimensions of the sets of parameters involve in the test statistics (where one (stated in the null hypothesis), among the two compare model, is the special case of the other (stated in an alternative model)). There are a number of studies done for finding whether chi-square can be used as an approximated distribution of -2logλ using simulation techniques. It was then suggested that the distribution of -2logλ can be approximated as chi-square with degrees of freedom equal to B. Shamshad, J. S. Siddiqi - Empirical Null Distribution of –2logλ 129 twice the number of manifest items in the model (Wolfe, 1970). Further study (Hartigan, 1977) conducted also showed that the distribution can be approximated by chi-square with degrees of freedom p and p+1, where p is the number of manifest items. The problem in using the LR test was discussed by (Aitkin & Wilson, 1980) reporting that in small samples the test might not follow the asymptotic distribution of likelihood ratio. Furthermore, (Aitkin et al., 1981) and (McLachlan, 1987) showed reservation regarding approximation adequacy for the null distribution of -2logλ to be chi-square since regularity condition does not hold. The use of prior distribution for the vector of mixing proportion was showed by (Aitkin & Rubin, 1985). A note written by (Quinn, McLachlan & Hjort, 1987) which showed that for the approach used by (Aitkin & Rubin, 1985) too the regularity conditions do not hold and therefore the standard asymptotic result cannot be applied. Several studies conducted for assessing the null distribution of -2logλ by various researchers. As mentioned earlier that through the usual testing procedure, it is considered to be asymptotic chi- square with degrees of freedom equal to the difference between the number of parameters under the null and alternative model. Studies conducted in this respect show reservations, since the regularity condition do not hold for -2logλ with a mixture and latent class model and so the chances for fitting asymptotic χ2 as a null distribution of -2logλ are small. (Aitkin, et al., 1981) mentioned the suggestions made by (Wolfe, 1970) and (Hartigan, 1977) that -2logλ could be approximated by 𝜒 on the basis of small simulation and that it should be between 𝜒 and 𝜒 respectively. They assessed the null distribution of -2logλ by simulating its two sets (19 points each), from a single population, in which 38 items (of teaching style data) were independent and the response probabilities are estimated from the real data, to test the hypothesis of a homogenous population at 5% level of significance. Both the sets of values were rejected. The first 19 values were simulated under the null hypothesis of a no-class model against an alternative hypothesis of a two-class model and the other set of values generated under a two-class model against the alternative hypothesis of a 3-class model. Both the sets were also considered as a single sample but none of the results (individual and/or pooled) showed the likelihood of fitting of asymptotic distribution of χ2. Thus, we need alternative method(s) for testing the hypothesis on model parameters. 2. The Empirical Null Distribution of LRT (-2logλ) Simulation studies are done in the context of a mixture modeling to approximate the theoretical null distribution of LRT, which include studies/researches by (Titterington et al., 1985) (McLachlan, 1987) (McLachlan & Basford, 1988) and others (for details see McLachlan & Peel, 2000). 3. Bootstrap Likelihood Ratio Test (BLRT) statistic BLRT was first introduced by (McLachlan, 1987) in his paper for the assessment of the p- value of the likelihood ratio test statistic for the number of component in a normal mixture taking the simplest situation for a specified value of g , of a no-class model H : g = g against a two-class model H : g = g . This method starts, under the null hypothesis, a bootstrap sample from the mixture model, where φ being the vector of the likelihood parameter of φ estimated through its MLE from the original data. The value of -2logλ is obtained for each bootstrap sample after fitting the mixture model for g = g and g in turn to it. The process is repeated independently M number of times, and the replicated values of -2logλ evaluated from successive bootstrap samples provide an assessment of the bootstrap, and hence of the true, null distribution of -2logλ (McLachlan and Peel, 2000). A test developed by (Lo, Mendell & Rubin, 2001) using the proposed theorem by (Voung, 1989) that under the null hypothesis that the random sample is drawn from g − component normal mixture distribution versus the alternative that it is drawn from g − component normal mixture, where g < g . The likelihood ratio statistic based on Kullback-Leibler information criteria is asymptotically distributed as a weighted sum of independent χ with 1 degree of freedom. Through simulation results, LMR test evaluates the improvement in the fitting of the two successive models BRAIN – Broad Research in Artificial Intelligence and Neuroscience Volume 10, Issue 2 (April, 2019), ISSN 2067-3957 130 and provide significance value to see if there is any improvement in the fit for the higher component model. The inconsistency in the mathematical proof of Lo-Mendell-Rubin test for normal outcome was pointed by (Jeffries, 2003). Despite such critic, many researchers are using LMR test empirically for determining the number of mixtures/ classes. “MPlus" software provides LMR p- value in fitting different models. Also a package named “MplusAutomation_0.5” (Hallquist, 2008)) in R software is available which provides programming for model fitting and calculations for significance values of related fit statistics and extractions of model parameters. The program is developed by Muthen and Muthen (www.statmodel.com). (Nylund et al., 2007) uses the method proposed earlier of bootstrap likelihood ratio statistic for LCM and calculated the p-value obtained through BLRT estimates of the log likelihood difference distribution which indicates if a t-1-class model is rejected in favor of the t-class model. (Nylund et al., 2007), while comparing the performance of BLRT with LMR and a Naïve chi-square test she also examine the performance of information criteria’s (which include AIC, BIC, CAIC and ABIC) for the LCM, factor mixture model and growth mixture model. The method involves estimating t-1 and t class models for the log-likelihood ratio statistic, which considered as initial estimates. On the basis of the simulation results BIC was marked as the best indicator among other information criterions and the bootstrap likelihood ratio test (BLRT) came up to be a very consistent indicator for determining the number of classes in three types of mixture modeling, namely, LCM, factor mixture modeling and growth mixture modeling. They considered both discrete and continuous LCM and use normal distribution to calculate the significance value of LRT regarding the number of component/ classes in the model. Furthermore, they used Bayesian approach and MCMC simulation technique for the assessment of the significance value of LRT. (Chun & Shapiro, 2009) assess the remarks made by (Nylund et al., 2007) regarding normal approximation. They reported that non-central χ2 can be approximated for the LRT statistics under reasonable misspecification. According to their research, the findings may vary for different models. The power computation of the test for LRT was done by (Gudicha, Schmittmann, & Vermunt, 2016), using two different methods, to estimate the non-centrality parameter of non-central χ2 distribution. We have used the method in frequentist framework (i.e. method proposed by Bartholomew in 1987 to find the solution of unknown parameters for LCM through EM-algorithm). Thus, there is a room for investigation of the asymptotic distribution of -2logλ purely in the context of LCM for categorical data. 4. Our Approach for BLRT To establish its empirical distribution we have used an approach proposed by (McLachlan, 1987) for a mixture of two multivariate normal. (Nylund et al., 2007) uses the same method proposed earlier for bootstrap likelihood ratio statistic. They used normal distribution to calculate the significance value. Although, they mentioned that it does not always fits. The approach we are using, to find and establish the empirical distribution of likelihood ratio test statistic, is by using confirmatory LCM, which then further used for exploratory models. Initially, the hypothesis under consideration about log-likelihood ratio statistic (LRTt-1,t ) under the null hypothesis is of “t-1”- class model against the alternative hypothesis of “t”- class model. Bootstrap samples are based on fitting of the LCM (via EM algorithm) while using a set of parameters stated under the null hypothesis. For each bootstrap sample, LCM is estimated for “t-1” and “t” class model to compute the value of -2logλ. The complete procedure for bootstrap sampling and storage of LRT . is presented below. 1. Estimate the model for t-1 and t-class model and calculate the initial estimates of the log- likelihood ratio statistic (LRTt-1,t)initial beforehand, which will be used for calculating the significance value of LRTt-1,t . B. Shamshad, J. S. Siddiqi - Empirical Null Distribution of –2logλ 131 2. Assume a hypothetical population based on the parameters of (𝑡 − 1)-latent class model stated in the null hypothesis. From the assumed hypothetical population, generate bootstrap sample and estimate (t-1) and (t)-latent class models to calculate LRTt-1,t . 3. Repeat step 2, say “B” times independently to compute LRTt-1,t at each replication. 4. The resultant vector of LRTt-1,t obtained is then used to evaluate the empirical distribution through goodness of fit test. The process is repeated independently “100” times. it should be noted that the estimation procedure used is same for both bootstrap samples and for the model fitted beforehand for obtaining parameter (McLachlan, 1987) (McLachlan & Krishnan, 2008). In the following sub sections, we will be investigating the null distribution of LRT through the bootstrap sampling technique on training and test data sets. The hypothetical population considered here is from (t-1)-class model. For the simulation of hypothetical population, LCM is fitted on the concerned data and the estimates thus obtained are treated as parameters. The (t-1) and t-class models are fitted on the sample drawn from the hypothetical population. LRTt-1,t is then calculated using maximum likelihood estimates for (t-1) and t-class models from each booted sample. The process is repeated multiple number of times (say B-time) and LRTt-1,t of size B, is then assessed for the empirical distribution of -2logλ. It is then investigated that the empirical distribution of -2logλ found out to be a non-central χ2 distribution with degrees of freedom equal to the expected mean of LRTt-1,t and non-centrality parameter (ncp) equals to the inverse of the expected variance of LRTt-1,t. The chi-square goodness of fit test shows that the non-central 𝜒 distribution (with df = E(LRTt-1,t,) ncp = (V(LRTt-1,t )-1) is very well fitted to the data, for each situation and for each data considered. In the next sections we present the procedure for each training data, i.e., Mastery, role conflict, and Karachi University Teachers Society (KUTS) panel data. Goodness of fit test and its graphical representation for a single case of size (B = ) 100 is presented for bootstrap LRT1,2, LRT2,3 for both Mastery and Role Conflict data and LRT1,2, LRT2,3 & LRT3, 4 for KUTS panel data (since a 3-class model is best fitted on KUTS data). 5. Mastery Data (Macready & Dayton, 1977) data known as “Mastery data” was also used by (Bartholomew, 1987) for applying latent class model. The data is about a test based on 4 dichotomous items constructed on solving problem of multiplications. The procedure for BLRT is applied to mastery data in two situations 1) 1 vs. 2-class model and 2) 2 vs. 3-class model. As, it is known that a mastery data decomposed into two classes of a single latent variable named as “Master” and “Non Master” class (Macready and Dayton, 1977) (Bartholomew, 1987). Although the procedure is repeated “100” times, we are presenting one simulation result of the goodness of fit test for situation 1 and 2. It can be seen from the Table 1 (see also Figure 1), that in both situations the non-central distribution (with df = E(BLRTt-1,t), ncp = V(BLRTt-1,t) -1) is a good fit to the null distribution of LRT, that is, p-value for goodness of fit for LRT1,2 is 0.46701 with 3 df ( = no. of classes–2–1; as 2 parameters are estimated). Whereas, for LRT2,3, the p-value equal to 0.36136 with 3 df, also indicate very well fitting of non-central χ2 distribution. Table 1. Mastery data; expected frequencies obtained through the fitting of non-central χ2 distribution for size “B = 100”. Observed frequencies from the bootstrapped sample of; (a) LRT1,2 (b) LRT2,3 LRT1,2 Observed frequency Empirical frequency (0.548,2.56] 18 16.99109 (2.56,4.56] 28 27.755298 (4.56,6.57] 27 23.037882 (6.57,8.58] 10 14.861048 (8.58,10.6] 9 8.40641 (10.6,12.6] 3 4.38686 (12.6,14.6] 2 2.167725 BRAIN – Broad Research in Artificial Intelligence and Neuroscience Volume 10, Issue 2 (April, 2019), ISSN 2067-3957 132 (14.6,36] 3 1.884808 Total 100 99.491121 𝛘𝟐(goodness of fit) 2.546142 Degrees of freedom (df) 3 p-value 0.46701 (a) LRT2,3 observed frequency Empirical frequency (0,1.67] 28 28.393741 (1.67,3.39] 32 30.546773 (3.39,5.11] 20 19.279113 (5.11,6.83] 8 10.702626 (6.83,8.55] 9 5.591767 (8.55,10.3] 1 2.819109 (10.3,12] 1 1.38809 (12,33.2] 1 1.278692 Total 100 99.999911 𝛘𝟐(goodness of fit) 3.203071 Degrees of freedom (df) 3 p-value 0.361363 (b) (a) (b) Figure 1. Mastery data (for B = 100); along with a superimposed non-central 𝜒 distribution curve: (a) Histogram for 𝐵𝐿𝑅𝑇 , ; (b) Histogram for 𝐵𝐿𝑅𝑇 , . 6. Role Conflict Data Role conflict data is taken from (Coleman, 1964). (Goodman, 1974a; 1974b) used this data to explain a restricted LCM, which further discussed by (Bartholomew, 1987). The data is a panel data collected at two different points in time. Two questions were asked each time from individuals. Each question was responded as either ‘positive’ or ‘negative’. A restricted model with an assumption that there exist two latent variables, and that they altogether form four latent classes is the solution for the data. The empirical distribution of BLRT (B = 100) is presented in the Table 2 for (a) Situation 1 (H0: 1-class model Vs. H1: 2-class model) and (b) Situation 2 (H0: 2-class model Vs. H1: 3-class model), along with the expected frequencies and goodness of fit test. In each case, Chi-square test indicates the goodness of fit for the non-central χ2 (with df = E(LRTt-1,t), ncp = V(LRTt-1,t) -1) to the empirical null distribution of BLRT. That is, p-value for 𝜒 goodness of fit in the first situation is 0.276083 with 2 df. Whereas, in the second situation the p- B. Shamshad, J. S. Siddiqi - Empirical Null Distribution of –2logλ 133 value equal to 0.058646 with 3 df also indicate very well fitting of non-central χ2 distribution. The goodness of fit can also be seen in Figure 2 for both LRT1,2 and LRT2,3. Table 2. Role conflict data; Expected frequencies obtained through the fitting of non-central χ2 distribution for size “B = 100”. Observed frequencies from the bootstrapped sample of (a) LRT1,2 ; (b) LRT2,3. LRT1,2 Observed Frequency Empirical Frequency (0.45,2.8] 15 20.31083 (2.8,5.15] 35 31.96709 (5.15,7.49] 27 23.35075 (7.49,9.84] 15 12.99846 (9.84,12.2] 4 6.299644 (12.2,14.5] 3 2.806942 (14.5,16.9] 0 1.182026 (16.9,38.5] 1 0.779773 Total 100 99.6955 χ2(goodness of fit) 3.867928 Degrees of freedom (df) 2 p-value 0.276083 (a) LRT1,2 Observed Frequency Empirical Frequency (0.298,1.91] 23 20.37098 (1.91,3.53] 23 26.60688 (3.53,5.14] 19 20.69197 (5.14,6.75] 21 13.54058 (6.75,8.37] 9 8.11963 (8.37,9.98] 2 4.617904 (9.98,11.6] 2 2.534554 (11.6,32.7] 1 2.812054 Total 100 99.29455 χ2(goodness of fit) 7.458069 Degrees of freedom (df) 3 p-value 0.058646 (b) (a) (b) Figure 2. Role Conflict data (for B = 100); Histogram along with superimposed non-central χ2 curve for (a) BLRT1,2; (b)BLRT2,3. BRAIN – Broad Research in Artificial Intelligence and Neuroscience Volume 10, Issue 2 (April, 2019), ISSN 2067-3957 134 7. KUTS Panel Data KUTS panel data is the original results of the election held in 1993-94 of teachers of the University of Karachi. Two groups (panels) were contesting we named them as “Rightist” and “Mix” based on their manifesto. The data was first used by (Shamshad & Siddiqi, 2012) to fit LCM and found that a 3-class model is best among other class models. As described earlier the strategy for Mastery and Role conflict data we have assessed the empirical distribution of BLRT for KUTS panel data in three situations. (a) Situation 1; (H0: 1-class model Vs. H1: 2-class model) and (b) Situation 2; (H0: 2-class model Vs. H1: 3-class model) (c) Situation 3; (H0: 3-class model Vs. H1: 4- class model. Non-central χ2distribution when fitted to BLRT in each situation gives test value (chi- square goodness of fit) very small (< 2), which is an indication of high p-value, that is, for test values 1.98, 0.1565 and 1.173 the p-values are 0.371, 0.924 and 0.882 for BLRT1,2 , BLRT2,3 and BLRT3,4 respectively (see Table 3; Figure 3). Table 3. KUTS panel data; Observed and expected frequencies (obtained through fitting non-central χ2 distribution) for size “B = 100”; Bootstrapped sample of (a) LRT1,2 (b) LRT2,3 (c) LRT3,4 LRT1,2 Observed Frequency Expected frequency (0,2.37] 21 21.5922 (2.37,4.8] 37 35.4063 (4.8,7.22] 19 23.1141 (7.22,9.65] 12 11.5248 (9.65,12.1] 8 5.05212 (12.1,14.5] 2 2.05339 (14.5,16.9] 0 0.79404 (16.9,38.6] 1 0.46302 Total 100 99.99994 χ2(goodness of fit) 1.980392 Degrees of freedom (df) 2 p-value 0.371504 (a) LRT2,3 Observed Frequency Expected Frequency (0,2.17] 23 22.83642 (2.17,4.4] 34 34.10453 (4.4,6.62] 23 22.26926 (6.62,8.84] 12 11.5264 (8.84,11.1] 5 5.340879 (11.1,13.3] 2 2.319014 (13.3,15.5] 0 0.964693 (15.5,37] 1 0.638765 Total 100 99.99995 χ2(goodness of fit) 0.156566 Degrees of freedom (df) 2 p-value 0.924703 (b) LRT3,4 Observed Frequency Expected Frequency (0,1.06] 21 21.84624 (1.06,2.16] 24 24.17677 (2.16,3.27] 18 18.18279 (3.27,4.38] 16 12.58786 (4.38,5.49] 7 8.380645 (5.49,6.59] 6 5.45261 B. Shamshad, J. S. Siddiqi - Empirical Null Distribution of –2logλ 135 (6.59,7.7] 3 3.493749 (7.7,18.5] 5 5.828512 Total 100 99.94918 χ2(goodness of fit) 1.173873 Degrees of freedom (df) 4 p-value 0.882381 (c) (a) (b) (c) Figure 3. KUTS panel data (B=100); Histogram and a superimposed curve of non-central χ2distribution of bootstrapped (a) LRT1,2 ; (b) LRT2,3; (c) LRT3,4. Table 4. Percent Acceptance Rate of Non- Central χ2 Distribution with df = E(BLRTt-1,t), ncp = V(BLRTt-1,t) -1 level of significance 100 200 500 Mastery data LRT(1 vs. 2-class model) 5% 86.73% 87.88% 88.89% 1% 95.92% 93.94% 91.92% LRT(2 vs. 3-class model) " 5% 80.81% 80.00% 55.00% 1% 96.97% 90.00% 67.00% role conflict data LRT(1 vs. 2-class model) " 5% 85.86% 85.00% 85.98% 1% 90.91% 94.00% 94.39% LRT(2 vs. 3-class model) " 5% 86.00% 81.08% 74.26% 1% 95.00% 91.22% 87.13% KUTS panel data LRT(1 vs. 2-class model) " 5% 78.57% 81.82% 27.27% 1% 88.78% 93.94% 43.43% LRT(2 vs. 3-class model) " 5% 82.83% 78.79% 60.61% 1% 93.94% 92.93% 78.79% LRT(3 vs. 4-class model) " 5% 82.65% 62.63% 42.98% 1% 96.94% 78.79% 55.26% For sizes (i.e. B = 100, 200 and 500) approximately 100 repetitions are done to validate the fitting of non-central χ2 distribution on the empirical null distribution of -2logλ for Mastery, Role Conflict and KUTS panel data. The summary presented in Table 4, show the percent acceptance rate of non-central χ2 distribution at 5% and 1% level of significance. It can be concluded that the null distribution of BLRT is non-central χ2, when modeling is done using LCM. For B = 500, the percentage rate of acceptance is quite low, which can be improved by reevaluating the frequency distribution of BLRT, as we have used the same codes for the construction of frequency distribution of BLRT as used for size 100. Since, for larger samples the distribution of BLRT is highly skewed and need appropriate class intervals for the construction of frequency distribution of BLRT. Thus, once again the provision of accepting the decision, regarding the number of classes in the model, through calculating the significance value of BLRT, using the non-central χ2 distribution (with df = E(LRTt-1,t), ncp = V(LRTt-1,t) -1) is very strong. BRAIN – Broad Research in Artificial Intelligence and Neuroscience Volume 10, Issue 2 (April, 2019), ISSN 2067-3957 136 8. Test Data Sets: Description of General Health Parameter (GHP) Data GHP data set is collected through a survey of a long questionnaire (more than 76 questions) from more than 1500 students studied at different government and private sector Universities and Colleges in Karachi, Pakistan during years 2008-2010. The purpose is to check whether the respondent is aware about his/her health conditions. Asking health related questions not only give a chance to a respondent to overview his/her health condition, but also provide collective information about health of teenagers in the society. Even if the respondent would not answer correctly due to any reason, at least he/she would think and realize any sort of problem by themselves. Questionnaire was adopted from the survey of world health organization and constructed in both Urdu and English languages separately, keeping in mind that most of the targeted population might not be comfortable in the English version as they are in their learning stage and could have difficulty in understanding the language, which might result in misleading information. Each question has been asked about the difficulty the respondents have had in doing work, moving around, listening, seeing, understanding, recognizing and remembering thing etc., considering the last 30 days. In order to get true response, each question is scaled from 1 to 5, 1 being (at a minimum) “none of the time” or “no problem” and 5 being the “all of the time” or “extreme problem”. From this survey we have focused our attention toward an important public health issue of problematic sleep which requires accurate diagnosis. Sleep problem is referred as both symptom and sign of specific disorder, known as “insomnia”. (Roth, 2007) defines “insomnia” in survey studies as a positive response to either question “Do you have difficulty falling or staying asleep?” or “Do you experience difficulty sleeping?” According to International Classification of Sleep Disorder, 2nd edition (ICSD-2) “insomnia” is defined as having complaints of difficulty in initiating and maintaining sleep or wake up too early or having a sleep that is of poor quality, such difficulties occur even with appropriate circumstances and adequate opportunities for sleep (Schutte-Rodin et al., 2008) (Buysse, 2008), which results in at least one of the following daytime dysfunctioning. That are, daytime sleepiness; fatigue or anxiety; concentration, remembering or focusing problems; difficulty in complex mental tasks; social or vocational dysfunction; poor school/job performance; irritable or bad mood. Having tension, headache or gastrointestinal symptoms in response to sleep loss; motivation, energy or initiative reduction; proneness for error/accidents at work or while driving and concern about sleep problem (ICSD-2). Here, we consider variables which are daytime impairment associated with insomnia as either symptom or diagnostic criteria. We named the set of these variables as “GHP-Insomnia data”. The selected questions are as follow: R: How much difficulty did you have with concentrating and remembering things? (Concentration Problem) C: How much difficulty did you have with analyzing and solving problem in day to day life? (Cognitive Issue) L: How much difficulty did you have with learning new tasks, for example, learning how to get a new place? (Learning Issue) I: How much did you feel irritable or having a bad mood? (Irritable) S: How much difficulty did you feel in falling asleep, waking up frequently during night or waking up too early in the morning? (Problematic sleep) We have used a total of 1289 responses after discarding the responses having missing values for this analysis. For convenience, 5-level likert scale has been reduced to binary as “1-No difficulty” being marked as a negative response (“1”) and the rest of the level of having difficulty of any degree (That are; 2, 3, 4, 5) are marked as a positive response (“2”). Summary of latent class model fitting up to 5 classes are presented in Table 5, in which the AIC is minimum in 3-class model and BIC in 2-class model. The difference between the values of G and χ for 1-class and 2- class model is also an indication of the presence of a latent variable in the data. B. Shamshad, J. S. Siddiqi - Empirical Null Distribution of –2logλ 137 Table 6 presents the estimated model parameters for 1 till 3-classes the probabilities presented are of a positive response to each item along with the respective estimated standard errors of estimates. In case of 2-class model the class proportions dividing the total sampled population in two groups are estimated as 67% and 33%. Class 1 (with the highest proportion of approximately 867 out of 1289 total respondents), show high probabilities of having difficulty in each and every statement. Approximately, 674 out of 867 individuals in this class faced difficulty in concentrating and remembering things (R), 765 and 781 (out of 867) had difficulty in the cognitive issue (C) and feel irritable (I), respectively, in the same duration. Whereas, approximately 75% and 70% of the respondents had had problematic sleep (S) and issues in learning new tasks (L), respectively. Class 2 on the other hand constitutes those respondents who felt irritable (I) (approximately, 311 out of 422) and had problematic sleep (S) (approximately; 218 out of 422). The probabilities of having positive response to questions related to the concentration issue (R), cognitive issue (C) and the learning issue (L) are very low in class 2. The probability of being irritable or having a bad mood is high among individuals in class 2 although the probability of having sleep issue is also high but not too high. Individual in class 1 seems to have at risk of having Insomnia. In a 3-class model (see Table 6) number of estimated parameters increased but with a clearer grouping of individuals with an additional class. There are only 8% individuals in the total sampled population who marked negative response to difficulty in any of the question asked with high probabilities. In Class 1 (with approximately 108 individual out of 1289) none of the respondent feel being irritable or in a bad mood as well as only 37, 28, 31 and 39 (out of 108 individual) show difficulty in the concentration issue, cognitive issue, learning issues and problematic sleep respectively, and these probabilities are not very high. We marked this group as “healthy fellows”. Whereas, class 2 with maximum proportion of 835 out of 1289 individuals (64.7% class proportion) represents that group of people who are at great risk of insomnia since the probabilities are very high for each and every question asked, i.e. they have disturbed sleep (S) (624 out of 835 marked positive response) as well as they face difficulty in daytime activities such as in remembering and concentrating things (R), cognitive issues (C), learning new tasks (L) and have a bad mood (I) with around 78%, 88%, 74%, and 89% probabilities, respectively. We marked class 2 as “chronic-insomnia risk group”. Respondents in class 3 (class proportion is 26.89%) give 100% positive response to being irritable (I) and 204 out of 347 respondents complain problematic sleep (S), although this group of individuals have no difficulty, in analyzing and solving day to day problems in life (C) (with 62.8% proportion of 347), in concentrating issue (R) (with 51.38% probability) and 81.54% proportion of 347 in learning new tasks (L). For class 3, the probability of having disturbed sleep is slightly high; this might be due to reason of having a bad mood of all individuals of this group. Feeling irritable might be a sign of anxiety or depression due to social or psychological issues, which may result in the nighttime sleep problem. We marked class 3 as “chronic irritable group”. Table 5. Results of Fitting Latent Class Models for GHP data 1 class 2 class 3 class 4 class 5 class AIC 7752.916 7545.709 7536.830 7540.031 7546.453 BIC 7778.724 7602.487 7624.577 7658.748 7696.140 G2 (Likelihood ratio /deviance statistic) 257.969 38.762 17.883 9.084 3.506 χ2 (Chi-square goodness of fit) 341.947 37.597 17.706 8.896 3.406 Number of estimated parameters 5 11 17 23 29 maximum log-likelihood -3871.45 -3761.85 -3751.41 -3747.01 -3744.22 -2logλ 219.206 20.880 8.800 5.577 BRAIN – Broad Research in Artificial Intelligence and Neuroscience Volume 10, Issue 2 (April, 2019), ISSN 2067-3957 138 Table 6. GHP survey data; Latent class model parameter estimates for 1 to 3-classes (probabilities of having difficulty) 1-latent class 2-latent class model 3-latent class model Model Class 1 Class 2 Class 1 Class 2 Class 3 R 0.6633 0.7772 0.4290 0.3459 0.7778 0.4862 [0.0136] [0.0207] [0.0391] [0.0567] [0.0222] [0.0457] C 0.6920 0.8823 0.3005 0.2632 0.8804 0.3716 [0.0137] [0.0255] [0.0543] [0.0627] [0.0302] [0.0695] L 0.5516 0.7062 0.2335 0.2970 0.7368 0.1846 [0.0145] [0.0254] [0.0434] [0.0558] [0.0345] [0.0703] I 0.8472 0.9006 0.7372 0.0000 0.8930 1.0000 [0.0102] [0.0130] [0.0285] [0.0000] [0.0143] [0.0000] S 0.6734 0.7493 0.5172 0.3677 0.7481 0.5884 [0.0133] [0.0192] [0.0346] [0.0565] [0.0199] [0.0378] Class Proportion 0.6729 0.3271 0.0835 0.6476 0.2689 [0.0452] [0.0452] [0.0110] [0.0597] [0.0549] The cross classification table (see Table 7b) of latent class membership against problematic sleep (S: insomnia) for the 3-class model present that 32.6% of the total individuals responded that they did not have problematic sleep at any time during the last month, whereas, 32.58% had problematic sleep, some of the time. The marginal totals for problematic sleep for responses “A good bit of the time”, “Most of the time” and “All of the time” are 17.22%, 10.7% and 6.8%, respectively. As, we have converted the responses dichotomously as “no difficulty (1)” and “having difficulty of any degree (2 (2, 3, 4 & 5))”. Therefore, the cumulative for responses (2, 3, 4 & 5) of “having difficulty of any degree” gives 67.33% of the total respondents. The proportion of class membership obtained through the cross classification against problematic sleep (S) are approximately the same as obtained through the fitted model (see Table ), that are, 8.9% (116 out of 1289), 66.4% (856 out of 1289) and 24.5% (317 out of 1289) for classes 1, 2 and 3, respectively (see Table: ). The distribution of individuals in each of the 5-levels of responses clearly show that in class 1 (marked as “healthy fellows”), 66.37% of the individuals do not face any difficulty in sleeping, while only 2.58% and 6% of the total face difficulty “all of the time” and “most of the time”, respectively. Although, the cumulative percentage of responses “some of the time” and “a good bit of the time” is 25% (29 out of 116). In class 2 (marked as “chronic-insomnia risk group”), 37.5% of 856 respondents marked no difficulty in having asleep during the nighttime. Even though, only 65 of 856 (6%) respondents had sleep problem all of the time, but the remaining cumulative of difficulty (low to marginally high degree; responses 2, 3 &4) is 56.15% which is quite alarming. The percentage distribution in 5- level scale of having asleep at bedtime for the last group (class 3) of “chronic-irritable” show that, a total of 73.7% of 317 respondents faced difficulty (for responses; 2, 3, 4, & 5). The individual percentages of having difficulty are also very high, that are 35%, 19.8%, 11.2% and 7.59% for responses 2, 3, 4 & 5, in class 3, as compared to other classes. This might be due to irritability they were facing during that time, which could be a cause of any social, psychological or mental pressure. The cross classification table for a 2-class model is also presented (see Table 7a) for reader interest. We will now assess the empirical null distribution of -2logλ for GHP-Insomnia data. The estimated parameter of 1- till 3- latent class model provided in Table--- are used for booting hypothetical populations based on these estimates. For each LRT1,2 , LRT2,3 and LRT3,4, 100 replications of bootstrap samples are obtained on which non-central χ2 distribution with df =E(LRTt- 1,t) and ncp =[V(LRTt-1,t)] -1 is assessed for fitting distribution. B. Shamshad, J. S. Siddiqi - Empirical Null Distribution of –2logλ 139 Table 7. GHP-Insomnia data: Cross classification tables of latent class membership against problematic sleep (insomnia) for: (a) 2-class model; (b) 3-class model Latent class membership Total Problematic sleep (insomnia) “Healthy Fellows group” Class 1 “Chronic-insomnia risk group” Class 2 1-None of the time 189 232 421 2- Some of the time 92 328 420 3-A good bit of the time 44 178 222 4-Most of the time 32 106 138 5-All of the time 22 66 88 Total 379 910 1289 (a) Latent class membership Total Problematic sleep (insomnia) “Healthy Fellows group” Class 1 “Chronic-insomnia risk group” Class 2 “Cornice- irritable group” Class 3 1-None of the time 77 225 119 421 2- Some of the time 22 300 98 420 3-A good bit of the time 7 170 45 222 4-Most of the time 7 96 35 138 5-All of the time 3 65 20 88 Total 116 856 317 1289 (b) The significance values obtained from BLRT1,2, BLRT2,3 and BLRT3,4 are 0.265872, 0.664608 and 0.08312 against test values 219.206, 20.88 and 8.8 each with 3 degrees of freedom, respectively, (see Table 8; see also Figure 4), show good fit of non-central χ2 distribution for each BLRTt-1,t. Table 8. GHP-Insomnia data (size B = 100); Observed frequencies; Expected frequencies (obtained through the fitting of non-central χ2 distribution; (a) LRT1,2 ; (b) LRT2,3; (c) LRT3,4 LRT , Observed Frequency Expected Frequency (0.73,3.57] 16 10.80113 (3.57,6.41] 24 29.29937 (6.41,9.24] 25 27.83557 (9.24,12.1] 18 17.3231 (12.1,14.9] 11 8.624018 (14.9,17.8] 4 3.742099 (17.8,20.6] 1 1.479546 (20.6,42.5] 1 0.835142 Total 100 99.93998 𝛘𝟐(goodness of fit) Test Statistics 3.959493 Degrees of freedom (df) 3 P-value 0.265872 (a) LRT , Observed Frequency Expected Frequency (0.75,3.08] 12 12.08323 (3.08,5.41] 26 26.62737 (5.41,7.74] 29 25.28383 (7.74,10.1] 17 17.12687 (10.1,12.4] 7 9.693352 BRAIN – Broad Research in Artificial Intelligence and Neuroscience Volume 10, Issue 2 (April, 2019), ISSN 2067-3957 140 (12.4,14.7] 5 4.908126 (14.7,17.1] 2 2.303954 (17.1,38.7] 2 1.758664 Total 100 99.7854 𝛘𝟐(goodness of fit) Test Statistics 1.577037 Degrees of freedom (df) 3 P-value 0.664608 (b) LRT , Observed Frequency Expected Frequency (0.155,2.21] 15 16.22409 (2.21,4.26] 29 29.66279 (4.26,6.31] 31 23.90222 (6.31,8.37] 8 14.75638 (8.37,10.4] 11 7.985343 (10.4,12.5] 3 3.994056 (12.5,14.5] 2 1.895745 (14.5,35.9] 1 1.542697 Total 100 99.96332 𝛘𝟐(goodness of fit) Test Statistics 6.672024 Degrees of freedom (df) 3 P-value 0.08312 (c) (a) (b) (c) Figure 4. GHP-Insomnia data (B=100); Histogram along with a superimposed non-central 𝜒 curve for bootstrap sample of; (a) LRT1,2 ; (b) LRT2,3; (c) LRT3,4 9. Conclusion We have considered four different data sets: Mastery, Role Conflict, KUTS panel and GHP- Insomnia data. The task of establishing the empirical distribution of -2logλ when fitting latent class model reveals that non-central χ2 distribution with df = E(LRTt-1,t) and ncp = [V(LRTt-1,t)] -1 is very well fitted (with high percentage) to each data set (either “training” or “test”) and that one can rely on calculating the significance value of -2logλ obtained through the empirical null distribution (i.e. Non-central χ2 distribution). References Aitkin, M., & Rubin, D. B. (1985). Estimation and hypothesis testing in finite mixture models. Journal of the Royal Statistical Society Series B (Methodological), 47 (1), 67-75. B. Shamshad, J. S. Siddiqi - Empirical Null Distribution of –2logλ 141 Aitkin, M., & Wilson, G. T. (1980). Mixture Models, Outliers, and the EM Algorithm. Technometrics, 22(3), 325-331. Aitkin, M., Anderson, D., & Hinde, J. (1981). Statistical Modeling of Data on Teaching Styles. Journal of the Royal Statistical Society, Series A (General), 144(4), 419-461. American Academy of Sleep Medicine. Diagnostic and Coding Manual. 2nd ed. Westchester, Ill: American Academy of Sleep Medicine; 2005. The International Classification of Sleep Disorders. Bartholomew, D. J. (1987). Latent Variable Models and Factor Analysis. London: Charles Griffin & Co. Ltd. Buysse D. J. (2008). Chronic insomnia. The American journal of psychiatry, 165(6), 678–686. Chun, S. Y., & Shapiro, A. (2009). Normal versus Noncentral Chi-square asymptotics of misspecified models. Multivariate Behavioral Research, 44 (6), 803-27. Coleman, J.S. (1964). Introduction to Mathematical Society. New York: Free Press. Gudicha, D. W., Schmittmann, V. D., & Vermunt, J. K. (2016) Power Computation for Likelihood Ratio Tests for the Transition Parameters in Latent Markov Models. Structural Equation Modeling: A Multidisciplinary Journal, 23 (2), 234-245. Goodman, L. A. (1974a). Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika, 61(2), 215-231. Goodman, L. A. (1974b). The analysis of systems of qualitative variables when some of the variables are unobservable. Part I-A modified latent structure approach. American Journal of Sociology, 79(5), 1179-1259. Hallquist (2008-2011). MplusAutomation: Automating Mplus Model Estimation and Interpretation. R package version 0.5. http://CRAN.R-project.org/package=MplusAutomation. Hartigan, J. A. (1977). Distribution Problems in Clustering. In J. V. Ryzin, ed., Classification and Clustering, New York: Academic Press. Jeffries, N. (2003). A note on “Testing the number of components in a normal mixture”. Biometrika, 90, 991–994. Lo, Y., Mendell, N., & Rubin, D. (2001). Testing the number of component in a normal mixture. Biometrika 88, 767-778. Macready, G. B., & Dayton, C. M. (1977). The use of probabilistic models in the assessment of mastery. Journal of Educational Statistics, 2, 99-120. McLachlan, G. J. (1987). On bootstrapping the likelihood ratio test statistic for the number of components in a normal mixture. Applied Statistics, 36, 318-324. McLachlan, G. J., & T. Krishnan. (2008). The EM algorithm and Extensions. Hoboken, N.J: Wiley- Interscience. McLachlan, G., & Peel, D. (2000). Finite Mixture Models. New York: Wiley. McLachlan, G. J., & Basford, K. E. (1988). Mixture Models: Inference and Application to Clustering. New York: Marcel Dekker. Nylund, K. L., Asparouhov, T., Muthén, B. O. (2007). Deciding on the number of classes in latent class analysis and growth mixture modeling: A Monte Carlo simulation study. Structural Equation Modeling, 14(4), 535–569. Quinn, B. G., McLachlan, G. J., & Hjort, N. L. (1987). A note on the Aitkin-Rubin approach to hypothesis testing in mixture models. Journal of the Royal Statistical Society B, 49, 311-314. Roth, T. (2007). Insomnia; Definition, prevalence, etiology, and consequences. Journal of Clinical Sleep Medicine, 3(5 Suppl): S7-S10. Schutte-Rodin, S., Broch, L., Buysse, D., Dorsey, C., & Sateia, M. (2008). Clinical guideline for the evaluation and management of chronic insomnia in adults. Journal of Clinical Sleep Medicine, 4(5), 487–504. Shamshad, B., & Siddiqi, J. S. (2012) Exploration of groups through latent structural model. Journal of Basic and Applied Science, 8 (1), 145-150. Titterington, D. M., Smith, A. F. M., & Markov, U. E. (1985). Statistical Analysis of Finite Mixture Distributions. New York: Wiley. BRAIN – Broad Research in Artificial Intelligence and Neuroscience Volume 10, Issue 2 (April, 2019), ISSN 2067-3957 142 Vuong, Q. H. (1989). Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica, 57, 307-33. Wilks, S. S. (1935). The likelihood test of independence in contingency tables. The Annals of Mathematical Statistics, 6(4), 190-196. Wilks, S. S. (1938). The large-sample distribution of the likelihood ratio for testing composite hypotheses. The Annals of Mathematical Statistics, 9(1), 60-62. Wolfe, J.H. (1970). Pattern clustering by multivariate mixture analysis. Multivariate Behavioral Research, 5, 329–350. Bushra Shamshad received her B.Sc (Honors), M.Sc and Ph.D degree in Statistics from Department of Statistics, University of Karachi, Pakistan, in 2002, 2003 and 2013, respectively. She is associated with the Department of Statistics, Karachi University since 2004, initially as a Co-operative Lecturer then became Full-time Lecturer in 2006. She is working as Assistant Professor since 2011. Her awards and honors include First-Class-First Position and 2 gold medals in M.Sc. and First-Class-Second Position in B.Sc. (Honors). Her research interests are Multivariate Analysis, Categorical Data Analysis, Distribution Theory and Structure Equation Modeling. Dr. Junaid Saghir Siddiqi born on 01-05-1952, He did his B.Sc. Hons. & M.Sc. in Statistics, in 1972 and 1973 respectively, from the Department of Statistics, University of Karachi. He did Ph.D. in Statistics from University of Exeter, England in 1992. He initially worked as Teaching / Research Assistant in Karachi University. He then joined Government of Sindh as Research Officer before joining the Department of Statistics, Karachi University as a Full-time Lecturer in October1975. He has been retired as Professor of Statistics on 30th April 2012. Currently he is working as an adjunct Professor in the Department. His research interest includes Application of multivariate methods in several social and economic field. He has supervised 5 PhD students.