J. Nig. Soc. Phys. Sci. 2 (2020) 36–50 Journal of the Nigerian Society of Physical Sciences Original Research A Modified Forced Randomized Response Model A. T. Adeniran∗, A. A. Sodipo, C. G. Udomboso Department of Statistics, University of Ibadan, Ibadan, Nigeria. Abstract This paper proposed a new Randomized Response Model (RRM) to estimate proportion of people characterizing a sensitive variable (S ) under study. Simple random sampling with replacement and stratified simple random sampling scheme were adopted. Maximum likelihood and Bayesian estimation procedures of the proposed model were developed and compared. The sampling distribution (expectation and variance) of the proposed estimator under the two sampling techniques, efficiency comparison of the proposed model with some existing models, and numerical illustration of all the compared models were also explored. The study found that the proposed model outperformed other existing RRMs in terms of efficiency and it proved to be more protective in designing survey for sensitive related issues. Keywords: Randomized response, proportion, sensitive variable, maximum likelihood estimator, sampling distribution. Article History : Received: 25 December 2019 Received in revised form: 15 February 2020 Accepted for publication: 17 February 2020 Published: 28 February 2020 c©2020 Journal of the Nigerian Society of Physical Sciences. All Rights Reserved. Communicated by: T. Latunde 1. Introduction Surveys usually collect responses to a large number of items from each sample unit and many institutions use empirical ev- idence from surveys to make their policies. Thus, surveys play a prominent role in society, hence collecting and interpreting survey data correctly is essential. One of the most obvious problems in census or sample surveys is the inability of the re- searcher to collect responses on some or all of the items for a sampled unit or when some responses are deleted because they fail to satisfy edit constraints [1]. In the vernacular of sample survey, this is the problem of non-response. Non response can arise for a variety of reasons, one of which is the sensitive na- ture of the survey question. Socially sensitive questions such as shoplifting, rash driving, tax evasion, felony, false declaration of assets, expenditure on addiction of various form (drug abuse), kidnapping, occult af- filiation, psychosocial status, cheating in national examination; ∗Corresponding Author Tel. No: +2348035529045 Email address: at.adeniran@mail.ui.edu.ng (A. T. Adeniran) or health habits and sexual orientation related questions such as HIV infection, induced abortion, masturbation, ravish, ho- mosexuality, and other illegal, unethical, prohibited attitudes or practices that receive disapproval by the society are thought to be threatening to respondents [2-3]. When sensitive topics are studied, respondents often react in ways that negatively affect the validity of data by giving socially desirable answers to avoid social embarrassment and to project a positive self-image [4-5]. The Randomized Response Model (RRM) also known as Ran- domized Response Technique (RRT) or Randomized Response Distribution/Design (RRD) is a survey method specifically de- veloped to improve the participation and precision of answers to sensitive questions because the frequency of socially unde- sirable, disgraceful, embarrassing, incriminating, ignominious, highly stigmatizing variable has been found usually underesti- mated in surveys. In the year 1965, Warner explained that the reluctance of the respondents to elicit sensitive or probably harmful information would diminish when respondents are convinced that their anonymity be guaranteed, when incriminating answers could be covered 36 Adeniran et al. / J. Nig. Soc. Phys. Sci. 2 (2020) 36–50 37 even from the interviewer, the need to present oneself in a pos- itive way would decrease and honest answering would increase [6]. Following this assumption, [7] did the pioneering work on a RRT which required the interviewee to give a “yes”or “no”answer either to the sensitive question or to its negative de- pending on the outcome of a randomizing device not revealed to anybody including the interviewer. After [7] original RRM, different authors developed various RR models including Unre- lated Questioning Technique (UQT), Forced Randomized Re- sponse Model (FRRM), stratification of new and existing mod- els in Randomized Response (RR) surveys, calibration tech- niques and so on. According to [8-9] ”FRRM and UQT with known population prevalence of the innocuous attribute are the best, a major setback is that it is not easy to come up with un- related question with known prevalence and variance near zero, hence, UQT seems harder to be adopted.” To circumvent this problem, this study proposed a Modified Forced Randomized Response Model (MFRRM) with a definite no response to the unrelated question. The structure of the paper is as follows: In section 2, litera- tures that are related to the study are reviewed to identify all existing works, their contributions and limitations. Section 3 introduces the newly proposed RRM namely modified forced randomized response model. In the same section, Maximum Likelihood (ML) and Bayesian Estimator (BE) of the proposed RRM are explored, comprehensive treatment of statistical prop- erties of the proposed estimator such as: unbiasedness, effi- ciency comparison with existing models, and cost efficiency of the proposed design are as well carried out. Section 4 presents analysis of the simulated data sets, discussion of results, and the concluding remarks. Tables and charts showing the results are also presented in this section. 2. Existing Models This section lucidly renders a vivid review of related previ- ous studies, starting with Warner original model, via intermedi- ate models to the current advanced stage of robust quantitative RRM. 2.1. Warner Randomized Response Model In Warner’s (1965) original design, respondents are provided with a randomizer (say, spinner) and instructed to answer one of two statements: (i) I belong to the sensitive group (selected with probability p). (ii) I do not belong to the sensitive group (selected with prob- ability 1 − p). Respondents then in turn answer “true”or “not true”or “yes”or “no”according to their status on sensitive question without re- vealing to the interviewer which statement was selected by the randomizer. The probability tree diagram below illustrates the randomization procedure of [7] original randomized response model. From Figure 1 above, the probability of yes and no answers are λ = pπs + (1 − p)(1 − πs) and 1 − λ = p(1 − πs) + (1 − p)πs, respectively. Suppose no and n − no denote the total number of “yes”and “no”answers in the sample of n respondents, follow- ing the maximum likelihood principle, the likelihood function to maximize is P(λ, n) = ( n no ) [pπs+(1−p)(1−πs)] no [ p(1−πs)+(1−p)πs] n−no.(1) Setting the derivative (w.r.t. πs) of the natural logarithm of equation (1) to zero, Warner unbiased estimator of πs is π̂s = no n − (1 − p) 2p − 1 = λ̂− (1 − p) 2p − 1 , p , 1 2 (2) with variance V ar(π̂s) = πs(1 −πs) n + p(1 − p) n(2p − 1)2 (3) where, n = number of observed respondents’ (sample size), p = probability of selecting (and answering) sensitive question, and πs = proportion of yes answers to the sensitive question. Subsequently, both in theory and application, several other au- thors have suggested various alternatives RR models including [10-17] among many others. Recently, [18] used [7] RRM to measure corruption among public bureaucrats in Bolivia, Brazil, and Chile. 2.2. Forced Randomized Response Model (FRRM) The forced response method otherwise referred to as UQT with known πu was originated by [12] and later simplified by [19] with the assumption that πu is a predetermined parameter. That is, πu exists or must be known either from independent ad-hoc studies or statistical abstracts. This assumption does not always hold. Even though πu is available in any register or statistical abstracts, cost of retrieving from records may be expensive. If πu is known beforehand, solving for πs in [11] model, and re- placing λ with its sample estimator ( λ̂ = no n ) gives π̂s = 1 p [ λ̂− (1 − p)πu ] = 1 p [ no n − (1 − p)πu ] (4) with variance V ar(π̂s) = πs(1 −πs) n + (1 − p) np [πs(1 −πu) + πu(1 −πs) + (1 − p)πu(1 −πu) p ] . (5) Estimating πs in this manner is termed “Forced Randomized Response Model or Technique (FRRM or FRRT)”. Among ap- plied researchers, FRRM is the most famous design. Illegal 37 Adeniran et al. / J. Nig. Soc. Phys. Sci. 2 (2020) 36–50 38 centering Origin First statement ∈ S yes πs < S no 1 −πs p Second statement < S yes 1 −πs ∈ S no πs 1 − p Figure 1: Probability tree diagram of the Warner original RRM waste disposal [9], prevalence of civilian cooperation with mil- itant groups in southeastern Nigeria [20], vote choice regarding a Mississippi abortion referendum [21], use of performance en- hancing drugs [22], xenophobia and anti-Semitism in Germany [23], illegal poaching among South African farmers [24], and violation of regulatory laws by commercial firms [25] are few evidences among numerous practical application of the Forced- RR-UQT. 2.3. Mangat Improved Two Step Procedure Alternative form of forced response model was developed by [15] as an optimization of one of his earlier designs. Mangat’s procedure requires all respondents that have sensitive attribute (S ) to answer truthfully without use of randomizer. All respon- dents who do not have the sensitive attribute are required to use the randomizer to choose which from Warner’s statements. This means that all no-answers are true negatives, and that only the yes answers are contaminated. [15] estimator of πs is π̂s = 1 p ( no n − 1 + p ) = 1 p (λ̂− 1 + p) (6) with variance V ar(π̂s) = πs(1 −πs) n + (1 − p)(1 −πs) np . (7) Even though Mangat’s estimator is most efficient when com- pared with the earlier similar estimators or models, his design is unrealistic as respondents with the sensitive attribute will be more inclined to lie, making the population estimates less valid, and this causes the results and inferences of his procedure to be less trustworthy [8], [26]. Recently, [27] modified and applied [15] RRM to estimate two sensitive attributes simultaneously. 2.4. Stratified Warner’s Randomized Response Model [16] presented a stratified RRT of [7] original model using an optimal allocation which is more efficient and cost effective than using a proportional allocation of [2] technique. The max- imum likelihood estimate of πs using [16] RRM is π̂s = L∑ h=1 Whπ̂sh = L∑ h=1 Wh  nohnh − (1 − ph)2ph − 1  = L∑ h=1 Wh [ λ̂h − (1 − ph) 2 ph − 1 ] (8) and the minimal variance of π̂s is given by V ar(π̂s) = 1 n  L∑ h=1 Wh { πsh(1 −πsh) + ph(1 − ph) (2ph − 1)2 } 1 2  2 (9) where, Wh = Nh N = stratum weight, λ̂h = noh nh = proportion of yes-answer in a stratum h, ph = probability that a respondent in the sample stratum h has a sensitive question (S) card, and π̂sh = λ̂h−(1−ph ) 2 ph−1 = proportion of respondents with the sensitive trait in a stratum h, for h = 1, 2, · · · , L. 2.5. Mixed Stratified Randomized Response Model [17] proposed mixed stratified RRM by taking two independent samples of sizes n1 and n2. Also, two randomization devices R1 and R2 were used in each sample to estimate Human Im- mune Virus (HIV) seroprevalence rate in Kaduna State, Nigeria. Following the same maximum likelihood estimation procedure, an unbiased mixed stratified seroprevalence rates estimator is given by π̂s = L∑ h=1 Whπ̂sh = L∑ h=1 Wh [ nh1 nh π̂h1 + nh2 nh π̂h2 ] (10) with variance given by V ar(π̂s) = L∑ h=1 W 2h  nh1 n2h { (1 −πsh)( ph1πsh + 1 − ph1) ph1 } 38 Adeniran et al. / J. Nig. Soc. Phys. Sci. 2 (2020) 36–50 39 + nh2 n2h { (1 −πsh)(ph2πsh + 1 − ph2) ph2 } (11) where, π̂sh1 = λ̂h1−(1−ph1 ) ph1 , π̂sh2 = λ̂h2−(1−ph2 ) ph2 and π̂sh = nh1 nh π̂h1 + nh2 nh π̂h2. If ph1 = ph2 = ph and nh1 = nh2 = noh, the above estimator reduces to π̂s = L∑ h=1 Whπ̂sh = L∑ h=1 Wh 2noh nh [ λ̂h − (1 − ph) ph ] (12) with variance V ar(π̂s) = 1 n  L∑ h=1 Wh {πsh(1 −πsh) (13) + (1 − ph)(1 −πsh) ph } 1 2  2 (14) where, π̂sh = 2λ̂h [ λ̂h−(1−ph ) ph ] for h = 1, 2, · · · , L. 3. Methodology 3.1. Randomization Procedure A simple random sampling with replacement (SRSWR) of n respondents was selected from the population. An individual respondent in the sample of size n was instructed to use the ran- domization device (R) which consists of a sensitive question (S) card selected with probability p ( p , 1 is a pre-assigned value set by the researcher) and unrelated question (U) card selected with probability (1 − p). The paired unrelated or innocuous question has a definite “no”response, for example (i) S- Are you a member of the insurgence group? (ii) U- Is this month of February (when research is conducted in the month of September)? In an applied research, the phrase inside bracket of the unre- lated question must be excluded. After the researcher explained the procedures, randomization procedure then followed. Each subject was given a box1 (containing both cards) to select at random a card and tick “yes”or “no”(based on his or her true status, i.e., under the assumption that respondents answer truth- fully) and return the ticked card through an opener in the box2. Respondents were not placed in close proximity and instructed not to let anyone (the investigator inclusive) see the card he or she had drawn so as to maintain the integrity of the random- ized response technique, and keep the individual respondent’s privacy or anonymity protected. The probability tree diagram below illustrates the proposed modified-FRRM: 3.2. Estimation Procedure: Classical Approach Let Xi = 0, if ith respondent says no1, if ith respondent says yes. (15) Elementary probability theory can then be used to get an un- biased estimate of the prevalence (π̂s) of sensitive issues in the population. The procedure followed thus: from Figure 2, the probability of a yes and no response are λ = pπs and 1 − λ = p(1 − πs) + (1 − p) = 1 − pπs, respectively. Therefore, the likelihood function of πs is L(πs) = ( n no ) [ pπs] no [p(1 −πs) + (1 − p)] n−no. (16) where no is the total number of yes answers in the sample of size n respondents. To estimate πs, ( n no ) in (16) does not contain πs, as a result, the study considers it to be constant. Hence, the function to maximize reduces to L(πs) = [ pπs] no [ p(1 −πs) + (1 − p)] n−no. (17) Since logarithm is a monotone function, π̂s that maximizes the likelihood function also maximizes its log-likelihood function. Therefore, to facilitate computation, the natural logarithm of (17) yield the log-likelihood function as l(πs) = no ln[ pπs] + (n − no) ln[p(1 −πs) + (1 − p)]. (18) Differentiating (18) with respect to πs, the differential coeffi- cient is d dπs l(πs) = no p pπs + −(n − no) p p(1 −πs) + (1 − p) . (19) Equating (19) to zero and simplifying the resulting algebraic equation, the maximum likelihood estimator of πs which shall be arbitrarily denoted in this study as πspropo is π̂spropo = no np , p , 1. (20) Remarks 1 (i) Putting p = 1 in either (16) or (20), the proposed MFRRM reduces to the conventional method of direct questioning and the proposed RRT estimator also becomes the tra- ditional measure of proportion. The value of p = 1 is absolute lack of protection, hence, p = 1 is impracticable in any randomized response survey. When 0 < p < 12 or 1 2 < p < 1, the respondent gives partially useful but not accurate information to which class (sensitive or not) he or she belongs. (ii) Like [11], [15] and [36], the proposed estimator π̂spropo is not admissible since the range of π̂spropo is not a subspace of (0, 1). In fact, if non > p, that is, observed proportion of yes is greater than the pre-assigned randomization pa- rameter, the proposed estimator produce estimate whose value is greater than unity. 39 Adeniran et al. / J. Nig. Soc. Phys. Sci. 2 (2020) 36–50 40 Origin Sensitive question ∈ S yes πs < S no 1 −πsp Unrelated question U no1 1 − p Figure 2: Probability tree diagram of the proposed MFRR model 3.2.1. Sampling Distribution of the Proposed Estimator (π̂spropo) Theorem 3.1 (Unbiasedness of π̂spropo). The proposed estima- tor is an unbiased estimator of population proportion of the sensitive attribute under survey. proof 3.1. π̂spropo is said to be unbiased of the population pa- rameter πs if and only if E(π̂spropo) = πs. Hence, we take expec- tation of (20) as follows E(π̂spropo) = E [ no np ] = 1 np n∑ i=1 [E(Xi)] = 1 np npπs = πs, (21) which proves the theorem. Theorem 3.2 (Variance of π̂spropo). The variance of the proposed estimator is V ar(π̂spropo) = πs(1 −πs) n + (1 − p)πs np . (22) proof 3.2. This follows from taking the variance of (20) and considering section 5.5 of [28], V ar(π̂spropo) = V ar [ no np ] = 1 (np)2 n∑ i=1 V ar(Xi) = npπs [ p(1 −πs) + (1 − p) ] (np)2 = πs(1 −πs) n + (1 − p)πs np . The sample estimator of V ar(π̂spropo) is obtained by substituting (20) in the preceding equation to have V̂ ar(π̂spropo) = π̂s(1 − π̂s) n + (1 − p)π̂s np . (23) Remark 2: Equation (23) consists of two parts. The first is the variance of the population proportion of the sensitive attribute if all the respondents are willing to imbibe direct questioning approach. The second part is injected due to use of randomizer. Putting πu = 0 in equation (4) and (5), the results coincident with the proposed estimator in (20) and (22), respectively. 3.2.2. Stratified Sampling Strategy for the Proposed MFRRM If in the proposed model, the population (P) is partitioned (us- ing a suitable or appropriate stratification factor) into L strata, and a sample nh(h = 1, 2, · · · , L) is selected by simple random sampling with replacement in each stratum such that n = L∑ h=1 nh. To get the full benefit from stratification, the study assume that Nh (the number of units in each stratum) is known. Following the same randomization procedure of the proposed (SRSWR) model, a respondent belonging to the sample in different strata will perform different randomization devices, each having dif- ferent preassigned probabilities ph. Under the assumption that these respondents “yes”or “no”-reports are made truthfully and ph , 1 is set by the researcher. The probability of a ”yes” and ”no” response from stratum h are Pr(Xih = 1) = λh = phπsh and Pr(Xih = 0) = 1−λh = ph(1−πsh) + (1− ph) for h = 1, 2, · · · , L respectively. Suppose noh report “yes”and (nh − noh) report “no”to the sensi- tive question in stratum h, the likelihood function of the sample in stratum h is L(πsh) = [ phπsh] noh [ ph(1 −πsh) + (1 − ph)] (nh−noh ) (24) For computational convenience, the natural logarithm of the likelihood function is l(πsh) = nohln[phπsh]+(nh−noh) ln[ph(1−πsh)+(1−ph)](25) To obtain the maximum likelihood estimator of πsh, the study differentiate (25) with respect to πsh and equating the derivative to zero to get 0 = noh ph phπsh − (nh − noh) ph ph(1 −πsh) + (1 − ph) (26) solving the resulting equation (26) for πsh, the unbiased estima- tors in terms of the responses of the respondent in stratum h is given by π̂sh = noh nh ph = λ̂h ph (27) where λ̂h = proportion of ”yes”-answers in stratum h and π̂sh is an unbiased estimate for πsh. Variance of π̂sh: The variance of π̂sh is obtained by taking vari- 40 Adeniran et al. / J. Nig. Soc. Phys. Sci. 2 (2020) 36–50 41 ance of (27) as follows: V ar(π̂sh) = V ar [ noh nh ph ] = 1 (nh ph)2 V ar(noh) = nh phπsh[ ph(1 −πsh) + (1 − ph)] (nh ph)2 = phπsh(1 −πsh) nh ph + πsh(1 − ph) nh ph . Hence, V ar(π̂sh) = πsh(1 −πsh) nh + πsh(1 − ph) nh ph . (28) Since selections in different strata are made independently, the maximum likelihood estimate of πs is easily shown to be π̂stpropo = L∑ h=1 Whπ̂sh = L∑ h=1 Wh λ̂h ph = 1 N L∑ h=1 Nh noh nh ph (29) where N and Nh denote the number of subjects in the whole population and in the stratum h, respectively. Wh (stratum weight) = Nh N , for h = 1, 2, · · · , L so that L∑ h=1 Wh = 1. 3.2.3. Sampling Distribution of the Proposed Stratified Estimator π̂st propo Theorem 3.3 (Unbiasedness of π̂stpropo). The proposed strati- fied estimator is an unbiased estimator of population propor- tion of the sensitive attribute under study. proof 3.3. π̂stpropo is said to be unbiased of the population pa- rameter πs if E(π̂stpropo) = πs. Hence, we take expectation of (29) and provided that π̂sh is unbiased for πsh, E ( π̂stpropo ) = E  L∑ i=1 Whπ̂sh  = L∑ h=1 Wh E(π̂sh) = L∑ h=1 Whπsh = πs.(30) Theorem 3.4 (Variance of π̂stpropo). The variance of π̂stpropo is V ar(π̂stpropo) = 1 n  L∑ h=1 Wh { πsh(1 −πsh) + πsh(1 − ph) ph } 1 2  2 .(31) proof 3.4. This follows from taking the variance of (29) and from corollary 1 in section 5.9 of [28], since each unbiased estimator π̂sh has its own variance, the variance of π̂stpropo is V ar ( π̂stpropo ) = V ar  L∑ h=1 Whπ̂sh  = L∑ h=1 W 2h V ar(π̂sh). (32) Substituting (28) into (32), we have V ar(π̂stpropo) = L∑ h=1 W 2h nh [ πsh(1 −πsh) + πsh(1 − ph) ph ] .(33) Recall, simple random sampling scheme asserts that V ar(ȳ) = (1 − f ) S 2y n . Similarly, V ar(π̂sh) = (1 − fh) S 2πsh nh = ( 1 − nh n ) S 2πsh nh . (34) [28] established that when sampling with replacement, the sam- pling fraction is ignorable as n −→ N relative to nh, nh n −→ 0 and 1 − fh = 1 − nh n −→ 1. Therefore, equation (34) reduces to V ar(π̂sh) = S 2πsh nh , (35) making S πsh subject of relation from equation (35) and substi- tuting equation (28) into the resulting expression produces S πsh = [ πsh(1 −πsh) + (1 − ph)πsh ph ] 1 2 . (36) However, by optimum allocation the sample sizes are defined to minimize variance with a given cost. For fixed cost, by the Cauchy-Schwarz inequality the sample size nh to minimize V ar(π̂st propo) is given by nh = nWhS πsh L∑ h=1 WhS πsh . (37) The optimal allocation of n to n1, n2, · · · , nL−1, nL to derive the minimum variance of π̂s subject to n = L∑ h=1 nh is obtained by substituting (36) into (37) to give nh = nWh [ πsh(1 −πsh) + πsh (1−ph ) ph ] 1 2 L∑ h=1 Wh [ πsh(1 −πsh) + πsh (1−ph ) ph ] 1 2 . (38) The minimal variance of π̂stpropo is then obtained by substituting (38) into (33), so that V ar(π̂stpropo) = L∑ h=1 W 2h { πsh(1 −πsh) + πsh(1 − ph) ph } ÷  nWh { πsh(1 −πsh) + πsh (1−ph ) ph } 1 2 L∑ h=1 { πsh(1 −πsh) + πsh (1−ph ) ph } 1 2  (39) V ar(π̂stpropo) = L∑ h=1 W 2h { πsh(1 −πsh) + πsh(1 − ph) ph } × L∑ h=1 { πsh(1 −πsh) + πsh (1−ph ) ph } 1 2 nWh { πsh(1 −πsh) + πsh (1−ph ) ph } 1 2 . (40) Further simplification of (40) yields V ar(π̂stpropo) = 1 n  L∑ h=1 Wh { πsh(1 −πsh) + πsh(1 − ph) ph } 1 2  2 . The unbiased minimal variance of π̂stpropo follows on replacing n by (n − 1) in equation (31) to get 41 Adeniran et al. / J. Nig. Soc. Phys. Sci. 2 (2020) 36–50 42 V̂ ar(π̂stpropo) = 1 n − 1  L∑ h=1 Wh {πsh(1 −πsh) + πsh(1 − ph) ph } 1 2  2 . (41) Although, if n is large, the difference between equation (31) and (41) is negligible. If ph = p for h = 1, 2, · · · , L, (31) reduces to V ar(π̂stpropo) = 1 n  L∑ h=1 Wh {πsh(1 −πsh) + πsh(1 − p) p } 1 2  2 (42) To facilitate computation of V ar(π̂stpropo), let {πsh(1 −πsh) + πsh (1−ph ) ph } 1 2 be replaced by φsh in equation (31) so that V ar(π̂stpropo) = 1 n  L∑ h=1 Whφsh  2 . (43) 3.3. Efficiency Comparison The efficiency of the proposed model with Warner’s original model, Boruch’s FRRM and Mangat’s improved two step pro- cedure is judged by Mean Square Error (MSE) criterion. It should be noted that since the existing and the proposed estima- tors are unbiased, then the criterion for judging the performance of the proposed estimator is now limited to variance compari- son. 3.3.1. Efficiency comparison with Warner’s (1965) Original RRM In this study, we arbitrarily denote variance under Warner de- sign as V ar(π̂sw) and the proposed model as V ar(π̂spropo). Re- call from equations (3) and (22) V ar(π̂s warner) = πs(1 −πs) n + p(1 − p) n(2p − 1)2 and V ar(π̂s propo) = πs(1 −πs) n + (1 − p)πs np The proposed modified FRRM is more efficient than Warner’s RRM if V ar(π̂s warner) − V ar(π̂s propo) ≥ 0. That is,{ πs(1 −πs) n + p(1 − p) n(2p − 1)2 } − { πs(1 −πs) n + (1 − p)πs np } ≥ 0 This implies that p2 − (2p − 1)2πs ≥ 0 ∀p ∈ [0, 1] (44) The inequality (44) always holds. Therefore, the proposed model is more efficient than [7] original model. 3.3.2. Efficiency Comparison with Boruch (1971) Original FRRM Recall from equation (5), the Boruch’s variance estimator of πs which is subjectively denoted here as V ar(π̂s.boruch) is V ar(π̂s.boruch) = πs(1 −πs) n + (1 − p) np [πs(1 −πu) + πu(1 −πs) + (1 − p)πu(1 −πu) p ] . (45) The proposed MFRRM is more efficient than Boruch’s FRRM if V ar(π̂s.boruch) − V ar(π̂spropo) ≥ 0. That is, πs(1 −πs) n + (1 − p) np [πs(1 −πu) + πu(1 −πs) + (1 − p)πu(1 −πu) p ] − { πs(1 −πs) n + (1 − p)πs np } (46) which implies πs(1 −πu) + πu(1 −πs) + (1 − p)πu(1 −πu) p −πs and further simplification gives πs ≤ 1 −πu(1 − p) 2 p . (47) If the condition in (47) is satisfied then the proposed MFFRM is more efficient than [12] original FRRM. 3.3.3. Efficiency Comparison with Mangat’s (1994) Improved Two-Step Procedure From equation (7), the variance of πs under Mangat design is V ar(π̂s) = πs(1 −πs) n + (1 − p)(1 −πs) np . The proposed MFRRM is more efficient than Mangat’s improved two-step procedure if V ar(π̂s mangat) − V ar(π̂s propo) ≥ 0. That is,{ πs(1 −πs) n + (1 − p)(1 −πs) np } − { πs(1 −πs) n + (1 − p)πs np } ≥ 0. After simplification, 1 − 2πs ≥ 0, =⇒ πs ≤ 1 2 (48) The inequality (48) holds ∀πs ≤ 1 2 . Hence, the proposed MFRRM is more efficient than Mangat’s improved two step procedure if and only if πs ≤ 1 2 . 42 Adeniran et al. / J. Nig. Soc. Phys. Sci. 2 (2020) 36–50 43 3.3.4. Efficiency Comparison with Kim and Warde Strati- fied Model Recall, variance of πs under Kim and Warde literally repre- sented as V ar(π̂skw) and proposed stratified MFRRM are V ar(π̂skw) = 1 n  L∑ h=1 Wh { πsh(1 −πsh) + ph(1 − ph) (2ph − 1)2 } 1 2  2 and V ar(π̂stpropo) = 1 n  L∑ h=1 Wh { πsh(1 −πsh) + πsh(1 − ph) ph } 1 2  2 , respectively. The proposed stratified MFRRM is more efficient than the [16] stratified model if the relative efficiency (RE) = V ar(π̂skw ) V ar(π̂stpropo ) ≥ 1. That is, V ar(π̂skw) − V ar(π̂stpropo) ≥ 0. Using this condition, 1 n  L∑ h=1 Wh { πsh(1 −πsh) + ph(1 − ph) (2ph − 1)2 } 1 2  2 − 1 n  L∑ h=1 Wh { πsh(1 −πsh) + πsh(1 − ph) ph } 1 2  2 ≥ 0. (49) The above inequality is true if for each stratum h, h = 1, 2, · · · , L we have[ πsh(1 −πsh) + ph(1 − ph) (2ph − 1)2 ] 1 2 − [ πsh(1 −πsh) + πsh(1 − ph) ph ] 1 2 ≥ 0 which implies ph(1 − ph) (2ph − 1)2 − πsh(1 − ph) ph ≥ 0 Multiplying the preceding inequality throughout by ph (2ph−1) 2 1−ph gives p2h−(2ph−1) 2πsh ≥ 0 ∀ ph ∈ (0, 1) and ∀ πsh ∈ (0, 1)(50) The LHS of (50) is always non-negative, hence the proposed stratified MFRRM is more efficient than [16] stratified random- ized response model. 3.3.5. Efficiency Comparison With Usman and Oshungade Mixed Stratified RRM [17] invented Mixed Stratified RRM for HIV sero-prevalence survey with V ar(π̂s) as V ar(π̂s) = 1 n  L∑ h=1 Wh { πsh(1 −πsh) + (1 − ph)(1 −πsh) ph } 1 2  2 . The proposed stratified Modified FRRM is more efficient than the [17] mixed stratified RRM if the relative efficiency (RE) = V ar(π̂s usman ) V ar(π̂stpropo ) ≥ 1. That is, V ar(π̂s usman)−V ar(π̂stpropo) ≥ 0. There- fore, 1 n  L∑ h=1 Wh { πsh(1 −πsh) + (1 − ph)(1 −πsh) ph } 1 2  2 − 1 n  L∑ h=1 Wh { πsh(1 −πsh) + πsh(1 − ph) ph } 1 2  2 ≥ 0(51) The inequality (51) holds if for each stratum h (h = 1, 2, · · · , L) we have { πsh(1 −πsh) + (1 − ph)(1 −πsh) ph } 1 2 − { πsh(1 −πsh) + πsh(1 − ph) ph } 1 2 ≥ 0 (52) which reduces to (1 − ph)(1 −πsh) ph − πsh(1 − ph) ph ≥ 0. (53) Multiplying the above inequality (53) through by ph1−ph yields 1 − 2πsh ≥ 0. (54) The inequality (54) holds ∀ πsh ≤ 1 2 . That is, the proposed model is more efficient than the [17] mixed stratified RR model if πsh ≤ 1 2 for h = 1, 2, · · · , L. 3.3.6. Cost and Efficiency of Stratification It is essential to think about more than two or three strata cases in terms of efficiency. [28] showed that the variance for the mean of a stratified random sample decreases as the number of strata increases. So, this section explore the behaviour of V ar(π̂stpropo) as the number of strata increases. Suppose L strata of equal sizes are created such that Wh = 1 L , substituting Wh = 1 L in equation (31) produces V ar(π̂stpropo) = 1 nL2  L∑ h=1 { πsh(1 −πsh) + πsh(1 − ph) ph } 1 2  2 .(55) Let f (L) = 1L2 [ L∑ h=1 { πsh(1 −πsh) + πsh (1−ph ) ph } 1 2 ]2 where L is a pos- itive integer, We want to show that f (L) − f (L + 1) ≥ 0 for L(πsh, ph) = { πsh(1 −πsh) + πsh (1−ph ) ph } 1 2 . f (L) − f (L + 1) = 1 L2  L∑ h=1 L(π̂sh, ph)  2 − 1 (L + 1)2 L+1∑ h=1 L(π̂sh, ph)  2 =   1L L∑ h=1 L(π̂sh, ph)   2 −   1L + 1 L+1∑ h=1 L(π̂sh, ph)   2 43 Adeniran et al. / J. Nig. Soc. Phys. Sci. 2 (2020) 36–50 44 =  1L  L∑ h=1 L(π̂sh, ph)  + 1L + 1 L+1∑ h=1 L(π̂sh, ph)  × 1L  L∑ h=1 L(π̂sh, ph)  − 1L + 1 L+1∑ h=1 L(π̂sh, ph)   . As the number of strata increases, it may be possible to divide a heterogeneous population into sub-populations, each of which is more homogeneous. So we may get 1L  L∑ h=1 L(π̂sh, ph)  − 1L + 1 L+1∑ h=1 L(π̂sh, ph)   ≥ 0. (56) By this assumption, f (L) is a monotone decreasing function of L. Therefore, the variance of the proposed estimator gets smaller as the number of strata increases. 3.4. Bayesian Estimation Approach In making generalization, conclusion or prediction about un- known population parameter(s) the trend is to distinguish be- tween classical method of estimating the population parame- ter(s) whereby inferences are based strictly on information ob- tained from a random sample and a Bayesian one which uti- lizes prior subjective knowledge about the probability distribu- tion in conjunction with the information provided by the sample data. [29] and [30] estimated Warner’s RRM using Bayesian approach with binomial likelihood and beta prior as π̂bw = (a+no ) (a+b+n) − (1 − p) (2p − 1) (57) with posterior variance defined as V ar(π̂bw) = nπs(1 −πs) (a + b + n)2 + np(1 − p) (2p − 1)2(a + b + n)2 . (58) Similarly, the model for the proposed sensitive random variable X conditional on unknown parameter λ popularly called likeli- hood function is the density function f (x|λ) = L(λ) given as L(λ) = f (no|λ) = ( n no ) λno (1 −λ)n−no = ( n no ) ( pπs) no [ p(1 −πs) + (1 − p)] n−no. (59) The parameter λ = pπs is considered a random variable which has a distribution g(λ) called the prior distribution otherwise known as prior predictive which describes uncertainty about the parameter before data are observed. A more convenient family of densities for a proportion λ is the beta with kernel propor- tional to g(λ) ∝ λa−1(1 −λ)b−1, 0 < λ < 1. Hence, g(λ) = 1 β(a, b) λa−1(1 −λ)b−1 = 1 β(a, b) (pπs) a−1[p(1 −πs) + (1 − p)] b−1, 0 < λ < 1 (60) where the hyper-parameters a and b are chosen to reflect the user’s prior beliefs about λ. Our goal is to start with this prior information and update it using the data to make the best pos- sible estimator of λ. Multiplying this beta prior with the likeli- hood function gives the joint density function h(no,λ) as h(no,λ) = f (no|λ)g(λ) = ( n no ) β(a, b) ( pπs) a+no−1[ p(1 −πs) + (1 − p)] b+n−no−1.(61) The marginal distribution (m(no)) can be obtained by integrat- ing out parameter λ from the joint distribution as m(no) = ∫ � h(no,λ)dλ = ∫ � f (no|λ)g(λ)dλ = ∫ 1 0 ( n no ) β(a, b) ( pπs) a+no−1[ p(1 −πs) + (1 − p)] b+n−no−1d( pπs) = ∫ 1 0 ( n no ) β(a, b) ( pπs) a+no−1(1 − pπs) b+n−no−1d( pπs). Therefore, m(no) = ( n no ) β(a, b) β(a + no, b + n − no). (62) Combining this beta prior with the likelihood function, one can find f (λ|no), called the posterior distribution/density for λ given X = no as f (λ|no) = f (no|λ)g(λ)∫ � f (no|λ)g(λ)dλ = h(no,λ) m(no) = ( n no ) β(a, b) ( pπs) a+no−1[ p(1 −πs) + (1 − p)] b+n−no−1 × β(a, b)( n no ) β(a + no, b + n − no) which gives f (λ|no) = ( pπs)a+no−1(1 − pπs)b+n−no−1 β(a + no, b + n − no) = (λ)a+no−1(1 −λ)b+n−no−1 β(a + no, b + n − no) . (63) Like g(λ), f (λ|no) also follows beta-distribution with updated parameters a + no and b + n − no. This is an example of a con- jugate analysis where the prior and posterior densities have the same functional form. The prior g(λ) is said to be a conjugate with respect to f (no|λ) [31, 32]. The Bayes estimator of h(λ) under the squared error loss is a ratio of integrals ĥλ = ∫ � h(λ)π(λ|no)dλ = ∫ � h(λ) f (no|λ)g(λ)dλ∫ � f (no|λ)g(λ)dλ (64) 44 Adeniran et al. / J. Nig. Soc. Phys. Sci. 2 (2020) 36–50 45 Following from (64) above, the posterior mean λ̂ is λ̂ = ∫ 1 0 λ f (λ|no)dλ = ∫ 1 0 λ λ(a+no )−1(1 −λ)(b+n−no )−1 β [(a + no)(b + n − no)] dλ p̂πs = 1 β[(a + no)(b + n − no)] ∫ 1 0 λ(a+no +1)−1(1 −λ)(b+n−no )−1dλ = 1 β[(a + no)(b + n − no)] β[(a + no + 1)(b + n − no)] = Γ(a + b + n) Γ(a + no)Γ(b + n − no) × Γ(a + no + 1)Γ(b + n − no) Γ(a + b + n + 1) By definition, Γ(α) = (α− 1)!. So, the up equation yields π̂s = a + no p(a + b + n) . (65) The variance of Bayesian estimator of the proposed RRM fol- lows by taking the variance of (65) as V ar(π̂s) = V ar [ a + no p(a + b + n) ] = 1 [ p(a + b + n)]2 npπs[ p(1 −πs) + (1 − p)]. (66) After some algebraic operations, equation (66) gives V̂ ar(π̂s) = nπs(1 −πs) (a + b + n)2 + n(1 − p)πs p(a + b + n)2 . (67) For large n, the influence of a and b in equations (65) and (67) become negligible because the weight na+b+n → 1. Con- sequently, the posterior mean and variance are getting closer to classical (maximum likelihood) estimators in equations (20) and (23), respectively. 4. Results and Discussion This section presents analysis of the simulated data of different sample sizes that are binomially distributed (with parameters (n, p)) to validate the developed model. [33] established that the best average value of p across RR studies is 0.7. However, this study considered values of p within the range [0.6, 0.8] at equal step of 0.1. The proposed model was estimated and variance comparison was done to examine its performance with some other existing models using [34](R-software version 3.6.0) and ”LearnBayes” package in R-software authored by [35]. The tables and figures showing the results are presented below and the results discussion follows: 4.1. Discussion In terms of minimum variance, column (2) and column (6) of Tables 4, 5 and 6 show that the proposed RR model is more ef- ficient and precise than [7] original RRM. In a situation where πu , 0 and πs in the neighbourhood of 0.6 and below, columns 4 and 6 of Tables 4, 5 and 6 reveal that the proposed MFRRM is more efficient than [12] original FRRM. In addition, Table 7 depicts that if πu = 0, [12] and the proposed model produce same results not only for V ar(π̂s) but also π̂s across different sample sizes regardless of pre-assigned randomization parame- ter ( p). Mangat’s procedure is more efficient but less effective when compare with the earlier similar estimators (models). The proposed model is more protective than [15] RRM. Moreover, the results demonstrate that it is more efficient given the follow- ing condition: when πs assume values ≤ 1 2 . This condition is already established in equation (48). A more detailed insight of the results is given in columns 5 and 6 of Tables 4, 5 and 6 that showed conditional efficiency of the two designs. When πs > 1 2 , Mangat is more efficient (see Table 6). Like [11] model, especially if the sensitive variable under study is highly rampant, the proposed estimator has tendency to pro- duce estimate whose value is greater than unity. Tables 1, 2 and 3 confirmed this possibility. Second column of Table 9 also show that Kim and Warde Warner’s stratified RRM has the same property of producing proportion greater than unity. Furthermore, Bayesian estimation of both Warner’s and the pro- posed MFRRM provide relatively more precise estimators than their classical (maximum likelihood) estimators, conditional on the sample size (n) and total of yes-responses (no). But in large samples, Bayesian estimator asymptotically approaching the classical estimators (refer to columns 3 and 4, 7 and 8 of Ta- ble 1 through 3; columns 2 and 3, 6 and 7 of Table 4 through 6 for details). To update the user’s prior knowledge about πs, the study considers posterior of immediate sample size as a new prior in the subsequent sampling to generate a new posterior which produce Figure 3. Figure 3 above reveals that the poste- rior density compromises between the initial prior beliefs and the information in the data. Again, using information from Table 8, the numerical illustra- tion shows that the proposed stratified MFRRM is more effi- cient than Kim and Warde Warner’s stratified RRM as V ar(π̂stpropo) = 0.0007313622 < V ar(π̂skw) = 0.002960187 (compare column 4 of Tables 10 and 14). In addition, the proposed model is less efficient when compared with mixed stratified RRM of [17] (see column 4 of Tables 12 and 14). Both Warner’s stratified and mixed stratified RRMs have tendency to produce negative es- timate of π̂sh which has a seeming distance from reality. The possibility of negative stratum proportion is well known from their theoretical results particularly when λ̂h + ph < 1 (see first row, second column of Table 11). Theoretically and empiri- cally, the proposed stratified RRM can not produce negative π̂sh (refer to equation (27) and Table 13 for confirmation). 5. Concluding Remarks This study proposed a modified version of FRRM for designing surveys that will elicit response to sensitive issues from sam- ple units. Although extant models such as [7] original RRM, [12] original FRRM, and [15] FRRM among others have fo- cused on improving response in RR surveys exists in the lit- erature. However, designs of some of these models are not 45 Adeniran et al. / J. Nig. Soc. Phys. Sci. 2 (2020) 36–50 46 Table 1: Estimate of πs for the proposed model and some existing models when p = 0.6, πu = 0.75, a = 10.2 and b = 27.4 n no Warner RRM Boruch Mangat Proposed MFRRM π̂mle π̂bayesian π̂mle π̂mle π̂mle π̂bayesian 200 118 0.950000 0.6978114 0.4833333 0.3166667 0.9833333 0.8992705 400 244 1.050000 0.9044790 0.5166667 0.3500000 1.0166667 0.9681597 600 362 1.016667 0.9187578 0.5055556 0.3388889 1.0055556 0.9729193 800 476 0.975000 0.9023400 0.4916667 0.3250000 0.9916667 0.9674467 1000 591 0.955000 0.8970702 0.4850000 0.3183333 0.9850000 0.9656901 10000 5949 0.974500 0.9684387 0.4915000 0.3248333 0.9915000 0.9894796 100000 60303 1.015150 1.0145265 0.5050500 0.3383833 1.0050500 1.0048422 1000000 600418 1.002090 1.0020281 0.5006967 0.3340300 1.0006967 1.0006760 Table 2: Estimate of πs for the proposed model and some existing models when p = 0.7, πu = 0.75, a = 10.2 and b = 27.4 n no Warner RRM Boruch Mangat Proposed MFRRM π̂mle π̂bayesian π̂mle π̂mle π̂mle π̂bayesian 200 135 0.9375000 0.7777778 0.6428571 0.5357143 0.9642857 0.8730159 400 275 0.9687500 0.8793419 0.6607143 0.5535714 0.9821429 0.9310525 600 416 0.9833333 0.9211104 0.6690476 0.5619048 0.9904762 0.9549202 800 557 0.9906250 0.9429322 0.6732143 0.5660714 0.9946429 0.9673898 1000 699 0.9975000 0.9587510 0.6771429 0.5700000 0.9985714 0.9764291 10000 6958 0.9895000 0.9855244 0.6725714 0.5654286 0.9940000 0.9917282 100000 70104 1.0026000 1.0021962 0.6800571 0.5729143 1.0014857 1.0012550 1000000 700007 1.0000175 0.9999772 0.6785814 0.5714386 1.0000100 0.9999870 Table 3: Estimate of πs for the proposed model and some existing models when p = 0.8, πu = 0.75, a = 10.2 and b = 27.4 n no Warner RRM Boruch Mangat Proposed MFRRM π̂mle π̂bayesian π̂mle π̂mle π̂mle π̂bayesian 200 151 0.9250000 0.7974186 0.7562500 0.6937500 0.9437500 0.8480640 400 308 0.9500000 0.8785801 0.7750000 0.7125000 0.9625000 0.9089351 600 470 0.9722222 0.9218946 0.7916667 0.7291667 0.9791667 0.9414210 800 624 0.9666667 0.9286055 0.7875000 0.7250000 0.9750000 0.9464542 1000 785 0.9750000 0.9439733 0.7937500 0.7312500 0.9812500 0.9579800 10000 7977 0.9961667 0.9928801 0.8096250 0.7471250 0.9971250 0.9946601 100000 80061 1.0010167 1.0006851 0.8132625 0.7507625 1.0007625 1.0005138 1000000 800137 1.0002283 1.0001952 0.8126712 0.7501713 1.0001712 1.0001464 Table 4: Performance comparison of the proposed model with some existing models when p = 0.7, πs = 0.1, πu = 0.75, a = 10.2 and b = 27.4 n Warner RRM Boruch Mangat Proposed MFRRM V (π̂mle) V (π̂bayesian) V (π̂mle) V (π̂mle) V (π̂mle) V (π̂bayesian) 200 0.0070125000 4.968668e-03 2.122194e-03 2.378571e-03 6.642857e-04 4.706760e-04 400 0.0035062500 2.929599e-03 1.061097e-03 1.189286e-03 3.321429e-04 2.775174e-04 600 0.0023375000 2.069939e-03 7.073980e-04 7.928571e-04 2.214286e-04 1.960828e-04 800 0.0017531250 1.599262e-03 5.305485e-04 5.946429e-04 1.660714e-04 1.514961e-04 1000 0.0014025000 1.302696e-03 4.244388e-04 4.757143e-04 1.328571e-04 1.234028e-04 10000 0.0001402500 1.392012e-04 4.244388e-05 4.757143e-05 1.328571e-05 1.318637e-05 100000 0.0000140250 1.401446e-05 4.244388e-06 4.757143e-06 1.328571e-06 1.327573e-06 1000000 0.0000014025 1.402395e-06 4.244388e-07 4.757143e-07 1.328571e-07 1.328472e-07 effective and their estimators are not efficient. Based on the simulated results and other evidence from variance compari- son, the proposed MFRRM outperformed some of these mod- els. In terms of efficiency, the proposed model is preferred to [7] RRM as it portents or exhibits minimal variance. In compar- ison with [12] FRRM, the proposed MFFRRM is more efficient and also more flexible as it does not requires prior knowledge of πu unlike Boruch’s original FRRM. To some extent, the pro- posed model is more efficient than [15] RRM. Its practicability trascends Mangat’s as it is wholly stochastic in procedure thus 46 Adeniran et al. / J. Nig. Soc. Phys. Sci. 2 (2020) 36–50 47 Table 5: Performance comparison of the proposed model with some existing models when p = 0.7, πs = 0.4, πu = 0.75, a = 10.2 and b = 27.4 n Warner RRM Boruch Mangat Proposed MFRRM V (π̂mle) V (π̂bayesian) V (π̂mle) V (π̂mle) V (π̂mle) V (π̂bayesian) 200 0.0077625000 5.500077e-03 2.550765e-03 2.485714e-03 2.057143e-03 1.457577e-03 400 0.0038812500 3.242926e-03 1.275383e-03 1.242857e-03 1.028571e-03 8.594088e-04 600 0.0025875000 2.291323e-03 8.502551e-04 8.285714e-04 6.857143e-04 6.072242e-04 800 0.0019406250 1.770306e-03 6.376913e-04 6.214286e-04 5.142857e-04 4.691493e-04 1000 0.0015525000 1.442021e-03 5.101531e-04 4.971429e-04 4.114286e-04 3.821506e-04 10000 0.0001552500 1.540891e-04 5.101531e-05 4.971429e-05 4.114286e-05 4.083520e-05 100000 0.0000155250 1.551333e-05 5.101531e-06 4.971429e-06 4.114286e-06 4.111194e-06 1000000 0.0000015525 1.552383e-06 5.101531e-07 4.971429e-07 4.114286e-07 4.113976e-07 Table 6: Performance comparison of the proposed model with some existing models when p = 0.7, πs = 0.6, πu = 0.75, a = 10.2 and b = 27.4 n Warner RRM Boruch Mangat Proposed MFRRM V (π̂mle) V (π̂bayesian) V (π̂mle) V (π̂mle) V (π̂mle) V (π̂bayesian) 200 0.0077625000 5.500077e-03 2.501020e-03 2.057143e-03 2.485714e-03 1.761239e-03 400 0.0038812500 3.242926e-03 1.250510e-03 1.028571e-03 1.242857e-03 1.038452e-03 600 0.0025875000 2.291323e-03 8.336735e-04 6.857143e-04 8.285714e-04 7.337293e-04 800 0.0019406250 1.770306e-03 6.252551e-04 5.142857e-04 6.214286e-04 5.668888e-04 1000 0.0015525000 1.442021e-03 5.002041e-04 4.114286e-04 4.971429e-04 4.617653e-04 10000 0.0001552500 1.540891e-04 5.002041e-05 4.114286e-05 4.971429e-05 4.934253e-05 100000 0.0000155250 1.551333e-05 5.002041e-06 4.114286e-06 4.971429e-06 4.967692e-06 1000000 0.0000015525 1.552383e-06 5.002041e-07 4.114286e-07 4.971429e-07 4.971055e-07 Table 7: Comparison of original FRRM and the Proposed MFRRM when p = 0.7 and πu = 0 Boruch original FRRD Proposed MFRRD n no π̂mle V (π̂mle) π̂mle V (π̂mle) 200 135 0.9642857 2.514286e-03 0.9642857 2.514286e-03 400 275 0.9821429 1.257143e-03 0.9821429 1.257143e-03 600 416 0.9904762 8.380952e-04 0.9904762 8.380952e-04 800 557 0.9946429 6.285714e-04 0.9946429 6.285714e-04 1000 699 0.9985714 5.028571e-04 0.9985714 5.028571e-04 10000 6958 0.9940000 5.028571e-05 0.9940000 5.028571e-05 100000 70104 1.0014857 5.028571e-06 1.0014857 5.028571e-06 1000000 700007 1.0000100 5.028571e-07 1.0000100 5.028571e-07 Table 8: Samples and strata sizes Strata Nh nh ph noh Wh λ̂h 1 876 69 0.4 27 0.08981852 0.3913043 2 2412 118 0.6 51 0.24730852 0.4322034 3 3012 279 0.7 115 0.30882805 0.4121864 4 3453 288 0.8 102 0.35404491 0.3541667 Total 9753 754 1.00 Table 9: Computational procedure of Kim and Warde Warner’s stratified RRM Strata π̂sh Whπ̂sh φ̂sh Whφsh 1 1.0434783 0.09372367 2.440211 0.2191762 2 0.1610169 0.03982086 2.476911 0.6125613 3 0.2804659 0.08661575 1.230571 0.3800348 4 0.2569444 0.09096987 0.797100 0.2822092 Total 0.3111302 1.493982 47 Adeniran et al. / J. Nig. Soc. Phys. Sci. 2 (2020) 36–50 48 0.0 0.2 0.4 0.6 0.8 1.0 0 2 4 6 8 1 0 1 2 p D e n si ty Prior Likelihood Posterior 0.0 0.2 0.4 0.6 0.8 1.0 0 5 1 0 1 5 p D e n si ty Prior Likelihood Posterior Figure 3: The Prior, the Likelihood and the Posterior for n = 200 and n = 400 Table 10: Summary of Kim and Warde Warner’s stratified RRM N n π̂s V ar(π̂s) 95% CI 9753 754 0.3111302 0.002960187 0.2044932, 0.4177671 Table 11: Computational procedure of mixed stratified RRM Strata π̂sh Whπ̂sh φ̂sh Whφsh 1 -0.40831758 -0.03667448 1.2399337 0.1113690 2 0.04639471 0.01147381 0.8246085 0.2039327 3 0.13211914 0.04080210 0.6975762 0.2154311 4 0.13650174 0.04832774 0.5777054 0.2045337 Total 0.06392917 0.7352665 Table 12: Summary of mixed stratified RRM N n π̂s V ar(π̂s) 95% CI 9753 754 0.06392917 0.0007169984 0.01144755, 0.1164108 Table 13: Computational procedure of the proposed stratified MFRRM Strata π̂sh Whπ̂sh φ̂sh Whφsh 1 0.9782609 0.08786594 1.2201057 0.1095881 2 0.7203390 0.17814597 0.8256372 0.2041871 3 0.5888377 0.18184960 0.7031834 0.2171628 4 0.4427083 0.15673863 0.5978250 0.2116569 Total 0.6046001 0.7425948 Table 14: Summary of proposed stratified MFRRM N n π̂stpropo V ar(π̂stpropo) 95% CI 9753 754 0.6046001 0.0007313622 0.5515954, 0.6576048 avoiding any posiible identification of individual status. The motivation for adopting existing RRT in sensitive related sur- vey is protection of respondents’ privacy, and for the fact that the proposed MFRRM is more efficient, flexible and practical over some of the leading RRT, the study suggest its adoption. The proposed model argues that majority of people have a wrong intuition of the concepts of probabilities. This ignorance was 48 Adeniran et al. / J. Nig. Soc. Phys. Sci. 2 (2020) 36–50 49 used to the advantage of the survey under consideration by mak- ing the subjective privacy protection larger than the true statis- tics privacy protection. Though, the proposed design is still in accordance with the principle of RRT, since the identity of the respondent is not required and the randomization process makes it impossible to know or even guess the response of a particular respondent. It is imperative to recall that RRT was introduced to protect respondents’ privacy and encourage them to divulge truthful answers rather than trapping them with prob- ability or mathematical techniques to trace their real status on the sensitive issue(s) understudy as suggested by [37-38], and the proposed model. The study however suggests further quest for more robust RRM in view of the possibility that respondents can subsequently assimilate the randomization process thereby increasing false or evasive response, or at worst refusal to par- ticipate in such survey. Acknowledgements The authors are grateful to the referee and editor for their valu- able comments and suggestions. References [1] G. N. Amahia, “Factors, Prevention and Correction Methods for Non- Response in Sample Surveys”, Central Bank of Nigeria Journal of Ap- plied Statistics 1 (2010) 79. [2] K. Hong, J. Yum & H. Lee, ”A Stratified Randomized Response Tech- nique”, The Korean Journal of Applied Statistics, 7 (1994) 141. [3] G. S. Lee, K. H. Hong, J. M. Kim & C. K. Son, “Estimation of the Propor- tion of a Sensitive Attribute Based on a Two-Stage Randomized Response Model with Stratified Unequal Probability Sampling”, Brazilian Journal of Probability and Statistics, 28 (2014) 381. [4] S. Sudman & N. M. Bradburn, “Asking Questions: A Practical Guide to Questionnaire Design”, San Fransisco, CA: Jossey-Bass (1982). [5] K. A. Rasinski, G. B. Willis, Baldwin, W. C. Yeh & L. Lee, “Methods of Data Collection, Perceptions of Risks and Losses, and Motivation to give Truthful Answers to Sensitive Survey Questions”, Applied Cognitive Psychology, 13 (1999) 465. [6] C. N. Bouza, C. Herrera & P. G. Mitra, “A Review of Randomized Re- sponses Procedures: The Qualitative Variable Case”, Revista Investi- gación 31 (2010) 240. [7] S. L. Warner, “Randomized Response: A Survey Technique for Eliminating Evasive Answer Bias”, Journal of American Statistical Association, 60(1965) 63, DOI: 10.2307/2283137, http://www.jstor.org/stable/2283137. [8] G. J. L. M. Lensvelt-Mulders, J. J. Hox & P. G. M. Van-der- Heijden, “How to Improve the Efficiency of Randomized Response Designs”, Quality and Quantity: Springer (2005) 39 (2005) 253, DOI:10.1007/s11135-004-0432-3. [9] A. C. Y. Chong, A. M. Y. Chu, M. K. P. So & R. S. W. Chung, “Asking Sensitive Questions Using the Randomized Response Approach in Public Health Research: An Empirical Study on the Factors of Illegal Waste Disposal”, International Journal of Environmental and Public Health 16 (2019) 1, doi:10.3390/ijerph16060970. [10] D. G. Horvitz, B. V. Shah & W. S. Simmons, ”The Unrelated Question Randomized Response Model, Proceedings in the Social Science Sec- tion”, American Statistical Association, (1967) 65. [11] B. G. Greenberg, A. L. Abul-Ela, W. R. Simmons & D. G. Horvitz, ”The unrelated question Randomized Response Model: Theoretical Frame- work”, Journal of the American Statistical Association, 64 (1969) 520, https://doi.org/10.1080/01621459.1969.10500991. [12] R. F. Boruch, ”Assuring Confidentiality of Responses in Social Research: A Note on Strategies”, The American Sociologist, 6 (1971) 308. [13] J. J. A. Moors, ”Optimization of the Unrelated Question Randomized Response Model”, Journal of the American Statistical Association, 66 (1971) 627, DOI: 10.1080/01621459.1971.10482320. [14] N. S. Mangat & R. Singh, An Alternative Random- ized Response Procedure, Biometrika, 77 (1990) 439, https://doi.org/10.1093/biomet/77.2.439. [15] N. S. Mangat, ”An Improved Randomized Response Strategy”, Journal of Royal Statistical Society: Series B, 56 (1994) 93. [16] J. M. Kim & W. D. Warde, ”A stratified Warner’s Randomized Response Model”, Elsevier Journal of Statistical Planning and Inference, 120 (2004) 155. [17] A. Usman & I. O. Oshungade, ”A Mixed-stratified Randomized Response Model for HIV Seroprevalence Surveys”, Research Journal of Mathemat- ics and Statistics, 4 (2012) 70. [18] D. W. Gingerich, ”Understanding Off-the-books Politics: Conducting In- ference on the Determinants of Sensitive Behavior with Randomized Re- sponse Surveys”, Political Analysis, 18 (2010) 349. [19] J. A. Fox & P. E. Tracy (1986), ”Randomized Response: A Method for Sensitive Surveys”, Beverly Hills, CA: Sage. [20] G. Blair, K. Imai & Y-Y Zhou, ”Design and Analysis of the Randomized Response Technique”, Journal of the American Statistical Association, 110 (2015) 1304, DOI: 10.1080/01621459.2015.1050028. [21] B. Rosenfeld, K. Imai & J. Shapiro, ”An Empirical Validation Study of Popular Survey Methodologies for Sensitive Questions”, American Jour- nal of Political Science, 60 (2015) 783, doi: 10.1111/ajps.12205. [22] J. H. Stubbe, A. M. Chorus, L. E. Frank, O. Hon & P. G. Heijden, ”Preva- lence of Use of Performance Enhancing Drugs by Fitness Centre Mem- bers”, Drug Testing and Analysis, 6 (2014) 434. [23] I. Krumpal, ”Estimating the Prevalence of Xenophobia and Anti-semitism in Germany: A Comparison of Randomized Response and Direct Ques- tioning”, Social Science Research, 41 (2012) 1387. [24] F. A. St-John, A. M. Keane, G. Edwards-Jones, L. Jones, R. W. Yarnell & J. P. Jones, ”Identifying Indicators of Illegal Behaviour: Carnivore Killing in Human-Managed Landscapes”, Proceedings of the Royal Society B: Biological Sciences, 279 (2012) 804. [25] H. Elffers, P. Van Der Heijden & M. Hezemans, ”Explaining Regu- latory Non-compliance: A Survey Study of Rule Transgression For Two Dutch Instrumental Laws, Applying the Randomized Response Method”, Journal of Quantitative Criminology, 19 (2003) 409, DOI: 10.1023/B:JOQC.0000005442.96987.9e. [26] G. Diana, S. Riaz & J. Shabbir, ”Hansen and Hurwitz Estimator with Scrambled Response on the Second Call”, Journal of Applied Statistics, 41 (2013)3, 596-611, http://dx.doi.org/10. 1080/02664763.2013.846305. [27] O. S. Ewemooje & G. N. Amahia, ”Improved Randomized Response Technique for Two Sensitive Attributes”, Afrika Statistika, 10 (2015) 839, DOI: 10.16929/as/2015.639.78. [28] W. G. Cochran, ”Sampling Techniques”, 3rd edition, New York: John Wiley and Sons (1977), MR 0474575. [29] R. L. Winkler & L. A. Franklin, ”Warner’s Randomized Re- sponse Model: A Bayesian Approach”, Journal of the Ameri- can Statistical Association, 74 (1979) 207, DOI: 10.2307/2286752, https://www.jstor.org/stable/2286752. [30] Z. Hussain, J. Shabbir & M. Riaz, ”Bayesian Estimation using Warner’s Randomized Response Model through Simple and Mixture Prior Distri- butions”, Communications in Statistics-Simulation and Computation, 40 (2010) 147, DOI: 10.1080/03610918.2010.532897. [31] D. Fink, ”A Compendium of Conjugate Priors: Environmental Statis- tics Group”, Department of Biology, Montana state university Bozeman (1997), MT 59717. [32] F. J. Anscombe, ”Bayesian Statistics”, Journal of Royal Statistical Soci- ety, 25 (1962) 1. [33] A. Quantember, ”A Standardization of Randomized Response Strate- gies”, Survey Methodology, 35 (2009) 143. [34] R Core Team, ”R: A language and environment for statistical computing”, R Foundation for Statistical Computing, Vienna, Austria (2019), URL= https://www.R-project.org/. [35] Jim Albert, ”LearnBayes: Functions for Learning Bayesian In- ference”, R package version 3.6.0 (2018), https://CRAN.R- project.org/package=LearnBayes. [36] J. M. Kim, J. M. Tebbs & An Seung-Won, ”Extensions of Mangat’s Randomized-Response Model”, Elsevier Journal of Statistical Planning 49 Adeniran et al. / J. Nig. Soc. Phys. Sci. 2 (2020) 36–50 50 and Inference, 136 (2006) 1554. [37] P. G. M. Van der Heijden, G. Van Gils, J. Bouts, & J. J. Hox, ”A Com- parison of Randomized Response, Computer-assisted self-interview, and Face-to-Face Direct Questioning”, Sociological Methods and Research, 28 (2000) 505. [38] H. Zawar & K. Khushnoor, ”On Estimation of Sensitive Mean Using Scrambled Data”, World Applied Sciences Journal, 23 (2013) 1201. 50