Microsoft Word - 2Sato_p64-82_Vol_I_Issue_2_2009.docx IJAHP ARTICLE: Sato/How to Measure Human Perception in Survey Questionnaires International Journal of the 64 Vol. 1 Issue 2 2009 ISSN 936-6744 Analytic Hierarchy Process HOW TO MEASURE HUMAN PERCEPTION IN SURVEY QUESTIONNAIRES Yuji Sato Graduate School of Policy Science, Mie Chukyo University Matsusaka, Japan E-mail: ysatoh@mie-chukyo-u.ac.jp ABSTRACT The objective of this study was to examine the effectiveness of multiple-choice method, ranking method, rating method and the application of Analytic Hierarchy Process in measuring human perception. The AHP not only clearly identifies the most important alternative but also the preference for each alternative by each decision maker. Therefore, using AHP to analyze the decision-making process may result in a precise clarification of preference for alternatives. Based on survey researches on social issues, the results offered some evidence that the application of the AHP was superior to traditional questionnaire methods in representing human perceptions. Keywords: survey questionnaire, multiple-choice, ranking, rating, Feeling Thermometer 1. Introduction Questionnaire design for survey research, such as public opinion polls, presents one of the biggest challenges for survey researchers in terms of accuracy in measuring respondents’ perceptions (Traugott and Lavrakas, 2000). Consequently, many ways of asking questions have been proposed and much discussion has been generated. The set of categories or range of scores on a variable is called a scale, and the process of assigning scores to objects to yield a measure of a construct is called scaling. When a respondent applies judgment to assign scores to individuals or objects along the scale, a rating method is being used (Judd, Smith, & Kidder, 1991). One rating method—called a Feeling Thermometer (FT; Kabashima, 1998)—was extensively used in survey questionnaires, which ranges from 0, the coldest feeling toward alternatives, to 100, the hottest, with 50, being neutral. In surveys, this method asks respondents to express their perceptions by indicating their “temperature” for each alternative for a given question. Although this method helps respondents precisely clarify their judgments for each alternative, consistency among responses to the alternatives is not always satisfactory (Sato, 2005). A more traditional method for measuring respondents’ perceptions is the multiple-choice (MC) question format, which has been thought to be well suited to questionnaire formatting because respondents find the questions easy to answer and they allow researchers to easily identify the main concerns of the respondents (Jerard, 1995; Downing, 2004). This method takes two different forms: one is simple multiple-choice (SMC); the other is modified multiple-choice (MMC; Sato, 2004). In the SMC method, respondents must choose one from among the given alternatives. The SMC identifies only the Rob Typewritten Text Rob Typewritten Text http://dx.doi.org/10.13033/ijahp.v1i2.31 IJAHP ARTICLE: Sato/How to Measure Human Perception in Survey Questionnaires International Journal of the 65 Vol. 1 Issue 2 2009 ISSN 936-6744 Analytic Hierarchy Process most important alternative for each respondent, thus preventing the respondent from expressing his or her preference concerning a selected alternative over the others. Moreover, no information regarding the relationship among the non-selected alternatives is derived (Sato, 2004). In the MMC method, respondents have the option of indicating their top-two (or more) alternatives. Since respondents are allowed to express their preferred alternatives, the MMC can be expected to be an effective way to make up for the lack of information incurred by the SMC. Nevertheless, the difference in the degree of importance among the selected alternatives is not clarified, nor is the information concerning non-selected alternatives reflected in the results (Sato, 2004). Let us consider the case asking respondents why they are non-partisan, for example. On a question designed in the SMC method, respondents must express their opinion by choosing one from among the reasons given. Respondents with a definite reason regarding the issue could choose one alternative without confusion if they found that it exactly represented their perception. This format could be expected to function quite well for these respondents. On the other hand, it might be that some respondents are non-partisan for no particular reason, while others are non-partisan for complex reasons. The MC would not be an effective format for those respondents. Another method that has been applied is the ranking method used by Ronald Inglehart and Paul Abramson (1993) in their World Values Survey. This method asks respondents to rank all given alternatives in a question, from the most preferred to the least, thus allowing researchers to identify a respondent’s preference order for all alternatives. The problem with this method, however, is that the more alternatives a questionnaire offers, the more difficult it is for the respondent to answer (Inglehart and Abramson, 1993). Another drawback to this approach is that it does not allow for ties (Sato, 2003). For example, let us consider the case asking executive staff members of a prefectural government who have authority for final budget decisions “which governmental projects should be budgeted with high priority for next year?” On a question designed in the ranking method, respondents must express their opinion by ranking all projects given in the question. Respondents with definite preferences on the issue could rank all the projects without hesitation. On the other hand, some respondents might have no definite preference concerning the issue while others might have many ties in the priority for projects. One possible option for formatting questionnaires is to apply the AHP, a popular tool for decision-making developed by T. Saaty (1977, 1980). Since it was released, many individuals and groups in various fields have used the AHP because of its user-friendly interface for multi-criteria decision-making (Vargas, 1990). In the AHP, data from a decision-maker’s judgments, called pairwise comparisons, are aggregated, and the degree of importance of each alternative is quantified in the AHP. This procedure identifies not only the most important alternative but also the preference for all alternatives for each decision-maker (Crawford and Williams, 1985). Using the AHP to analyze the decision-making process, therefore, results in a precise clarification of respondents’ preferences for alternatives (Sato, 2007). Here we summarize the aforementioned four questioning methods by taking the case of buying a fruit as an example. If you were supposed to buy a fruit at a store from among the following four alternatives, apple, banana, melon and orange for instance, your preference should be represented as follows. IJAHP ARTICLE: Sato/How to Measure Human Perception in Survey Questionnaires International Journal of the 66 Vol. 1 Issue 2 2009 ISSN 936-6744 Analytic Hierarchy Process In a question asking your preference, SMC method requires you to choose one alternative from among the four fruits, and MMC, if it allows you to give your second choice, asks you to choose up to two alternatives from among the four fruits. On the other hand, Ranking method asks you to rank all fruits from the most preferred to the least by number, and Feeling Thermometer requires you to assign adequate number between 0 and 100 to each one of the four fruits. The application of AHP asks you to conduct pairwise comparison over possible combinations of the four fruits and then the weights for all alternatives would be calculated. In this study, we compared the answers to four sets of questions on a particular issue, each formatted in a different way. The first two of the four sets consisted of questions formatted using two different types of the MC method and the AHP; the third set consisted of two pairs of questions formatted by the ranking method and the AHP; the fourth set consisted of two pairs of questions formatted by the rating method and the AHP. We then evaluated the four methods in terms of appropriateness for representing each respondent’s perception. First, we focused on the difference of the aggregated ranking of alternatives across all respondents between the MC and AHP methods. The ranking derived from the SMC implies aggregated plurality, while that elicited from the AHP suggests aggregated intensity.1 Since both rankings reflect the pattern of responses for alternatives, they are likely to produce similar results. In addition, we evaluated the effectiveness of the MMC in terms of its ability to make up for the lack of information incurred by the SMC. Since the MMC is a type of the MC question method that allows respondents to indicate their second-best alternative, it may reflect each respondent’s preference for alternatives more precisely than does the SMC. To compare the two methods, three questions—formatted by the SMC, MMC and AHP methods—on the same issue were posed: each asked about the reasons that respondents were non-partisan. Details of the data set are shown in Sections 2.1 and 2.2. Second, we compared the preference orders of alternatives across all respondents between the ranking and the AHP methods, and evaluated these two formats in terms of the appropriateness for representing each respondent’s perception. The aggregated ranking of alternatives across all respondents derived from the ranking method implies preference order of alternatives for respondents. Similarly, aggregated weight for alternatives across all respondents elicited from the AHP also implies preference order of alternatives for respondents. Since both preference orders reflect the entire trend concerning each alternative of a population, these two methods are likely to produce similar results. To compare the two methods, two different sets of a pair of questions on a particular issue were used: one concerned an abstract issue—the 1 From discussion at the May Conference, University of Michigan, May 2000. apple banana melon orange SMC : X MMC (in case, up to 2 alternatives) : X X Ranking : 4 1 3 2 Feeling Thermometer : 22 95 43 78 AHP : 0.092 0.399 0.181 0.328 IJAHP ARTICLE: Sato/How to Measure Human Perception in Survey Questionnaires International Journal of the 67 Vol. 1 Issue 2 2009 ISSN 936-6744 Analytic Hierarchy Process basic concepts of refining governmental program policy; the other related to a concrete issue—governmental projects with high priority. Details of the data set are shown in Section 2.3. Third, we focused on the difference of the aggregated weight of alternatives across all respondents between the FT and the AHP methods. The weight derived from FT is based on absolute evaluation, while that elicited from the AHP is based on comparative evaluation. Since both aggregated weight reflect the pattern of responses for alternatives, they are likely to produce similar results. In addition, we evaluated the effectiveness of each method in terms of its ability as independent variables in regression analyses. In the above-mentioned comparisons, a data set obtained from a 2004 survey on public opinion was employed. To compare the two methods, two types of questions on a particular issue were employed, each formatted in a different way, one using the FT and the other using the AHP. Each asked about respondents’ intention to vote for a party in the next election. Details of the data set are shown in Section 2.4. 2. Outline of surveys In this study, we compared the three above-mentioned methods—the MC method, the ranking method, rating method, and the AHP method—using survey data. The data sets were obtained from the following four surveys. For the AHP, respondents were asked to respond to a series of redundant pairwise comparisons. We thus needed to take into account the possible inconsistency of a pairwise comparison matrix in analyzing elicited weights (Webber, Apostolou and Hassell, 1997). In this paper, however, we do not cut off those samples exceeding the threshold of Consistency Index (C.I.), such as 0.15, because the respondents whose C.I. exceeded the threshold are social existences, such as constituencies in elections, too. 2.1 Survey 1 (January 1999) Survey 1 was carried out in January 1999, one month after the coalition cabinet of the Liberal Democratic Party and the Liberal Party was established in Japan. Respondents were 834 students at a university in Japan. The purpose of the survey was to identify “the political attitude of students when a coalition cabinet was established.” The survey consisted of 30 questions. In Q.12, respondents were asked their party identification, and for only the 398 respondents who answered “non-partisan” to the question, the following three sub-questions were posed in three ways to clarify why they were non-partisan. Q.13 (hereinafter referred to as Q1S) and Q.29 (Q1M) in Survey 1 were respectively formatted by the SMC method (See Appendix 1), which requires respondents to choose only one from among four given reasons, and by the MMC method (See Appendix 2), which gives respondents the option of indicating their second-best alternative. Q.26 (Q1A) was formatted by the AHP method (See Appendix 3), in which respondents are required to conduct pairwise comparisons across all possible combinations of reasons. The reasons offered were: Too much political realignment; Political apathy; Non-confidence with party and politician; and Corruption of political ethics. 2.2 Survey 2 (April 2001) Survey 2 was conducted in April 2001, the month when graduating students usually begin to start job search in Japan. In this survey, respondents were 323 students of a university in Japan. The intellectual purpose of this survey was to clarify “the main concerns of Japanese graduating students as they begin IJAHP ARTICLE: Sato/How to Measure Human Perception in Survey Questionnaires International Journal of the 68 Vol. 1 Issue 2 2009 ISSN 936-6744 Analytic Hierarchy Process their job search.” The main concerns offered were: Job specifications; Welfare program of the company; Salary; and Place of employment. In this survey, three differently formatted questions were posed, each asking the respondents’ main concerns of their job-search activities. The first question (hereinafter referred to as Q2S) was formatted by the SMC, the second question (Q2A) by the AHP and the third question (Q2M) by the MMC. We leave out the details of questions Q2S, Q2A and Q2M; suffice it to say that each question had exactly the same format as questions Q1S, Q1A and Q1M in Survey 1, respectively (cf. Appendices 1, 2 and 3). 2.3 Survey 3 (January 2002) Survey 3 was carried out in January 2002, the month executive staff members of a local government in Japan finalize the budget for the following fiscal year. Respondents were 35 executive staff members having authority for final budget decisions on governmental projects. The purpose of the survey was to identify “the main concerns of executive staff members in budgeting.” The survey included two issues: one asked about the basic concepts for refining the public administration (abstract issue); the other asked about actual governmental projects with high priority (concrete issue). The first pair of questions (hereinafter referred to as Q3A1 and Q3R1) concerned an abstract issue—the basic concepts of refining governmental program policy. They were: Concept I—for cultural development of our area; Concept II—for safety of our social life; Concept III—for environmental preservation of our area; Concept IV—for economic growth of our area; and Concept V—for enhancement of our area. The respondents were asked, in two ways, which of the five basic concepts they thought significant for refining governmental program policy: Q3A1 was formatted by the AHP method, in which respondents were asked to conduct pairwise comparisons across all possible combinations of the basic concepts (cf. Appendix 3). Q3R1 used the ranking method (See Appendix 4), in which respondents were asked to rank the entire concept in the given question. The second pair of questions (hereinafter referred to as Q3A2 and Q3R2) related to a concrete issue—governmental projects with high priority (PHP, for short). They were: PHP I—support for school education (for cultural development of our area); PHP II—improvements to rivers, mountains and coasts (for safety of our social life); PHP III—preservation of water resources (for environmental preservation of our area); PHP IV—support for entrepreneurs (for economic growth of our area); and PHP V—construction and repair of roads (for enhancement of our area). The respondents were asked, in two ways, which of the governmental projects should be budgeted with high priority: Q3A2 employed the AHP method and Q3R2 was formatted by the ranking method. Q3A2 and Q3R2 were formatted as the same as Q3A1 and Q3R1 respectively (cf. Appendices 3 and 4). 2.4 Survey 4 (March 2004) Survey 4 was conducted in March 2004, one month after Japan Self-Defense Forces was dispatched to Iraq. Respondents were lay citizens of a local city in Japan. The purpose of the survey was to identify “the political attitude of citizens when the Self-Defense Force was dispatched to a country in the state of warfare.” The sample size was 30; each respondent’s political attitude was elicited by interviewing him/her one by one. The survey consisted of 33 questions. In Qs.9 and 17, respondents were asked their intention to vote for a party in the next election. The parties were: Liberal Democratic Party of Japan (LDP); Democratic Party of Japan (DPJ); New Komeito (NK); Japanese Communist Party (JCP); and Social Democratic Party (SDP). Q.9 (hereinafter referred to as Q4W) was formatted by the FT method IJAHP ARTICLE: Sato/How to Measure Human Perception in Survey Questionnaires International Journal of the 69 Vol. 1 Issue 2 2009 ISSN 936-6744 Analytic Hierarchy Process (See Appendix 5), which requires respondents to assign an adequate number between 0 and 100 according to their intention to vote for each party. Q.17 (Q4A) used the AHP method (cf. Appendix 3). The FT gave Feeling Score for each party, and the AHP gave the weight of each party. We also asked respondents about their political ideology in Q.13, and whether they support Prime Minister and the president of LDP, Jyunichiro Koizumi, in Q.33. Both questions were formatted by the SMC method, which requires respondents to choose one from among the given alternatives. Each of these issues was mutually related to the party electorates would vote for in the House of Councilors election. The outputs from these questions would serve as ideal dependent variables in a regression analysis. 3. Results In this section, we analyze responses to the SMC-formatted questions, the MMC-formatted questions and the ranking-formatted questions of Surveys 1, 2, 3 and 4, based on the weight of each alternative elicited from the AHP. 3.1 Comparison between SMC and AHP First, we focus on the difference of the aggregated ranking of alternatives between the SMC method and the AHP method; in particular, the observed ratios of each alternative derived from the SMC and the weight of each alternative elicited from the AHP are compared. Table 1 summarizes the results of two questions, Q1S and Q1A, from Survey 1. Numbers in the first row represent: weights derive from Q1A aggregated across the respondents who chose Apathy in answering Q1S; observed number of the respondents; its ratio among all respondents; and the ratio’s ranking among four reasons. From the second to fourth row show the results in the same way as the first row. The fifth row represents the weight of each reason aggregated across all responses, and total number of the respondents. The last row shows the ranking of the average weight of each reason among all responses. As was done in Table 1, Table 2 summarizes the results of two questions, Q2S and Q2A, from Survey 2. Table 1 Reason for being non-partisan (SMC and AHP) Apathy Realignment Corruption Non-confidence Apathy 0.192 0.350 0.223 0.235 108 38.4% 1 Realignment 0.184 0.171 0.227 0.418 40 14.2% 4 Corruption 0.177 0.199 0.371 0.253 56 19.9% 3 Non-confidence 0.298 0.189 0.248 0.265 77 27.4% 2 Average 0.217 0.250 0.260 0.273 281 100% Ranking 4 3 2 1 Obs. Ratio RankingSMC (Q1S) AHP (Q1A): aggregated weight over each answer to SMC IJAHP ARTICLE: Sato/How to Measure Human Perception in Survey Questionnaires International Journal of the 70 Vol. 1 Issue 2 2009 ISSN 936-6744 Analytic Hierarchy Process Table 2 Main concern of job-search activities (SMC and AHP) As can be seen in Tables 1 and 2, each diagonal element is not always the maximum in each row; that is, an answer on the SMC and the most weighted alternative in the AHP do not necessarily coincide. For example, in Table 1, respondents who answered Apathy to the SMC attached their weight mostly to Realignment in the AHP, while respondents who answered Realignment to the SMC attached their weight mostly to Non-confidence in the AHP. Furthermore, the highest ratio of the answer to the SMC is Apathy; the most weighted reason in the AHP, on the other hand, is Non-confidence. Thus, based on the SMC, the survey research would conclude that the most important reason respondents were non-partisan is Apathy and the next is Non-confidence, and so on. In contrast, based on the AHP, the most important reason is Non-confidence and the next is Corruption, and so on. Clearly, these two methods yield a different aggregated ranking of alternatives. The SMC is superior for two reasons. First, it is easy for respondents to fill out the questionnaire, and second, the main concerns of a respondent can be easily identified. However, this method prevents respondents from expressing their preference for a particular alternative over the others. Furthermore, no information regarding the relationship among non-selected alternatives can be derived. On the other hand, the AHP makes it possible to reflect the relative importance of alternatives to results, even though it requires respondents to answer complex questions, thus requires much more time than the SMC. Which of these methods accurately represents respondents’ perceptions is still an open question; verification requires more empirical tests. These results imply, however, that the output of the SMC, widely employed in survey research, might conceivably provide erroneous information. 3.2 Comparison between MMC and AHP Next, we compare the ratio for each alternative derived from the MMC method and the weight of each alternative elicited from the AHP method. The MMC is another type of the MC question format allowing respondents to express their top two alternatives, thus giving them a greater degree of freedom in answering questions. Each respondent’s preference for alternatives is likely to be specified more precisely than it is with the SMC. In Surveys 1 and 2, however, almost half the respondents in each survey chose only one alternative in the MMC. Accordingly, we define as “Singular” the group of respondents that chose only one alternative despite the fact that they were given the option of indicating their second choice in the MMC (Q1M and Q2M in Surveys 1 and 2, respectively), and we define as “Plural” the Specifications Welfare Salary Place Specifications 0.405 0.097 0.178 0.320 83 47.4% 1 Welfare 0.276 0.219 0.184 0.322 28 16.0% 3 Salary 0.356 0.179 0.266 0.199 46 26.3% 2 Place 0.306 0.066 0.087 0.540 18 10.3% 4 Average 0.361 0.134 0.193 0.312 175 100% Ranking 1 4 3 2 SMC (Q2S) AHP (Q2A): aggregated weight over each answer to SMC Obs. Ratio Ranking IJAHP ARTICLE: Sato/How to Measure Human Perception in Survey Questionnaires International Journal of the 71 Vol. 1 Issue 2 2009 ISSN 936-6744 Analytic Hierarchy Process group of respondents that chose two alternatives. In this section, we first focus on the difference of aggregated ranking of alternatives among the “Singular” and the “Plural” of the MMC and of the AHP. Table 3 summarizes the results of the two questions, Q1M and Q1A, for those defined as “Singular” in the same way as Tables 1 and 2. Table 5 similarly summarizes the results of the two questions Q2M and Q2A. Tables 4 and 6 summarize the results concerning those defined as “Plural,” where two answers are counted for each respondent. As can be seen in Tables 3 and 5, each diagonal element is not always the maximum in each row; that is, an answer of the “Singular” to the MMC and the most weighted alternative in the AHP do not necessarily coincide. For example, in Table 5, respondents who answered Specifications to the MMC attached their weight to Place most in the AHP, and respondents who answered Welfare to the MMC also attached their weight most to Place in the AHP. Furthermore, the highest ratio of the answer to the MMC is Salary; however, the most weighted concern in the AHP is Specifications. As a result, these two methods yield a different aggregated ranking of alternatives. The fact that the choice in the MMC is more compatible than the SMC with the most weighted alternative in the AHP is likely to result in a rationale that the MMC functioned well for the “Plural.” As shown in Table 4, for example, each diagonal element is always the maximum in each row. Table 6, however, shows that the relationship between the choice in the MMC and the most weighted alternative is in a state of chaos. Furthermore, as was seen in Tables 3 and 5, the MMC and the AHP yield a different aggregated ranking of alternatives for both Tables 4 and 6, while the rankings elicited from the AHP are robust (see Tables 1, 3 and 4 from Survey 1 and Tables 2, 5 and 6 from Survey 2). Lastly, we evaluate the effectiveness of the MMC in making up for the lack of information encountered with the SMC. Since the MMC gives respondents the option of indicating their second-best alternative, whether or not a second-best alternative is chosen may depend on the strength of that alternative vis-à-vis the first alternative. In other words, if the MMC functions well for the “Plural,” it could contribute by supplying missing information (e.g., the difference in the degree of importance among selected alternatives and the respondent’s perception concerning non-selected ones). Let wp1 and wp2 respectively denote respondent p’s maximum and second-maximum element of the eigenvector corresponding to the Frobenius root, then ∆wp≡wp1 - wp2 (≥0) represents the discrepancy in the maximum weight and the second maximum weight of alternative for respondent p. Therefore, the larger the ∆wp is, the more clearly respondents would distinguish their best alternative from the second-best. Table 7 summarizes the number of the “Singular” and the “Plural” respondents in Surveys 1 and 2, the table is stratified by the size of ∆wp. As shown in the table, the difference in the distribution of responses between the “Singular” and the “Plural” seems small. Indeed in Chi-square tests, X2(Q1M) = 2.979 < X2(6, 0.8) = 3.070 and X2(Q2M) = 2.108 < X2(6, 0.9) = 2.204 hold. These results from the chi-square tests imply that there may not be any difference in ∆wp between the “Singular” and the “Plural”; in other words, whether or not respondents added their second choice in answering the MMC-formatted question may be independent of the discrepancy in the degree of importance between the best and the second-best alternative. Thus, we could conclude that the MMC does not succeed in precisely specifying a respondent’s preference of alternatives. The MMC, overall, does seem to be a good option for designing questionnaires because it enhances the degree of freedom in answering questions for respondents. Many researchers thus extensively employ the MMC in their survey questionnaires. Insofar as our surveys are concerned, however, the MMC may not be effective in filling the information gap between the SMC and the AHP. IJAHP ARTICLE: Sato/How to Measure Human Perception in Survey Questionnaires International Journal of the 72 Vol. 1 Issue 2 2009 ISSN 936-6744 Analytic Hierarchy Process Table 3 Reason for being non-partisan (MMC “Singular” answer and AHP) Table 4 Reason for being non-partisan (MMC “Plural” answers and AHP) Table 5 Main concern of job-search activities (MMC “Singular” answer and AHP) MMC (Q1M) "Singular" answer Apathy Realignment Corruption Non-confidence Apathy 0.281 0.207 0.229 0.283 41 31.3% 2 Realignment 0.169 0.372 0.230 0.229 48 36.6% 1 Corruption 0.154 0.165 0.415 0.266 25 19.1% 3 Non-confidence 0.142 0.191 0.219 0.448 17 13.0% 4 Average 0.198 0.258 0.263 0.281 131 100% Ranking 4 3 2 1 AHP (Q1A): aggregated weight over each answer to MMC Obs. Ratio Ranking MMC (Q2M) "Singular" answer Specifications Welfare Salary Place Specifications 0.283 0.079 0.162 0.476 19 23.5% 2 Welfare 0.309 0.215 0.096 0.381 11 13.6% 4 Salary 0.367 0.169 0.314 0.150 38 46.9% 1 Place 0.247 0.140 0.082 0.530 13 16.0% 3 Average 0.320 0.149 0.212 0.319 81 100% Ranking 1 4 3 2 AHP (Q2A): aggregated weight over each answer to MMC Obs. Ratio Ranking MMC (Q1M) "Plural" answer Apathy Realignment Corruption Non-confidence Apathy 0.275 0.244 0.245 0.236 83 28.0% 1 Realignment 0.223 0.322 0.203 0.251 72 24.3% 3 Corruption 0.199 0.211 0.324 0.266 78 26.4% 2 Non-confidence 0.209 0.216 0.246 0.330 63 21.3% 4 Average 0.228 0.248 0.256 0.267 296 100% Ranking 4 3 2 1 AHP (Q1A): aggregated weight over each answer to MMC Obs. Ratio Ranking IJAHP ARTICLE: Sato/How to Measure Human Perception in Survey Questionnaires International Journal of the 73 Vol. 1 Issue 2 2009 ISSN 936-6744 Analytic Hierarchy Process Table 6 Main concern of job-search activities (MMC “Plural” answers and AHP) Table 7 Comparison of ∆wp between “Singular” and “Plural” 3.3 Comparison between ranking method and AHP First, we analyze the difference of the preference order for all alternatives concerning the abstract issue, between the ranking method and the AHP method. Specifically, the aggregated ranking of each concept for refining governmental program policy derived from the ranking method and the aggregated weight of each concept elicited from the AHP are compared. Table 8 summarizes the results of the two questions, questions Q3R1 and Q3A1, from Survey 3. The numbers in the first row are the aggregated ranking of each concept across all responses obtained from Q3R1. The second row represents the average weight for each concept aggregated across all responses derived from Q3A1. The last row indicates the change from year 2001 to 2002 in the actual implementation of the budget for each concept, which corresponds to one of the five budgeting categories in the local government. MMC (Q2M) "Plural" answer Specifications Welfare Salary Place Specifications 0.445 0.136 0.230 0.189 57 32.4% 1 Welfare 0.283 0.225 0.093 0.399 40 22.7% 3 Salary 0.426 0.134 0.286 0.154 45 25.6% 2 Place 0.328 0.196 0.063 0.413 34 19.3% 4 Average 0.381 0.167 0.181 0.271 176 100% Ranking 1 4 3 2 AHP (Q2A): aggregated weight over each answer to MMC Obs. Ratio Ranking Singular Plural Singular Plural 46 45 24 30 ( 0 , 0.1 ] 34 35 22 21 ( 0.1 , 0.2 ] 16 24 9 13 ( 0.2 , 0.3 ] 7 10 6 8 ( 0.3 , 0.4 ] 10 14 5 5 ( 0.4 , 0.5 ] 6 10 6 5 ( 0.5 , 1.0 ] 12 10 9 6 Q1M (Survey 1) Q2M (Survey 2) ∆ w p 0 IJAHP ARTICLE: Sato/How to Measure Human Perception in Survey Questionnaires International Journal of the 74 Vol. 1 Issue 2 2009 ISSN 936-6744 Analytic Hierarchy Process Table 8 Concepts of refining governmental program policy As shown in Table 8, the difference of the preference order for all alternatives between the ranking method and the AHP seems small. For example, the highest ranked concept in the aggregated ranking derived from the ranking method is Concept II; the most weighted concept in the AHP is also Concept II. The case is the same for both the lowest concept in the ranking method and the lightest weighted concept in the AHP: one concept—Concept IV—results for both. In addition, the correlation coefficient (CC) between the outputs of Q3R1 and Q3A1 is -0.87. Consequently, the preference order of the basic concepts derived from the ranking method coincides almost exactly with that elicited from the AHP. As a result, these two methods would likely produce similar results concerning respondents’ preference order of alternatives. Nevertheless, the preference order of basic concepts for refining governmental program policy does not coincide with the annual change in the actual implementation of the budget. Specifically, the concept thought to be the most significant by executive staff members for refining governmental program policy (Concept II) and the category budgeted with the highest priority (corresponding to Concept III) are different. Consequently, the preference orders obtained from the ranking method and the AHP do not coincide with the annual change in budget. Indeed, CCs between the preference orders of concepts and the actual change in budget are -0.44 and 0.32. These results may imply that both the ranking method (Q3R1) and the AHP (Q3A1) did not function well in capturing respondents’ perceptions. We would not, however, be able to draw such a conclusion based solely on this survey because the change rates in the budget were so small (between –1.62% and +1.26%) that the correlation between the preference order and the change rate may have been affected by measurement error. Next, we focus on the difference of the preference order for all alternatives concerning the concrete issue between the ranking method and the AHP. Specifically, the aggregated ranking of each governmental project with high priority obtained from the ranking method and the aggregated weight of each project elicited from the AHP are compared. Table 9 summarizes the results of the two questions, Q3R2 and Q3A2 in the same way as Tables 8. As shown in Table 9, the preference order for all alternatives between the ranking method and the AHP are different. For example, both the highest PHP in the aggregated ranking obtained from the ranking method and the most weighted PHP in the AHP are PHP I, while the lowest PHP in the ranking method (PHP V) and the lightest weighted PHP in the AHP (PHP III) are different. Indeed, the CC between the outputs of Q3R2 and Q3A2 is -0.32. As a result, the ranking method and the AHP would reason differently. Q3R1 & Q3A1 Concept I Concept II Concept III Concept IV Concept V Ranking 2.000 1.250 3.500 4.500 3.750 Weight 0.18803 0.36152 0.19979 0.08191 0.16874 Annual change in budget 2001→ 2002 -1.11%0.355% -1.62% 1.26% -0.231% IJAHP ARTICLE: Sato/How to Measure Human Perception in Survey Questionnaires International Journal of the 75 Vol. 1 Issue 2 2009 ISSN 936-6744 Analytic Hierarchy Process Table 9 Governmental projects with high priority In contrast to the case of the abstract issue, the preference order of PHPs induced from the AHP are highly correlated with the annual change in the actual implementation of the budget for the case of the concrete issue. As can be seen in the Table 9, the preference order elicited from the AHP coincides almost exactly with the annual change in the actual implementation of the budget (CC=0.98), while that generated by the ranking method does not (CC= -0.32). Although these results may have been affected by measurement error, these results imply that each respondent’s preference for alternatives elicited from the AHP is likely to be specified more precisely than it is with the ranking method. 3.4 Comparison between rating method and AHP First, we focus on the difference of the aggregated ranking of alternatives between the FT method and the AHP method; in particular, the average of Feeling Score for each party obtained from the FT and that of the weight of each party derived from the AHP are compared. Since both rankings reflect the entire trend concerning each party of a population, they are likely to produce similar results. Table 10 summarizes the results of two questions, Q4W and Q4A, from Survey 4. The numbers in the first and the second rows are the average of Feeling Score and average weight of each party aggregated across all responses, derived from Q4W and Q4A, respectively. The last row represents the correlation coefficients between Feeling Score and the weight elicited from the AHP of each party among all respondents. As shown in Table 10, Feeling Scores and the weights of the AHP imply that the most and the second most favored party among respondents are DPJ and LDP, respectively; both the FT and the AHP clearly identify the top 2 parties. On the other hand, the ranking for the remaining parties, NK, JCP and SDP, are different, even though the difference of Feeling Score or the weight of each party among those parties is quite small. Indeed, the correlation coefficients between Feeling Score and the weight of each party are high on some level. As a result, answers on the FT and on the AHP do not necessarily coincide, which yields a different aggregated ranking of alternatives. Both the FT and the AHP, however, produces similar results, and they overall seem to specify respondents’ preference for parties in the choice of House of Councilors election. Q3R2 & Q3A2 PHP I PHP II PHP III PHP IV PHP V Ranking 2.000 2.750 2.750 3.250 4.250 Weight 0.39234 0.14968 0.08169 0.14725 0.22904 Annual change in budget 2001→ 2002 -5.87% -0.300%10.3% -7.72% -8.16% IJAHP ARTICLE: Sato/How to Measure Human Perception in Survey Questionnaires International Journal of the 76 Vol. 1 Issue 2 2009 ISSN 936-6744 Analytic Hierarchy Process Table 10 Aggregated Feeling Score and weight of each party Next, we evaluate the effectiveness of each method in terms of its ability as independent variables in regression analyses; in particular, we formulate four regression models and compare R2s. In the evaluations, as independent variables, we employ Feeling Score and not-normalized weight of each party except for those of the Social Democratic Party (SDP), because the party was minority group (The number of the member of House of Councilors was; LDP: 116, DPJ: 70, NK: 23, JCP: 20 and SDP: 5, as at the survey was carried out. (March 2004)). Insofar as dependent variables, based on Quantification Theory I, we employ two variables; one is respondent’s political ideology (Q.13) in regression models 1 and 2, and the other is whether respondent supports Prime Minister Koizumi or not (Q.33) in regression models 3 and 4. We offered six alternatives including “Don’t Know” answer in Q.13, and we set “Progressive” or “Slightly progressive” as 0, and “Conservative” or “Slightly conservative” as 1, and we omitted “Neutral” and “Don’t Know” responses from the Regression Analysis A. In the same way as for Regression Analysis A, we set “Support Koizumi” as 0, and “Do not support Koizumi” as 1, and we omitted “Don’t Know” responses from Regression Analysis B. Thus, the actual regression models can be formulated as follows. Regression Analysis A dependent variables; Political Ideology “Progressive” or “Slightly progressive” = 0, “Conservative” or “Slightly conservative” = 1 regression equations; Political Ideology = a1 + a2*FSLDP + a3*FSDPJ + a4*FSNK + a5*FSJCP (regression model 1) Political Ideology = b1 + b2*wLDP + b3*wDPJ + b4*wNK + b5*wJCP (regression model 2) Regression Analysis B dependent variables; Support Koizumi or not “Support” = 0, “Do not support” = 1 regression equations; Koizumi = c1 + c2*FSLDP + c3*FSDPJ + c4*FSNK + c5*FSJCP (regression model 3) Koizumi = d1 + d2*wLDP + d3*wDPJ + d4*wNK + d5*wJCP (regression model 4) Q4W & Q4A LDP DPJ NK JCP SDP Feeling Score 53.6 66.6 19.2 14.4 14.2 Weight 0.34662 0.38969 0.08945 0.08007 0.09417 Correlation Coefficient 0.81257 0.72848 0.77450 0.77443 0.75042 IJAHP ARTICLE: Sato/How to Measure Human Perception in Survey Questionnaires International Journal of the 77 Vol. 1 Issue 2 2009 ISSN 936-6744 Analytic Hierarchy Process Table 11 summarizes the results of Regression Analysis A. As can be seen in the table, the R2s of regression models 1 or 2 are not large enough to predict respondents’ political ideology; respondent’s intention for voting does not necessarily correlate with his/her political ideology. By employing the weights derived from the AHP, however, R2 is improved from 0.23755 to 0.42850. Furthermore, focusing on the p-values, the degree of precision of the regression model 2 is higher than that of the regression model 1. Table 11 Results of Regression Analysis A (models 1 and 2) Table 12 summarizes the results of Regression Analysis B. As shown in the table, R2s are relatively larger that those of the Regression Analysis A; whether respondent supports Prime Minister Koizumi or not can be predicted on some level based on regression models 3 or 4. By employing the weight derived from the AHP, R2 is again improved from 0.42293 to 0.51299. Furthermore, focusing on the p-values, the degree of precision of the regression model 4 is higher than that of the regression model 3. Table 12 Results of Regression Analysis B (models 3 and 4) constant 1.48488 ** 1.35099 *** LDP -1.58819 ** -0.00968 *** DPJ -0.51939 * -0.00175 * NK -0.45019 * -0.00330 ** JCP -1.62317 * -0.00642 ** N = 25 R 2 = 0.23755 SE = 0.48460 R 2 = 0.42850 SE = 0.41955 model 1 (Feeling Score) model 2 (weight) coefficient coefficient Political ideology p-value: 0.05 < p <= 0.1: *, 0.01 < p <= 0.05: **, p <= 0.01: *** constant -0.05558 * -4.22262 ** LDP 0.00918 ** 5.29983 *** DPJ -0.00148 * 4.09808 ** NK 0.00481 ** 6.46932 *** JCP 0.00125 * 9.04207 *** N = 17 R 2 = 0.42293 SE = 0.45130 R 2 = 0.51299 SE = 0.41459 p-value: 0.05 < p <= 0.1: *, 0.01 < p <= 0.05: **, p <= 0.01: *** model 3 (Feeling Score) model 4 (weight) coefficient coefficient Support Koizumi IJAHP ARTICLE: Sato/How to Measure Human Perception in Survey Questionnaires International Journal of the 78 Vol. 1 Issue 2 2009 ISSN 936-6744 Analytic Hierarchy Process The FT method overall seems to be a good option for designing questionnaires because it assures respondents expressing their preferences for all alternatives, and it enhances the degree of freedom in answering questions for respondents. Therefore, many researchers, especially in political science field, have extensively employed the FT in their survey questionnaires. Insofar as our survey is concerned, however, Feeling Scores may not be effective in predicting respondents’ preferences in comparison with weights elicited from the AHP. Either in Regression Analysis A or B, R2s based on the FT are not large enough to predict respondents’ preferences. On the other hand, the results of this section imply that the AHP could quantify respondents’ preferences in terms of the distribution of the weight of each alternative; by employing weights of the AHP, R2s in regression analyses are improved from the use of Feeling Scores. Since Feeling Score measures the feeling for parties and the weight of the AHP measures the preference for parties, we cannot simply compare these R2s and conclude that the weight is the better explainer than Feeling Score. The weights derived from the AHP, however, could be conjectured to function better as independent variables in regression analyses than Feeling Scores. 4. Concluding remarks Questionnaire design poses one of the biggest challenges for survey researchers because how respondents are asked questions can have a great effect on results. One political scientist2 remarked that different question formats yielded different results, despite the fact that they were asking about the exact same content. Consequently, various ways of eliciting opinions have been proposed and evaluated in order to clarify which best represents each respondent’s perception. In particular, the multiple-choice method, ranking method, and rating method have often been compared, generating a great deal of discussion. Among the aforementioned methods, each method has the pros and cons. The multiple-choice method has been most extensively used because of its ease of response and its ease in identifying for the researcher the respondents’ main concerns. In addition, this method provides some formats that enhance a respondent’s degree of freedom in answering questions. Meanwhile, no information regarding the non-selected alternatives or the relative importance among selected alternatives can be derived. The ranking method, often considered ideal for designing questionnaires, is also extensively used because it allows researchers to easily identify respondents’ preference orders. This method, however, does not allow ties for alternatives nor can they represent the degree of importance for each alternative. The rating method, such as the Feeling Thermometer, has also been extensively employed because of its ease in identifying for the researcher the respondents' concerns for all alternatives. On the other hand, this method does not fit the natural way of thinking for humans. In contrast, the AHP, a support system for decision-making, can thus be a possible option for formatting questionnaires. This method makes it possible to reflect the relative importance of alternatives to results, even though it requires respondents to answer complex questions, thus necessitates much more time. In this study we focused on survey research in which the questions that respondents were asked involved issues, we verified the effectiveness of the multiple-choice method, the ranking method, and the rating method by using the weight of alternatives elicited from the AHP method. The results were: (1) the simple or modified multiple-choice and the AHP yielded a different aggregated ranking of alternatives, while the weights elicited from the AHP were robust; (2) whether or not respondents added a second choice in answering the modified multiple-choice formatted questions was irrelevant to the discrepancy in the degree of importance between the best and the second-best alternative; (3) for the abstract issue, the 2 Chris Achen, University of Michigan, February 2000 (personal communication). IJAHP ARTICLE: Sato/How to Measure Human Perception in Survey Questionnaires International Journal of the 79 Vol. 1 Issue 2 2009 ISSN 936-6744 Analytic Hierarchy Process ranking method and the AHP yielded similar aggregated ranking of alternatives; (4) for the concrete issue, the results from the AHP coincided almost exactly with the annual change in the actual implementation of the budget, unlike those generated by the ranking method; (5) the Feeling Thermometer method and the AHP yielded a similar aggregated ranking of alternatives; (6) in regression analyses, Feeling Scores may not be effective in predicting respondents’ preferences, while the weight derived from the AHP could predict their preferences on some level. These results, insofar as our survey is concerned, provide some evidence that the multiple-choice method, the ranking method, and rating method do not succeed in precisely specifying a respondent’s perception for alternatives and thus might not be appropriate for measuring human perception in questionnaires of survey research. The application of the AHP to questionnaire design in survey research, on the other hand, might very well be superior to those traditional methods. Nevertheless, several issues remain. The first issue is: “What are the criteria that would be appropriate for evaluating the effectiveness of the questionnaire format?” For example, in Survey 3 is the annual change in the actual implementation of a budget really adequate? This issue relates to the criterion evaluating the method: the criterion needs to be sensitive in measuring the correlation between what is in respondents’ minds and their actual behavior. As for the theme of the survey, the annual change in budget is still a functional criterion for the evaluation because the change is one of the distinguishing phenomena representing the priority of executive staff members in the government. On the other hand, other criteria for evaluation could be considered: the number of staff members for a particular project, the frequency of being one of the main agenda items in the executive staff meetings, the number of times a particular project is covered in the public relations material of the government, and so on. Further empirical tests are required for verification. The second issue concerns what the results would be if the number of alternatives in a question were other than we used in the surveys. Would similar results be obtained if the question offered nine alternatives, for instance? In this study, we analyzed the case for just four or five alternatives; investigations of up to at least ten alternatives would therefore be warranted in order to verify what an adequate format for questionnaire design would be. REFERENCES Downing M.Steven. (2004). Reliability: on the Reproducibility of Assessment Data, Medical Education, 38(9), 1006–1012. Inglehart, R. & Abramson, P. (1993). Values and Value Change of Five Continents, paper presented at the 1993 Annual Meeting of the American Political Science Association, Washington D.C., September 1-5. Jerard, K. (1995). Writing multiple-choice test items Practical Assessment, Research and Evaluation, 4(9). Judd, C. M., Smith, E. R. & Kidder, L. H.(1991). Research Methods in Social Relations. Sixth Edition. Orlando: Harcourt Brace & Co. Kabashima, I. (1998). Seiken Koutai to Yuken-shya no Taido Henyou. Tokyo: Bokutaku-shya (in Japanese). Saaty, T.L. (1980). The Analytic Hierarchy Process. New York: McGraw-Hill. IJAHP ARTICLE: Sato/How to Measure Human Perception in Survey Questionnaires International Journal of the 80 Vol. 1 Issue 2 2009 ISSN 936-6744 Analytic Hierarchy Process Saaty, T.L. (1994). Highlights and Critical Points in the Theory and Application of the Analytic Hierarchy Process, European Journal of Operational Research, 52, 426-447. Sato, Y. (2003). Comparison between Ranking Method and the Analytic Hierarchy Process in Program Policy Analysis, Proceedings of the Seventh International Symposium on the Analytic Hierarchy Process, 439-447. Sato, Y. (2004). Comparison Between Multiple-choice and Analytic Hierarchy Process: Measuring Human Perception, International Transactions in Operational Research, 11(1), 77-86. Sato, Y. (2005). Questionnaire Design for Survey Research: Employing Weighting Method, Proceedings of the Eighth International Symposium on the Analytic Hierarchy Process, ISSN 1556-830X. Sato, Y. (2007). Administrative Evaluation and Public Sector Reform: An Analytic Hierarchy Process Approach, International Transactions in Operational Research, 14(5), 445-453. Traugott, W. Mike & Lavrakas, J. Paul. (2000). The Voter’s Guide to Election Polls, 2ed. New York: Chatham House Publishers, Seven Bridges Press, LLC. Webber, A., Apostolou, B. & Hassell, J.M. (1997). The Sensitivity of the Analytic Hierarchy Process to Alternative Scale and Cue Presentations, European Journal of Operations Research, 96, 351-362. IJAHP ARTICLE: Sato/How to Measure Human Perception in Survey Questionnaires International Journal of the 81 Vol. 1 Issue 2 2009 ISSN 936-6744 Analytic Hierarchy Process Appendix 1 (Q1S: Q.13 in Survey 1, as formatted by the SMC method) Which of the following was the significant reason you were non-partisan? Choose one from among the four reasons and write the letter in the box. Appendix 2 (Q1M: Q.29 in Survey 1, as formatted by the MMC method) Which of the following was the significant reason you were non-partisan? Choose one from among the four reasons and write the letter in the box. You may give an additional reason in the box marked “optional.” Appendix 3 (Q1A: Q.26 in Survey 1, as formatted by the AHP method) If you compare each of the following pairs of reasons for being non-partisan, which do you think is more significant? Compare each of the following pairs of reasons and mark the place along the segment. A: Too much political realignment B: Political Apathy C: Non-confidence with party and politician D: Corruption of political ethics A: Too much political realignment B: Political Apathy C: Non-confidence with party and politician D: Corruption of political ethics optional absolutely equivalent absolutely Realignment Non-confidence Corruption Apathy Non-confidence Corruption Realignment Corruption Apathy Realignment Non-confidence Apathy IJAHP ARTICLE: Sato/How to Measure Human Perception in Survey Questionnaires International Journal of the 82 Vol. 1 Issue 2 2009 ISSN 936-6744 Analytic Hierarchy Process Appendix 4 (Q3R1: One of the two first questions in Survey 3, as formatted by the Ranking method) The five basic concepts for refining our govenmental program policy are as follows: Concept I: for cultural development of our area Concept II: for safety of our social life Concept III: for environmental preservation of our area Concept IV: for economic growth of our area Concept V: for enhancement of our area Which of the following projects are more significant than others? Rank all projects, from the most preferred (1) to the least (5), and write the number in each box. Appendix 5 (Q4W: Q.9 in Survey 4, as formatted by the Feeling Thermometer method) The next House of Councilors election is scheduled this coming July. Which party are you willing to vote for the election by a proportiional representation system? Indicate your intention of voting for each of the following parties by a number from 0, coldest feeling, to 100, the hottest feeling, with 50 being neutral, and write the number in each box. For cultural development of our area For the safety of our social life For environmental preservation of our area For the economic growth of our area For the attraction of our area Liberal Democratic Party (LDP) Democratic Party Japan (DPJ) New Komeito (NK) Japan Communist Party (JCP) Social Democratic Party (SDP)