Evaluation of English Language Question Papers for Content Validity at Intermediate Level: 295 UNIVERSITY OF CHITRAL JOURNAL OF LINGUISTICS AND LITERATURE VOL. 5 | ISSUE II | JULY – DEC | 2021 ISSN (E): 2663-1512, ISSN (P): 2617-3611 https://doi.org/10.33195/jll.v5iII.329 Evaluation of English Language Question Papers for Content Validity at Intermediate Level: A Case Study of Board of Intermediate and Secondary Education, Sukkur Ikramullah Khan Assistant Professor in English Government Degree College & Postgraduate Studies Centre, Sukkur ikramk607@gmail.com Dr. Zulfiqar Ali Shah Director Institute of English Language and Literature Shah Abdul Latif University, Khairpur, Pakistan. zulfiqar.shah@salu.edu.pk Dr. Abdul Saeed Assistant Professor of English Sukkur Institute of Business Administration University, Pakistan saeedabdulskr@gmail.com Abstract The purpose of this study was to examine if the annual English exam question papers of first year accurately reflected the synchronisation of the National Curriculum Benchmarks (2006) with the questions on the test administered by the Board of Intermediate and Secondary Education Sukkur. Annual question papers from BISE Sukkur's XI courses were chosen to examine content validity across a five-year period (2014-18). A survey questionnaire was used to obtain data from ELT experts. When compared to the Benchmarks of the National Curriculum of English Language (2006), the data revealed that the items on the annual question papers had very low content validity. The study recommends some effective ways for developing effective question papers aligned with Benchmarks. Keywords: English language, content validity, Benchmarks, national curriculum, evaluation, Introduction English has become the language of communication across the globe. It is a source to connect people from different corners of the world. It serveqs many purposes such as education, trade and tourism. Similarly, English language has also gained a very prestigious position in Pakistan (Ahmed, 2012). It is used as a second language along with national language Urdu for educational and official correspondence. The Pakistani Ministry of Higher Education has made English language proficiency a requirement for employment in Pakistani universities. In addition to the scholastic benefits of English, it is also a means of achieving a high social status (Haidar, 2019). The higher one's proficiency, the better one's chances of gaining social prestige.Having https://doi.org/10.33195/jll.v5iII.310 https://doi.org/10.33195/jll.v5iII.310 mailto:ikramk607@gmail.com mailto:zulfiqar.shah@salu.edu.pk mailto:saeedabdulskr@gmail.com Evaluation of English Language Question Papers for Content Validity at Intermediate Level: 296 UNIVERSITY OF CHITRAL JOURNAL OF LINGUISTICS AND LITERATURE VOL. 5 | ISSUE II | JULY – DEC | 2021 ISSN (E): 2663-1512, ISSN (P): 2617-3611 https://doi.org/10.33195/jll.v5iII.329 seen the importance of English language worldwide, the government of Pakistan has made it a compulsory subject from primary to higher level of education. The aim is to make the students proficient enough in all four language learning skills so that they can compete for any world level educational scholarship (Muhammad, 2016). The National Curriculum of Pakistan (2006) has set different level-wise standards from primary to higher level of education to make sure the steady progress of students in English throughout their academic career. The students are assessed by the end of their academic year through written exams to assess their required level of progress. Thus, evaluation is one of the most important and crucial factors in the process of teaching and learning. It is the only proper evaluation process that can estimate the possible achievement of students in their academic year (Hughes, 2003). The proper assessment and evaluation help the learners to focus on the less focused or weak areas of the study to show improvement. To assess students’ performance, a variety of methods and approaches are used. Question papers, thus, are one of the ways to assess outcome of learners’ achievement of the year. Therefore, these question papers are supposed not only to contain lower to higher level skills to assess the overall growth and understanding of the learners but they should also reflect the required outcome of the learning, set by relevant authority to detect the students’ achievement (Stoynoff, 2009). In other words, if question papers are not designed keeping the assessment of outcomes in mind, they will have a detrimental impact on teaching and learning. The term "validity" refers to whether or not a test has measured what it was supposed to measure. According to Brennan et al. (2006), validity is an essential precept of evaluation, and a crucial characteristic that is associated with the interpretations and uses of test scores. They further highlighted that based on assessment, test’s validity helps in inferring the test assessing what it was expected to assess. It is considered that because content validity is an important component of an educational test, the test developers should evaluate any test's content validity to make it more successful and beneficial. To assess students' mastery of the subject, a genuine question paper is required (Akhter, 2015). Content validity is the most common type of validity which entails a thorough analysis of test content in the form of test items to ensure that the test covers and represents the proportion of the syllabus as well as the cognitive domain level of educational https://doi.org/10.33195/jll.v5iII.310 https://doi.org/10.33195/jll.v5iII.310 Evaluation of English Language Question Papers for Content Validity at Intermediate Level: 297 UNIVERSITY OF CHITRAL JOURNAL OF LINGUISTICS AND LITERATURE VOL. 5 | ISSUE II | JULY – DEC | 2021 ISSN (E): 2663-1512, ISSN (P): 2617-3611 https://doi.org/10.33195/jll.v5iII.329 objectives. The content validity of a test is determined by specialists who determine how valid a test is in terms of content and objective (Zamanzadeh et, al., 2015). Despite the teaching of English as a compulsory subject from primary to higher level of education in Pakistan, the results do not show a required proficiency level of learners. Several reasons for this inefficiency have been highlighted by the researchers. For example, it has been found that English teachers are not highly skilled to use the modern teaching techniques. Most of the teachers still use only grammar-translation method which is inadequate to cover all four skills (Shamim, 2008). The English language syllabus does not conform to the specific curricular objectives, and the text books emphasise content rather than language acquisition skills. However, the validity of question papers is the area that has not been explored yet (Warsi, 2004). Therefore, the purpose of this study was to examine the questions of annual examination papers of English at intermediate level for their validity (content validity). This study aimed to investigate the gap between questions of English papers of Intermediate level class XI of Sukkur board with the Benchmarks of National Curriculum (2006) in terms of relevance. The study is of a great importance as it compares the relevance of annual question papers of the most recent five years of Board of Intermediate and Secondary Education Sukkur with endorsed Benchmark of National Curriculum (2006). Second, the study assesses the consistency of question paper content with the expressed objective for which the tests are being regulated. Third, the investigation helps English language instructors to get an understanding of content validity while creating question papers. Fourth, the study proposes test makers and Sukkur Board to synchronise the question papers with the proposed National Curriculum (2006). At last, it serves as a valuable resource for future studies on testing and assessment. Research Questions • To what extent annual question papers of the English language grade XI at the Intermediate level are relevant with the prescribed Benchmarks of National Curriculum (2006)? • What measures can be taken to synchronize the process of question papers with the desired outcomes as suggested by the Benchmarks? https://doi.org/10.33195/jll.v5iII.310 https://doi.org/10.33195/jll.v5iII.310 Evaluation of English Language Question Papers for Content Validity at Intermediate Level: 298 UNIVERSITY OF CHITRAL JOURNAL OF LINGUISTICS AND LITERATURE VOL. 5 | ISSUE II | JULY – DEC | 2021 ISSN (E): 2663-1512, ISSN (P): 2617-3611 https://doi.org/10.33195/jll.v5iII.329 Literature Review Validity Validity is one of the basic characteristics of a test. The level of measurement of a test that is designed to measure is usually characterized as the basic attribute of assessment ‘validity.' The higher the test's validity, the more valuable it will be, and testing will be beneficial to future educational planning and implementation (Xi, 2021). The notion refers to the efficacy, correctness and meaningfulness of the specific inference made from the test (Fulcher, 2007). The general definition of validity is the level to which a test measures what it is supposed to measure. Although different researchers have used almost 35 different terms for different kinds of validity, Brown's (1980) classification is considered more appropriate as it covers a wide range of the term. Brown (1980) defined four types of validity. First kind is called Predictive validity which predicts students’ performance for future progress. Second one is Content validity which assesses whether the test contains all aspects of the construct to be assessed. The third kind of validity is Construct validity that describes the efficacy of test items which means whether the exams questions are appropriate enough to assess what they are supposed to assess. The fourth kind of validity is the Concurrent validity which compares the items of one test with another test assessing same level of competency. All the above described forms of validity help to develop a valid test which assesses what it is supposed to assess. The test developers either focus on all of these kinds of validity or one or two of these to make the test valid (Gipps, 2011). Although all kinds of validity are important for test development; content validity is considered the most important as it evaluates the contents of test to be assessed (Weir, 2005). Moreover, as the main focus of the study is to analyse content validity of last five years’ English papers of class XI of Sukkur Board, only content validity will be discussed in the following section. Content Validity and Assessment Test Generally, content validity is the degree to which a test accurately reflects the subject area it is designed to examine. When a test is developed to estimate high content validity, the material must be compatible with the testing purpose as well as the current understanding of the subject https://doi.org/10.33195/jll.v5iII.310 https://doi.org/10.33195/jll.v5iII.310 Evaluation of English Language Question Papers for Content Validity at Intermediate Level: 299 UNIVERSITY OF CHITRAL JOURNAL OF LINGUISTICS AND LITERATURE VOL. 5 | ISSUE II | JULY – DEC | 2021 ISSN (E): 2663-1512, ISSN (P): 2617-3611 https://doi.org/10.33195/jll.v5iII.329 matter being tested (Martone, 2009). The content validity, also known as rational validity, should determine how far the content is represented in a test paper in order to measure the full construct. For example, rather than asking unrelated questions, a question paper with content validity would signify the subject actually taught to the students (Carmines, 1979). The importance of content validity has rigorously been stressed from past several decades (Almanasreh et al., 2019; Vakili, 2018). It is deemed to be very necessary not only to analyze whether the content of the test is compatible with the content of the curriculum taught but also the proportion of that compatibility. In general, content validity is determined deductively by creating a universe of objects and systematically sampling within that universe to generate the test (Cronbach, 2017). Moreover, evidence of content validity does not require a complex, time- consuming analytical analysis or massive samples, rather it is assessed simply by comparing the content of curriculum and the test for assessments (Colquitt et al., 2019). Hughes (2020) defined the issue of content convergence by emphasizing the importance of our attention, particularly in achievement testing, as if achievement tests are based on detailed teaching and textbook content, such tests will provide a more accurate picture of what has been accomplished in teaching and learning. These tests will most likely be judged in relation to the aims of the content. When writing an achievement test paper, a test designer should start with a specific outline of the topics, and the expert should explain what a student should focus on learning during the academic year. The primary goal of test item designer is to assess the most important skills and knowledge that learners have acquired over time (Siddiek, 2018). The Examination System of Pakistan According to Nawani (2021), the current examination system in Pakistan does not focus on assessing students’ critical and analytical ability. Rather, it is more content based than skill and focusses on assessing the factual information by the students which does not serve the purpose set by the National Curriculum (2006). Due to being superficial in nature and content, the examination system of Pakistan has highly been criticized (Fatima, 2020; Greaney, 1998; Mirza, 1999). Rehmani (2003) adds that teachers teach in classroom keeping examination in mind and their sole purpose is to prepare the students to get through their exams than preparing them to learn practical https://doi.org/10.33195/jll.v5iII.310 https://doi.org/10.33195/jll.v5iII.310 Evaluation of English Language Question Papers for Content Validity at Intermediate Level: 300 UNIVERSITY OF CHITRAL JOURNAL OF LINGUISTICS AND LITERATURE VOL. 5 | ISSUE II | JULY – DEC | 2021 ISSN (E): 2663-1512, ISSN (P): 2617-3611 https://doi.org/10.33195/jll.v5iII.329 skills- language learning skills in the context of English language teaching. Rehmani (2003, p. 4) further states, “There are model papers or guess paper guides available in the market with readymade answers based on the past five year papers”. Greaney (1998) pointed out many flaws in Pakistani examination system such as cramming, copy culture and not assessing higher level cognitive skills. It was further pointed out that these short comings are playing havoc with teaching learning process. The education system generally and secondary level particularly is not contributing to the attainment of higher level cognitive skills, rather assessing superficial cognitive skills. Consequently, pupils tend to focus on these required skills which may lead them to secure higher marks than actual language learning skills. Warsi (2004) also pointed out the wide gap between textbooks and cognitive problem- solving tasks in exams. It may lead the learners to pass exams with flying colors but fail in their practical life. Shah (2010) has highlighted that education system in Pakistan is simply based on memorization and due to this thought of an educational atmosphere, students rely solely on knowledge of the prescribed textbook content rather than the practical or creative use of their understanding. In this context, another study by Shah (2004), also highlighted several problems such as the question papers have many errors in content, language and technical construction. Shah also writes that, in the public exam, writers may be highly skilled individuals with more than 5-10 years of teaching experience, but few may have had adequate trainings in assessment and evaluation approaches. Therefore, it is very clear that there are several issues that require due attention to be addressed that are devaluing the validity of the examination system. Evaluation of Last Five-year Question Papers There is a lot of literature on setting question papers based on Bloom’s taxonomy, but very few studies on exam question evaluation have been done. And, very little research has been done on the evaluation of question papers with Pakistan National Curriculum (2006). Siddique (2013) conducted a research on the evaluation of English language assessment criteria at upper secondary level in Pakistan. In her study, she aimed to explore the weakness of the assessment criteria in https://doi.org/10.33195/jll.v5iII.310 https://doi.org/10.33195/jll.v5iII.310 Evaluation of English Language Question Papers for Content Validity at Intermediate Level: 301 UNIVERSITY OF CHITRAL JOURNAL OF LINGUISTICS AND LITERATURE VOL. 5 | ISSUE II | JULY – DEC | 2021 ISSN (E): 2663-1512, ISSN (P): 2617-3611 https://doi.org/10.33195/jll.v5iII.329 relation to the student's performance in the subject of XI-XII grade. The research was conducted to analyze the theoretical assessment framework and the practical model, practised in public exams. The data were collected from the manual document of the National Curriculum, (2006), updated works of the English subject (2007-2010), experts, teachers and students of the respective classes. The results of the study showed that there is a pressing need to improve the quality of the assessment to achieve the desired learning outcome. Furthermore, the assessment system suffers from multiple shortcomings, such as rote-learning and lower-level skills assessment. Shah (2004) highlights a primary factor that reduces the validity of question papers as a high-level valid test is repetition of questions. The study scrutinized the repetition of questions, essays, and structured questions and claimed that this repetition of questions completely convinces the students to take selective shortcuts and prepare the repeated questions to get through exams. As a reason, students take shortcuts to memorize things that have been consistently repeated in tests for the past five years. Likewise, Martone (2009) conducted research to analyze whether there is synchronization between English reading material recommended by the Sindh Text Book Board for the intermediate level and the annual question papers. But the results reveal that the contents of the Sindh textbook are not in synchronization with the specifications shown in the National Curriculum (2006), which is one of the latest national curricula for Pakistan. This, as a result, may leave a washback effect on the learners. Research Methodology Participants According to Retnawati (2016), content validity is determined by expert agreement, and this agreement determines content validity stratification. Rogers (2010) goes on to describe the content validity of the test based on qualified assessments of test content related to the domains of the subject matter, as well as its depiction of items. According to Messick (1996), the outcome of the agreement specifies the synchronization of test material of a particular behavioral domain of interest and the Judges' subjectivity is mostly responsible for the content scrutiny. According to Lynn (1986), 10 experts are appropriate to examine content validity, hence thirty testing specialists https://doi.org/10.33195/jll.v5iII.310 https://doi.org/10.33195/jll.v5iII.310 Evaluation of English Language Question Papers for Content Validity at Intermediate Level: 302 UNIVERSITY OF CHITRAL JOURNAL OF LINGUISTICS AND LITERATURE VOL. 5 | ISSUE II | JULY – DEC | 2021 ISSN (E): 2663-1512, ISSN (P): 2617-3611 https://doi.org/10.33195/jll.v5iII.329 from the English language teaching and assessment field were purposefully selected. All the participants were having at least Master degree in the field as well as minimum three years’ experience of teaching and assessing English as a second language. Tool Dörnyei (2009) states Likert Scale as a simple, versatile and reliable scaling technique. Therefore, the present study implied Likert Scale for data collection. Lynn (1986) applied this scale in the study and found it very valid and reliable. Considering the similarity in purpose of the study, the questionnaire originally developed and applied by Davis (1992) was modified to use in our context. While modifying the instruments, its application and flexibility was also considered. The focus of the study was to find out the content validity of English question papers of Intermediate Level grade XI from 2014 -2018 of Sukkur Board by comparing them with the Benchmarks of National Curriculum of English (2006). Therefore, the survey questionnaire included the selected items of five years’ English question papers, the Benchmarks of National Curriculum of English (2006) and the Likert Scale of the content relevance in the grid form. The purpose to keep these items together in survey questionnaire was to give the experts all items together on a single page for their ease to provide their valuable feedback while comparing these items. The obtained data were coded and transferred to the spreadsheet on the computer for proper tabulation and interpretation. Procedure The total numbers of English papers were ten with the numbers of questions exceeding 90. As the total number of questions were difficult to answer for experts being time consuming, 60 questions were chosen randomly for experts’ evaluation. Question papers of first year have two parts: part first consists of composition type questions, generally the extended type questions and short questions whereas the second part consists only objective type of questions having 20 marks. But, as the underhand study has focused on the construction of Part 1, the essay type, because the second part of the paper is objective which does not come under the scope of the present study. The modified questionnaire was divided into three grids, each of which contained a set of annual question paper items, National Curriculum Benchmarks (2006), and the Scale of Relevance. https://doi.org/10.33195/jll.v5iII.310 https://doi.org/10.33195/jll.v5iII.310 Evaluation of English Language Question Papers for Content Validity at Intermediate Level: 303 UNIVERSITY OF CHITRAL JOURNAL OF LINGUISTICS AND LITERATURE VOL. 5 | ISSUE II | JULY – DEC | 2021 ISSN (E): 2663-1512, ISSN (P): 2617-3611 https://doi.org/10.33195/jll.v5iII.329 Total thirty questionnaires were given to the experts via email and courier services, together with a consent form and an instruction letter, so that they could complete them appropriately. Out of these questionnaires fourteen questionnaires were returned by the experts. Ten questionnaires were chosen for this research and four were dropped due to certain technical reasons. Data Analysis Procedure The feedback provided by the experts on the survey questionnaire was analyzed for overall content validity. The data was described according to the Content Validity Index (from .0 to.70 for invalid, .71 to .79 for good and .80 and above for excellent) and kappa value (from .0 to .59 for poor, .60 to .73 for good and .75 and above for excellent) in tables. The results of the Experts’ Survey Questionnaire have been classified into ‘High Item Content Validity’ and ‘Low Item Content Validity index’ on the bases of Content Validity Index and the key of Kappa value indicates the weightage of the item. Table 1: High Item Content Validity Index Experts’ Survey Questionnaire No. Questions (Excellent items) I-CVI 1 Item No. 10 0.8 2 Item No. 20 0.9 3 Item No. 23 0.8 4 Item No. 30 0.9 CVI= Content Validity Index Table 1 explains the valid items in the five-year papers of Intermediate English Class XI from Sukkur Board. Only 4 items out 60 are highly valid according to the feedback given by experts with a score 0.8 and above on Content Validity Index. In other words, these four items are completely aligned with the criteria of the National Curriculum Benchmarks (2006) and assess the required outcome from the students by the end of the year. Table 2. Low Item Content Validity Index Experts’ Survey Questionnaire https://doi.org/10.33195/jll.v5iII.310 https://doi.org/10.33195/jll.v5iII.310 Evaluation of English Language Question Papers for Content Validity at Intermediate Level: 304 UNIVERSITY OF CHITRAL JOURNAL OF LINGUISTICS AND LITERATURE VOL. 5 | ISSUE II | JULY – DEC | 2021 ISSN (E): 2663-1512, ISSN (P): 2617-3611 https://doi.org/10.33195/jll.v5iII.329 Sr. No. Questions (Fair items) I-CVI 1 Item No. 51 0.7 2 Item No. 17 0.7 3 Item No. 34 0.7 4 Item No. 37 0.7 5 Item No. 38 0.7 6 Item No. 43 0.7 7 Item No. 45 0.7 8 Item No. 46 0.7 9 Item No.48 0.7 10 Item No.60 0.7 11 Item No.56 0.7 CVI= Content Validity Index Table 2 explains the low valid items in the five-year papers of Intermediate English Class XI from Sukkur Board. Total 11 items out 60 have low validity according to the feedback given by experts with a score .7 in Content Validity Index. Although these items are not as valid as the items in table 1 above, these items somehow assess the required outcomes from students by the end of the year set by the National Curriculum Benchmarks (2006). The statistics suggest that there is a critical need for congruence between the National Curriculum (2006) and the Intermediate English Sukkur board's exam papers. Testing experts found that test elements were not adequately reflected in the annual question papers after connecting the National Curriculum Benchmarks (2014-18). It reveals that the content validity of Intermediate Class XI annual question papers was violated. Discussion The findings clearly witness that the five-year (2014-18) Intermediate English question papers of grade XI of Sukkur board are not valid and does not assess what they are supposed to assess in terms of students’ achievement by the end of the year. Only 04 items out of 60 are highly https://doi.org/10.33195/jll.v5iII.310 https://doi.org/10.33195/jll.v5iII.310 Evaluation of English Language Question Papers for Content Validity at Intermediate Level: 305 UNIVERSITY OF CHITRAL JOURNAL OF LINGUISTICS AND LITERATURE VOL. 5 | ISSUE II | JULY – DEC | 2021 ISSN (E): 2663-1512, ISSN (P): 2617-3611 https://doi.org/10.33195/jll.v5iII.329 valid whereas 11 items have low Content validity. Rest of the items in five-year papers are completely invalid according to the findings collected from the subject experts through a survey questionnaire. The situation highlights a wide gap between ‘what is being taught and assessed’. In other words, teachers and paper designers of the subject are not on the same page which may leave the students in lurch. In terms of 4 highly valid questions in five-year papers of English, all these four questions seem to focus on higher level cognitive skills than assessing students’ memory. For example, question number 20. ‘How does Monte Cristo prove that he is justified to take revenge from Count of Morcert?’. This and other alike questions require students to understand, analyze, synthesize and then write the answer in a well-organized way for readers’ ease. These kinds of answers not only need emphatic or chronological order to explain the incidents sequentially step by step but also need the proper use of cohesive devices to connect the ideas together. Moreover, these answers need an argument with supporting details of incidents to make it effective and acceptable. The items with low Content Validity index are 11in total out of 60, which is bit higher number than the high Content Validity Index. It might be due to the difference between ‘invalid’ and ‘low validity’ is huge from .0 to .70 whereas the difference between ‘low validity’ and ‘high validity’ is very small from .71 to .79. According to experts’ evaluation, these low content validity questions though assess what these are supposed to assess, but the degree of validity is little bit lower than the high validity questions. It may be due to the questions which do not assess all benchmarks or may address some while leaving others. The findings are similar to Siddique (2013) that explain the need to integrate question papers with the set objectives. Most of the questions are quite irrelevant and do not assess what they are supposed to assess to reflect the set outcome by the National Curriculum Benchmarks (2006). The findings suggest that the procedures for designing paper should be familiar to test designers. Before setting up annual question papers, the paper designer should do a content analysis of the textbook. Based on the findings, test designers should include elements that closely match the requirements of the National Curriculum Benchmarks (2006). Moreover, the testing expert should conduct continuous seminars on assessment mechanisms in the domains of language testing to both English teachers https://doi.org/10.33195/jll.v5iII.310 https://doi.org/10.33195/jll.v5iII.310 Evaluation of English Language Question Papers for Content Validity at Intermediate Level: 306 UNIVERSITY OF CHITRAL JOURNAL OF LINGUISTICS AND LITERATURE VOL. 5 | ISSUE II | JULY – DEC | 2021 ISSN (E): 2663-1512, ISSN (P): 2617-3611 https://doi.org/10.33195/jll.v5iII.329 and test designers so that test designers may build content valid question papers and teachers can consider the requirements of the National Curriculum (2006). References Ahmed, M. (2012). Influence of regional languages on second language learning at secondary level Bahawalpur (Pakistan). International Journal of Social Sciences & Education, 2(1), 567-575. Akhter, N. (2015). Perceptions of academicians regarding assessment process of distance teacher education courses in Pakistan. Pakistan Journal of Commerce and Social Sciences (PJCSS), 9(1), 248–260. Almanasreh, E., Moles, R., & Chen, T. F. (2019). Evaluation of methods used for estimating content validity. Research in Social and Administrative Pharmacy, 15(2), 214–221. https://doi.org/10.1016/j.sapharm.2018.03.066 Brennan, R. L., on Measurement in Education, N. C., & others. (2006). Educational measurement. Praeger Publishers,. Brown, F. (1980). Perspective on validity. NCME Measurement News, 23(3), 3–4. Carmines, E. G., & Zeller, R. A. (1979). Reliability and validity assessment. Sage publications. Colquitt, J. A., Sabey, T. B., Rodell, J. B., & Hill, E. T. (2019). Content validation guidelines: Evaluation criteria for definitional correspondence and definitional distinctiveness. Journal of Applied Psychology, 104(10), 1243. https://doi.org/10.1037/apl0000406 Cronbach, L. J., & Meehl, P. E. (2017). Construct validity in psychological tests. Research Design (pp. 225–238). Routledge. Davis, L. L. (1992). Instrument review: Getting the most from a panel of experts. Applied Nursing Research, 5(4), 194–197. https://doi.org/10.1016/S0897-1897(05)80008-4 Dörnyei, Z., & Taguchi, T. (2009). Questionnaires in second language research: Construction, administration, and processing. Routledge. Fatima, Q., & Ali Akbar, R. (2020). Assessment of English Writing Learning Outcomes of https://doi.org/10.33195/jll.v5iII.310 https://doi.org/10.33195/jll.v5iII.310 https://doi.org/10.1016/j.sapharm.2018.03.066 https://psycnet.apa.org/doi/10.1037/apl0000406 https://doi.org/10.1016/S0897-1897(05)80008-4 Evaluation of English Language Question Papers for Content Validity at Intermediate Level: 307 UNIVERSITY OF CHITRAL JOURNAL OF LINGUISTICS AND LITERATURE VOL. 5 | ISSUE II | JULY – DEC | 2021 ISSN (E): 2663-1512, ISSN (P): 2617-3611 https://doi.org/10.33195/jll.v5iII.329 Students at Secondary School Certificate and Ordinary Level. Bulletin of Education and Research, 42(2), 53-68. Fulcher, G., & Davidson, F. (2007). Language testing and assessment. Routledge. Gipps, C. (2011). Beyond Testing (Classic Edition): Towards a theory of educational assessment. Routledge. Greaney, V., & Hasan, P. (1998). Public examinations in Pakistan: A system in need of reform. Education and the State: Fifty Years of Pakistan, 136–176. Haidar, S., & Fang, F. (2019). Access to English in Pakistan: a source of prestige or a hindrance to success. Asia Pacific Journal of Education, 39(4), 485–500. https://doi.org/10.1080/02188791.2019.1671805 Hughes, A. (2003). Testing for Language Teachers Cambridge: CUP. Hughes, Arthur. (2020). Testing for language teachers. Cambridge University Press. John, S. (2013). Application of Bartlett and Morgan's standards to evaluate the impact of English textbooks at HSSC/A level in Sindh. Hamdard University Press. Lynn, M. R. (1986). Determination and quantification of content validity. Nursing Research, 35(6), 382–385. https://doi.org/10.1097/00006199-198611000-00017 Martone, A., & Sireci, S. G. (2009). Evaluating alignment between curriculum, assessment, and instruction. Review of Educational Research, 79(4), 1332–1361. https://doi.org/10.3102/0034654309341375 Messick, S. (1996). Validity and washback in language testing. Language Testing, 13(3), 241– 256. https://doi.org/10.1177/026553229601300302 Mirza, M. (1999). Examination system and teaching and practice of teachers at Secondary, Higher Secondary and O' Level. Bulletin of Education and Research, 1. Muhammad, Z. (2016). Pakistani Government Secondary Schools Students' Attitudes towards Communicative Language Teaching and Grammar Translation in Quetta, Balochistan. English Language Teaching, 9(3), 258-270. Nawani, D., & Goswami, R. (2021). Assessment of Student Learning in South Asia. Handbook of Education Systems in South Asia, 1729. https://doi.org/10.33195/jll.v5iII.310 https://doi.org/10.33195/jll.v5iII.310 https://doi.org/10.1080/02188791.2019.1671805 https://psycnet.apa.org/doi/10.1097/00006199-198611000-00017 https://doi.org/10.3102%2F0034654309341375 https://doi.org/10.1177%2F026553229601300302 Evaluation of English Language Question Papers for Content Validity at Intermediate Level: 308 UNIVERSITY OF CHITRAL JOURNAL OF LINGUISTICS AND LITERATURE VOL. 5 | ISSUE II | JULY – DEC | 2021 ISSN (E): 2663-1512, ISSN (P): 2617-3611 https://doi.org/10.33195/jll.v5iII.329 Retnawati, H. (2016). Proving content validity of self-regulated learning scale (The comparison of Aiken index and expanded Gregory index). REiD (Research and Evaluation in Education), 2(2), 155–164. 10.21831/reid.v2i2.11029 Rogers, W. T. (2010). Educational psychology 507: The nature of validity. Unpublished Manuscript, University of Alberta. Shah, D., & Afzaal, M. (2004). The examination board as educational change agent: The influence of question choice on selective study. 30th Annual IAEA Conference. Philadelphia, United States of America. Shah, S. M. H., & Saleem, S. (2010). Factors conducive for the purposeful use of libraries among university's students in Pakistan. International Journal on New Trends in Education and Their Implications, 1(2), 46–57. Shamim, F. (2008). Trends, issues and challenges in English language education in Pakistan. Asia Pacific Journal of Education, 28(3), 235–249. https://doi.org/10.1080/02188790802267324 Siddiek, A. G. (2018). The impact of test content validity on language teaching and learning. Available at SSRN 3180269. http://dx.doi.org/10.2139/ssrn.3180269 Siddique, N. (2013). Evaluation of the assessment criteria of English language at higher secondary level in Pakistan. Evaluation, 5(4), 12–25. Stoynoff, S. (2009). Recent developments in language assessment and the case of four large-scale tests of ESOL ability. Language Teaching, 42(1), 1–40. https://doi.org/10.1017/S0261444808005399 Vakili, M. M., & Jahangiri, N. (2018). Content validity and reliability of the measurement tools in educational, behavioral, and health sciences research. Journal of Medical Education Development, 10(28), 106–118. 10.29252/edcj.10.28.106 Warsi, J. (2004). Conditions under which English is taught in Pakistan: An applied linguistic perspective. Sarid Journal, 1(1), 1–9. Xi, X. (2021). Validity and the automated scoring of performance tests. In The Routledge handbook of language testing (pp. 513-529). Routledge. Zamanzadeh, V., Ghahramanian, A., Rassouli, M., Abbaszadeh, A., Alavi-Majd, H., & Nikanfar, https://doi.org/10.33195/jll.v5iII.310 https://doi.org/10.33195/jll.v5iII.310 http://dx.doi.org/10.21831/reid.v2i2.11029 https://dx.doi.org/10.2139/ssrn.3180269 https://doi.org/10.1017/S0261444808005399 http://dx.doi.org/10.29252/edcj.10.28.106 Evaluation of English Language Question Papers for Content Validity at Intermediate Level: 309 UNIVERSITY OF CHITRAL JOURNAL OF LINGUISTICS AND LITERATURE VOL. 5 | ISSUE II | JULY – DEC | 2021 ISSN (E): 2663-1512, ISSN (P): 2617-3611 https://doi.org/10.33195/jll.v5iII.329 A. R. (2015). Design and implementation content validity study: development of an instrument for measuring patient-centered communication. Journal of caring sciences, 4(2), 165. @ 2021 by the author. Licensee University of Chitral, Journal of Linguistics & Literature, Pakistan. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) (http://creativecommons.org/licenses/by/4.0/). https://doi.org/10.33195/jll.v5iII.310 https://doi.org/10.33195/jll.v5iII.310