Gist2014FinalFinal.indd 99 Self- and Teacher-Assessment in an EFL Writing Class1 Autoevaluación y Evaluación Docente en una Clase de Escritura de Inglés como Lengua Extranjera Sasan Baleghizadeh and Tahereh Hajizadeh2* Shahid Beheshti University, G.C., Allameh Tabataba’i University, Iran Abstract The present study investigated how fifteen Iranian EFL learners developed the ability to self-assess their writings through having access to the rater’s scores. The participants were supervised for four weeks as they went through their first experience in self-assessment. They were provided with a detailed evaluation sheet for assessing their work, and after each self-evaluation they were able to have access to the teacher-assigned scores. The results indicated a high correlation between self-assessment and teacher-assessment. It was revealed that students’ self-assessment throughout the study turned out to be highly correlated with the teacher-assessment. It was also shown that the learners assessed different components of their writing in a manner comparable to that of the teacher. The findings confirmed that self-assessment could not only be viewed as a useful tool for evaluating learners’ performance but also be regarded as an efficient instrument for developing their writing skill. Keywords: self-assessment, teacher assessment, writing Resumen Este estudio investigó cómo 15 estudiantes iraníes de inglés como lengua extranjera desarrollaron la capacidad de evaluar sus escritos al tener acceso a las puntuaciones de los evaluadores. Los participantes fueron supervisados durante cuatro semanas al ser su primera experiencia en el proceso de autoevaluación. Se proporcionó una hoja de evaluación detallada a cada estudiante para que evaluara su trabajo y después de cada autoevaluación, los estudiantes pudieron tener acceso a las puntuaciones globales asignadas por el profesor. Los resultados indicaron una alta correlación entre la autoevaluación y la evaluación del profesor. Esto reveló que la autoevaluación de los estudiantes durante todo 1 Received: October 4, 2013 / Accepted: April 18, 2014 2 sasanbaleghizadeh@yahoo.com, t.hajizade@yahoo.com Gist Education and LEarninG rEsEarch JournaL. issn 1692-5777. no. 8, (January - JunE) 2014. pp. 99-117. thE EffEct of story rEad-aLouds No. 8 (January - June 2014) No. 8 (January - June 2014) 100 el estudio resultó estar altamente correlacionado con la evaluación docente. También se demostró que los estudiantes evaluaron diferentes componentes de su escritura de una manera comparable a la realizada por el profesor. Los resultados confirmaron que la autoevaluación podría no sólo ser vista como una herramienta útil para evaluar el rendimiento de los estudiantes sino también podría ser considerada como un instrumento eficaz para desarrollar sus destrezas de escritura. Palabras clave: autoevaluación, evaluación de la actividad docente, escritura Resumo Este estudo pesquisou como 15 estudantes iranianos de inglês como língua estrangeira e desenvolveram a capacidade de avaliar seus escritos ao ter acesso às pontuações dos avaliadores. Os participantes foram supervisados durante quatro semanas ao ser sua primeira experiência no processo de autoavaliação. Foi proporcionada uma folha de avaliação detalhada a cada estudante para que avaliasse seu trabalho e depois de cada autoavaliação, os estudantes puderam ter acesso às pontuações globais designadas pelo professor. Os resultados indicaram uma alta correlação entre a autoavaliação e a avaliação do professor. Este revelou que a autoavaliação dos estudantes durante todo o estudo resultou estar altamente correlacionado com a avaliação docente. Também se demonstrou que os estudantes avaliaram diferentes componentes da sua escritura de uma maneira comparável à realizada pelo professor. Os resultados confirmaram que a autoavaliação poderia não só ser vista como uma ferramenta útil para avaliar o rendimento dos estudantes como também poderia ser considerada como um instrumento eficaz para desenvolver as suas destrezas de escritura. Palavras chave: autoavaliação, avaliação da atividade docente, escritura Introduction Despite the numerous advantages of analytic scoring of productive language skills such as speaking and writing, there are still teachers who, due to time constraints and work pressure, prefer holistic scoring of their learners’ performance through summative assessment. When applied to evaluating students’ written compositions, this approach can result in potentially biased evaluation because more often than not teachers do not have clear criteria for marking the papers. Even worse, most of them tend to correct all grammatical and spelling errors with red pens which, as Peñaflorida (2002) aptly put it, “bleeds students’ papers to death” (p.345). In order to overcome the limitations of summative assessment, particularly when it is done holistically, alternative assessments or “alternatives in sELf- and tEachEr-assEssmEnt baLEGhizadEh & haJizadEh No. 8 (January - June 2014) No. 8 (January - June 2014) 101 assessment” (Brown & Hudson, 1998, p. 57) has become the common practice of many teachers around the world. One such alternative assessment is self-assessment, which is in line with current learner- centered education (Brown, 2001) and can help students become independent learners (Blanche & Merino, 1989) and, in turn, lessens the burden on teachers (Oscarson, 1989). Literature Review While self-assessment came hand in hand with learner-centered approaches such as the communicative approach, it was not until the late 90s when the focus of language teaching shifted to promoting learner autonomy, According to Kumaravadivalu (1994), learner autonomy involves helping students to learn on their own, raising their awareness of their learning strategies, and encouraging them to self-direct their own assessment. This approach to assessment no longer places the teacher at the center of the evaluation process; rather, it prompts learners to take responsibility for assessing their own performance (Oscarson, 1989), paving the way for their improvement through reflection and action. In this way, assessment would surely play a positive role in the learners’ learning process (Roberts, 2006) and would help to increase their autonomy (Cresswell, 2000). Brown and Hudson (1998) have argued that self-assessment is a type of “personal-response assessment” (p. 63) and define it as a kind of assessment that “require(s) students to rate their own language” (p. 65). Upshur (as cited in Heilenman, 1990) was one of the first to support the use of this kind of assessment in the measurement of second language abilities since he believed it is only the learner who knows how successfully he could use the language. Many advantages of self-assessment such as speed, direct involvement of learners, encouragement of autonomous learning (Brown & Hudson, 1998), and the possibility of enlarging the domain of language behavior sampled without substantially increasing the time and cost involved (LeBlanc & Painchaud, 1985) have been extensively dealt with in the literature. For example, Bachman and Palmer (1989) argued that “self-ratings can be reliable and valid measures of communicative language abilities” (p. 22). Likewise, Huerta-Macias (as cited in Brown & Hudson, 1998) has claimed that “alternative assessment (not just self-assessment) consists of valid and reliable procedures that avoid problems inherent in traditional testing including norming, linguistic, and cultural biases” (p. 55). Thus, students can even gain more insight into their strengths and weaknesses as writers if they monitor their own writing tasks (Myers, 2001). sELf- and tEachEr-assEssmEnt baLEGhizadEh & haJizadEh No. 8 (January - June 2014) No. 8 (January - June 2014) 102 Despite the foregoing advantages, not all research studies have confirmed the usefulness of this sort of evaluation. For example, Blanche and Merino (1989), who summarized major findings in self- assessment, argued that the accuracy of most students’ self-estimates often varied depending on the linguistic skills and materials involved in the evaluations. Similarly, Davidson and Henning (1985) indicated that although classical reliability estimates of students’ self-ratings might be reasonably high, “little confidence should be placed in these particular student self-ratings” (p. 176). This is mostly because students might not be well-trained in doing the self-ratings. Moreover, Heilenman (1990), who investigated the role of response effects (tendencies to respond to factors other than item content) in the self-assessment of second language ability, found out that both a measure of “acquiescence effects” and “overestimation effects” were present (p. 188). Thus, Heilenman realized that those students who had been learning English for two years or more were more likely to overestimate their performance. On the other hand, Matsuno (2009) found that Japanese EFL learners underestimated their performance when they were asked to self-assess their own writings, which was particularly true for high-achieving students. Hence, Matsuno (2009) concluded that “self-assessment was somehow idiosyncratic and therefore of limited utility as a part of formal assessment” (p. 75). Even when the criteria for assessment were set, the participants could not judge their performance in a manner comparable to that of the teachers (Patri, 2002). In addition to the studies conducted to show whether self- assessment is a useful evaluation tool or not, many studies have been carried out to find out how teachers can help students become better evaluators of their own performance. For example, Roberts (2006) maintained that “in order to have higher correlation between self- assessment and teacher-assessment, we need to provide learners with guidance” (p. 3). Moreover, the result of Jafarpur and Yamini’s (1995) study showed that training with self-assessment questionnaires could improve learners’ skill to estimate their own language ability. Although without direct instruction, students’ self-assessments have shown great improvement over time (Chen, 2008), a number of researchers like Oscarson (1989) emphasize that students do need training for improving their self-assessment. The writing skill seems to be a good area for investigation when it comes to assessment either by teachers or students themselves. This is mainly because even if the raters assign the same score to a piece of writing, they might have arrived at it based on different criteria (Connor-Linton, 1995; Hamp-Lyons, 1995). Furthermore, even native sELf- and tEachEr-assEssmEnt baLEGhizadEh & haJizadEh No. 8 (January - June 2014) No. 8 (January - June 2014) 103 English speakers may not have the same reaction as non-natives toward the same written paper. For example, the results of Khalil’s (1985) study revealed that native speakers found semantically deviant utterances more problematic than grammatically deviant ones. Native raters proved to be stricter than non-native raters (Kobayashi, 1992), and they gave lower ratings to content than language (Santos, 1988). Even researchers like Brown (1991), who found no statistically significant difference between the ratings given by English and EFL raters, admitted that these two groups arrived at these scores from different perspectives, suggesting that native English speakers focused more on cohesion when assessing a piece of writing, whereas their non-native counterparts attended more to organization. In order to overcome the problem of subjectivity in holistic assessment of writing, analytic scoring has been proposed as an alternative (Heaton, 1988). According to Stiggins, Richard, Nancy, and Bridgeford (as cited in Perkins, 1983) “Holistic scoring calls for the reader to rate overall writing proficiency on a single rating scale, (but) analytic scoring breaks performance down into component parts (e.g. organization, wording, idea)” (p. 652). Bacha (2001) confirmed that adopting either of these techniques depended on the purpose of writing in EFL programs. She maintained that to provide learners with more specific feedback, analytic scoring would be more appropriate since holistic scoring tends to be highly subjective and lacks internal consistency due to shifting standards (Perkins, 1983). Other scholars such as Hamp-Lyons (1995) have pointed out that “holistic scoring system is a closed system” (p. 760) and no one can have access to points for different parts since the raters do not have a certain criterion for scoring a piece of writing. Cumming (1990), who discussed biases in holistic evaluations of ESL writings, claimed that “analytic scales may have the advantage of drawing raters’ attention to specific aspects of students’ composition” (p. 42). Studies have indicated that high inter- rater reliability has been obtained from analytic scoring (Bachman, 1990; Jacobs, Zingraf, Wormuth, Hartfiel, & Hughey, 1981; Perkins, 1983). Apart from having clear criteria for assessment, all raters need training in assessment. Lumley and McNamara (1995) argued that even teachers who wanted to act as raters needed some training courses to make them internally consistence or “self-consistent” (p. 57). Jacobs et al. (1981) argued that the guidance helped to neutralize the differences in their judgment related to raters’ backgrounds. Taking the above mentioned points into account, it is evident that training is essential, particularly for learners who want to practice self-assessment for the first time. sELf- and tEachEr-assEssmEnt baLEGhizadEh & haJizadEh No. 8 (January - June 2014) No. 8 (January - June 2014) 104 Methodology Research Design As mentioned previously, the results obtained about the efficiency of self-assessment are inconclusive. Given the fact that very little research has been done in this respect in the Iranian context, this point needs further exploration since cultural backgrounds of the participants could affect self-assessment results (Blanche & Merino, 1989). In the present study, Pearson-product moment correlation coefficient was used to investigate the relationship between self-assessment and teacher-assessment of students’ writing. It was hoped that students’ self-assessment would improve over time. In other words, it was predicted that during the first cycle of assessment, the correlation between the teacher-assessment and student self-assessment would be relatively low, but gradually students would learn how to assess themselves and this assessment of their own writings would progress to be a better approximation of the teacher’s assessment. Hence, the correlation would be greater in the last cycle of assessment. To this end, the following research questions guided the study: 1. Is there a statistically significant relationship between teacher- assessment and learners’ self-assessment in each cycle of assessment? 2. Does learners’ self-assessment improve over time (as they have access to teacher-assigned scores)? 3. In the last cycle of assessment, do learners assess different components of their writing in the same way as the teacher did? 4. Does self-assessment lead to an improvement in learners’ writing? Participants Twenty Iranian female EFL learners at the upper-intermediate level of English language proficiency with an average TOEFL score of 550 participated in this study. The participants were all females with an average age of 20, and were all taking a TOEFL preparation course at a private English language school in Tehran, Iran. Since the study lasted for several weeks, not all the participants were able to attend all the assessment sessions. Therefore, the researchers decided to include the individuals who took part in all sessions, as a result of which the number of participants was reduced to fifteen. sELf- and tEachEr-assEssmEnt baLEGhizadEh & haJizadEh No. 8 (January - June 2014) No. 8 (January - June 2014) 105 In addition to the learners, a native Iranian teacher (the second researcher of the present study) with five years of experience in teaching English as a foreign language was in charge of teaching paragraph writing to the students and rating their written assignments. Inasmuch as analytic scoring offers high inter-rater and intra-rater reliability, the rater assessed each paper only once. Data Collection Instruments In this study both the teacher and the students assessed the writings through a detailed evaluation sheet (see Appendix A). This evaluation sheet with five subscales (see Table 1) was taken from Jacobs et al. (cited in Bacha, 2001). However, the researchers thought that students would need to know how these five components (content, organization, vocabulary, language, and mechanics) would be broken down into smaller sub-scales while scoring their papers. To this end, following Matsuno (2009), these five criteria were clearly defined. Thus, for example, it became clear that content refers to sub-scales, such as the amount of writing, the development of the topic, and the relevance of the students’ writing to the assigned topic. Similarly, it was known that organization refers to the opening, supporting sentences, closing and logical sequences of ideas in writing (see Table 1 for further details). Table 1. Jacobs et al.’s Composition Profile By using this detailed evaluation sheet, interval scales could be obtained which paved the way for using Pearson correlation coefficient formula. Working with this checklist was quite easy for the participants because instead of a main category like content, they had access to the subcategories which helped them in their assessment. The second instrument was a questionnaire developed by the researchers (Appendix B). The questions were written in simple sELf- and tEachEr-assEssmEnt baLEGhizadEh & haJizadEh No. 8 (January - June 2014) No. 8 (January - June 2014) 106 language to ensure they would be easily comprehended. The main goals behind utilizing this questionnaire were to elicit the learners’ attitude toward self-assessment, their beliefs about the usefulness of this kind of assessment, and their ideas on how they had self-assessed their performance. This instrument was primarily used to save time since obtaining the same amount of information through interviewing all the participants would be rather time-consuming. The last instrument used was a semi-structured interview that the researchers conducted with a few of the participants. Although the research was mainly quantitative, it was felt that the use of an interview might shed more light on some dark points, such as the participants who would either overestimate or underestimate their performances. The study was conducted at a private language institute and lasted for one month. Although the participants were upper-intermediate learners, some of them had problems in developing a paragraph. Therefore, their teacher (the second researcher) devoted the first session to providing them with some instruction on appropriate length, format, content, and organization of a paragraph. Then during the second session, the participants were introduced to the evaluation sheet (Appendix A).The second researcher explained what each category as well as the related subcategories meant and made sure students understood what they were expected to do. After realizing how to assign scores to different components, they were given a topic to write about on the same day and were asked to evaluate their writings two or three days later, trying to be as objective as possible. This interval would help the participants to detach themselves from their writings and enable them to be more critical of them. It is worth mentioning that the topics were related to what they had studied during the week, and an attempt was made to take the participants’ interest into consideration in the process of topic selection. After each writing and self-assessment, the second researcher, who was both the teacher and the rater, collected the papers and returned them the next session along with her evaluation (she used the same evaluation sheet for assigning scores). She did not write any comments either in the margin of the papers or the evaluation sheets. The participants were supposed to figure out for themselves why they had received a particular score. Furthermore, the second researcher asked them to read their writings one more time and reflect on the scores assigned by the rater. Then, the participants were given back their writings, which were evaluated two times (once by the learners and the other time by the rater) to have a chance to compare their self- sELf- and tEachEr-assEssmEnt baLEGhizadEh & haJizadEh No. 8 (January - June 2014) No. 8 (January - June 2014) 107 assessments with the teacher-assessment and to find out whether they had overestimated or underestimated their performances. This cycle of writing, self-assessment, and rater’s assessment lasted for four weeks during which the participants wrote about four different topics. Data Analysis and Interpretation The strength of relationship between each self-assessment and the teacher-assessment was calculated using Pearson-product moment correlation coefficient. Since the study consisted of four cycles of self- assessment and teacher-assessments, the results of four correlations are included in the paper. The correlations were used in order to find out whether there was a relationship between the two types of assessments, and if yes whether an acquaintance with the teachers’ assessment had positively affected their self-assessments or not. A matched t-test was used to compare the learners’ first writings with the last ones. The result of this t-test could be used as an indicator of whether self-assessment had helped improve learners’ writing ability or not. The means and standard deviations of different components of learners’ last writing were compared. The purpose of this comparison was to find out if there was a difference between the way the rater and the students assessed the papers. Results The Correlation Table 2 shows the results of four correlations. The magnitude of these correlations is larger than the critical value (ρ< .01). This indicates that the obtained results are statistically significant. Table 2. Correlation between self-assessment and teacher-assessment in the four cycles of writing sELf- and tEachEr-assEssmEnt baLEGhizadEh & haJizadEh No. 8 (January - June 2014) No. 8 (January - June 2014) 108 Before discussing the result of these correlations in length, it is worth showing the scatter plots of the four sets of scores (Figure 1), since they make the interpretation of the results much easier. Figure 1. The scatter plots of four sets of scores obtained from self- assessment (horizontal axes) and teacher-assessment (vertical axes) The scatter plots clearly show that there is a positive correlation between self-assessment and teacher-assessment. The amount of this correlation remained constant (r=.63) during the first and second cycles of assessment. This might have been due to the fact that the experience was new to the students. As the first and second scatter plots reveal, there were some outliers that affected the results of the correlations to a great extent. All of these outliers overestimated their writing performance. The third scatter plot indicates a significant change in the way the participants assessed their writings; here we had only one outlier who overestimated her performance though the majority of the participants reported scores which were quite close to those of the rater. There were also four participants who underestimated their performance. Although the correlation was not a perfect positive one (r= .71), the improvement of the learners’ self-assessment cannot be overlooked. sELf- and tEachEr-assEssmEnt baLEGhizadEh & haJizadEh No. 8 (January - June 2014) No. 8 (January - June 2014) 109 The result of the fourth cycle of assessment was quite unexpected (r= .91); the participants seemed to have developed an excellent skill in assessing their writings. The researchers assume that this great improvement might have been due to the fact that the second researcher asked the students to have a second meticulous look at their third papers and compare the scores they had given to themselves and those that the rater had assigned them. The motivation for such a decision came from the second researcher’s observation. She realized that during the first two sessions, the participants took a quick look at the rater’s scores and returned the papers in a minute or two. The extra time allocation has obviously had a great effect on raising the learners’ awareness and consequently their self-assessment. This is rather surprising as the learners were not provided with any direct explanation or instruction. The mean and standard deviations of the scores assigned by learners and the rater to different components of writing are displayed in Table 3. A comparison of these scores can provide us with the answer to the third research question. The scores that the participants gave to different parts of their last writing were quite similar to those assigned by the teacher. Thus, we can conclude that in the last cycle of assessment, there was not only a high correlation between teacher- assessment and students’ self-assessment, but also a great similarity in the way learners and the rater assessed different components of writing. Table 3. The means and standards deviations of the scores assigned by learners and the rater to different components of writing sELf- and tEachEr-assEssmEnt baLEGhizadEh & haJizadEh No. 8 (January - June 2014) No. 8 (January - June 2014) 110 The Questionnaire After evaluating the last writing, the participants were asked to complete the questionnaire (Appendix B). The answers shed more light on the processes which students went through in order to evaluate their performance. It was revealed that the experience of self-assessment was quite new to all the participants. Some of them claimed that they self-evaluated their performance in most of the tests, and their obtained scores had been close to what they had expected. This was particularly true about the participants who claimed to have been objective in their self-assessment. The learners’ attitudes toward the experiment varied. While all of them agreed that self-assessment was quite motivating, some of them found it a bit hard. Most of them believed that they were quite objective in their self-assessment. Many of them found the rater’s score fair and indicated that having access to these scores had helped them a lot in their evaluations. Most of the participants claimed that they had learned a great deal about self-assessment during the experiment. Nevertheless, some of them maintained that teacher-assessment is more beneficial for the following reasons: • When we learn a new grammar point and want to use it in our writing(s) we need someone to correct our mistake(s). • My teacher is more knowledgeable and can correct my paper better. • I’m still a student. I cannot judge my writing. Others believed that when they evaluate their works objectively they could have a better image of their weaknesses and strengths: • I think it is better to check my writing myself, because I can realize my problems better. • Self-assessment helps me to find out my problems myself. Almost all the participants argued that self-assessment could be more useful when it comes hand in hand with teacher-assessment. The researchers suppose that this attitude stems from the way Iranian students have been treated in schools. They have learned to regard the teachers as authority figures and they respect them as people who are capable of judging their performance. The interesting point was that most of the participants found having access to the rater’s scores more useful than having direct training in sELf- and tEachEr-assEssmEnt baLEGhizadEh & haJizadEh No. 8 (January - June 2014) No. 8 (January - June 2014) 111 self-assessment. They believed that participating in this experiment had not only improved their self-assessment abilities but also had helped them develop their writing skill. This claim is supported by the result of a matched t-test (see Table 4). Table 4. Matched t-test on improvement of students’ writing after self- assessment The Interview As mentioned earlier, a semi-structured interview was carried out with the participants who had influenced the result of the correlations in one way or the other, i.e. those who had either over-estimated or underestimated their performance. One of the participants, who had assigned high scores to her writings, claimed that she did not pay much attention to the accuracy of her self-assessment and all she wanted was improvement of her writing skill. Her goal was to receive more or less the same scores she had given to her writing from the rater. She thought assigning higher scores to her writing meant that her writing was actually improving. Another learner who had underestimated her performance believed that she had never been good at writing. She was quite modest and maintained that the rater had been quite generous; otherwise, her scores could not have been so high. The most interesting part of the interview was talking to the learner whose self-assessment scores were always quite close to those of the rater. She said that she had tried to be quite objective in her self-assessment. She also told the second researcher that whenever she wanted to assess her writing she imagined that it was someone else’s paper, not hers. sELf- and tEachEr-assEssmEnt baLEGhizadEh & haJizadEh No. 8 (January - June 2014) No. 8 (January - June 2014) 112 Conclusions This paper sought to take a closer look at self-assessment in the Iranian context. The study confirmed that there is a statistically significant positive correlation between teacher-assessment and self- assessment. This finding can be viewed as confirmation of the research carried out in different EFL contexts. However, the main contribution of this study to the literature lies in the fact that this research indicated that training learners simply through introducing them to checklists is not the only possible way to improve their ability to self-assess their performance. Unlike previous studies, this research took students through four cycles of writing, self-assessment, and teacher-assessment. After each self-assessment, the learners were provided with the scores that the teacher assigned to different components of their writings. The comparison that students made between their self-assessment and teacher-assessment had a positive effect on the way they self-evaluated their subsequent writings. The results of the correlational studies justified the improvement in learners’ self-assessment. The magnitude of correlation between self- assessment and teacher-assessment remained constant during the first two cycles of assessment when the experience of self-assessment was still quite novel to the participants. However, the obtained correlation rose significantly in the third cycle of assessment since the learners had a better image of what self-assessment was about, and how they could assess their writings in the same way that the rater did. Assessing their last writing was much easier for the learners since having access to the teacher-assigned scores for their previous writings had taught them how to be objective and critical toward their own works, so the last self- assessment was the closest to teacher-assessment (r= .91). Consequently, it can be claimed that there is a direct relationship between training of students in self-assessment, which is done by providing the learners with their teacher-assessed papers, and the accuracy of their self-assessment scores. That is to say, the more the students ponder the scores assigned by the rater, the better they tend to assess their subsequent writings. Another finding inferred from students’ writings was the fact that their writing skill improved significantly toward the end of the experiment. This fact was not only noticed by the students themselves, but was also supported through statistical procedures. The result of the matched t-test indicated a statistically significant change in the learners’ writing ability. sELf- and tEachEr-assEssmEnt baLEGhizadEh & haJizadEh No. 8 (January - June 2014) No. 8 (January - June 2014) 113 The last, but obviously not the least finding was that the participants evaluated their paragraphs almost in the same way that the teacher (rater) did. The scores that students assigned to different components of their last writing were quite close to those assigned by the teacher. This study confirmed the effectiveness of self-assessment, as a kind of alternative assessment, in the context of Iran. The learners’ assessment of their own writings not only proved to correlate with those of the rater, but also improved significantly over the course of the study. Nevertheless, the findings should be treated with caution since the research was carried out with only fifteen participants, all from the same institute. This indicates that further research is still needed before we can be fully sure about the beneficial effects of training students in self-assessment. References Bacha, N. (2001). Writing evaluation: What can analytic versus holistic essay scoring tell us? System, 29(3), 371-383. Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press. Bachman, L. F., & Palmer A. S. (1989). The construct validation of self-ratings of communicative language ability. Language Testing, 16(1), 14-29. Blanche, P., & Merino, B. J. (1989). Self-assessment of foreign- language skills: Implications for teachers and researchers. Language Learning, 39(3), 313-340. Brown, H. D. (2001). Teaching by principles: An interactive approach to language pedagogy (2nd ed.). New York: Pearson Education. Brown, J. D. (1991). Do English and ESL faculties rate writing samples differently? TESOL Quarterly, 25(4), 587-603. Brown, J. D., & Hudson, T. (1998). The alternatives in language assessment. TESOL Quarterly, 32(4), 653-675. Chen, Y. (2008). Learning to self-assess oral performance in English: A longitudinal case study. Language Teaching Research, 12(2), 235-262. Connor-Linton, J. (1995). Looking behind the curtains: what do L2 composition ratings really mean? TESOL Quarterly, 29(4), 762-765. sELf- and tEachEr-assEssmEnt baLEGhizadEh & haJizadEh No. 8 (January - June 2014) No. 8 (January - June 2014) 114 Cresswell, A. (2000). Self-monitoring in student writing: Developing learner responsibility. ELT Journal, 54(3), 235-244. Cumming, A. (1990). Expertise in evaluating second language composition. Language Testing, 7(1), 31-51. Davidson, F., & Henning, G. (1985). A self-rating scale of English: Rasch scalar analysis of item and rating categories. Language Testing, 2(2), 164-179. Hamp-Lyons, L. (1995). Rating nonnative writing: the trouble with holistic scoring. TESOL Quarterly, 29(4), 759-762. Heaton, J. B. (1988). Writing English language tests. New York: Longman. Heilenman, L. K. (1990). Self-assessment of second language ability: The role of response effects. Language Testing, 7(2), 174-201. Jacobs, H. J., Zingraf, S. A., Wormuth, D. R., Hartfiel, V. F., & Hughey. J. B. (1981). Testing ESL composition: A practical approach. Rowley: Newbury. Jafarpur, A., & Yamini, M. (1995). Do self-assessment and peer-rating improve with training? RELC Journal, 26(1), 63-85. Khalil, A. (1985). Communication error evaluation: Native speakers’ evaluation and interpretation of written errors of Arab EFL learners. TESOL Quarterly, 19(2), 335-351. Kobayashi, T. (1992). Native and nonnative reactions to ESL compositions. TESOL Quarterly, 26(1), 81-112. Kumaravadivalu, B. (1994). The postmethod condition: (E)merging strategies for second/foreign language teaching. TESOL Quarterly, 28(1), 27-48. LeBlanc, R., & Painchaud, G. (1985). Self-assessment as a second language placement instrument. TESOL Quarterly, 19(4), 673-687. Lumley, T., & McNamara, T. F. (1995). Rater characteristics and rater bias: Implications for training. Language Testing, 12(1), 54-71. Matsuno, S. (2009). Self-, peer-, and teacher-assessments in Japanese university EFL writing classrooms. Language Testing, 26(1), 75-100. Myers, J. L. (2001). Self-evaluations of the “stream of thought” in journal writing, System, 29(4), 481-488. Oscarson, M. (1989). Self-assessment of language proficiency: Rationale and applications. Language Testing, 6(1), 1-13. sELf- and tEachEr-assEssmEnt baLEGhizadEh & haJizadEh No. 8 (January - June 2014) No. 8 (January - June 2014) 115 Patri, M. (2002). The influence of peer feedback on self- and peer- assessment of oral skills. Language Testing, 19(2), 109-131. Peñaflorida, A. H. (2002). Nontraditional forms of assessment and response to student writing: A step toward learner autonomy. In J. C. Richards & W. A. Renandya (Eds.), Methodology in language teaching: An anthology of current practice (pp. 344-353). Cambridge: Cambridge University Press. Perkins, K. (1983). On the use of composition scoring techniques, objective measures, and objective tests to evaluate ESL writing ability. TESOL Quarterly, 17(4), 651-671. Roberts, T. S. (2006). Self, peer, and group assessment in E-learning: An introduction. In T. S. Roberts (Ed.), Self, peer, and group assessment in E-learning (pp. 1-16). London: Information Science Publishing. Santos, T. (1988). Professors’ reactions to the academic writing of nonnative speaking students. TESOL Quarterly, 22(1), 69-90. Authors *Sasan Baleghizadeh is Associate Professor of TEFL at Shahid Beheshti University (G.C.) in Tehran, Iran, where he teaches courses in applied linguistics, syllabus design, and materials development. He is interested in investigating the role of interaction in English language teaching and issues related to materials development. His published articles appear in many international journals including TESL Reporter, TESL Canada Journal, ELT Journal, and Language Learning Journal. *Tahereh Hajizadeh holds an M.A. degree in TEFL from Allameh Tabataba’i University in Tehran, Iran. She is an experienced EFL teacher and is interested in issues related to classroom assessment. sELf- and tEachEr-assEssmEnt baLEGhizadEh & haJizadEh No. 8 (January - June 2014) No. 8 (January - June 2014) 116 Appendix A. Evaluation Sheet 1. Content 1-10 Amount Development of the topic Relevance to the topic 2. Organization 1-5 Opening Supporting sentences Closing Logical sequencing 3. Vocabulary 1-10 Range Word form/ word choice 4. Language 1-10 Grammar Use of variety of structures 5. Mechanics 1-5 Spelling Punctuation sELf- and tEachEr-assEssmEnt baLEGhizadEh & haJizadEh No. 8 (January - June 2014) No. 8 (January - June 2014) 117 Appendix B. The Questionnaire 1. How long have you been learning English? 2. Have you ever tried to evaluate your performance in the writing part of an exam? If yes, was your estimation close to the mark you received? 3. Was this experiment new to you? If not, when and how did you experience something like this? 4. How did you feel when you had to evaluate your own works? 5. Did you try to be objective in your self-assessments? If yes, what did you do? 6. Did you think that the scores that the rater assigned to your works were fair? Why? 7. Do you think that comparing your self-assessments with teacher- assessments helped you evaluate your writings better? If yes, how? 8. After this experiment, do you like to be given the chance to assess your writings yourself? Why? 9. Do you think that self-assessment can help you learn better? Why? 10. Do you think that students need training for self-assessment? Why? sELf- and tEachEr-assEssmEnt baLEGhizadEh & haJizadEh No. 8 (January - June 2014) No. 8 (January - June 2014)