Language Value 

http://www.e-revistes.uji.es/languagevalue 

November 2018, Volume 10, Number 1 pp. 67-88 

ISSN 1989-7103 

 Articles are copyrighted by their respective authors 

DOI: http://dx.doi.org/10.6035/LanguageV.2018.10.5 
67 

Teacher’s feedback vs. computer-generated feedback: A focus 

on articles

Tamara Hernández Puertas        

tamaraeoi@gmail.com 
Escuela Oficial de Idiomas (Castellón), Spain 

ABSTRACT 

As attested by a vast number of studies, in the process of second/foreign language acquisition feedback 

plays an important role as it may trigger learners’ noticing of the mismatch between their interlanguage 

and the target language (Schmidt 1990). In foreign language classrooms, feedback on written production 

may not be properly provided due to a large number of students or time constraints (Chacón-Beltrán 

2017). In this sense, the use of new technologies in the classroom may help both the teacher in the 

correction process and the student in his/her language development. In the present study we aim to 

compare feedback provided by the teacher and feedback provided by the software Grammar Checker 

(Lawley 2015). One group of English-as-a-foreign language (EFL) students received teacher’s feedback 

on their mistakes on articles in their written production whereas a second group obtained feedback on the 

same grammar aspect by means of the above-mentioned software. The control group did not obtain 

feedback on their errors. Results show statistically significant differences in the last composition for the 

group who received teacher’s feedback, although this feedback did not have a lasting effect in the tailor-

made delayed test. In light of these findings, we may claim that the use of Grammar Checker as a 

potential tool for self-correction and feedback may facilitate students’ language development, at least on 

the grammar aspect under analysis. 

Keywords: corrective feedback, teacher’s feedback, computer-generated feedback, writing, articles, 

errors 

I. INTRODUCTION 

Second language acquisition (SLA) is a complex process involving multiple variables 

along with natural elements such as errors, which should be regarded as part of the 

language learning process and not as something negative that has to be avoided. By 

means of errors, learners may test their hypotheses about how the target language works 

and teachers obtain information about learners’ progress and difficulties in their 

development. Traditionally, teachers (and sometimes, peers) have provided correction in 

the formal context in various ways to help learners overcome their errors (both oral and 

written) and further their learning. The issue of whether mistakes should be corrected, 

when and how, among other questions, has fuelled much research, together with the 

elaboration of different typologies accounting for corrective feedback (CF) types, 

http://www.e-revistes.uji.es/languagevalue
mailto:tamaraeoi@gmail.com


Tamara Hernández Puertas  

 
Language Value 10 (1), 67–88  http://www.e-revistes.uji.es/languagevalue 68 

ranging from most indirect to most direct. However, there seems to be some agreement 

on the fact that, although demanded by the learners, providing CF is a complex task to 

do. Corrective feedback for oral mistakes may be obtrusive and thus interrupt the flow 

of conversation. In turn, CF for written errors may take much of the teacher’s time and 

sometimes it is only provided superficially. 

Over the past two decades, there have been efforts to develop software which aids in the 

process of student writing along with some other software which provides a score on 

students’ written production. The focus of this study is on the former, that is, we aim at 

contributing to the expanding body of research on computer-generated feedback in an 

attempt to examine whether this type of feedback has an impact on students’ linguistic 

accuracy when compared to teacher’s feedback. With this aim in mind, the software 

Grammar Checker was employed by one group of students as source of feedback on 

errors, whereas another group obtained teacher’s feedback. 

  
II. CORRECTIVE FEEDBACK AND SLA 

Making mistakes is part of the natural process of learning a language. However, when 

producing output, students may not be aware of how successful they have been at 

conveying their messages if some kind of feedback is not offered. Corrective feedback 

becomes, then, a key factor in the SLA process since mere language exposure does not 

seem to be enough and second language (L2) speakers need some kind of corrective 

feedback to notice the discrepancies between their output and the L2.   

The term corrective feedback (Lyster 1998) has adopted different terminology 

depending on the author: for example, ‘negative evidence’ (Long 1991), ‘interactional 

feedback’ (Lyster and Mori 2006) or ‘negative feedback’ (Ortega 2009). For the 

purposes of the present study, we will adhere to the definition provided by Russell and 

Spada (2006: 134): ‘Corrective feedback will refer to any feedback, provided to a 

learner, from any source, that contains evidence of learner error of language form’. In 

this sense, corrective feedback refers to the teacher’s reaction to a mistake, when this 

reaction causes attention to language forms and has a corrective aim. Much research has 

been carried out on CF, and most has employed different types of CF based on the 

learner’s reaction (i.e., uptake). For instance, Ellis (2009) classified CF types along the 

http://www.e-revistes.uji.es/languagevalue


Teacher’s feedback vs. computer-generated feedback: A focus on articles  

 
Language Value 10 (1), 67–88  http://www.e-revistes.uji.es/languagevalue 69 

implicit-explicit dichotomy. Implicit feedback referred to recasts (i.e., reformulation of 

the learner’s incorrect utterance minus the error), repetition and clarification requests, in 

which the learner has to work harder in order to spot the mistake and self-repair. In turn, 

explicit feedback included explicit corrections, metalinguistic explanations, elicitations 

and paralinguistic signals which showed in a more direct way that the learner’s 

production was wrong. 

Although the effectiveness of CF on acquisition is a debatable issue, it is regarded as an 

intervening element in the process of SLA. In fact, since the early 90s, a vast number of 

studies have demonstrated the beneficial role of CF on acquisition. Moreover, some 

meta-analyses and reviews of the literature (for example, Russell and Spada 2006, 

Spada 2011), point to the positive effects of CF for L2 grammar learning and its 

durability over time as long as it is noticeable, comprehensible and as individualized as 

possible. 

 
II.1. The effect of corrective feedback on written production 

In the current multimedia age, different modes of writing and image combine to make 

multimodal texts which communicate meanings and may be used for language learning. 

Images (including the use of colors) play an essential role in multimodal 

communication as attention-getters (Kress 2010), therefore maximizing the potential for 

learning. In this sense, a crucial condition for the effectiveness of CF is that the student 

notices the input features and the differences between his/her interlanguage and the 

target language forms. The notion of noticing was coined by Schmidt (1990) and 

supported by other researchers (e.g., Mackey et al. 2000, Philp 2003) as one of the 

crucial elements necessary for acquisition to take place, in the sense that noticing is 

essential for input to become intake. Intake has been defined by Ellis (1994: 708) as 

‘that portion of input that learners notice and therefore take into temporary memory’. 

Learners may notice input thanks to the CF provided to them in the language classroom. 

Indeed, research has shown that CF does occur in the classroom in a high proportion 

(e.g., Panova and Lyster 2002) as an intervening variable in the process of language 

learning. The benefits of CF in oral interaction point to learners’ noticing of problematic 

forms, opportunities to modify output and test hypotheses, and an increase in linguistic 

http://www.e-revistes.uji.es/languagevalue


Tamara Hernández Puertas  

 
Language Value 10 (1), 67–88  http://www.e-revistes.uji.es/languagevalue 70 

accuracy. Yet, the debate about the value of written corrective feedback (WCF, 

henceforth) has yielded conflicting results (Evans et al. 2010). For instance, in a much-

cited study by Truscott (1996), he argued that ‘correction is not only unhelpful but even 

counterproductive’ (1996: 354). In the same vein, Polio et al. (1998) and Fazio (2001) 

stated that CF can be discouraging and ineffective to improve subsequent writings due 

to the pressure it may create on learners. However, broadly speaking, research has found 

a beneficial effect of CF on writing accuracy (e.g., Bitchener 2008, Lee 2013). More 

specifically, Bitchener and Ferris (2012) claim that students' accuracy improves when 

they attend to feedback as they draw their attention to linguistic inconsistencies or 

mistakes. Moreover, for ethical reasons, learners need to be provided with CF in their 

written production, even more when it has been shown that students want to improve 

their linguistic accuracy (Ferris and Hedgcock 2005) and that they expect to have their 

writing mistakes marked (Guénette 2007). 

Feedback may be delivered in a more direct (explicit) or indirect (implicit) way. Direct 

feedback is offered when the teacher provides the correct form straight away and the 

student is supposed to incorporate that correction in the final version. Contrarily, in 

indirect feedback the teacher merely indicates in some way (underlining or highlighting 

the error, or marking in the margin of the text) that there is an error, without providing 

the correction. Thus, the student knows there is a mistake and he/she has to solve it. In 

this sense, some voices have claimed that indirect feedback is more desirable because it 

may engage students in problem solving and, eventually, in more progress in accuracy 

over time than direct feedback (Ferris et al. 2000). Different degrees of explicitness in 

feedback provision were examined in Ferris and Roberts’ (2001) study: Group A had 

their errors underlined and coded, Group B had their errors underlined but not coded 

and Group C (control group) had no error markings. No statistically significant 

differences were reported between Group A and B, suggesting that more explicit 

feedback (underlining and coding of errors) was not more advantageous than simple 

underlining. 

Some research has addressed the impact of different types of feedback on accuracy in 

student writing. Chandler (2003) had four treatments including (i) Correction, (ii) 

Underlining with Description, (iii) Description of error only, and (iv) Underline. 

Findings show that conditions (i) and (iv) resulted in more accurate pieces of writing in 

http://www.e-revistes.uji.es/languagevalue


Teacher’s feedback vs. computer-generated feedback: A focus on articles  

 
Language Value 10 (1), 67–88  http://www.e-revistes.uji.es/languagevalue 71 

the next assignment, whereas treatments (ii) and (iii), which involved a description of 

the error type, had the opposite effect. Overall, the number of studies which have 

addressed the effectiveness of direct and indirect WCF show inconclusive findings. 

However, there seems to be a wider consensus on the fact that if feedback is provided, 

learners’ accuracy tends to improve when compared to control groups receiving no 

feedback, as reported by Ene and Upton’s (2014) study. 

 
II.2. Computer-generated feedback in writing 

When to provide feedback has been one of the main concerns in the field of language 

correction and feedback. Warschauer (2010) claimed that autonomous learning and 

revision could be enhanced by promptly delivered feedback. Indeed, when little time 

lapses between the student’s writing and the teacher’s CF, learning opportunities may 

be maximized. In the same line, Guichon et al. (2012) argued that if learners can get 

‘just in time’ feedback, they may self-correct almost immediately after their mistakes 

and possibly incorporate this feedback in subsequent writings. In this way, written CF 

may be more effective as in traditional classrooms feedback is only provided by the 

teacher several days after the written production. 

As stated by Spada (2011), corrective feedback occurs both in natural learning contexts 

as well as in formal environments, although it is more frequent and presumably more 

beneficial and necessary in the latter. Yet, in large classes in which the students are 

required to perform written tasks, teachers need to lessen their workload by delegating 

work to their students, who may use electronic feedback to self-correct their written 

productions (Lee 2013). Therefore, more time could be devoted to other areas which 

need more attention in writing, such as content and organization (Chen and Cheng 

2008). In this sense, and especially in the education domain, the importance of 

technology and the benefits it may provide to the learning process shows how it is 

taking over classrooms at all levels. The use of computer tools, what is called 

‘computer-assisted language learning’ (CALL), applied to the classroom and the 

students' way of working represents an extra value and motivation. In fact, as Becker 

stated (1991: 385), ‘in the 1980s, no single medium of instruction or object of 

instructional attention produced as much excitement in the conduct of elementary and 

http://www.e-revistes.uji.es/languagevalue


Tamara Hernández Puertas  

 
Language Value 10 (1), 67–88  http://www.e-revistes.uji.es/languagevalue 72 

secondary education as did the computer.’ CALL is an approach that has many 

advantages: first, it adapts to the learning of the students letting them control their own 

pace, second, it allows them to be more autonomous since they are the ones who make 

their own choices, third, it offers them freedom and authenticity and finally, it develops 

their critical thinking. In this vein, computer-mediated feedback may contribute to help 

students write more independently both inside and outside the classroom. Moreover, 

research from Tiene and Luft (2001) suggests that the use of technology fosters 

individualized communication between teacher and students more often and allows 

teachers to focus on higher-order aspects of writing, leaving common grammar or 

spelling mistakes to the program.  

As just stated, new technological implementations in the language classroom have 

influenced the skill of writing, especially the revision and editing processes by means of 

online tools. The interplay of range of modes on screen (for example, image and 

writing) has resulted in a redesign of how students can receive feedback. As Jewitt put it 

(2002: 172), ‘communication and learning are multimodal’. This multimodality may be 

significant for writing improvement. In this sense, in the past twenty years, software 

aiming at scoring and/or providing feedback on students’ writings has been devised 

(e.g., Criterion, MyAccess, Grammarly, Summary Street, to mention but a few), with 

diverse degree of effectiveness on students’ satisfaction (Chen and Cheng 2006). Still, 

some voices (e.g., Ware and Warschauer 2005) claim that the amount of time a teacher 

may spend correcting students’ compositions may be dramatically reduced if teachers 

can rely on computer-generated feedback. Moreover, software which generates 

feedback on writing has been created providing either reports on grammatical errors or 

more holistic assessment on aspects such as content or organization of the piece of 

writing. In the case of grammar checkers, Potter and Fuller (2008) reported an increase 

in students’ motivation, proficiency and confidence in grammar rules in the use of 

English grammar checkers. In turn, Nadasdi and Sinclair (2007) argued that the French 

online grammar checking program BonPatron was as effective as teacher correction. 

Also, Burston (2008) investigated the accuracy of this grammar checker showing that 

88% of errors were spotted by the software. Mistakes were highlighted by means of 

color-coding: red indicated those grammatical aspects the student had to modify and 

orange was used to signal segments or words which needed to be verified. 

http://www.e-revistes.uji.es/languagevalue


Teacher’s feedback vs. computer-generated feedback: A focus on articles  

 
Language Value 10 (1), 67–88  http://www.e-revistes.uji.es/languagevalue 73 

Despite the a priori benefits of grammar checkers, they are not without limitations. As 

argued by Davis (1989), any user of grammar checkers has to set their perceived 

usefulness and ease of use, two key factors in Davis’ Technology Acceptance Model 

(TAM) determining the likelihood of acceptance of new technology. A second 

drawback refers to the fact that sometimes computer-generated feedback may not be 

specific or informative enough to guide learners in their revision process, eventually 

causing frustration or dissatisfaction (Chen and Cheng 2006). 

 
III. GRAMMAR CHECKER 

In 2001 the Universidad Nacional de Educación a Distancia (UNED) in Madrid started 

to work on the software Grammar Checker (GC, henceforth) in an attempt to detect 

errors made by English-as-a-foreign-language students. It provides written feedback on 

grammar, spelling, and words used incorrectly based on a corpus of eighty million 

words ‘taken from the written component of the British National Corpus’ (Lawley 

2015: 26). As explained by this author, the program divides the text into segments that 

are compared to that corpus and highlighted in red if they do not appear in it or have a 

threshold number lower than 0 and 0.1, in orange if they occur in the corpus fewer than 

500 times and their threshold numbers range between 0.1 and 0.5, or yellow if they 

occur fewer than 75 times and their threshold numbers lie between 0.5 and 0.9. 

Therefore, this program requires cognitive process from students as it only uses certain 

colors to show frequency but does not offer the possibility to receive corrections at the 

click of a mouse. Students are responsible for changing the segment or not upon 

reflection. In this way, it offers the opportunity to learn from mistake. GC does not 

provide a score for the text, it merely alerts users to those combinations that are rare or 

do not occur. 

GC works as follows: after creating an account, the student has to write the text and 

press “Enter your text” and then “Start” to check if there are any mistakes. First, spelling 

mistakes are highlighted in yellow (also purple if it is a very rare word but not 

necessarily a mistake, e.g., proper names) and by clicking on the words highlighted 

useful feedback is provided. By clicking on “Modify”, the previous spelling mistakes 

can be corrected and checked again by pressing “Check again”. Then the same 

http://www.e-revistes.uji.es/languagevalue


Tamara Hernández Puertas  

 
Language Value 10 (1), 67–88  http://www.e-revistes.uji.es/languagevalue 74 

procedure is followed for the “Incorrect sequences filter” that highlights grammar 

mistakes such as ‘These table’, and for the “Problem words filter” which refers to 

correct English and does not highlight any word but suggests words that are usually 

misused by students, e.g., ‘insano’ (unhealthy). Therefore, if after reading the 

suggestions the student thinks he/she has made a mistake, he/she can modify it. 

The most important step for the aims of the present study is the button “Pairs filters” 

which highlights phrases that do not usually occur, e.g., ‘had do’. In order to know the 

frequency with which those phrases occur and decide whether it is a mistake or not, the 

student can use the search engine at the top of the screen. Figure 1 below illustrates a 

screenshot of GC: 

 
Figure 1. Screenshot of Grammar Checker. 

 
GC was selected for the purposes of the present study for several reasons: firstly, it 

offers a cue (highlighting in colors) so that students can locate, reflect and self-correct, 

which, according to the literature, may be conducive to learning. Secondly, GC does not 

overwhelm language learners with metalinguistic terminology which may be at odds 

with some learners’ literacy (Dikli 2006). Thirdly, this software does not score learner’s 

written production, but provides them with feedback and possible suggestions for 

improvement. Finally, it is an affordable program for only €14 a year for students 

http://www.e-revistes.uji.es/languagevalue


Teacher’s feedback vs. computer-generated feedback: A focus on articles  

 
Language Value 10 (1), 67–88  http://www.e-revistes.uji.es/languagevalue 75 

aiming, in the present study, for level B2 of the Common European Framework of 

Reference of Languages (CEFR). 

 
IV. THE STUDY 

Prior to this research, a pilot study to test the use of GC was conducted with a group of 

students with a similar level of proficiency as the participants in the present study and 

also enrolled in an Official School of Languages. The purpose of that pilot study was, 

on the one hand, to test the computer program, and on the other hand, to decide 

important aspects such as the level and the number of students participating and the 

targeted grammatical aspects (articles, verb tenses and prepositions in this case). One 

group of students received teacher’s feedback and another obtained feedback by means 

of GC. Analysis of the data collected in the pilot study revealed a higher number of 

corrections after computer feedback. Therefore, this program proved helpful in 

highlighting and correcting students’ mistakes. 

Taking into account previous research pointing at overall benefits of WCF in the 

development of students’ writing accuracy on the one hand (Bitchener 2008, Russell 

and Spada 2006), and the rapid growth of computerized feedback in educational 

contexts on the other (Ene and Upton 2014), in this study we entertain two research 

questions. The first research question aims at revealing what type of feedback (teacher 

vs. computer) will have a better effect on accuracy in the targeted grammar aspect 

(articles). On the other hand, the second research question aims at showing what type of 

feedback (teacher vs. computer) will have a lasting effect in the delayed tailor-made 

test. 

 
IV.1. Participants 

Three groups of Spanish students (n=27) participated in the present study. They were 

divided into two treatment groups and the control group. All participants were studying 

at an Official School of Languages in order to pass the B2 level for professional reasons 

and reported having studied English for over 6 years. Their mother tongue was Spanish 

and/or Catalan and their ages ranged from 20 to 50 years old (mean=39.3). 

http://www.e-revistes.uji.es/languagevalue


Tamara Hernández Puertas  

 
Language Value 10 (1), 67–88  http://www.e-revistes.uji.es/languagevalue 76 

The study was carried out as part of their formal EFL instruction and the compositions 

were regular assignments the students had to elaborate as part of their written 

homework. 

 
IV.2. Targeted grammar aspect: articles 

Errors on rule-governed forms allow for more focused correction than errors which are 

not rule-based (Lee 2013). Ferris (1999) termed the first type of errors ‘treatable errors’, 

as some grammar errors may be treatable through feedback. In this vein, articles fall 

under this ‘treatable’ category and for Spanish EFL students they may be a recurrent 

source of errors, especially the zero article. In fact, the English article system has been 

shown to be used inaccurately by foreign language learners, even with high proficiency. 

Despite the fact that article errors seldom cause misunderstanding, since they possess 

low communicative value, it is still necessary for learners to overcome their problems 

with this specific grammar form. On this account, Master (1995) pointed out that 

attention to the article system was important because this type of errors may leave the 

impression that the learners have incomplete control of the target language. Some years 

later, Bitchener (2008) also argued that EFL learners across different language 

proficiency levels experience difficulties in their mastery of the English article system. 

These perceived difficulties, along with the fact that articles are potentially ‘treatable’, 

were the reasons to have articles as targeted grammatical forms for examination. 

 
IV.3. Types of feedback 

Group 1 (n=11) received teacher’s feedback, Group 2 (n=8) computer feedback and the 

Control Group (n= 8) obtained no feedback on the targeted grammatical aspect. 

Computer feedback was provided by Grammar Checker by means of a color code (red, 

orange and yellow) as explained in Section III. It was an indirect type of feedback 

which only signaled potentially problematic bits in the compositions. For comparability 

issues, teacher’s feedback had to be indirect as well, so she also used colors similar to 

the ones in the computer software to highlight the mistakes on articles. 

 
http://www.e-revistes.uji.es/languagevalue


Teacher’s feedback vs. computer-generated feedback: A focus on articles  

 
Language Value 10 (1), 67–88  http://www.e-revistes.uji.es/languagevalue 77 

IV.4. Data collection procedure 

In a session prior to the data collection, participants belonging to Group 2 were trained 

in the use of Grammar Checker and they were explained what the color code meant and 

how they had to correct their mistakes. Afterwards, all participants were asked to write a 

180/200-word composition based on a comic strip (Abbey Time 1). In strip 1 someone is 

writing a letter to an old woman, in strip 2 Abbey appears next to an Elvis-looking man, 

in strip 3 the man is holding some flowers and a teddy bear, in strip 4 a woman different 

from the old woman and physically similar to Abbey is looking at the man with a 

menacing gaze, in strip 5 Abbey looks sad and in strip 6 someone who seems to be 

Abbey is writing a letter. As mentioned above, Group 1 received teacher’s feedback and 

Group 2 obtained computer feedback. The control group did not get any feedback on 

articles but on other non-targeted grammatical aspects. After this feedback, they rewrote 

a second version of the same comic strip (Abbey Time 2) to check whether correction 

had been effective. The time elapsed between Abbey 1 and teacher’s feedback was one 

week, and between teacher’s feedback and Abbey 2 also one week. 

Two weeks after Abbey 2, participants composed a second text based again on a similar 

graphic prompt but with different strips (Pam Time 1). In strip 1 someone is writing a 

letter while the image of Pam appears in the background. In strip 2 an old woman is 

holding a sheet of paper, and in strip 3 the woman who looks like Pam is looking at the 

Elvis-looking man with a menacing gaze. In strip 4 the man is showing the woman a 

cake he has just made, in strip 5 the old woman looks happy and in strip 6 the old 

woman is writing a letter.  

The same process as the one depicted above applied: after the first composition (Pam 

Time 1), feedback (either by the computer software or the teacher) was provided and 

students wrote a second version (Pam Time 2) after 2 weeks from the first version. 

Therefore, 4 compositions (Abbey Time 1 and 2 and Pam Time 1 and 2) are the data for 

analysis. 

Six weeks after having written the last of the four compositions, the participants were 

asked to complete an individual tailor-made test (see a sample in Appendix 1) to check 

any long-term impact of the two types of feedback. The tailor-made tests included all 

the errors each student had made in Abbey Time 2 and Pam Time 2, that is, after having 

http://www.e-revistes.uji.es/languagevalue


Tamara Hernández Puertas  

 
Language Value 10 (1), 67–88  http://www.e-revistes.uji.es/languagevalue 78 

obtained feedback three times (either from the teacher or the computer). Table 1 

illustrates the timeline for data collection.   

 
Table 1. Timeline for the data collection procedure. 

Week 1 Abbey T1 

Week 2 Teacher's or computer feedback 

Week 3 Abbey T2 

Week 4 Teacher's or computer feedback 

Week 5 Pam T1 

Week 6 Teacher's or computer feedback 

Week 7 Pam T2 

Week 8 Teacher's or computer feedback 

Week 14 Tailor-made test 

 
All four compositions belonged to the same genre, that of narrative story, in which a 

short story is described. The learners had to describe what was happening in the story 

according to the given pictures. Therefore, as stated by Bitchener (2008), valid text 

comparisons can be made because both storylines were related and even seemed a 

continuation and had similar characters. For this reason, similar tenses, structures and 

vocabulary for both comic strips were expected. 

 
IV.5. Results and discussion 

A Kruskal-Wallisi test was run to determine whether there existed significant 

differences in the two experimental groups and the control group taking into account 

errors on articles in Abbey Time 1, that is, in the first composition the learners had to 

write. As can be seen in Table 2, results show no significant differences, a fact that, 

from a methodological point of view, is desirable as it indicates that all groups made an 

equivalent number of errors (p>0.05 in all three groups). 

 
Table 2. Means and standard deviations for Abbey Time 1. 

Group Mean and standard deviation 

Group 1: computer’s feedback .91 (2.07) 

http://www.e-revistes.uji.es/languagevalue


Teacher’s feedback vs. computer-generated feedback: A focus on articles  

 
Language Value 10 (1), 67–88  http://www.e-revistes.uji.es/languagevalue 79 

Group 2: teacher’s feedback 1.13 (1.35) 

Control group .50 (.75) 

 
As to the first research question, a first analysis was carried out to determine whether 

feedback had been useful when students had to write Abbey Time 2 and Pam Time 2 

(i.e., when they had obtained feedback after Abbey Time 1 and Pam Time 1). With that 

aim in mind, a Wilcoxon signed-rank test ii taking into account the number of errors on 

articles between Abbey Time 1 and 2, and between Pam Time 1 and 2 revealed only 

statistically significant differences between Pam 1 and 2 for the group who had been 

offered teacher’s feedback (Group 1; p=.026). For Group 2 (computer group) and the 

Control Group, no significant differences were observed, as Table 3 depicts: 

 
Table 3. Comparison between Time 1 and Time 2 in both compositions. 

 Group 1 (teacher) Group 2 (computer) Control Group 

 Z (W) Z (W) Z (W) 

Abbey Time 1 and 2 1.00 .00 .81 

Pam Time 1 and 2 2.23 .68 .33 

 
As stated above, only a significant decrease in the number of errors in the use of articles 

occurs between Pam 1 and 2 for Group 1. Although both treatment groups at the time of 

writing Pam 2 had received feedback three times, in light of our results teacher’s 

feedback appears to be more effective as far as linguistic accuracy is concerned, despite 

the fact that this feedback was as indirect as the one provided by the computer. In view 

of the above results, the effect size was calculated (Cohen’s dii). For Pam Time 1 and 2, 

the effect size was large (d=1.024), but the rest of effect sizes ranged from medium to 

small. 

A second test was used (Wilcoxon signed-rank test) to examine the effect of feedback in 

Abbey and Pam at Time 2. Again, as shown in Table 4 below, the analysis reveals only 

statistically significant differences for Group 1, that is, it seems that teacher’s feedback 

had a positive effect on reducing learners’ errors on articles. One possible explanation 

for this finding is that learners tried harder to self-repair before giving their revised 

compositions back to their teacher. Maybe they were not so confident about computer’s 

http://www.e-revistes.uji.es/languagevalue


Tamara Hernández Puertas  

 
Language Value 10 (1), 67–88  http://www.e-revistes.uji.es/languagevalue 80 

feedback and might have felt skeptical about this source of feedback. Still, that is the 

only significant difference, since Group 2 and the CG did not show any significant 

difference in reducing the number of errors. Our results seem to align with Sauro’s 

(2009) research on zero articles. Her two treatment groups received two types of 

computer feedback (recast and metalinguistic information). The indirect type of 

feedback (recast) in Sauro’s study and highlighting in the present investigation do not 

seem to have an impact on learners’ correction of their errors. 

 
Table 4. Means and standard deviations for Abbey Time 1. 

Abbey & Pam Time 2 Z (W) p 

Group 1 (computer) 2.11 .035 

Group 2 (teacher) .37 .70 

Control Group .81 .41 

 
In an overview of the grammar checker Grammarly, Cavaleri and Dianati (2016) report 

that 22% of their students agreed that the feedback provided on their writing was not 

always helpful, as some of the feedback made no sense for learners. Our participants 

may presumably have been in the same situation, finding the feedback too indirect. 

As for the second research question, a Wilcoxon test was run. In Group 1, there were no 

statistically significant differences (Z(W)= 1.63; p= .10; d=.25) between the errors 

students had made in Abbey Time 2 and Pam Time 2 and the tailor-made tests, showing 

a small effect (calculated with Cohen’s d). The same pattern applies to the results for the 

computer group and the control group, as there were no significant differences between 

the mistakes in Time 2 in both compositions and the tailor-made tests (Z(W)= 1.63; p= 

.10; d=.14) for Group 2 and (Z(W)= 1.89; p= .059; d=.40) for the CG, again with a small 

to medium effect size. 

Despite the fact that, as shown by the results of the first research question, there were 

significant differences in the number of errors after teacher’s feedback, this applied only 

to immediate gains which were not maintained in the long term, as attested by the 

results for the second research question. Neither of the treatment groups showed gains 

in accuracy in the tailor-made post-tests. Again, one likely explanation for this result be 

http://www.e-revistes.uji.es/languagevalue


Teacher’s feedback vs. computer-generated feedback: A focus on articles  

 
Language Value 10 (1), 67–88  http://www.e-revistes.uji.es/languagevalue 81 

the fact that feedback was too indirect and the color codes were too vague and not 

showing the learners what to focus on in a more specific way. In this sense, multimodal 

combination of text and image (colors, in this case) did not seem to benefit the students’ 

self-correction process. Although it has been claimed that learners may benefit more 

from indirect CF because they need to engage in deeper language processing (van 

Beuningen et al. 2008), CF which is too indirect may not reach the desired goals in the 

long run. Indeed, Chandler (2003) found that direct feedback resulted in largest 

accuracy gains, both in revisions of previous writings and in subsequent writing, 

whereas students who revised their compositions after indirect CF were unable to do so. 

A second explanation points to the fact that the compositions learners had to write were 

not graded. As a result, their motivation could have been rather low along with the 

possibility that they might have got bored of writing four compositions which were very 

similar and demanded little creativity. 

 
V. CONCLUSION 

Many adult students may have to work autonomously on their language acquisition 

process. As shown by the findings of the present study, computer-assisted learning tools 

such as Grammar Checker may prove useful in that process, as ‘everything that can be 

done to facilitate accurate self-correction is positive’ (Lawley 2016: 879). Still, GC 

merely suggests potential problems by highlighting some written bits, thus leaving it up 

to students to solve the error. In this vein, computer-generated feedback may have 

resulted to be a difficult task for the students who received this type of feedback ‘due to 

their learned dependence on teacher-provided feedback’ (Peterson 2017: 48). Moreover, 

the effectiveness of computer-generated feedback to highlight aspects such as content or 

organization of writings is questionable as humans can assess writings more accurately 

than computers (Reiners et al. 2011). 

The present study aimed at comparing the impact of teacher’s and computer feedback 

on students’ errors, as most errors are repeated among students, which makes the 

teacher correct the same error numerous times. In this sense, and despite the above-

mentioned drawbacks of using technology for grammar correction, software such as 

http://www.e-revistes.uji.es/languagevalue


Tamara Hernández Puertas  

 
Language Value 10 (1), 67–88  http://www.e-revistes.uji.es/languagevalue 82 

Grammar Checker could improve this situation, encouraging students to be more 

independent of the teacher and more responsible for their own learning. Benefits may 

apply both for the learners and the teacher.  

Yet, taking into account the results of this study, we concur with Ware’s (2011) claims 

that computer-generated feedback should be seen as a supplement to writing instruction 

and not as a replacement, since teacher’s CF, although as indirect as the one delivered 

by GC, seemed to work better in reducing the number of errors in the short run. We 

adhere to Heift and Hegelheimer’ (2017) recent claims that there is still scant evidence 

with regard to whether computer-generated feedback results in accuracy development 

and learning over time, pointing to a need of long-term research to determine these 

issues. 

This piece of research was conducted in authentic classrooms as part of students’ 

ordinary classes. In this sense, it represents a realistic picture of EFL instruction, which 

impacts on its ecological validity, even though some factors, such as students’ 

commitment during the process may be a handicap. Therefore, as limitations to the 

study we can mention the small sample size, which poses questions of generalizability, 

and the fact that the feedback provided addressed errors on articles, that is, rule-

governed forms which are more amenable to correction (Lee 2013). The extent to which 

other non-rule-governed aspects may benefit from the two types of CF has not been 

examined in the present study. Also, the type of indirect feedback offered (highlighting 

errors) may prove more useful for students at higher levels of proficiency. Perhaps the 

small impact of this kind of feedback in the present study may be due to the proficiency 

level of the participants, who could have felt at a loss because of their limited linguistic 

competence. Finally, a further limitation refers to the effectiveness of Grammar 

Checker, since it depends highly on the teacher and students' attitudes toward computer-

based feedback and their technology-use skills in working with computer-based 

programs, because not all teachers and students may be equally skilled. 

 
Notes 

i Non-parametric test that compares independent sample of equal of different sample sizes. 
ii Non-parametric test used to compare two related samples in this case 

http://www.e-revistes.uji.es/languagevalue


Teacher’s feedback vs. computer-generated feedback: A focus on articles  

 
Language Value 10 (1), 67–88  http://www.e-revistes.uji.es/languagevalue 83 

REFERENCES  

Becker, H. J. 1991. “How computers are used in United States schools: Basic data from 

the 1989 IEA Computers in Education survey”. Journal of Educational 

Computing Research, 7, 385-406. 

Bitchener, J. 2008. “Evidence in support of written corrective feedback”. Journal of 

Second Language Writing, 17 (2), 102-118. 

Bitchener, J. and Ferris, D. R. 2012. Written corrective feedback in second language 

acquisition and writing. New York: Routledge. 

Burston, J. 2008. “BonPatron: An online spelling, grammar, and expression checker”. 

Calico Journal, 25 (2), 337-347. 

Cavaleri, M. and Dianati, S. 2016. “You want me to check your grammar again? The 

usefulness of an online grammar checker as perceived by students”. Journal of 

Academic Language & Learning, 10 (1), 223-236. 

Chacón-Beltrán, R. 2017. “Free-form writing: computerized feedback for self-

correction”. ELT Journal, 71 (2), 141-149. 

Chandler, J. 2003. “The efficacy of various kinds of error feedback for improvement in 

the accuracy and fluency of L2 student writing”. Journal of Second Language 

Writing, 12, 267-296. 

Chen, C-F. and Cheng, W-Y. 2006. The use of computer-based writing program: 

facilitation of frustration? Paper presented at the 23rd. International Conference 

on English Teaching and Learning in the Republic of China. 

Chen, C-F. and Cheng, W-Y. 2008. “Beyond the design of automated writing 

evaluation: Pedagogical practices and perceived learning effectiveness in EFL 

writing classes”. Language Learning & Technology, 12 (2), 94-112. 

Davis, F. D. 1989. “Perceived usefulness, perceived ease of use, and user acceptance of 

information technology”. MIS Quarterly, 13 (3), 319-339. 

Dikli, S. 2006. “An overview of automated scoring of essays”. Journal of Technology, 

Learning, and Assessment, 5 (1), 1-35. 

http://www.e-revistes.uji.es/languagevalue


Tamara Hernández Puertas  

 
Language Value 10 (1), 67–88  http://www.e-revistes.uji.es/languagevalue 84 

Ellis, R. 1994. The Study of Second Language Acquisition. Oxford: Oxford University 

Press. 

Ellis, R. 2009. “A typology of written corrective feedback types”. ELT Journal, 63 (2), 

97-107. 

Ene, E. and Upton, T. A. 2014. “Learner uptake of teacher electronic feedback in ESL 

composition”. System, 46, 80-95. 

Evans, N. W., Hartshorn, K. J., McCollum, R. M. and Wolfersberger, M. 2010. 

“Contextualizing corrective feedback in second language writing pedagogy”. 

Language Teaching Research, 14 (4), 445-463. 

Fazio, L. 2001. “The effect of corrections and commentaries on the journal writing 

accuracy of minority- and majority-language students”. Journal of Second 

Language Writing, 10 (4), 235-249. 

Ferris, D. R. 1999. “The case for grammar correction in L2 writing classes: A response 

to Truscott (1996)”. Journal of Second Language Writing, 8 (1), 1-11. 

Ferris, D. R., Chaney, S. J., Komura, K., Roberts, B. J. and McKee, S. 2000. 

Perspectives, problems, and practices in treating written error. Colloquium 

presented at International TESOL Convention. (March 14-18, 2000). 

Ferris, D. R. and Hedgcock, J. 2005. Teaching ESL composition: Purpose, process, 

and practice. Mahwah, NJ: Erlbaum. 

Ferris, D. R. and Roberts, B. 2001. “Error feedback in L2 writing classes. How 

explicit does it need to be?” Journal of Second Language Writing, 10, 161-184. 

Grammar Checker. 2 October 2015. http://www.e-

uned.es/subscription/subscriptionsInfo.php?subID=CM 

Guénette, D. 2007. “Is feedback pedagogically correct? Research design issues in 

studies of feedback on writing”. Journal of Second Language Writing, 16, 40-53. 

Guichon, N. Betrancourt, M. and Prié, Y. 2012. “Managing written and oral negative 

feedback in a synchronous online teaching situation”. Computer Assisted 

Language Learning, 25 (2), 181-197. 

http://www.e-revistes.uji.es/languagevalue


Teacher’s feedback vs. computer-generated feedback: A focus on articles  

 
Language Value 10 (1), 67–88  http://www.e-revistes.uji.es/languagevalue 85 

Heift, T. and Hegelheimer, V. 2017. “Computer-assisted corrective feedback and 

language learning”. In Nassaji, H. and Kartchava, E. (Eds.) Corrective feedback 

in second language teaching and learning. New York: Routledge, 51-65. 

Jewitt, C. 2002. “The move from page to screen: The multimodal reshaping of school 

English”. Journal of Visual Communication, 1 (2), 171-196. 

Kress, G. 2010. Multimodality. A social semiotic approach to contemporary 

communication. London: Routledge. 

Lawley, J. 2015. “New software to help EFL students self-correct their writing”. 

Language Learning & Technology, 19 (1), 23-33. 

Lawley, J. 2016. “Spelling: computerised feedback for self-correction”. Computer 

Assisted Language Learning, 29 (5), 868-880. 

Lee, I. 2013. “Research into practice: Written corrective feedback”. Language 

Teaching, 46 (1), 108-119. 

Long, M. 1991. “Focus on form: A design feature in language teaching methodology”. 

In de Bot, K., C. Kramsch and R. Ginsburg (Eds.) Foreign language research in 

cross-cultural perspective. Amsterdam: John Benjamins, 39-52. 

Lyster, R. 1998. “Recasts, repetition and ambiguity in L2 classroom discourse”. Studies 

in Second Language Acquisition, 20 (1), 51-80. 

Lyster, R. and Mori, H. 2006. “Interactional feedback and instructional 

counterbalance”. Studies in Second Language Acquisition, 28, 269-300. 

Mackey, A., Gass, S. and McDonough, K. 2000. “How do learners perceive 

interactional feedback?”. Studies in Second Language Acquisition, 22, 471-497. 

Master, P. 1995. “Consciousness raising and article pedagogy”. In Belcher, D. and G. 

Brain (Eds.) Academic writing in a second language. Norwood, NJ.: Ablex, 183-

204. 

Nadasdi, T. and Sinclair, S. 2007. Anything I can do, CPU can do better: A 

comparison of human and computer grammar correction for L2 writing using 

BonPatron.com. Unpublished manuscript. 15 January 2018. 

https://sites.ualberta.ca/~tnadasdi/Dublin.htm. 

http://www.e-revistes.uji.es/languagevalue
https://sites.ualberta.ca/~tnadasdi/Dublin.htm


Tamara Hernández Puertas  

 
Language Value 10 (1), 67–88  http://www.e-revistes.uji.es/languagevalue 86 

Ortega, L. 2009. “The linguistic environment”. In Ortega, L. (Ed.) Understanding 

second language acquisition. London: Hodder Arnold, 71-76. 

Panova, I. and Lyster, R. 2002. “Patterns of corrective feedback and uptake in an adult 

ESL classroom”. TESOL Quarterly, 36, 573-595. 

Peterson, E. K. 2017. “The impact of computer-generated feedback on student 

perceptions of revision process”. Masters of Arts in Education Action Research 

Papers, 247. 28 August 2018. https://sophia.stkate.edu/maed/247 

Philp, J. 2003. “Constraints on "noticing the gap": Nonnative speakers' noticing of 

recasts in NS-NNS interaction”. Studies in Second Language Acquisition, 25, 99-

126. 

Polio, C., Fleck, C. and Leder, N. 1998. “‘If only I had more time’: ESL learners’ 

changes in linguistic accuracy on essay revisions”. Journal of Second Language 

Writing, 7, 43-68. 

Potter, R. and Fuller, D. 2008. “My new English partner? Using the Grammar Checker 

in writing instruction”. English Journal, 98 (1), 36-41. 

Reiners, T., Dreher, C. and Dreher, H. 2011. “Six key topics for automated 

assessment utilization and acceptance”. Informatics in Education, 10 (1), 47-64. 

Russell, J. and Spada, N. 2006. “The effectiveness of feedback for the acquisition of 

L2 grammar”. In Norris, J. D. and L. Ortega (Eds.) Synthesizing research on 

language learning and teaching. Amsterdam: John Benjamins, 133-164. 

Sauro, S. 2009. “Computer-mediated corrective feedback and the development of L2 

grammar”. Language Learning & Technology, 13 (1), 96-120. 

Schmidt, R. 1990. “The role of consciousness in second language learning”. Applied 

Linguistics, 11 (2), 129-158. 

Spada, N. 2011. “Beyond form-focused instruction: Reflections on past, present and 

future research”. Language Teaching, 44, 225-236. 

Tiene, D. and Luft, P. 2001. “Teaching in a technology-rich classroom”. Educational 

Technology, 41, 23-31. 

http://www.e-revistes.uji.es/languagevalue


Teacher’s feedback vs. computer-generated feedback: A focus on articles  

 
Language Value 10 (1), 67–88  http://www.e-revistes.uji.es/languagevalue 87 

Truscott, J. 1996. “The case against grammar correction in L2 writing classes”. 

Language Learning, 46, 327-369. 

Van Beuningen, N., de Jong, N. H. and Kuiken, F. 2008. “The effect of direct and 

indirect corrective feedback on L2 learners’ written accuracy”. International 

Journal of Applied Linguistics, 156, 279-296. 

Ware, P. 2011. “Computer-generated feedback on student writing”. TESOL Quarterly, 

45 (4), 769-774. 

Ware, P. and Warschauer, M. 2005. “Electronic feedback and second language 

writing”. In Hyland, K. and F. Hyland (Eds.) Feedback and second language 

writing. Cambridge: Cambridge University Press, 1-29. 

Warschauer, M. 2010. “Invited commentary: New tools for teaching writing”. 

Language Learning & Technology, 14 (1), 3–8. 

 
APPENDIX 1: Sample tailor-made test 

 
http://www.e-revistes.uji.es/languagevalue


Tamara Hernández Puertas  

 
Language Value 10 (1), 67–88  http://www.e-revistes.uji.es/languagevalue 88 

Received: 15 March 2018 

Accepted: 23 July 2018 

 
Cite this article as:  

Hernández Puertas, Tamara 2018. “Teacher’s feedback vs. computer-generated feedback: A 
focus on articles”. Language Value 10 (1), 68-89. Jaume I University ePress: Castelló, Spain. 

http://www.e-revistes.uji.es/languagevalue.  

DOI: http://dx.doi.org/10.6035/LanguageV.2018.10.5 

ISSN 1989-7103 

Articles are copyrighted by their respective authors 

http://www.e-revistes.uji.es/languagevalue
http://www.e-revistes.uji.es/languagevalue