




















































3248-12029-1-CE


Journal of Teaching and Learning with Technology, Vol. 2, No. 1, June 2013, pp. 15 - 30.  

 

Classroom clickers offer more than repetition: Converging evidence 
for the testing effect and confirmatory feedback in clicker-assisted 

learning 
 

Amy M. Shapiro1 and Leamarie T. Gordon2 
 

Abstract: The present study used a methodology that controlled subject and item 
effects in a live classroom to demonstrate the efficacy of classroom clicker use for 
factual knowledge acquisition, and to explore the cognition underlying clicker 
learning effects. Specifically, we sought to rule out repetition as the underlying 
reason for clicker learning effects by capitalizing on a common cognitive 
phenomenon, the spacing effect. Because the spacing effect is a robust 
phenomenon that occurs when repetition is used to enhance memory, we proposed 
that spacing lecture content and clicker questions would improve retention if 
repetition is the root of clicker-enhanced memory. In experiment 1 we found that 
the spacing effect did not occur with clicker use. That is, students performed 
equally on clicker-targeted exam questions regardless of whether the clicker 
questions were presented immediately after presentation of the information 
during lecture or after a delay of several days.  Experiment 2 provided a more 
direct test of repetition, comparing test performance after clicker use with 
performance after a second presentation of the relevant material. Clicker 
questions promoted significantly higher performance on test questions than 
repetition of the targeted material. Thus, the present experiments failed to support 
repetition as the mechanism driving clicker effects. Further analyses support the 
testing effect and confirmatory feedback as the mechanisms through which 
clickers enhance student performance. The results indicate that clickers offer the 
possibility of real cognitive change in the classroom. 

 
 Keywords: clickers, feedback, clicker-assisted learning, knowledge acquistion 

 
Personal response systems, commonly called clickers, have become common in thousands of 
classrooms nationally. They allow instructors to assess comprehension and memory for material 
by posing a question to the class (usually multiple-choice) that students answer with remote 
devices they bring to class. Questions and answers take as little as a minute or two to present 
and collect, and voting results can be displayed instantly in a bar graph. Understandably, 
educators and researchers have been interested in the technology’s educational effectiveness. 
Generally speaking, the majority of studies have shown that clickers are effective in boosting 
attendance and participation (Beekes, 2006; Poirier & Feldman, 2007; Shih, Rogers, Hart, 
Phillis, & Lavoie, 2008; Stowell & Nelson, 2007) and learning outcomes (Kennedy & Cutts, 
2005; Mayer et al., 2009, Morling et al., 2008; Ribbens, 2007; Shapiro, 2009; Shapiro & 
Gordon, 2012). Few studies have explored the cognitive mechanism through which clickers 
increase retention of lecture content, however. The focus of the present work was to better 

	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
1 University of Massachusetts Dartmouth, Psychology Department, 285 Old Westport Road, North Dartmouth, MA 02747-2300, 
ashapiro@umassd.edu  
2 Tufts University 



Shapiro, A.M. and Gordon L.T. 

Journal of Teaching and Learning with Technology, Vol. 2, No. 1, June 2013. 
jotlt.indiana.edu  

16 

understand the cognition driving clicker effects.  Specifically, the experiments presented here 
were designed to rule out repetition as the basis of clicker effects in fact-based learning, thereby 
supporting the hypothesis that the testing effect and feedback drive clicker effects. 
 There are several ways clickers may work to enhance memory for classroom material: 
(1) directing students’ attention to material likely to be on exams, (2) repetition, and (3) the 
testing effect. The first possibility, that clickers “tip off” students about the instructor’s 
judgment of important material, and therefore the content of exam questions, is a reasonable 
hypothesis. One might expect students to attend to those topics more in class and focus study 
effort on those topics. Greater attention in class and increased studying would both enhance 
exam performance. If the attention-grabbing hypothesis is correct, it would mean clicker 
questions do not directly enhance memory or learning. It would mean only that they are an 
effective means of directing learners’ attention to particular topics. Repetition effects, the 
second possible way clickers enhance memory for classroom material, make the justification for 
clicker use similarly debatable.  Repetition can be accomplished through online resources, 
readings or other assignments outside of class, without using any class time and at no cost to 
students.  If clicker effects are attributable instead to the last possibility, the testing effect, it 
would indicate they have unique benefit in the classroom. Writing clicker questions and 
integrating them with class lectures does require a modest time investment. Once that is 
completed, however, clicker questions require little class time to administer, correct and enter to 
grade sheets. Indeed, the entire sequence of presentation, response, grading, recording and 
feedback all happens within seconds. As such, if clickers are due to the testing effect rather than 
repetition or attention-grabbing it would mean they offer unique benefit of enhanced learning 
during class time with very little investment of time or money. 
 Shapiro and Gordon (2012) were able to rule out attention-grabbing and found modest 
support for the testing effect during clicker use in a live classroom. In their study, a series of 
exam questions were targeted over the semester in two classes. Half the items in one class were 
targeted with clicker questions when the information was taught in class. The other half of the 
questions was targeted by attention alerts. They assigned the same items to the opposite 
conditions in the other class. This counterbalanced the assignment of each question to the 
experimental and control conditions, and created a situation in which each item served in both 
the clicker and attention conditions. Students did not get clicker questions about the information 
assigned to the attention condition. Instead, they were told that the information was very 
important and would be covered on the next test. The relevant information on the PowerPoint 
slide was also highlighted in red and was animated to flash. At the end of the semester students 
were given a survey that asked what directed their decisions about what to study. In spite of the 
fact that they reported studying the information targeted by the alerts more than that targeted by 
clicker questions, students performed as well or better on questions when a clicker questions 
was offered. In short, even when attention was explicitly drawn to specific information in class 
and studied more outside of class, answering a clicker question had an equal or greater effect on 
exam performance. That study did not rule out attention-grabbing as a contributing factor to 
clicker effects, but it did provide strong evidence that it is unlikely to be the sole source of 
clicker effects. The authors argued that the testing effect also underlies clicker effects. 
 Shapiro and Gordon (2012) were not able to rule out the possibility of repetition effects 
as the mechanism underlying clicker effects, however.  Because they compared a clicker group 
to a no-clicker control group that was exposed to one presentation of the material, repetition is 
confounded with clicker use.  Indeed, the majority of studies that report clicker effects compare 



Shapiro, A.M. and Gordon L.T. 

Journal of Teaching and Learning with Technology, Vol. 2, No. 1, June 2013. 
jotlt.indiana.edu  

17 

clicker use with no clicker use, with no control for repetition effects (e.g., Mayer et al. 2009; 
Morling et al, 2008; Shapiro, 2009).  At present, then, it is unclear whether the testing effect or 
simple repetition effects are driving clicker effects in the classroom.  The present study was 
designed to address this question. We sought to determine whether the learning outcomes 
observed with clicker use are attributable to repetition.  Before explaining the methodology, we 
provide a brief review on the research that explains these phenomena.  
 
The Testing Effect and Repetition Learning 
 
Karpicke, Roediger and others have documented that testing memory can enhance later recall or 
recognition better than an equivalent amount of additional study (Butler, Karpicke, & Roediger, 
2007; Carrier & Pashler, 1992; Karpicke & Roediger, 2007a, 2007b, 2008; Roediger & 
Karpicke, 2006a; Szpunar, McDermott, & Roediger, 2008). In what has become the classic 
paradigm for investigating the testing effect, Thompson, Wenger, and Bartlings (1978) gave one 
group 3 study sessions followed by a delayed test (SSST). Another group studied the same 
information once and was then tested 3 times (STTT), the final test serving as the dependent 
measure after a 48 hour delay. On the final test, the SSST group forgot 56% of the material, as 
opposed to just 13% by the STTT group. This basic effect has been demonstrated using free 
recall (Jacoby, 1978; Szpunar et al., 2008), short-answer (Agarwal, Karpicke, Kang, Roediger, 
& McDermott, 2006) and multiple-choice (Duchastel, 1981; Nungester & Duchastel, 1982) tests 
and has been demonstrated with memory for word lists (Karpicke & Roediger, 2007a; Tulving, 
1967), paired-associates (Allen, Mahler, & Estes (1969) and text (Nungester & Duchastel, 1982; 
Roediger and Karpicke, 2006a). 
 The cognition underlying the testing effect is not fully understood but some hypotheses 
have emerged and are currently under investigation. One possibility is that repeated testing 
creates conditions in which information is over-learned, a position argued by Thompson et al. 
(1978). Over-learning is an unlikely explanation of clicker effects, as it is improbable that 
offering a single clicker question in class can lead to over-learning. A more likely possibility is 
that testing strengthens the pathways leading to a stored memory more than additional study 
does (Bjork, 1975). Since study can be very passive (e.g., re-reading text passages or lecture 
notes), the more active nature of generating responses or comparing multiple-choice alternatives 
could reasonably offer greater opportunity for such enhancement. In other words, individuals 
are engaging in an activity that requires greater concentration during testing than some forms of 
study. Indeed, Bjork and Bjork (1992) have argued that there is a positive relationship between 
the level of effort required during testing and the strength of memory. As such, the effect may 
be a form of depth of processing (Craik & Lockhart, 1972).  

Alternatively, testing may generate new routes to the memory trace, thus multiplying 
possible access points to the material (McDaniel & Masson, 1985). When memories are formed, 
information about the context and activities relevant to the material are also formed. Testing 
offers new perspectives and links to the information that may be sensitive to different memory 
cues than the connections formed during study. The latter possibility would take advantage of 
encoding specificity, as a pathway generated through testing is likely to be more easily accessed 
during later testing. An excellent and more extensive review of the testing effect is provided by 
Roediger & Karpicke (2006b). 
 Although the mechanisms underlying the testing effect are not fully understood, 
numerous investigations have demonstrated that the effect seems to be enhanced by feedback 



Shapiro, A.M. and Gordon L.T. 

Journal of Teaching and Learning with Technology, Vol. 2, No. 1, June 2013. 
jotlt.indiana.edu  

18 

(e.g., Butler & Roediger, 2007; Hattie & Timperley, 2007; Kulhavy, 1977; Pashler, Cepeda, 
Wixted, & Rohrer, 2005; Sassenrath & Gaverick, 1965; Thorndike, 1913). Feedback can be 
confirmatory or corrective, and there is evidence that both types enhance later test performance 
(Butler, Karpicke, & Roediger, 2007; Kluger & DeNisi, 1996; McDaniel et al., 2007; 
Vojdanoska et al., 2010). Because clickers allow instructors to provide feedback with a simple 
button click within seconds of voting, feedback is widely used among clicker-adopting 
instructors. As a consequence, feedback is an important facet of clicker use to consider when 
questioning the reasons underlying clicker-mediated learning effects, particularly the testing 
effect. 
 It is important to note that the testing effect has been demonstrated in many experiments 
that did not employ feedback (see Kang, McDermott, & Roediger, 2007, experiment 1; Marsh, 
Agawal, & Roediger, 2009; Roediger & Karpicke, 2006a), so while there is the potential for the 
contribution of feedback effects during clicker-based learning, some other mechanism unique to 
testing appears to be working with or in addition to feedback. A study by Kang et al. (2007) 
underscores this point. After reading journal articles, subjects took either short answer or 
multiple-choice tests prior to a final memory test. Subjects did better on the final test when they 
took preliminary multiple-choice tests. When feedback was offered on the preliminary tests (in 
experiment 2), however, students taking the short answer tests did better on the final test. In 
sum, testing improved learning in Kang et al.’s study, but the addition of feedback altered 
something about the mechanism involved. The results are highly suggestive of some sort of 
interaction between the memory processes relevant during testing and feedback.  
 In spite of the fact that testing, especially with feedback, has been shown to enhance 
performance on tests more than study repetition, mere re-exposure to material alone can 
improve learning. The more times a student is exposed to a bit of information, the greater the 
likelihood he or she will retain it (e.g., Ebbinghaus,1913; Raney, 2003; Scarborough, Cortese, & 
Scarborough, 1977; Tulving, 1967). As such, it is certainly possible that clicker questions may 
improve retention for classroom content merely by re-exposing students to the material. In other 
words, clicker effects may simply be repetition effects, and that is a potential criticism of any 
experiment that demonstrates clicker effects by comparing clicker use with a no-clicker control. 
Thus, it is important to rule out repetition as the cause of clicker effects in order to strengthen 
the argument for classroom clickers as effective and worthwhile pedagogical tools. 
 
The Present Study 
 
Shapiro and Gordon (2012) concluded that the testing effect, not attention-grabbing, is 
responsible for enhanced learning with clickers in their experiment. Because they compared 
clicker groups to non-clicker control groups, as do most published studies on the topic, clicker 
use was confounded with repetition in their investigation. In the present two-experiment study 
we tested whether clicker effects are due, at least in part, to repetition.    

Experiment 1 takes advantage repetition learning in order to determine the role of 
repetition in clicker effects. Specifically, if repetition is a significant source of clicker effects, 
clicker use should be subject to the spacing effect. The spacing effect (also called distributed 
learning) refers to the phenomenon in which rehearsal or re-exposure to material results in 
greater memory when a period of time is allowed to intervene between presentations (Benjamin 
& Tullis, 2010; Cepeda, Pashler, Vul, Wixted, & Rohrer, 2006; Glenberg, 1979; Hintzman, 
1974). If clicker questions are more effective when offered after a delay of several days, it will 



Shapiro, A.M. and Gordon L.T. 

Journal of Teaching and Learning with Technology, Vol. 2, No. 1, June 2013. 
jotlt.indiana.edu  

19 

indicate the questions are likely serving as a method of repeating exposure to class material. If 
the spacing effect is not evident, it will indicate that repetition is unlikely to be a significant 
factor in clicker effects.  

In experiment 2, we compared a clicker group that received a single presentation of the 
material and a subsequent clicker question to a group that received a second presentation of the 
material in place of the clicker question. Because Shapiro and Gordon (2012) have found 
evidence against attention-grabbing as the reason for clicker effects, failure to support repetition 
in the present study would provide converging evidence that clicker effects are most likely 
attributable to the testing effect. We also took advantage of the clicker data to perform a 
secondary analysis on clicker performance to learn something about the role of feedback in 
clicker effects. 
 
Experiment 1 
 
The experiment was designed to determine whether the clicker learning effects demonstrated in 
prior studies are subject to the spacing effect, and thus attributable to repetition effects. We 
designed experiment 1 to compare exam question performance when clicker questions were 
asked immediately after in-class presentation of the material and when clicker questions were 
asked after a delay. If the spacing effect is in evidence, subjects should score higher on test 
items when clicker questions were offered 2-5 days after the material was taught in class, as 
compared with the same clicker questions offered the same day. Finding a spacing effect would 
indicate that clicker effects may be attributed, at least in part, to repetition. If the spacing effect 
does not emerge in the data, it would indicate that either feedback or the testing effect leads to 
cognitive change that can’t be attributed to simple rehearsal. For this reason, an analysis of 
clicker question performance was conducted to determine the role of feedback apart from 
repetition. 
 

Method 
 
Subjects   
 
Four hundred students enrolled in two sections of general psychology at the University of 
Massachusetts participated in the study. Students participated as part of their normal 
coursework, and earned participation points by correctly answering in-class questions. They 
ranged from freshmen to seniors and represented a range of  
disciplines offered at the institution. IRB approval was sought prior to beginning the study and a 
waiver was granted. 
 
Materials and Procedure 
 
The class covered 11 topics in general psychology and was taught as a typical lecture course 
with demonstrations and multimedia integrated into many of the lectures. PowerPoint 
presentations were projected onto a movie theater-sized screen. In-class clicker questions were 
integrated into the presentations, with individual slides dedicated to single questions. The 
iClicker system was used to allow students to make their responses to clicker questions. 



Shapiro, A.M. and Gordon L.T. 

Journal of Teaching and Learning with Technology, Vol. 2, No. 1, June 2013. 
jotlt.indiana.edu  

20 

Students were required to purchase their clickers (for $20-40, depending on whether they were 
new or bundled with the required text book).  
 Sixteen test clicker question/test item pairs were used as stimuli in the present study. 
Each clicker question was written to tap the same information as its targeted exam question. All 
clicker and exam questions were multiple-choice and were taken from Shapiro and Gordon 
(2012). The clicker question/test item pairs were spread throughout the semester, and across the 
four exams administered during the semester. Performance on the exam questions was the 
dependent variable.  
 All the targeted exam questions were included in the exams for both classes. The clicker 
question written for each targeted exam question was also given to each class. The timing of the 
clicker question presentation was manipulated as the within-subjects independent variable. 
When assigned to the “immediate” condition, clicker questions were given in class directly after 
the material was presented and any student questions were answered. When assigned to the 
“delayed” condition, the questions were given at the start of another class meeting, 2-5 days 
after the material was taught. Half the items were included in each condition for one class, with 
the other half included in the opposite condition for the other class. As such, each of the 16 
experimental items was included in both the immediate and delayed conditions, and each 
subject contributed data to both conditions. Presentation of the relevant course material was the 
same in both conditions; the information was included on a PowerPoint slide. Identical “filler” 
clicker questions targeting material unrelated to the experimental items were offered to both 
classes, with the experimental items mixed randomly among them. Between 1-5 clicker 
questions (filler and experimental) were asked in class each day. The instructor projected the 
clicker questions onto the screen after soliciting and answering any questions from the students. 
Students were given 30-90 seconds to answer each question and a bar chart showing the 
percentage of the class to respond with each option was projected to provide feedback after 
voting was closed. 
 
Exam and clicker question validation. Because a simple, no-clicker control condition would 
not allow discrimination between clicker and repetition effects, which is the purpose of this 
investigation, a no-clicker group was not included. For that reason, it was important to establish 
that the materials used in the present study do induce a basic learning effect. As mentioned, the 
sixteen clicker questions, and the corresponding exam questions for which they were written, 
were taken from Shapiro and Gordon (2012). The clicker question written for each exam 
question probed the same basic information as the test question, but was still unique. In their 
study, Shapiro and Gordon implemented a counterbalancing strategy wherein each of two 
classes was given clicker questions for half the targeted exam questions. For the other half of 
the questions, subjects were given no clicker question. For half of those in the control condition 
(see experiment 1), no special treatment was given to the material in class. For the other half, 
however, students were told the material was important and would be on the test (see 
experiment 2), creating a very conservative test of clicker learning effects. The methodology 
controlled for both item and subject effects, as each exam question was used in the control and 
clicker conditions and each subject contributed data to both conditions. Half the stimuli in the 
present experiment were taken from Shapiro and Gordon’s experiment 1 and half from 
experiment 2. Thus, in order to establish that the item subset chosen for the present study does 
produce the basic clicker learning effect, the analysis from that experiment was re-run including 
only the subset of items chosen for the present study. Analyzed by subjects, a paired t-test 



Shapiro, A.M. and Gordon L.T. 

Journal of Teaching and Learning with Technology, Vol. 2, No. 1, June 2013. 
jotlt.indiana.edu  

21 

revealed a significant effect of clickers on performance, t(234) = 5.62, p < .0001, d=.37. 
Students scored a mean of 68.9% (SD=18.7) correct on items when no clicker question was 
offered and 76.8% (SD=18.1) correct when a question was offered, more than an 11% 
performance increase. The results were also significant when analyzed by items, t(15) = 4.29, p 
< .001, d=1.08, with items answered correctly by 69.4% (SD=12.1) of subjects when placed in 
the control condition and 76.0% (SD=10.8) answering the same questions correctly when 
clicker questions were asked, an increase of almost 10%. Again, this is a very conservative test 
of the stimuli because half the items in the control condition were identified to students as 
material that would be on the test. In spite of the warning, clicker questions still significantly 
boosted exam performance.  
 Other measures of the stimuli were taken to ensure stimulus validity. Two independent 
content experts provided validation ratings of the stimuli. Both are professors of psychology 
that routinely teach introductory psychology. They rated each clicker and exam question on a 7-
point scale for the following dimensions: (1) overall quality of the question, (2) relevance of the 
information targeted by the clicker/exam item pairs to the content and goals of an introductory 
psychology course, (3) the relationship between each clicker item and each exam question. The 
questions used in the experiment all scored a minimum rating and minimum mean of 5.0 by 
each rater on questions 1 and 2. The clicker/exam pairs met the same criteria on survey question 
3. The relationship ratings between clicker questions and exam questions which were not 
intended as pairs were also analyzed. It was important that unpaired items were actually 
unrelated to ensure clicker questions were not enhancing memory for exam questions for which 
they were not written. All unrelated clicker/exam question pairs used in the present experiment 
scored a maximum rating of 2.0 among reviewers and had a mean rating of 1.5. The low ratings 
established the unlikelihood of “spillover” effects. That is, clicker questions were unlikely to 
affect performance on exam questions for which they were not intended. 
 
Results and Discussion 
 
Students who withdrew early from the course, those with attendance lower than 60%, and those 
who missed more than one exam were excluded from the data analysis. These students provided 
insufficient data for the within-subjects comparisons or were insufficiently exposed to the 
independent variable. The deletions yielded a total of 283 subjects in the analysis. Moreover, 
individual exam question data were removed from the analysis for students who were absent 
from class the day the targeted content was presented. Missing those critical classes meant 
missing the targeted content as well as their immediate clicker questions. Also, effects of the 
delayed clicker questions would be difficult to interpret for those cases. A maximum of 16 exam 
questions per subject was possible and these deletions resulted in a mean of 13.1 per subject. 
Out of a maximum of 283 student scores for each question, the deletions resulted in a mean of 
229.6. 
 Paired t-tests were performed to compare performance between the immediate and 
delayed conditions. The results did not reveal evidence of a spacing effect. When analyzed by 
subjects, there was no significant difference between performance on exam items when targeted 
by immediate (M = 67.5, SD = 24.1) or delayed (M = 70.0, SD = 21.8) clicker questions, t(282) 
= 1.73, p > .05. No significant difference between the immediate (M = 67.4, SD=9.3) and 
delayed (M = 69.8, SD=11.7) conditions was revealed in the item analysis, t(15) = 1.04, p > .05. 
The mean discrimination index for the exam questions was 50.4. 



Shapiro, A.M. and Gordon L.T. 

Journal of Teaching and Learning with Technology, Vol. 2, No. 1, June 2013. 
jotlt.indiana.edu  

22 

 Since there was no spacing effect, the data argue against repetition as a significant 
mechanism underlying clicker effects. If repetition isn’t driving the effect, what is? A clue to the 
relevant processes may be gleaned by examining clicker question performance in the immediate 
versus delayed conditions. It makes intuitive sense that students would perform better on 
immediate clicker questions, as the information needed to answer the questions correctly has 
just been presented in lecture. In light of the fact that students performed equally well on later 
exam questions regardless of clicker question timing, however, if students did perform better on 
the immediate versus delayed clicker questions it would suggest corrective feedback is being 
used to improve test performance to some extent. Paired t-tests by items, comparing clicker 
question performance between immediate and delayed conditions revealed just that. Students 
scored a significantly higher percent correct on immediate clicker questions (M = 94.7, SD=9.1) 
than delayed (M = 83.2, SD=15.8), t(281) = 10.99, p < .0001, d = .65. The same result was 
found when analyzed by items, with the same clicker questions answered correctly more often 
when asked in the immediate condition (M=94.7, SD=5.1) than in the delayed condition 
(M=82.0, SD=18.3), t(15) = 2.85, p < .01, d = .72. Not only are the t-tests significant, but the 
effect sizes are quite robust. Despite such clear differences between immediate and delayed 
clicker performance, exam performance was not affected by conditions. As such, it stands to 
reason students were able to make some use of their performance feedback in the delayed 
condition to improve test performance. 
 The clicker performance analysis provides only indirect evidence about the effect of 
corrective feedback, however. A more direct test is possible by comparing exam question 
performance when the clicker questions were answered correctly versus incorrectly. If feedback 
is a primary factor in clicker effects, students should score equally on exam questions regardless 
of clicker performance as long as they are given feedback, as they were in the present study. If 
there is a significant difference, it would mean the effect of corrective feedback is limited and 
unlikely to account for the entire effect. To run this test, all subjects and questions in the 
delayed clicker condition were combined to create groups based on clicker performance. 
Because clicker performance was quite high in the immediate condition (95%), there were 
insufficient incorrect responses to compare with the correct responses, so the analysis was done 
only on the delayed clicker questions. Moreover, since exam question performance was deleted 
when the critical content lecture was missed, there are no cases in the immediate clicker 
condition in which students attended the content lecture but missed the clicker questions. The 
delayed condition, however, provides an important comparison group. That is, students who 
attended the critical content lecture but were not exposed to the delayed clicker question.  
 The limitations of corrective feedback effects are seen when performance is compared 
on test items for which students correctly versus incorrectly answered the corresponding clicker 
questions, or did not see the clicker questions. The mean of the 1422 exam questions included in 
the analysis, for which the corresponding clicker questions were correctly answered3, was 72%. 
For the 286 exam questions, for which the corresponding clicker questions were incorrectly 
answered, the mean score was 63% correct. There were 191 unanswered, delayed clicker 
questions across subjects that did attend the critical content lecture (in other words, students 
who received the content in class but did not see the clicker question) and the mean score on the 
corresponding exam questions was 59%. Although the effect size was quite small, the difference 
was significant, F(2, 1896) = 9.83, p< .0001,η2 =.01. A Scheffe’s posthoc analysis revealed that 
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
3 With 16 items and 283 subjects, there were 2284 possible clicker responses in the immediate and in the delayed conditions. 
The number in the analysis is lower due to student absences. 



Shapiro, A.M. and Gordon L.T. 

Journal of Teaching and Learning with Technology, Vol. 2, No. 1, June 2013. 
jotlt.indiana.edu  

23 

exam question performance was significantly higher in the case of correctly answered clicker 
questions than incorrectly answered clicker questions, p < .05 (two tailed) and in the case of 
correct versus missed clicker questions, p < .05 (two tailed). The difference between exam 
question performance based on incorrect versus missed clicker questions was not significant, p 
> .05 (two tailed).  
 If corrective feedback were a primary mechanism through which clicker effects worked, 
there should be little or no significant difference on exam performance based on clicker 
performance. More importantly, incorrectly answered clicker questions should yield better 
performance than getting no clicker question at all. After all, if students are using clicker 
questions primarily to gain corrective feedback on their performance, one would expect to see 
evidence of widespread self-correction on the exam questions. The significant performance 
advantage by students getting the answer correct, in addition to the comparable exam 
performance of students getting a clicker question wrong and those unexposed to it, suggests 
corrective feedback was not particularly useful for students getting clicker questions wrong. The 
large differences in sample sizes and the rather low effect size, however, warrant caution about 
the strength of this conclusion.  
 
Experiment 2 
 
The purpose of experiment 2 was to provide converging evidence with experiment 1 that 
repetition is not the major source of clicker learning effects.  The advantage of the methodology 
used in experiment 1 was that the presentation of immediate and delayed clicker questions 
seemed natural to students within the context of a live classroom. Taking advantage of the 
spacing effect in this way, however, only provided indirect evidence of the role of repetition. 
Experiment 2 addressed the question more directly by comparing exam question performance 
after the presentation of clicker questions or information repetition. Moreover, since the main 
evidence refuting repetition effects in experiment 1 was a nonsignificant result, experiment 2 
was also designed to provide positive evidence (i.e., a significant statistical result) in support of 
our hypothesis. 
 

Method 
 
Subjects 
 
Three hundred twenty students enrolled in two sections of General Psychology at the University 
of Massachusetts participated in the study. Students participated as part of their normal 
coursework, and earned participation points by correctly answering in-class questions. They 
ranged from freshmen to seniors and represented all five colleges across campus. IRB approval 
was sought prior to beginning the study and a waiver was granted. 
 
Materials and Procedure 
 
The same materials and procedure the same procedure was used as in experiment 1, but with 
one change. Instead of half the exam questions being targeted with delayed clicker questions in 
each semester, half were targeted with a second, immediate presentation of the material.  In the 
clicker and repetition conditions, the same slide was used to present the information for the first 



Shapiro, A.M. and Gordon L.T. 

Journal of Teaching and Learning with Technology, Vol. 2, No. 1, June 2013. 
jotlt.indiana.edu  

24 

time.  In the clicker condition a clicker question followed the slide. In the repetition condition a 
second PowerPoint slide that presented the relevant information in a slightly different way from 
the first was presented in lieu of a clicker question. In this way, the effect of a second, novel 
presentation on exam question performance could be compared with the effect of a clicker 
question.  A sample stimulus set from each condition is provided in Appendix A. In both 
conditions, the targeted information was presented verbally along with an accompanying slide. 
(In the Appendix A example, the targeted information was the role of the hypothalamus in 
hormone regulation.) In the repetition condition, the information was repeated with a new visual 
aid, while in the clicker condition students answered a question in lieu of seeing the second 
slide. 
 

Results and Discussion 
 
Students that withdrew early from the course, those with attendance lower than 60%, and those 
that missed more than one exam were excluded from the data analysis. This yielded a total of 
290 students in the analysis. Paired t-tests were performed to compare performance between the 
clicker and repetition conditions. The results indicated significantly better performance in the 
clicker condition (M = 61.2, SD = 21.6) than in the repetition condition (M = 56.2, SD = 20.6) 
when analyzed by subjects, t (289) = 3.417, p = .001, d = .20.  The effect was also significant 
when analyzed by items, t (15) = 2.419, p = .029, d = .60, with students performing better on 
items when the relevant content was presented with a clicker question (M = 60.7, SD = 10.4) 
rather than with a second presentation (M = 55.2, SD = 12.0). 
 The results of experiment 2 converge with those of experiment 1 to support the 
hypothesis that clicker questions do not enhance retention of classroom material merely because 
they act as a second presentation of information. The 5-point increase in subject performance 
(from 52.2 to 61.2) in the subject analysis represents a performance increase of 8.9%. The effect 
size is rather small, however. The 5.5-point increase in the item analysis represents a 10% 
increase and a moderate effect size, however.  While these results can’t rule out any role of 
repetition in clicker effects, they do provide compelling evidence that repetition is not the major 
source of the effect. 
 

General Discussion and Conclusions 
 
Shapiro and Gordon (2012) reported evidence that clicker effects are not attributable to drawing 
students’ attention to certain material. That study was not able to rule out repetition effects as an 
underlying cause of clicker-enhanced learning, however.  The present study addressed that 
possibility and demonstrated that repetition is unlikely to be a major contributor to the effect.  In 
doing so, it provides converging evidence with Shapiro and Gordon that the testing effect is 
likely to underlie clicker-enhanced learning.   

In a secondary analysis of experiment 1, we tried to determine whether feedback has a 
role in clicker effects, since feedback is an important variable in the testing effect.  The 
conclusions we were able to draw from those analyses are suggestive of some role of feedback, 
but do not paint a clear picture. The delayed clicker group performed worse on clicker questions 
than the immediate group but performed equivalently on exam questions, suggesting that 
corrective feedback helped.  However, a comparison of exam question performance when 
students correctly versus incorrectly answered the clicker questions revealed students performed 



Shapiro, A.M. and Gordon L.T. 

Journal of Teaching and Learning with Technology, Vol. 2, No. 1, June 2013. 
jotlt.indiana.edu  

25 

better on exam questions when they got clicker questions right.  Indeed, students answering the 
clicker question incorrectly performed only as well on the exam questions as students that were 
not exposed to the clicker question at all. These results suggest corrective feedback had a weak 
effect on exam performance.  Any conclusions drawn from the latter result, however, are 
mitigated by the rather low effect size.  On balance, then, the present results are suggestive of 
some role of corrective feedback in clicker-based learning. That conclusion is compatible with 
the large literature on the role of feedback in the testing effect.  Certainly, feedback should be an 
important area for future inquiry. 
 Regardless of the feedback question, the results do converge with Shapiro and Gordon 
(2012) to support the conclusion that the testing effect is the most likely mechanism underlying 
clicker effects. The notion of testing itself causing cognitive change is supported by the 
extensive work of Karpicke and colleagues (e.g., Karpicke & Roediger, 2007a; 2008) on the 
testing effect. As Bjork (1975) suggests, the act of retrieving memories may strengthen the 
memory trace. Moreover, it may create new routes to memories that are more easily invoked 
during exams, with the context common to testing situations acting as a retrieval cue.  
 The present experiment was designed to test clicker use for enhancing fact-based 
learning alone. As such, the results do not support clicker use for problem-solving, application, 
or deep-level understanding of the material.  Within the context of fact-based learning, however, 
the present results are of practical importance for educators and students. As such, we can offer 
some concrete suggestions for effective use of clickers in the classroom. Specifically, we 
suggest that important factual content be targeted with clicker questions. The questions should 
be written specifically to require memory retrieval of the targeted information. We also suggest 
the questions be worded clearly and in a way that maximizes students ability to correctly answer 
the questions. After all, if the testing effect is at the heart of clicker-enhanced learning, the goal 
should be to encourage students to correctly recall the correct information from memory, 
thereby activating the testing effect.   

Finally, clickers seem to invoke cognitive change in the classroom that is unique. If 
clicker effects were attributable to repetition or attention-grabbing, their value might be 
dubious. After all, there are many avenues through which to provide repetition or enhance 
attention inside and outside the classroom. Having demonstrated that clicker use affects 
cognitive change attributable to the testing effect (and quite possibly to feedback, as well) the 
present results support clickers as a unique and valuable pedagogical classroom tool. Given the 
relatively low cost in terms of classroom time and equipment expense, the evidence in support 
of their educational benefit suggests they do offer real value to students and instructors. 
 

References 
 
Agarwal, P. K., Karpicke, J. D., Kang, S. K., Roediger, H. L., & McDermott, K. B. (2008). 
Examining the testing effect with open- and closed-book tests. Applied Cognitive Psychology, 
22, 861-876. doi:10.1002/acp.1391 
 
Allen, G. A., Mahler, W. A., & Estes, W. K. (1969). Effects of recall tests on long-term 
retention of paired associates. Journal of Verbal Learning & Verbal Behavior, 8(4), 463-470. 
doi:10.1016/S0022-5371(69)80090-3 
 



Shapiro, A.M. and Gordon L.T. 

Journal of Teaching and Learning with Technology, Vol. 2, No. 1, June 2013. 
jotlt.indiana.edu  

26 

Beekes, W. (2006). The "Millionaire" method for encouraging participation. Active Learning in 
Higher Education: The Journal of the Institute for Learning and Teaching, 7, 25-36. 
doi:10.1177/1469787406061143 
 
Benjamin, A., & Tullis, J. (2010). What makes distributed practice effective? Cognitive 
Psychology, 61, 228-247.	
  doi:10.1016/j.cogpsych.2010.05.004 
 
Bjork, R. A. (1975). Retrieval as a memory modifier: An interpretation of negative recency and 
related phenomena. In R.L. Solso (Ed.), Information Processing and Cognition: The Loyola  
Symposium (pp. 123-144). Hillsdale, NJ: Erlbaum.  
 
Bjork, R. A., & Bjork, E. L. (1992). A new theory of disuse and an old theory of stimulus 
fluctuation. In A. Healy, S. Kosslyn, & R. Shiffrin (Eds.), From Learning Processes to 
Cognitive Processes: Essays in Honor of William K. Estes Volume 2 (pp. 35-67). Hillsdale, NJ: 
Erlbaum. 
 
Butler, A. E., Karpicke, J. D., & Roediger, H. L. (2007). The effect of type and timing of 
feedback on learning from multiple-choice tests. Journal of Experimental Psychology: Applied, 
13, 273-281. 
 
Butler, A. C., & Roediger, H. L. (2007). Testing improves long-term retention in a simulated 
classroom setting. European Journal of Cognitive Psychology, 19, 514–527. doi:10.1037/1076-
898X.13.4.273 
 
Carrier, M., & Pashler, H. (1992). The influence of retrieval on retention. Memory & Cognition, 
20, 633-642. doi:10.3758/BF03202713 
 
Cepeda, N. J., Pashler, H., Vul, E., Wixted, J.T., & Rohrer, D. (2006). Distributed practice in 
verbal recall tasks: A review and quantitative synthesis. Psychological Bulletin, 132, 354-380. 
doi:10.1037/0033-2909.132.3.354 
 
Craik, F. I., & Lockhart, R. S. (1972). Levels of processing: A framework for memory research. 
Journal of Verbal Learning & Verbal Behavior, 11, 671-684.	
  doi:10.1016/S0022-­‐
5371(72)80001-­‐X 
 
Duchastel, P. C. (1981). Retention of prose following testing with different types of tests. 
Contemporary Educational Psychology, 6, 217-226. doi:10.1016/0361-476X(81)90002-3 
 
Ebbinghaus, H. (1913). Memory: A contribution to experimental psychology. (H. A. Ruger & C. 
E. Bussenius, Trans.). New York: Teachers College Press. 
 
Glenberg, A. M. (1979). Component-levels theory of the effects of spacing of repetitions on 
recall and recognition. Memory & Cognition, 7, 95–112. 
 
Hattie, J., & Timperley, H. (2007).  The power of feedback. Review of Educational Research, 
77, 81-112. DOI: 10.3102/003465430298487 



Shapiro, A.M. and Gordon L.T. 

Journal of Teaching and Learning with Technology, Vol. 2, No. 1, June 2013. 
jotlt.indiana.edu  

27 

Hintzman, D. L. (1974). Theoretical implications of the spacing effect. In R. L. Solso (Ed.), 
Theories in cognitive psychology: The Loyola symposium (pp. 77–97). Potomac, MD: Erlbaum. 
 
Jacoby, L. L. (1978). On interpreting the effects of repetitions: Solving a problem versus 
remembering a solution. Journal of Verbal Learning and Verbal Behavior, 17, 649-667. doi: 
10.1016/S0022-5371(78)90393-6 
 
Kang, S. H. K., McDermott, K. B., & Roediger, H. L. (2007). Test format and corrective 
feedback modify the effect of testing on long-term retention. European Journal of Cognitive 
Psychology, 19, 528-558. doi: 10.1080/09541440601056620 
 
Karpicke, J. D., & Roediger, H. L. (2007a). Repeated retrieval during learning is the key to 
long-term retention. Journal of Memory and Language, 57, 151-162.  
doi: 10.1016/j.jml.2006.09.004 
 
Karpicke, J. D., & Roediger, H. L. (2007b). Expanding retrieval practice promotes short-term 
retention, but equally spaced retrieval enhances long-term retention. Journal of Experimental 
Psychology: Learning, Memory, and Cognition, 33, 704-719. doi: 10.1037/0278-7393.33.4.704 
 
Karpicke, J. D., & Roediger, H. L. (2008). The critical importance of retrieval for learning. 
Science, 319, 966-968. doi: 10.1126/science.1152408 
 
Kluger, A. & DeNisi, A. (1996).  The effects of feedback interventions on performance: A 
historical review, a meta-analysis, and a preliminary feedback intervention theory.  
Psychological Bulletin, 119, 254-284 doi: 10.1037/0033-2909.119.2.254. 
 
Kennedy, G. E., & Cutts, Q. I. (2005). The association between students’ use of an electronic 
voting system and their learning outcomes. Journal of Computer Assisted Learning, 21, 260-
268. doi: 10.1111/j.1365-2729.2005.00133.x 
 
Kulhavy, R. W. (1977). Feedback in written instruction. Review of Educational Research, 47, 
211-232. doi:10.2307/1170128 
 
Marsh, E. J., Agarwal, P. K., & Roediger, H. L. (2009). Memorial consequences of answering 
SAT II questions. Journal of Experimental Psychology: Applied, 15, 1-11. 
doi:10.1037/a0014721 
 
Mayer,	
  R.	
  E.,	
  Stull,	
  A.,	
  DeLeeuw,	
  K.,	
  Almeroth,	
  K.,	
  Bimber,	
  B.,	
  Chun,	
  D.,	
  Bulger,	
  M.,	
  Campbell,	
  
J.,	
  Knight,	
  A.,	
  &	
  Zhang,	
  H.	
  (2009).	
  Clickers	
  in	
  college	
  classrooms:	
  Fostering	
  learning	
  with	
  
questioning	
  methods	
  in	
  large	
  lecture	
  classes.	
  Contemporary	
  Educational	
  Psychology,	
  34,	
  
51–57.	
  doi:10.1016/j.cedpsych.2008.04.002	
  

McDaniel, M. A., & Masson, M. E. J. (1985). Altering memory representations through 
retrieval. Journal of Experimental Psychology: Learning, Memory, and Cognition, 11, 371-385.  
doi:10.1037//0278-7393.11.2.371 



Shapiro, A.M. and Gordon L.T. 

Journal of Teaching and Learning with Technology, Vol. 2, No. 1, June 2013. 
jotlt.indiana.edu  

28 

McDaniel, M. A., Anderson, J. L., Derbish, M. H., & Morrisette, N. (2007). Testing the testing 
effect in the classroom. European Journal of Cognitive Psychology, 19, 494-513. 
doi:10.1080/09541440701326154 
 
Morling, B., McAuliffe, M., Cohen, L., & DiLorenzo, T. (2008). Efficacy of personal response 
systems (“clickers”) in Large, Introductory Psychology Classes. Teaching of Psychology, 35, 
45-50. doi:10.1080/00986280701818516 
 
Nungester, R. J., & Duchastel, P. C. (1982). Testing versus review: Effects on retention. Journal 
of Educational Psychology, 74, 18-22. doi:10.1037/0022-0663.74.1.18 
 
Pashler, H., Cepeda, N. J., Wixted, J. T., & Rohrer, D. (2005). When does feedback facilitate 
learning of words? Journal of Experimental Psychology: Learning, Memory, and Cognition, 31, 
3-8. doi:10.1037/0278-7393.31.1.3 
 
Poirier, C. R., & Feldman, R.S. (2007). Promoting active learning using individual response 
technology in large introductory psychology classes. Teaching of Psychology, 34, 194-196. 
doi:10.1080/00986280701498665 
 
Raney, G. (2003). A context-dependent representation model for explaining text repetition 
effects. Psychonomic Bulletin & Review, 10, 15-28. doi:10.3758/BF03196466 
 
Ribbens, E. (2007). Why I like clicker personal response systems. Journal of College Science 
Teaching, 37, 60-62. 
 
Roediger, H. L., & Karpicke, J. D. (2006a). Test-enhanced learning: Taking memory tests 
improves long-term retention. Psychological Science, 17, 249-255.  
doi: 10.1111/j.1467-9280.2006.01693.x 
 
Roediger, H. L., & Karpicke, J. D. (2006b). The power of testing memory: Basic research and 
implications for educational practice. Perspectives on Psychological Science, 1, 181-210.  
doi:10.1111/j.1745-6916.2006.00012.x 
 
Sassenrath, J.M., & Garverick, C.M. (1965). Effects of differential feedback from examinations 
on retention and transfer. Journal of Educational Psychology, 56, 259-263. 
doi:10.1037/h0022474 
 
Scarborough, D. L., Cortese, C., & Scarborough,H. S. (1977). Frequency and repetition effects 
in lexical memory. Journal of Experimental Psychology: Human Perception & Performance, 3, 
1-17. doi:10.1037//0096-1523.3.1.1 
 
Shapiro, A. M. (2009). An empirical study of personal response technology for improving 
attendance and learning in a large class. Journal of the Scholarship of Teaching and Learning, 
9, 13-26. 



Shapiro, A.M. and Gordon L.T. 

Journal of Teaching and Learning with Technology, Vol. 2, No. 1, June 2013. 
jotlt.indiana.edu  

29 

Shapiro, A.M., & Gordon, L.T. (2012). A controlled study of clicker-assisted memory 
enhancement in college classrooms. Applied Cognitive Psychology, 26, 635–643. doi: 
10.1002/acp.2843.  
 

Shih, M., Rogers, R., Hart, D., Phillis, R., & Lavoie, N. (2008, April). Community of practice: 
The use of personal response system technology in large lectures. Paper presented at the 
University of Massachusetts Conference on Information Technology, Boxborough, MA. 

Stowell, J., & Nelson, J. (2007). Benefits of electronic audience response systems on student 
participation, learning, and emotion. Teaching of Psychology, 34, 253-258. 
doi:10.1080/00986280701700391 
 
Szpunar, K. K., McDermott, K. B., & Roediger, H. L. (2008). Testing during study insulates 
against the buildup of proactive interference. Journal of Experimental Psychology: Learning, 
Memory, and Cognition, 34, 1392-1399. doi:10.1037/a0013082 
 
Thompson, C. P., Wenger, S. K., & Bartlings, C. A. (1978). How recall facilitates subsequent 
recall: A reappraisal. Journal of Experimental Psychology: Human Learning and Memory, 4, 
210-221. doi:10.1037/0278-7393.4.3.210 
 
Thorndike, E. L. (1913). Educational psychology: Vol. 1. The original nature of man. New 
York: Columbia University. 
 
Tulving, E. (1967). The effects of presentation and recall of material in free-recall verbal 
learning. Journal of Verbal Learning and Verbal Behavior, 6, 175-184. doi:10.1016/S0022-
5371(67)80092-6 
 

Vojdanoska, M.; Cranney, J., & Newell, B. (2010). The testing effect: The role of feedback and 
collaboration in a tertiary classroom setting. Applied Cognitive Psychology, 24, 1183-1195. 
DOI: 10.1002/acp.1630 

Appendix 

Appendix A. Sample Stimulus Set. 

Sample item in the clicker and repetition conditions, reproduced in grayscale.  

TARGETED EXAM QUESTION:   

Which brain structure exerts considerable influence over the secretion of hormones 
throughout the body? 

A. the hypothalamus 
B. the amygdala 
C. the hippocampus 
D. the thalamus 

 



Shapiro, A.M. and Gordon L.T. 

Journal of Teaching and Learning with Technology, Vol. 2, No. 1, June 2013. 
jotlt.indiana.edu  

30 

EXPERIMENTAL 
CONDITION 

 
FIRST PRESENTATION 

 
SECOND PRESENTATION 

Clicker 

  
Repetition 

  
 

Hypothalamus 
 

•  Located deep in the brain 
•  Controls hormones and regulates a number of  
 functions 
  

iClicker Question 
 

Which of the following is NOT a function 
of the hypothalamus? 

 
 

1.  Hormone regulation 
2.  Thirst 
3.  Sleep 
4.  All of these are hypothalamus functions  

Hypothalamus 
 

•  Located deep in the brain 
•  Controls hormones and regulates a number of  
 functions 
  

Hypothalamus 
 
•  Temperature regulation 
•  Controls hormones  
  (endocrine system) 
•  Sexual activity 
•  Hunger 
•  Thirst 
•  Sleep 


