ALT-J, Research in Learning Technology
Vol. 12, No. 3, September 2004

ISSN 0968–7769 (print)/ISSN 1741–1629 (online)/04/030215–15
© 2004 Association for Learning Technology
DOI: 10.1080/0968776042000259546

Implementation of computer assisted 
assessment: lessons from the literature
Gavin Sim*, Phil Holifield & Martin Brown
University of Central Lancashire, UK
Taylor and Francis LtdGALT120303.sgm10.1080/0968776042000259546ALT-J, Research in Learning Technology0968-7769 (print)/1741-1629 (online)Original Article2004Association for Learning Technology123000000September 2004GavinSimDepartment of Computing, Computing and Technology BuildingUniversity of Central LancashirePrestonPR1 2HEgrsim@uclan.ac.uk

This paper draws attention to literature surrounding the subject of computer-assisted assessment
(CAA). A brief overview of traditional methods of assessment is presented, highlighting areas of
concern in existing techniques. CAA is then defined, and instances of its introduction in various
educational spheres are identified, with the main focus of the paper concerning the implementa-
tion of CAA. Through referenced articles, evidence is offered to inform practitioners, and direct
further research into CAA from a technological and pedagogical perspective. This includes issues
relating to interoperability of questions, security, test construction and testing higher cognitive
skills. The paper concludes by suggesting that an institutional strategy for CAA coupled with staff
development in test construction for a CAA environment can increase the chances of successful
implementation.

Introduction

This paper presents evidence that the more traditional methods of assessment within
universities have their limitations. As a result of these limitations and also the
continued increase in the use of technology to deliver curriculum, the gap between
assessment methods and learning is widening.

Students entering higher education directly from schools and colleges are likely to
have been exposed to Information Technology as part of the UK National Curricu-
lum. Pilot studies conducted within schools for the delivery of summative assessment
via the web (Ashton et al., 2003; Nugent, 2003) and for basic key skills tests in both
Learn Direct and army centres (Sealey et al., 2003) indicate that CAA can success-
fully assess students and provide timely feedback regarding class and individual
progress. There is also empirical evidence to suggest students find CAA an acceptable
assessment technique (Sambell et al., 1999; Croft et al., 2001; Ricketts & Wilks,
2002a). Therefore it could be argued that for many students CAA may become a

* Corresponding author: Department of Computing, Computing and Technology Building,
University of Central Lancashire, Preston PR1 2HE, UK. Email: grsim@uclan.ac.uk


216 G. Sim et al.

more widely used method of assessment in schools, Further education and Universi-
ties. Many universities are now using technology in their assessment strategies
(Stephens & Mascia, 1997) and by examining the literature, lessons can be learned
to facilitate the successful implementation of computer-assisted assessment.

Methodology

A comprehensive literature review was conducted primarily using online resources,
although library resources were also used. The searching centred on the databases
Ingenta, AACE, Science Direct and Conference Proceedings such as the International
Computer Assisted Assessment Conference. Key word searching was problematic and
time consuming, for example conducting a search using ‘computer assessment’ would
produces a divergent array of articles in excess of one thousand. Other terminology
used in the search included ‘computer based testing, computer-based assessment,
computer aided assessment and e-assessment’. Browsing through the contents of entire
journals such as ‘Assessment and Evaluation in Higher education’ was also adopted.

Assessment in general

Academic assessment can be administered through various techniques. Fifty varied
techniques have been identified and used within higher education for assessment
purposes (Knight, 2001); the most commonly used are exams and essays (Graham,
2004). However this does not include all the methods now available within CAA
packages for example incorporate questions that make use of multimedia. New
assessment techniques will continue to emerge as technology and teaching methods
change and develop, therefore continuing research will be required to determine the
effectiveness and appropriateness of these methods.

Each form of assessment presents its own difficulties, whether computer based or
traditional. Essays present the problem of double marking, in one study both markers
agreed only 52% of the time (Powers et al., 2002). Additionally there are the prob-
lems with cheating as Internet sites offer custom-written and off the shelf essays
(Crisp, 2002). It has been suggested that exams tend to encourage surface learning
(Race, 1995) and may cause increased anxiety resulting in significantly lower scores
(Cassady & Johnson, 2002). The multiple choice question (MCQ) styles are used in
both offline and CAA exams and raise a number of concerns, for example, grade
deflation by not enabling partial credit (Baranchik & Cherkas, 2000), poorly designed
questions (Paxton, 2000; Jafarpur, 2003) and guessing (Burton, 2001). However the
advantages of using computers to deliver MCQ for lecturers include automated
marking (Pollock et al., 2000) and for formative purposes the students have the
opportunity to study at their own pace, repeat questions and receive instant feedback
(Loewenberger & Bull, 2003). It is the potential advantages of CAA that has driven
research into ways to overcome the difficulties.

Ultimately in an academic environment the marks from summative assessment are
accumulated to award an overall grade and there are concerns over comparability across


Implementation of computer assisted assessment 217

subject domains. It has been suggested that the scientific subjects produce more First
Class Degrees than the humanities because of the nature of the marking criteria in
using the full range of marks and subjectivity is eliminated from the equation where
there is a predefined correct answer (Yorke et al., 2002; Horney, 2003). These findings
would appear to be further corroborated by the Higher education Statistics Agency
(HESA) figures. Of the students graduating from UK universities in 2001/02 in Math-
ematical Science 25.5% passed with a First Class Degree, compared to 10.4% in
Humanities (HESA, 2002) and this trend was also evident in other years for example,
1994/95 (HESA, 1995). CAA, like mathematics and some science subjects, also tends
to use the full range of marks therefore the trend towards a high proportion of First
Class Degrees may occur in other subject domains adopting this technique in the future.

There is pressure on lecturers not to fail students, and one study found that in
professional subjects there is a tendency to leave the award of a fail to the next assessor
(Hawe, 2003). Lecturers are confronted with emotional and ethical dilemmas when
a close working relationship is formed, increasing their reluctance to award a fail
(Sabar, 2002). The emotional and subjectivity issues that are evident in human
centred marking may be removed via automatic marking offered by CAA software.

It is important to recognize that some of these issues discussed are still prevalent in
CAA along with new challenges. Adopting a diverse assessment strategy may lead to
a fairer assessment of the student (Race, 1995).

Computer-assisted assessment defined

From the literature there is a lack of universal consent regarding the terminology and
its definition, however, Bull and McKenna (2001) argue that computer-assisted
assessment is the common term for the use of computers in the assessment of
students and the other terminology tend to focus on the activities. Therefore the
definition of CAA used in this review will be that: CAA encompasses the use of
computers to deliver, mark or analyse assignments or exams.

Variations in CAA

Within higher education institutions the application of CAA has occurred in a
number of varied ways, these include, adaptive testing (Latu & Chapman, 2002; Mills
et al., 2002), analysis of the content of discussion boards (Macdonald & Twining,
2002; Wiltfelt et al., 2002), automated essay marking (Christie, 1999; Burstein et al.,
2001), delivery of exam papers (Sim et al., 2003) and objective testing (Walker &
Thompson, 2001; Pain & Le Heron, 2003). These methods vary considerably
however the focus of this review of research will centre on the issues relating to
implementing objective tests via CAA.

Testing cognitive skills with CAA

There is concern in the literature relating to CAA and its ability to test higher cognitive
skills across subject domains (Daly & Waldron, 2002; Paterson, 2002). The higher


218 G. Sim et al.

cognitive skills are often associated with ‘Analysis, Synthesis, and Evaluation’ as defined
in Bloom’s Taxonomy (Bloom, 1956). However, a revised taxonomy takes into consid-
eration the ‘Knowledge Dimension’ (Anderson & Krathwohl, 2001) and this has also
been used in CAA research for classification of questions (King & Duke-Williams,
2002; Mayer, 2002).

Paterson (2002) indicated that it is not feasible to test the higher-level cognitive
skills using CAA within mathematics. Bloom states that in the majority of instances
Synthesis and Evaluation promote divergent thinking and answers cannot be
determined in advance (Bloom et al., 1971). Heinrich and Wang (2003) argue that
objective testing is still not sophisticated enough to examine complex content and
thinking patterns. However, other research in linguistics and computer programming
concluded that the higher-level skills can be assessed via CAA through innovative
approaches (Cox & Clark, 1998; Reid, 2002). In the study by Reid (2002) a new
language was devised and students were required to apply linguistic techniques in
order to answer MCQ. It has been suggested that CAA tests of higher-level skills are
more complex and costly to produce (Dowsing, 1998) and this may be because more
innovative approaches are needed.

Question styles

Objective testing has been used within assessment for over forty years (Wood, 1960)
and computer programs delivering MCQ date back to the 1970s (Morgan, 1979).
More sophisticated question styles have emerged enabling more diverse assessment
methods. The question styles delivered by the TRIADS software developed at Derby
University are evidence of this evolution, offering 17 question styles in 1999
(Mackenzie, 1999) and 39 in 2003 (CIAD, 2003). However, staff at the University
of Liverpool using TRIADS found that this presented an additional problem, as they
were unfamiliar with the new question styles and lacked confidence in writing suitable
questions (McLaughlin et al., 2004). Staff development in writing suitable questions
and guidelines can be used to overcome these problems. For example, generic
guidelines developed by Haladyna (1996), Herd and Clark (2002) present examples
of the various questions styles used in further education whilst examples used within
higher education can be found at http://www.caacentre.ac.uk.

Although there are a large number of possible formats for CAA questions, it is
possible to classify them into four distinct groups based on the human interaction
technique required (CIAD, 2003). These groups are defined as point and click, move
object, text entry and draw object.

Point and click

Point and click questions include Multiple Choice (MCQ) and Multiple Response
(MRQ) items, which have both been used within assessment practise for a consider-
able time and as a result are often transformed into CAA (Ricketts & Wilks, 2002b).
Ebel (1972) suggests that any understanding or ability that can be tested by means of


Implementation of computer assisted assessment 219

any other technique, for instance essays, can also be tested by MCQ. More complex
MCQ questions can be devised through assertion reasoning resulting in the testing of
higher cognitive skills (Bull & McKenna, 2001). Both MCQ and MRQ have inherent
problems, such as reliance on true and false style questions which students might
perceive to be unfair (Wood, 1960). Davies also argues that the quality of MCQ is
dependent on the quality of the distracter and not the question (Davies, 2002).

Move object

Move object style questions focus on the movement of objects to predetermined
positions on the screen. They are a variation of the MCQ format and are good for
assessing students understanding of relationships (Bull & McKenna, 2001). For
example in computing they could be used for the labelling of entity relationship
diagrams or in linguistics students could be presented with a poem and move the
highlighted words to the appropriate word class. One problem is that when the
number of moveable objects is equal to the number of targets, if a student knows all
but one answer they will automatically get full marks (Wood, 1960).

Text entry

Text entry questions consist of input of short predefined answers, such as factual
knowledge or syntax in computer programming. An advantage of this format is that
students must supply the correct answer removing the possibility of guessing (Bull &
McKenna, 2001) and this style has been found to be the most demanding format for
students (Reid, 2002). There are problems associated with text entry within some
subject domains such as mathematics, as mathematical expressions cannot easily be
included in most commercial software (Croft et al., 2001; Paterson, 2002). Another
problem associated with this question style is that the answer may be marked
incorrect due to spelling mistakes and the time saving element may be reduced if
lecturers need to manually check for spelling errors.

Draw object

This is associated with drawing simple objects or lines. For example, students may be
required to plot graphs which can be automatically marked. This style of question is
a high discriminator between strong and weak candidates (Mackenzie, 1999). There
is little evidence in the literature concerning the effectiveness of this format, but this
might be due to the fact that commercial software such as Questionmark and I-Assess
do not have this style in their templates.

Interoperability and question banks

Question banks which are authored and peer reviewed by academics are emerging,
such as the Electrical and Electronic Engineering Assessment Network who


220 G. Sim et al.

developed a database of questions in electrical and electronic engineering (Bull et al.,
2002). One such bank will typically require 5000 questions making it unfeasible for a
single institution to develop (Maughan et al., 2001). Constructing high quality
questions is difficult, time consuming and expensive (Sclater et al., 2003) and issues
arise in the interoperability of questions between CAA Software (Lay & Sclater,
2001). There are several international standards established to enable interoperability
of questions between software applications (Herd & Clark, 2003). These specifica-
tions are based on metadata structure for questions and their grouping together.
Unless these interoperability standards are developed and utilized question banks will
have a limited life, as they cannot be used on a variety of delivery platforms (White &
Davis, 2000). Systems are emerging that are IMS-QTI compliant (Instructional
Management Systems – Question and Test Interoperability Specification) to facilitate
the exchange of questions (Daly, 2002; Bacon, 2003). The Centre for Educational
Technology Interoperability Standards (www.cetis.ac.uk) offers comprehensive
resources and information on the issues concerning interoperability which may help
direct further research.

Guessing

A number of the question styles associated with CAA can lead to artificially high
marks through guessing (Bush, 1999), which has implications for setting the pass
mark of the test. For example, setting a pass mark of 40% based on assessment of
true/false answers would be inappropriate, as guessing alone would give an average of
50% (Harper, 2002). The problems of guessing may be addressed through various
marking schemes, such as post test correction (Bull & McKenna, 2001), negative
marking (Bush, 1999), increasing the number of questions or combining the results
from several tests (Burton & Miller, 1999) or increasing the number of distracters and
the pass mark (Mackenzie & O’Hare, 2002). It has been suggested that negative
marking is not generally implemented in the UK (McAlpine, 2002) and that post test
correction is only suitable with a single question style because the formulae would
vary depending on the number of distracters (Harper, 2003).

Statistical analysis has resulted in various methods being developed to assist in test
construction in order to reduce the effects of guessing. An empirical marking
simulator to assist in scoring and test construction based on a base level guess factor
has been developed (Mackenzie & O’Hare, 2002), this program examines the mark
distribution and measurement scale for a set of random answers, enabling tutors to
establish the effects of guessing on their assessment. Also statistics to award a score
for partial credit through a formula based on a mean uneducated guessers score has
been investigated (McCabe & Barrett, 2003). This allows MCQ to be unconstrained,
similar to MRQ styles, enabling students to provide more than one answer and their
score is weighted depending on the number of choices. For example, an MCQ with
one correct answer, four possible options and a score of 3, if a student includes the
correct answer by selecting 2 options they would only score 2 (2=3-1). Davies used a
combination of predetermining the students’ confidence in answering the question


Implementation of computer assisted assessment 221

prior to seeing the distracters and negative marking, resulting in students perceiving
this to be a fairer test of their abilities (Davies, 2002).

There is lack of evidence that any one specific technique generates more accurate
results than any other. It could be argued that these techniques are unnecessary if the
tests are well constructed (Bull & McKenna, 2001).

Accessibility

UK institutions now have to comply with the Special Educational Needs and
Disability Act when preparing both teaching and assessment material (SENDA,
2001). The number of students in UK higher education registering a disability in
2000 was 22,290 and this has implications for CAA (Phipps & McCarthy, 2001). For
example, a student with dyslexia may exert more cognitive resources in interpreting
the question, therefore, ensuring the language is appropriate is a necessity (Wiles &
Ball, 2003). In addition extra time may be required to complete the test which may
necessitate the publishing of two different assessments, one with a longer duration.
Feedback from one dyslexic student regarding CAA indicated that they thought it
provided a more level playing field in which they can demonstrate their knowledge
(Jefferies et al., 2000). Students with visual or physical impairment may struggle to
answer move object and draw object style questions without the aid of assistive
technology, they may need specially adapted input software and hardware such as,
touch screens, eyegaze systems or speech browsers.

There are guidelines for general teaching, however there is little evidence that
guidelines for inclusive and accessible design in CAA are emerging (Wiles, 2002). For
example, when multimedia elements, such as video are used within the assessment,
it may necessitate the provision of an alternative paper-based version for students
with sensory impairment. The introduction of an alternative, in this instance paper,
poses the problem of ensuring comparability (Bennett et al., 1999). When identical
tests are presented on a computer and paper they are not comparable (Clariana &
Wallace, 2002) because there are numerous variables that impact on student’s
performance when questions are presented on a computer. These variables include
the monitor (Schenkman et al., 1999), the way text is displayed on screen (Dyson and
Kipping, 1997), reading from a monitor is slower than paper (Mayes et al., 2001) and
the problems of obtaining a feel for the exam when only a single question is presented
(Liu et al., 2001). The Web Accessibility Initiative (http://www.w3c.org/WAI/) has
produced useful guidelines for promoting online accessibility which may be
applicable to CAA but this initiative does not address the issue of comparability
between questions.

Institutional strategies for the adoption of CAA

The greatest barrier to the adoption of CAA by academics is lack of time, to both
develop questions and learn the software (Warburton & Conole, 2003). This may
have contributed to the fact that the adoption of CAA has usually resulted from the


222 G. Sim et al.

impetus of enthusiastic individuals rather than strategic decisions (O’Leary & Cook,
2001; Daly & Waldron, 2002). The perceived benefits of CAA of freeing lecturers’
time can be illusive if no institutional strategy or support is offered (Stephens, 1994),
successful implementation may be left to chance (Stephens et al., 1998) and CAA
may be developed in an anarchic fashion (McKenna & Bull, 2000). Research
conducted at the University of Portsmouth indicate that there is no time saving
benefit for courses with less than twenty students (Callear & King, 1997). In order to
utilize the features within software packages staff training and development is
necessary (Boyle & O’Hare, 2003) and this may not be feasible without institutional
support.

Institutions adopting CAA are faced with the difficulty of evaluating and deciding
upon the most appropriate CAA software. Without an institutional strategy,
individual departments may adopt their own systems (O’Leary & Cook, 2001). This
results in students having to cope with a number of different user interfaces and CAA
formats, increased licence costs and problems offering administrative and technical
support. Even if an institution has a clear strategy there are also problems in
determining the selection criteria for software used to deliver assessment and there is
a lack of analysis within the literature (Valenti et al., 2002). Sclater and Howie (2003)
contributed to this literature by defining the ultimate online assessment engine. This
was achieved through a process of examining the user requirements of the system,
establishing the stakeholders and their functional requirements. This research may
aid institutions identify their needs and establish an appropriate evaluation
methodology.

The following guidelines for an institutional strategy have been formulated by
Loughborough University and the University of Luton: establish a coordinated
CAA management policy for CAA unit(s) and each discipline on campus; establish
a CAA unit; establish CAA discipline groups/committees; provide funding; organize
staff development programmes; establish evaluation procedures; identify technical
issues; establish operational and administrative procedures (Stephens et al., 1998).
BS7988 is a new British Standard Code of practice that has been introduced
governing the use of information technology in the delivery of assessments
(BS7988, 2002). The guidelines have various implications for the delivery of assess-
ments, for example, it is recommended that students take a break after 1.5 hours
which has an impact on the invigilation process. If this recommendation is followed,
procedures need to be established to prevent collusion between students during the
break or the tests need to be split into two separate sections. One of the difficulties
for many institutions using CAA arises through the lack of resources to accommo-
date large cohorts of students sitting the exam simultaneously (Mackenzie et al.,
2004). This problem can be alleviated through institutional support and therefore,
to fully utilize the benefits of CAA an institutional strategy would appear necessary
to increase the chance of successful implementation. These benefits are evident
within a number of institutions with strategies, such as, Ulster (Stevenson et al.,
2002), Derby (Mackenzie et al., 2002), Coventry (Lloyd et al., 1996) and
Loughborough (Croft et al., 2001).


Implementation of computer assisted assessment 223

Security

The move from traditional teaching environments and examination settings presents
additional issues relating to security. Frohlich (2000) states that in traditional
environments it is possible to ensure the security of the exam papers and scripts, this
includes the transportation to and from the exam venue. However, even under this
system breaches in security do occur, for example AQA had to replace 500,000
English and English Literature exam papers after a box had been tampered with
(Curtis, 2003).

Tannenbaum (1999) defines security in computer systems as consisting of
procedures to ensure that individuals cannot access material for which they do not
have authorisation. This is essential within a CAA environment as questions and
student details are stored in a database and usually the test data is sent over a local
network or the Internet. Before computers were connected to the Internet it was rela-
tively easy to have effective security measures (Mason, 2003), but transmission of
sensitive data over an insecure network requires additional security measure to be
implemented.

Encryption techniques can be used to ensure the security of the questions and
answers when transmitting data over the Internet (Sim et al., 2003). To increase
security, examinations can be loaded on to the server at the last minute (Whittington,
1999). If email is used to submit results there is a potential risk due to the lack of
authentication (Hatton et al., 2002). Four security requirements have been identified
by Luck and Joy, these being: all submissions must be logged, it must be verified that
a stored document used for the assessment is the same as the one used by the student,
a feedback mechanism must inform students that their submission has been received
and the identity of the student must be established (Luck & Joy, 1999).

With the majority of CAA software students and administrators are required to
have passwords which is often the weakest link in terms of protection (Hindle, 2003).
Although an unlikely event, students could get access to the administrator password
and change their results or gain access to the questions. Other concerns include
authentication and invigilation of the students, which can be are particularly
problematic in remote locations (Thomas et al., 2002). At present students enrolled
on distance learning courses overseas need to sit exams in a specific location such as,
the British Council Offices to enable authentication and invigilation. Research is
being conducted to overcome these problems but unless solutions are found,
geographical barriers will remain as students need access to the test centres.

During the test computers need to be locked down, removing the possibility of
accessing other content and secure browsers have been developed to enable this such
as, Questionmark Secure (Kleeman & Osborne, 2002). There are operational risks
associated with CAA that have security implications such as the server crashing and
these risks need to be identified and procedures established to minimize them
(Zakrzewski & Steven, 2003).

There are software standards for security for example, the British Standards on
Information Security Management BS7799, which has also been adopted as an


224 G. Sim et al.

International Standard IS17799. In addition, when data from the test has been
collected institutions within the UK should abide by the Data Protection Act 1998
(Mason, 2003). If security measures are in place there is no evidence to suggest that
the integrity of the examination is more compromised by delivery over the Internet
than by paper.

Conclusion

The implementation of CAA from a technical and pedagogical perspective is a
complex process. The first, and perhaps the most important, lesson that can be
learned is that an institutional strategy would seem to greatly increase the chances of
success. There are recommendations that have been made to assist policy makers
formulate an effective strategy. Without institutional support implementing security
procedures may be more problematic, such as locking down PCs. However,
authentication and invigilation in remote locations is still an issue that has yet to be
fully resolved.

The other important lesson that can be learned is in relation to staff development
and training in test construction within a CAA environment. Focused staff develop-
ment may help alleviate a number of issues, such as guessing, testing various cognitive
skills, using appropriate question styles and accessibility. The emergence of question
banks may also address these issues depending on their level of interoperability.
Another issue is that whilst there are guidelines relating to accessible online content
there are still no formal guidelines relating to CAA.

The reliance on a single method of assessment is problematic and a diverse
assessment strategy is usually necessary. Within an environment of increasing student
numbers and a reduction of staff to student ratio, CAA would appear to be a partial
solution. This study has highlighted the issues surrounding the implementation of
CAA to both inform and direct further research in the field.

References

Anderson, L. W. & Krathwohl, D. R. (2001) A taxonomy for learning, teaching, and assessing. A
revision of blooms taxonomy of educational objectives (New York, Longman).

Ashton, H. S., Schofield, D. K. & Woodger, S. C. (2003) Pilot summative web assessment in
secondary education, Proceedings of the 7th International Computer Assisted Assessment Conference
(Loughborough, Loughborough University), 19–29.

Bacon, R. A. (2003) Assessing the use of a new QTI assessment tool within Physics, Proceedings of
the 7th International Computer Assisted Assessment Conference (Loughborough, Loughborough
University), 33–44.

Baranchik, A. & Cherkas, B. (2000) Correcting grade deflation caused by multiple-choice scoring,
International Journal of Mathematical Education in Science and Technology, 31(3), 371–380.

Bennett, R. E., Goodman, M., Hessinger, J., Kahn, H., Liggett, J., Marshall, G. & Zack, J. (1999)
Using multimedia in large-scale computer-based testing programs, Computers in Human
Behaviour, 15(3), 283–294.

Bloom, B. S. (1956) Taxonomy of educational objectives: the classification of educational goals. Hand-
book 1. Cognitive domain (New York, Longman).


Implementation of computer assisted assessment 225

Bloom, B. S., Hastings, J. T. & Madaus, G. F. (1971) Handbook on formative and summative evalu-
ation of student learning (New York, McGraw-Hill Books).

Boyle, A. & O’Hare, D. (2003) Finding appropriate methods to assure quality computer-based
assessment development in UK higher education, Proceedings of the 7th International Computer
Assisted Assessment Conference, (Loughborough, Loughborough University), 67–82.

BS7988 (2002) Code of practice for the use of information technology (IT) in the delivery of assessment
Bull, J., Conole, G., Davis, H. C., White, S., Danson, M. & Sclater, N. (2002) Rethinking

assessment through learning technologies, Proceedings of the ASCILITE 2002 (Auckland,
UNITEC), 1–12.

Bull, J. & McKenna, C. (2001) Blueprint for computer-assisted assessment (Loughborough,
Loughborough University).

Burstein, J., Leacock, C. & Swartz, R. (2001) Automated evaluation of essays and short answers,
Proceedings of the 5th International Computer Assisted Assessment Conference (Loughborough,
Loughborough University).

Burton, R. F. (2001) Quantifying the effects of chance in multiple choice and true/false tests: ques-
tion selection and guessing of answers, Assessment and Evaluation in Higher Education, 26(1),
41–50.

Burton, R. F. & Miller, D. J. (1999) Statistical modelling of multiple-choice and true/false tests:
ways of considering, and of reducing, the uncertainties attributed to guessing, Assessment and
Evaluation in Higher Education, 24(4), 399–411.

Bush, M. (1999) Alternative marking schemes for on-line multiple choice tests, Proceedings of the
7th Annual Conference on the Teaching of Computing (Belfast, Elsevier).

Callear, D. & King, T. (1997) Using computer-based tests, ALT-J, 5(1), 27–32.
Cassady, J. C. & Johnson, R. E. (2002) Cognitive test anxiety and academic performance, Contem-

porary Educational Psychology, 27(2), 270–295.
Christie, J. R. (1999) Automated essay marking for both content and style, Proceedings of the

3rd Annual Computer Assisted Assessment Conference (Loughborough, Loughborough Univer-
sity).

CIAD (2003) Summary of question styles. Available online at: http://www.derby.ac.uk/ciad/
ciastyles.html (accessed 30 June 2003).

Clariana, R. & Wallace, P. (2002) Paper-based versus computer-based assessment: key factors
associated with test mode effect, British Journal of Educational Technology, 33(5), 593–602.

Cox, K. & Clark, D. (1998) The use of formative quizzez for deep learning, Computers and Educa-
tion, 30(3), 157–167.

Crisp, B. R. (2002) Assessment methods in social work education: a review of the literature, Social
Work Education, 21(2), 259–269.

Croft, A. C., Danson, M., Dawson, B. R. & Ward, J. P. (2001) Experience of using computer
assisted assessment in engineering mathematics, Computers and Education, 37(1), 53–66.

Curtis, P. (2003) Missing paper sparks exam reprint (London, Guardian).
Daly, C. & Waldron, J. (2002) Introductory programming, problem solving and computer assisted

assessment, Proceedings of the 6th International Computer Assisted Assessment Conference,
(Loughborough, Loughborough University), 95–106.

Daly, J. (2002) An XML question bank using Microsoft Office, Proceedings of the 6th International
Computer Assisted Assessment Conference (Loughborough, Loughborough University), 107–118.

Davies, P. (2002) There’s no confidence in multiple-choice testing, Proceedings of the 6th
International Computer Assisted Assessment Conference (Loughborough, Loughborough
University), 119–132.

Dowsing, R. D. (1998) Flexibility and the technology of computer aided assessment, Proceedings of
the ASCILITE 1998 (Wollongong, University of Wollongong), 163–171.

Dyson, M. C. & Kipping, G. J. (1997) The legibility of screen formats: are three columns better
than one? Computers and Graphics, 21(6), 703–712.

Ebel, R. L. (1972) Essentials of educational measurement (Englewood Cliffs, Prentice-Hall).


226 G. Sim et al.

Frohlich, R. (2000) Keeping the wolves from the door, wolves in sheep clothing, that is,
Proceedings of the 4th International Computer Assisted Assessment Conference (Loughborough,
Loughborough University).

Graham, D. (2004) A survey of assessment methods employed in UK higher education
programmes for HCI courses, Proceedings of the 7th HCI Educators Workshop (Preston, LTSN),
66–69.

Haladyna, T. M. (1996) Writing test items to evaluate higher order thinking (Needham Heights,
Allyn & Bacon).

Harper, R. (2002) Allowing for guessing and for expectations from the learning outcomes in
computer-based assessments, Proceedings of the 6th International Computer Assisted Assessment
Conference (Loughborough, Loughborough University), 139–150.

Harper, R. (2003) Correcting computer-based assessment for guessing, Journal of Computer
Assisted Learning, 19(1), 2–8.

Hatton, S., Boyle, A., Byrne, S. & Wooff, C. (2002) The use of PGP to provide secure email
delivery of CAA results, Proceedings of the 6th International Computer Assisted Assessment
Conference (Loughborough, Loughborough University), 149–160.

Hawe, E. (2003) It’s pretty difficult to fail; the reluctance of lecturers to award a fail grade,
Assessment and Evaluation in Higher Education, 28(4), 371–382.

Heinrich, E. & Wang, Y. (2003) Online marking of essay-type assignments, Proceedings of the World
Conference on Educational Multimedia Hypermedia and Telecommunications (Hawaii, AACE),
768–772.

Herd, G. & Clark, G. (2002) Computer assisted assessment implementing CAA in FE sector in Scotland:
question types (Glenrothes, Glenrothes College).

Herd, G. & Clark, G. (2003) CAA implementation in the FE Sector in Scotland (Glenrothes,
Glenrothes College).

HESA (1995) Students in higher education institutions 1994/95 (London, HMSO).
HESA (2002) Students in higher education institutions 2001/02 (London, HMSO).
Hindle, S. (2003) Careless about privacy, Computers and Security, 22(4), 284–288.
Horney, W. (2003) Assessing using grade-related criteria: a single currency for universities? Assess-

ment and Evaluation in Higher Education, 28(4), 435–454.
Jafarpur, A. (2003) Is the test constructor a facet? Language Testing, 20(1), 57–87.
Jefferies, P., Constable, I., Kiely, B., Richardson, D. & Abraham, A. (2000) Computer aided

assessment using WebCT, Proceedings of the 4th International Computer Assisted Assessment
Conference (Loughborough, Loughborough University).

King, T. & Duke-Williams, E. (2002) Using computer aided assessment to test higher level learn-
ing outcomes, Proceedings of the 5th International Computer Assisted Assessment Conference
(Loughborough, Loughborough University).

Kleeman, J. & Osborne, C. (2002) A practical look at delivering assessment to BS7988
recommendations, Proceedings of the 6th International Computer Assisted Assessment Conference
(Loughborough, Loughborough University), 163–170.

Knight, P. (2001) A briefing on key concepts formative and summative, criterion and norm-referenced
assessment (York, LTSN Generic Centre).

Latu, E. & Chapman, E. (2002) Computerised adaptive testing, British Journal of Educational
Technology, 33(5), 619–622.

Lay, S. & Sclater, N. (2001) Question and test interoperability: an update on national and
international developments, Proceedings of the 5th International Computer Assisted Assessment
Conference (Loughborough, Loughborough University).

Liu, M., Papathanasiou, E. & Hao, Y. (2001) Exploring the use of multimedia examination
formats in undergraduate teaching: results from the fielding testing, Computers in Human
Behaviour, 17(3), 225–248.

Lloyd, D., Martin, J. G. & McCaffery, K. (1996) The introduction of computer based testing on
an engineering technology course, Assessment and Evaluation in Higher Education, 21(1), 83–90.


Implementation of computer assisted assessment 227

Loewenberger, P. & Bull, J. (2003) Cost-effectiveness analysis of computer-based assessment,
ALT-J, 11(2), 23–45.

Luck, M. and Joy, M. (1999) A secure on-line submission system, Software—Practice and
Experience, 29(8), 721–740.

Macdonald, J. & Twining, P. (2002) Assessing activity-based learning for a networked course,
British Journal of Educational Technology, 33(5), 603–618.

Mackenzie, D. (1999) Recent developments in the triartite interactive assessment delivery
system (TRIADS), Proceedings of the 3rd Annual Computer Assisted Assessment Conference
(Loughborough, Loughborough University).

Mackenzie, D., Hallam, B., Baggott, G. & Potts, J. (2002) TRIADS experiences and
developments. A panel discussion, Proceedings of the 6th International Computer Assisted
Assessment Conference (Loughborough, Loughborough University).

Mackenzie, D. & O’Hare, D. (2002) Empirical prediction of the measurement scale and base
level ‘Guess Factor’ for advanced computer-based assessment, Proceedings of the 6th Interna-
tional Conference of Computer Aided Assessment (Loughborough, Loughborough University),
187–201.

Mackenzie, D., O’Hare, D., Paul, C., Boyle, A., Edwards, D., Willimas, D. & Wilkins, H. (2004)
Assessment for learning: the Triads assessment of learning outcomes project and the
development of a pedagogically friendly computer-based assessment system, in: D. O’Hare &
D. Mackenzie (Eds) Advances in computer aided assessment (Birmingham, SEDA), 11–25.

Mason, S. (2003) Electronic security is a continous process, Computer Fraud and Security, 2003(1),
13–15.

Maughan, S., Peet, D. & Willmot, A. (2001) On-line formative assessment item banking and
learning support, Proceedings of the 5th International Computer Assisted Assessment Conference
(Loughborough, Loughborough University).

Mayer, R. E. (2002) A taxonomy for computer-based assessment of problem solving, Computers in
Human Behaviour, 18(6), 623–632.

Mayes, D. K., Sims, V. K. & Koonce, J. M. (2001) Comprehension and workload differences
for VDT and paper-based reading, International Journal of Industrial Ergonomics, 28(6),
367–378.

McAlpine, M. (2002) Principles of assessment (Luton, CAA Centre).
McCabe, M. & Barrett, D. (2003) CAA scoring strategies for partial credit and confidence levels,

Proceedings of the 7th International Computer Assisted Assessment Conference (Loughborough,
Loughborough University), 209–219.

McKenna, C. & Bull, J. (2000) Quality assurance of computer-assisted assessment: practical and
strategic issues, Quality Assurance in Education, 8(1), 24–31.

McLaughlin, P. J., Fowell, S. L., Dangerfield, P. H., Newton, D. J. & Perry, S. E. (2004)
Development of computerised assessment (TRIADS) in an undergraduate medical school, in:
D. O’Hare & D. Mackenzie (Eds) Advances in computer aided assessment (Birmingham,
SEDA), 25–32.

Mills, C. N., Potenza, M. T., Fremer, J. J. & Ward, W. C. (2002) Computer-based testing. Building
the foundation for future assessments (Mahwah, Lawrence Erlbaum Associates).

Morgan, M. R. J. (1979) MCQ: an interactive computer program for multiple-choice self testing,
Biochemical Education, 7(3), 67–69.

Nugent, G. (2003) On-line multimedia assessment for K-4 students, Proceedings of the World
Conference on Educational Multimedia, Hypermedia and Telecommunications (Hawaii, AACE),
1051–1057.

O’Leary, R. & Cook, J. (2001) Wading through treacle: CAA at the University of Bristol, Proceedings
of the 5th International Computer Assisted Assessment Conference (Loughborough, Loughborough
University).

Pain, D. & Le Heron, J. (2003) WebCT and online assessment: the best thing since SOAP? Educa-
tional Technology and Society, 6(2), 62–71.


228 G. Sim et al.

Paterson, J. S. (2002) Linking on-line assessment in mathematics to cognitive skills, Proceedings of
the 6th International Computer Assisted Assessment Conference (Loughborough, Loughborough
University), 295–306.

Paxton, M. (2000) A linguistic perspective on multiple choice questioning, Assessment and
Evaluation in Higher Education, 25(2), 109–119.

Phipps, L. & McCarthy, D. (2001) Computer assisted assessment and disabilities, Proceedings of
the 5th International Computer Assisted Assessment Conference (Loughborough, Loughborough
University).

Pollock, M. J., Whittington, C. D. & Doughty, G. F. (2000) Evaluating the costs and benefits of
Changing to CAA, Proceedings of the 4th International Computer Assisted Assessment Conference
(Loughborough, Loughborough University).

Powers, D. E., Burstein, J. C., Chodorow, M., Fowles, M. E. & Kukich, K. (2002) Stumping
e-rator: challenging the validity of automated essay scoring, Computers in Human Behaviour,
18(2), 103–134.

Race, P. (1995) The art of assessing, The New Academic, 4(3).
Reid, N. (2002) Designing online quiz questions to assess a range of cognitive skills, Proceedings of

the World Conference on Educational Multimedia, Hypermedia and Telecommunications (Denver,
AACE), 1625–1630.

Ricketts, C. & Wilks, S. (2002a) What factors affect students opinions of computer-assisted
assessment? Proceedings of the 6th International Computer Assisted Assessment Conference
(Loughborough, Loughborough University), 307–316.

Ricketts, C. & Wilks, S. (2002b) Improving student performance through computer-based assess-
ment: insights from recent research, Assessment and Evaluation in Higher Education, 27(5),
475–479.

Sabar, N. (2002) Towards principled practice in evaluation: learning from instructors’ dilemmas
in evaluating graduate students, Studies in Educational Evaluation, 28(4), 329–345.

Sambell, K., Sambell, A. & Sexton, G. (1999) Students’ perception of the learning benefits of
computer-assisted assessment: a case study in electronic engineering, in: S. Brown, J. Bull
& P. Race (Eds) Computer-assisted assessment in higher education (Birmingham, SEDA),
179–191.

Schenkman, B., Fukuda, T. & Persson, B. (1999) Glare from monitors measured with subjective
scales and eye movements, Displays, 20, 11–21.

Sclater, N., Davis, H. C., White, S. A., Conole, G. C. & Danson, M. (2003) Technologies for
online interoperable assessment, Proceedings of the CAL03 (Belfast, Elsevier).

Sclater, N. & Howie, K. (2003) User requirements of the ‘ultimate’ online assessment engine,
Computers and Education, 40(3), 285–306.

Sealey, C., Humphries, P. & Reppert, D. (2003) At the coal face: experience of computer
based exams, Proceedings of the 7th International Computer Assisted Assessment Conference
(Loughborough, Loughborough University), 357–376.

SENDA (2001), Special Educational Needs and Disability Act 2001,  http://www.hmso.gov.uk/
acts/acts2001/20010010.htm (Accessed 10 March 2004)

Sim, G., Malik, N. A. & Holifield, P. (2003) Strategies for large-scale assessment: an institutional
analysis of research and practice in a virtual university, Proceedings of the 7th International
Computer Assisted Assessment Conference (Loughborough, Loughborough University), 379–390.

Stephens, D. (1994) Using computer-assisted assessment: time saver or sophisticated distractor?
Active Learning, 1, 11–15.

Stephens, D., Bull, J. & Wade, W. (1998) Computer-assisted assessment: suggested guide-
lines for an institutional strategy, Assessment and Evaluation in Higher Education, 23(3),
283–294.

Stephens, D. & Mascia, J. (1997) Results of a survey into the use of computer-assisted assessment in
institutions of higher education (Loughborough University). Available online at: http://
www.lboro.ac.uk/service/ltd/flicaa/downloads/survey.pdf (accessed 18 June 2004).


Implementation of computer assisted assessment 229

Stevenson, A., Sweeny, P., Greenan, K. & Alexander, S. (2002) Integrating CAA within the
University of Ulster, Proceedings of the 6th International Computer Assisted Assessment Conference
(Loughborough, Loughborough University), 329–340.

Tannenbaum, R. S. (1999) Theoretical foundations of multimedia (New York, W.H. Freeman).
Thomas, P., Price, B., Paine, C. & Richards, M. (2002) Remote electronic examinations: student

experience, British Journal of Educational Technology, 33(5), 537–549.
Valenti, S., Cucchiarelli, A. & Panti, M. (2002) Computer based assessment systems evaluation

via the ISO90126 quality model, Journal of Information Technology Education, 1(3), 157–175.
Walker, D. M. & Thompson, J. S. (2001) A note on multiple choice exams, with respect to

students’ risk preference and confidence, Assessment and Evaluation in Higher Education,
26(3), 261–267.

Warburton, B. & Conole, G. (2003) CAA in UK HEIs—the state of the art? Proceedings of the 7th
International Computer Assisted Assessment Conference (Loughborough, Loughborough
University), 433–441.

White, S. & Davis, H. C. (2000) Creating large scale test banks: a briefing for participative discus-
sion and agendas, Proceedings of the 4th International Computer Assisted Assessment Conference
(Loughborough, Loughborough University).

Whittington, D. (1999) Technical and security issues, in: S. Brown, J. Bull & P. Race (Eds)
Computer assisted assessment in higher education (Birmingham, SEDA), 21–28.

Wiles, K. (2002) Accessibility and computer-based assessment: a whole new set of issues? in:
L. Phipps, A. Sutherland & J. Seale (Eds) Access all areas: disability, technology and learning
(Oxford and York, ALT/TechDis), 61–66.

Wiles, K. & Ball, S. (2003) Constructing accessible CBA: minor works or major renovations?
Proceedings of the 7th International Computer Assisted Assessment Conference (Loughborough,
Loughborough University), 445–451.

Wiltfelt, C., Philipsen, P. E. & Kaiser, B. (2002) Chat as media in exams, Education and
Information Technologies, 7(4), 343–349.

Wood, D. A. (1960) Test construction (Columbus, Charles E. Merrill Books).
Yorke, M., Barnett, G., Bridges, P., Evanson, P., Haines, C., Jenkins, D., Knight, P., Scurry, D.,

Stowell, M. & Woolf, H. (2002) Does grading method influence honours degree classifica-
tion? Assessment and Evaluation in Higher Education, 27(3), 269–279.

Zakrzewski, S. & Steven, S. (2003) Computer-based assessment: quality assurance issues, the hub
of the wheel, Assessment and Evaluation in Higher Education, 28(6), 609–623.