marsden Evaluation: Towards a definition and statement of purpose M Jane Marsden This paper explores various interpretations of the term "evaluation" in the context of training and development. It looks specifically at definitions and purposes of evaluation. It is proposed that, in Australia's current economic state and in the environment of the Training Guarantee, training professionals need to address the need for comprehensive evaluation practices. The purpose of this paper is to develop a definition and statement of purpose for evaluation in the context of training and development. Evaluation is given a low priority in the instructional process, a contention that is supported by the small number of articles in the literature that deal with it. However, in the current economic environment and in the light of the Training Guarantee, training personnel are going to be faced with hard economic decisions about the viability and value of the programs they offer. They are going to need evidence of the quality of their programs in order to make such decisions and to influence the decisions of organisational management. This evidence can only come about through evaluation. Thus, evaluation must be given a high priority and must be fully incorporated into the instructional development process. In order for this to happen the meaning of evaluation must be clarified and its purpose(s) must be clearly identified. In this paper it is my intention to report on a limited review of the literature and to explore some of the notions of evaluation that are presented. Evaluation of the Literature In 1989, Marguerite Foxon reported on some of the findings of a literature search that she undertook dealing with evaluation. She reviewed the Australian, American and British journals for a 15 year period to 1986 and 32 Australian Journal of Educational Technology, 1991, 7(1) expressed surprise at the relatively small number of articles (80) on the subject. Six of these articles were in the Australian literature, five of which were in Training and Development in Australia. Of the remaining 74 articles, 34% were published in the Training and Development Journal. The total contribution to the literature on evaluation in these two journals averaged two articles per year for the 15 year period. I have confined my own search to these two sources for the period 1987 to 1990 (May). In the three and a half year period I identified a total of 13 articles dealing with evaluation representing an average of three articles per year. Although an improvement, this still reflects the regard that the profession has for evaluation, it remains a low priority. Only two of the 13 articles located are in the Australian journal. In a survey of Training and Development Journal readers 30% of respondents identified "evaluation of training as the most difficult part of the job" (Galagan, 1983, and Del Faizo, 1984 cited in Foxon, 1989, p 90). This finding is not surprising since evaluation is poorly defined having different meanings for different people in many different contexts. There is a strong dependence in the profession on the determination of trainee reactions to programs as a major means of evaluation. Foxon (1989, p 91) makes the point that many trainers see the "development and delivery of training as their primary concern, and evaluation something of an afterthought." She suggests that the reliance on post-course reactions results from an inability to deal with quantitative measurement techniques and a lack of finances, time and expertise in comprehensive evaluation. Further, she suggests that training practitioners are confused by the term and do not understand what its "essential features" are nor what "purpose it should serve". In my own review, only one author attempts to define evaluation. Wigley (1988, p 21) defines it as "a data reduction process that involves the collection of large amounts of data which are analysed and synthesised into an overall judgement of worth or merit". The implication here is that the judgement of worth can be supported by the data. In her review, Foxon (1989, pp 91-92) found similar definitions referring to judgements of "value or worth". What is not clear in any of the definitions offered is what is entailed in the criteria of worth. It has been suggested that a major problem in arriving at a definition of evaluation is confusion with related terms such as measurement, assessment and validation (Foxon, 1989 p 92). This suggestion is a reasonable one if the literature is any indication of the way the training population perceives evaluation. Only four of the articles (Wigley, 1988; Brinkerhoff, 1988; Birnbrauer, 1987; and Bushnell, 1990) in the current study refer to a comprehensive approach to evaluation, involving the collection of data from the beginning of program design through to Marsden 33 program completion and post program evaluation techniques utilising a variety of data collection methods. Of the remaining articles five (O'Donnell, 1988; Harbour, 1988; Erkut and Fields, 1987; Weatherby and Gorosh, 1989; and Newstrom, 1987) use evaluation when referring to post course trainee reaction assessment and two (Alliger and Horowitz, 1989; and Hewitt, 1989) use it to refer to performance and/or knowledge testing. Two articles (Bowsher, 1990; and Lombardo, 1989) highlight the cost and benefits of training within the organisation but fail to refer to cost/benefit analysis in the context of course evaluation! Evaluation in Practice Wigley (1988) describes a "production approach" to training in which evaluation activities are seen as being isolated from the training itself. In this approach evaluation is focused on statistics that describe the number of training days per year, the number of courses per year and the number of trainees attending each course among other things. Whilst these statistics are useful in providing data about how popular the programs offered by the training department are, they have little affect in showing whether the training department is fulfilling any useful purpose for the organisation - unless "bums on seats" is seen as a useful purpose. Foxon (1989, p 90) reported the results of a number of surveys involving the HRD professionals in the United States, Britain and Australia in which it was found that the major form of evaluation, in many cases the only form, was end-of-course trainee reactions and that the data from these evaluations were seldom used. The current search revealed that a number of authors (Harbour, 1988; Erkut and Fields, 1987; Weatherby and Gorosh, 1989) continue to rely heavily on the trainee reaction form of evaluation. The trainee reaction questionnaire, often referred to as the "smile sheet," is relatively easy to construct and administer when compared to other forms of evaluation, and, if kept simple enough, it is easy to analyse and report on the findings. The data obtained can be useful in determining which trainers, training methods, aids and resources are popular and therefore likely to affect trainee motivation and participation. However, its usefulness is limited in that the data obtained is subjective and gives little or no information about whether the training program contributed to or achieved the goals and objectives of the organisation, the training department, the particular program or the individual trainees. Only one article (Newstrom, 1987) has referred to a statistical analysis to determine the reliability of evaluation - in this case applied to the trainee reaction form of evaluation. The impression that one is left with is that, for the most part, evaluation is seen as an activity that occurs at the completion of a training program, and that the practice of evaluation is confined to a limited number of activities. 34 Australian Journal of Educational Technology, 1991, 7(1) Those articles that describe a comprehensive approach to evaluation generally speak of the principle rather than the practice. In the current study there are no reports of a training program in which various evaluative data has been collected from the outset and continued through to follow-up assessment with collation and analysis of the data and compilation of a report with recommendations as the final outcome. This perhaps sounds like an idealistic expectation, and indeed it is, but I believe that it is not only possible but essential if we are to be responsible and accountable to our clients for effective training programs. This, I believe, is the ultimate goal of evaluation - to demonstrate the accountability of the training department/consultant to the client for effective training programs. Purpose of Evaluation If this is the ultimate goal, then what is/are the purpose(s) of evaluation? For it is only in determining the purpose(s) that we can develop a working definition of the term. Foxon (1989, p 93) reports that more that 20% of the articles reviewed "neither describe nor imply a purpose for the evaluation." Of those that did, 15% saw the purpose as "justifying the training department's existence and providing evidence of cost benefit to the organisation." (Foxon, 1989, p 93). Interestingly, only 2% of the articles in Foxon's study considered assessment of trainee reactions to be the purpose of evaluation. Given this evidence one can not help but wonder at the proliferation of this form of evaluation. In the current review three of the articles provide a clear statement of purpose for training. Hewitt (1989, p 23) sees the purpose of evaluation as providing data demonstrating the program's effectiveness on targeted behaviour. Wigley (1988, p 21) has a broader view of the purpose to improve the program and facilitate informed decision making. Far and away the most comprehensive view of the purpose of evaluation is given by Bushnell (1990, p 41) who identifies four purposes: "to determine whether training programs are achieving the right purposes... to detect the types of changes they [the trainers] should make to improve course design, content, and delivery... [to tell the trainer] whether students actually acquire the needed knowledge and skills."; the ultimate purpose being to "balance the cost and results of training." All of Bushnell's purposes involve end stage evaluation methods, that is the data that will fulfil them can only be obtained at the completion of the program. Another four of the articles in the current study (Weatherby and Gorosh, 1989; Erkut and Fields, 1987; Alliger and Horowitz, 1989; and Brinkerhoff, 1988) imply a purpose to the evaluation of training. These implications range from very vague phrases like "gauging the effectiveness of training materials and programs" (Erkut and Fields, 1987, p 74) to very specific and Marsden 35 comprehensive descriptions such as that given by Brinkerhoff (1988, p 67) when he provides "six stages of HRD program development and operation": 1. to determine that an identified problem represents a training need !and to determine what the real goals are. 2. to determine the most appropriate training strategy 3. to determine if the chosen strategy is successfully implemented 4. to determine if learning occurred and to what extent 5. to determine usage outcomes (at individual level) 6. to determine impacts and worth (at organisational level) Implicit in these six stages are seven corresponding purposes. These seven purposes require the utilisation of evaluation methods to obtain data continually from the beginning of the instructional process - needs analysis - to the completion of the process - performance assessment. In comparing the six-stage model to the widely referenced Kirkpatrick model (cited in Birkenhoff, 1988, p 66; Poulet & Moult, 1987, p 64; and Newstrom, 1987, p 57), it is clear that the latter falls short of an ideal model because it is entirely outcome-oriented whereas the six stage model is integrated to include instructional activities in the planning, design and implementation stages of the instructional process. On first looking at the purposes implicit in Brinkerhoff's model, one can easily conclude that the two purposes identified at (1) above relate directly to needs assessment and, indeed, could even be interpreted to be the same as needs assessment. Closer examination of the model reveals this to be true to a certain extent, but it goes further to incorporate appraisal of the needs assessment tools and methods to ensure that they are truly exposing the real nature of the problem and the correct solution options. Similarly, with the purpose given at (2) above which could rightly be said to be the property of instructional planning or instructional design, yet in choosing the appropriate strategy one or more training strategies need to be considered, trialed and/ or analysed to ensure that the correct one is identified. Thus, an evaluative procedure must be employed. The remaining purposes equate readily to frequently stated or implied purposes, namely to determine trainee/trainer reactions to the training methods and materials utilised in a program, to assess trainee acquisition of knowledge and attitudes, to assess trainee performance, and to determine if organisational goals are met. It is my view that evaluation should fulfil all these purposes with the overriding aim being to influence decisions about the need for future programs, the need to modify future programs, the need to modify instructional tools at all stages of the instructional process and the need to provide cost/benefit data regarding the programs offered by the training 36 Australian Journal of Educational Technology, 1991, 7(1) department/ consultant. It seems clear that, if decisions are to be influenced, then the culmination of any set of evaluation activities is the compilation of a report containing recommendations Evaluation in the context of training Thus, any definition of evaluation in the context of training and development should include a number of elements: what it is, what it involves and what it leads to. Evaluation is an analytical process. Evaluation involves the collection of subjective and objective data from a number of sources using a variety of techniques about a training program and the reduction of such data. Evaluation leads to the synthesis of the data into a report containing a summary of results and recommendations, with validated rationales, about the program being evaluated. This last element - the synthesis of a report - is one which has not been addressed in the literature reviewed. Yet, if the overriding aim is to influence decisions, then the omission of a report is a grave omission indeed. A well written report presents arguments clearly and concisely so that decision makers have the evidence before them for consideration of the trainers recommendations. There is less reliance on verbal discussion and what is often construed as the trainers' attempt to justify their continued existence within the organisation. Further, a written report provides a permanent record of the evaluation outcomes and the reasons for modifications to a program. If the reporting is not formalised then the reasons for modification of a program can tend to become more and more subjectively interpreted over time. The result being that the program can eventually revert to its original form rather than being progressively improved and in line with organisational goals. Conclusion In summary, evaluation can be defined as an analytical process involving the collection and reduction of data of all (or some) phases of the instructional process and culminating in the synthesis of a report containing recommendations about the instructional program being evaluated. The overall aim of evaluation is to influence decisions about the need for the program in the future; the need for modifications to the program; and the need to provide cost/benefit data about the program. Therefore, evaluation can be said to have at least seven purposes: 1. to validate needs assessment tools and methods. 2. to confirm or revise solution options. 3. to confirm or revise training strategies. 4. to determine trainee/trainer reactions. Marsden 37 5. to assess trainee acquisition of knowledge and attitudes. 6. to assess trainee performance. 7. to determine if organisational goals are met. In light of the Training Guarantee, which requires employers to contribute a minimum of one percent of the payroll to training (in 1990-91), training practitioners need to become more cost conscious than they have ever been in the past. In my experience, trainers are often heard to complain that in hard economic times training is one of the first things to suffer budget cuts and staff stand downs. The Training Guarantee will ensure that organisations give careful consideration to the contribution that training makes to the organisation before taking such action in the future. However, in order for that to happen, training practitioners are going to be called upon to provide hard evidence of the value of the training programs they offer. References Alliger, G. M. & Horowitz, H. M. (1989). IBM takes the guessing out of testing. Training and Development Journal, April, 69-73. Birnbrauer, H. (1987). Evaluation techniques that work. Training and Development Journal, July, 53-55. Bowsher, J. (1990). Making the call on the COE. Training and Development Journal, May, 65-66. Brinkerhoff, R. O. (1988). An integrated evaluation model for HRD. Training and Development Journal, February, 66-68. Bumpass, S. & Wade, D. (1990). Measuring participant performance - An alternative. Australian Journal of Educational Technology, 6(2), 99-107. http://www.ascilite.org.au/ajet/ajet6/bumpass.html Bushnell, D. S. (1990). Input, process, output: A model for evaluating training. Training and Development Journal, March, 41-43. Erkut, S. & Fields, J. P. (1987). Focus groups to the rescue. Training and Development Journal, October, 74-76. Foxon, M. (1989). Evaluation of training and development programs: A review of the literature. Australian Journal of Educational Technology. 5(1), 89-104. Hewitt, B. (1989). Evaluation a personal perspective. Training and Development in Australia, 16(3), 23-24. Lombardo, C. A. (1989). Do the benefits of training justify the costs? Training and Development Journal, December, 60-64. Newstrom, J. W. (1987). Confronting anomalies in evaluation. Training and Development Journal, July, 56-60. O'Donnell, J. M. (1988). Focus groups: A habit-forming evaluation technique. Training and Development Journal, July, 71-73. Poulet, R. & Moult, G. (1987). Putting values into evaluation. Training and Development Journal, July, 62-66. 38 Australian Journal of Educational Technology, 1991, 7(1) Weatherby, N. L. & Gorosh, M. E. (1989). Rapid response with spreadsheets. Training and Development Journal, September, 75-79. Wigley, J. (1988). Evaluating training: Critical issues. Training and Development in Australia, 15(3), 21-24. Author: Jane Marsden is a Human Resource Development (HRD) professional with a range of expertise in policy development, personnel management, education, training and development, and professional writing. She has fourteen years experience in training needs analysis, job analysis and skills audits, concept analysis, instructional systems design, formal and informal presentation strategies, and assessment and evaluation techniques. For the last three years, Jane has been involved in training needs analysis, skills audits, instructional systems design and training evaluation in the corporate sector. Please cite as: Marsden, J. (1991). Evaluation: Towards a definition and statement of purpose. Australian Journal of Educational Technology, 7(1), 31- 38. http://www.ascilite.org.au/ajet/ajet7/marsden.html