Microsoft Word - triantafillou.doc Australasian Journal of Educational Technology 2007, 23(3), 350-370 Applying adaptive variables in computerised adaptive testing Evangelos Triantafillou, Elissavet Georgiadou Center of Educational Technology, Greece Anastasios A. Economides University of Macedonia, Greece Current research in computerised adaptive testing (CAT) focuses on applications, in small and large scale, that address self assessment, training, employment, teacher professional development for schools, industry, military, assessment of non-cognitive skills, etc. Dynamic item generation tools and automated scoring of complex, constructed response examinations are coming into use. Therefore it is important to extend CAT’s functionality to include more variables in its student model that define the examinee as an individual beyond the mastery level, for improved performance and more efficient test delivery. This paper examines variables that can prompt adaptation and discusses their potential use in a hypothetical student model for CAT. The objective of this effort is to provide researchers, designers, and developers of CAT a perspective for exploiting research outcomes from the area of personalised hypermedia applications. Introduction Due to the advances in communication and information technology, the popularity of computer based testing has increased in recent years. Computer delivery of tests has become feasible for processes such as licensure, certification and admission. Moreover, computers can be used to increase the statistical accuracy of test scores using computerised adaptive testing (CAT). As an alternative to giving each examinee the same fixed test, CAT item selection adapts to the ability level of individual examinees, and after each response the ability estimate is updated and the next item is selected to have optimal properties at the new estimate (van der Linden & Glas, 2003). The computer continuously re-evaluates the ability of the examinee until the accuracy of the estimate reaches a statistically acceptable level or when some limit is reached, such as a maximum number of test items presented. The score is determined from the level of Triantafillou, Georgiadou and Economides 351 the difficulty, and as a result, while all examinees may answer the same percentage of questions correctly, high ability examinees will attain a better score as they answer correctly more difficult items. The vast majority of CAT systems rely on Item Response Theory as the underlying model (Lord, 1980; Wainer, 1990). However, Decision Theory provides an alternative underlying model for sequential testing (Rudner, 2002), and Knowledge Space Theory (Doignon & Falmagne, 1985) is a third basis for small scale construction of adaptive tests. Regardless of some disadvantages reported in the literature, for example, high cost of development, item calibration, item exposure control (Eggen, 2001; Boyd, 2003), effect of a flawed item (Abdullah, 2003), or the use of CAT for summative assessment (Lilley & Barker, 2002, 2003), CAT has several advantages. Testing on demand can be facilitated, so an examinee can take the test whenever and wherever he or she is ready. Multiple media can be used to create innovative item formats and more realistic testing environments. Other possible advantages are flexibility of test management, immediate availability of scores, increased test security, and increased motivation. However, the main advantage of CAT over any other computer based test is efficiency. Since fewer questions are needed to achieve a statistically acceptable level of accuracy, significantly less time is needed to administer a CAT compared to a fixed length computer based test (Rudner, 1998; Linacre, 2000). Since the mid 1980s when the first CAT systems became operational (Armed Services Vocational Aptitude Battery for US Department of Defence, van der Linden & Glas, 2003), using adaptive techniques to administer multiple choice items, much research and overcoming of technical challenges has enabled new assessment tools. Currently, analysis of the results can go deeper than just calculating right and wrong answers. Contemporary research in profile scoring involves the design and generation of enhanced score reports, focusing on the interpretation of score report components, feedback about skills (e.g. most promising skills for the student to work on), and educational advice in the form of suggestions for improvement (Gitomer & Bennet, 2002). As research advances in the field, new item generation tools appear, to further increase the efficiency of test creation processes (e.g. Higgins, Futagi & Deane, 2005; Guzmán, Conejo & García- Hervás, 2005; Lilley, Barker & Britton, 2004; Gonçalves, Aluísio, de Oliveira & Oliveira, 2004; Bejar, Lawless, Morley, Wagner, Bennett & Revuelta, 2002). Most CAT systems include a student model. Paiva, Self and Hartley (1995:509) define a student model as “representations of some characteristics and attitudes of the learners, which are useful for achieving the adequate and individualised interaction established between 352 Australasian Journal of Educational Technology, 2007, 23(3) computational environments and students”. Replacing the term student by user, this definition is also applicable to a user model. A user model is constituted by descriptions of what is considered relevant about the actual knowledge and/or aptitudes of a user, providing information for the system environment to adapt itself to the individual user (Koch, 2000). Student model variables describe characteristics of examinees, such as knowledge, skills and abilities, about which the user of the assessment wants to make inferences. However, the main goal of the vast majority of CAT systems is to arrange examinees on a problem complexity scale that is relevant for graduation or admission decisions. As a result, student models used by these systems do not include a large array of user variables. They usually contain variables representing the aspects of proficiency that are the targets of inference in the assessment. Current research in CAT encompasses applications, in small and large scale, that address self assessment, training, employment, teacher professional development for schools, industry, military, assessment of non-cognitive skills, etc. Moreover, with dynamic item generation tools and automated scoring of complex, constructed response examinations reaches operational status (Williamson, Bejar & Sax, 2004). Therefore, it is important to extend CAT’s functionality to include more variables in its student model that define the examinee as an individual beyond the mastery level, for improved performance and more efficient test delivery. Research on personalised hypermedia applications and especially adaptive educational hypermedia systems (AEHS) has identified a number of variables that can prompt adaptivity. Contributions from general areas such as user modelling, student modelling, and intelligent tutoring systems are also relevant to this issue. Evidence of the interconnection of these research fields with CAT is that AEHS incorporate CAT in their architecture in order to extend the adaptive capabilities of the systems and support learning, for example INSPIRE (Gouli, Papanikolaou & Grigoriadou, 2002), ELMART (Weber & Brusilowsky, 2001) and DCG (Vassileva, 1996). CAT is used as a student modelling technique in intelligent tutoring systems (Dowling & Kaluscha, 1995; Ríos, Millan, Trella, Perez-de-la-Cruz & Conejo, 1999). This paper examines different variables that can prompt adaptation and discusses their potential use in a hypothetical student model for CAT. The objective of this effort is to provide researchers, designers, and developers of CAT with a perspective for exploiting research outcomes from the area of personalised hypermedia applications. Triantafillou, Georgiadou and Economides 353 Adaptive variables Adaptive variables refer to the features of the user that are used as a source of the adaptation, i.e. to what features of the user the system can adapt its behaviour. Brusilovsky (1996) identified the following features which are used by existing adaptive hypermedia systems: users’ goals, knowledge, background and hyperspace experience, and preferences. Furthermore, Brusilovsky (2001) added two more variables to this list, the user's interests and individual traits, and indicated the importance of adaptation in the user’s environment (user’s location and user’s platform). Kobsa, Koenemann & Pohl (2001) reviewed techniques for personalised hypermedia presentation and described the following categories of user data that have been the basis for adaptation in a number of systems developed since 2001: (a) demographic data, (b) user’s knowledge, (c) user’s skills and capabilities, (d) user’s interests and preferences, and (e) user’s goals and plans. In addition, they underlined the significance of computer usage (interaction behaviour, current task, and interaction history) and the physical environment (locale, software and hardware) that can be taken into account when adapting hypermedia pages to the needs of the current user. Rothrock, Koubek, Fuchs, Haas & Salvendy (2002) in reviewing adaptive interfaces argued that “an adaptive interface autonomously adapts its displays and available actions to current goals and abilities of the user by monitoring user status, the system task, and the current situation” (p. 9). They identified the following variables calling for adaptation: 1. user performance, 2. user goals, 3. user workload, 4. user situation awareness, 5. user knowledge, 6. groups of users, 7. user personality and cognitive style, and 8. task variables (situation variables and system variables). Magoulas and Dimakopoulos (2005) explored the dimensions of individual differences that should be included in a student model specification to meet personalisation services requirements and create personalised information access. They identified the following nine dimensions of a user data model for structured information spaces: (i) personal data, such as gender, age, language, and culture, (ii) cognitive or learning styles, (iii) device information (the hardware used for access), (iv) context related data capture, the physical environment from where the user is accessing the information and can be used to infer the user’s goals, (v) user history data capture, user past interaction with the system which can be used under the assumption that users’ future behaviour will be almost similar to their past behaviours, (vi) user preferences and interests, (vii) goal related data, (viii) system experience, which indicates that particular user's knowledge about 354 Australasian Journal of Educational Technology, 2007, 23(3) the information space, and (ix) domain expertise, relating to the existing level of understanding a particular user has about the knowledge domain. Table 1 lists thirteen different adaptive variables identified from the research discussed. Some of the variables are under the same or similar terminology. For example, Brusilovsky (2001) argued about individual traits that include user personality factors, cognitive factors and learning styles, while Rothrock et al. (2002) referred to user personality and cognitive style, and Magoulas and Dimakopoulos (2005) argued for learning or cognitive styles. Investigation of the thirteen adaptive variables included in Table 1 guided the authors of this paper to classify them under two broad categories: user dependent and user independent. Table 1: Adaptive variables identified in the literature from 1996 to 2005 Brusil- ovsky, 1996 Brusil- ovsky, 2001 Kobsa, Koene- mann & Pohl, 2001 Rothrock, Koubek, Fuchs, Haas & Salvendy, 2002 Magoulas & Demak- opoulos, 2005 Users’ goals      Knowledge of the domain      Background and hyperspace experience    Preferences     User's interests    Individual traits (cognitive or learning style, user personality)    Environment (location, locale, software, hardware) (user situation awareness)     Personal data   User skills and capabilities  User performance  Usage data (user history)   User cognitive workload  Groups of users  User dependent and user independent variables The user dependent variables are those directly related to the user and strictly defining him or her as an individual. These variables generally are concerned with individual user characteristics, such as the user’s knowledge state, background, demographics, mental model, etc. From the research reviewed in this paper the user dependent variables are identified as follows: (a) knowledge of the domain, (b) background and hyperspace experience, (c) preferences, (d) user interests, (e) individual traits, (f) Triantafillou, Georgiadou and Economides 355 personal data, (g) user skills and capabilities, (h) user performance, (i) usage data, (j) user cognitive workload, and (k) groups of users. The user independent variables generally affect the user indirectly and are related mainly to the context of a user’s work with a hypermedia application, rather than to the user as an individual. From the research reviewed in this paper the user independent variables are identified as follows: (a) user’s goal and (b) environment. Dependent variables a. Knowledge of the domain User’s knowledge of the domain is a variable for a particular user. This means that an adaptive hypermedia system which relies on user’s knowledge has to recognise changes in the user’s knowledge state and update the student model accordingly. There are many established techniques for modelling student knowledge in relation to domain or course knowledge (for a detailed account see Abdullah, 2003). However, user’s knowledge of the subject is most often represented by an overlay model which is based on the structural model of the subject domain. Generally, the structural domain model is represented as a network of domain concepts. The concepts are related with each other thus forming a kind of semantic network which represents the structure of the subject domain. These concepts can be named differently in different systems - topics, knowledge elements, objects, learning outcomes - but in all cases they are just elementary pieces of knowledge for the given domain. In most of the existing CAT systems, user’s knowledge of the domain is the basic variable in their student model since item selection adapts to the ability level of individual examinees. b. Background and hyperspace experience Background and hyperspace experience in the given hyperspace are two features of the user which are similar to user’s knowledge of the subject but functionally differ from it. User’s background describes all the information related to the user’s previous experience outside the subject of the hypermedia system, which is relevant enough to be considered. This includes the user’s profession, experience of work in related areas, as well as the user’s point of view and perspective. User’s experience in the given hyperspace describes the familiarity of the user with the structure of the hyperspace and how easily the user can navigate in it. Sometimes, the user who is generally quite familiar with the subject itself is not familiar at all with the hyperspace structure. Vice versa, the user can be quite familiar with the structure of the hyperspace whilst lacking deep knowledge of the subject. Background and experience usually are modelled using a stereotype model (e.g. experience stereotype, background stereotype for profession). 356 Australasian Journal of Educational Technology, 2007, 23(3) c. Preferences Preferences are user features that relate to the user’s likes and dislikes. This variable recognises that a user can prefer some types of nodes and links to others or some parts of a page over others. Preferences can indicate interface elements such as preferred colours, fonts, navigation ways, etc. User preferences are not assumed by the system; instead the user has to notify the system, directly or indirectly by providing feedback. Usually, through checklists the user can select preferred interface elements. Once the preferences are determined, the system generalises and applies them for adaptation in new contexts. d. Interests The interests variable is in a way similar to preferences, but it is not the same as it refers mostly to web based information retrieval systems. It is concerned with the user’s long term interests, that are used in parallel with the user’s short term search goal, in order to improve information filtering and recommendations. Interests can be modelled through navigation monitoring, for example, by noting which links the user visits more often. e. Individual traits User's individual traits is a group name for user features that together define a user as an individual. Examples are user personality factors (e.g. introvert/extrovert), cognitive factors, and learning styles. Like user background, individual traits are stable features of a user that either cannot be changed at all, or can be changed only over a long period of time. However, unlike user background, individual traits usually are extracted not by a simple interview, but by specially designed psychological tests. User personality: Murray and Bevan (1985) argued that human-computer interaction would improve if computers were assigned personalities, as the best way for a human to interact with a computer should closely resemble the interaction between two humans. On that view, Richter and Salvendy (1995) compared the performance of introverted and extroverted users using “extroverted” and “introverted” interfaces. The extroverted interface they designed had more words, more “fun” pictures, more sounds, bold fonts and exclamation marks than the introverted interface. The subjects used in their empirical study were classified as introverted or extroverted according to the Eysenck Personality Inventory score. The main findings from this study suggest that users perceive the computer software as having personality attributes similar to those of humans, and using software designed with introverted personality generally results in fastest performance for both extroverted and introverted individuals (Rothrock et al., 2002). Triantafillou, Georgiadou and Economides 357 Cognitive style and learning style: Cognitive or learning styles refer to a user’s information processing behaviour and have an effect on user’s skills and abilities, such as preferred modes of perceiving and processing information, and problem solving. They can be used to personalise the presentation and organisation of content, navigation support and search results (Magoulas & Dimakopoulos, 2005). Cognitive style is the way individuals organise and structure information from their surroundings and its role is critically important. It is associated with student success in any learning situation. Cognitive style is usually described as a personality dimension, which influences attitudes, values, and social interaction. It also refers to the preferred way an individual processes information. There are many different definitions of cognitive styles as different researchers place emphasis on different aspects. However, Witkin’s definition of field dependent (FD) and field independent (FI) is the best known division of cognitive styles and is more relevant to hypermedia research than others (Witkin, Moore, Goodenough & Cox, 1977). Many experimental studies have showed the impact of field dependence-independence on the learning process and academic achievement and identified a number of relationships between cognitive style and learning, including the ability to learn from social environments, types of educational reinforcement needed to enhance learning, and amount of structure preferred in an educational environment (Summerville, 1999; Ford & Chen, 2000; Weller, Repman & Rooze, 1994; Triantafillou, Demetriadis, Pombortsis & Georgiadou, 2004). Learning style is an important issue that affects the learning process and therefore the outcome. Many definitions and interpretations of learning styles have appeared in the literature in past decades (Bedford, 2006). However, in general terms, learning styles is the individual preferences for how to learn (Sternberg, 1997). When designing instructional material, it is imperative to accommodate elements that reflect individual differences in learning, as every learner has a unique way of learning. Papanikolaou and Grigoriadou (2004) suggest that important decisions underlying the incorporation of learning style characteristics in educational adaptive hypermedia systems demand the synergy of computer science and instructional science, such as: (i) the selection of proper categorisations, which are suitable for the task of adaptation, (ii) the design of adaptation, including the selection of appropriate adaptation technologies for different learning style categorisations and of apposite techniques for their implementation, (iii) the design of the knowledge representation of such a system in terms of the domain and the learner model, (iv) the development of intelligent techniques for the dynamic adaptation of the system and the diagnosis process of learners’ learning style, including also the selection of 358 Australasian Journal of Educational Technology, 2007, 23(3) specific measurements of learners’ observable behaviour, which are considered indicative of learners’ learning style and studying attitude. f. Personal data Personal data, such as gender, age, language and culture should be taken into account when designing adaptive educational interfaces to optimise learner’s potential to benefit from the system’s design in terms of knowledge acquisition. For example, males and females appear to have different preferences in terms of media presentation, navigation support, attitudes, and information seeking strategies (Magoulas & Dimakopoulos, 2005). An empirical study into gender differences in collaborative web searching revealed that males formulate queries comprising fewer keywords, spend less time on individual pages, click more hypertext links per minute and in general are more active while online than females (Large, Beheshti & Rahman, 2001). Research also suggests that males significantly outperform females in navigating virtual environments. Tan, Robertson and Czerwinski (2001) suggested that special navigation techniques when combined with a large display and wide field of view appeared to reduce that gender bias. Kobsa, Koenemann and Pohl (2001) extended the term personal data to demographic data about the user which are “objective facts” such as the following: record data (e.g. name, address, phone number), geographic data (area code, city, state, country), user characteristics (e.g. age, sex, education, disposable income), psychographic data (data indicating lifestyle), customer qualifying data (e.g. frequency of product/service usage), registration for information offerings, participation in raffles and so on as their research focused on online customer relationships. g. User skills and capabilities Kobsa et al. (2001) suggest that besides “knowing what”, a user’s “knowing how” can play an important role in adapting systems to user needs. Adaptive help systems are typical representatives of this approach. For instance, the Unix Consultant (Chin, 1989) tailors its help messages and explanations to the user’s familiarity with Unix commands. Peter and Rösner (1994) tailor repair instructions to the user’s familiarity with the operations involved in the suggested repair plan. Küpper and Kobsa (1999) go further and distinguish between the actions a user is familiar with and the actions he or she is actually able to perform. It is possible that a user knows how to do something but is not able to perform the action due to lack of required permissions or to some physical handicap. Therefore, the tourist information system AVANTI (Fink, Kobsa, & Nill, 1998), which Triantafillou, Georgiadou and Economides 359 takes into account the needs of different kinds of disabled people (wheelchair bound, motor impaired and vision impaired), recommends only actions that these users are actually able to perform. This variable is important as people with disabilities often find difficulty in using computer based systems, since the vast majority of these systems have no design considerations for them. These different users have varying needs regarding content and presentation of the information. For example, information for the blind should be presented in audio mode, and a Braille display and speech synthesiser is needed for interacting with the learning material; information for the deaf should never be in audio format. h. User performance Rothrock et al. (2002) consider adaptation useful not only in the correction, but also in the prevention of poor performance. The user’s performance is mainly defined by error rate in performing a task, as well as the time required to perform the task. If there are concurrent tasks, they must be assessed separately. Examples of inputs to infer the user’s performance include computer data entry speed, latency of response to a verbal request, reaction time to capture a simple target, and tracking deviation. User performance is difficult to measure as it is complex to specify accurately all user goals and reactions. For example, highly cognitive tasks, like decision making, are very difficult to measure, because the performance outcome does not necessarily reflect the complexity of the mental process. i. Usage data Kobsa et al. (2001) suggest that usage data can be used by the system to adapt to user preferences, habits and levels of expertise. Usage data may be directly observed and recorded, or acquired by analysing observable data (e.g. what pages and files have been requested from the server, mouse clicks and movements). In addition to interaction behaviour, the usage context may also be considered as a source for adaptation. Among the relevant items are the current task and the interaction history. Magoulas and Dimakopoulos (2005) refer to User history data that captures user past interaction with the system, for example visited pages that contain pointers to specific keywords, or browsing habits, which can be used under the assumption that users’ future behaviour will be almost similar to their past behaviours. j. User cognitive workload Rothrock et al. (2002) consider user cognitive workload as another variable that calls for adaptation. The class of input variables associated with workload is important because it provides a direct link to user performance. The predominant theory used to infer user workload in 360 Australasian Journal of Educational Technology, 2007, 23(3) multitask processing is the multiple resource theory (Wickens, 1992). In the multiple resource theory, the user has multiple pools of resources at his or her disposal, from which to perceive, decide and act. A limitation of the multiple resource model is that it does not take into account the learning that takes place as the user gains experience. Thus, as the user becomes more experienced, the task becomes more automatic and will require fewer resources. A predictive workload measure can be calculated from models using timeline analysis. The objective of these models is to calculate global workload. This is the sum of the measurable workloads for each task spanning across all time intervals, weighted by the theoretical overlap between human resources. If the workload calculated is greater than 100%, the task can be reallocated or postponed. k. Groups of users Computer supported collaborative learning (CSCL) and groupware applications have attracted increased educational research attention. Group models are important for collaborative work, since a standard group model should serve as a starting point for interaction for the new member who enters a group (Brusilovsky, 1996). While the new user starts to interact with the system, the user profile can be formed including those characteristics that are in common with, and are different from, the group profile. To build the group profile, information from users can be acquired using techniques similar to those used for the individual student model: stereotypes, interviews and monitoring users’ behaviour. Ihese techniques take into account adaptive variables such as individual traits in order to select users for construction of the group. Group profile is quite important for web based systems as the web facilitates collaborative activities. Independent variables a. User’s goal The most changeable user feature that activates adaptation is the user’s goal. It is related to the context of a user’s work with a hypermedia application rather than to the user as an individual. It informs what the user wants to accomplish by using the application. For example, in information retrieval systems, a user’s goal is a search goal; in educational systems, a learning goal; in testing systems may be a problem solving one. A user’s goal is not firm but may change from session to session, and frequently changes several times within a session. However, there can be simultaneous goals also, that is. simple, multiple and concurrent goals. General or high level goals are more stable than local or low level goals. For example, in educational systems the learning goal is a high level goal, while the problem solving goal is a low level goal which may change from one educational problem to another several times within a session. Triantafillou, Georgiadou and Economides 361 b. Environment The importance of adaptation to user's environment is acknowledged by all the researchers cited above. It is a new kind of adaptation introduced by web based systems. Users of web based systems can work irrespective of time and location, using different equipment, and as a result adaptation to the user’s environment can lead to better use of the system and better performance. Systems can adapt to the user platform, such as hardware, software and network bandwidth. Such adaptation usually involves selecting the type of material and media to present the content, for example, still image versus movie, text versus sound (Joerding, 1999). Kobsa et al. (2001) suggest that web usage may be influenced by both software (browser version and platform, availability of plugins, etc.) and hardware (bandwidth, processing speed, input device, etc.) of the individual user, and by the characteristics of the user’s current locale (current location and usage locale: noise level and brightness of the surroundings, and information about places and objects in the immediate environment). Magoulas and Dimakopoulos (2005) use the terms device information and context related data to describe the environment variable. Device information concerns the hardware used for access and affects personalisation services in terms of screen layout and bandwidth limitations, and context related data capture the physical environment from where the user is accessing the information and can be used to infer the user’s goals. Changes in the environment or changes in the system can call for an adaptation of the interface. Rothrock et al. (2002) use the term task variables that includes situation and system variables. Situation variables that influence user abilities as well as task requirements include: time pressure, location in space and presence and location of targets; situation in time; weather conditions; visibility; and vibration and noise. Like the situation variables, some changes of the task represent critical system events. Moreover, variables that cause system changes (e.g. loss of engine power and failures) are often interdependent with the user and the situation variables. The environment variable is closely associated with user situation awareness, also suggested by Rothrock et al. (2002). Situation awareness is the perception of elements in the environment within a volume of time, comprehension of their meaning, and projection of their status in the near future (Endsley, 1997). Current information and communication technologies developments focus on mobile information technology that allows for mobility in the physical space. Given the user and the information is connected to a network, this technology facilitates accessibility of information from any point in the 362 Australasian Journal of Educational Technology, 2007, 23(3) physical space. For communication purposes the user employs different devices that have specific characteristics and limitations in terms of bandwidth and information presentation. For mobile information technology the particular challenge for adaptivity is the support of users at different locations. To achieve this, mobile information technology can be combined with technologies to identify the users’ working environment and his or her position in the physical space such as infrared or general positioning systems (GPS) (Oppermann & Specht, 1999). Discussion In the previous section this paper examined different adaptive variables acknowledged by researchers in the area of personalised adaptive systems. This section will discuss whether the application of these variables to a student model of a CAT system will be a benefit to such systems in terms of increasing efficiency. In order to be more efficient than a fixed length computerised test, a CAT initially assesses each individual’s level by presenting first an item of moderate difficulty. However, if the knowledge of the domain variable is modelled for each individual, then this initial question could be closer to the examinee’s estimated ability, and this may lead to reduced testing time, as fewer items need to be administered to evaluate the examinee's aptitude. Self adaptive testing (SAT), a variation of CAT, can also be used to determine the starting difficulty level of the CAT (Frosini, Lazzerini & Marcelloni, 1998). In SAT the examinee, rather than a computerised algorithm, chooses the difficulty of the next item to be presented (Rocklin & O' Donnell, 1987). In item response theory (IRT) based CAT systems the item selection process adapts to the ability level of individual examinees. After each response the ability estimate is updated and the next item is selected to have optimal properties at the new estimate. The computer continuously re-evaluates the ability of the examinee until the accuracy of the estimate reaches a statistically acceptable level. If we consider the response in previous item as an interaction behaviour aspect, and the fact that as the user gains experience the task becomes more automatic, thus requiring fewer items for assessing performance, then we can suggest that most IRT based CAT systems while modelling knowledge of the domain, do take into account the user performance and user cognitive workload variables described by Rothrock et al. (2002) and the usage data variable described by Kobsa et al. (2001). Modelling the background and hyperspace experience variable could result in simpler interfaces for examinees who are familiar with the information Triantafillou, Georgiadou and Economides 363 space, and more explanatory ones for those who are unfamiliar. This, combined with modelling of the preferences variable that can indicate interface elements (preferred colours, fonts, navigation ways, etc.), allows examinees to focus on the assessment process. Further, clearer and more self explicit interfaces may be obtained by taking into account the personal data variable. For example, in examining gender, males and females appear to have different preferences in terms of media presentation, navigation support, attitudes and information seeking strategies. Some examinees might feel frustrated or discouraged when they cannot work confidently with the assessment’s interface or when the interface is not designed to suit their individuality. In turn, this will result in poorer performance, as more time will be needed to process information. This is an important issue, as in most assessments time is an essential element in measuring performance. The individual traits variable refers to stable features of the user such as personality factors, cognitive factors and learning styles. Not much research exists, according to our knowledge, on user personality factors. Richter and Salvendy (1995) suggested that users perceive the computer software as having personality attributes similar to those of humans. Interfaces designed with introverted personality can result in most cases in faster performance for extroverted and introverted individuals. Moreover, modelling of cognitive or learning styles for CAT can result in more efficient systems. In interface design terms, with regard to cognitive style for example, a rigid structure should be provided for field dependent (FD) users as they need navigation and orientation support; while a more flexible (or customisable) interface should be made available for field independent (FI) users. Furthermore, studies have shown that FD are holistic and require external help while FI people are serialistic and possess internal cues to help them solve problems. FD learners are more likely to require externally defined goals and reinforcements while FI tend to develop self defined goals and reinforcements (Witkin et al. 1977). These implications of style characteristics in CAT design could result in clear, explicit directions and maximum amount of guidance and extensive feedback to FD examinees, compared with minimal guidance and direction and less feedback to FI examinees. Modelling of the interests variable for CAT systems can offer items closer to the long term interests of each individual examinee. By knowing what interests a particular user, adaptive algorithms can be set to rule out certain items. However, this could be problematic in some cases, for example general knowledge assessments, as examinees will not face items that represent the whole range of the domain. Kobsa et al. (2001) suggest that besides “knowing what”, a user’s “knowing how” can also play an important role in adapting systems to user needs. In 364 Australasian Journal of Educational Technology, 2007, 23(3) a CAT system modelling of the user skills and capabilities variable can, when needed, give examinees with different skills help messages and explanations according to their familiarity with the domain presented. Further, an examinee population often include people with disabilities. If a mechanism exists to assist such individuals on demand, disabled people will feel less disadvantaged as they could more readily take part in an examination process. Modelling of the groups of users variable will be important in cases of group adaptive testing systems. Although interest in computer supported collaborative learning is increasing, according to our knowledge currently there are no examples of CAT systems for conducting group evaluations. The independent variables have an effect on the user indirectly, in terms that are not defining him or her as an individual. The most complicated variable to model is user’s goal as it may change from session to session, and in many cases there are simultaneous goals within the same session. For example, the main goal of taking a test is to pass it, however, simultaneously several goals exist, one for each item that is included in the test. In simple CAT systems modelling of user’s goal is not of particular weight because it complicates the development of the test without any significant benefits for the examinee. However, in assessing non-cognitive skills, modelling of the user’s goal variable is important, as examinees will always face items that closely match their own individual goals, resulting in better individual performance. A user is not tied to a particular hardware platform. Users may work in one instance from a personal computer (PC) attached to a desk and in another from a mobile device such as a personal digital assistant (PDA). Dependent variables remain the same with regards to student modelling. The independent variable of environment cannot affect the content, yet it may affect the presentation mode significantly. Systems can adapt to the user platform by selecting appropriate ways in terms of bandwidth, media, etc., for presenting the information. For educational courseware modelling of the environment variable may facilitate teaching and learning for disciplines related to outdoors activities such as zoology, botany, sailing, etc. Nevertheless, it is quite unusual to model this variable for testing purposes as there are not many situations when an examinee will need to be assessed for the same subject using a PC or a PDA. However, it is important to consider at this point the effort of Kinshuk and Lin (2004) who explored how to improve learning processes by adapting course content presentation to student learning styles in multi-platform environments such as PCs and PDAs. They developed a framework and a mechanism to comprehensively model students' learning styles and Triantafillou, Georgiadou and Economides 365 present the appropriate subject matter, including the content, format, media type, and so on, to suit individual students, based on the Felder- Silverman Learning Style Theory. Summarising, most IRT based CAT systems employ in their student model the knowledge of the domain variable. This variable is closely associated with user performance, usage data, and user cognitive workload. Besides these variables, modelling of background experience, preferences, personal data and individual traits can produce well-organised CAT systems, as fewer items will be needed to assess performance. Moreover, it could affect item quality, since items can be more complex, taking into account user characteristics. As a result, testing sessions would not be limited to measuring performance but they could contribute to the learning process by using evidence of examinee’s performance, gathered from complex tasks used to support learning activities. In advanced CAT, modelling of user’s goal also can contribute to the test’s quality. Modelling of interests needs careful implementation as it may result in false measurements, because examinees may be presented with items that always fall in their individual interests domain and not in the whole of the knowledge domain examined with a CAT. Conclusion Currently, research in CAT is moving beyond admission programs to address many aspects of measuring performance in education and training. This, combined with new dynamic item generation tools and advances in profile scoring, can facilitate computerised assessments that take into consideration more individual differences of the user than the mastery level, resulting in improved individual performance and more efficient test delivery. Moreover, graphical modelling extends the IRT based CAT inferential framework to accommodate richer tasks and more complex student models (Almond & Mislevy, 1999). Modelling multiple variables is important because users have complex characteristics that ultimate affect their performances. Student models must incorporate multiple variables of the user; dependent and independent. However, adding variables will not always increase the accuracy of the student model but will always increase its complexity and the requirements to collect additional user information (Carver, Hill & Pooch, 1999). Media elements are difficult to generate and are not as flexible for automatic recombination as text. Therefore, multimedia adaptation adds additional complexity and requires a greater implementation effort. There are many research questions related to multiple variables modelling and several studies that attempted to address such questions are referenced 366 Australasian Journal of Educational Technology, 2007, 23(3) in this paper. Nevertheless, the key issue is that taking into account individual characteristics in test design can benefit the users, resulting in better performances. The essence of testing is to measure performance and consequently an elaborated student model for CAT which includes a large array of variables must be the way ahead. The type and number of variables that each CAT would comprise in the student model depends heavily on the subject matter and the way that the test is implemented. Mislevy, Steinberg and Almond (1999, p.7) argue that “the factors that determine the number and the nature of the student model variables in a particular application are the conception of competence in the domain and the intended use of the assessment”. Reviewing and examining the different variables that can prompt adaptation, and discussing their potential use in a hypothetical student model for CAT provides researchers, designers, and developers of CAT with a perspective for exploiting research outcomes from the area of personalised hypermedia applications. Acknowledgments The work presented in this paper has been funded partially by the General Secretariat for Research and Technology, Hellenic Republic, through the E- Learning, EL-51, FlexLearn project. References Abdullah, S. C. (2003). Student modelling by adaptive testing - A knowledge-based approach. Unpublished PhD Thesis, University of Kent at Canterbury, June. [verified 2 Jul 2007] http://www.cs.kent.ac.uk/pubs/2003/1719/index.html Almond, R. G. & Mislevy, R. J. (1999). Graphical models and computerized adaptive testing. Applied Psychological Measurement, 23(3), 223-237. Bedford, T. (2006). Learning styles: A review of the English language literature. In R. R. Sims & S. J. Sims (Eds.), Learning styles and learning: A key to meeting the accountability demands in education. New York: Nova Science Publishers Inc. Bejar, I. I., Lawless, R., Morley, M., Wagner, M., Bennett, R. & Revuelta, J. (2002). A feasibility study of on-the-fly item generation in adaptive testing (ETS RR-02-23). Princeton, NJ: ETS. [verified 2 Jul 2007] http://www.ets.org/portal/site/ets/menu item.c988ba0e5dd572bada20bc47c3921509/?vgnextoid=4391af5e44df4010VgnVCM100 00022f95190RCRD&vgnextchannel=e15246f1674f4010VgnVCM10000022f95190RCRD Boyd, A. M. (2003). Strategies for controlling testlet exposure rates in computerized adaptive testing systems. Unpublished PhD Thesis, The University of Texas at Austin, May 2003. Brusilovsky, P. (1996). Methods and techniques of adaptive hypermedia. Journal of User Modeling and User Adapted Interaction, 6(2-3), 87-129. [verified 2 Jul 2007] http://www.sis.pitt.edu/~peterb/papers/UMUAI96.pdf Triantafillou, Georgiadou and Economides 367 Brusilovsky, P. (2001). Adaptive hypermedia. Journal of User Modeling and User Adapted Interaction, 11(1/2), 87-110. Ten Year Anniversary Issue (Alfred Kobsa, Ed.). [verified 2 Jul 2007] http://www2.sis.pitt.edu/~peterb/papers/ brusilovsky-umuai-2001.pdf Carver, C., Hill, M. & Pooch, U. (1999). Third generation adaptive hypermedia systems. Proceedings WebNet 99, AACE, Honolulu, Hawaii, 1999. Chin, D. N. (1989). KNOME: Modeling what the user knows in UC. In A. Kobsa & W. Wahlster (Eds), User models in dialog systems, 74-107. Springer-Verlag: Berlin. Doignon, J. P. & Falmagne, J. C. (1985). Spaces for the assessment of knowledge. International Journal of Man-Machine Studies, 23(2), 175-196. Dowling, C. E. & Kaluscha, R. (1995). Prerequisite relationships for the adaptive assessment of knowledge. In: J. Greer (Ed), Proceedings of AI-ED'95, 7th World Conference on Artificial Intelligence in Education, Washington DC, 16-19 August 1995, AACE. pp. 43-50. Eggen, T. J. H. M. (2001). Overexposure and underexposure of items in computerized adaptive testing. Measurement and Research Department Reports 2001-1, Citogroep Arnhem. http://www.cito.nl/share/poc/reports/Report01-01.pdf Endsley, M. R. (1997). Automation and situation awareness. In R. Parasuraman & M. Mouloua (Eds), Automation and human performance: Theory and applications. Mahwah, NJ: Lawrence Erlbaum. Fink, J., Kobsa, A. & Nill, A. (1998). Adaptable and adaptive information provision for all users, including disabled and elderly people. The New Review of Hypermedia and Multimedia, 4, 163-188. Ford, N. & Chen, S. (2000). Individual differences, hypermedia navigation, and learning: An empirical study. Journal of Educational Multimedia and Hypermedia, 9(4), 281-311. Frosini, G., Lazzerini, B. & Marcelloni, F. (1998). Performing automatic exams. Computers & Education, 31(3), 281-300. Gitomer, D. H. & Bennett, R. E. (2002). Unmasking constructs through new technology, measurement theory, and cognitive science. Research Memorandum, February 2002, RM-02-01, Educational Testing Service, Princeton, NJ. [verified 2 Jul 2007] http://www.nap.edu/openbook/0309083206/html/1.html Gonçalves, J. P., Aluísio, S. M., de Oliveira, L. H. M. & Oliveira, O. N. (2004). A learning environment for English for academic purposes based on adaptive tests and task-based systems. Lecture Notes in Computer Science, 3220, 1-11. Gouli, E., Papanikolaou, K. & Grigoriadou, M. (2002). Personalizing assessment in adaptive educational hypermedia systems. In P. De Bra, P. Brusilovsky & R. Conejo (Eds), Proceedings of the Second International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems, and Lecture Notes in Computer Science, 2347, 153-163. Springer-Verlag Berlin Heidelberg. [verified 19 May 2007] http://hermes.di.uoa.gr/papanikolaou/papers%5Cpapanikolaou%5Cgpg_AH2002.pdf Guzmán, E., Conejo, R. & García-Hervás, E. (2005). An authoring environment for adaptive testing. Educational Technology & Society, 8(3), 66-76. http://www.ifets.info/others/download_pdf.php?j_id=28&a_id=558 368 Australasian Journal of Educational Technology, 2007, 23(3) Higgins, D., Futagi, Y. & Deane, P. (2005). Multilingual Generalization of the Model Creator Software for Math Item Generation ETS Research and Development (ETS RR- 05-02) Princeton, NJ: ETS. Joerding, T. (1999). A temporary user modeling approach for adaptive shopping on the Web. Proceedings of Workshop on Adaptive Systems and User Modelling on World Wide Web at WWW '99 Conference, Toronto, Canada, 75-79. [verified 2 Jul 2007] http://wwwis.win.tue.nl/asum99/joerding/joerding.html Kinshuk & Lin T. (2004). Application of learning styles adaptivity in mobile learning environments. Third Pan Commonwealth Forum on Open Learning, 4-8 July 2004, Dunedin, New Zealand. http://www.col.org/pcf3/Papers/PDFs/Kinshuk_Lin_1.pdf Kobsa, A., Koenemann, J. & Pohl, W. (2001). Personalised hypermedia presentation techniques for improving online customer relationships. The Knowledge Engineering Review, 16(2), 111-155. [verified 2 Jul 2007] http://www.ics.uci.edu/~kobsa/papers/2001-KER-kobsa.pdf. Koch, N. (2000). Software engineering for adaptive hypermedia systems. Unpublished PhD thesis, Mathematics and Informatics Department, University of Munich. http://www.pst.informatik.uni-muenchen.de/~kochn/PhDThesisNoraKoch.pdf Küpper, D. & Kobsa, A. (1999). User-tailored plan generation. In J. Kay (Ed), UM99 User Modeling: Proceedings of the Seventh International Conference Springer-Verlag 45-54. Large, A., Beheshti, J. & Rahman, T. (2001). Gender differences in collaborative Web searching behavior: An elementary school study. Information Processing & Management, 38(3), 427-443. Lilley, M. & Barker, T. (2002). The development and evaluation of a computer- adaptive testing application for English language, 6th Computer Assisted Assessment Conference, July 2002, Loughborough, UK. http://www.caaconference.com/pastConferences/2002/proceedings/lilley_m1.pdf Lilley, M. & Barker, T. (2003). An evaluation of a computer adaptive test in a UK university context. 7th Computer Assisted Assessment Conference, 8-9 July, Lough- borough. http://www.caaconference.com/pastConferences/2003/procedings/lilley.pdf Lilley, M., Barker T. & Britton, C. (2004). The development and evaluation of a software prototype for computer-adaptive testing. Computers & Education, 43, 109-123. Linacre, J. M. (2000). Computer-adaptive testing: A methodology whose time has come. MESA Memorandum No. 69. Published in Sunhee Chae, Unson Kang, Eunhwa Jeon & J.M. Linacre. Development of Computerised Middle School Achievement Test (in Korean). Seoul, South Korea: Komesa Press. http://www.rasch.org/memo69.pdf Lord, F. M. (1980). Applications of item response theory to practical testing problems. Lawrence Erlbaum Associates, New Jersey. Magoulas, G. D. & Dimakopoulos, D. N. (2005). Designing personalised information access to structured information spaces. Proceedings of the 1st International Workshop Workshop on New Technologies for Personalized Information Triantafillou, Georgiadou and Economides 369 Access. [verified 2 Jul 2007] http://www.dcs.bbk.ac.uk/~gmagoulas/ Designing%20Personalised%20Information%20Access.pdf Mislevy, R. J., Steinberg, L. S. & Almond, R. G. (1999). On the roles of task model variables in assessment design. CSE Technical Report 500, January 1999, Educational Testing Service, Princeton, New Jersey. http://eric.ed.gov/ ERICWebPortal/contentdelivery/servlet/ERICServlet?accno=ED431804 Murray, D. & Bevan, N. (1985). The social psychology of computer conversations. In B. Shackel (Ed.), Human-Computer Interaction – INTERACT 84. New York: Elsevier Science. Oppermann, R. & Specht, M. (1999). Adaptive information for nomadic activity: A process oriented approach. Proceedings of the Software-Ergonomie ’99, Stuttgart: Teubner, 255-264. http://fit.fraunhofer.de/~oppi/publications/SE99.Adaptive ActivitySuppor.pdf Paiva, A., Self, J. & Hartley, R. (1995). Externalizing learner models. In J. Greer (Ed), Proceedings of AIED95. AACE publication, 509-519. Papanikolaou, K. A. & Grigoriadou, M. (2004). Accommodating learning style characteristics in adaptive educational hypermedia systems. Proceedings of the AH 2004 Workshop “Individual Differences in Adaptive Hypermedia”. Peter, G. & Rösner, D. (1994). User-model-driven generation of instructions. User Modeling and User-Adapted Interaction 3(4), 289–319. Richter, L. A. & Salvendy, G. (1995). Effects of personality and task strength on performance in computerized tasks. Ergonomics, 38(2), 281-291. Ríos, A., Millan, E., Trella, M., Perez-de-la-Cruz & Conejo, R. (1999). Internet based evaluation system. In S. P. Lajoie & M. Vivet (Eds), Artificial Intelligence in Education. Open Learning Environments: New Computational Technologies to Support Learning, Exploration and Collaboration. Volume 50 in Frontiers in Artificial Intelligence. IOS Press, Amsterdam, pp. 387-394. Rocklin, T. & O' Donnell, A. M. (1987). Self-adapted testing: A performance- improving variant of computerized adaptive testing. Journal of Educational Psychology, 79(3), 315-319. Rothrock, L., Koubek, R., Fuchs F., Haas, M. & Salvendy, G. (2002). Review and reappraisal of adaptive interfaces: Toward biologically-inspired paradigms. Theoretical Issues in Ergonomic Science, 3(1), 47-84. [verified 3 Jul 2007] http://www2.ie.psu.edu/Rothrock/Research/HPAM/rothrock/ rothrock_page_files/r005.pdf Rudner, L. M. (1998). An online, interactive, computer adaptive testing tutorial. 11/98. http://EdRes.org/scripts/cat [viewed 15 May 2007] Rudner, L. M. (2002). An examination of decision-theory adaptive testing procedures. Paper presented at the annual meeting of the American Educational Research Association New Orleans, LA April 1-5, 2002. [verified 3 Jul 2007] http://edres.org/mdt/papers/aera2c.pdf Sternberg, R. J. (1997). Thinking styles. New York: Cambridge University Press. 370 Australasian Journal of Educational Technology, 2007, 23(3) Summerville, J. (1999). Role of awareness of cognitive style in hypermedia. International Journal of Educational Technology, 1(1). [verified 2 Jul 2007] http://www.ascilite.org.au/ajet/ijet/v1n1/summerville/ Tan, D. S., Robertson, G. G. & Czerwinski, M. (2001). Exploring 3D navigation: Combining speed-coupled flying with orbiting. In Proceedings of CHI 2001, Human Factors in Computing Systems. Seattle, WA, 1-6 April. ACM, 418-424. [verified 3 Jul 2007] http://www.cs.cmu.edu/~desney/publications/CHI2001- final-color.pdf Triantafillou, E., Demetriadis, S., Pombortsis, A. & Georgiadou, E. (2004). The value of adaptivity based on cognitive style: An empirical study. British Journal of Educational Technology, 35(1), 95-106. van der Linden, W. J. & Glas, C. A. W. (2003). Preface. In W. J. van der Linden & C. A. W. Glas (Eds), Computerised adaptive testing: Theory and practice. Dordrecht, Boston, London: Kluwer Academic Publishers, vi-xii. Vassileva, J. (1996). A task-centered approach for user modeling in a hypermedia- based information system. In Proceedings of the 4th International Conference on User Modeling, Hyannis MA, 115-120. Wainer, H. (1990). Computerized adaptive testing: A primer. Lawrence Erlbaum Associates, New Jersey. Weber, G. & Brusilovsky, P. (2001). ELM-ART: An adaptive versatile system for web-based instruction. International Journal of Artificial Intelligence in Education, 12, 351-384. http://www2.sis.pitt.edu/~peterb/papers/JAIEDFinal.pdf Weller, H. G., Repman, J. & Rooze, G. E. (1994). The relationship of learning behaviour and cognitive styles in hypermedia-based instruction: Implications for design of HBI. Computers in the Schools, 10, 401-420. Wickens, C. D. (1992). Engineering psychology and human performance (2nd ed.). New York: Harper Collins. Williamson, D. M. Bejar, I. I. & Sax, A. (2004). Automated tools for subject matter expert evaluation of automated scoring. Research and Development Report. March 2004, RR-04-14, ETS: Princeton, NJ. Witkin, H. A., Moore, C. A., Goodenough, D. R. & Cox, P.W. (1977). Field- dependent and field-independent cognitive styles and their educational implications. Review of Educational Research, 47(1), 1-64. Evangelos Triantafillou, Center of Educational Technology Dodekanisou 21, Thessaloniki 55131, Greece. Email: vtrianta@edutech.gr Elissavet Georgiadou, Center of Educational Technology Karaoli 46, Thessaloniki 57001, Greece. Email: elisag@otenet.gr Anastasios A. Economides, University of Macedonia, Department of Computer Networks, Egnatia 156, Thessaloniki 54006, Greece. Email: economid@uom.gr