08 Layout SA JOURNAL OF PHYSIOTHERAPY 2010 VOL 66 NO 1 21 INTRODUCTION Total knee arthroplasty is used in the treatment of osteoarthritis of the knee to bring about a decrease in pain (Hawker et al 1998; McAuley et al 2002) and improvement in function (Hawker et al 1998; Walsh et al 2001). Impairments acquired post total knee arthroplasty may include, knee flexion contracture, limited range of motion, quadriceps weakness, instability and malalignment (Bhave et al 2005). Physiotherapy aims to prevent these impairments through appropriate treatment techniques which include continuous passive mobilisation, stretching, strengthening, active and passive mobilisations, functional elec - trical stimulation, gait training and patient education. It has yet to be shown whether rou- tine physiotherapy plays a role in the rehabilitation of patients post total knee arthroplasty (Rajan et al 2004). If patients are not routinely referred for physiotherapy, it becomes essential to continuously assess patients postopera- tively to monitor for the development of such impairments. If patients are being routinely referred for outpatient physiotherapy, as is common practice in many facilities, then physiotherapists should be using valid outcome measures to provide evidence of the benefit of their intervention. Whether patients are being referred for outpatient physiotherapy or not, the outcome measures used should be valid, reliable, responsive and standardized to Intra- and inter-rater reliability of the Knee Society Knee Score when used by two physiotherapists in patients post total knee arthroplasty R e s e a r c h A r t i c l e A BST R A CT: Back ground and Purpose: It has yet to be shown whether routine physiotherapy plays a role in the rehabilitation of patients post total knee arthroplasty (Rajan et al 2004). Physiotherapists should be using valid outcome measures to provide evidence of the benefit of their intervention. The aim of this study was to establish the intra and inter-rater reliability of the Knee Society Knee Score, a scoring system developed by Insall et al (1989). The Knee Society Knee Score can be used to assess the integrity of the knee joint of patients undergoing total knee arthroplasty. Since the score involves clinical testing, the intra-rater reliability of the clinician should be established prior to using the scores as data in clinical research. W here multiple clinicians are involved, inter-rater reliability should also be established. Design: This was a correlation study. Subjects: A sample of thirty patients post total knee arthroplasty attending the arthroplasty clinic at Johannesburg Hospital between six weeks and twelve months postoperatively. M ethod: Recruited patients were evaluated twice with a time interval of one hour between each assessment. Statistical A nalysis: The intra- and inter-rater reliability were estimated using Intraclass Correlation Coefficient (ICC). R esults: The intra-rater reliability showed excellent reliability (h= 0.95) for Examiner A and good reliability (h= 0.71) for Examiner B. The inter-rater reliability showed moderate reliability (h= 0.67 during test one and h= 0.66 during test two). Conclusion: The KSKS has good intra-rater reliability when tested within a period of one hour. The KSKS demon- strated moderate agreement for inter rater reliability. KEY W ORDS: TOTA L KNEE A RTHROPLA STY, KNEE SOCIETY KNEE SCORE, REHA BILITATION OUT- COME MEA SURES. Gopal S, MSc1; Wood W, MSc1; Myezwa H, PhD1; Stewart A, PhD1 1 Division of Physiotherapy, University of the Witwatersrand. Correspondence to: Wendy-Ann Wood Department of Physiotherapy, Medical School University of the Witwatersrand 7 York Road Parktown 2193 Johannesburg, South Africa Email: bradwend@hotmail.com facilitate the communication of results in the medical (between healthcare professionals) and scientific community (Kreibich et al 1996). An outcome measure must provide the user with an objective measure (Davies 2002) of the subject’s impairment which can be compared with other similar subjects and should be applicable before and after 22 SA JOURNAL OF PHYSIOTHERAPY 2010 VOL 66 NO 1 an intervention. An outcome measure should also be related to the intervention (APA position statement 2003). The American Knee Society Clinical Rating System (AKSCRS) is one among the most commonly used outcomes for total knee arthroplasty patients (Stavem and Arnesen 2005; Lingard et al 2001). It is a dual rating system developed by Insall et al (1989). It is also known as the Knee Society Rating System (KSRS) or Knee Society Clinical Rating System (KSCRS). For the purpose of clarity the rating system hereafter will be referred to as the AKSCRS. The AKSCRS has two components, the knee score and the functional score. The system was designed to score the knee joint itself (knee score) and its function (functional score) separately, thus avoiding the impact of functional and age related health problems on the knee joint itself. The knee score is based on the subjec- tive assessment of pain and objective measurement of stability, range of motion, flexion contracture, extension lag and alignment at the knee joint. The individual scores are combined to give the knee a score which ranges from 0 to 100. The functional score is a composite score of walking, climbing up and down stairs and use of assistive devices. The knee score has been shown to be valid and responsive (Lingard et al in 2001).The functional score of the system has been shown to be less responsive (Lingard et al in 2001) and is not explored further in this paper. For the purpose of clarity, the knee score will be mentioned hereafter as Knee Society Knee Score (KSKS). Wright and Feinstein (1992) discussed the common causes of variability in orthopaedic measurements. They stated that patient, procedure and clinician variability are the common causes for unreliable measures. Patient variability can be reduced by selecting measure- ment tools appropriate to patient condi- tions. Procedural variability can be reduced by using the same instruments and standardising measurement proce- dures. Clinician variability can be reduced by repeated practice and expe - rience of the examiner in the measure- ment skills used. The classification of some commonly used outcome measures, based on their type, validity, reliability and dual rating design (design which measures structure and function of the joint separately) are shown in Table 1. From Table 1 it can be seen that if researchers are in need of a joint specific outcome measure, that has been shown to be valid and reliable, the Knee Score component of the AKSCRS is a good option. The aim of this study was to assess whether the KSKS can be reliably used by physiotherapists in evaluating the knee joint in post TKA patients. This was achieved by establishing intra- and inter-tester reliability of two qualified physiotherapists using the KSKS. MATERIALS AND METHODS This was a correlational study. Ethical clearance for this study was obtained from the Human Research Ethics Committee (Medical) of the University of the Witwatersrand. Patients who agreed to take part in the study signed a consent form and were assigned numerical codes on the data sheet, ensuring anonymity. Sample Two qualified physiotherapists partici- pated and are referred to as examiner A and examiner B. The study was con - ducted at the arthroplasty clinic in a Gauteng hospital. The Knee Society Knee Score was administered on patients who met the inclusion criteria and gave consent for participation. Inclusion criteria: • Patients aged between 45 – 75 years who were attending the clinic for their six weeks to one year postoper- ative follow up visit Exclusion criteria: • Subjects who were not walking (inde- pendently or without walking aids) prior to surgery / severe pain • Pre-existing septic arthritis or con - ditions that may compromise TKA outcome (Charcot’s joint, Paget’s disease, severe osteoporosis) • Patients with a neurological disorder that may affect the outcome of TKA • Patients with infectious diseases or metastatic disease The sample size required was taken as 30 based on the assumptions of norma - lity that a minimum sample size of n = 30, is essential to be able to assess agreement. Procedure The principal researcher (Examiner A) and Examiner B evaluated the patients independently of one another. The Outcome Type of Validity Reliability Dual rating Measures measure system WOMAC Disease Yes Yes No (Kreibich et al specific 1996; Davies 2002) SF-36 General health Yes Yes No (Aitken and Bohannon 2001; Lingard et al 2001) HSSKRS Joint specific No Yes No (Davies 2002) Bristol Knee Score Joint specific No Yes No (Davies 2002) OKS Disease specific No Yes No (Davies 2002; Dawson et al 1998) AKSCRS Joint specific Yes Yes Yes (Insall et al 1989) Table 1: Classification of outcome measures based on their type, validity, reliability and dual rating design. SA JOURNAL OF PHYSIOTHERAPY 2010 VOL 66 NO 1 23 patients were taken into a room with a plinth and a chair with back support along with the researcher (examiner A) and an observer. The observer was a qualified physiotherapist. The same room, chair, plinth and goniometer were used for all measurements and for the full duration of the study. Examiner A took all measurements, immediately followed by examiner B. The procedures were repeated by examiner A and examiner B with a stipulated time interval not less than 45 minutes between their first and second measurements. The examiners did not record the measure directly but, gave the actual measures (eg. degrees of ROM) to the independent observer who entered it into the data sheet, minimising examiner bias. The principal researcher completed the scoring after data collec- tion was complete. PROCEDURES FOLLOWED FOR MEA- SURING EACH COMPONENT OF THE SCORE Pain The patients were asked, “Do you have any pain in your operated knee?” If they answered “yes” they were asked, “Is your pain mild, moderate or severe”. If they had mild pain, they were asked whether they had pain while using stairs and walking. In cases where they had moderate pain, they were asked whether the pain was continuous or occasional. Range of motion (ROM) A universal goniometer was used to measure the range of motion at the knee joint. The patients were in supine lying. The head was supported by a pillow, with the hip in neutral and the knee extended (Clarkson and Gilewhich 1989). The goniometer axis was placed over the lateral condyle of the femur with its stationary arm parallel to the longitudinal axis of the femur pointing towards the greater trochanter and the movable arm parallel to the longitudinal axis of the fibula pointing to the lateral malleolus. The measurement was noted down as initial ROM. If the initial ROM was not 0º the reading was taken as degree of flexion contracture. The patients were instructed to take their heel towards their buttock and the examiner assisted the movement to feel the end range and measured range of motion. The patients were instructed to inform the examiner if they felt any pain or discomfort in their knee and the movement was stopped at that point. The measurement was recorded in the data sheet by the observer. Stability The Lachman’s test (Petty and Moore 1998) and Valgus-Varus stress test (Magee 1997) was used to assess the anteroposterior and mediolateral stability respectively. The amount of translation of the tibia over the femur during Lachman’s, and the amount of angula- tion at the knee joint during the Valgus- Varus test experienced by the examiner were conveyed to the observer and noted on the data sheet. These were clinical measurements of what the examiner experienced during the tests. Extension lag The patients were positioned supine at the end of the plinth, with the knee hanging flexed over the end of the plinth, with a towel roll underneath the distal thigh. The patients were asked to actively extend the knee and range was measured using the goniometer as active extension ROM. The difference between active extension ROM and the passive extension ROM was recorded as the degree of extension lag (Stillman 2004) by the observer. Alignment Measurements of the degree of valgus and varus at the knee joint were obtained from the surgeon. STATISCAL ANALYSIS Reliability was assessed by making use of an Intraclass Correlation Coefficient (ICC) (John 2004). ICC (h) is the num- ber obtained from the statistical analysis which ranges from zero to positive or negative one. The closer the value of ICC is to one, the closer the relationship between the two variables (Hicks 1995). RESULTS Thirty patients were initially included in this study. Two patients were excluded due to severe pain and one patient was excluded as she was unwilling to parti - cipate once the testing began. In three patients, both knees met the inclusion criteria, therefore 30 knees were exa - mined. The alignment scores for two knees were missing from the database and therefore the following results are from the scores of 28 knees. An average of one hour was the time between the first and second measurements and in no case was the time less than 45 minutes. Intra-rater reliability The total scores obtained by individual examiners during their assessments with KSKS were used to establish the intra- rater reliability of the KSKS. The first set of scores obtained by examiner A were compared and correlated with the second set of scores obtained by exam- iner A. The same procedure was fol- lowed for examiner B. Individual items on the KSKS were also subjected to analysis. The ICC (h) for intra-rater reliability for the individual items as well as the total score is shown in Table 2. Item on the KSKS Intraclass Correlation Coefficient (h) Examiner A Examiner B Knee ROM 0.96 0.94 AP stability 0.82 0.80 ML stability 0.72 0.82 Flexion contracture 0.95 0.89 Extension lag 0.65 0.87 Total 0.95 0.71 * AP stability – anterior posterior stability, ML stability – medial lateral stability Table 2: The Intraclass Correlation Coefficient of intra rater reliability for the individual items from Examiner A and Examiner B. 24 SA JOURNAL OF PHYSIOTHERAPY 2010 VOL 66 NO 1 Examiner A showed excellent corre- lation and examiner B showed good cor- relation for the KSKS (.90 ≤ excellent, .70 to .89 = good, .50 to .69 = moderate, .50 ≥ poor). Inter-rater reliability The total scores obtained by examiners A and B during their test 1 and test 2 were used to estimate the inter-rater reliability of the KSKS. The set of scores obtained from test 1 by examiner A and examiner B were correlated. Similarly the set of scores obtained from test 2 by examiner A and examiner B were also correlated. Individual items on the KSKS were also subjected to ana - lysis. The ICC (h) measuring inter-rater reliability of the individual items as well as the total score is shown in Table 3. Most of the individual items in the scoring system showed a poor corre - lation between the two examiners. Overall, the examiners showed moderate correlation between the KSKS during test 1 and test 2. DISCUSSION Practice of assessment and evaluation in physiotherapy has been emphasised not only for the purpose of quality service, but also for audit and research advances (Stavem and Arnesen 2005; Kreibich et al 1996). It has become essential for the physiotherapist to assess the effective- ness of a treatment using an outcome measure which is valid and reliable. Besides improving the quality of health care services, reliable outcome measures enhance the quality of trials in which they are used (John 2004). It is impor- tant for physiotherapists to communi- cate their findings in the same terms as other health professionals to facilitate the team-approach to patient management. The aim of this study was to establish the intra and inter-rater reliability of two physiotherapists using the KSKS. The common method of test-retest relia- bility was implemented by administering the KSKS at two different times with an average time interval of 60 minutes between them. The time interval in this study was not so short that the memory of the previous test biased the perfor- mance of the examiner (Thomas and Stewart 2005) and not too long so that there were no changes in the attributes which were being measured (Finch 2002; Campbell et al 1999). The argument that good overall relia- bility was biased by the memory of previous measurement as the time inter- val (one hour) between the two tests was shorter than that of the time interval (two hours) of a previous study (Liow et al 2000) could be made. In the current study, the influence of memory was minimised by recruiting an independent observer to note down the measure- ments from the tests with the intention that if the examiner was not actually writing down the measurement it would less easily be committed to memory. In our study, examiner A showed excellent intra-rater reliability (h=0.95) and examiner B showed good intra-rater reliability (h = 0.71). In a similar study (Liow et al 2000) the knee score was administered by six examiners with varying experience on 29 subjects. The study showed considerable variations in intra-rater reliability which was attri - buted to poor experience of the exa - miners and a lack of training in adminis- tering the tool. They also found that the examiners with more than three years experience showed relatively higher intra-rater reliability. In our study, both the physiotherapists had more than four years experience and were trained in the assessment tool prior to the study, which may have contributed to the reliability. The results of this study revealed a moderate inter-rater reliability between the examiners during test 1 (h=0.67) and test 2 (h=0.66). Ryd et al (1997) reported low inter-rater reliability with a standard deviation of 26 for the knee score which is larger than that that reported in this study (SD = 16). The physiotherapists in our study standardised the measurement procedures through repeated practice, training and discussion prior to the study. This is supported by Liow et al (2000). To complete this discussion, individual components of the score will be discussed. Intra-rater reliability for ROM was excellent for both examiners (h=0.96 and 0.94). Inter-rater reliability for ROM was also good (h=0.85 and 0.82). Analysis of the flexion contracture com- ponent of the KSKS, showed excellent (h=0.95), and good (h=0.89) intra-rater reliability. This is significantly higher than in the previous study by Liow et al 2000 (Kappa=0.52). Moderate inter- rater reliability was found between the examiners during test 1 and test 2 (h=0.58 and 0.64). This is relatively higher than that reported between expe- rienced staff by Liow et al 2000 for the same component (Kappa=0.19). It is of interest that knee ROM and flexion contracture showed very little variation. This may be because both were con- trolled passively by the examiners and the measurements are taken using goniometry, which has been found to be a reliable measure (Smith and Walker 1983; Gajdosik and Bohannon 1987). When analysing the stability compo- nents of the KSKS, both examiners showed good intra-rater reliability in antero-posterior stability and medio- lateral stability. A previous study (Liow et al 2000) demonstrated moderate correlation in an experienced examiner with a Kappa value = 0.50. In our study, Item on the KSKS Intraclass Correlation Coefficient (h) Test 1 (A & B) Test 2 (A & B) Knee ROM 0.85 0.82 AP stability 0.47 0.45 ML stability 0.32 0.24 Flexion contracture 0.58 0.64 Extension lag 0.54 0.76 Total 0.67 0.66 *ROM –range of motion, AP stability – anterior posterior stability, ML stability – medial lateral stability Table 3: The Intraclass Correlation Coefficient of inter rater reliability for the individual scores during test 1 and test 2, between Examiners A and B. SA JOURNAL OF PHYSIOTHERAPY 2010 VOL 66 NO 1 25 poor inter-rater reliability was found between the examiners. It is postulated that this is due to the subjective nature of the testing procedure. In contrast, the measurements from a goniometer are more objective and showed highest inter-rater reliability among the items in the KSKS. An interesting finding was that the inter-rater reliability of extension lag improved from test 1 (h=0.54) to test 2 (h=0.76). This may be attributed to a learning effect, or due to repeated rein- forcement from the examiners. In conclusion the results of this study showed good intra-rater reliability and moderate inter-rater reliability for the KSKS when conducted by two experi- enced physiotherapists. Physiotherapists working in the field of osteoarthritis or total knee replacement rehabilitation should consider using this measure in the clinical setting as well as in research. REFERENCES Aitken DM, Bohannon RW. 2001. Functional independence measure versus short form-36: relative responsiveness and validity. Inter - national Journal of Rehabilitation Research. 24(1): 65-68. APA (Australian Physiotherapy Association) position statement. 2003. Clinical justification and outcome measures. November. 1-3 Bhave A, Mont M, Tennis S, Michele N, Starr R, Etienne G. 2005. Functional problems and treatment solutions after total hip and knee joint arthroplasty. The Journal of Bone and Joint Surgery. 87: 9-21. Campbell MJ, Machin D. 1999. Medical statistics a common sense approach. Third edition. John Wiley and Sons Ltd. England: 28-29. Clarkson HM, Gilewich GB. 1989. Musculoskeletal Assessment-Joint Range of Motion and Manual Muscle Strength. First edition. Williams and Wilkins. Baltimore: 286. Davies AP. 2002. Rating systems for total knee replacement. The Knee. 9: 261-266. Dawson J, Fitzpatrick R, Murray D, Carr A 1998 Questionnaire on the perceptions of patients about total knee replacement. Journal of Bone and Joint Surgery (Br) 80-B: 63-69 Finch E, Brooks D, Stratford P, Mayo N. 2002. Physical rehabilitation outcome meausures-A guide to enhanced clinical deci- sion making. Second edition. Lippincott, Williams & Wilkins. Ontario: 28-31. Gajdosik RL, Bohannon RW. 1987. Clinical measurement of range of motion – Review of goniometer emphasizing reliability and validity. Physical Therapy. 67(12): 1867 – 1872. Hawker G, Wright J, Coyte P, Paul J, Dittus R, Croxford R, Katz B, Bombardier C, Heck D, Freund D. 1998. Health-related quality of life after Knee Replacement. Journal of Bone and Joint Surgery (Am). 80(2): 163-173. Hicks CM. 1995. Research for Physio - therapists- Project Design and Analysis. Second edition. Churchill Livingstone. Singapore. 57-58. Insall JN, Dorr LD, Scott RD, Scott WN. 1989. Rationale of the Knee Society Clinical Rating System. Clinical Orthopaedics and Related Research. Nov (248): 13-14. John LM. 2004. The role of measurement reliability in clinical trials. Clinical Trials. 1: 553-566 Kreibich DN, Vaz M, Bourne RB, Rorabeck CH, Kim P, Hardie R, Kramer J, Kirkley A. 1996. What is the best way of assessing outcome after total knee replacement? Clinical orthopaedics and related research. 331: 221- 225. Lingard EA, Katz JN, Wright RJ, Wright EA, Sledge CB. 2001. Validity and Responsiveness of the Knee society Clinical Rating System in comparison with the SF-36 and WOMAC. Journal of Bone and Joint Surgery (Am) 83-A (12): 1856-1864. Liow RYL, Walker K, Wajid MA, Bedi G, Lennox CME. 2000. The reliability of the American Knee Society Score. Acta Orthop Scand; 71 (6): 603–608. Magee DJ. 1997. Orthopedic Physical Assessment. Third edition. W.B. Saunders. Pennsylvania: 539-547. McAuley J, Harrer M, Ammeen D, Engh G. 2002. Outcome after Total Knee Arthroplasty in Patients with Poor Preoperative Range. Clinical Orthopaedics and Related Research. 404: 203-207. Petty NJ, Moore AP. 1998. Neuromusculo - skeletal Examination and Assessment. A Handbook for Physiotherapist. First edition. Churchill Livingstone. London: 295. Rajan R, Pack Y, Jackson H, Gillies C, Asirvatham R. 2004. No need for outpatient physiotherapy following total knee arthro - plasty: A randomized trial of 120 patients. Acta Orthopaedica Scandinavia. 75(1): 71-73. Ryd L, Karrholm J, Ahlvin P. 1997. Knee scor- ing systems in gonarthrosis: evaluation of interobserver variability and the envelope of bias. Acta Orthopedic Scandinavia. 68: 41-46. Stavem K, Arnesen Q. 2005. Use of hip and knee clinical scoring systems in prosthesis surgery in Norwegian hospitals. International Orthopaedics (SICOT). 29: 301-304. Stillman BC. 2004. Physiological Quadriceps lag. Its nature and clinical significance. Australian Journal of Physiotherapy. 50: 237- 241. Thomas SJ, Stewart A. 2005. Test-retest relia- bility, inter rater reliability and internal con - sistency of the post operative physiotherapy discharge scoring tool. Unpublished research report. University of the Witwatersrand. Health Sciences Library. Johannesburg. Walsh M, Kennedy D, Stratford P, Woodhouse L. 2001. Perioperative performance of women and men following total knee arthroplasty. Physiotherapy Canada (spring): 92-100. Wright JG, Feinstein AR. 1992. Improving the reliability of orthopaedic measurements. Journal of bone and joint surgery. 74B: 287-291.