1. Department of Orthopaedic Surgery, University Hospital of Heraklion, Heraklion, Crete, Greece
2. 4th TO.M.Y., Heraklion, Crete, Greece
Sign in to download the Issue in PDF format.
Traumatic, degenerative, and inflammatory musculoskeletal conditions, are extremely common causes of pain and disability, that affect all patients’ age groups. They are responsible for a large number of health-care visits and days of hospitalisation, and many days of work loss.1 Proper clinical assessment and interpretation of imaging studies are crucial in order to achieve accurate diagnosis. However, during the treatment of musculoskeletal conditions, an important factor of decision-making is the impact of the disorder on the patient’s functional status and everyday activity. Therefore, it is necessary to evaluate the patient’s perspectives about their condition. The use of valid and reliable patient-reported outcome measures (PROMs) can offer better and more detailed assessment of the patient’s experience and provide critical information about prognosis and further management. Furthermore, more detailed and in-depth evaluation of patients’ experience is of paramount importance in order to achieve improvement of the provided care by the health-care facilities.
PROMs can be classified in three broad categories: generic, disease-specific, and condition-specific.2 Generic PROMs can be used for a broad spectrum of clinical conditions and measure single aspects of health or cover multiple dimensions of health status.2 Disease-specific PROMS are used to assess the outcome regarding a particular condition.2 Condition-specific PROMs are not used to assess a particular disease, but a broader health condition or state. They include a range of functional status or disability measures used to assess the health of a particular population group such as the elderly or those with mental health conditions.3 The selection of a PROM depends on the construct of interest and the measurement properties of the PROM.4 PROM measurement properties include reliability, validity and responsiveness.5 However, the quality of the studies providing evidence about the instruments’ measurement properties is often overlooked. The COSMIN (COnsensus-based Standards for the selection of health status Measurement INstruments) initiative developed a consensus-based standard for assessing the quality of studies on measurement properties.5
The purpose of PROM utilisation in clinical practice and research is to achieve an accurate representation of the patients’ perspectives. For that reason, it is important that the patient carries out the completion of the questionnaire unassisted. The inability of the patient to comprehend the questionnaire because of language difficulties can have a detrimental effect on the reliability of the data. As a result, translation of PROMs into other languages and cross-cultural adaptation using well-accepted methodological standards are necessary for the development of appropriate questionnaires. The aim of this review is to systematically identify the Greek-language validated PROMs reported in the published literature, which are used to assess musculoskeletal conditions and to evaluate the psychometric properties of the identified instruments using the COSMIN risk of bias checklist.
Literature search
Structured search of Pubmed/MEDLINE, Embase, Scopus, and the Cochrane library was performed without time restriction, in order to identify studies translating and validating a PROM into the Greek language. Studies only in the English language were included. The electronic search was tailored to the individual database being searched and was based on the protocol suggested by the COSMIN group.6 The search strategy involved the combination of index terms and free-text words (including patient-reported outcome measures , quality of life, questionnaire, assessment tool, outcome tool, outcome measures, instrument, score, scale, cross cultural, Greek) and the Boolean operators ‘OR’ and ‘AND’. The final search was performed on 6 February 2021. Reference lists were hand-searched to identify potential additional relevant studies.
Selection Criteria for Eligible Studies
After removal of duplicate studies, two reviewers (ID and IS) independently assessed all titles and abstracts. We included all studies that reported a translation and validation of at least one PROM, designed for the assessment of musculoskeletal conditions, into the Greek language. Clinical studies were eligible regardless of the presence or type of study intervention. Studies for de novo development of PROMs in Greek were also included. Any disagreement regarding eligibility of a study was resolved by consensus between the two reviewers, and if required, the senior author (T.T.) was consulted.
Data Extraction
Data were extracted by ID and IS. The following data were extracted from each publication: the PROM, the intended construct for measurement, measurement properties, study population and diagnosis, number of patients, patient demographics, country and language.
Assessment of the quality of studies and assessment of measurement properties
Two authors independently rated the methodological quality of the eligible studies using the COSMIN Risk of Bias checklist.7Furthermore, the quality of measurement properties was assessed according to the COSMIN criteria for good measurement properties6
The COSMIN Risk of Bias checklist consists of 3 sections. The first section involves content validity, which is the degree to which the content of a PROM is an adequate reflection of the construct to be measured.2 Content validity evaluation includes: the relevance (all items in a PROM should be relevant for the construct of interest within a specific population and context of use), comprehensiveness (no key aspects of the construct should be missing), and comprehensibility (the items should be understood by patients as intended).7 In this systematic review, only the comprehensibility of the translated versions of PROMs was assessed, as relevance and comprehensiveness are considered more applicable to the initial development of the instrument. For the tools developed de novo in Greek, the development checklist was utilised and all components of content validity were evaluated. The second section of the checklist evaluates internal structure, and it consists of structural validity, internal consistency, and cross‐cultural validity/measurement invariance. The third section involves the remaining measurement properties, which are: measurement error, criterion validity, hypotheses testing for construct validity and responsiveness. Each measurement property is awarded a score of “Very good”, “Adequate”, “Doubtful”, “Inadequate”, or not applicable. The methodological quality of each measurement property is assessed by a box containing questions scored on this scale according to defined COSMIN criteria. A system of ‘worst score counts’ applies for each box. The methodological quality of a measurement property could only be rated “Very good” if all the boxes of the checklist are rated “Very good”.
Search results
A total of 6743 studies were initially identified in the literature search. Removal of duplicates yielded 6612 studies. After screening, 43 full-text articles were retrieved, of which 32 met the inclusion criteria for this review. The study selection flow chart is shown in Figure 1.
The identified studies included 31 PROMs. Two of them were developed de novo in the Greek language, and 29 were translated versions. The characteristics of the identified studies are shown in Table 1 and the characteristics of the identified PROMs are shown in Table 2.
All the instruments that were identified regarding the musculoskeletal system were disease-specific. The majority of the questionnaires (16) involved the lower limb.8-23 Nine of them involved knee conditions,10-18 which was the entity with the largest number of PROMs translated into the Greek language. Four instruments were retrieved for the evaluation of hip conditions,8-10 while three questionnaires involved foot and ankle pathologies.19-21 Five instruments were retrieved about upper limb conditions tools translated and validated in Greek.24-29 Regarding spine conditions, six instruments were retrieved,30-35 with two of them being health-related quality of life measures.30,32,33 Two questionnaires involved other conditions: fibromyalgia37 and juvenile arthritis.38 Two questionnaires that were constructed de novo in Greek were also retrieved, the Functional Assessment Scale for Acute Hamstring Injuries score (FASH score)22 and the Brace questionnaire (BrQ).36
Quality of the included studies
In 32 identified studies, 31 PROMs were validated. The total number of reported measurement properties was 171. The methodological quality for 37 of them (21%) was inadequate and doubtful for 43 (25%) of them. Many measurement properties were not reported. The methodological quality of the studies is summarized at Tables 3 and 4. The measurement properties of each PROM were rated according to the COSMIN criteria for good measurement properties (Table 5).
Summary of translated PROMs
PROMs about hip disorders
The modified Harris Hip Score was developed in 2000,39 as a modification of the original Harris Hip Score.40 It includes only assessments about pain and function, therefore it can be used as a patient-reported outcome measure. Reliability of The Greek version of mHHS8 received sufficient rating. The rest of the measurement properties were indeterminate or were not reported.
The 12-item International Hip Outcome Tool (iHOT-12)41 was developed as a shorter version of the 33-item International Hip Outcome Tool questionnaire,42 and it is used for the assessment of the quality of life of patients of hip disorders. In the Greek version of iHOT-12, reliability was rated sufficient. The rest of the measurement properties were indeterminate or were not reported.
PROMs about knee disorders
Literature search yielded nine instruments for the evaluation of knee conditions translated in Greek. The Western Ontario and McMaster Osteoarthritis Index (WOMAC) is a 24-item questionnaire,43 designed for the assessment of patients with hip or knee osteoarthritis. It has been translated and validated in Greek in two studies.10,11 In the study of Konstantinidis et al.,11 comparative validation with the Lequesne Index44 was performed. The majority of the participants (68 of 97) were patients with knee osteoarthritis, with the rest being patients with hip osteoarthritis. The study of Papathanasiou et al.11 included only patients with knee osteoarthritis. In both studies, internal consistency and reliability received sufficient ratings. The rest of the measurement properties received adequate ratings.
Six instruments were retrieved that can be used for various knee pathologies, the Knee Outcome Survey-Activities of Daily Living Scale (KOS-ADLS), 12 the Knee Injury and Osteoarthritis Outcome Score (KOOS)13 and KOOS-Child,14 the International Knee Documentation Committee Subjective Knee Form (IKDC)15, the Lysholm Knee Scoring Scale (LKSS),16 and the Tegner Activity Scale (TAS).16
The KOS-ADLS is a 14-item questionnaire assessing the symptoms and function during daily activities of patients with knee pathologies.45 The Greek version of KOS-ADLS12 received sufficient ratings for internal consistency, reliability, construct validity and responsiveness. The rest of the measurement properties were not reported or indeterminate.
The KOOS consists of 42 questions divided into five domains.46 It assesses Pain (9 items), Symptoms (7 items), Activity of Daily Living (ADL; 17 items), Sport and Recreation Function (Sports/Rec; 5 items) and Quality of Life (QoL; 4 items). The Greek version of KOOS13 received sufficient ratings for internal consistency, reliability, construct validity and responsiveness. The rest of the measurement properties were not reported or were indeterminate. Due to difficulty of understanding some of the items by the paediatric population, another version of KOOS was developed, modified for children.47 Reliability, construct validity and responsiveness of the Greek version were rated sufficient. All other measurement properties were indeterminate or were not reported.
The IKDC48 is a 10-item instrument and evaluates symptoms and functional status of both daily life and sports activities, The Greek version of IKDC14 received sufficient ratings for internal consistency, reliability, construct validity, and responsiveness. The rest of the measurement properties were not reported or indeterminate.
The internal consistency and reliability of the Greek versions were rated sufficient. All other measurement properties were indeterminate or not reported.
Two instruments were retrieved that were specifically designed for the assessment of anterior knee pain, the Kujala Anterior Knee Pain Scale (KAKPS),17 and the Victorian Institute of Sport Assessment scale-Patella (VISA-P) questionnaire.18 All measurement properties of the KAKPS Greek version17 were indeterminate or not reported, except for internal consistency and reliability. Regarding the measurement properties of the VISA-P Greek version,18 they were indeterminate or not reported, besides construct validity and responsiveness.
PROMs about ankle disorders
The Achilles Tendon Rupture Score (ATRS) is the only outcome measure validated for Achilles’ tendon ruptures.52 Its purpose is the evaluation of symptoms and function after Achilles tendon rupture. Internal consistency, reliability, construct validity and responsiveness of the Greek version19 received sufficient ratings. The rest were not reported.
The Cumberland Ankle Instability Tool (CAIT) is a questionnaire of nine independently-scored items, for the assessment of symptoms of ankle instability.53 The Greek version of the CAIT20 received sufficient ratings for internal consistency and reliability. All other measurement properties were indeterminate or were not reported.
The (Manchester Foot and Pain Disability Index) MFPDI was the only retrieved tool translated and validated in Greek21 that is designed for the assessment of disability caused by foot disorders. It consists of 19 items, that starting with the statement "Because of pain in my feet", divided in three subscales: functional limitation, pain intensity, concern with personal appearance. Only internal consistency of the Greek version was rated sufficient. Reliability and responsiveness were indeterminate. All other measurement properties were not reported.
PROMs about upper limb disorders
The Shoulder Pain and Disability Index has been translated and validated in Greek in two studies.24,25 In the study of Vrouva et al.,24 the participants were patients with rotator cuff tear, treated conservatively. All measurement properties were rated sufficient except for measurement error and criterion validity that were not reported. In the study of Spanou et al.,25 the participants were patients that suffered of shoulder pain for at least four weeks. Internal consistency, construct validity and responsiveness were rated sufficient. All other measurement properties were indeterminate or not reported.
The Disabilities of the Arm, Shoulder, and Hand (DASH) Questionnaire is utilized for the assessment of a variety of symptoms associated with upper limb disorders. Only internal consistency of the Greek version26 received sufficient rating. Reliability was rated insufficiently, and the rest of the measurement properties were indeterminate or not reported. The Hand20 questionnaire was also designed for the assessment of a variety of symptoms of upper limb disorders.58 Evaluation of the measurement properties’ quality showed sufficient internal consistency and reliability, with all other measurement properties being indeterminate or not reported.
One instrument was retrieved for the evaluation of hand conditions, the Boston Carpal Tunnel Questionnaire (BCTQ).59 All measurement properties of the Greek version28 received sufficient ratings, except for structural validity, cross-cultural validity and measurement error that were not reported.
One questionnaire was retrieved about elbow disorders, the Patient-rated Tennis Elbow Evaluation (PRTEE) which is an updated version of the Patient-Rated Forearm Evaluation Questionnaire (PRFEQ).60 The Greek version of PRTEE29 received sufficient rating for reliability. All other measurement properties were indeterminate or not reported.
PROMs about spine disorders
Six instruments were retrieved for the evaluation of spine conditions. The Neck Disability Index is a short, condition-specific questionnaire used for patients with neck pain.61 It consists of 10 items concerning various activities. The Greek version of NDI31 received sufficient ratings for reliability and measurement error. All other measurement properties were indeterminate or not reported.
Regarding the assessment of patients with low back pain, three condition-specific tools were identified: the Quebec Back Pain Disability Scale (QBPDS),34 the Oswestry Disability Index (ODI) and the Roland-Morris Disability Questionnaire (RMDQ).35 Internal consistency of all three translated versions was rated sufficient. All other measurement properties were indeterminate or not reported.
Finally, two patient reported Health-Related quality of Life measures (HRQoL) were identified: the Ankylosing Spondylitis Quality of Life (ASQoL) questionnaire30 and the scoliosis research society – 22 (SRS-22) questionnaire.32,33 The internal consistency of the ASQoL Greek version30 was rated sufficient. All other measurement properties were indeterminate or not reported. The SRS-22 has been translated and validated in Greek in two studies. The study of Antonarakos et al.32 included surgically treated patients, while the study of Potoupnis et al.33 included conservatively treated patients. The ratings were similar: sufficient internal consistency and reliability, with the rest of the measurement properties being indeterminate or not reported.
PROMs constructed de novo in Greek
Two instruments were retrieved that were constructed de novo in Greek, the Brace questionnaire (BrQ) and the Functional Assessment Scale for Acute Hamstring Injuries (FASH). The BrQ was constructed by Vasiliadis et al. in 200636 and it is a HRQoL measure for adolescents with idiopathic scoliosis treated conservatively. It consists of 34 items divided in 8 subdomains. The methodology for total PROM design received “inadequate” rating, due to the fact that the construct of interest was not clearly described according to the COSMIN criteria.7 Pilot test of the questionnaire was not performed, therefore the content validity of the questionnaire was not assessed.
The FASH questionnaire was constructed in 2014 by Malliaropoulos et al.22 It is a condition-specific, 10-item questionnaire designed to evaluate the functional status of athletes with hamstring injuries. Total PROM design was rated “inadequate”, as the description of the construct was not clear. Pilot test was performed, and the sample was an accurate representation of the target population. However, the items were not tested in their final form; thus, the methodological quality of comprehensibility assessment was rated “inadequate”. Comprehensiveness was not assessed. Summary of PROM development checklist is presented in Table 6.
The purpose of this review was to summarise the PROMs involving musculoskeletal conditions that have been translated and validated in Greek, and to also evaluate the methodological quality of the validation studies according to the COSMIN Risk of Bias Checklist.7 Thirty-one translated versions of PROMs were identified. The methodological quality for 47,3% (n=81) of the measurement properties was adequate and 45% (n=77) of the measurement properties received the “sufficient” rating. The remaining measurement properties were indeterminate or not reported.
Content validity is the degree to which the content of an instrument is an adequate reflection of the construct to be measured5 and is the most important measurement property of a PROM. Comprehensibility is a significant component of content validity, and it was rated “insufficient” in the majority of the studies, as cognitive debriefing was not performed during pre-testing or the process that was used was not clearly described. Other components of content validity (comprehensiveness and relevance) were not evaluated in this systematic review, as they are considered more applicable to the initial development of a PROM.
Structural validity refers to the degree to which the scores of a PROM are an adequate reflection of the dimensionality of the construct to be measured.3 It is usually assessed with factor analysis. In the majority of the studies, factor analysis was not performed,8,11,14,15,16-19,20,24,26-28,31-33,36 and the authors assumed models from other studies that evaluated the structural validity of the construct of interest.
Internal consistency is an important component of internal structure of a PROM. It represents the degree of interrelatedness among the items and is often assessed by Cronbach’s alpha.5 For the appropriate interpretation of internal consistency, the items should form a unidimensional scale or subscale. Unidimensionality means that the items in a scale or a subscale measure a single construct. Internal consistency was one of the most frequently reported measurement properties across the studies. The methodological quality was sound in the vast majority of them, with the calculation of Cronbach’s alpha.
Cross cultural adaptation is the cornerstone of the comprehensibility of a PROM, and it is absolutely necessary for the accurate reflection of a PROM in another language. The translation process in most of the studies was in compliance with the international guidelines (such as those of Beaton et al.69). For further confirmation of cross-cultural validity, it is suggested by COSMIN guidelines to perform comparisons between at least two different groups, with differences such as gender, literacy or language. However, such comparisons were performed only in two studies, the validation of CAIT20 and the validation of SPADI by Vrouva et al.24
Reliability refers to the total variance in the measurements which is due to “true” differences between patients. “True” is the average score that would be
obtained if the scale was administered an infinite number of times to the same person.5 It does not concern the accuracy of an instrument, but only its consistency.70 Reliability also refers to the ability of a PROM to distinguish between patients.5 Reliability was reported in 27 of 32 studies. The methodology was inadequate in 20 them, even though the results were sufficient (ICC >70). The main reason for the inadequate rating of methodology was that the interval between the first and the second completion of the questionnaire was much shorter than 2 weeks that is deemed acceptable by the COSMIN guidelines.7 The same shortcoming applied and for measurement error calculation.
Hypotheses testing for construct validity refers to the consistency of the scores of a PROM with a hypotheses, assuming that the PROM validly measures the construct to be measured. The more specific the hypotheses are and the more hypotheses are being tested, the more evidence is gathered for construct validity.71 Many types of hypotheses can be tested to evaluate construct validity of a PROM.5 Responsiveness refers to the ability of a PROM to detect change over time in the construct to be measured. The difference between construct validity and responsiveness is that construct validity refers to the validity of a single score, and responsiveness refers to the validity of a change score.5 The standards for evaluation of responsiveness are similar to the standards utilised for construct validity evaluation. Hypotheses testing for construct validity and responsiveness were reported in all the studies. The most common method of assessment was comparison with other outcome measures and the methodological quality was adequate. However, in most of the studies, the results were deemed indeterminate due to the lack of an a priori hypotheses statement.
Literature search did not yield any further validation studies for any of the translated versions of PROMs, besides the initial ones. However, it cannot be excluded that the instruments are utilized in everyday clinical practice. The BrQ36 has been translated and validated in Polish,71 Italian,72 French,73 Korean,74 and Persian.75 The results of the validation studies were satisfactory regarding reliability. The FASH scale has been validated in German76 and French.77 The validation studies reported satisfactory internal consistency and reliability results.
To the best of our knowledge, this is the first systematic review that summarises the PROMs related to the musculoskeletal system that have been translated in the Greek language and evaluates their measurement properties according to the COSMIN criteria. This review can be used as an everyday clinical practice reference guide for clinicians, in relation to the available instruments translated in the Greek language. It also highlights the strengths and limitations of the studies conducted with the aim of PROM validation in the Greek language. Therefore, it offers information for future researchers in relation to the quality of the existing studies and how to avoid shortcomings in the future. The limitation of this study is that it only includes studies with PROMs constructed in the English language. Instruments constructed in other languages were not included.
Literature search for this review revealed that there is a lack of translated and validated instruments in Greek in several areas of musculoskeletal medicine, such as traumatology, paediatric orthopaedics, and orthopaedic oncology. Further research is encouraged with studies in compliance with the COSMIN criteria in order to translate and validate new outcome measures in Greek regarding those areas. In addition, further research is encouraged regarding the PROMs that have already been translated in Greek, in order to achieve further validation of their measurement properties and report the measurement properties that have not been previously reported.
A number of PROMs has been translated into the Greek language related to musculoskeletal conditions. The majority of them involves the lower limb and especially knee conditions. Further validation of these instruments is encouraged, with studies of good quality according to the COSMIN checklist. In addition, there are quite a few fields of musculoskeletal medicine where outcome measures have not been translated yet. Therefore, it is indicated that new tools need to be translated into Greek, in compliance with the COSMIN criteria that will involve those areas of clinical practice.
The authors declare no conflict of interest.