|
|
|||||||||||
SUPPLEMENT |
Section of Community Psychiatry (PRiSM), Institute of Psychiatry, King's College London, UK
Department of Psychiatry, Academic Medical Centre, Amsterdam, The Netherlands
Institute of Preventive Medicine, Copenhagen University Hospital, Denmark
Clinical and Social Psychiatry Research Unit, University of Cantabria, Santander, Spain
Department of Medicine and Public Health, University of Verona, Italy
Medical Statistics Unit, London School of Hygiene and Tropical Medicine, London, UK
Correspondence: Paul McCrone, Section of Community Psychiatry (PRiSM), Institute of Psychiatry, King's College, London, De Crespigny Park, Denmark Hill, London SE5 8AF. Tel: 020 7848 0711; fax: 020 7277 1462
Declaration of interest No conflict of interest. Funding detailed in Acknowledgements.
|
|
ABSTRACT |
|---|
|
|
|---|
Method The CAN-EU was administered in each country, at two points in
time to assess test-retest reliability, and was rated by two interviewers at
the first administration. Cronbach's
, test-retest reliability and
interrater reliability were compared between the five sites. Reliability
coefficients and standard errors of measurement for summary scores were
estimated.
Results Sites varied in levels and spread of needs. Alphas were 0.48, 0.58 and 0.64 for total, met and unmet needs respectively. Test-retest reliability estimates, pooled over sites, were 0.85 for the total needs, 0.69 for met needs and 0.78 for unmet needs. Pooled estimates for interrater reliability were higher, at 0.94, 0.85 and 0.79 for total, met and unmet needs respectively. There were statistically significant differences in interrater reliability between sites.
Conclusion The results confirm the feasibility of using CAN-EU across sites in Europe and its psychometric adequacy.
|
|
INTRODUCTION |
|---|
|
|
|---|
Within the domain of the assessment of individual patient outcomes, increasing importance has recently been attached to the needs of those who suffer from mental illness. This emphasises the active role of the users of mental health services, and also raises a series of important questions: How can needs be defined, and by whom ? How can they be measured and compared ? What importance should be accorded to both met and unmet needs in the assessment of individual patients, and in the planning and evaluation of mental health services as a whole ? How should the needs of those suffering from schizophrenia be prioritised in relation to the needs of other diagnostic groups ?
|
|
CAMBERWELL ASSESSMENT OF NEED - EUROPEAN VERSION (CAN-EU) |
|---|
|
|
|---|
Each item of the CAN-EU contains the same question structure. The first question asks whether a need exists, and, if it does, whether it is a met or an unmet need. If there is no need in any particular area, then the interviewer proceeds straight to the next item. If a met or unmet need does exist, then further questions relating to service receipt for that item are asked. The first of these finds out how much care is received from friends or relatives (0=no help, 1=low help, 2=moderate help, 3=high help). The same question is asked about care received from formal services and also how much care is required from formal services. Finally, the person being interviewed is asked whether overall they receive the right sort of help, and whether they receive the right amount of help. Both of these are rated as zero (no) or one (yes).
Summary scores of the total number of needs (the number of 1s or 2s), the met needs (the number of 1s) and the unmet needs (the number of 2s) are computed. If the number of valid items (i.e. excluding missing values) needs is 18 or more, a prorated total is computed from the valid items, otherwise the summary score is regarded as missing.
|
|
AIMS OF THE EPSILON STUDY |
|---|
|
|
|---|
Although it is now relatively common for the authors of outcome scales to publish details of scale reliability in the original language, it is rare for the authors of translated versions to repeat the reliability exercise in the new languages, or indeed to do more than undertake a literal translation. This study therefore aims to undertake the conversion and cultural adaptation of each of the five main study scales into all the study languages in a comprehensive and scientifically rigorous manner.
|
|
METHOD |
|---|
|
|
|---|
Case identification
Cases included in the study were adults aged 18-65, selected as
representative of all people suffering from schizophrenia utilising mental
health services in each of the five study sites. Study samples were identified
either from psychiatric case registers (in Copenhagen and Verona) or
case-loads of local special mental health services (in-patient, out-patient
and community). Patients included had been in contact with mental health
services during the 3 months before the start of the study in 1997. Patients
with a clinical diagnosis of any ICD-10 categories F20-F25 were eligible to
enter screening, undertaken with the Item Group Checklist (IGC), which is part
of the Schedule for Clinical Assessment in Neuropsychiatry (SCAN) developed by
the World Health Organization
(1992). Only patients with an
ICD-10 F20 research diagnosis were finally included.
The exclusion criteria were: current residence in prison, secure residential services or hostels for long-term patients; co-existing mental retardation, primary dementia or other severe organic disorder; and extended in-patient treatment episodes longer than one year. The numbers of patients finally included in the study varied from 52 to 107 between the five sites, with a total of 404 for the study as a whole.
Outcome scales
The study included the conversion of five scales from their original
language into the other four study languages. The scales are: the Camberwell
Assessment of Need - European Version (CAN-EU), the Client Socio-Demographic
and Service Receipt Inventory - European Version (CSSRI-EU), the Involvement
Evaluation Questionnaire - European Version (IEQ-EU), the Lancashire Quality
of Life Profile - European Version (LQoLP-EU), and the Verona Service
Satisfaction Scale - European Version (VSSS-EU). These instruments were based
on the following original versions: the Camberwell Assessment of Need
(Phelan et al, 1995),
the Client Service Receipt Inventory
(Beecham & Knapp, 1992),
the Involvement Evaluation Questionnaiare
(Schene et al, 1998),
the Lancashire Quality of Life Profile
(Oliver, 1991), and the Verona
Service Satisfaction Scale (Ruggeri &
Dall'Agnola, 1993). The CAN-EU reliability results are presented
in this paper, and the results for the other scales appear in this supplement
in the papers by Schene et al
(2000), Gaite et al
(2000) and Ruggeri et
al (2000). Reliability
tests were not appropriate for the CSSRI-EU, and its development is described
by Chisholm et al
(2000, this supplement).
Two other groups of questionnaires were used in the main study. The first group consisted of a number of instruments which had been developed previously by other authors. Local services were described using the European Service Mapping Schedule (ESMS) (Johnson et al, 1998). The Brief Psychiatric Rating Scale (Overall & Gorham, 1962) was used to measure symptomatology. Disability was measured by the Global Assessment of Functioning (American Psychiatric Association, 1987). These were not converted to different languages and were used or produced in English. Second, we also used instruments documenting the sampling process (Prevalence Cohort Data Sheet), area socio-demographic descriptors (Area Socio-Demographic Data Sheet) and patients' psychiatric history (Psychiatric History Data Sheet). These were developed for the purpose of the study in English, and all are available from the first author on request. Becker et al (1999) describe the study and the methodology employed.
Interviewing and data preparation
All interviewers received training at the Institute of Psychiatry, London,
UK, in the use of SCAN and the other study instruments. There were regular
contacts to ensure standard use of instruments and a series of study
co-ordinating meetings. Data consistency and homogeneity were ensured by the
co-ordinating centre (in London) preparing the SPSS templates used at all the
participating sites. Consistent data structures were adhered to.
Reliability assessment procedures
Reliability testing was conducted on several levels depending on the nature
of the responses involved and whether the instruments are administered as
interviews or questionnaires. Three kinds of reliability test were used: (a)
Cronbach's
statistic, to estimate the internal consistency of scales
and sub-scales consisting of more than one item; (b) Cohen's
statistic to estimate the interrater reliability and test-retest reliability
of single items where these are expressed as binary variables; and (c)
intraclass correlations, to estimate the interrater reliability and
test-retest reliability of scales and sub-scales. These statistics are
discussed in Streiner & Norman
(1995). Each step in the
analysis was described in an analysis protocol which was followed by all
sites.
First, summary statistics were computed for each site, and differences in
sample variances were explored using the Levene test
(Levene, 1960). Cronbach's
was computed for each site, and for the pooled sample, and a test for
differences in
values between sites was performed
(Feldt et al, 1987).
Intraclass correlation coefficients (ICCs) were computed by maximum likelihood
estimation of a variance components model, with patients entered as random
effects, and (in the case of pooled estimates), site entered as a fixed
effect. The data for each patient were either all time 1 ratings (for
interrater reliability), or all ratings by the first rater (for test-retest
reliability). All available data were used for these analyses, including cases
where only one rating was present; however, values of n quoted in
those tables which relate to reliability are the numbers of complete pairs.
Only at one site (Verona) were there sufficient raters to estimate a specific
interrater component of variance. However, for consistency in estimation
between the sites, rater was not specifically included in the model.
Interrater variance is thus reflected in the ICC by being incorporated in the
error variance.
The ratio of the between-patient component of variance to total variance was used to estimate the ICC, and the delta technique (Dunn, 1989) was used to obtain standard errors for the ICC from the variance-covariance matrix for the components. Fisher's Z transformation was applied (Donner & Bull, 1983), and differences between sites were then tested for significance by the method of weighting (Armitage & Berry, 1994), before transforming back to the ICC scale. The standard error of measurement was obtained from the error component of variance. Pooled reliability of the individual items was estimated, without between-site testing. Finally, a paired t-test on the test-retest data was carried out in order to assess systematic changes from time 1 to time 2.
For reasons of comparability, all sites used the same procedure and the
same software for all instruments: SPSS 7.5 or higher, the Amsterdam
-testing program ALPHA.EXE based on Feldt et al
(1987), and EXCEL for tests of
the homogeneity of ICCs.
Test-retest reliability was conducted at intervals of between 1 and 2 weeks, although in a few cases up to 7 weeks elapsed, depending on the practicalities of contacting patients. The same rater interviewed at test and retest. For the CAN, patients' responses are rated by an interviewer, and therefore interrater reliability is an issue as well as test-retest. For interrater reliability, a second rater present at the interviews rated in parallel with the primary interviewer who asked the questions of the patient. The numbers of raters at time 1 and time 2 were as follows: Amsterdam: 4, 4 ; Copenhagen: 5, 5; London: 2, 5; Santander: 3, 3; Verona: 11, 13.
Answers to the service receipt part of the CAN-EU depend on answers to Part 1 (presence of a need), and therefore interrater reliability for the subsequent sections is hard to define, since the parallel rater has no control over the flow of questions. Furthermore, the service receipt sections are mainly useful in a clinical situation. For these reasons, and in common with reliability testing of the original CAN, these sections have not been analysed here for reliability purposes.
|
|
RESULTS |
|---|
|
|
|---|
|
The
coefficients, which reflect correlations between individual CAN
items, were moderate to low, as shown in
Table 2. For total needs, the
pooled
was 0.64 (95% CI 0.58-0.70). Only for met needs (pooled mean
0.48, 95% CI 0.40-0.56) was there strong evidence for differences between
sites, with Santander having the lowest value at 0.16. For unmet needs (pooled
mean 0.58, 95% CI 0.51-0.64) the differences were less marked, but Copenhagen
showed a somewhat higher value than the other sites, at 0.70.
|
The ICCs between the two time points are given in Table 3, which shows that test-retest reliability is at an acceptable level. There were no significant differences between sites, except for unmet needs. Pooled values were 0.85 (95% CI 0.82-0.88) for total needs, 0.69 (95% CI 0.63-0.74) for met needs and 0.78 (95% CI 0.74-0.82) for unmet needs.
|
For estimating
coefficients for the individual items, unmet, met
and total needs were each expressed as binary variables in turn.
Table 4 shows that
coefficients for test-retest reliability were high for total needs
(0.55-0.84), and moderately high for met needs (0.40-0.76, excluding
for met needs for drugs, which was zero) and unmet needs (0.34-0.85). Standard
errors for these
estimates were typically 0.06, 0.08 and 0.09
respectively. Only one item had a
coefficient below 0.4 (unmet needs
for physical health).
|
There was evidence of site differences in interrater reliability (Table 5) for total, met and unmet needs. However, all the sites had coefficients for total and met needs above 0.8. In the case of unmet needs, coefficients for Amsterdam, Santander and Verona were under 0.8, although they were all still over 0.65. The pooled estimates were 0.93 (95% CI 0.92-0.95) for total needs, 0.85 (95% CI 0.81-0.87) for met needs and 0.79 (95% CI 0.75-0.83) for unmet needs.
|
Interrater reliability for individual items, pooled over sites
(Table 6), was very good for
total needs (0.75-0.99) and met needs (0.57-0.95), and moderately good for
unmet needs (0.41-0.83). Standard errors were typically 0.04, 0.07 and 0.10
respectively. All items had
coefficients over 0.4.
|
Paired sample t-tests revealed a tendency for a decrease in the rating of total needs over time, pooled across sites, but this was significant only at a borderline level (P=0.053). At individual sites, there were no significant differences between mean scores at test and retest, with the exception of total needs in Verona, where the time 2 total values were rated lower: 4.39 at time 2 compared with 5.11 at time 1, difference 0.72 (95% CI 0.52-0.92), P=0.001. This is most likely to be a chance finding, given the large number of tests employed.
|
|
DISCUSSION |
|---|
|
|
|---|
are
quite acceptable in this context. Indeed, they are not surprising, given the
diverse range of needs assessed with the instrument, which were deliberately
selected to cover the entire range of difficulties commonly encountered by
people suffering from severe mental illnesses. In this context the
coefficients are not that informative, but have been reported so as to
maintain consistency with the other papers in this supplement.
The very low value for
for met needs for Santander is interesting,
and may be connected with the lower level and smaller degree of variation in
met needs at that particular site, as shown in
Table 1. Alternatively it may
be connected with one particular item, "help with psychotic
symptoms". When this item is removed, the
is doubled to 0.32,
more in keeping with its value at other sites.
Overall, the test-retest reliability is at least moderately good, although usually lower than interrater reliability. Lack of reliability may, in some cases, have been due to changes in patient status that occurred between the two time points, in addition to lack of consistency in a patient's responses from one time point to the next. However, interviews were generally made within intervals of 1-2 weeks, so real changes in status were unlikely.
Interrater reliability is excellent, with only a slight fall-off for unmet needs. Although there were significant differences between sites, all values of interrater reliability coefficients were over 0.65. The two slightly lower reliabilities for unmet needs (Amsterdam and Verona) are due to higher standard errors of measurement rather than the differences in variances between the samples shown in Table 1. It should be noted that Verona had a larger pool of primary raters and, in this respect, the data from that site may more realistically reflect the range of raters who might use the instrument in practice. The very small standard error of measurement in London might reflect a longer history of CAN training and use.
For individual items, both for testretest and interrater, the items with the lowest k values tend to be those where there are low base rates for the need: for example, drugs. Two items showed both low k and low percentage agreement in the test-retest comparison: psychological distress (item 9) and company (item 14). These two items are not of this character, and there seems to be no obvious pattern in the inconsistent responses over time. It may be that these two items are hard to rate because they are very much related to mood and reflect relatively transient situations.
A point which applies generally, both over time and also between raters, is that there are greater levels of agreement for total needs than for the component items. However, the very skewed nature of the data relating to individual items (i.e. the low base-rates in many cases) makes reliability tests problematic. Indeed it reduces the feasibility of analysing these variables individually, except in very large samples.
Mean scores did not differ significantly between test and retest, except for one score in one site. The pooled ratings for total needs did decrease slightly over time (at a borderline level of significance) but in general there is little evidence for substantial increase or decrease over time, a problem which might occur if patients tended to reflect on and modify their ideas following an interview. In these respects the CAN can be seen to be stable over time.
This analysis has concentrated on the three total needs scores, rather than individual items. This is because the 22 CAN items, while clearly important in considering the needs of individual patients, are of limited use for analytical purposes when treated in isolation, since most of them are encountered infrequently in individual cases. Similarly, the sections of the CAN relating to levels of formal and informal care received, and formal care required, are most relevant for clinical rather than research purposes. With large samples, the data on particular needs and on care required or received could be analysed, but such samples have hitherto been scarce.
Bearing in mind these caveats, we suggest that the summary scores for the CAN-EU (total, met and unmet needs) are generally reliable over time and between raters. Despite some evidence for differences in levels of reliability between sites for unmet needs at test-retest, and between raters for all three total scores, the results are good at each site, and encouraging for the use of this instrument in its five translations.
|
|
ACKNOWLEDGMENTS |
|---|
|
|
|---|
This study was supported by the European commission BIOMED-2 Programme (Contract BMH4-CT95-1151). We would also like to acknowledge the sustained and valuable assistance of the users, carers and the clinical staff of the services in the five study sites. In Amsterdam, the EPSILON Study was partly supported by a grant from the National Fonds Geestelijke Volksgezondheid and a grant from the Netherlands Organization for Scientific Research (940-32-007). In Santander the EPSILON Study was partially supported by the Spanish Institute of Health (FIS) (FIS Exp. No. 97/1240). In Verona additional funding for studying patterns of care and costs of a cohort of patients with schizophrenia were provided by the Regione del Veneto, Giunta Regionale, Ricerca Sanitaria Finalizzata, Venezia, Italia (Grant No. 723/01/96 to Professor M. Tansella).
|
|
REFERENCES |
|---|
|
|
|---|
Armitage, P. & Berry, G. (1994) Statistical Methods in Medical Research (3rd edn) . Oxford: Blackwell Scientific.
Becker, T., Knapp, M., Knudsen, H. C., et al
(1999) The EPSILON study of schizophrenia in five European
countries: design and methodology for standardising outcome measures and
comparing patterns of care and service costs. British Journal of
Psychiatry, 175, 514
-521.
Becker, T., Knapp, M., Knudsen, et al
(2000) Aims, outcome measures, study sites and patient
sample. Epsilon Study I. British Journal of
Psychiatry, 177 (suppl. 39), s1
-s7.
Beecham, J. & Knapp, M. (1992) Costing psychiatric interventions. In Measuring Mental Health Needs (eds G. Thornicroft, C. R. Brewin & J. Wing). London: Gaskell.
Chisholm, D., Knapp, M. R. J., Knudsen, H. C. et al
(2000) The Client Socio-Demographic and Service Receipt
Inventory: development of an instrument for international research. EPSILON
Study 5. British Journal of Psychiatry,
177 (suppl. 39), s28
-s33.
Donner, A. & Bull, S. (1983) Inferences concerning a common intraclass correlation coefficient. Biometrics, 39, 771 -775.[CrossRef][Medline]
Dowrick, C., Casey, P., Dalgard, O., et al (1998) Outcomes of Depression International Network (ODIN). Background, methods and field trials. British Journal of Psychiatry, 171, 359 -363.
Dunn, G. (1989) Design and Analysis of Reliability Studies. London: Edward Arnold.
Feldt, L. S., Woodruff, D. J. & Salih, F. A. (1987) Statistical inference for coefficient alpha. Applied Psychological Measurement, 11, 93-103.
Gaite, L., Vázquez-Barquero, J.
L., Arriaga Arrizabalaga, A., et al (2000) Quality
of life in schizophrenia: development, reliability and internal consistency
of the Lancashire Quality of Life Profile - European Version. EPSILON Study 8.
British Journal of Psychiatry,
177 (suppl. 39), s49
-s54.
Johnson, S., Salvador-Carulla, L. & the EPCAT group (1998) Description and classification of mental health services: a European perspective. European Psychiatry, 13, 333 -341.[CrossRef]
Knudsen, H.C., Vázquez-Barquero,
J.L., Welcher, B., et al (2000) Translation and
cross-cultural adaptation of outcome measurements for schizophrenia. EPSILON
Study 2. British Journal of Psychiatry,
177 (suppl. 39), s8
-s14.
Levene, H. (1960) Tests for equality of variances. In Contributions to probability and statistics: Essays in honor of Harold Hotelling (eds I. Olkin, S. G. Ghurye, W. Hoeffding, et al), pp. 278-292. Stanford, CA : Stanford University Press.
Oliver, J. (1991) The social care directive: development of a quality of life profile for use in the community services for the mentally ill. Social Work and Social Sciences Review, 3, 5 -45.
Overall, J. & Gorham, D. (1962) Brief Psychiatric Rating Scale. Psychological Reports, 10, 799 -812.
Phelan, M., Slade, M., Thornicroft, G., et al
(1995) The Camberwell Assessment of Need: the validity and
reliability of an instrument to assess the needs of people with severe mental
illness. British Journal of Psychiatry,
167, 589
-595.
Ruggeri, M. & Dall'Agnola, R. (1993) The development and use of the Verona Expectations for Care Scale (VECS) and the Verona Service Satisfaction Scale (VSSS) for measuring expectations and satisfaction with community-based psychiatric services in patients, relatives and professionals. Psychological Medicine, 23, 511 -523.[Medline]
Ruggeri, M., Lasalvia, A. Dall'Agnola, R., et al
(2000) Development, internal consistency and reliability of
the Verona Service Satisfaction Scale - European Version. EPSILON Study 7.
British Journal of Psychiatry,
177 (suppl. 39), s41
-s48.
Schene, A. H., van Wijngaarden, B & Koeter, M. W. J. (1998) Family caregiving in schizophrenia: domains and distress. Schizophrenia Bulletin, 24, 618 -618.
Schene, A. H., Koeter, M., van Wijngaarden, B., et al
(2000) Methodology of a multi-site reliability study. EPSILON
Study 3. British Journal of Psychiatry,
177 (suppl. 39), s15
-s20.
Streiner, D. & Norman, G. (1995) Health Measurement Scales: A Practical Guide to their Development and Use. Oxford: Oxford University Press.
van Wijngaarden, B., Schene, A. H., Koeter, M., et al
(2000) Caregiving in schizophrenia: development, internal
consistency and reliability of the Involvement Evaluation Questionnaire -
European Version. EPSILON Study 4. British Journal of
Psychiatry, 177 (suppl. 39), s21
-s27.
World Health Organization (1992) Schedules for Clinical Assessment in Neuropsychiatry (ed. -in-chief J. K. Wing). Geneva: WHO.
This article has been cited by other articles:
![]() |
H. Najim and P. McCrone The Camberwell Assessment of Need: comparison of assessments by staff and patients in an inner-city and a semi-rural community area Psychiatr. Bull., January 1, 2005; 29(1): 13 - 17. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. BECKER, M. KNAPP, G. THORNICROFT, H. C. KNUDSEN, A. H. SCHENE, M. TANSELLA, and J. L. VAZQUEZ-BARQUERO Aims, outcome measures, study sites and patient sample: EPSILON Study I The British Journal of Psychiatry, July 1, 2000; 177 (39): s1 - s7. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. C. KNUDSEN, J. L. VAZQUEZ-BARQUERO, B. WELCHER, L. GAITE, T. BECKER, D. CHISHOLM, M. RUGGERI, A. H. SCHENE, and G. THORNICROFT Translation and cross-cultural adaptation of outcome measurements for schizophrenia: EPSILON Study 2 The British Journal of Psychiatry, July 1, 2000; 177 (39): s8 - s14. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. H. SCHENE, M. KOETER, B. VAN WIJNGAARDEN, H. C. KNUDSEN, M. LEESE, M. RUGGERI, I. R. WHITE, and J. L. VAZQUEZ-BARQUERO Methodology of a multi-site reliability study: EPSILON Study 3 The British Journal of Psychiatry, July 1, 2000; 177 (39): s15 - s20. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. RUGGERI, A. LASALVIA, R. DALL'AGNOLA, M. TANSELLA, B. VAN WIJNGAARDEN, H. C. KNUDSEN, M. LEESE, and L. GAITE Development, internal consistency and reliability of the Verona Service Satisfaction Scale - European Version: EPSILON Study 7 The British Journal of Psychiatry, July 1, 2000; 177 (39): s41 - s48. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. GAITE, J. L. VAZQUEZ-BARQUERO, A. A. ARRIZABALAGA, E. VAZQUEZ-BOURGON, M. P. RETUERTO, A. H. SCHENE, B. WELCHER, G. THORNICROFT, M. LEESE, and M. RUGGERI Quality of life in schizophrenia: development, reliability and internal consistency of the Lancashire Quality of Life Profile - European Version: EPSILON Study 8 The British Journal of Psychiatry, July 1, 2000; 177 (39): s49 - s54. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Psychiatric Bulletin | Advances in Psychiatric Treatment | All RCPsych Journals |