The British Journal of Psychiatry (2000) 177: 89
© 2000 The Royal College of Psychiatrists
Hospital Anxiety and Depression Scale for use with adolescents
D Marchevsky
Department of Psychiatry, Campbell Centre, Standing Way, Eaglestone,
Milton Keynes MK6 5NG
I had difficulties in answering important questions about the support of
some of the conclusions that White et al
(1999) draw in their work on
validation of the Hospital Anxiety and Depression Scale for use with
adolescents.
- It is not clear whether the non-clinical sample used in the study can be
considered independent since the method of selection is not mentioned.
- Of the 248 children (110 girls) who were tested first, 180 were re-tested.
However, the girl/boy ratio in the latter group is not indicated. Moreover,
the outpatient sample comprised 48 subjects (27 girls) and the deliberate
self-harm (DSH) sample had 38 subjects (30 girls). Considering the
disproportionate group sizes and gender distribution it is surprising to find
that the variances in the different groups are homogeneous. However, this
information cannot be deduced from the published data. As a result, it is very
difficult to assess fundamental requirements for the F-test.
- Girls are 44% of the non-clinical sample, 56% of the out-patient sample
and, more importantly, 79% of the DSH group. The authors conclude that there
is a statistically significant gender difference, with girls scoring higher
than boys in both depression and anxiety scales. Assuming that the
F-test's requirements are met, then it may not seem surprising to
find an overall significant difference detected by the F-test because
of the characteristics of the DSH group.
- As the authors do not report multiple comparisons between the groups, it is
not possible to know whether the differences remain when the DSH group is
excluded.
- The analysis does not include techniques to control for gender, which
appears to be a very important confounder.
- The authors assessed test-retest reliability with Pearson's correlation
coefficient. As this technique does not take into account errors of
measurement, it does not measure agreement and its results are not
meaningful.
REFERENCES
- White, D., Leach, C., Sims, R., et al
(1999) Validation of the Hospital Anxiety and Depression
Scale for use with adolescents. British Journal of
Psychiatry, 175,
452-454.[Abstract/Free Full Text]
Authors' reply
C. Leach,
D. White and
R. Sims
QED Department, Wakefield & Pontefract Community Health NHS Trust,
Fieldhead, Ouchthorpe Lane, Wakefield WFI 3SP
D. Cottrell
School of Medicine, University of Leeds, 12a Clarendon Road, Leeds LS2
9NN
We welcome the opportunity to clarify the points raised by D.
Marchevsky.
- The non-clinical sample was selected by asking the head teachers of each
school to choose a selection of mixed-ability classes from each of the school
year groups that fitted the age range we had selected.
- Of the 180 adolescents re-tested, 77 (43%) were girls, almost identical to
the ratio of the first test sample (44%). The variances in the different
groups are indeed heterogeneous, but the results of the analysis hold when the
analyses are corrected for this effect or when non-parametric analyses are
carried out. Limited space precluded us from reporting the full analyses.
- The results remain the same whether or not the deliberate self-harm group
is included in the analyses.
- Robust multiple comparisons show that, for the depression sub-scale, the
out-patients depressed group scores higher than the other three groups, with
the other two clinical groups not differing significantly from each other, but
both scoring significantly higher than the school sample. For the anxiety
sub-scale, the three clinical groups do not differ significantly from each
other, but all score significantly higher than the school sample.
- Analyses for each gender separately produce the same results.
- We see no problems with using the Pearson product-moment correlation as a
measure of test-retest reliability. There is a long history of using this
correlation as a measure of reliability in the psychometric test theory
literature. Note that we are not measuring agreement between raters here, for
which a measure such as kappa would be appropriate.