Hostname: page-component-8448b6f56d-42gr6 Total loading time: 0 Render date: 2024-04-20T06:15:58.131Z Has data issue: false hasContentIssue false

Interpretation of screening implementation studies

Published online by Cambridge University Press:  02 January 2018

Alex J. Mitchell*
Affiliation:
Leicester General Hospital, Leicester LE5 4PW, UK. Email: alex.mitchell@leicspart.nhs.uk
Rights & Permissions [Opens in a new window]

Abstract

Type
Columns
Copyright
Copyright © Royal College of Psychiatrists, 2009 

Baas et al Reference Baas, Wittkampf, van Weert, Lucassen, Huyser and van den Hoogen1 report some very valuable findings based on a screening implementation study in Dutch general practice. In particular, they document that converting detections into treatment success is difficult in clinical practice and that many individuals with depression are unable or unwilling to accept help. However, I must disagree with their interpretation that it is necessary to screen 118 (17 of 2005) ‘high-risk’ people to treat one new case.

Let me illustrate this with an analogy of a drug trial for drug X. Let's say that I conduct a trial of drug X in primary care among 2005 individuals. Of 2005 approached, 780 consent to take X and of these, 226 have an initial response. The main question I would be asked is how many of the 780 actually had depression? I don't have this figure but I can say that of the 226 responders, 173 were given a Structured Clinical Interview for DSM–IV Axis I disorders (SCID) and of these 71 have depression. Further, unknown to me, 36 of the 71 were already receiving treatment (even though the protocol asked general practitioners to exclude those people with depression already known to them) and ultimately only 17 accepted treatment. Can I conclude from my trial of X that it is not a successful drug because only 17 were newly treated? No. I have demonstrated the difficulty of conducting a pragmatic trial in primary care, but I don't really know the success of X and I don't have any comparative placebo (treatment-as-usual) arm. What does this mean for the interpretation of the paper from Baas et al? From the authors' data the most critical step for useful interpretation of screening yield is revealed from those who have (a) the screen and (b) the criterion reference (gold standard, i.e. SCID). Thus I suggest that:

  1. (a) the number of detected cases per screen (who had a criterion diagnosis) = 71/173 (41%);

  2. (b) the number of newly treated cases per screen (who had a criterion diagnosis) = 35/173 (20%);

  3. (c) the number of helped cases per screen (who had a criterion diagnosis) = 17/173 (10%).

There may be many more people with depression (with high or low Patient Health Questionnaire (PHQ) scores) who were unidentified because a SCID was not applied to the 780 and the screen depends wholly on the PHQ–9 on a single application. At a typical (medium-risk) prevalence rate of 20% there would be around 156 cases of depression in the group of 780, but in a high-risk group where the prevalence may be 35% this would mean around 273 true cases. Just relying on a single application of the PHQ–9 on its own (either by algorithm or recommended cut-off) is probably insufficient. Assuming (like the authors) that the sensitivity of the PHQ–9 is a generous 0.88, Reference Kroenke, Spitzer and Williams2 then there might be 33 missed cases in a high-risk sample. However, a meta-analysis from Wittkampf et al Reference Wittkampf, Naeije, Schene, Huyser and van Weert3 found a pooled sensitivity of 0.77 and Gilbody et al Reference Gilbody, Richards, Brealey and Hewitt4 found 0.81 (both in primary care), which would translate into 52–63 missed cases. Of course there is offset by the issue of false positives which should also be examined in a screening implementation study. However, this remains a speculation without the SCID data from the parent 780 sample which is not reported (but perhaps available to the authors?).

In summary, I suggest this is a genuinely useful paper about the hazards of screening implementation but it is not really about screening success, for which a screening randomised controlled trial or pre–post screen design is needed. A simple guide to interpretation of screening studies can be downloaded from www.psycho-oncology.info/education.htm.

References

1 Baas, KD, Wittkampf, KA, van Weert, HC, Lucassen, P, Huyser, J, van den Hoogen, H, et al. Screening for depression in high-risk groups: prospective cohort study in general practice. Br J Psychiatry 2009; 194: 399403.CrossRefGoogle ScholarPubMed
2 Kroenke, K, Spitzer, RL, Williams, JB. The PHQ–9: validity of a brief depression severity measure. J Gen Intern Med 2001; 16: 606–13.CrossRefGoogle ScholarPubMed
3 Wittkampf, KA, Naeije, L, Schene, AH, Huyser, J, van Weert, HC. Diagnostic accuracy of the mood module of the Patient Health Questionnaire: a systematic review. Gen Hosp Psychiatry 2007; 29: 388–95.CrossRefGoogle ScholarPubMed
4 Gilbody, S, Richards, D, Brealey, S, Hewitt, C. Screening for depression in medical settings with the Patient Health Questionnaire (PHQ): a diagnostic meta-analysis. J Gen Intern Med 2007; 22: 1596–602.CrossRefGoogle ScholarPubMed
Submit a response

eLetters

No eLetters have been published for this article.