Amitriptyline v. the rest: still the leading antidepressant after 40 years of randomised controlled trials

CORRADO BARBUI , MATTHEW HOTOPF

Abstract

Background Tricyclic antidepressants have similar efficacy and slightly lower tolerability than selective serotonin reuptake inhibitors (SSRIs). However, there are no systematic reviews assessing amitriptyline, the reference tricyclic drug, v. other tricyclics and SSRIs directly.

Aims To review the tolerability and efficacy of amitriptyline in the management of depression.

Method A systematic review of randomised controlled trials (RCTs) comparing amitriptyline with other tricyclics/heterocyclics or with an SSRI.

Results We reviewed 186 RCTs. The overall estimate of the efficacy of amitriptyline revealed a standardised mean difference of 0.147 (95% CI 0.05-0.243), significantly favouring amitriptyline. The overall OR for dropping out was 0.99 (95% CI 0.91-1.08) and that for side-effects was 0.62 (95% CI 0.54-0.70), favouring the control drugs. With drop-outs included as treatment failures, the estimate of the effectiveness of amitriptyline v. tricyclics/heterocyclics and SSRIs showed a 2.5% difference in the proportion of responders in favour of amitriptyline (number needed to treat 40, CI 21-694; OR 1.12 (95% CI 1.01-1.24)).

Conclusions Amitriptyline is less well tolerated than tricyclics/heterocyclics and SSRIs, but slightly more patients treated on it recover than on alternative antidepressants.

Amitriptyline is one of the first ‘reference’ tricyclic antidepressants (TCAs). Over the past 40 years a number of newer tricyclics, heterocyclics and selective serotonin reuptake inhibitors (SSRIs) have been introduced (Garattini et al, 1998). Despite several large systematic reviews comparing tricyclics and SSRIs there is no clear agreement over first-line treatment of depression (Song et al, 1993; Anderson & Tomenson, 1995; Montgomery & Kasper, 1995; Hotopf et al, 1996; Canadian Coordinating Office for Health Technology Assessment, 1997a). Grouped as a whole, tricyclics appear to have similar efficacy to SSRIs, but are slightly less well tolerated. If tolerability is measured according to the numbers of drop-outs occurring in randomised controlled trials (RCTs), the number needed to treat (NNT) with SSRIs to prevent one tricyclic-related drop-out is estimated at 33 (Anderson & Tomenson, 1995). This modest advantage has to be set against the increased cost of SSRIs (Canadian Coordinating Office for Health Technology Assessment, 1997b). A meta-analysis which subdivided TCAs according to whether they were reference compounds (e.g. the oldest TCAs, amitriptyline and imipramine) or newer tricyclics or hetero-cyclics, suggested that the higher drop-out rates associated with tricyclics could be attributed to the effect of amitriptyline and imipramine — newer tricyclics and heterocyclics were no worse than the SSRIs (Hotopf et al, 1997). However, there have not been any systematic reviews assessing amitriptyline v. other tricyclics and hetero-cyclics directly. We therefore aimed to test the hypothesis that amitriptyline would be less well tolerated than other tricyclics and SSRIs, and also to assess its effectiveness compared with the alternatives.

METHOD

Inclusion criteria

All RCTs comparing amitriptyline with any other tricyclic, heterocyclic or SSRI were included. Crossover studies were excluded. Studies adopting any criteria to define patients suffering from depression were included; a concurrent diagnosis of another psychiatric disorder was not considered an exclusion criterion. Trials in patients with depression with a concomitant medical illness were not included in this review.

Search strategy

Relevant studies were located by searching the Cochrane Collaboration Depression, Anxiety and Neurosis Controlled Trials Register (CCDANCTR). This specialised register is regularly updated by electronic (Medline, Embase, PsycINFO, LILACS, Psyndex, CINAHL, SIGLE) and non-electronic literature searches. The register was searched using the following terms: AMITRIPTYLIN*or AMITRIL or ELATROL or ELAVIL or EMITRIP or ENDEP or ENOVIL or LAROXYL or LENTIZOL or LEVATE or MEVARIL or NOVOTRIPTYN or SAROTEN or TRYPTAL or TRYPTIZOL or TRIPTAFEN*. A specific electronic search was also performed with Medline and Embase from 1966 to 1998. We used the search term: AMITRIPTYLINE and RANDOMISED CONTROLLED TRIAL or RANDOM ALLOCATION or DOUBLE-BLIND METHOD. Reference lists of relevant papers and previous systematic reviews were hand searched for published reports and citations of unpublished research. Finally, attempts were made to obtain data through direct contact with the pharmaceutical industry.

Outcomes

Efficacy was evaluated using the following outcome measures:

  1. Number of patients who responded to treatment out of the total number of randomised patients.

  2. Group mean scores at the end of the trial on Hamilton Depression Rating Scale (HDRS; Hamilton, 1960), or Montgomery and Åsberg Depression Scale (MADRS; Montgomery & Åsberg, 1979), or any other depression scale.

Tolerability was evaluated using the following outcome measures:

  1. Number of patients failing to complete the study as a proportion of the total number of randomised patients.

  2. Number of patients complaining of side-effects out of the total number of randomised patients.

Data extraction

Using a standard form two reviewers independently extracted information on the year of publication, concealment of allocation, blindness, length of treatment, inclusion criteria, age range, country and setting of the study and type of pharmacological intervention. The number of patients undergoing the randomisation procedure, the number of patients who failed to complete the study (drop-outs) and that of patients complaining of side-effects were recorded. For dichotomous outcomes the number of patients showing a 50% reduction in score on the HDRS or MADRS scale was extracted; if these figures were not available, we extracted the number of patients categorised as ‘ much improved’ and ‘improved’ on the Clinical Global Impression scale (CGI; Guy, 1976), or the number of patients in the corresponding categories of any other rating scale if the CGI was not used. For continuous outcomes the mean scores at end-point on the HDRS and the number of patients included in this analysis were recorded. If the HDRS was not employed, we extracted the mean scores at end-point on the MADRS or on any other rating scale. Mean scores were recorded with the standard deviation (s.d.) or standard error (s.e.) of these values. When only the s.e. was reported, it was converted into s.d. according to Altman & Bland (1996).

Statistical analysis

Efficacy data were analysed in the following way. Responders to treatment were calculated on an intention-to-treat (ITT) basis: drop-outs were always included in this analysis. When data on drop-outs were carried forward and included in the efficacy evaluation (last observation carried forward, LOCF), they were analysed according to the primary studies; when drop-outs were excluded from any assessment in the primary studies they were considered as ‘ drug failures’. Scores from continuous outcome scales could not be analysed on an ITT basis. This approach was not feasible as most studies performed only an end-point or LOCF analysis, which inevitably excluded most drop-out patients. Therefore, scores from continuous outcomes were analysed on an end-point basis, including only patients with a final assessment or with an LOCF to the final assessment. Tolerability data were analysed by calculating the proportion of patients who failed to complete the study and who experienced adverse reactions out of the total number of randomised patients. For each outcome measure three separate meta-analyses were planned. The first compared amitriptyline with tricylic/heterocyclic antidepressants, the second amitriptyline with SSRIs and the third analysis summarised the overall comparison of amitriptyline with both tricyclic/heterocyclic drugs and SSRIs.

Dichotomous outcomes were summarised by calculating a Peto-weighted odds ratio for each study, together with the 95% CI. An overall odds ratio was then calculated as a summary measure. The number of patients who need to be treated (NNT) with amitriptyline rather than the control antidepressants for one additional patient to benefit (NNTB) or be harmed (NNTH) was calculated with the 95% CI (Altman, 1998). Heterogeneity of treatment effects between studies was tested using the χ 2 statistic. Continuous outcomes were analysed by calculating a standardised weighted mean difference (SMD) for each study. This measure gives the effect size of an intervention in units of standard deviation so that scores from different outcome scales can be combined into an overall estimate of effect. A random effects model, which takes into consideration any between-study variation, was adopted to combine the effect sizes. Calculations were performed using the RevMan software provided by the Cochrane Collaboration (Review Manager, 1999).

RESULTS

Characteristics of included studies

We identified 352 potentially relevant studies: 186 RCTs met the inclusion criteria and were considered in this review (see Appendix), while 166 studies were excluded for the reasons listed in Table 1. Of the 186 included studies, 146 compared amitriptyline with another TCA or heterocyclic antidepressant and 40 compared amitriptyline with one of the SSRIs. In six studies amitriptyline was administered in combination with perphenazine; in two of these studies the experimental drug was nortriptyline in combination with fluphenazine. One trial compared amitriptyline with nortriptyline plus fluphenazine.

View this table:
Table 1

Studies identified by the electronic search but excluded from the meta-analysis, and reason for exclusion

Although all trials reported that patients had been randomly allocated, in six cases the concealment of allocation was inadequate with some bias possible. In four studies only physicians, but not patients, were blind to treatments, in nine cases neither physicians nor patients were blind, while the other 173 studies were double-blind. The median sample size was 50 patients (10% percentile 24, 25% percentile 40, 75% percentile 80, 90% percentile 153; range 10-531). The median length of trials was four weeks (25% percentile 4, 50% percentile 4, 75% percentile 6; range 3-12); the number of studies with more than four weeks of follow-up increased from 28 (30%) to 62 (67%) after 1980. In 67 trials (36%) authors adopted diagnostic criteria and a specification of severity of depression to enrol patients; in 55 trials (30%) authors adopted only a specification of severity, while in the remaining 34% of studies patients were enrolled on the basis of physicians' implicit criteria to define patients with depression or because they were judged to require antidepressant therapy. Fifty-nine per cent of studies published before 1980 used implicit criteria v. 9.6% of those published after this date. Overall, 108 trials (60%) used operational criteria for depression. Nearly half of the studies (47%) provided a comprehensive description of patients' side-effects, while 23 (12%) trials gave inadequate details. The outcome assessment was performed with valid and reliable instruments in 70% of the sample; the use of valid instruments in studies published before and after 1980 increased from 51 (55%) to 81 (86%).

Efficacy of amitriptyline

Data extracted from 82 RCTs showed that the proportion of patients who responded to amitriptyline was 2.4% higher than for control TCA/heterocyclic antidepressants (NNTB 42, 95% CI NNTH 357 to ∞ to NNTB 20) (see Table 2). This difference corresponds to an overall odds ratio which favoured amitriptyline (Peto odds ratio 1.11, 95% CI 0.99-1.25), but with only borderline statistical significance. The estimate of the efficacy of amitriptyline and control TCAs/heterocyclic antidepressants on a continuous outcome, performed on 699 and 661 patients respectively, revealed an effect size which also significantly favoured amitriptyline (SMD=0.177, 95% CI 0.005-0.350). Head-to-head comparisons indicated that amitriptyline, in comparison with imipramine, is associated with a greater proportion of responders; in comparison with dothiepin, however, the proportion of responders was significantly lower (see Table 2).

View this table:
Table 2

Amitriptyline (AMI) in comparison with tricyclic (TCA) or heterocyclic antidepressants: proportion of responders, number of patients evaluated on a continuous outcome and estimates of efficacy

Data from 17 RCTs showed that the proportion of patients who responded to amitriptyline was 2.8% higher than for SSRIs (NNTB 35, 95% CI NNTH 53 to ∞ to NNTB 13) (see Table 3). This difference corresponded to an overall odds ratio which favoured amitriptyline (Peto odds ratio 1.14, 95% CI 0.92-1.38), but not significantly. The estimate of the efficacy of amitriptyline and SSRIs on a continuous outcome, performed on 1041 and 1061 patients, respectively, revealed a small effect size which significantly favoured amitriptyline (SMD=0.106, 95% CI 0.02-0.19). No significant differences emerged from direct comparisons between amitriptyline and one of the SSRIs (see Table 3).

View this table:
Table 3

Amitriptyline (AMI) in comparison with selective serotonin reuptake inhibitors (SSRIs): proportion of responders, number of patients evaluated on a continuous outcome and estimates of efficacy

Tolerability of amitriptyline

Data from 125 RCTs showed that 20% of patients treated with amitriptyline failed to complete the study, in comparison with 21.5% of patients who received another tricyclic/heterocyclic antidepressant (NNTB=69, 95% CI NNTH 385 to ∞ to NNTB 32). This difference corresponded to an overall odds ratio non-significantly favouring amitriptyline (Peto odds ratio 1.09, 95% CI 0.98-1.22) (see Table 4). However, the estimate of the proportion of patients who experienced side-effects during the study was 13% higher for amitriptyline than for control TCAs/heterocyclic antidepressants (NNTH=7.6, 95% CI NNTH 6 to NNTH 11) (see Table 4), corresponding to an odds ratio which significantly favoured the control TCAs/heterocyclic antidepressants. Head-to-head comparisons failed to detect statistically significant differences in terms of drop-outs between amitriptyline and one of the TCA/heterocyclic antidepressants (see Table 4). However, amitriptyline was associated with more side-effects than dothiepin, maprotiline, mianserin, minaprine and nortriptyline (see Table 4).

View this table:
Table 4

Amitriptyline (AMI) in comparison with tricylcic/heterocyclic antidepressants (TCAs): proportion of drop-outs, proportion of patients with side-effects and estimates of tolerability

Data from 40 RCTs comparing amitriptyline and SSRIs showed that 29.8% of patients treated with amitriptyline failed to complete the study, in comparison with 27.7% of patients treated with SSRIs (NNTH=49, 95% CI NNTB 180 to ∞ to NNTH 22). This difference corresponds to an overall odds ratio of 0.86 (95% CI 0.75-0.98), which significantly favoured SSRIs (see Table 5). The estimate of the proportion of patients who experienced side-effects during the study was 11.6% higher for amitriptyline than for SSRIs (NNTH=8.6, 95% CI NNTH 6 to NNTH 15) (see Table 5), corresponding to an odds ratio which significantly favoured the SSRIs.

View this table:
Table 5

Amitriptyline (AMI) in comparison with selective serotonin reuptake inhibitors (SSRIs): proportion of drop-outs, proportion of patients with side-effects and estimates of tolerability

Overall efficacy and tolerability of amitriptyline in comparison with all antidepressant drugs

A funnel plot (Fig. 1) showed no evidence of publication bias being a problem in the data collected. The overall estimate of the efficacy of amitriptyline in comparison to TCAs/heterocyclic drugs and SSRIs showed a 2.5% difference in the proportion of responders in favour of amitriptyline (NNTB=40, 95% CI NNTB 21 to NNTB 694) (see Fig. 2), which corresponded to an intention to treat odds ratio of 1.12 (95% CI 1.01-1.24). The estimate of the efficacy of amitriptyline and control antidepressants on a continuous outcome confirmed the slightly superior efficacy profile of amitriptyline: the estimate of the SMD significantly favours amitriptyline (see Fig. 2).

Fig. 1

Funnel plot of estimated logarithmic odds ratio against the size of the study. Broken horizontal line represents the overall estimate of the logarithmic odds ratio (0.11).

Fig. 2

Overall estimate of the efficacy and tolerability of amitriptyline (AMI) in comparison to all other antidepressant drugs.

The drop-out rate in patients taking amitriptyline and the control antidepressants was very similar, yielding an overall odds ratio of 0.99 (95% CI 0.91-1.08). However, the estimate of the proportion of patients who experienced side-effects during the study was 13.1% higher for amitriptyline than control antidepressants (NNTH=7.6, 95% CI NNTH 6 to NNTH 10), corresponding to an odds ratio which significantly favoured the control antidepressants (see Fig. 2).

DISCUSSION

Implications for research

This systematic review suggests that amitriptyline should remain in its position as the gold-standard antidepressant. Using a highly conservative approach to estimate efficacy — in which drop-outs were included in the analysis — we estimated that amitriptyline is slightly more efficacious than all other antidepressants grouped together. The same applied when the analysis was subdivided according to pharmacological class of the comparison drug — although the comparison with SSRIs failed to reach statistical significance. This measure of outcome takes into consideration drop-outs from therapy, so it cannot be explained by differential completion of the study protocol. The additional efficacy outcome — using effect sizes of continuous outcomes — showed a similar picture, but now with a statistically significant difference against the SSRIs. The tolerability data confirm that amitriptyline is associated with more side-effects than, but similar drop-outs to, other TCAs, and more side-effects and more drop-outs than SSRIs.

Methodological concerns

There are reasons for interpreting these results with caution. Included studies are heterogeneous in terms of selection criteria, allocation concealment, setting and out-come measures. A certain variability in the overall quality of the primary research might therefore have influenced the overall comparison. This systematic review did not investigate heterogeneity by grouping trials according to patient characteristics or trial quality and performing subgroup analyses. This approach was not adopted because it would have inevitably decreased the power of the analysis, thus providing ambiguous results; in addition, increasing the number of comparisons would have increased the possibility of detecting significant differences only by chance. The present analysis, which pools data from different trials carried out in many populations, has the advantage of generating information which can be applied to a very diverse range of patients (Oxman et al, 1994).

Implications for practice

How should these data be translated into clinical practice? It certainly seems reasonable to conclude that amitriptyline is as good as — if not better than — the other TCAs and heterocyclic antidepressants, with the possible exception of dothiepin (Eccles et al, 1999). It seems reasonable to suggest that either amitriptyline or dothiepin should remain the first-line TCA. More controversial is the role of TCAs alongside SSRIs. The results from randomised trials suggest that amitriptyline probably has the edge in terms of efficacy over SSRIs. Given that publication bias is likely to work in favour of newer compounds, it is possible that unpublished data would further improve amitriptyline's position. Those who advocate first-line use of an SSRI point to two additional strands of evidence — the danger of TCAs in overdose and the fact that they are often in practice prescribed at sub-therapeutic doses. Although the widespread prescribing of SSRIs has to be viewed as a public health measure to prevent suicide, it is likely to be prohibitively expensive; in addition, data showing that the widespread use of SSRIs decreases suicide rates are lacking (Barbui et al, 1999). The advice should probably remain that SSRIs are the first-line treatment to be given to patients at high risk of committing deliberate self-harm. The problems of TCAs being prescribed in low doses has attracted considerable attention, as evidence suggests that in real situations TCAs are rarely taken appropriately. However, the guidelines on ‘ adequate’ dosing — which suggest at least 125 mg of amitriptyline have to be prescribed for it to be effective — are themselves based on inadequate research. Recent systematic reviews indicate that low-dose TCAs are as effective as SSRIs in treating depression (Canadian Coordinating Office for Health Technology Assessment, 1997a), and studies directly comparing low- and high-dose TCAs show only very modest benefits of high doses (Bollini et al, 1999).

Clinical Implications and Limitations

CLINICAL IMPLICATIONS

  • Amitriptyline is at least as effective as the other tricyclic and heterocyclic antidepressants.

  • Slightly more patients treated with amitriptyline make a recovery than with selective serotonin reuptake inhibitors.

  • Amitriptyline is less well tolerated than selective serotonin reuptake inhibitors.

LIMITATIONS

  • Included trials are heterogeneous in terms of patients, settings and outcome measures.

  • Heterogeneity has not been investigated by performing subgroup analyses.

  • The variability in the quality of the original studies might have influenced the overall comparison.

Acknowledgments

We would like to thank Hugh McGuire, CCDANCTR Trial Search Coordinator, for assisting in developing the search strategy of this research. Thanks in addition to Nick Freemantle for sharing relevant references and to Jennifer Hillebrand for assistance in extracting data from non-English articles.

Footnotes

  • See editorial, pp. 99-100, this issue.

  • Received January 31, 2000.
  • Revision received May 19, 2000.
  • Accepted May 19, 2000.

References

APPENDIX: Potentially relevant studies identified by the electronic search and subsequently excluded from the meta-analysis

APPENDIX: Randomised controlled trials included in the meta-analysis

View Abstract