Tricyclic antidepressants and serotonin reuptake inhibitors are considered to be equally effective, but differences may have been obscured by internally inconsistent measurement scales and inefficient statistical analyses.
To test the hypothesis that escitalopram and nortriptyline differ in their effects on observed mood, cognitive and neurovegetative symptoms of depression.
In a multicentre part-randomised open-label design (the Genome Based Therapeutic Drugs for Depression (GENDEP) study) 811 adults with moderate to severe unipolar depression were allocated to flexible dosage escitalopram or nortriptyline for 12 weeks. The weekly Montgomery–Åsberg Depression Rating Scale, Hamilton Rating Scale for Depression, and Beck Depression Inventory were scored both conventionally and in a more novel way according to dimensions of observed mood, cognitive symptoms and neurovegetative symptoms.
Mixed-effect linear regression showed no difference between escitalopram and nortriptyline on the three original scales, but symptom dimensions revealed drug-specific advantages. Observed mood and cognitive symptoms improved more with escitalopram than with nortriptyline. Neurovegetative symptoms improved more with nortriptyline than with escitalopram.
The three symptom dimensions provided sensitive descriptors of differential antidepressant response and enabled identification of drug-specific effects.
Less than 50% of people with depression respond to the first prescribed antidepressant, but the majority eventually respond to a different treatment.1,2 The rate and magnitude of response appear to be similar for tricyclic antidepressants and selective serotonin reuptake inhibitors (SSRIs).3–5 Psychiatrists are unable to predict which drug will work for whom and the choice of first and subsequent treatments has to progress by trial and error. The present study addresses two major methodological challenges that may have precluded identification of drug-specific effects in previous studies: symptomatic heterogeneity and statistical power.
Although depression is conceived as a single condition, its defining symptoms do not necessarily co-occur and individual symptoms may differ in their distribution across individuals and their response to treatments.6 This heterogeneity of depressive symptoms complicates exploration of drug effects. For example, the early improvement of sleep with tricyclic antidepressants may be unrelated to sustained response, but early improvement in anxiety precedes and predicts overall improvement.7 Such cross-sectional and longitudinal dissociations between symptom dimensions decrease the correlations between items of scales that combine mood, anxiety and sleep items in a single score, i.e. impair their internal consistency, to a degree where a summed test score is uninformative.8,9 We have sought to remediate this problem and, using categorical item factor analysis, we identified three dimensions of depressive symptoms with good psychometric properties: observed mood, cognitive and neurovegetative symptoms.10 The present study tests the hypothesis that escitalopram and nortriptyline differ in their effects on these dimensions.
A second challenge concerns the effectiveness of statistical analysis. Most previous trials were powered to compare active medication with placebo, but differences between active antidepressants are likely to be smaller.11 To maximise the power for a specified sample size, it is essential that all information on outcome is used in the analysis. Many previous investigations used dichotomised outcomes (e.g. responder/non-responder). However, response to antidepressants is a matter of degree of change rather than a yes/no qualitative transformation, and dichotomising a continuous outcome is associated with a substantial loss of power.12,13 Furthermore, temporal characteristics of antidepressant response are lost in end-point analysis and the commonly used last observation carried forward procedure for missing data produces biased results.14–16 In the present report, we apply mixed-effect modelling that permits the use of data measured at multiple time points, and provides unbiased estimates in the presence of missing data.14,16,17 This approach also separates inter-individual variation in antidepressant response from measurement error and unmeasured centre differences. This partitioning allows estimation of the proportion of variance attributable to unmeasured individual-specific characteristics, including genes.
Genome Based Therapeutic Drugs for Depression (GENDEP) is a partially randomised multicentre clinical and pharmacogenetic study comparing two active antidepressants with contrasting modes of action. The study was undertaken in nine European clinical centres. GENDEP is registered at EudraCT2004-001723-38 (http://eudract.emea.europa.eu) and ISRCTN03693000 (www.controlled-trials.com).
Pragmatic design features were adopted to make GENDEP inclusive and acceptable to a large proportion of people with depression.18 These included non-random allocation of participants who would otherwise not be eligible, no use of placebo, flexible dosage, no post-allocation masking and open communication with general practitioners.
Two antidepressants were selected that represent the two most common mechanisms of action among commonly used antidepressants and have a good efficacy record. Escitalopram is a highly selective inhibitor of the serotonin transporter with no effect on noradrenaline reuptake.19 Nortriptyline is a tricyclic antidepressant with a hundred times higher affinity for the noradrenaline transporter than for the serotonin transporter.20 Nortriptyline was used in preference to the even more selective reboxetine as it has better established efficacy and was considered to be clinically at equipoise with escitalopram.
Study medication was started immediately after the first assessment in antidepressant-free participants or participants on low doses of other antidepressants. Two week wash-out was required for people on fluoxetine or monoamine oxidase inhibitors. Escitalopram was initiated at 10 mg daily and increased to a target dose of 15 mg daily within the first 2 weeks unless adverse effects limited dose increase, and could be further increased to 20 mg daily (and up to 30 mg if there was clinical agreement that a higher dose was needed). Nortriptyline was initiated at 50 mg daily and titrated to a target dose of 100 mg daily within the first 2 weeks unless adverse effects limited dose increase, and could be further increased to 150 mg daily (and up to 200 mg if there was clinical agreement that a higher dose was needed). Use of plasma levels to guide dose titration has been suggested for nortriptyline, but it is of uncertain benefit21 and could introduce a systematic difference between the two antidepressants. Therefore, dose titration of both antidepressants was informed by assessments of depressive symptoms and adverse effects rather than plasma levels. Adherence was recorded weekly as self-reported pill count and plasma levels of antidepressants were measured at week 8. Other psychotropic medication was prohibited with the exception of occasional use of hypnotics.
Participants for whom the two antidepressants were clinically considered to be at equipoise were randomly allocated to receive escitalopram or nortriptyline using a random number generator, stratified by centre and performed independently of the assessing clinician. If there was a history of adverse effects, non-response or contraindications to one of the study medications, participants were allocated to the other drug non-randomly. Participants who could not tolerate the initially allocated medication or who did not experience sufficient improvement with adequate dosage within 8 weeks were offered the other antidepressant. Participants who swapped medication were then followed up for 12 weeks.
The clinician-rated Montgomery–Åsberg Depression Rating Scale (MADRS),22 the 17-item Hamilton Rating Scale for Depression (HRSD–17)23 and the self-report Beck Depression Inventory (BDI)24 were administered at baseline and then weekly for 12 weeks. The week 0, 8 and 12 assessments were face-to-face interviews with a psychiatrist and a research assistant, both trained in the administration of the instruments. The remaining assessments were conducted by telephone or face-to-face interviews with a trained psychologist or psychiatrist. Psychometric properties and interrater reliability have been reported.10 Using factor analysis of ordered categorical variables with robust weighted least squares estimator and item response modelling, the items of the three scales were integrated into three dimensional scores of observed mood, cognitive symptoms and neurovegetative symptoms.10 The dimensional scores for the present analyses were estimated based on a graded-response model using the previously reported item parameters10 applied in the MULTILOG 7 software for Windows.25 The observed mood dimension comprised the symptoms of depressed mood, activity, anxiety and psychomotor disturbance rated by the clinician. The cognitive symptoms dimension consisted of guilt, pessimism, suicidal thoughts and most items of the self-report BDI. The neurovegetative factor included disturbed sleep, loss of appetite, weight loss and lack of libido. Full mapping of individual items to dimensions is available in a previous article.10 To facilitate interpretation, dimensional symptom scores have been converted to T-scores with a mean of 50 and standard deviation of 10, based on the baseline assessment. This makes a change of 10 on a dimensional score comparable with a change of 10 points on BDI, 7 points on MADRS or 5 points on HRSD–17.
Sample size and recruitment of participants
The sample size of over 800 gives GENDEP 90% power to detect drug differences corresponding to an effect size (Cohen's d) as small as 0.06 at α=0.05.
Participants were recruited by generalist and specialist referrals and advertisement. Inclusion criteria were: White European ethnicity (to facilitate genetic association analyses), age 18 or older, onset of current depressive episode at age 65 or younger, and a diagnosis of major depressive episode of at least moderate severity defined by the ICD–1026 or DSM–IV27 and established using the Schedules for Clinical Assessment in Neuropsychiatry interview (SCAN version 2.1).28 The exclusion criteria were: family history of bipolar affective disorder or schizophrenia in a first-degree relative, a personal history of hypomanic or manic episode, schizophrenia, mood incongruent psychotic symptoms, primary substance misuse, primary organic disease and pregnancy. Participants were also excluded if they had contraindications or a history of lack of efficacy or adverse reaction to both study medications. The study protocol was approved by the research ethics boards of all participating centres. After explanation of study procedures, all participants provided written consent.
Baseline characteristics were compared using chi-squared tests, Kruskal–Wallis tests or ANOVA for categorical, ordered and continuous variables respectively. Predictors of time to drop out or switch from initially allocated treatment were assessed by Cox proportional hazard regression with drug, allocation (random v. non-random), gender, age, baseline severity, taking antidepressants and benzodiazepines at baseline and number of previous episodes as explanatory variables.
To assess fair dosage of the two antidepressants, we followed the recommendation of a consensus group on antidepressant comparisons,11 and used Cox proportional hazard regression to assess the impact of drug and allocation on time to reach a mid-range dose, which is half-way between the lowest effective and highest recommended dose, i.e. 15 mg for escitalopram and 100 mg for nortriptyline.
Outcomes were analysed using mixed models with individual random intercepts and slopes, and fitted with full maximum likelihood.17 Participants who swapped medication were included under both medications, with the last measurement on the first antidepressant serving as a baseline for the effect of the second antidepressant, a fixed covariate capturing systematic differences between first and second run of medication, and individual-level clustering being controlled by the random effect of the individual. Centre was included as a higher-level random effect. Model selection was performed by means of likelihood ratio tests. The best fitting model included fixed linear and quadratic effects of time, and fixed linear effects of baseline severity, drug, allocation and age.
The mixed-effect models provide unbiased estimates, assuming the data is missing at random and the variables associated with missing values are included in the model.14,29 To assess the missing data mechanism, we explored the relationship between missingness and observed variables at baseline and at the last observed time point.
The combined analysis of randomised and non-randomised participants may be subject to confounding by baseline group differences on observed or unobserved variables. Therefore, to evaluate the sensitivity of our analysis to selection effects, the mixed-model analyses were repeated on the reduced sample of observations from randomised individuals while they were on their first course of medication.
All analyses were conducted in Stata 10 for Windows.30
Screening and reasons for non-inclusion
The flow of participants through the study is summarised in Figs 1 and 2. The reasons for exclusions at the screening stage were: not fulfilling diagnostic criteria for moderate or severe depressive episode (24%); bipolar disorder or psychotic symptoms (18%); unable to discontinue current psychotropic medication (16%); ethnicity (10%); primary alcohol or substance misuse (7%); family history of bipolar disorder or schizophrenia (7%); unable to attend the study centre (7%); contraindications (6%); age (3%); and pregnancy (2%).
Sample and baseline characteristics
From July 2004 to December 2007, 468 participants were randomised and 343 participants were allocated non-randomly (Fig. 1). More participants were non-randomly allocated to escitalopram than to nortriptyline. Sample characteristics at baseline are presented in Table 1 (full details are presented in online Table DS1). The non-randomly allocated participants differed from the randomised sample: fewer were married (χ2(3)=11.72, P=0.008) or employed (χ2(5)=13.86, P=0.017), they had later age at onset (F(1, 809)=10.56, P=0.001), fewer depressive episodes (Kruskal–Wallis χ2(1)=45.70, P<0.001) and less severe symptoms (MADRS F(1, 809)=7.22, P=0.007). Within the participants who could not be randomly allocated to treatment, those receiving nortriptyline had more previous episodes (Kruskal–Wallis χ 2(1)=5.04, P=0.025) (Table 1) and were more likely to have a history of taking SSRI-type antidepressants (χ2(1)=7.36, P=0.007) than those non-randomly allocated to escitalopram (online Table DS1).
Retention of participants
Of the 811 participants, 628 (77%) completed 8 weeks and 527 (65%) completed 12 weeks on the originally allocated antidepressant (Fig. 1). Over the 12 weeks, 105 (13%) participants switched to the other antidepressant and an additional 4 switched after completing 12 weeks on the originally allocated drug. Reasons for switching were poor tolerance (39%), lack of effect (45%) or both (16%). Over the 12 weeks, 179 participants dropped out because of adverse reactions (31%), lack of effect (34%), improvement (8%), death (1%, see adverse events) and other reasons (25%). Of the 109 participants who switched antidepressant, 80 (73%) completed 8 weeks and 68 (62%) completed 12 weeks on the second antidepressant (Fig. 2).
The rate of drop out and switching was highest among participants randomly allocated to nortriptyline (hazard ratio (HR)=1.87, 95% CI 1.36–2.56, P=0.001 compared with random escitalopram; HR=1.47, 95% CI 1.02–2.13, P=0.041, compared with non-random nortriptyline; Fig. 3). There were no significant differences in drop-out and switching rate among the other three groups. Attrition was predicted by more severe baseline symptoms with a hazard ratio of 1.22 (95% CI 1.08–1.38, P=0.002) for one standard deviation increase in MADRS.
The weekly data on depression severity were 92.9% complete and proportion of missing values did not differ between groups. Taking benzodiazepines at the time of recruitment was related to the proportion of missing values; 4% data were missing in participants who were taking benzodiazepines at baseline compared with 9% in participants who were not taking benzodiazepines (β=–0.045, 95% CI –0.064 to –0.026, P<0.001). Younger age was associated with more missing values (β=–0.010, 95% CI –0.018 to –0.001, P=0.030). Other clinical and demographic variables were not related to missing data. Missing values at a specific time point (t) were not predicted by severity of depression on the preceding visit (t=–1), for example for MADRS (β=–0.003, 95% CI –0.012 to 0.005, P>0.1).
Antidepressant dosage and adherence
For both antidepressants, the median time to reach mid-range dose was 3 weeks, and there was no significant effect of drug (HR=1.11, 95% CI 0.95–1.30, P=0.198), indicating similar rate of dose titration for both antidepressants. The mean dose by study group and week is presented in the online Table DS2. The self-reported adherence was high (98.4%) and did not differ between treatment groups (P>0.1). The average plasma levels at the eighth treatment week were nortriptyline 100.4 mcg/l (s.d.=57.9) and citalopram 30.7 mcg/l (s.d.=21.2), with no significant difference between randomly and non-randomly allocated participants (P>0.1).
Changes in depression symptoms
The weekly measurements of depressive symptoms on the three original scales and the three symptom dimensions are presented in Fig. 4. The mixed models included linear and quadratic functions of time, fixed effects of drug, randomisation status, baseline severity, age, gender, number of depressive episodes, history of taking antidepressants and benzodiazepines at baseline (the latter was included as it predicted missingness) and showed that drug did not affect the outcome measured by the HDRS–17, MADRS or BDI (all P>0.1, Table 2). However, there were significant effects of drug on outcome on each of the three symptom dimensions. The observed mood and cognitive symptoms improved more in escitalopram-treated participants. The neurovegetative symptoms improved more in those receiving nortriptyline (Table 2).
To control for selection bias, we performed a sensitivity analysis restricted to the first course of antidepressant treatment in the randomised participants. The results were very similar with all effect size estimates within one standard error of the whole sample estimates (Table 2). The degree of statistical certainty was reduced owing to the smaller sample size.
Younger age was associated with improvement on all measures (e.g. for MADRS: β=0.08, 95% CI 0.04–0.11 per 10 years of age, P<0.001). History of taking antidepressants predicted less improvement on all measures (e.g. for MADRS: β=0.13, 95% CI 0.04–0.23, P=0.005).
The fixed part of the models explained 35% of variability in antidepressant response on the observed mood dimension. Of the remaining variance, 8% was attributable to the unmeasured characteristics of centre, 69% was at the level of individual and 28% remained as level-three residuals, corresponding to measurement error and unmeasured time-varying factors.
Information on response and remission using last observation carried forward analysis is available in the online data supplement.
Adverse events and reactions
Two participants died during the study period. A woman randomised to nortriptyline died by suicide in the ninth week. A man randomly allocated to escitalopram died of a road traffic accident in the fifth week. Severe adverse events included two hospital admissions owing to suicide risk (ninth week on random escitalopram, third week on random nortriptyline), a manic episode in the third week of nortriptyline and an unintentional overdose of nortriptyline with full recovery. Commonly reported adverse reactions to escitalopram included nausea and vomiting (15%) and sexual dysfunction (30%). Common adverse effects of nortriptyline included dry mouth (80%), orthostatic dizziness (32%), drowsiness (27%) and constipation (24%).
Differential effects of antidepressants
The present results demonstrate the utility of dimensional symptom measures derived by psychometric analysis to identify relative advantages of individual antidepressants. Escitalopram was more effective than nortriptyline in relieving mood and cognitive symptoms of depression. Nortriptyline was more effective than escitalopram in improving neurovegetative symptoms such as disturbed sleep and poor appetite. None of these differences would have been revealed by summed scores on conventional depression rating scales that combine all three types of symptoms.
The observed mood dimension reflects the symptoms of depressed mood, anxiety, psychomotor retardation and activity. It has been noted that changes in core mood symptoms are more likely to reflect sustained antidepressant effect,7 differentiate active antidepressants from placebo,31 show dose–response relationship32 and moderation by polymorphism in the serotonin transporter gene.33 The observed mood dimension contains information from most items that constitute the previously suggested core sub-scales of the HRSD,31,34 but has the advantages of using information from a larger number of items and not making indefensible assumptions about additivity and equal contribution of items.10,35 Therefore, the observed mood score is suitable for testing hypotheses related to pharmacological modulation of affect and biomarkers of the monoaminergic systems. The strong effect of escitalopram on observed mood indicates the utility of this antidepressant in people where core affective symptoms dominate the clinical picture.
The cognitive symptoms dimension comprises items reflecting dissatisfaction with oneself, pessimism, guilt and suicidal thoughts. It shows a modest advantage of escitalopram over nortriptyline. As suicidal ideation appears to lie on a continuum with cognitive symptoms,10 the cognitive dimension may be evaluated as a monitoring tool for treatment-emergent suicidality.36
The most robust finding of the present study was that neurovegetative symptoms improved significantly more with nortriptyline than with escitalopram. The neurovegetative symptom dimension includes disturbed sleep, decreased appetite, weight loss and lack of sexual interest. These symptoms are characteristic of melancholic depression and may indicate the need for antidepressants with a broader spectrum of pharmacological effects.37 It has been reported that the HRSD–17 with three sleep items may give an advantage to tricyclic antidepressants that improve sleep through their anticholinergic action over SSRIs that may disturb sleep, cause gastrointestinal discomfort and sexual dysfunction.38 Sleep improvement may be independent of antidepressant action on mood7 and moderated by genes regulating the circadian rhythm.39 The present findings add to the weight of evidence indicating that sleep and appetite should be measured separately from the core mood symptoms.
Our results suggest that failure to find differential efficacy of tricyclic antidepressants and SSRIs in previous studies3 may have been because such differences were obscured by the internal inconsistence of scales such as the HRSD–17.8 As the item response theory scoring is independent of the number of administered items,35 it could be used to derive equivalent scores for samples where either HRSD or MADRS is available.10 This raises the possibility of re-examining existing data-sets to attempt to replicate the present findings and extend them to placebo-controlled trials.
The size of the drug differences is comparatively small. However, it may be of clinical utility since it is approximately 25–50% of the size of the differences between antidepressants and placebo in contemporary trials.40,41 Increased efficacy of the item response theory-scored dimensions may also have substantial implications for the sample size and power of future comparisons between active drugs or between drugs and placebo.42 Moreover, small overall differences can point to large differences in subgroups of patients. A relatively small improvement in accuracy of symptom measurement can magnify the power to detect interactions between drug and individual characteristics, and facilitate identification of predictors of differential drug response.43 Dimensional symptom scores will allow testing of specific pharmacogenetic hypotheses concerning mood,33 neurovegetative39 or cognitive symptoms.36
The mixed-effect modelling estimated the sources of residual variability in symptom change over time. Although a number of predictors have been included in the models, these have jointly explained only 35% of the variance in the individual trajectories of depressive symptoms. Most of the residual variance is attributable to unmeasured individual characteristics that are stable over time. This large proportion of variance presents a challenge for future research, which should include exploration of genetic factors and early environmental influences.
Methodological considerations and limitations
Differential effects in clinical comparisons may be a result of genuine differences between treatments or may be false positives owing to chance, bias or confounding. Chance alone is unlikely to account for the present findings as the differential effects were identified with a high level of statistical certainty. Additional analyses excluded other potential sources of bias and confounding such as baseline differences between groups allocated to different drugs and inequality of dose titration.11
The attrition rate was higher among participants randomly allocated to nortriptyline. This is consistent with previous reports.44,45 Interestingly, the differential attrition was a result of switching rather than drop out and did not generalise to participants who were non-randomly allocated to nortriptyline. This suggests that a high discontinuation rate on nortriptyline is not inevitable, and that clinical assessment based on medication history improves the fit between the individual and the antidepressant.
Differential drop out can lead to bias, especially with the last observation carried forward procedure.14,16,45 We applied maximum likelihood estimation with observed predictors of missingness included in the model. This method is robust to differential rates of missing data.14,15,17
The GENDEP study aimed to include a sample representative of the treatment-seeking population of individuals with depression. Therefore, non-random allocation was allowed where the two antidepressants were not at equipoise and the participants and their general practitioners knew which medication they were receiving. These features increased the acceptability of the study to participants and to general practitioners and thus made the study more inclusive and externally valid. However, they have implications for the internal validity. The inclusion of non-randomly allocated participants introduced systematic differences at baseline. However, the findings were qualified by a sensitivity analysis that demonstrated that observed differential effects of drugs on symptom dimensions were not a result of selection bias. The lack of masking introduces a potential for biased reporting of symptoms. It is, however, unlikely that a reporting bias would operate in opposite directions for different categories of symptoms.
In conclusion, dimensional measures distinguishing between observed mood, cognitive and neurovegetative symptoms of depression allowed the identification of relative advantages of escitalopram and nortriptyline. The differential drug effects were not a result of baseline sample characteristics, unfair dosage or differential attrition. These dimensional symptom measures provide a powerful tool to facilitate drug comparisons and find predictors of differential drug response.
The GENDEP study was funded by the European Commission Framework 6 grant, EC Contract Ref.: LSHB-CT-2003-503428. Lundbeck provided both nortriptyline and escitalopram free of charge for the GENDEP study. GlaxoSmithKline contributed by funding an add-on project in the London centre. The sponsors had no role in the design and conduct of the study, in data collection, analysis, interpretation or writing the report. We would like to thank the following collaborators for their contribution: Helen Dean, Bhanu Gupta, Joanna Gray, Cerisse Gunasinghe, Desmond Campbell, Richard J Williamson, Julien Mendlewicz, Thomas Schulze, Jana Strohmaier, Susanne Höfels, Anna Schuhmacher, Ute Pfeiffer, Sandra Weber, Erik Roj Larsen, Anne Schinkel Stamp, Dejan Kozel, Mojca Zvezdana Dernovek, Alenka Tancic, Jerneja Sveticic, Zrnka Kovacic, Pawe Kapelski, Maria Skibiñska, Piotr M Czerski, Aleksandra Rajewska, Aleksandra Szczepankiewicz and Elzbieta Cegielska. We would like to specially acknowledge the contribution of Jorge Perez, who was the principal investigator at Brescia, Italy, and who passed away in October 2007. We also wish to acknowledge the important contribution made by Andrej Marušič, the principal investigator at Ljubljana, Slovenia, who passed away in June 2008.
- Received July 22, 2008.
- Revision received October 7, 2008.
- Accepted October 16, 2008.
- © 2009 Royal College of Psychiatrists