The British Journal of Psychiatry
Comparison of risk factors for the onset and maintenance of depression
Christian Bottomley, Irwin Nazareth, Francisco Torres-González, Igor Švab, Heidi-Ingrid Maaroos, Mirjam I. Geerlings, Miguel Xavier, Sandra Saldivia, Michael King



Factors associated with depression are usually identified from cross-sectional studies.


We explore the relative roles of onset and recovery in determining these associations.


Hazard ratios for onset and recovery were estimated for 39 risk factors from a cohort study of 10 045 general practice attendees whose depression status was assessed at baseline, 6 and 12 months.


Risk factors have a stronger relative effect on the rate of onset than recovery. The strongest risk factors for both onset and maintenance of depression tend to be time-dependent. With the exception of female gender the strength of a risk factor’s effect on onset is highly predictive of its impact on recovery.


Preventive measures will achieve a greater reduction in the prevalence of depression than measures designed to eliminate risk factors post onset. The strength of time-dependent risk factors suggests that it is more productive to focus on proximal rather than distal factors.

Depression typically runs a course characterised by relapse and remission. As with other transient illnesses, the prevalence of depression is determined by incidence (i.e. the rate of depression onset) and the duration of the disease. The latter is governed by the rates of recovery, and to a lesser extent, mortality. Although a number of studies have identified associations between risk factors and depression in general population samples,1 most are based on cross-sectional data and demonstrate a difference in the prevalence of depression among individuals with and without the risk factor. However, rates of onset and recovery are of greater clinical relevance than prevalence.2 Associations between risk factors and onset of depression can help identify people at risk of depression,3 whereas knowledge of risk factors associated with recovery can be used to guide treatment. A limited number of primary care based studies have examined onset or recovery separately,1,47 but to our knowledge none have simultaneously assessed risk factors for their influence on onset and recovery from major depression. This study compares the relative rank and magnitude of risk factors in the onset of and recovery from depression using data collected from a general practice cohort.


Study design

The study design is described in detail in King et al,8 here we give a brief overview. General practice attendees were recruited from seven countries (UK, Spain, Portugal, Netherlands, Estonia, Slovenia, Chile). Risk factors and DSM–IV9 major depression status were assessed at baseline and 6 months, and a further assessment of depression status was made at 12 months. DSM–IV major depression was diagnosed using version 2.1 of the Composite International Diagnostic Interview (CIDI),10,11 which identifies depressive episodes within the previous 6 months.

Recruitment differed slightly in each country because of local service preferences. In the UK and The Netherlands, researchers spoke to people while they waited to see practice staff. In the other four European countries and Chile the doctors introduced the study before contact with the researcher. Participants who gave informed consent undertook a research evaluation within 2 weeks at their home or the general practice. The study design was approved by ethics committees within each of the participating countries.

A total of 10 045 people took part in the seven countries. Response to recruitment was high in Portugal (76%), Estonia (80%), Slovenia (80%) and Chile (97%) but lower in the UK (44%) and The Netherlands (45%).8 Across all countries response to follow-up was 89.5% at 6 and 85.9% at 12 months.3 Recovery rates were estimated in a cohort (n = 1385) who were depressed at baseline, 86% (n = 1186) of whom had follow-up data at 6 months and 77% (n = 1070) also had data at 12 months. For onset of depression the cohort consisted of 8517 non-depressed individuals at baseline, 89% of whom (n = 7558) had data at 6 months and 83% (n = 7061) had full data at 6 and 12 months. Risk factors were classified as being either time-dependent or time-independent. The latter remain constant over time and include variables fixed at birth (e.g. gender) and historic variables (e.g. childhood experience of abuse). All other risk factors were considered time-dependent. For time-dependent risk factors, estimates of effect for these variables were based on outcome and risk factor data collected at 6 months. We chose to use 6-month data for risk factors rather than baseline data as this better reflects an individual’s state between baseline and 6 months (the baseline measurement represents the individual’s state before baseline) and is consequently more predictive of both onset and recovery for this period. Estimates of effect for time-independent variables were based on 6-month risk factor data and 6- and 12-month outcome data because it can be assumed that these risk factors will not change between 6 and 12 months.

Risk factors

Risk factors were selected for this study based on a systematic review of the literature performed for the PredictD study.8 The review used both cross-sectional and longitudinal studies to identify personal, social and psychological factors (biological factors were not included) associated with depression. Neither severity of depression nor treatment were included as risk factors for recovery since they are not relevant to onset and the objective of this study is to compare the effects of risk factors on onset with their effect on recovery. Most risk factors were binary; where they were not, they were converted into binary variables for the analysis. Variables that were originally continuous were categorised as being below or above the median score. Where a variable had more than two categories, it was recoded so that the category with the highest prevalence of depression was compared with the remaining categories combined. The following 39 risk factors were used in the study.

  • (a) Sociodemographic factors and personal factors: age, gender, occupation, educational level, marital status, employment status, ethnicity, owner-occupier accommodation, living alone or with others, born in the country of residence or abroad, satisfied with their living conditions and presence of long-standing physical illness.

  • (b) A life-time screen for depression based on the first two questions of the CIDI. People answering yes to both questions screened positive.12

  • (c) Controls, demands and rewards for paid and unpaid work in the preceding 6 months were estimated by an adapted version of the Job Content Questionnaire.13 Participants were categorised as feeling in control in paid or unpaid work; as experiencing difficulties without support in paid or unpaid work; and experiencing distress without feeling respect for their paid or unpaid work.

  • (c) Financial strain: a single question that is commonly used in government and other UK social surveys.14

  • (d) Physical and mental well-being were assessed by the Short Form–12.15

  • (e) Alcohol misuse using the World Health Organization’s Alcohol Use Disorders Identification Test (AUDIT) questionnaire16 plus questions on whether or not the respondent had ever had an alcohol problem or treatment for same.

  • (f) Use of recreational drugs ever, adapted from the relevant sections of the CIDI.

  • (g) Brief questions on the quality of sexual and emotional relationships with a partner were adapted from a standardised questionnaire.17

  • (h) Presence of serious physical, psychological or substance misuse problems or any serious disability, in people who where in a close relationship to participants.

  • (i) Difficulties in getting on with people and maintaining close relationships were assessed using questions from a social functioning scale.18

  • (j) Childhood experiences of physical and/or emotional abuse and sexual abuse.19

  • (k) Religious or spiritual beliefs.20

  • (l) Family psychiatric history: serious psychological problems in first-degree family members requiring pharmacological or psychological treatment in primary or secondary care, and suicide in first-degree relatives.21

  • (m) Anxiety and panic symptoms in the previous 6-months using the relevant sections of the Patient Health Questionnaire (PHQ).22

  • (n) The living environment, including satisfaction with neighbourhood and perception of safety inside/outside of the home, using questions from the Health Surveys for England.23

  • (o) Recent major life events using the List of Threatening Life Experiences Questionnaire.24

  • (p) Experiences of discrimination on the grounds of gender, age, ethnicity, appearance, disability or sexual orientation using questions from a recent European study.25

  • (q) Adequacy of social support from family and friends.26

Statistical methods

Data on depression status at baseline, 6 months and 12 months were used to estimate the relative effect of each risk on the rates of recovery and onset. A model for interval-censored data was used since the time of recovery or onset could only be identified within a 6-month window. The model was originally proposed by Prentice and Gloeckler,27 and was fitted using glm in STATA version 9 for Windows. In the following the model for recovery is described; the model for onset of depression is similar. Further details of this model are given elsewhere.28,29

In an interval-censored model, an individual’s discrete time hazard of recovery during the interval j (i.e. the probability of recovery given that they have not recovered before the start of the interval) may be modelled by Math where X is a vector of covariates, β is a vector of parameters (log hazard ratios) and the parameter γj represents the log of the product of the baseline rate of recovery during interval j and the length of the interval.

Models of this form were fitted by selecting individuals depressed at baseline and then organising the data such that each person–interval combination was a separate row in the data-set; the number of rows for an individual corresponded to the number of intervals up to, and including, the interval in which the individual recovered or was censored. A variable for interval and a binary variable for recovery were defined. The parameters were estimated by fitting a generalised linear model with complementary log–log link function to these data in which the outcome is recovery and the interval variable is included as a covariate.

Models were fitted for each of the 39 risk factors in turn. In addition to time period we included country in the models to account for the dependence between observations from the same country. We also estimated adjusted hazard ratios using models that included, as well as country and the risk factor of interest, the time-independent covariates. Time-dependent covariates were not included since these are potentially mediator variables rather than confounders. In models for time-independent risk factors, the proportional hazards assumption was tested by including an interaction between time period and the risk factor. Standard errors were based on the sandwich estimator of variance to incorporate clustering by practice.


Crude rates of onset and recovery

There were 7558 individuals who were not depressed at baseline with follow-up depression data at 6 months; of these, 4.87% (n = 368) became depressed by 6 months. Between 6 and 12 months the rate of onset fell to 3.43% (n = 231) among the 6733 non-depressed individuals followed over this period (Fig. 1).

Fig. 1

Follow-up in the two cohorts (defined by depression status at baseline) used to estimate rates of onset and recovery.

The rate of recovery during the first 6 months was 67.45% (n = 800) out of 1186 individuals with depression at baseline with follow-up data at 6 months. For the cohort of 338 individuals that remained depressed at 6 months and who had follow-up data at 12 months, the rate of recovery fell to 48.82% (n = 165).

A null model with the variables time and country fitted using depression data from 6 and 12 months revealed that the rate of onset declined during the 6- to 12-month interval to 71% (95% CI 61–84) of its rate between baseline and 6 months. A similar model for recovery showed a reduction in the rate of recovery between 6 and 12 months to 61% (95% CI 52–72) of its rate between baseline and 6 months.

Proportional hazards assumption

For the 14 time-independent risk factors (online Table DS1) the outcome (recovery or onset) used data at 6 and12 months. The models fitted with these variables assumed that the effect of the risk factor on the rate of onset or recovery remained constant over both time periods (i.e. an implicit proportional hazards assumption was made). This assumption was tested by including an interaction term between time period and risk factor. The interaction term was not significant at P<0.05 for any of the time-independent covariates. The data are therefore consistent with the proportional hazards assumption.

Hazard ratios for onset and recovery

Online Table DS1 reports the relative effect of each risk factor on the rate of onset and recovery. The hazard ratio (HR) is reported for onset, whereas (1/HR) is reported for recovery since a risk factor that slows recovery is presumed to increase the rate of onset. Viewed another way, it is the absence of the risk factor that is associated with faster recovery.

Three features in online Table DS1 are noteworthy; these are common to both unadjusted and adjusted hazard ratios. First, hazard ratios for onset are generally greater than hazard ratios for recovery. This suggests that the risk factors studied have a greater impact on onset than on recovery. To assess the impact of risk factors on either onset or recovery it is helpful to convert hazard ratios to incidence density fractions (IDF) (using the relationship IDF = 1–1/HR). Under certain biological models the incidence density fractions can be viewed as an estimate of the aetiological fraction – the proportion of exposed cases that are a result of the risk factor.30 For depression onset, 22 of the 39 risk factors have an incidence density fraction of more than 50% (for new cases of depression exposed to these risk factors more than 50% of the cases can be attributed to the risk factor). By contrast for recovery the absence of the risk factor produced an incidence density fraction greater than 50% in only 3 of 39 cases.

The second noteworthy feature of online Table DS1 is that the strongest risk factors both in terms of onset and recovery tend to be time-dependent. Based upon unadjusted hazard ratios nine of the top ten risk factors for both onset and recovery are time-dependent.

Third, the hazard ratios for onset and recovery in online Table DS1 exhibit a strong negative association (r =–0.71, Fig. 2). For example, among the 11 risk factors with the greatest effect on onset (as measured by the unadjusted hazard ratio), 10 were also among the top ten factors for reducing the rate of recovery.

Fig. 2

A comparison of the unadjusted hazard ratios for onset v. recovery: each point corresponds to a different risk factor.

The hazard ratio (HR) for onset is strongly predictive of HR for recovery and vice versa (r=–0.71). The data were collected from general practice attendees in six European countries and Chile who were recruited between April 2003 and September 2004.


Main findings

To our knowledge this is the first study that has simultaneously compared risk factors for onset of and recovery from depression in the same population. We have made three main observations from this analysis:

  1. risk factors have a greater relative effect on the rate of onset than on the rate of recovery;

  2. the strongest risk factors for onset and recovery are predominantly time-dependent; and

  3. the strength of a risk factor’s effect on onset is highly predictive of its impact on recovery.

An illness model of depression

Viewed in terms of the aetiological fraction, observation (a) above suggests that the proportion of cases among exposed individuals that are caused by a risk factor exceeds the proportion of recoveries among unexposed individuals that are attributable to the absence of the risk factor. This supports an illness model of depression in which the stimulus (i.e. time-dependent risk factor) continues to have an impact, even when no longer present. In this model there is an immediate increase in the risk of illness following application of a stimulus; once ill, however, removing the risk factor does not necessarily lead to restitution. Examples of such illness models are relatively common. Exposure to allergens in occupational asthma sufferers increases the risk of bronchial hypersensitivity, but removing the exposure does not reverse the symptoms.31 To interpret observation (a) in terms of the aetiological fraction we must assume that the aetiological fraction can be estimated from the hazard ratio by IDF = 1–1/HR; this is only true under certain assumptions that cannot be tested from the data.30,32 The assumption that is most commonly made is the independence-of-background assumption. Under this assumption the incidence of cases caused by the exposure is independent of the incidence of cases not caused by exposure.33 In practice the incidence density fraction is likely to underestimate the aetiological fraction. The aetiological fraction will be underestimated if, for instance, some cases caused by the exposure occur in individuals who would have become depressed during the follow-up period had they not been exposed (i.e. these individuals become depressed earlier as a result of the exposure). Under this scenario the rate ratio does not reflect these extra cases attributable to the exposure.

Prevalence of depression

For a rare disease in a population of constant size with no migration, the prevalence of disease is approximately the product of the incidence (rate of onset) and 1/(average recovery rate).32 Taken in conjunction with observation (a) above (risk factors have a greater impact on the rate of onset than recovery) this suggests that the risk factors influence the prevalence of depression predominantly through their effect on the rate of onset. Thus interventions that act preventively through these risk factors may be more successful at reducing the prevalence of depression than those designed to increase the rate of recovery.

Proximal and distal risk factors

The risk factors that we have identified as time-independent tend to be distal in that they relate to historic events whereas time-dependent factors are more likely to be proximal to the outcome. We propose two explanations for the relative strength of time-dependent factors compared with time-independent ones (observation (b)) based on this distinction between proximal and distal factors. First, individuals may adjust to historical events with the passage of time so that the effects of distal risk factors weaken with time. Second, the effect of distal risk factors may be mediated by proximal factors. Causal pathways of this sort have been proposed by Bifulco et al.34,35 They suggest that negative childhood experiences (distal factors) are related to depression in adulthood because they may lead to a negative evaluation of self and difficulties with core relationships (proximal factors).

Correlation in the strength of risk factors for onset and recovery

The strong correlation between the size of a risk factor’s impact on onset and its impact on recovery is intuitive: a risk factor that greatly increases the risk of onset should also markedly slow recovery. It is perhaps most interesting therefore that there appear to be two exceptions to observation (c), namely gender and current harmful alcohol consumption. In the case of gender other studies have shown that whereas the rate of onset is greater in women, men and women both recover at the same rate.5,6,3638 There are fewer data for harmful alcohol consumption. In keeping with our findings, harmful alcohol consumption has been identified as a risk factor for onset of depression in primary care,1 and in one large study of recovery there was no association with the rate of recovery.6 However, another study reported that frequent alcohol consumption was associated with slower recovery.5 In our analysis, the strength of the association between harmful alcohol consumption and onset was markedly reduced in the model that also incorporated time-independent risk factors. Thus the relationship between harmful alcohol consumption and the onset of depression may be attributable to the confounding effect of time-independent risk factors.

Strengths and limitations of the study

The risk factors considered in this study are limited to those examined in the original Predict D study. In a more recent study, poor sibling relationships during childhood were implicated in the onset of depression in adulthood.39 This was not one of the 39 risk factors examined in our study. In general, however, the 39 risk factors reflect the current state of knowledge.

This study used data collected from people attending general practices; these are not necessarily representative of the population as a whole. Although different age groups were equally represented, the proportion of women in the sample was high and Black and minority ethnic groups were underrepresented.3 The sample is also potentially biased towards people with psychological morbidity as this is more common among frequent attendees.40

Follow-up rates were high in all countries except the UK and The Netherlands, where the study may have been less actively endorsed by physicians.3 Together with the large sample size, high rates of response at follow-up made it possible to examine risk factors for both onset and recovery from depression within a single study. Unfortunately, there were insufficient data to make our results country-specific. Thus our findings are averaged across seven different countries. The extent to which this is significant depends on whether or not risk factors are mediated by their cultural setting.

Clinical implications

We are aware that our results arise from observational data and thus their application to the clinical management of depression in general practice needs to be treated with caution. Nonetheless, we feel that the data support the following conclusions. First, depression follows an illness model. Second, interventions aimed at prevention may be more successful at reducing the prevalence of depression than eliminating risk factors after onset. Third, attention to recent risk factors is likely to be more productive than focusing on past history.


This work was supported by The European Commission (PREDICT-QL4CT2002-00683).

  • Received April 9, 2009.
  • Revision received July 9, 2009.
  • Accepted August 5, 2009.


View Abstract