Background The EURO–D, a12-item self-report questionnaire for depression, was developed with the aim of facilitating cross-cultural research into late-life depression in Europe.
Aims To describe the national variation in depression symptoms and syndrome prevalence across ten European countries.
Method The EURO–D was administered to cross-sectional nationally representative samples of noninstitutionalised persons aged ≥50 years (n=22 777). The effects of age, gender, education and cognitive functioning on individual symptoms and EURO–D factor scores were estimated. Country-specific depression prevalence rates and mean factor scores were re-estimated, adjusted for these compositional effects.
Results The prevalence of all symptoms was higher in the Latin ethno-lingual group of countries, especially symptoms related to motivation. Women scored higher on affective suffering; older people and those with impaired verbal fluency scored higher on motivation.
Conclusions The prevalence of individual EURO–D symptoms and of probable depression (cut-off score ≥4) varied consistently between countries. Standardising for effects of age, gender, education and cognitive function suggested that these compositional factors did not account for the observed variation.
Depression is common in later life. However, there is considerable variation in reported prevalence between studies world-wide. There have been relatively few direct cross-national comparisons of the prevalence of depression using comparable methodology, particularly with respect to sampling, definition and assessment of outcome. Methodological differences between studies preclude firm conclusions about cross-cultural and geographical variation (Beekman et al, 1999). Improving the comparability of epidemiological research constitutes an important step forward. The EURO–D scale (Prince et al, 1999b) was developed to harmonise data on late-life depression from 11 European population-based studies (EURODEP). The EURO–D scale has more recently been administered in a large, collaborative house-hold survey of nationally representative samples of people aged 50 years and over from ten European countries: the Survey of Health, Ageing and Retirement in Europe (SHARE). Observed differences in prevalence of depression in later life may be accounted for by methodological, compositional or contextual factors. In this analysis we sought to answer three questions:
Are there differences in the prevalence of depression between European countries, and are these consistent across the two factors that underlie the EURO–D measure and its 12 constituent symptom-based items?
Are there compositional differences between the older European populations in terms of age and gender (found previously to be associated with the ‘ motivation’ and ‘affective suffering’ factors respectively) (Prince et al, 1999a), education, and cognitive function (previously hypothesised to be associated with motivation factor) (Prince et al, 1999a)?
Do these compositional differences account, wholly or partly, for any observed differences in depression prevalence and EURO–D scale scores?
The SHARE study (Borsch-Supan et al, 2005) is a consortium survey of health in older people across Europe. In this study, national survey organisations were responsible for selecting household samples and conducting interviews in nationally representative samples of people aged 50 years and over from ten countries: Denmark, Sweden, The Netherlands, Germany, Austria, Switzerland, France, Spain, Italy and Greece.
The SHARE interview was specifically designed to cross-link with the US Health and Retirement Study (http://hrsonline.isr.umich.edu) and the English Longitudinal Study of Ageing (http://www.natcen.ac.uk/elsa), with the advantage that it encompasses international variation in culture, health and social welfare systems and public policy. Questions covered health variables (self-reported health, physical functioning, cognitive functioning, health behaviour, use of healthcare facilities), psychological variables (depression, well-being, life satisfaction), economic variables (current work activity, job characteristics, opportunities to work past retirement age, sources and composition of current income, wealth and consumption, housing and education) and social support variables (assistance within families, transfers of income and assets, social networks, volunteer activities). All of the above topics were rated in an interview conducted in the respondent’s home, with an average interview duration of around 90 min. Response rates were acceptable throughout. The data are freely available to the research community (http://www.share-project.org).
The EURO–D was originally developed to compare symptoms of depression in 11 European centres (Prince et al, 1999b). Its items are derived from the Geriatric Mental State examination (GMS; Copeland et al, 1986) and cover 12 symptom domains: depressed mood, pessimism, suicidality, guilt, sleep, interest, irritability, appetite, fatigue, concentration, enjoyment and tearfulness. Each item is scored 0 (symptom not present) or 1 (symptom present), and item scores are summed to produce a scale with a minimum score of zero and a maximum of 12.
The psychometric properties of the EURO–D have been extensively investigated and criterion validity demonstrated in the cross-cultural context. Principal components analysis generated two factors (affective suffering and motivation) that were common to nearly every participating European country in the EURODEP studies (Prince et al, 1999b) and for Indian, Latin-American and Caribbean centres in the 10/66 Dementia Research Group pilot studies (Prince et al, 2004). Subsequent analysis of the EURO–D in the SHARE data-set using confirmatory factor analysis confirmed the two-factor solution of the EURO–D and suggested measurement invariance across the ten countries (common factor loadings and item calibrations), at least for the ‘ affective suffering’ factor (Castro-Costa et al, 2007). Criterion validity for this measure was demonstrated in each of the EURODEP study sites, with an optimal cut-off point of a score of 4 or above against a variety of criteria for clinically significant depression (Prince et al, 1999b). The EURO–D was also found to be reliable and was validated against the criterion of DSM–III–R depression in older people in Spain (Larraga et al, 2006).
The following aspects of cognitive function were measured in all participants: memory, using delayed recall of a ten-word list in wide international use (Ganguli et al, 1996; Prince et al, 2003), the only difference being that in our study this was presented once only in the learning phase, as opposed to the conventional three presentations; and verbal fluency, measured using the Consortium to Establish a Registry for Alzheimer’s Disease (CERAD) animal naming task (Goodglass & Kaplan, 1983). Other factors considered in the analysis were age, gender and duration of education.
Country-specific prevalences of all 12 EURO–D items were derived, as were prevalence of EURO–D scores of 4 or more, and country-specific mean scores for the affective suffering and motivation sub-scales. The independent effects of gender, age (<75 years v. 75+ years), duration of education (11+ years v. <11 years), verbal fluency score (<10 v. 10+ animals named) and memory (3+ v. <3 words recalled) upon individual symptoms were estimated in each country as mutually adjusted prevalence ratios from Poisson regression models with 95% confidence intervals. For each covariate, an effect by country interaction term was added to the final model to test for heterogeneity. Associations with factor scores for the two sub-scales were estimated as eta-squared statistics derived from generalised linear modelling (GLM), again with effect by country interaction terms fitted in the final stage. Finally, ‘affective suffering’ and ‘motivation’ scores were estimated, adjusting for age, gender, education, verbal fluency and memory using GLM, and the country-specific prevalence of EURO–D depression (a score of 4 or over) was standardised separately for age, gender, education, verbal fluency and memory and for all effects simultaneously, using the direct method to the age, gender, education and verbal fluency distribution of the pooled data-set. All analyses were conducted with Stata version 9.1 for Windows.
Based on probability samples in all participating countries, the SHARE sampling strategy aimed to represent the noninstitutionalised population aged 50 years and older. The SHARE investigators have compared sample characteristics with other data sources, indicating that although there were some expected divergences, this objective has been broadly achieved (Borsch-Supan et al, 2005).
The characteristics of the sample by country, gender and age and also household and individual response rates have been reported in detail elsewhere (Borsch-Supan et al, 2005). The proportion of households responding was 57.4% overall, with the lowest response in Switzerland (37.6%) and the highest in France (69.4%). Individual response proportions ranged from 73.8% (Italy) to 93.0% (Denmark), with a rate of 86.0% overall. The principal characteristics of the respondents in each country are summarised in Table 1. The distribution of gender and age did not differ between countries. In the pooled data-set, 54.5% were female, and the mean age was 64.7 years (s.d.=10.0). Educational levels were lowest in the Latin countries (France, Italy and Spain) and in Greece. Participants from these countries also recorded the lowest (most impaired) scores on both the animal naming task and the delayed recall of the ten-word delayed recall list. In most countries more than half of the sample were retired, the exceptions being The Netherlands (32%), Spain (35%), Switzerland (45%) and Greece (45%). Mean EURO–D scores were statistically different between countries (F=68.79; P<0.00001) and were highest in France, Spain and Italy. The prevalence rates of individual depressive symptoms are displayed in Figs 1 and 2 for symptoms loading principally on affective suffering and motivation respectively. These varied consistently and significantly between countries (P<0.001) for all 12 symptoms, with a higher prevalence in France, Spain and Italy. Among affective suffering symptoms, depressed mood, tearfulness, fatigue and sleep disturbance were most common. Among motivation symptoms, poor concentration was reported most frequently.
Associations with affective suffering symptoms are summarised in Table 2: depression, tearfulness, suicidality and fatigue were selected because they have the highest factor loading for affective suffering (Castro-Costa et al, 2007). The four individual symptoms and the overall factor score were each consistently associated with gender, with higher prevalence of symptoms and higher factor scores among women, with negligible influence of age, education or cognitive function. None of these effects varied significantly between countries.
Associations with motivation symptoms are summarised in Table 3. Enjoyment, pessimism and interest were chosen because of their high factor loading for motivation (Castro-Costa et al, 2007). The three individual symptoms and the overall factor score were consistently and strongly associated with age, with a higher prevalence of symptoms and higher factor scores among older people. Effects of gender and education were negligible. Motivation symptoms and factor score were also strongly associated with lower verbal fluency, but were not associated with impaired memory. None of these associations varied significantly between countries.
The prevalence of case-level depression according to the EURO–D scale is summarised and compared between nations in Table 4. Consistent with the observations for individual EURO–D symptoms, the highest prevalence rates were found in France, Italy and Spain, with a 9% difference between the lowest of these (33% in France) and the next lowest (24% in Greece). Prevalence in the remaining countries was 18–19%. Heterogeneity between countries was statistically significant (P<0.001). The pattern and extent of between-country differences were not affected by direct standardisation for gender, age, education, verbal fluency or memory.
In this study the prevalence of individual EURO–D symptoms varied consistently between countries, with a higher prevalence in three Latin countries–France, Italy and Spain. Prevalence of probable depression according to the EURO–D cut-off score of 4 or more therefore followed the same distribution. The distribution of age and gender was similar across the ten countries, but educational levels were lower and cognitive function more impaired in France, Italy, Spain and Greece. Standardising for the effects of age, gender, education and cognitive function suggested that compositional differences in these factors did not account for the observed variation in the prevalence of depression.
Strengths and weaknesses of the study design
This study has the advantage of using data from nationally representative samples of those aged 50 years and over from ten European countries. Strong conceptual validity, high internal consistency and a common factor structure among different European centres was previously demonstrated for the depression assessment, the EURO–D (Prince et al, 1999b). Furthermore, these favourable psychometric properties were confirmed in an earlier analysis of SHARE data using more advanced psychometric techniques – confirmatory factor analysis and Rasch modelling – to support the cross-cultural validity of the measure (Castro-Costa et al, 2007). These analyses provided further robust evidence to support a two-factor solution: affective suffering (well characterised, and invariant across cultures) and motivation (less well characterised and variable across cultures). As with the mental health survey under-taken by the European Study of the Epidemiology of Mental Disorders (ESEMed; Alonso et al, 2004), the SHARE data are limited by the relatively low proportion of households and individuals responding. This may, unfortunately, represent a secular trend in more developed countries. The net effect may be an underestimation of the true prevalence of depression (Eaton et al, 1992; De Graaf et al, 2000). We used a simple scale-based assessment for depression rather than a comprehensive clinical diagnostic interview. Nevertheless, the EURO–D and its cut-off point of 4 or more have previously been validated against relevant clinical assessments in different European settings (Prince et al, 1999b).
Consistency with findings from other research
Findings from the SHARE survey are most directly comparable with those of the EURODEP consortium studies, in which the same outcome measure (the EURO–D) was administered to older adults in cross-sectional population-based surveys. In descending order, the mean EURO-D scores for each of the EURODEP centres that used the GMS were: Munich (Germany), 3.58; London (UK), 2.54; Berlin (Germany), 2.48; Iceland, 2.03; Amsterdam (The Netherlands), 1.98; Verona (Italy), 1.84; Liverpool (UK), 1.79; Zaragoza (Spain), 1.61; and Dublin (Ireland), 1.34 (Prince et al, 1999a). The EURODEP findings do not, therefore, support our finding of higher levels of reported depression symptoms in Latin countries. However, there are important differences between the SHARE and EURODEP studies. First, none of the EURODEP centres used nationally representative samples, and the age range was 65 years and over rather than 50 years and over as in SHARE. Second, the EURO–D items were nested within the more comprehensive GMS clinical interview. Finally, in several centres the GMS was administered by clinicians working for university research groups rather than by lay interviewers working for survey organisations as with SHARE. Nevertheless, several findings in our analysis are consistent with those of EURODEP: motivation factor scores and EURO–D scores increase with age, and affective suffering scores and EURO–D scores are higher in women than in men (Prince et al, 1999a). The EURODEP investigators postulated that some of the effect of age on motivation factor scores might have been accounted for by cognitive impairment (cognitive assessments were not available from most of the EURODEP centres, so this could not be tested directly). This hypothesis is supported in the current analysis, but it is impairment in verbal fluency rather than memory that seems to be mediating or confounding the effect of age on motivation factor scores. As with the EURODEP analyses, the differences between SHARE countries in the distribution of age and gender could not account for between-country differences in depression symptoms. We have further demonstrated that compositional differences in education and cognitive functioning are also not relevant.
Comparisons with other European surveys are limited by the different age ranges and different outcomes studied. For instance, the ESEMeD study (Alonso et al, 2004), as part of the wider World Mental Health survey, used the World Health Organization’s Composite International Diagnostic Interview (CIDI) to estimate the prevalence of mood disorder (DSM–IV bipolar disorder, major depression and dysthymia) in nationally representative samples of all those aged 18 years and over in seven European countries. In descending order, the prevalence of mood disorder varied from Ukraine (9.1%), France (8.5%), The Netherlands (6.9%), Belgium (6.2%), Spain (4.9%), Italy (3.8%) to Germany (3.6%) (Alonso et al, 2004). The Outcome of Depression International Network (ODIN) study used a two-phase design (the Beck Depression Inventory for screening in the first phase and the Schedule for Clinical Assessment in Neuropsychiatry for definitive clinical diagnoses in the second phase) in locally representative samples from five European countries: the prevalence of any ICD–10 depressive disorder varied from 17.1% in Liverpool (UK) and 12.3% in Dublin (Ireland) to 2.6% in Santander (Spain) (Ayuso-Mateos et al, 2001). Findings from surveys using structured clinical diagnostic assessments of predominately younger adult samples are therefore not consistent with the ethno-cultural distribution of reported depression symptoms observed in the SHARE study.
Cultural and methodological effects on measurement
We have previously demonstrated, using the SHARE data (Castro-Costa et al, 2007), that the EURO–D had promisingly invariant measurement properties – that is, the factor structures, factor loadings and hierarchical measurement properties were similar across all ten centres. Similar characteristics were observed internationally for the CIDI major depression items in the WHO international study of psychological problems in general healthcare (Simon et al, 2002). The investigators in the latter study remind us that invariant measurement properties do not preclude the possibility of threshold effects, whereby the severity with which a symptom is experienced before the relevant item is endorsed may vary between cultural settings, and that this may account for cultural differences in prevalence (Simon et al, 2002). In other domains of health assessment innovative techniques are being developed to adjust for such effects, using vignettes to estimate and then adjust for threshold differences between populations (Salomon et al, 2004); these could in principle be applied to the assessment of depression. Earlier studies of patterns of responses to items of the Center for Epidemiologic Studies Depression Scale between minority ethnic groups in the USA (Iwata et al, 1995) and in undergraduate students in east Asia and in North and South America (Iwata & Buka, 2002) indicated greater cross-cultural variability in responses to positively worded compared with negatively worded items. Interestingly, the same pattern was observed for the EURO–D in both the EURODEP and SHARE studies. The motivation items (for which there was greater variation in item prevalence and factor scores) are all positively worded, whereas the affective suffering items are negatively worded. Some of this between-country variation may therefore reflect cultural and linguistic differences in appraising and responding to positively worded items, rather than national differences in psychological morbidity. This could not, however, have accounted for the heavier load of depressive symptoms in Latin countries observed in the SHARE study, given that the higher symptom prevalence in France, Italy and Spain was observed for both affective suffering and motivation symptoms (see Figs 1 and 2).
The specific association between verbal fluency (but not memory) and motivation (but not affective suffering) calls into question the construct validity of the motivation components of the EURO–D scale, which may be measuring the effects of subcortical brain damage (apathy and slowing) rather than depression per se. Given that the EURO–D items were originally selected because they were present in all or most of the five late-life depression assessments used in the EURODEP studies, this is likely to be a general problem. It would be interesting to re-explore the concept of ‘ vascular depression’ (Alexopoulos et al, 1997), using the affective suffering and motivation components. It is tempting to conclude that the affective suffering component might be the more valid measure of psychological morbidity, as well as providing a more psychometrically appropriate tool for cross-cultural comparison (Castro-Costa et al, 2007).
Given the pattern of findings, it is tempting to conclude that variation in prevalence of depressive symptoms and syndromes in older European men and women may be best understood in terms of ethnocultural differences, with a higher prevalence recorded in the Latin nations (France, Italy and Spain) than in the Germanic (Sweden, Denmark, Germany, The Netherlands) and Hellenic (Greece) countries. However, although we have excluded here the compositional effects of major determinants of depression prevalence – age, gender, education and cognitive function – it remains possible that other risk exposures, differently distributed between countries, might have accounted for the observed variation. Even though compositional effects can be confidently excluded, it remains difficult to attribute the contextual effect or effects that might be responsible. Language and culture are contextual effects, in that they are the property of the population (country, in this case) with no meaningful individual-level variation. Other contextual effects may be important, for example income inequality (Muramatsu, 2003), social capital (LaGory, 1992) and religiosity (Braam et al, 2001). Technically it is possible in principle to study contextual effects and to disentangle their impact from that of compositional effects, using multilevel modelling approaches. This will be the focus of a further analysis.
The SHARE survey was funded primarily by the European Commission through the Fifth Framework Programme (project QLK6-CT-2001-00360 in the thematic programme Quality of Life). Additional funding came from the US National Institute on Aging (U01 AG09740-13S2, P01 AG005842, P01 AG08291, P30 AG12815, Y1-AG-4553-01 and OGHA 04-064). Data collection in Austria (through the Austrian Science Fund, FWF), Belgium (through the Belgian Science Policy Office) and Switzerland (through BBW/OFES/UFES) was funded within those countries. E.C.-C. was supported for this analysis by the Social Psychiatry Research Trust.
- Received February 9, 2007.
- Revision received May 25, 2007.
- Accepted July 10, 2007.
- © 2007 Royal College of Psychiatrists