Declaration of interest
P.B. and M.B. own a company that has developed the online intervention used in this study and that develops and distributes computerised lifestyle interventions.
Brief interventions can be efficacious in changing alcohol consumption and increasingly take advantage of the internet to reach high-risk populations such as students.
To evaluate the effectiveness of a brief online intervention, controlling for the possible effects of the research process.
A three-arm parallel groups design was used to explore the magnitude of the feedback and assessment component effects. The three groups were: alcohol assessment and feedback (group 1); alcohol assessment only without feedback (group 2); and no contact, and thus neither assessment nor feedback (group 3). Outcomes were evaluated after 3 months via an invitation to participate in a brief cross-sectional lifestyle survey. The study was undertaken in two universities randomising the email addresses of all 14 910 students (the AMADEUS-1 study, trial registration: ISRCTN28328154).
Overall, 52% (n = 7809) of students completed follow-up, with small differences in attrition between the three groups. For each of the two primary outcomes, there was one statistically significant difference between groups, with group 1 having 3.7% fewer risky drinkers at follow-up than group 3 (P = 0.006) and group 2 scoring 0.16 points lower than group 3 on the three alcohol consumption questions from the Alcohol Use Disorders Identification Test (AUDIT-C) (P = 0.039).
This study provides some evidence of population-level benefit attained through intervening with individual students.
Alcohol causes huge problems, both for population health and for society more broadly.1 Heavy drinking among university students is a global phenomenon, and Swedish students drink heavily, with heavy episodic drinking normative.2 Effective interventions have the potential to alter acute risk of car crashes, violence and suicide, the leading causes of death among young people globally3 and chronic risk of the longer-term health and psychosocial consequences of alcohol consumption.4 Population-level interventions that seek to influence the price, availability and cultural acceptability of hazardous and harmful drinking are likely to be most effective in reducing these problems.5 These may be complemented by individual-level interventions delivered in health services and elsewhere.6
Brief feedback and computerised interventions have the capacity to alter behaviour and reduce alcohol problems in student populations.7-10 This research literature is, however, both recent and evolving quickly and many important questions have not yet been addressed. These include the effect sizes to be expected, the conditions under which they are obtained, population moderators and intervention content mediators (for example is individualised feedback required, and if so should it be normative), as well as questions to do with different delivery models and cross-cultural variability.8,11 A key limitation of existing evidence is the paucity of multisite large-scale effectiveness trials of internet interventions.12,13 Such studies must address significant methodological challenges common to other areas of e-health research including management of assessment reactivity in seeking to detect subtle effects on behaviour14 and attrition.15 Following an earlier Swedish effectiveness study,16 Sweden became the first country to implement a policy based on the accumulating international evidence-base in the form of a national system. It is now routine practice for university students to receive an email from student healthcare services inviting them to participate in a brief online alcohol intervention.11 New Zealand has recently decided to follow suit,17 and it is anticipated that other countries will implement local, regional or national systems. This study aimed to evaluate the effectiveness of a brief online intervention, part of the national strategic response in Sweden, controlling for the possible effects of the research process.
Study design and hypotheses
The lack of any standard timing of email delivery was exploited to undertake a randomised trial in the newly formed national system (trial registration: ISRCTN28328154).
This was a three-arm parallel group trial in which routine provision of assessment and feedback via email (group 1) was compared with assessment only (group 2) and a no-contact control (group 3). With this dismantling design we sought to assess possible overall and component effects, with equivalent numbers in each arm. Groups 1 and 2 completed identical assessments, the sole difference between them being that group 1 received normative and other feedback as usual, whereas group 2 did not. There were four pre-specified hypotheses tested, including three tests of universal prevention among university students in general and a per-protocol evaluation of the specific effects of feedback among baseline risky drinkers only.11 This design thus nests one conventional intervention effectiveness evaluation within a rare study of the effects of the universal provision of individualised intervention in the population as a whole. A key feature of this study design is that all participants were entirely unaware they were participating in a trial at any point (see Masking) specifically in order to minimise any possible effects of the research process.
The study was undertaken simultaneously in Linköping and Luleå universities in Sweden. These were selected on the basis of previously conducted research involving the local student healthcare services responsible for alcohol interventions. All students in semesters one, three or five of their studies during the autumn 2011 term were randomised via email addresses, through which all official mail is delivered. Randomisation was computerised (programmed by M.B.) and did not employ any strata or blocks and was not possible to subvert, as this and all subsequent study processes were fully automated. Study procedures are fully detailed in the study protocol.11
Groups 1 and 2 received an email from the student healthcare services on 5 September 2011 and completed an alcohol assessment instrument comprising ten items. The only difference between the emails was that group 1 were advised they would also get feedback, which they then received, whereas group 2 were simply thanked for their participation and offered a link to a commonly used alcohol website without content understood to be effective in assisting behaviour change. A demonstration version of the assessment and feedback intervention can be viewed at http://demo.livsstilstest.nu.
Three months later all three groups were sent an identical email from the Swedish principal investigator (P.B.). This made no reference to alcohol nor to the previous email and comprised an invitation to participate in an online lifestyle survey with a 15-item questionnaire. Trial outcomes were derived from the three alcohol questions (see Outcome evaluation) in this survey. There were three reminders containing a link to the questionnaire. The final reminder also provided an option of completing three brief questions in the body of an email, only one of which measured drinking (heavy episodic drinking) in order to preserve masking. Follow-up data collection was completed on 21 December 2011.
All three groups were unaware they were participating in an intervention study and that they had been randomised. Instead, at follow-up they were invited to participate in a seemingly unrelated cross-sectional lifestyle survey without any particular focus on alcohol. The non-alcohol nature of this invitation stemmed from our large pilot trial in which we obtained higher rates of follow-up than had been observed previously in Sweden, but also found markedly higher participation rates in group 3.18 We thus chose to conceal the alcohol study focus at follow-up. The use of masking and deception in this trial raises ethical issues that were considered and approved by the Regional Ethical Committee in Linköping, Sweden (No. 2010/291-31). We later undertook a focus group study debriefing two groups of AMADEUS-1 participants as part of an ethical evaluation and subsequently debriefed all participants about the nature of the study.
The first three items of the Alcohol Use Disorders Identification Test (AUDIT-C19) were embedded within the 15-item lifestyle survey alongside questions on smoking, diet, physical activity and sociodemographic characteristics. The two primary outcomes were AUDIT-C scores and the prevalence of risky drinking (according to the Swedish definition of at least one heavy episode of drinking 5 drinks (of 12 g of alcohol) or more for men or 4 drinks or more for women in the past month or weekly consumption of more than 14 drinks for men or more than 9 drinks for women20). The three secondary outcomes were the component items of the AUDIT-C: (1) frequency of drinking; (2) typical quantity consumed; and (3) number of heavy episodes of drinking per month. Psychometric properties online have been established as valid in student populations.21
By necessity, inferences involving group 3 can only be drawn in entirely unselected populations, as there were no baseline data. We declared a priori our approach as highly conservative, unavoidably including data biased towards the null (both from non-participants at baseline and those who are not risky drinkers).11 We undertook an additional analysis (reported in online Table DS2) and acknowledge that testing the specific effects of feedback in the per-protocol analysis among baseline risky drinkers involves a departure from the intention-to-treat principle.11
The pilot study indicated that any between-group differences were likely to be very small.18 To detect an effect size of 0.08 standard deviations between any two groups with 5% significance level and 80% power, we required 2500 individuals analysed per group. Assuming a follow-up rate of 50%, we therefore aimed to recruit 5000 individuals per group.
All analyses were restricted to individuals reporting outcome data and thus assumed that missing data were missing at random. To assess this assumption, we compared AUDIT-C and heavy episodic drinking outcomes between individuals who responded via the online questionnaire after the initial and first, second and third reminder emails, and between heavy episodic drinking among those who responded via the online questionnaire and in the body of the email. Any trend is suggestive of a missing not at random mechanism, with missing outcomes likely to be more similar to late responders than early responders.22 We also undertook pre-specified analyses of possible effect modification by university, term, age, gender and baseline drinking. We added an unplanned sixth outcome measure, combining drinking frequency and quantity (AUDIT items 1 and 2) to form a total weekly consumption outcome.
Differences in proportions between groups were examined with chi-squared tests and mean differences by Student’s t-test. Logarithmic transformations were used to reduce skewness. Effect sizes were calculated as standardised mean differences (Cohen’s d23) and odds ratios as appropriate. Multivariate linear regression (for AUDIT-C outcomes) and multivariate logistic regression (for binary risky drinking outcomes) were used to adjust intervention effects for baseline covariates. In intention-to-treat analyses comparing groups 1, 2 and 3, baseline covariates were gender, age, term and university. In per-protocol analyses comparing groups 1 and 2, baseline covariates additionally included log-transformed weekly alcohol consumption. Interactions between intervention group and predictor variables were tested by comparing models excluding and including the interaction parameters using the F-test (for linear regression) or the likelihood ratio test (for logistic regression). For analysis of missing data, AUDIT-C and heavy episodic drinking (both log-transformed) were compared between the initial and first, second and third reminder emails using linear regression. A t-test further compared heavy episodic drinking among those who responded in the body of the final email with all those who responded via the online questionnaire. All tests were performed two-sided at P<0.05. The statistical analyses were performed using SPSS 19 and Stata 12 in Windows.
There were small statistically significant differences in the participation rates at baseline (36% and 33% in groups 1 and 2) and at follow-up (51%, 52% and 54% in groups 1, 2 and 3 respectively; see Table 1 and Fig. 1). The proportion who were risky drinkers at baseline was similar in groups 1 and 2 (64% and 63%) as were levels of attrition in these groups (70% and 72% respectively). The sociodemographic and university characteristics of the follow-up participants in all three groups did not differ (Table 2) nor were there differences at baseline between the two groups for the per-protocol analyses in sociodemographic, university or alcohol consumption data (online Table DS1).
The intention-to-treat analyses included those in all three groups who took part in the follow-up survey, regardless of earlier participation. The prevalence of risky drinking (one of two primary outcomes) was higher in group 3 compared with group 1 by approximately 3.7% (P = 0.006 in the multivariate model) and AUDIT item 3 (heavy episodic drinking) was statistically significantly higher in the multivariate model only (P = 0.044; Table 3). In addition to these assessment and feedback effects, assessment-only effects were apparent in the AUDIT-C total score primary outcome (P = 0.039 multivariate P-value) and AUDIT item 3 (P = 0.036 multivariate P-value) in comparisons between group 2 and group 3 (Table 3). There were consistently small differences in the anticipated direction in comparisons with group 3, which were possibly as a result of chance, with P<0.1 for four of five comparisons for both groups 1 and 2. Restricting the sample to those scoring 2+ on AUDIT item 1 at follow-up diminished any small between-group differences (online Table DS2). There were no differences between groups 1 and 2 in the intention-to-treat analyses, nor any which attained statistical significance in univariate comparisons.
Groups 1 and 2 did not differ to a statistically significant degree on either primary or secondary outcomes in the per-protocol analyses (Table 4). For the additional originally unplanned outcome of total weekly consumption, both the unadjusted test and the ANCOVA showed a statistically significant difference between the two groups (Table 4). There was no evidence of effect modification among five possible effect modifiers tested for both primary outcomes. Among those who responded via the online questionnaire, there was no evidence of differences in mean AUDIT-C scores or heavy episodic drinking at follow-up between responders after the initial and first, second and third reminder emails (AUDIT-C geometric mean: first email, 3.51; second email, 3.55; third email, 3.40, fourth email, 3.30, P = 0. 0.204; heavy episodic drinking geometric mean: first email, 1.08; second email 1.10; third email, 1.10; fourth email, 0.98; P = 0.447). However, there was evidence of a difference (P<0.001) in mean reported heavy episodic drinking between the 546 responders in the body of the email (geometric mean heavy episodic drinking: 1.46) and the 7809 responders via the online questionnaire (geometric mean heavy episodic drinking: 1.08).
This study found very small between-group differences in both primary and secondary outcomes favouring both assessment and feedback (group 1) and assessment only (group 2) in comparison with no contact (group 3). These differences were consistently in the anticipated direction, and some attained statistical significance. They provide evidence of a population-level effect obtained through very brief and simple individual-level interventions. However, this statement should not be interpreted to suggest that there is strong evidence of intervention benefit and any effects beyond those obtained after 3 months should be expected to deteriorate in the longer term.24 That many small differences were not statistically significant, even with such a large sample size, demonstrates just how small any such effects actually are. This study was highly naturalistic, comprising an unobtrusive evaluation of a national system in which student health services send emails to university students. We were aware at the outset that our evaluation strategy was conservative with populations randomised and compared regardless of their need for intervention, interest in participation and motivation to change, and also that the identification of even small effects is not without public health significance.11,18 This study design and the findings are novel not only for alcohol, but also for other health behaviours such as tobacco smoking for which it is desirable to intervene with individuals for population-level prevention purposes, now facilitated by the internet.25
Strengths and limitations
A highly pragmatic evaluation was designed to minimise interference by research artefacts stemming from intervention study participation. This naturalistic study context, although having certain methodological advantages, also imposed limitations, most notably in participation rates and consequent potential for selection biases. Offering feedback in the initial email led to greater uptake in group 1 compared with group 2. Differential participation rates at baseline could bias outcomes away from the null in comparisons of groups 1 and 2, increasing the risk of selection bias in the per-protocol analysis in particular, whereas differential participation rates at follow-up are more difficult to interpret but may bias comparisons of all three groups. The overall follow-up rate of 52% is fairly unremarkable in the context of e-health trials, being higher than many and not as high as some.26 Although other analyses suggested missing data were missing at random, the higher reported mean heavy episodic drinking among responders via email, compared with the large majority who responded via the online survey, provides further grounds for caution in interpreting these results. This suggests either a missing not at random mechanism, with heavier drinkers being disproportionately missing, or a mode effect.
The AUDIT-C provides limited capacity to identify intervention effects, although such a short instrument served masking well. Outcomes were self-reported and although computerised data collection may minimise social desirability bias,27 there is a need to study the validity of self-reported data in brief alcohol intervention trials as validity established in treatment contexts provides only limited reassurance.28,29 The multiplicity of analyses should be borne in mind when interpreting study findings. Heterogeneity of findings from previous studies partly results from the detailed content of interventions evaluated, for example in the study by Moreira and colleagues30 normative feedback arrived some weeks after the assessment. There are also differences among trial design and methods used, meaning that even the most rigorous estimates of intervention effects are contingent on the detailed characteristics of the evaluation studies.31
This study has randomised and retained many more participants than any other alcohol study of which we are aware. Other study strengths include complete automation and minimisation of the potential for subversion of randomisation and observer bias in ascertainment of study outcomes. No previous study has been undertaken to evaluate a possible population-level impact of brief alcohol interventions.32 Indeed, there are no randomised studies and meagre time series data on alcohol programmes of any duration for any target group having an impact at a population level.32 Any such impact is thus noteworthy, especially when the costs of obtaining it are so low (annual cost per university is 6000 SEK, approximately £600 or US$920). Even if one assumes that the intervention was only delivered to those in groups 1 and 2, who participated at both baseline and follow-up (and thus ignoring subsequent delivery to all groups), the costs per intervention delivered are lower than in all previous brief intervention studies.33 This calculation does not include the initial developmental costs,34 and although we did not include an economic evaluation (another study limitation to be borne in mind) it is reasonable to assume cost-effectiveness would be demonstrated for these effects given that the costs per brief intervention delivered are extremely low. Reductions in risky drinking and AUDIT scores can also be translated into possible effects on the prevalence of clinical diagnoses.35
The effects of assessment alone observed here are striking. These findings extend previous randomised evaluations of assessment reactivity, almost entirely obtained in studies of hazardous and harmful drinkers.14 It appears that students who are drinking at levels that are not hazardous may change their behaviour after thinking about it when prompted by answering questions. It also appears that feedback may be additionally useful to such thinking among those who are drinking at hazardous or harmful levels. These findings are compatible with self-regulation theory,36 and there is a need to develop conceptual frameworks to guide further studies of these effects.14,37
This study was also designed to address key methodological issues. The differences between groups 2 and 3 are important because it has been shown that effect estimation depends on whether comparison is made with an assessed or an unassessed control group. Previous brief alcohol intervention studies have all used control conditions similar to group 2 rather than group 3. Group 2 constitutes a control condition that suffers fundamentally from contamination, as the data gathered for group 2 are entirely necessary to the delivery of the intervention received by group 1.14,38
This study has attempted both to evaluate a part of a national programme and to address important methodological problems in behavioural intervention trials. It makes clear the need to overcome certain methodological challenges associated with rigorous evaluation in behavioural intervention trials where the sought effects are subtle and thus vulnerable to interference by the research process. From an intervention perspective, study findings point towards the need to further develop content, delivery and tailoring methods, applying insights into behaviour change gained elsewhere. Experimentation with intervention timing as well as media such as smartphone applications may also be useful. This study provides useful evidence of both the need for, and potential benefit of, so doing. It indicates also the likely need to integrate these individual-level e-health interventions with other means to address high levels of alcohol consumption and problems associated with student drinking at a population level.6
To address problems within the general population, increasing the price of alcohol, better controlling its availability and restricting marketing to change the cultural acceptability of heavy drinking are most likely to be effective.5 Student drinking may even be more price sensitive than that in adult populations, and university campuses provide environments that lend themselves to outlet and marketing controls. It is a key weakness of the existing literature that we do not possess multilevel studies that explore the synergy or otherwise of individual-level interventions with those implemented at the community or population level in any group. Although this particular intervention targeted university students, evaluation of universal applications for adult general populations may be warranted, for example in primary care. The global burden of alcohol-related harm is likely to grow in the coming years. Very modest interventions similar to those evaluated here appear capable of making a small but important contribution to moving the entire distribution of alcohol consumption and related problems to the left, and thus to the strategy of preventive medicine.39
This study was funded by The Swedish Council For Working Life and Social Research (FAS), grant number 2010-0024 and through a Wellcome Trust Research Career Development Fellowship in Basic Biomedical Science to the first author (WT086516MA). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
- Received March 5, 2013.
- Revision received May 14, 2013.
- Accepted May 22, 2013.
- Royal College of Psychiatrists
Royal College of Psychiatrists, This paper accords with the Wellcome Trust Open Access policy and is governed by the licence available athttp://www.rcpsych.ac.uk/pdf/Wellcome%20Trust%20licence.pdf