Background Concern is widespread about potential sponsorship influence on research, especially in pharmacoeconomic studies. Quantitative analysis of possible bias in such studies is limited.
Aims To determine whether there is an association between sponsorship and quantitative outcomes in pharmacoeconomic studies of antidepressants.
Method Using all identifiable articles with original comparative quantitative cost or cost-effectiveness outcomes for antidepressants, we performed contingency table analyses of study sponsorship and design v. study outcome.
Results Studies sponsored by selective serotonin reuptake inhibitor (SSRI) manufacturers favoured SSRIs over tricyclic antidepressants more than non-industry-sponsored studies. Studies sponsored by manufacturers of newer antidepressants favoured these drugs more than did non-industry-sponsored studies. Among industry-sponsored studies, modelling studies favoured the sponsor's drug more than did administrative studies. Industry-sponsored modelling studies were more favourable to industry than were non-industry-sponsored ones.
Conclusions Pharmacoeconomic studies of antidepressants reveal clear associations of study sponsorship with quantitative outcome.
Long-standing concern exists about the potential influence of financial interests on medical decision-making (e.g. Hillman et al, 1990; Rennie & Flanagin, 1992). Especially vigorous discussion has centred on the conduct and reporting of pharmacoeconomic research (e.g. Hillman et al, 1991; Udrarhelyi et al, 1992; Gulati & Bitran, 1995; Siegel et al, 1996; Neumann, 1998; Hill et al, 2000; Jones & Cockrum, 2000; Neumann et al, 2000b). However, there has been little quantitative study of potential bias in pharmacoeconomic research throughout medicine. Reported studies have reached mixed conclusions (e.g. Sacristan et al, 1997; Azimi & Welch, 1998; Friedberg et al, 1999; Neumann et al, 2000a), perhaps in part because with one exception (Friedberg et al, 1999) they investigated several drugs and in some cases included medical devices. We are unaware of any study focused on psychiatric medication.
We studied associations between sponsorship and study design with quantitative outcome in pharmacoeconomic studies by examining the test case of antidepressants. We asked the following primary questions. First, is there an association between industry v. non-industry sponsorship of studies and quantitative conclusions? Second, among industry-sponsored studies and between industry-sponsored v. non-industry-sponsored studies, is there an association between study design and quantitative conclusions?
We chose antidepressants licensed in the UK or the USA as our test case because of their large market share and the number of pharmacoeconomic studies. Antidepressants rank in the top three drug classes world-wide in terms of sales dollars. Their growth in sales ranks them among the top five drug classes worldwide (IMS Health, 2001). Additionally, these antidepressants are the subject of multiple cost-outcome studies reporting quantitative results.
To locate reports of pharmacoeconomic studies of antidepressant drugs we used the Cochrane Library, Medline and Health-STAR databases supplemented by manual searches based on the references cited in the studies located through the databases. We searched for all articles between 1987 - the year the first 'newer' antidepressant, fluoxetine, received US Food and Drug Administration (FDA) approval - and April 2001. The search terms we used in Medline and HealthSTAR were COST-BENEFIT ANALYSIS or COST SAVINGS or DRUG COSTS or COST-EFFECTIVENESS (text word) and ANTIDEPRESSIVE AGENTS or ANTIDEPRESSANT (text word). The search term we used in the Cochrane Library was ANTIDEPRESSIVE AGENTS. We identified 46 articles (Jonsson & Bebbington, 1993, 1994; Hatziandreu et al, 1994; Le Pen et al, 1994; McFarland, 1994; Sclar et al, 1994, 1995, 1998, 1999; Stewart, 1994; Anton & Revicki, 1995; Einarson et al, 1995, 1997; Lapierre et al, 1995; Nuijten et al, 1995; Revicki et al, 1995, 1997; Skaer et al, 1995; Bentkover & Feighner, 1996; Forder et al, 1996; Hylan et al, 1996, 1998; Montgomery et al, 1996; Smith & Sherrill, 1996; Croghan et al, 1997, 2000; Melton et al, 1997; Obenchain et al, 1997; Woods & Rizzo, 1997; Boyer et al, 1998; Canadian Coordinating Office for Health Technology Assessment, 1998; Crown et al, 1998; Simon & Fishman, 1998; Thompson et al, 1998; Brown et al, 1999a,b; Griffiths et al, 1999; Nurnberg et al, 1999; Russell et al, 1999; Simon et al, 1999; Borghi & Guest, 2000; Sullivan et al, 2000; Casciano et al, 2001; Doyle et al, 2001; Poret et al, 2001; Wan et al, 2002). We excluded two studies (Boyer et al, 1998; Simon et al, 1999) because they were randomised trials, unlike all the other studies, which were modelling studies or analyses of administrative databases. The remaining articles represent 45 separate studies. Two articles report the results of one study (Jonsson & Bebbington, 1993, 1994). Two articles report two studies each, one in-patient, one out-patient (Einarson et al, 1995, 1997). Two articles reported slight variations on two studies, one in-patient and one out-patient (Casciano et al, 2001; Doyle et al, 2001).
Classification of studies
For the primary analysis we categorised each study according to whether it was industry-sponsored. The study was categorised as industry-sponsored if at least one author was listed as a pharmaceutical company employee, or an acknowledgement listed pharmaceutical company support; otherwise, it was categorised as non-industry-sponsored. For secondary analyses we categorised studies authored by industry employees separately from studies only listing financial support.
Study sponsors were categorised by product into those manufacturing selective serotonin reuptake inhibitors (SSRIs: fluoxetine, sertraline, paroxetine and citalopram) or ‘atypical’ antidepressant drugs (venlafaxine, bupropion and mirtazapine).
Operationalisation of outcomes
For either of the questions posed in our study no single means of operationalising the issue of which antidepressant was favoured could be applied to all studies. Therefore, we performed separate analyses using alternative operationalisations. Specifically, for question one (the industry v. non-industry comparison), no single standard was applicable that allowed analysis of all 46 studies. Seemingly simple standards such as ‘ sponsor’s antidepressant favoured' could not apply: in non-industry-sponsored studies, there is no ‘sponsor’s antidepressant'. In our primary depressant'. analysis of industry-sponsored v. non-industry-sponsored studies we examined whether the outcome favoured SSRIs or tricyclic antidepressants (TCAs), excluding studies sponsored by ‘atypical’ antidepressant manufacturers. To allow analysis of the latter studies, we performed an alternative analysis based on whether the outcome favoured the ‘newest antidepressant’ (‘newness’ was based upon date of FDA approval). In this analysis, studies in which the sponsor's drug was not the newest were excluded.
In addressing our second question, regarding the association of study design with bias on outcome, we examined the issue both within industry-sponsored trials and between industry-sponsored and non-industry-sponsored trials. Within the first group we looked at the association of modelling v. administrative study designs with outcome. We operationalised the outcomes and groups in two alternative ways: favouring the newest drug among all industry-sponsored studies, or favouring the sponsored drug among all industry-sponsored studies.
In examining the association of study design with outcome between industry v. non-industry sponsors, we compared the outcome patterns within modelling studies. We could not compare outcome patterns in administrative data studies given there was only one such non-industry-sponsored study. We operationalised outcomes in two alternative ways: favouring the newest drug, or favouring SSRIs v. TCAs.
Rating study outcomes
Initially two of the authors (C.B.B. and M.N.J.) independently categorised sponsorship and outcomes of each study. If their ratings were inconsistent, a third author (S.W.W.) rated the study. Initial ratings agreed in all cases but one.
Most studies contained several outcomes. However, we wished to rate a single outcome from each study and employed the following decision rules to select that outcome. First, we selected only quantitative outcomes. Second, among base case and variants, we selected the base case. Third, among outcomes adjusted for bias and unadjusted outcomes, we selected the adjusted outcome. Fourth, among outcomes for various time periods, we selected the longest period. Fifth, among multiple pharmacoeconomic indicators, we selected a single outcome on the basis of the following rules: if only cost outcomes were reported, we chose total costs over more limited costs; if cost and cost-effectiveness outcomes were reported, we chose cost-effectiveness outcomes; and if more than one type of cost-effectiveness ratio was reported, we chose incremental over average ratios. Sixth, if results were reported separately for individual countries, we selected the results for the UK and the USA.
After selecting a single outcome for each study, the researchers rated each study as favourable, neutral or unfavourable for the drug of interest, depending on the particular analysis (e.g. SSRI in the SSRI v. TCA analysis, or newest antidepressant in the newest v. older antidepressant analysis): ‘favourable’ meant that a drug's quantitative cost-effectiveness results were unequalled by any of the other drugs in the study; ‘neutral’ meant that although other drugs' results might be equal to it, none surpassed the drug of interest; and ‘ unfavourable’ meant that other drugs' results did surpass the drug of interest. Raters used all available information to judge differences in outcomes among drugs. If the study reported statistical significance, raters based their judgements on statistically significant differences. If the study did not report statistical significance, raters based their judgements on the reported numerical differences. With quality-adjusted life-years (QALYs), raters judged a treatment superior if marginal cost-effectiveness was less than US$20 000 per QALY, a common applied limit (Laupacis et al, 1992). Subsequently, we performed a sensitivity analysis by varying the marginal threshold between $20 000 and $100 000 per QALY.
The following is an example of how raters applied the rules noted above to designate a specific study as favourable, neutral or unfavourable. In the SSRI v. tricyclic or heterocyclic antidepressant analysis of the Hatziandreu study (Hatziandreu et al, 1994) the preceding rules led raters to judge that the study favoured the SSRI. The study reported the base case incremental cost-effectiveness ratio to be £2172 ($3692) for each QALY gained by using the SSRI rather than the TCA. This cost per QALY gained is less than the $20 000 per QALY cut-off noted in the raters' decision rules; therefore, the study was rated as favourable for the SSRI.
In addition to the planned analyses described above, we performed two exploratory analyses: one was based on the number of industry authors and the second was based on the ordinal position of any industry authors. Neither of these analyses yielded a significant association.
We analysed the association between sponsorship and outcome using Fisher's exact test as generalised for 2×3 tables. We chose contingency table analysis rather than a meta-analytic technique because of the qualitative heterogeneity of the pharmacoeconomic outcome types across studies, which ranged from direct costs per patient, to direct costs per treatment success, to direct costs per symptom-free day, to lifetime direct costs per discounted QALY. We judged it inappropriate to transform these qualitatively disparate types of outcomes into a common effect size. We selected the 0.05 α level, two-tailed.
Association of sponsorship
SSRI v. TCA analysis
For the primary analysis of industry v. non-industry sponsorship of SSRI v. TCA studies, six of seven non-industry-sponsored studies were eligible for analysis (see Table 3). Seventeen industry studies were eligible (see Tables 1 and 2).
Distribution and results for Fisher's exact test are noted in Table 4. The association between industry sponsorship and outcome favouring SSRIs v. TCAs was statistically significant. Each of the two secondary analyses contrasting studies with industry-employed authors v. non-industry-sponsored studies and contrasting studies with industry funding alone v. non-industry-sponsored studies demonstrated a statistically significant association between industry sponsorship and outcome favouring SSRIs v. TCAs, with probability values of 0.0420 and 0.0163 respectively.
New v. old antidepressant analysis
All non-industry-sponsored studies were eligible (see Table 3). Thirty-three industry-sponsored studies (see Tables 1 and 2) were eligible. Distribution and results of Fisher's exact test are noted in Table 4. The association between industry sponsorship and outcome favouring the newest antidepressant was statistically significant. Each of the two secondary analyses contrasting studies with industry-employed authors v. non-industry-sponsored studies and contrasting studies with industry funding alone v. non-industry-sponsored studies demonstrated a statistically significant association between industry sponsorship and outcome favouring the newest antidepressant, with probability values of 0.0047 and 0.0018 respectively.
Association between study design and sponsorship bias
Within industry-sponsored studies, is there a difference in tendency to favour the sponsor's drug over a competitor's drug or drug class, based on type of study design? For the principal analysis, ‘favouring the sponsor’s drug or drug class’ was defined based on favouring the newest drug among all manufacturer-sponsored studies. Thirty-three industry-sponsored studies were eligible (see Tables 1 and 2). Distribution and results of the Fisher's exact test are noted in Table 4. The association between modelling v. administrative study design and outcome favouring the newest drug was statistically significant.
For our alternative analysis based on whether the sponsor's drug or drug class won, regardless of whether it was newest, all 38 industry-sponsored studies were eligible (see Tables 1 and 2). This analysis yielded a probability value of 0.0011, consistent with the results in the primary analysis.
Between industry-sponsored and non-industry-sponsored modelling design studies, is there a difference in outcome patterns? For the principal analysis of this question we examined the patterns of favouring the newest drug. Nineteen industry studies (see Tables 1 and 2) and five non-industry studies (see Table 3) were eligible. The distribution and results of the Fisher's exact test are noted in Table 4. The association between industry v. non-industry sponsorship of modelling studies and outcome favouring the newest drug was statistically significant. Each of the two secondary analyses contrasting studies with industry-employed authors v. non-industry-sponsored studies and contrasting studies with industry funding alone v. non-industry-sponsored studies demonstrated a statistically significant association between industry sponsorship and outcome favouring the newest antidepressant in modelling studies, with probability values of 0.0010 and 0.0100 respectively.
In an alternative analysis we examined the patterns of favouring SSRIs v. favouring TCAs in modelling studies. We performed this analysis with the five eligible non-industry-sponsored studies (see Table 3) contrasted first with all twelve eligible industry-sponsored modelling studies (see Tables 1 and 2) that included SSRI v. TCA comparisons, and then with the six eligible modelling studies sponsored by SSRI manufacturers (see Table 1) that included SSRI v. TCA comparisons. The results of the Fisher's exact test in the two cases were 0.0139 and 0.0151 respectively, indicating that the tendency for industry-sponsored simulations to favour SSRIs more often than non-industry-sponsored studies is unlikely to be due to chance.
The sensitivity analysis varying the marginal cost-effectiveness threshold from $20 000 to $100 000 per QALY did not change any of the results reported above.
Our analyses show that, regardless of how the question was operationalised, for each of our study questions there was greater than chance association between study sponsorship and outcome. Among industry-sponsored v. non-industry-sponsored studies, industry-sponsored studies more frequently reported results favourable to the industry sponsor than did non-industry-sponsored studies. This was true whether industry sponsorship was defined as industry authorship, industry financial support alone, or both. Among industry studies, modelling studies were more likely to report results favourable to the sponsor than administrative data studies. Between industry-sponsored and non-industry-sponsored modelling design studies, industry studies were more likely to report results favourable to industry.
Consistency with prior studies
Our overall finding of sponsorship bias is consistent with prior studies in that all three prior fully reported studies found some association between study sponsorship and outcomes (Azimi & Welch, 1998; Friedberg et al, 1999; Neumann et al, 2000a). However, any detailed comparison between our study and previous studies is necessarily limited given that the previous studies mixed drugs, devices and other health interventions, and mixed various classes of medicines (Azimi & Welch, 1998; Neumann et al, 2000a); focused on qualitative conclusions (Friedberg et al, 1999); and used various definitions to select the specific study outcomes to be analysed (Azimi & Welch, 1998; Friedberg et al, 1999; Neumann et al, 2000a). The study by Friedberg et al (1999) of oncology drugs is perhaps most comparable with our current study, given their focus on a single pharmaceutical class and their categorisation of study conclusions as favourable, neutral or unfavourable, although they focused on qualitative rather than quantitative conclusions. Like our study, that of Friedberg et al did find an association between study conclusion and funding source.
Support for concern about modelling studies
In addition to supporting the general concern about sponsorship bias in pharmacoeconomic studies, our findings support the more specific concerns that have been raised about the potential for bias in modelling studies (Luce, 1995; O'Brien, 1996; Sheldon, 1996; Maynard & Cookson, 1998; McCabe & Dixon, 2000). Such support stems from the combination of our two findings regarding study design: among industry studies, modelling studies are more favourable to the sponsor than administrative studies, and in a comparison of industry-sponsored and non-industry-sponsored modelling studies, studies sponsored by industry are significantly more favourable to industry.
Limitations of our study
Our study has clear limitations. Randomised pharmacoeconomic trials could not be compared on the basis of sponsorship because there were only two such trials in this area. Relatively few non-industry-sponsored studies were available. We examined only one class of medications; analyses of other classes of medications should be conducted.
Bias v. accuracy
Although we have demonstrated several associations between study sponsorship and outcome, these associations do not suggest which (if either) side presents a more accurate estimate of relative pharmacoeconomic outcome. Both industry-supported and non-industry-supported researchers may be subject to forces that could potentially bias their work (Yee & Hillman, 1997; Drummond, 1998; Rennie & Luft, 2000). Additionally, journal editorial processes can result in a biased sample of studies being published. It has been observed that journals tend to publish studies with ‘positive’ rather than ‘negative’ results (Freemantle & Mason, 1997).
Causes of bias
Many ideas have been offered to explain how sponsorship could result in biased reported outcomes (Udrarhelyi et al, 1992; Freemantle & Mason, 1997; Drummond, 1998; Cook, 1999; Neumann et al, 2000a; Rennie & Luft, 2000). Industry, motivated to enhance sales of its products, might only pursue studies on products and select comparators that would yield favourable results. They might select biased populations within administrative data-sets, overtly or subtly influence analytical methods or models, or veto submission for publication of studies yielding unfavourable results. Non-industry-sponsored researchers might bias the studies submitted for publication in similar ways, although perhaps from different motivations such as controlling formulary costs, personal or academic rivalries, or career promotion.
We are unable to pinpoint the causes of bias among the reports analysed here. Examination of the individual studies does not reveal a common element that differ differentiates industry-sponsored from non-industry-sponsored studies; rather, the methodological limitations in the studies vary widely. These limitations have been discussed extensively elsewhere (Hotopf et al, 1996; Woods & Baker, 1997, 2002). However, at least two suggested causes seem unlikely. First, some commentators have noted the potential role of selection bias - i.e. the tendency of researchers not to submit and of journals not to publish small studies or studies with negative statistical outcomes (Freemantle & Mason, 1997; Neumann, 1998). This would help to explain how an overall preponderance of statistically positive studies could exist even if there were true uncertainty about alternative medications (Djulbegovic et al, 2000). The difference we have shown between industry-sponsored and non-industry-sponsored studies suggests that submission or editorial selection bias based on statistical significance alone does not adequately explain the bias in the present case. Second, it has been suggested that a particular sponsor weeds out weak alternatives among its drugs in early preliminary processes; therefore, drugs that reach the stage of being marketed are strong competitors and likely to yield analyses that favour the sponsor's drug (Gagnon, 2000). However, these same strong competitors performed less well in non-industry-sponsored studies, as shown clearly in our analysis of outcomes favouring either SSRIs or TCAs. Moreover, it should be noted that in the 18 studies with head-to-head comparisons among such strong competitors, the sponsor's drug lost only once (Einarson et al, 1995).
Bias in efficacy v. pharmacoeconomic studies
It is not possible to comment about whether the bias revealed in the current study of pharmacoeconomic reports of antidepressants is any greater or less than the sponsorship bias that may exist in efficacy studies of antidepressants. There is no published report on sponsorship bias in efficacy studies in any medication category within psychiatry. The only published report devoted to such quantitative analysis of psychiatric medications is a letter reviewing efficacy studies of any psychiatric medication in one journal over a 1-year period (Mandelkern, 1999). This author reported a tally for industry-supported studies of 16 favourable to the manufacturer's drug and none unfavourable, and for unsupported studies 10 favourable and 6 unfavourable, concluding that there was a correlation between source of support and efficacy outcome. In other areas of medicine, bias has been demonstrated repeatedly in efficacy studies (Davidson, 1986; Rochon et al, 1994; Stelfox et al, 1998; Djulbegovic et al, 2000). A study of sponsorship bias in efficacy trials of antidepressants would provide a useful comparison for our study.
It is important for pharmacoeconomic studies to attempt to give estimates that are as accurate and uninfluenced by bias as possible, given the large and growing number of health care dollars spent on medications. Pharmaceutical sales for North America were reported to be US$153 billion in 2000, representing a 14% growth over the previous year (IMS Health, 2001). Owing to the importance of cost constraint in medicine the volume of pharmacoeconomic research has been growing (Detsky, 1994) and is linked to governmental purchasing decisions in some jurisdictions (Canadian Coordinating Office for Health Technology Assessment, 1994; Ontario Ministry of Health, 1994; Australian Government, 1995). However as we noted previously, financial and other incentives create strong motives for bias. Our results for antidepressants suggest that actual bias related to sponsorship appears to exist, although whether or how the bias and specific motives are related cannot be determined. Until the mechanisms producing the bias are better understood, interpretation of results from pharmacoeconomic studies should take sponsorship into account.
Clinical Implications and Limitations
Evaluation of pharmacoeconomic studies of antidepressants for treatment of major depression should take into account the source of study funding and of authorship.
Evaluation of modelling studies in this area may warrant additional scrutiny.
Evaluation of pharmacoeconomic studies should be informed by a thorough understanding of their methodology.
Randomised pharmacoeconomic trials could not be compared because of their scarcity.
Few non-industry-sponsored trials were available.
Only one class of medication was examined.
Consultancies, research grants or unrestricted grants were received from Pfizer, Bristol-Myers Squibb, Eli Lilly & Co. and Forest Laboratories. The work was supported in part by PHS KO8 MH-01718 (C.B.B.), NARSAD R03051 (C.B.B.), Robert Wood Johnson Foundation (M.L.C.), Substance Abuse and Mental Health Services Administration (M.L.C.), Meadows Foundation (M.L.C.), Houston Endowment (M.L.C.), Hogg Foundation (M.L.C.), Texas Department of Mental Health and Mental Retardation (M.L.C.) and PHS MH-54446 (S.W.W.).
- Received January 3, 2003.
- Revision received July 10, 2003.
- Accepted July 31, 2003.
- © 2003 Royal College of Psychiatrists