Background Veterans of the Persian Gulf War of 1991 have reported symptoms attributed to their military service.
Aims To review all studies comparing the prevalence of psychiatric disorders in Gulf War veterans and in a comparison group of service personnel not deployed to the Gulf War.
Method Studies of military personnel deployed to the Gulf published between 1990 and 2001 were identified from electronic databases. Reference lists and websites were searched and key researchers were contacted for information. A total of 2296 abstracts and 409 complete articles were reviewed and data were extracted independently by two members of the research team.
Results The prevalence of psychiatric disorder in 20 studies of Gulf War veterans was compared with the prevalence in the comparison group. Prevalence of post-traumatic stress disorder (PTSD) and common mental disorder were higher in the Gulf War veterans. Heterogeneity between studies was significant, but all reported this increased prevalence.
Conclusions Veterans of the Persian Gulf War reported an increased prevalence of PTSD and common mental disorder compared with other active service personnel not deployed to the Gulf. These findings are attributable to the increase in psychologically traumatic events in wartime.
Since the end of the Persian Gulf War of 1991, its veterans have reported a range of health complaints attributed to service during the war. These veterans report an increased prevalence of a whole range of common symptoms compared with other service personnel who were not deployed to the Gulf.
It is now widely recognised that exposure to combat and other wartime experiences can have both short-term and long-term psychological effects. These psychological consequences are varied, but the concept of post-traumatic stress disorder (PTSD) has arisen to describe the syndrome of intrusive thoughts, flashbacks, hyperarousal and numbing that can occur after exposure to any traumatic event, including those common in wartime.
The Persian Gulf War was brief and there were relatively few casualties among the troops deployed on behalf of the United Nations. Nevertheless, a number of aspects of the war exposed service personnel to traumatic and stressful events: these included the risk of chemical and biological warfare, exposure to combat, and dealing with prisoners and dead and wounded Iraqi soldiers. This paper describes a systematic review of studies that have compared the prevalence of psychiatric disorder in Gulf War veterans with its prevalence in a comparison group who were not deployed to the Gulf (non-Gulf veterans).
Studies between January 1990 and May 2001 were identified from a range of electronic databases, including EMBASE, Medline, ASSIA, SIGLE, PsycINFO, CancerLit, HealthSTAR, Dissertation, Abstracts, Current Contents, Health and Psychosocial Instruments, CINAHL and Biological Abstracts. Keywords used to identify the studies were: DESERT STORM or DESERT SHIELD or DESERTSHIELD or GULF WAR or GULF SYNDROME or GULF WAR SYNDROME or PERSIAN GULF WAR or PERSIAN GULF SYNDROME. References of identified studies were searched for further studies. Specialist Gulf veterans' illnesses research websites (US Department of Defense Center for Deployment Health Research site and the Walter Reed Army Medical Center Gulf War database) and more-general Gulf websites were also searched for any additional references. Researchers who had expressed an interest in Gulf veterans' illness research were contacted for any non-published information. There was no restriction on the identification of studies in terms of publication status or language. This search strategy was first applied to data published up to the end of 1998 (n=4156) and then repeated to the end of May 2001 (n=1231).
Studies were included if they contained data on veterans who had been deployed to the Gulf War on military, medical or peace-keeping grounds (i.e. those involved in operations Desert Shield, Desert Storm, Granby or Desert Peace). Any study design was eligible for inclusion provided that an appropriate control or comparison group was included to compare the prevalence of psychiatric disorder.
The 5387 abstracts identified by the original search were screened by N.J.S. and the 2296 that remained eligible were examined by two members of the research team to decide whether they might meet our inclusion criteria. Printed copies of 409 papers were then obtained and examined by two members of the research team to confirm eligibility and extract data.
In our original search we also included studies that compared ill and well Gulf War veterans, but these were excluded from the review reported here. Studies were also excluded if they measured simulated exposures, if they measured non-health-related outcomes, or if the study population included inhabitants of the Persian Gulf states rather than deployed military, medical or peace-keeping personnel.
All identified papers that fulfilled the pre-stated inclusion criteria were categorised by health outcome. Forty-nine studies included data on psychiatric disorder, 29 of which reported on Gulf War veterans and an external comparison group of non-Gulf War veterans. We further restricted the studies to those with a limited range of outcomes concerned with psychiatric disorder (20 studies). The outcomes we chose to include were as follows:
PTSD diagnosed using a recognised standardised assessment;
common mental disorder: depression or anxiety diagnosed using a recognised standardised assessment; or self-reported symptoms of depression recorded on a checklist;
problems related to alcohol misuse.
We have chosen to use the term ‘common mental disorder’ ( Goldberg & Huxley, 1992) to refer to the common symptoms of depression and anxiety that are seen in the community and reflect the use of assessments such as the General Health Questionnaire (GHQ; Goldberg & Williams, 1988) and the Symptom Checklist (and its derivatives) ( Derogatis et al, 1974, Derogatis, 1977; Derogatis & Spencer, 1982)
Data relating to the studies' main hypotheses and to methodological quality were extracted independently by two members of the research team. Information on the methodological quality of the individual studies included the response rate, the potential of selection bias in the sampling of the study participants, the potential bias in the measurement of outcomes, the availability of data on confounders, and any adjustment for such variables.
Summary odds ratio and risk ratios were calculated with a random-effects model using the inverse variance method. The degree of heterogeneity was assessed using the chi-squared test within a fixed-effects model. All analyses were performed using the METAN command ( Bradburn et al, 1998) in Stata version 6 ( StataCorp, 1999). We chose this approach because of the inherent heterogeneity in the data. In particular, we were combining studies with a variety of outcome measures. A random-effects model assumes that the studies in a meta-analysis are sampled from a distribution of effect sizes, which are estimated from the data in the meta-analysis. In contrast, a fixed-effects model assumes that all the studies are sampled from a population with the same effect estimate.
We chose to perform analyses on dichotomous outcomes because the distribution of scores from continuous scales is often difficult to establish from published articles, and this — together with the wide variety of scales that were used — can introduce difficulties in performing a quantitative synthesis. Using ratio measures to estimate association should be less sensitive to the different case definitions and measures used in the constituent studies.
The systematic review process is shown in Fig. 1 We identified 20 primary studies that investigated the association between deployment to the Gulf War and psychiatric disorder ( Perconte et al, 1993; Sutker et al, 1993, 1994; Stretch et al, 1996a,b; Iowa Persian Gulf Study Group, 1997; Pierce, 1997; Stuart & Halverson, 1997; Goss Gilroy Inc., 1998; Holmes et al, 1998; Proctor et al, 1998; Stuart & Bliese, 1998; Gray et al, 1999; Ishoy et al, 1999; Unwin et al, 1999; Wolfe et al, 1999; Bartone, 2000; Kang et al, 2000; Steele, 2000; Cherry et al, 2001). We excluded nine other studies that included data on psychiatric disorder in Gulf War veterans but did not meet our inclusion criteria: five repeated results already included, three did not include any of the psychiatric outcomes defined above, and one compared Gulf veterans with reported illness with a comparison sample (further details available from the authors upon request).
Table 1 summarises the studies we identified. All are best described as cross-sectional surveys. Some studies, for example those by Kang et al ( 2000), Unwin et al ( 1999) and Ishoy et al ( 1999), resemble cohort studies, as the population was defined in terms of ‘exposure’ to the Gulf War. However, these studies had little or no information on health status before deployment and therefore share most of the methodological limitations of cross-sectional surveys.
The sampling design of the studies varied. For example, Unwin et al ( 1999), Kang et al ( 2000), Goss Gilroy Inc. ( 1998), Ishoy et al ( 1999) and Cherry et al ( 2001) identified samples of service personnel from military databases. The Unwin et al and Cherry et al studies were of two independent samples drawn from the same UK military database. They employed stratified random sampling in order to frequency-match the characteristics of Gulf War veterans with those who were on active duty at the time but were not deployed to the Gulf. These comparison groups are referred to as non-Gulf veterans; the proportion actually deployed to areas other than the Gulf varied between studies. An alternative sampling strategy used by two studies, the Iowa Persian Gulf Study Group ( 1997) and Steele ( 2000), identified all military service personnel who had served during the period of the Gulf War and who lived in one US state (Iowa and Kansas, respectively). Within this standard survey design the investigators then compared those who had been deployed to the Gulf with those who had not. Pierce ( 1997) also used a military database but selected only women from the US Air Force to study.
There were also more ad hoc sampling procedures that did not use the large national databases. For example, Holmes et al ( 1998), Gray et al ( 1999) and Sutker et al ( 1993) compared Gulf War veterans and non-Gulf veterans within a selection of units. Some studies also chose a small number of military bases without any apparent justification for inclusion ( Proctor et al, 1998; Wolfe et al, 1999).
Response rates also varied considerably between studies ( Table 1). Of most importance is that the response rate of the Gulf War veterans was higher than that of the non-Gulf veterans in studies that reported the response rates separately. This could introduce a biased comparison. For example, Unwin et al ( 1999) had a 70% response rate in the Gulf War veterans and a 63% response rate in the non-Gulf War veterans sample. Goss Gilroy Inc. ( 1998) in the Canadian study reported response rates of 73% for Gulf War veterans and 60% for non-Gulf War veterans.
Most of the studies took place after there had been considerable publicity about illness in Gulf War veterans. However, four studies included here reported findings based upon surveys carried out within about a year of the end of the Gulf War: these studies were by Sutker et al ( 1993, 1994), Holmes et al ( 1998) and Stuart & Halverson ( 1997). All reported a significant excess of psychopathological disorder within the Gulf War veterans.
Many of the studies used the Mississippi scale ( Keane et al, 1988) or modified versions thereof to assess symptoms of PTSD; this is a self-administered scale and it is generally assumed to be less valid than some of the more detailed questionnaires. Some studies used their own method for assessing PTSD based upon questions modelled on the DSM—III—R ( American Psychiatric Association, 1987) criteria. A few studies used structured interviews administered by clinicians ( Sutker et al, 1994; Proctor et al, 1998; Wolfe et al, 1999), but these assessments would have had the potential disadvantage of introducing possible observer bias, as the interviewers would not have been masked to the participants' deployment status.
We identified 17 studies that included data on common mental disorders. The self-administered questionnaire used most frequently to assess common mental disorder, in eight studies, was the Hopkins Symptom Checklist or Brief Symptom Inventory ( Derogatis et al, 1974; Derogatis, 1977; Derogatis & Spencer, 1982; Derogatis & Melisaratos, 1983). This scale was reported either as a continuous outcome or used to define a ‘case’ of common mental disorder. The other studies used a variety of methods to assess common mental disorder, from self-reported symptoms of depression ( Proctor et al, 1998; Gray et al, 1999; Ishoy et al, 1999; Kang et al, 2000; Steele, 2000), other self-administered scales such as the GHQ ( Unwin et al, 1999), to lengthy clinician-administered structured interviews ( Wolfe et al, 1999).
There was considerable variation in the extent to which the authors attempted to adjust for confounders. Many of the studies that selected from the military databases used a stratified sampling procedure and frequency-matched the non-Gulf veterans on some characteristics in order to adjust for confounding. Some studies included these variables in a multivariate model when analysing their results, which was probably necessary given the differential response rate between the Gulf War veterans and non-Gulf veterans. The most thorough adjustments were carried out by Unwin et al ( 1999). In particular, only Unwin et al and Stuart & Bliese ( 1998) adjusted for marital status. This is likely to be an important confounding variable, as single people usually have higher rates of common mental disorder and were more likely to be deployed to the Gulf War — although not in Unwin et al's study, possibly because the UK military have fewer members who are never deployed on active service. Unwin et al ( 1999) found that the odds ratio for being a case on the GHQ changed only from 2.0 to 2.1 after adjustment, indicating that there was little evidence of confounding by the variables identified in that study. Results similar to these were obtained using PTSD as the outcome.
Post-traumatic stress disorder
It was possible to conduct a meta-analysis of 9 of the 11 studies that reported dichotomous outcomes for PTSD. We were unable to use the data from Goss Gilroy Inc. ( 1998) and Bartone ( 2000). The results are summarised in Fig. 2. The overall summary estimate using a random-effects model was an odds ratio of 3.17 (95% CI 2.16-4.65), indicating an increased risk in Gulf War veterans. There was significant heterogeneity (χ2=29.4, d.f.=8, P<0.0001). In particular, the two large studies by Unwin et al and Gray et al differed: the former found an OR of 3.5 and the latter on OR of 1.8. The summary estimate for the risk ratios was 2.9 (95% CI 2.0-4.2).
Common mental disorder
We were able to perform a meta-analysis on 11 of the studies that reported on the prevalence of common mental disorder ( Fig. 3). Two studies used the same sample, but one ( Wolfe et al, 1999) reported results from the Structured Clinical Interview for DSM—III—R ( Spitzer et al, 1990) and the other ( Proctor et al, 1998) presented results from self-reported symptoms of depression. The summary estimate was an odds ratio of 2.04 (95% CI 1.94-2.15), irrespective of whether the data from either of these studies were excluded, indicating an increased risk of common mental disorder in the deployed service personnel. Despite the variation between studies in the outcome used, there was no statistical evidence to support heterogeneity in this sample using odds ratios (heterogeneity test χ2=9.39, d.f.=10, P=0.5). The summary estimate for risk ratio was 1.8 (95% CI 1.6-2.0). It should be noted that the studies by Kang et al ( 2000) and Unwin et al ( 1999) accounted for 90% of the variance weights in the meta-analysis. The other studies therefore had little influence on the summary estimate.
A funnel plot of the standard error of the estimate against the size of effect suggested that there were fewer small non-significant findings than would be expected. This would not have had much influence on the findings, given the presence of a number of large studies.
There was little evidence concerning alcohol misuse or dependence. Goss Gilroy Inc. ( 1998) stated that there was no statistically significant association between alcohol misuse and deployment. The Iowa study ( Iowa Persian Gulf Study Group, 1997) reported an increased prevalence of alcohol misuse measured by the CAGE questionnaire ( Ewing, 1984).
The results of our systematic review and meta-analysis indicate an increased prevalence of PTSD and common mental disorder in service personnel who had been deployed to the Persian Gulf War. The size of this effect was somewhat larger for PTSD, with an OR of 3.2 (95% CI 2.2-4.7) compared with 2.0 (95% CI 1.9-2.1) for common mental disorder.
We adopted a thorough search strategy but — as in all systematic reviews — may have failed to identify some studies. We are also aware that other studies on this topic are in progress and have yet to report their findings. It is difficult to assess the effect of any publication or citation bias in our data, given the small number of studies that reported data in a form permitting meta-analysis. A funnel plot for the ‘common mental disorder’ outcome suggested that there was an underrepresentation of small studies finding no association between deployment to the Gulf and disorder. However, these small studies would not have had a major impact on the summary odds ratio, despite suggesting that it might be a slight overestimate. The summary estimate was dominated by the two large studies.
A critical part of these designs is the comparability of the deployed and non-deployed troops. Some of the studies used military databases and took care to ensure that their sample was representative of both Gulf War veterans and the comparison group. It is likely that the characteristics of troops selected for deployment systematically differed from those of other active service personnel who were not deployed. This could be less marked for the UK military service, in which almost everyone is likely to be deployed on active duty. Potential confounding factors include gender, fitness level and marital status, along with other aspects (such as propensity for risk-taking) that are more difficult to measure. It is also likely that within individual units, the reasons for choosing people for deployment would lead to a greater selection bias than in studies sampling from national databases, in which whole units would have been selected.
It is difficult to be sure about the effect of selection on the results reported here. Some authors have suggested a ‘healthy warrior’ effect, that the deployed have better underlying health. On the other hand, single people, who are more likely to have been deployed (at least in some studies; Proctor et al, 1998), tend to have poorer mental health ( Kessler et al, 1994; Jenkins et al, 1997). None of the studies had any independent information about the mental health of participants before the Gulf War and so were not able to take any account of this factor.
The studies that reported response rates according to deployment status all found that the Gulf War veterans had a higher response rate. It is likely that the publicity surrounding illnesses in Gulf War veterans increased the relevance of a questionnaire about health effects to respondents who had been deployed to the Gulf. This differential response rate could introduce a systematic bias.
Some studies have reported that non-responders tended to have poorer mental health than those who responded ( Williams & Macdonald, 1986) although Unwin et al ( 1999) in a more intensive follow-up of non-respondents did not find a statistically significant increased risk of common mental disorder. Kang et al ( 2000) also compared those who responded to the later mailings with those who returned the first mailshot. They did not find that the later respondents had poorer self-rated general health. On balance, it is unlikely that the differential response rate seen in these studies could have explained such a large association as that reported.
The majority of studies relied on self-reported symptoms to assess the prevalence of psychiatric disorder. Some of the studies used well-recognised and validated measures of psychiatric disorder, but others (including some of the larger studies) reported results from a single question asking about depressive symptoms. Despite this variation in measurement methods, there was little evidence of heterogeneity in the estimates for common mental disorder. Studies that used the longer semi-structured interviews might have introduced observer bias, given the difficulty in ‘blinding’ the interviewers. In contrast, there was evidence of heterogeneity for the PTSD estimates. In particular, Unwin et al ( 1999) reported a larger effect than did Gray et al ( 1999), although both reported a significant increase in prevalence in the Gulf War veterans. Gray et al restricted their sample to naval construction workers, so the different result might merely have reflected the different experiences of this group of service personnel. It should also be noted that the Unwin et al study used a UK military cohort in which almost all the non-Gulf War veterans comparison group would have been deployed on active service at one time or another.
Five of the 20 studies were carried out within 12 months of the end of the war and at a time when publicity concerning illness in Gulf War veterans was minimal. All these studies reported a statistically significant increase in psychopathological disorders in Gulf War veterans. These early studies tended to be less robust from a methodological point of view than the later ones: the samples were less representative, response rates were lower and the studies smaller in size. In contrast, the later and often more robust studies could have been subject to a reporting bias following publicity about illnesses in Gulf War veterans. In conclusion, it appears unlikely that a reporting bias could have led to the findings reported in the constituent studies.
Illnesses in Gulf War veterans
We found that veterans deployed to the Persian Gulf War reported more PTSD and more symptoms of common mental disorder than did service personnel who had not been deployed to this war. Increased rates of PTSD have often been reported after conflicts and can be attributed to the increased likelihood of psychologically traumatising events during wartime. The increased rates of other psychiatric symptoms might just be a consequence of the same process. There is evidence that psychologically traumatic events also lead to an increase in other psychological symptoms, particularly anxiety, in addition to the symptoms more specifically associated with the syndrome of PTSD. An increased rate of psychiatric disorders would therefore be expected in Gulf War veterans, although this does not diminish the importance of this morbidity in affecting veterans many years after returning from the conflict.
What is less clear is how these findings relate to the issue of Gulf War illnesses. Gulf War veterans have reported a wide variety of symptoms, aside from psychiatric symptoms. Unwin et al ( 1999), in the UK study, reported an increased prevalence of a whole range of symptoms after having adjusted for the increased prevalence of common mental disorder in the Gulf War veterans. This supports the view that some other factors must be contributing to illnesses in these veterans, in addition to any increase in psychiatric disorder.
Psychiatric disorder is common, disabling and burdensome. It is an important source of disability after war, yet this is often inadequately recognised and acknowledged. Developing more-effective means of preventing and treating psychiatric disorder in service personnel is an important priority for future research.
Clinical Implications and Limitations
Veterans of the Gulf War report more post-traumatic stress disorder and more depression and anxiety than do war veterans not deployed to that conflict.
Service in a war zone leads to an increase in symptoms many years afterwards.
The presence of psychiatric symptoms probably does not explain the increased prevalence of other somatic symptoms reported by Gulf War veterans.
Most of the large studies were conducted after public concern about illnesses in Gulf War veterans had been voiced.
The studies relied upon self-reported information about psychiatric symptoms.
Some of the larger studies used non-standard methods of assessing psychiatric symptoms.
We thank Simon Wessely and Matthew Hotopf for comments on an earlier draft of the manuscript. We thank the Department of Information Services at University of Wales College of Medicine for assistance in obtaining references for this review.
- Received May 23, 2002.
- Revision received October 18, 2002.
- Accepted October 30, 2002.
- © 2003 Royal College of Psychiatrists