The British Journal of Psychiatry
Stability of recall of military hazards over time: Evidence from the Persian Gulf War of 1991
Evidence from the Persian Gulf War of 1991


Background War time traumatic events are related to subsequent psychological and physical health, but quantifying the association is problematic. Memory changes over time and is influenced by psychological status.

Aims To use a large, two-stage cohort study of members of the UK armed forces to study changes in recall of both traumatic and ‘ toxic’ hazards.

Method A questionnaire-based follow-up study assessed 2370 UK military personnel, repeating earlier questions about exposure to military hazards.

Results The k statistics for reporting of hazards were good for some exposures, but very low for others. Gulf veterans reported more exposures over time (no significant rise in the Bosnia cohort). In the Gulf cohort only, reporting new exposures was associated with worsening health perception, and forgetting previously reported exposures with improved perception. We found no association between physical health, psychological morbidity or post-traumatic stress disorder symptoms and endorsement or non-endorsement of exposures.

Conclusions Reporting of military hazards after a conflict is not static, and is associated with current self-rated perception of health. Self-report of exposures associated with media publicity needs to be treated with caution.

It is known that there is an association between traumatic event reporting and negative health outcomes, particularly post-traumatic stress disorder (PTSD) ( Kaylor et al, 1987; Brewin et al, 2000). However, establishing the nature and magnitude of this association has been difficult, resulting in very different estimates. Most studies of the link between adversity and health are cross-sectional and rely on retrospective accounts of events and circumstances. There is consensus that retrospectively recalled accounts of trauma are problematic, and potentially subject to a variety of recall biases, but there is no consensus as to either the size of the problem or its implications ( McFarlane, 1988).

Several studies have looked at the relationship between retrospective recall of exposures at several time points and their relationship to health outcomes, mainly PTSD. We have conducted a large-scale longitudinal study of the health of UK military personnel, based on three cohorts: those who saw service in the 1991 Persian Gulf conflict, those who were deployed on peacekeeping operations in Bosnia between 1991 and 1997, and those who were in the forces at the time of the 1991 Gulf conflict but were not deployed ( Unwin et al, 1999). We now report on the results of a follow-up study, in which the same questions about specific military exposures related to the deployments on which the respondents had served were asked again. In this paper we examine the consistency of reporting of military traumas and hazards over the period. We also look at the predictors of any observed change in recall of traumatic events. In particular, we test the hypothesis that psychological distress prospectively increases the recall of traumatic events and hazards over time. This study is unique in that it permits the comparison of exposure consistency for both the Gulf War and the Bosnia deployments. We are thus able to compare the recall of military exposures relevant to both fighting and peacekeeping.


A follow-up descriptive study examined the health status of a stratified sample of participants who had completed the first phase of the King’s College London epidemiological health survey of military personnel ( Unwin et al, 1999). The original study took place in 1997, which was 6 years after the end of the Gulf War, and 5 years after the start of the Bosnia deployment. This survey was succeeded by a series of detailed clinical case–control studies (stage 2: David et al, 2002; Higgins et al, 2002; Sharief et al, 2002). The follow-up study, stage 3, took place in 2000 and 2001, approximately 3 years later. During the follow-up study, participants were asked again about specific military exposures related to the deployments on which they had served. In this paper we compare responses between the two large epidemiological surveys of the same personnel at stage 1 and stage 3.


The target group was a stratified sample of the cohort who completed the stage 1 Health Survey of Military Personnel (n=8195). This cohort consisted of three groups: personnel who served in the Persian Gulf region between 1 September 1990 and 30 June 1991 (the Gulf Cohort); personnel who had served in Bosnia between 1 April 1992 and 6 February 1997 (the Bosnia cohort); and personnel who were serving in the armed forces on 1 January 1991 but who were not deployed to the Gulf conflict (the Era cohort). Special forces were excluded for security reasons. Two stratification variables were used: fatigue and gender.

We were primarily interested in examining the health of Gulf veterans. At stage 1, fatigue, along with being strongly associated with other health outcomes measured, was one of the most consistently reported symptoms in Gulf veterans, and was our a priori principal outcome measure, and the basis for the stratified sample strategy. In order to ensure that the most severely ill were well represented, all male veterans with a fatigue score of 9 or more (511 Gulf, 115 Bosnia and 120 Era were included). A 1:2 sample of male Gulf veterans with mid-range fatigue scores of 4–8 (484 veterans) along with all Bosnia veterans (n=333) and Era veterans (n=364) scoring in this range were selected. Finally, a 1:8 sample of veterans with fatigue scores less than 4 was selected in order to represent asymptomatic individuals (250 in each group). All female veterans who completed the stage 1 questionnaire (n=648) were contacted, as women were oversampled in the original cohort. This also allowed us to look for any gender differences in follow-up variables. The total sample size was 3322.

Ethical approvals were obtained for all stages of the study. All respondents at stage 1 gave signed consent to later follow-up.


The questionnaire mainly replicated the measures used at stage 1, including demographic details (age, gender, marital and educational status, alcohol and smoking habits), chronic fatigue scale, medical symptoms (50 items), self-reported medical disorders (39 items), the 12-item General Health Questionnaire ( Goldberg, 1972), and the 36-item Medical Outcomes Study Short Form (SF-36) sub-scales for physical health, health perception and functional capacity ( Stewart et al, 1988).

As detailed in the original study report ( Unwin et al, 1999) we created a brief measure labelled ‘post-traumatic stress reaction’ (PTSR). This was embedded in the wider questionnaire because we did not wish to have an overt PTSD scale, given the social context of ‘ Gulf War syndrome’ at the time among the UK service community and also because of the need to keep measures to a minimum. Full details of this are contained elsewhere, but in essence it consisted of four simple stem questions covering the basic psychopathological features of PTSD ( Unwin et al, 1999).

Military exposure history was investigated using the same checklist as at stage 1, again tailored for the appropriate deployment. In practice this meant that the Bosnia and Era groups were not asked about the following exposures specific to the Gulf War: smoke from oil-well fires; mustard gas or other blistering agents; having a Scud missile explode in the air or on the ground within 1 mile; hearing chemical alarms sounding; and chemical/nerve gas attack. The questionnaires were tailored according to whether the participant was still in service or not, as ascertained at stage 1.


The reliability of the responses for each exposure at the two time points was quantified by the kappa statistic, which is a measure of the degree of non-random agreement between measurements of the same categorical variable. If the measurements agree more often than expected by chance, k is positive; if agreement is complete, k is 1; if they disagree more often than expected by chance, k is negative ( Last, 1995). A paired t-test was used to examine the number of endorsed exposures at the two time points.

We followed the same analytical approach to exposure measurements over two time points as Southwick et al ( 1997). For each of the exposures asked about, variables were created indicating whether the exposure was:

  1. always endorsed at both time points (YY);

  2. never endorsed at either time point (NN);

  3. endorsed at time 1 but no longer endorsed at time 2, i.e. no longer endorsed (YN);

  4. not endorsed at time 1, later to be endorsed, i.e. newly endorsed (NY).

Risk factors for number of newly endorsed and no longer endorsed exposures were explored by examining their median and interquartile range (IQR). Change in health status was examined by creating a change variable (stage 3 minus stage 1) for each health outcome. This was then used as the dependent variable when exploring the effect of newly endorsed and no longer endorsed exposures on health reporting over time. The data were analysed using the Statistical Package for the Social Sciences, version 10 for Windows.


Address information was not available for 13 of the original participants. Valid responses were obtained from 2370 (72%) participants: 907 Gulf (response rate 73.0%), 638 Bosnia (70.2%), 643 Era (69.5%) and 182 Bosnia and Gulf (78.4%). There were 246 (7.4%) refusers. Owing to the absence of an accurate address, 259 (7.8%) never received the questionnaire, despite three mailing attempts, giving a true rate of 78% (Gulf 79%; Bosnia 77%; Era 6.0%; Bosnia and Gulf 82%). For the purpose of analysis participants who had been deployed to both the Gulf and Bosnia were combined with the Gulf-only group, as in previous analyses of this cohort (Unwin et al, 1999, 2002; Ismail et al, 2000; Reid et al, 2001). Table 1 gives the distribution of demographic variables in the two study cohorts.

View this table:
Table 1

Demographic variables for the Gulf and Bosnia cohorts

The mean number of reported exposures significantly increased over time for the Gulf cohort ( Table 2), but the increase in the Bosnia cohort was modest and nonsignificant. The Pearson correlation between the number of reported events at both time points was low for both the Gulf cohort and Bosnia cohort (r=0.66 and 0.57 respectively). Table 3 shows the percentage responses of YY, NN, YN and NY, along with the κ values, for each of the variables in the questionnaires given to the Gulf and Bosnia cohorts. In the Gulf cohort the most reliably recalled exposures were: smoke from oil-well fires (κ=0.79); handled prisoners of war (κ=0.71); small arms fire (κ=0.68); Scud missile exploding within 1 mile (κ=0.67); and seeing dismembered bodies (κ=0.64). For the Bosnia cohort the most reliably recalled exposures were: small arms fire (κ=0.61); witnessing anyone dying (κ=0.58); artillery close by (κ=0.57); seeing dismembered bodies (κ=0.55); and diesel or petrochemical fuel on skin. For the 24 exposures common to the two cohorts, the Gulf cohort had higher κ values for all except 4 (bathed in/drank local water; bathed in local pond/river; heat illness; and witnessed anyone dying).

View this table:
Table 2

Changes in mean number of exposures over time reported by the Gulf and Bosnia cohorts

View this table:
Table 3

Frequency of recall categories for each exposure in the Gulf and Bosnia cohorts

On average the Gulf cohort had more newly endorsed (NY) than no longer endorsed (YN) exposures (mean NY=2.90, s.d.=2.39; mean YN=1.80, s.d.=1.75), a pattern repeated in the Bosnia cohort (mean NY=2.89, s.d.=2.36; mean YN=1.74, s.d.=1.78) and indicating an overall rise in the number of exposures recalled over time ( Table 4). Table 5 gives the risk factors for numbers of newly endorsed and no longer endorsed exposures. For the number of no longer endorsed items, the most significant risk factor was serving status, with those in service having a higher median value for no longer endorsed exposures than those not in service. This pattern held for both the Gulf and Bosnia cohorts. For the newly endorsed exposures over time, being male and younger were associated with higher median values for both the Gulf and Bosnia cohorts, whereas living with a partner was associated with a higher median value in the Bosnia cohort only.

View this table:
Table 4

Frequency of newly endorsed (‘no to yes’) and no longer endorsed (‘yes to no’) exposure recall in the Gulf and Bosnia cohorts

View this table:
Table 5

Association between demographic factors and exposure change variables (YN, no longer endorsed; NY, newly endorsed)

Table 6 shows the mean changes in health outcomes for the Bosnia and Gulf cohorts and their association with newly endorsed or no longer endorsed exposures. For the purpose of these analyses, the no longer endorsed (YN) and newly endorsed (NY) exposure recall variables have been recoded to combine the tailed distribution into one group. There was a pattern of increased (i.e. improved) health perception and increased no longer endorsed (i.e. forgotten) exposures over time in the Gulf cohort, which was not replicated in the Bosnia cohort. Conversely, there was a pattern of worsening health perception and increasing new endorsement of exposure variables over time in the Gulf cohort but not the Bosnia cohort. There was no discernible pattern of association between physical health, psychological morbidity or PTSD symptoms (PTSR) and endorsement or non-endorsement of exposures, for either the Gulf or the Bosnia cohort. These analyses were repeated for the Gulf cohort, omitting the five Gulf-specific exposures (smoke from oil-well fires; mustard gas or other blistering agents; having a Scud missile explode in the air or on the ground within 1 mile; chemical/nerve gas attack; and hearing chemical alarms sounding), with no difference in findings ( Table 7).

View this table:
Table 6

Mean change in health outcomes categorised by no longer endorsed and newly endorsed exposures for the Bosnia and Gulf cohort

View this table:
Table 7

Mean change in health outcomes categorised by newly endorsed and no longer endorsed exposures for generic military exposures in the Gulf cohort

Table 8 shows the regression analyses results for the effects of newly endorsed (‘newly remembered’) and no longer endorsed (‘forgotten’) exposures on health, controlling for age, gender and number of endorsed exposures at stage 1. For the Gulf cohort, the total of newly endorsed exposures was associated with a reduction in health perception and increased psychological morbidity, whereas the total of no longer endorsed exposures was significantly associated with improved health perception. This pattern was not replicated in the Bosnia cohort (data not shown).

View this table:
Table 8

Prediction of change in health status and psychological morbidity by newly endorsed and no longer endorsed exposure recall: hierarchical regression analysis controlling for age, gender and exposure at time 1 for the Gulf cohort


We already know that there is poor agreement between reporting of military events and contemporaneous records of the same events ( Keane et al, 1989). In an ideal world we would have objective, independent, contemporary records of exposures and hazards, but this is rarely (if ever) possible, given the ‘ friction’ of war, and the impossibility of monitoring all hazards, both known and unknown, at the time. For that reason it is likely that self-report of hazards and exposures will continue to be the basis of the assessment of the consequences of war and military deployments for the foreseeable future.

Roemer et al ( 1998) documented consistent increases in reports of exposure to seven specific war-related stressors over time in a sample of 460 service personnel deployed to Somalia in a peacekeeping operation. These men and women were assessed in the first year after their return, and then 1–3 years later. At the second assessment PTSD symptoms uniquely contributed to reported exposure scores. Southwick et al ( 1997) administered on two occasions a 19-item war zone exposure questionnaire and the Mississippi Scale for Combat Related PTSD to 59 members of the National Guard who had been activated for Gulf War duty. They analysed the extent that recall and forgetting of exposures altered over time, and found that the number of ‘no’ to ‘yes’ changes was significantly and positively related to PTSD symptom severity at the later assessment. In contrast, Bramsen and colleagues, in a study of Dutch peacekeepers, did not find either an increase in reported items over time, or an association between number of changes between the first and second assessments and symptoms of PTSD ( Bramsen et al, 2001). Meanwhile, other researchers have used the experience of the Gulf War to study consistency of reports of hazardous ‘toxic’ exposures over time ( McCauley et al, 1999; Wolfe et al, 2002), but these studies did not consider the influence of psychological variables on changing patterns of recall.

We now consider two major findings. The first relates to the stability of recall of military hazards, and the second concerns the direction of any observed changes and the influence of psychosocial factors on those changes.

Stability of recall

There was relatively low agreement for reporting of war exposures over time, as shown by the majority of exposures having a k under 0.6. In general our findings are very similar to those of the only study that used a similar design to look at consistency of recall in smaller numbers of US Gulf War veterans ( McCauley et al, 1999). When the questions that we asked were almost identical to those asked in the US survey, the consistency of recall was likewise similar. Hence, in both studies, reporting exposure to smoke from oil fires was associated with good reliability, hearing Scuds detonate was also reasonably reliable, being aware of chemical alarms sounding was moderately reliable as was believing oneself exposed to chemical attack, whereas reporting drinking local water, exposure to chemical agent resistant coating (CARC) paint and exposure to depleted uranium was very unreliable in both studies. The recall of depleted uranium exposure is particularly problematic in the Bosnia cohort, indicated by the lowest k value (0.05). Perhaps this is a reflection of the enormous publicity given to reports of cancers occurring in peacekeepers from several European nations that happened between the two phases of our study. Likewise, chemical exposures such as CARC paint, other paints/solvents and pesticides on clothing/bedding were associated with the greatest number of ‘ no’ to ‘yes’ changes (increased recall) in both Gulf and Bosnia cohorts, and correspondingly low k values. There has also been intense media concern over all these exposures in the British press in the past decade, including, but not restricted to, the military context.

Change in recall over time

Looking now at the general pattern of change, previous studies have shown that the mean number of events reported over time can either increase ( Southwick et al, 1997; Roemer et al, 1998; King et al, 2000) or stay the same ( Bramsen et al, 2001). Our study produced an increase in the number of events reported over time in the Gulf cohort, but no significant increase in the Bosnia cohort. What this illustrates is the importance of not assuming that all conflicts are the same in terms of their social and psychological impact. Results in our peacekeeping cohort are similar to those of Bramsen and colleagues looking at Dutch peacekeepers, and our Gulf results, although different from those in the peacekeepers, are similar to the findings of Southwick and colleagues in US Gulf veterans. On the other hand, neither we nor Bramsen et al ( 2001) are able to confirm the substantial increase in reporting of events recorded by Roemer et al ( 1998) in US peacekeepers in Somalia. However, the US operation in Somalia was beset by difficulties, and involved rather more than peacekeeping, with periods of actual combat.

A second reason why the literature is not entirely consistent is that previous studies have been concerned with either post-traumatic type events and symptoms, or more ‘toxic’ hazards, but rarely with both. In our Gulf studies we have always taken a broader view of hazards and exposures, incorporating exposures such as vaccinations, smoke from oil fires, depleted uranium and so on, which are not traumatic in the customary use of the word, but certainly came to prominence after the 1991 Gulf War. This means that we included more measures of these kinds of hazards than the other studies, but conversely our measure of post-traumatic stress symptoms is less sophisticated.

Recall of exposures and current health perception

We found an association between health perception and both increased reporting and also forgetting of exposures, but this was not true of psychological morbidity or physical health. This association held for the Gulf cohort, but not for the Bosnia cohort. In general we found that the main pattern of change was of increased reporting (‘no’ to ‘ yes’) rather than forgetting (‘yes’ to ‘ no’).

There may be several explanations for changes in reporting of an event over time. The recall of events might simply become inflated over time; conversely, individuals might have underestimated their reports initially and later given more accurate appraisals. Over the specific interval of this study, there was considerable media attention to the Gulf War and its possible health effects. No doubt this information was incorporated to some lesser or greater degree into the participants’ perspectives on their experiences of the war. The acquisition of new knowledge, from whatever source, could explain both types of changes in item endorsement: elucidating and clarifying events and circumstances that did occur but were not previously known (accounting for ‘ no’ to ‘yes’ changes), and delimiting details of experiences previously held to be true (accounting for ‘yes’ to ‘ no’ changes). We should also be careful not to assume that changes in reporting equate with changes in memory – it might be that events previously seen as irrelevant and not endorsed on a questionnaire have increased in importance and salience over time, perhaps because of media coverage, rather than being newly remembered. On the other hand, there is evidence from this study that the reporting of events is influenced by current health perception. We found an association between changes in endorsement – both positive and negative – of hazards, and current health perception. Remembering more exposures over time was associated with worsening perception of health; conversely, improved perception of health was associated with forgetting previously recalled exposures. This pattern held after we had removed exposures specific to the Gulf War from the analysis, indicating that the finding was not due to certain key Gulf War exposures that might be strongly associated with health. This finding was not replicated with measures of mental health in general, or PTSD symptoms in particular.

Another important finding is that the association between health perception and recall or non-recall of hazards and exposures was found in the Gulf cohort but not in the Bosnia cohort. This was not because of a differential effect of exposures only encountered in the Gulf and also the subject of intense media scrutiny, since removing these Gulf-specific exposures did not alter the association.

We draw attention to the finding that, contrary to our original predictions, change in recall of exposures was not associated with changes in post-traumatic stress symptoms, but with health perception. However, this is in keeping with our own nested case–control study in which we interviewed both ill and healthy veterans, this time using a standardised psychiatric interview to enable us to make firm diagnoses of PTSD, which was not the case in the epidemiological study. Although psychological morbidity was increased, modestly, in the Gulf cohort, this was rarely due to PTSD ( Ismail et al, 2002). This is unsurprising. The Gulf War was not particularly traumatic in the conventional sense for the coalition forces, particularly in the context of past military campaigns that were associated with high rates of classic war-related psychiatric injury. The perceived hazards of the Gulf tended to be those not usually associated with the military setting, and not encapsulated in the formulations of PTSD. Instead, most revolved around fears of environmental exposure and contamination. We have argued elsewhere that the indisputable increase in ill health seen after the Gulf War is better understood as part of the literature on unexplained symptoms and syndromes, rather than conventional PTSD. This may help to explain why changes in recall of exposures were associated more with changes in health perception than with symptoms of post-traumatic stress.

Clinical Implications and Limitations


  1. Stability of recall of hazardous exposures during military operations differs according to the nature of the exposure – those extensively publicised in the media are particularly problematic.

  2. The number of exposures recalled has increased with the passage of time.

  3. The total number of hazardous exposures reported may reflect not only actual exposures but also current distress.


  1. All military conflicts differ in various ways – it cannot be assumed that these findings extrapolate beyond the 1991 Gulf War.

  2. The measurement of self-reported exposures was fairly crude.

  3. Measurement of post-traumatic symptoms was by self-report and not by structured interview.


This study was funded by the US Department of Defense and the UK Medical Research Council. Neither organisation has had any input into the design, conduct, analysis or reporting of the study. The views expressed are the authors’ own.

We thank all the servicemen and service women who gave freely of their time. We also thank Dr Inge Bramsen and Professor Ariel Shalev for their helpful advice.

  • Received January 15, 2003.
  • Revision received April 22, 2003.
  • Accepted May 6, 2003.


View Abstract