Recovery from psychotic illness: a 15- and 25-year international follow-up study


Background Poorly defined cohorts and weak study designs have hampered cross-cultural comparisons of course and outcome in schizophrenia.

Aims To describe long-term outcome in 18 diverse treated incidence and prevalence cohorts. To compare mortality, 15- and 25-year illness trajectory and the predictive strength of selected baseline and short-term course variables.

Method Historic prospective study. Standardised assessments of course and outcome.

Results About 75% traced. About 50% of surviving cases had favourable outcomes, but there was marked heterogeneity across geographic centres. In regression models, early (2-year) course patterns were the strongest predictor of 15-year outcome, but recovery varied by location; 16% of early unremitting cases achieved late-phase recovery.

Conclusions A significant proportion of treated incident cases of schizophrenia achieve favourable long-term outcome. Sociocultural conditions appear to modify long-term course. Early intervention programmes focused on social as well as pharmacological treatments may realise longer-term gains.

In the last quarter of the 20th century, evidence for a more promising long-term course of schizophrenia accumulated across Swiss, German and British studies (Huber et al, 1975; Ciompi, 1980; Shepherd et al, 1989). Reconciling such findings with the sometimes markedly different picture reported in other studies is complicated, however, by unresolved design and measurement problems, including sampling biases, poor case definition, inadequate standardisation of outcome measures and differential attrition in follow-up.

Building upon earlier groundwork, the recently completed International Study of Schizophrenia (ISoS), coordinated by the World Health Organization (WHO), attempted to address these and related problems in a long-term follow-up study of 14 culturally diverse treated incidence cohorts and four prevalence cohorts, totalling 1633 subjects. Reported here are: crude and adjusted mortality (standardised mortality ratios; SMRs); 15- and 25-year cross-sectional outcomes for symptoms, disability and resource utilisation; longitudinal patterning of aggregated 15-year course data for first-episode cases and assessment of a ‘late recovery’ effect; and (for a selected group of incidence cohorts) the predictive strength of selected baseline, centre and short-term (2-year) outcome variables. Comprehensive analyses and centre-specific reports will appear in the collaborative report on ISoS (Hopper et al, 2001). The investigators have also agreed to post the relevant research data on the internet (see


The study protocol and instruments used in ISoS are described in detail elsewhere (Sartorius et al, 1996).

Study cohorts

The International Study of Schizophrenia builds upon the results of earlier studies. The International Pilot Study of Schizophrenia (IPSS; WHO, 1973) reported better 2- and 5-year outcomes for patients in ‘developing’ centres. This finding was reinforced in the subsequent Determinants of Outcome of Severe Mental Disorders (DOSMeD) study (Jablensky et al, 1992), which traced treated incidence cohorts in geographically defined areas, using standardised methods of case finding and diagnosis. A third WHO study, the Assessment and Reduction of Psychiatric Disability (RAPyD; Wiersma, 1996), identified a further set of treated incidence case groups assembled from diverse catchment areas. These three cohorts offered a unique opportunity to carry out 15-year (DOSMeD, RAPyD) and 25-year (IPSS) follow-up studies. The cultural diversity of the sample was further enhanced by adding two incidence case groups and one prevalence case group to the field research centres (FRCs) recruited from the earlier studies (12 from DOSMeD and RAPyD, three from IPSS).

Table 1 lists the 14 treated incidence cohorts and four prevalence cohorts, totalling 1633 subjects. The treated incidence case group of 1171 subjects included 766 DOSMeD subjects, defined as “cases in the early stages of the illness, evaluated as closely as possible to the point of their first contact with any service or helping agency” (Jablensky et al, 1992); 205 RAPyD subjects, selected by screening administrative records for recent onset of non-affective psychotic disorder; 200 additional subjects, including 100 subjects in the Hong Kong FRC randomly selected through a record review of first admissions for schizophrenia in clinics serving a defined catchment area, and 100 subjects in Madras, who earlier met Feighner criteria for schizophrenia (Thara & Eaton, 1996).

View this table:
Table 1

Subjects with baseline psychotic diagnosis by centre; subjects with non-psychotic or missing diagnosis for total International Study of Schizophrenia (ISoS) analysis group

The prevalence case group included 373 cases from three centres from the original IPSS and 89 subjects from Beijing, who had met criteria for schizophrenia in a community epidemiological survey (Shen et al, 1986).

Table 1 also shows the distribution of ‘living’ subjects (those with sufficient follow-up data for analysis), dead subjects, and those participants lost at follow-up (those with insufficient follow-up data for analysis).

Baseline diagnoses

The original IPSS and DOSMeD cohorts were identified before the widespread use of operationalised diagnostic criteria such as ICD-10-DCR (WHO, 1993) and DSM-IV (American Psychiatric Association, 1994). For the IPSS, DOSMeD and RAPyD case groups, project diagnoses were also made in terms of then-conventional criteria of ICD-8 (IPSS) or ICD-9 (DOSMeD and RAPyD) in accordance with the WHO glossary (WHO, 1978). The newly added case groups were identified through prior epidemiological studies (Beijing and Madras both used the Present State Examination (PSE), and the latter employed Feighner criteria) or by re-analysis of original case notes (Hong Kong).

Ideally, we would have wished to reassemble baseline and supplementary data for all ISoS cases and, blind to subsequent course and outcome, apply a diagnostic algorithm for one of the current operationalised research criteria. Resource constraints precluded such an exercise, however, and (apart from the three added centres) we decided to utilise the original project diagnosis made at baseline. In some FRCs, these were consensus diagnoses arrived at by two or more investigators; in others, diagnoses were made by an individual investigator without further review by a research team. The reliability of this process was investigated in the original DOSMeD study and found to be acceptable (Jablensky et al, 1992).

For ISoS, original ICD-8 or ICD-9 diagnoses were converted to ICD-10 diagnoses, as were case notes or project diagnoses provided by the three additional centres, using WHO cross-walk rules (WHO, 1994). This method provided the closest approximation to a standardised diagnostic system given the constraints of the study. For the descriptive analyses, study subjects in each of the groups were further divided into three analytical groups: schizophrenia only (F20.0-20.3; 20.5-20.9); psychotic disorders other than schizophrenia (F10.5, 22, 23, 24, 25, 28, 29, 30-34); and total psychoses (both groups combined).

Tracing exercise

The ISoS research was approved through local FRC ethics committees (or their functional equivalents). Tracing methods varied considerably (and are described in detail in centre reports; Hopper et al, 2001), but efforts generally began with the last known address and worked forward in time, consulting clinical records, address directories, death registries, primary care practitioners and family contacts. Tracing was facilitated in those areas where psychiatric case registries were available or research subjects had been followed clinically. Logistical obstacles could prove formidable where communication facilities were primitive or in areas of high residential mobility.

Follow-up assessments

  1. The principal psychopathology assessment tool was the Present State Examination (PSE-9; Wing et al, 1974), supplemented in most cases by the Schedule for the Assessment of Negative Symptoms (SANS; Andreasen, 1989) and the Psychological Impairment Schedule (PIRS-2; WHO, 1992).

  2. Current functioning was assessed using a modified version of the WHO Disability Assessment Schedule (DAS; Jablensky et al, 1980) and the Global Assessment of Functioning Disability and Symptoms scales (GAF—D and GAF—S; adapted from Endicott et al, 1976).

  3. Course of illness — covering fluctuations in symptoms, treatment resource utilisation, residential status, involvement with work and kin — was constructed using the Life Chart Schedule (LCS), a newly developed instrument that drew upon the earlier work of Harding et al (1987).

  4. A global assessment of current clinical status, taking into account all information gathered with respect to course, symptoms and functioning, was rendered using Bleuler's criteria (Bleuler, 1978) for the past month only. This approach was taken in order to enable constructions of overall illness course patterns that could be compared with other longitudinal studies.

Training and reliability

Most of the principal investigators had extensive experience of using the PSE (and several other ISoS instruments) from prior IPSS, DOSMeD and RAPyD studies. Two of the three added FRCs had also collected PSE data at baseline. Because ISoS introduced new instruments and recruited some inexperienced fieldworkers, training and reliability were approached systematically. (Further details are available in Siegel et al, 2001.)

The ISoS follow-up instruments were combined in a single package, accompanied by procedural guidance for local training and implementation in fieldwork exercises. The full package was introduced at three training seminars for principal investigators spread over 2 years. At each seminar, experienced users demonstrated the instruments. Training also made use of video-tapes and/or case vignettes.

Reliability was examined both ‘within centre’ and ‘ between centres’. Five centres also carried out a maintenance reliability check mid-way through their follow-up procedures. Within centres, the rater-observer method used live or videotaped interviews and had observers make independent ratings. Each FRC aimed to assess 10 patients, a target met in nine centres. Another completed ratings for five cases, and a centre with only a single clinician carried out a test-retest assessment. No reliability assessments were carried out in Moscow, Honolulu and Rochester, where single investigators made nearly all ratings. Mannheim and Cali did not carry out within-centre reliability exercises because their data collection had been completed through existing studies before ISoS formally began.

The between-centre reliability exercise used five videotapes of patients with a history of psychosis comparable to the ISoS incidence cases. Videotaped interviews were done in English with English-speaking subjects; the tapes were accompanied by interview transcripts and structured case vignettes that approximated case notes. Videotapes and vignettes were circulated for reliability assessments of the PSE, DAS and LCS. An average of 29 researchers rated each case. All but one centre assessed at least one tape; the majority, three or more.

Because measuring pairwise agreements between raters does not take into account chance agreement, the κ statistic is commonly used. Low variability for an item, however, may produce imprecision in the κ statistic. We therefore devised a customised method for assessing interrater reliability. We tested only items for which ratings were categorical and defined a range of variability sufficient for reporting κ. Variables were considered to have insufficient variation if the percentage scoring ‘ absent’ for that item approached 0 or 100% (i.e. 0-10% or 90-100%).

Analytical approach


Cause of death was determined from available data at each centre (death certificates, autopsies, etc). SMRs were calculated against a standard population that depended upon the location of the centres. For all but three centres in India, the standard was the corresponding national population; for the Indian centres, it was the national population up to 1990 and the relevant state-specific populations thereafter. For most countries, WHO mortality rates were used (WHO, 1996), although other sources were also used. Statistical significance of SMRs was determined by the method of Breslow & Day (1987). With the exception of Rochester (where no deaths were recorded), for each centre a Kaplan-Meier (Kaplan & Meier, 1958) survival distribution was computed for all deaths.

Clinical outcomes and social disability in subjects traced alive

Outcome data were aggregated (unweighted) within each analytical group. Although data are not presented by individual FRCs, the range of values is reported for the principal outcome variables. Associations between diagnostic group and pattern of course variables were assessed using the χ 2 statistic.

Predictor variables

Predictive relationships between baseline and early course variables and longer-term outcome were examined for the DOSMeD cohorts only (461 subjects), because these were the only subjects for whom a sufficiently consistent data-set existed. Of those with a baseline diagnosis of psychosis, 60.2% were found alive and assessed; 30.4% were lost to follow-up; and 9.4% had died. A separate examination (Drake et al, 2001) uses a propensity score method (Rosenbaum & Rubin, 1983) to assess the potential bias introduced by basing our analysis on the alive cohort only, and finds it negligible.

Predictive analyses used two measures of outcome at follow-up derived from the GAF scale (symptoms and disability). Course of illness over the entire follow-up period was described in terms of patterns constructed from the data on symptom presence and strength over time obtained via the LCS. These were divided into ‘complete’ remission (no residual symptoms between episodes, return to premorbid functioning) and ‘incomplete’ remission (or continuous psychosis). In this, we followed the precedent of Jablensky et al (1992) and Craig et al (1997) to enable comparisons with earlier analyses of short-term outcome.

The set of variables chosen was based on a conceptual framework that links environmental, predisposing and clinical factors to outcomes, and takes account of factors that may mediate their impact. Selected items included variables reported to be related to outcome in the prior reanalysis of short-term outcome (Craig et al, 1997): age at first contact, gender, marital status, contacts with close friends, history of drug or alcohol use, type of onset and diagnosis. We also included others suggested in recent literature, for example symptoms available from the PSE which form part of the ‘deficit syndrome’ (Kirkpatrick et al, 1989). Duration of untreated psychosis, cited by several researchers (Larsen et al, 1996), could not be used because of insufficient reliable data.

The diagnosis was the baseline ICD-9 clinical consensus diagnosis, converted to ICD-10 and grouped into five categories (schizophrenia, schizoaffective disorder, acute schizophrenia disorder, bipolar disorder/depression and other psychoses). The short-term (2-year) outcome variables were ‘percentage of time experiencing psychosis’ and ‘ pattern of course’. Only one subject-level mediating variable could be included in the model — family involvement with treatment over the follow-up period. Medication was not used because univariate analysis showed that in the first 2 years nearly all subjects received anti-psychotic medication, while later in the follow-up period the odds of being on medication for those with poor outcome were much higher than for those with good outcome. To find that medication use predicted outcome would thus have been uninformative.

Coarse-grained area-level mediating variables were constructed for social stability, prevailing conventions of illness attribution, and configuration and strength of kinship ties, as well as for certain aspects of the treatment system (e.g., existence of national health insurance). Such constructions made use of qualitative data supplied in answer to open-ended questions by key informants as well as the descriptions of locales by investigators (Hopper et al, 2001). A Delphi-like procedure, employing five independent raters, was used to arrive at composite scores (Siegel et al, 2001).

A two-part statistical procedure was used for the two GAF outcome variables. First, stepwise linear regression was applied to distinguish the variables most highly related to outcome. In these models, sample sizes were considerably reduced from the full cohort size because subjects were excluded from the analysis if data were missing for any regression variable. A second model was fitted using only the variables distinguished in part one. The analyses presented from the clinical outcomes step used a data-set that is about 90% of the total (419/461). Two sets of analysis were carried out: first, centre was included as a categorical variable then, second, centre was replaced by the set of descriptive locale variables.


Initial results from Nottingham (Mason et al, 1995; Harrison et al, 1996), Groningen (Wiersma et al, 1998), Sofia (Ganev et al, 1998) and disability data for a selected European cohort (Wiersma et al, 2000) have already been published.

Attrition in follow-up

Loss to follow-up across all case groups of ISoS ranged from an average 30% in the DOSMeD cohort to 10% in the RAPyD cohort, amounting to 399 subjects or about one-quarter of the total (Table 1). Of the 1005 living participants, the treated incidence (n=776) and prevalence (n=229) cohorts were sub-categorised by baseline ICD-10 diagnosis into the three analytical groups: schizophrenia (n=644); psychoses other than schizophrenia (n=361); and all psychoses (n=1005). Mean ages at follow-up were: schizophrenia incidence group, 41.4 years; schizophrenia prevalence group, 51.2 years; other psychosis incidence group, 42.8 years; and other psychose prevalence group, 54.5 years.

The literature frequently cites gender, mode of onset and short-term course of illness as predictors of outcome (Jablensky et al, 1992; Harrison et al, 1996). To assess bias in the follow-up cohort, we compared their distribution in the living groups versus those subjects lost to follow-up. In the incidence cohorts, there was a higher percentage of females in the living groups compared with those subjects lost to follow-up in the total group with psychotic illnesses (P<0.10) and the group with schizophrenia (P<0.05) (51.9% v. 45.4% and 49.4% v. 38.4%, respectively). The opposite occurred in the prevalence cohorts, where a (non-significant) higher proportion of living males appeared in both analytical groups. Percentages were almost equal for the other psychoses group in both incidence and prevalence cohorts. Mode of onset was divided into sudden/acute and slow/insidious. In all six analytical groups, there was a higher (but non-significant) proportion of acute/sudden cases in the living group, although the power of our analysis may have been low. Favourable early course of illness was defined as complete remission between psychotic episodes. Five of the six analytical groups showed a higher (but non-significant) percentage of complete remissions in the living group compared with those lost at follow-up; in the other psychoses incidence group, the percentages were essentially equal. Overall, therefore, attrition in follow-up appears biased toward males and subjects with slower onsets and less favourable early pattern of course.

Completed assessments

For the main instruments, 88.6% of participants still living at follow-up were interviewed with the PSE. Seventy-nine per cent had completed DAS scales (with a further 14% rated on the basis of informants) and 85% had completed LCS (with a further 15% completed on the basis of information from informants and extracted from case records).


The range of agreement according to the κ statistic for each of the three principal instruments (PSE, DAS and LCS) is shown in Table 2. We adopted the convention that moderate to very good agreement is reflected by κ values above 0.4 (Fleiss, 1981).

View this table:
Table 2

Inter- and intracentre reliability exercise

In the intracentre reliability exercise before the start of the study, only two variables for the three instruments had κ values less than 0.4; for 108 items, κ values were higher, indicating acceptable agreement between raters before additional training. For the intercentre exercise, the best split of κ values showing sufficient variation was set at 25-75% (i.e., between 25 and 75% of ratings were scored absent). Moderate to very good agreement was achieved for 86% of PSE items, 80% of DAS items and 77% of LCS items.


Table 3 lists the sample size, deaths and SMRs for each centre. Differences in selection criteria and length of follow-up preclude drawing specific conclusions, but a few patterns are discernible. In non-industrialised countries, the majority of known deaths were listed as natural, whereas the reverse was true in industrialised countries. Suicide accounts for the majority of deaths from unnatural causes. In contrast to reports from other studies, we did not find that infectious diseases contributed disproportionately to the natural death toll.

View this table:
Table 3

Death counts by centre and cause

All but five centres had SMRs significantly greater than one: two of these were located in non-industrialised countries (Cali and Madras), two were in Eastern European countries (Sofia and Moscow), and Rochester recorded no deaths. The eight highest SMRs were in industrialised centres (including Hong Kong). When examined by gender and age at study entry, a much more variable pattern emerged (see Craig et al, 2001): young-entry male patients tended to have significantly elevated SMRs among the industrialised centres, whereas females showed no significant pattern of mortality risk by centre type. Survival probability estimates at 5, 10 and 15 years showed few significant differences between centres.

Cross-sectional measures of outcome

Table 4 shows global outcome data for all psychoses, schizophrenia and other psychoses subjects in the incidence and the prevalence cohorts as defined above. Over half of the living ISoS participants — 56% of the incidence cohort and 60% of the prevalence — were rated ‘recovered’ on the Bleuler scale and nearly half had not experienced psychotic episodes in the past 2 years. Ratings of current symptomatology and functioning closely tracked the more global assessments. Outcomes for schizophrenia were less favourable than for other psychoses across all domains, although the percentage rated as globally ‘ recovered’ was still close to 50%. Wide variation is apparent across centres, although numbers for some cells were small.

View this table:
Table 4

Symptoms and social disability at follow-up, percentages (range across centres)

Measures of most recent 2-year course of disorders

LCS — living arrangements

The majority of living participants, in both incidence and prevalence cohorts and across all diagnostic groups, had spent most of the past 2 years living with family or friends. Small percentages, ranging from 3.4% among people in the other psychoses prevalence cohort to 11.6% among people in the schizophrenia incidence cohort, had spent the majority of the past 2 years in institutional settings. Twelve people (1.5%) in the incidence cohort and four (1.7%) in the prevalence cohort had been homeless (defined as living on the street or in a designated shelter) at some point in the past 2 years; 4.6% of the incidence group had been homeless at some point during the follow-up period.

LCS — course type and negative symptoms

Table 5 shows course of illness over the past 2 years and presence of prominent negative symptoms by course type. Just over one-quarter (26.8%) of all people in the incidence cohort had been continuously ill over the past 2 years: 33.6% of those in the schizophrenia group and 14.6% of people with other psychoses. This pattern was more common in the prevalence group (37.5%), especially among people with schizophrenia (46.4%). Negative symptoms were prominent in almost half the people with continuous illness, across all analytical groups, except those people in the other psychoses incidence group; they were relatively infrequent in those with episodic psychoses.

View this table:
Table 5

Course of illness over past 2 years (corresponding percentage of the group rated as having prominent negative symptoms)

LCS — employment

As Table 6 shows, employment figures (including paid work and housework) for people in the incidence and prevalence cohorts across the two analytical groups range from 56.8% among people in the schizophrenia incidence cohort, to 73.9% or better in both the schizophrenia and other psychoses groups in the prevalence cohort. Quality of work for full-time workers and those doing housework was rated as satisfactory in 80% or higher, excepting the schizophrenia incidence cohort for housework. Although men were much more likely to be employed at paid work, performance ratings for women who did such work were comparable to those of men.

View this table:
Table 6

Working (full-time employment or housework) in eligible subjects

LCS — help seeking and sources of support

One-fifth (20.9%) of people in the incidence cohort had been hospitalised for psychiatric reasons at some point in the past 2 years, a considerably larger number (69.3%) had received some form of substantial psychiatric treatment (chiefly medication), and a minority (23.8%) had received other professional help. Over half (52.9%) had been on neuroleptic medication for most of the past 2 years. Percentages of all forms of treatment were smaller in the prevalence cohort.

Longitudinal course

Course of illness patterns were constructed (Table 7) using a modified version of Bleuler's typology, which combines mode of onset (acute versus insidious), overall trajectory (simple versus episodic), and end state (Bleuler recovered or minimal symptomatology (good), versus moderate or severe impairment (poor)). With the sole exception of the group with schizophrenia in the prevalence cohort, episodic illness (whether acute or insidious in onset) accounted for favourable outcomes in well over half of all participants. Of the episodic group, over two-thirds of people (68%) in the schizophrenia incidence cohort had at least two illness episodes.

View this table:
Table 7

Modified Bleuler course types1

Sixteen per cent (15.7%) of schizophrenia cases in the incident cohort showed evidence of late improvement at the 15-year follow-up. That is, they were described as having continuous symptoms (simple course type) in the second epoch, but were rated as recovered in the third. A similar finding emerged (18.4%) in the prevalence cohort.

Stricter operationalisation of recovery

To be meaningful, the concept of recovery requires careful operationalisation. When defined as receiving both a Bleuler rating of recovered and a GAF-disability rating greater than 60, 37.8% of people with schizophrenia and 54.8% of people with other psychoses in the incidence cohort qualified. Because such rates include subjects who had received treatment or been hospitalised in the past 2 years, they may be viewed as ‘treated recovered’ rates. If we adopt a more stringent concept, excluding those with a recent (in the past 2 years) episode of treatment (as in Mason et al, 1995), the proportion of recovered subjects falls sharply, to only 16.3% of those people with schizophrenia in the incidence cohort and 35.8% of people without schizophrenia. Even so, subjects with a GAF score greater than 60 may still include those with ‘some difficulties’ in areas of social or occupational functioning. How such persisting difficulties may affect overall quality of life we were unable to assess.

Predictors of outcome

For GAF—S and GAF—D, in all regression models, the percentage of time experiencing psychotic symptoms in the first 2 years was the strongest predictor (first to enter stepwise regression models) of both symptom and disability scores. For symptom scores, the only additional variable that entered the model in which centre was included (Table 8) was the variable ‘ centre.’ In the analysis excluding the centre, but including area variables, two additional variables entered the model. By order of entry, they were baseline diagnosis and age at study entry. Those with a baseline diagnosis of schizophrenia had significantly poorer symptom scores than those in the acute schizophrenia and the bipolar groups, as did persons who were younger at study entry.

View this table:
Table 8

Multiple regressions for Global Assessment of Functioning Symptoms and Disability (GAF-S and GAF-D) scales: analyses with centre included (in), area variables excluded; centre excluded, area variables included

For disability, the variables in addition to the percentage of time experiencing psychotic symptoms that entered the model in which the centre is included were, in order of entry: centre and diagnosis. In a comparison of all centres with Chandigarh Urban, only those in Dublin, Prague and Rochester had significantly poorer disability scores. The finding on diagnosis is similar to that for the symptom score when centre is excluded, except that those with schizoaffective disorder were also significantly better than those with schizophrenia. In the second analysis that excluded the variable centre but included area variables, variables entering the model were, by order of entry: diagnosis, blunted affect, national health insurance, history of drug use and family involvement in treatment. The finding for diagnosis indicated that those with acute schizophrenia and schizoaffective disorders had better outcomes than those with schizophrenia. Those without national health insurance had better disability scores. Family involvement with treatment decisions, presence of blunted affect from the baseline PSE, and premorbid drug use were associated with poorer disability scores.

In an earlier analysis, Craig et al (1997) re-analysed 2-year outcome data from the DOSMeD study in order to examine the role that the centre plays in predicting course. A statistical approach (recursive partitioning or Classification and Regression Trees (CARTs); Breiman et al, 1984) was used in which predictor variables are not pre-grouped. Their analysis confirmed earlier findings (Jablensky et al, 1992) that a strong predictor of ‘pattern of course’ was a certain grouping of centres. However, in the CART analysis that grouping, while resembling the ‘developing/developed’ dichotomy reported in earlier WHO studies, was not identical to it. Subjects in Nottingham and Prague joined the developing countries to form a sub-group with more favourable outcomes. A CART analysis of the present follow-up data of the DOSMeD cohort (Siegel et al, 2001) found that Chandigarh Rural and Nottingham had better outcome than other centres; however, this was true only for those subjects whose percentage of time experiencing psychotic symptoms in the early part of their illness (first 2 years) was 13.5% or less.


The ISoS results compare favourably with those of other longitudinal studies of schizophrenia, with rates of recovery for aggregated data ranking among the highest published to date. Global outcomes at 15 and 25 years were favourable for over half of all people followed up. Striking heterogeneity was seen across the different dimensions of outcome, however. Despite the limitations of the diagnostic criteria utilised, a baseline diagnosis of ICD—10 schizophrenia was consistently associated with poorer outcomes in symptoms, social disability and resource utilisation. Short-term course of illness strongly predicted long-term outcome, but local environment (centre effects) played a significant role in determining both symptoms and social disability.

Limitations of the data

In far-flung collaborative studies of this sort, data quality control is always of concern. Experience, repeated training, reliability checks, and centralised data management and monitoring served, we believe, to minimise this hazard. Success in follow-up was encouraging. Still, findings require qualification in view of certain methodological constraints. The marked heterogeneity in outcome across ISoS centres, for example, might be partially due to biases in case ascertainment at follow-up and assessment methods.

The ISoS living cohort differs from those lost to follow-up in ways that potentially bias outcome data in a more favourable direction. Making the most conservative assumption — that all cases lost to follow-up fell into the poor outcome categories — we recalculated results. Thus ‘ corrected’, the percentage of Bleuler recovered cases falls from 56% to 41% for all psychoses, and from 48% to 35.7% for schizophrenia. A similar range of adjusted outcome values was obtained for the GAF—S and GAF—D global outcome variables. However, for the predictive analyses of the DOSMeD cohort, the Drake et al (2001) examination found that biasing effects introduced by basing analysis on the living cohort alone were negligible. We conclude that our findings, while attenuated somewhat, cannot be explained by systematic biases in follow-up.

Because they appear so consistently across the ISoS centres, the case attrition data are also informative concerning likely biases in similar follow-up studies. Intuitively, we might predict that those with better outcomes would be more likely lost to follow-up because of greater social and occupational mobility and reluctance to consent to interview about past events with strongly negative associations. We found a trend in the opposite direction, with males, those with slow illness onset and individuals with poor short-term course more likely lost to follow-up. The implication — that those most at risk of deteriorating are being lost from psychiatric aftercare — should concern mental health service providers.

Reliability of case definition at entry is another potential source of error. Although ICD—10 diagnoses were generated using crosswalk algorithms, the original diagnoses in ICD—8 or ICD—9 were not operationalised, and criteria may have been applied inconsistently. However, for all but one of the study cohorts (Hong Kong) baseline diagnoses were established using the same semi-structured mental state assessment tool (the PSE), followed by consensus research diagnosis meetings in most centres. This is likely to have substantially reduced information variance and the chances of systematic diagnostic bias operating between centres.

Although reliability in the application of follow-up instruments was tested by formal reliability exercises (assessment of videotaped interviews), its implementation was incomplete. The majority did, however, complete three or more. Nor was there sufficient variability across all the items in the three principal instruments to ensure the sensitivity of κ as a measure of reliability, and therefore the number of test items was restricted. Nevertheless, we found a fair to very good measure of agreement both within and across the FRCs, even with videotaped interviews having been conducted in English, which was not the first language for many study participants. Further, the DAS requires knowledge of local norms and conventions in order to rate dimensions of social disability, which may have reduced intercentre reliability for this instrument.

We also regret that we were unable to carry out a systematic needs assessment (e.g. for social support, residential care, treatment) using standardised measures. The LCS rating of living independently in the community proved an unreliable proxy for met need in terms of either the appropriateness of residential placement or the level of social support. We learned, for example, that although three-quarters of the 259 subjects from Beijing, Madras, Hong Kong and Sofia (the group for whom sufficient data were available) were coded as living with family, for a substantial percentage (39.5%) this amounted to a surrogate institutional arrangement in terms of the intensive support required. Similarly, in the judgement of the Nottingham field-worker, about 10% of cases rated as living with family or friends could not sustain independent function in the community were this level of support to be withdrawn (Harrison et al, 1994). Residential ratings of living independently must be interpreted with great caution, as the burden of care on many families may have been considerable. Rates of institutionalisation (hospitalisation and supervised residence) are also problematic, reflecting administrative policy and resource availability at least as much as need for care. Proportions in supported residential accommodation were higher for industrialised FRCs with the exception of Hong Kong and Moscow.

Working concepts of recovery require qualification as well. Our study relied heavily upon absence of symptoms, social disability and resource utilisation. This should not be equated with recovery of the level of function achieved before the onset of illness, and even less with the recovery of lost potential. Commentators advocating the user-perspective (e.g. Prior, 2000) focus on the individual's ‘recovering a meaningful and fulfilling life’ within the limitations of the disorder, an important judgement our data do not allow us to make. Nor did our methods permit investigation of the subtle, but potentially powerful effects of operant ‘cultural’ or environmental factors on course of illness and restoration of function. Finally, as noted earlier, data on the timing of onset — more precisely, on the emergence of early symptoms — that would have enabled us to estimate duration of untreated psychosis were not reported consistently enough to permit analysis of its potential effect.


For those few centres in which a substantial proportion were lost to follow-up, it is possible that the mortality risk was some-what overstated, as the status of deceased individuals is probably ascertained more readily than that of living ones. This limitation is mitigated by the use of statistical methods that take into account censored follow-up data. It is also possible that cross-centre variation in the reliability and validity of routine death certificate data may have resulted in unnatural causes of death going undetected, particularly in non-industrialised centres.

Such limitations notwithstanding, the ISoS findings nevertheless extend and refine the findings of earlier studies in populations from largely industrialised settings. First, despite marked variations in SMR across cultural settings, the absolute mortality risk for patients with schizophrenia and related psychoses is high and remarkably similar in all the countries examined. Second, differences between developed and non-industrialised countries may be pronounced. For centres in industrialised countries, our results tend to replicate earlier reports of increased mortality risk for unnatural death and for younger patients. For non-industrialised countries, however, they suggest different, setting-specific risk factors, reinforcing the need for future studies to examine mortality in a broader range of cultures (see Simpson & Tsuang, 1996).

Predictors of long-term course and outcome for the DOSMeD cohort

The robust finding of this analysis is that, of all the variables thought to relate to long-term outcome, the strongest predictors were measures of early illness course. Percentage of time spent experiencing psychotic symptoms in the 2 years following onset was the best predictor for all outcome measures: the shorter the percentage of time with psychotic symptoms, the better the longer-term symptom and disability scores, as well as overall course of illness. A CART analysis of the data revealed that pattern of course in the first 2 years was also related to long-term pattern of course. Also, a presenting diagnosis of schizophrenia (in contrast to, depending on the analysis, either schizoaffective disorder, acute schizophrenia or bipolar disorder/depressive psychosis) enhanced the likelihood of poor symptom or poor disability scores. Although type of onset did not play an independent explanatory role for any study measures, it may be viewed as having an effect at one remove, since it was significantly related to percentage of time with psychotic symptoms in the 2-year follow-up study of this cohort (Jablensky et al, 1992). In addition, baseline variables of age (younger), family involvement in treatment, history of drug use, symptoms of blunted affect and, from the CART analysis, lack of close contact with friends, as well as loss of interest, were somewhat related to an increased likelihood of poor course or outcome.

The regression models also highlight the role that cultural variables may play in explaining outcome. The centre variable entered the stepwise regression models for both symptoms and disability at the second step, indicating that rates of recovery do vary by location. The effect that those with national health insurance have poorer outcomes may only be approximating a developing country effect since those without such insurance are in large part those in the two Indian centres (n=113). The other centres without national health insurance, Honolulu and Rochester, only contribute an additional 57 persons. Overall, the data show that early poor outcome predicts a continuation of incomplete remission and unfavourable long-term status in both symptom and disability assessments. Premorbid signs and symptoms suggesting poor social adjustment also enhance the likelihood of adverse long-term outcomes, as do symptoms of a deficit syndrome. That said, living in certain areas appears to improve chances of recovery, even for subjects with unfavourable early-illness course. The precise nature of these setting- or culture-specific effects remains to be unravelled.

Implications for long-term management of psychotic illness

The ISoS mortality findings suggest that, for industrialised countries, clinical (and public health) efforts to reduce the causes of unnatural death (e.g. suicide prevention strategies) could reduce the mortality risk, especially among young males. In non-industrialised countries, the predominance of unnatural deaths may be due to differences in access to medical care between persons with schizophrenia and the general population; if so, health services should aim at earlier and better diagnosis and at treatment of comorbid medical conditions. Our findings fail to replicate earlier studies whose results (increased deaths from infectious disease among psychiatric patients) suggested compromised immune systems. More recent studies, especially in settings heavily affected by the AIDS epidemic, may produce different findings.

The ISoS data underline the heterogeneous nature of recovery in schizophrenia across ‘linked but separate’ domains of outcome. This is nowhere more apparent than in the 20% or more of individuals who managed to sustain employment despite some persisting symptoms and/or disability. These data highlight the need for a better understanding of the unevenness of restored function, and the different schedules of recovery that may obtain across different domains, at varying paces, to disparate degrees of thoroughness (Strauss & Carpenter, 1972; Davidson & Strauss, 1992). Future research should examine how such discrepancies are handled by people, how they infiltrate identity (Weingarten, 1994) and how variant schedules of recovery might be supported or subverted by community-based interventions and treatment programmes.

Our data also give evidence of a ‘late recovery’ effect. If one allows for differences in measurement of mode of onset, there are striking parallels in the course of illness patterns found by Bleuler (1978) and the trajectories of incidence ISoS cases. Approximately half of each group was characterised by episodic course type with good global outcome. Episodic courses eventuating in poor outcomes, which accounted for nearly one-third of the Vermont cohort (Harding et al, 1987), amounted to only 9% of people in the Bleuler and ISoS incidence cohort. Conversely, at the 15-year point of follow-up of the incidence cohort, we observed a pattern of good outcome following a simple (that is, continuous impairment) course type in 16% of incidence cases. Although our effect was weaker compared with Harding et al (1987), these data support the case for maintaining therapeutic optimism and for re-instituting employment and rehabilitation programmes despite failures earlier in the course of the illness.

Our data underscore but fail to elucidate the contribution of sociocultural factors to the longer-term course and outcome of psychosis. Unravelling the epigenetic puzzle of how complex gene-to-environment interactions influence different aspects of life course in schizophrenia may eventually require dimensional rather than categorical conceptualisations of psychosis. However, those interactions occurring at the time of early symptom development may prove most ‘toxic’ in terms of long-term outcome, offering a critical ‘window’ for therapeutic intervention. This will need to embrace social as well as pharmaceutical measures, but because our data are silent with respect to local cultural responses to early behavioural change, we cannot yet say how early intervention strategies might be designed to enhance their therapeutic impact.

The overarching message of ISoS is that schizophrenia and related psychoses are best seen developmentally as episodic disorders with a rather favourable outcome for a significant proportion of patients. Because expectation can be so powerful a factor in recovery, patients, families and clinicians need to hear this. At the same time, the hope these data represent should not be overdrawn. Subjects with poor prognostic indicators were overrepresented in those lost to follow-up, and mortality was elevated throughout. A relatively modest proportion (about one-sixth) was judged as having achieved complete recovery, in the sense of no longer requiring any form of treatment.

Despite these notes of caution, the ISoS findings join others in relieving patients, carers and clinicians of the chronicity paradigm which dominated thinking throughout much of the 20th century. They offer robust reasons for therapeutic optimism and point to a critical ‘window of opportunity’ in the early period of syndromal differentiation. If the course of these disorders depends upon short-term outcome and sociocultural setting, then early intervention programmes and intensive engagement strategies may have a favourable impact upon the evolution of symptoms over the next 15-25 years, at least for some patients. The dual challenge remaining is first to open up further the ‘black box’ of culture subsumed under centre effects and then to find ways of translating customary practices into interventions by design. Future studies will require both qualitative and quantitative methods to explore the characteristics of environment that promote recovery.

Clinical Implications and Limitations


  • Striking heterogeneity in the long-term course of schizophrenia challenges conventional notions of chronicity and therapeutic pessimism.

  • The predictive strengths of early pattern of course and sociocultural setting support the case for early intervention strategies that encompass social as well as pharmacological measures.

  • Evidence of late recovery in a significant minority of subjects should encourage innovative rehabilitation and employment programmes in those with long-term illness, despite earlier failures.


  • Cases lost to follow-up were biased toward males and subjects with slower onsets and less favourable early pattern of course.

  • The reliability study did not include all investigators in every centre, which may have led to some uncontrolled variability.

  • Although we report a number of administrative outcomes (such as residence at follow-up) we did not carry out a systematic needs assessment.


This paper is based on the data obtained in the ISoS, a project sponsored by the World Health Organization, the Laureate Foundation (USA) and the participating centres. The ISoS, a transcultural investigation coordinated by WHO in 18 centres in 14 countries, was designed to examine patterns of long-term course and outcome of severe mental disorders in different cultures, to develop further methods for the study of characteristics of mental disorders and their course in different settings, and to strengthen the scientific basis for future international multidisciplinary research on schizophrenia and other psychiatric disorders seen in a public health perspective.

The chief collaborating investigators in the field research centres are: Aarhus, A. Bertelsen; Agra, K. C. Dube; Beijing, Y. Shen; Cali, C. Leon; Chandigarh, V. Varma and (since 1994) S. Malhotra; Dublin, D. Walsh; Groningen, R. Giel and (since 1994) D. Wiersma; Hong Kong, P. W. H. Lee; Honolulu, A. J. Marsella; Madras, R. Thara; Mannheim, H. Hafner and (since 1989) W. an der Heiden; Moscow, S. J. Tsirkin; Nagasaki, Y. Nakane; Nottingham, G. Harrison; Prague, C. Skoda; Rochester, L. Wynne; and Sofia, K. Ganev. Coordination of the data collection, experimental design and data analysis were carried out by the WHO Collaborating Centre at the Nathan S. Kline Institute for Psychiatric Research under the direction of E. Laska. At WHO headquarters, Geneva, the study has been coordinated by N. Sartorius (until August 1993), by W. Gulbinat (September 1993—April 1996) and by A. Janca (since May 1996).

The authors also gratefully acknowledge the contribution of Ezra Susser, Sarah Conover and John Cooper to the early design and development of instruments for the ISoS.

  • Received May 15, 2000.
  • Revision received November 20, 2000.
  • Accepted December 12, 2000.


View Abstract