Efficacy of an evidence-based cognitive stimulation therapy programme for people with dementia
Randomised controlled trial


Background A recent Cochrane review of reality orientation therapy identified the need for large, well-designed, multi-centre trials.

Aims To testthe hypothesis that cognitive stimulation therapy (CST) for older people with dementia would benefit cognition and quality of life.

Method A single-blind, multi-centre, randomised controlled trial recruited 201 older people with dementia. The main outcome measures were change in cognitive function and quality of life. An intention-to-treat analysis used analysis of covariance to control for potential variabilityin baseline measures.

Results One hundred and fifteen people were randomised within centres to the intervention group and 86 to the control group. At follow-up the intervention group had significantly improved relative to the control group on the Mini-Mental State Examination (P=0.044), the Alzheimer’s Disease Assessment Scale – Cognition (ADAS–Cog) (P=0.014) and Quality of Life – Alzheimer’s Disease scales (P=0.028). Using criteria of 4 points or more improvement on the ADAS–Cog the number needed to treat was 6 for the intervention group.

Conclusion The results compare favourably with trials of drugs for dementia. CST groups may have worthwhile benefits for many people with dementia.

Psychological treatments for dementia, such as reality orientation, have been in use for nearly half a century ( Taulbee & Folsom, 1966). Despite their longevity, their effects remain open to question and many studies have been either small, of poor methodological quality, or both ( Orrell & Woods, 1996). Reality orientation operates through the presentation and repetition of orientation information, either throughout the day (‘24-hour’) or in groups meeting on a regular basis to engage in orientation-related activities (‘classroom’) ( Brook et al, 1975). A recent Cochrane review found that reality orientation was associated with significant improvements in both cognition and behaviour, but also identified a need for large, well-designed, multi-centre trials (Spector et al, 1998, 2000). The results of the Cochrane review were used to develop a programme of evidence-based therapy focused on cognitive stimulation ( Spector et al, 2001). The cognitive stimulation therapy was piloted in three care homes and one day centre, leading to improvements in cognition and depression for people participating in the programme compared with the control group ( Spector et al, 2001). The aim of the study reported here was to evaluate the effects of cognitive stimulation therapy groups on cognition and quality of life for people with dementia, in a single-blind, multi-centre, randomised controlled trial (RCT).



A total of 169 day centres and residential homes with a minimum of 15 residents each (to maximise numbers of suitable participants) were contacted in the participating areas (the National Health Service Trusts for Barking, Havering and Brentwood, Tower Hamlets, Enfield, and Camden and Islington, as well as Quantum Care, a voluntary organisation in Hertford-shire). The researchers investigated all interested centres (day centres and residential homes) to determine whether there were adequate numbers of potential participants with dementia, by using an inclusion criteria flow chart. A minimum of eight or more eligible people were required in each centre, because five were needed for the group, leaving three or more control participants.

Inclusion criteria

People were considered suitable for full assessment and participation if they:

  1. met the DSM–IV criteria for dementia ( American Psychiatric Association 1994);

  2. scored between 10 and 24 on the Mini-Mental State Examination (MMSE; Folstein et al, 1975);

  3. had some ability to communicate and understand communication – a score of 1 or 0 in questions 12 and 13 of the Clifton Assessment Procedures for the Elderly – Behaviour Rating Scale (CAPE–BRS; Pattie & Gilleard, 1979);

  4. were able to see and hear well enough to participate in the group and make use of most of the material in the programme, as determined by the researcher;

  5. did not have major physical illness or disability which could affect participation;

  6. did not have a diagnosis of a learning disability.

Design and process of randomisation

In residential homes and day centres with at least eight suitable participants, full assessments were conducted in the week prior to, and the week following, the intervention by a researcher masked to group membership. Groups were established in 23 centres (18 residential homes and 5 day centres). Of 292 people screened, 201 participants (115 treatment, 86 control) entered the study ( Fig. 1). There were more people in the intervention group because frequently centres had only eight or nine suitable participants, and five of these had to be randomised to the intervention group. Control group participants from each centre continued with usual activities while the group therapy was in progress. For most residential homes ‘usual activities’ consisted of doing nothing. For the other centres, usual activities included games such as bingo, music and singing, arts and crafts, and activity groups. Within each centre, one researcher (the therapist) ran the group and the other (the assessor) conducted initial and follow-up assessments, ensuring masking. Participants were randomly allocated into treatment and control groups. The assessor ordered the names of the selected participants for each centre alphabetically and allocated numbers in sequence according to the total number to be randomised (8–10). The therapist independently placed identical numbered discs into a sealed container and the first five numbers to be drawn out formed the treatment group. The appropriate multi-centre and local research ethics committees granted ethical approval. Informed consent was obtained from participants. After an explanation of the study, those who agreed to participate were asked to sign the consent form in the presence of a witness (usually a member of staff). People whom the staff felt were too impaired to understand the nature of the study were excluded, and it usually followed that they were too impaired to participate in the groups. Using the results from our pilot study, we estimated that a sample size of 64 in each group was required to achieve 80% power to detect a difference in means of 2 points (MMSE). This assumed that the common standard deviation was 4.0, using a two-group t-test with a 0.05 (two-sided) significance level.

Fig. 1

Profile of trial and attrition. MMSE, Mini-Mental State Examination.

The programme

The 14-session programme ran twice a week for 45 min per session over 7 weeks. It was designed using the theoretical concepts of reality orientation and cognitive stimulation. It largely focused on a trial of cognitive stimulation ( Breuil et al, 1994), which was identified through the systematic reviews as having the most significant results. Topics included using money, word games, the present day and famous faces. The programme included a ‘reality orientation board’, displaying both personal and orientation information, including the group name (chosen by participants). The board was to provide a focus, reminding people of the name and nature of the group, and creating continuity. Each session began with a warm-up activity, typically a softball game. This was a gentle, non-cognitive exercise, aiming to provide continuity and orientation by beginning all sessions in the same way. Sessions focusing on themes (such as childhood and food) allowed the natural process of reminiscence but had an additional focus on the current day. Multisensory stimulation was introduced when possible. Sessions encouraged the use of information processing rather than factual knowledge. For example, in the ‘faces’ activity, people were asked, ‘Who looks the youngest?’ ‘What do these people have in common?’, with factual information as an optional extra. A range of activities for each session enabled the facilitator to adapt the level of difficulty of the activities to take into account the group’s cognitive capabilities, interests and gender mix. The 14-session programme has been previously described in depth ( Spector et al, 2001).

Assessment measures


The primary outcome variable was the MMSE ( Folstein et al, 1975). This is a brief, widely used test of cognitive function, with good reliability and validity. The secondary outcome variable was the Alzheimer’s Disease Assessment Scale – Cognition (ADAS–Cog; Rosen et al, 1984); this is a more sensitive scale measuring cognitive function and including more items that assess short-term memory. It is frequently used in drug trials as the principal cognitive measure, allowing the effects of cognitive stimulation therapy to be compared with antidementia drugs.

Quality of life

The Quality of Life – Alzheimer’s Disease scale (QoL–AD; Logsdon et al, 1999) was used as a secondary outcome variable; it has 13 items covering the domains of physical health, energy, mood, living situation, memory, family, marriage, friends, chores, fun, money, self, and life as a whole. This brief, self-report questionnaire has good internal consistency, validity and reliability ( Thorgrimsen et al, 2003).


The Holden Communication Scale ( Holden & Woods, 1995), which is completed by staff, covers a range of social behaviour and communication variables, including conversation, awareness, pleasure, humour and responsiveness.


The Clifton Assessment Procedures for the Elderly – Behaviour Rating Scale (CAPE–BRS; Pattie & Gilleard, 1979) covers general behaviour, personal care and behaviour towards others. It has good reliability and validity, and was included to assess the overall level of functional impairment and dependency.

Global functioning

The Clinical Dementia Rating scale (CDR; Hughes et al, 1982), completed by the researcher, provided a global rating of dementia severity at baseline.


The Cornell Scale for Depression in Dementia ( Alexopoulos et al, 1988) rates depression in five broad categories (mood-related signs, behavioural disturbance, physical signs, biological functions and ideational disturbance) using information from interviews with staff and participants. Good reliability and validity have been demonstrated.


Anxiety was assessed using the scale Rating Anxiety in Dementia (RAID; Shankar et al, 1999); this rates anxiety in four main categories (worry, apprehension and vigilance, motor tension, and automatic hypersensitivity) using interviews with staff and participants. It has good validity and reliability.


Data were entered into the Statistical Package for the Social Sciences, version 10 for Windows ( SPSS, 2001). An intention-to-treat analysis was conducted and analysis of covariance (ANCOVA) was chosen as the method of analysis because it controls for variability in pre-test scores (the ‘covariate’; Vickers & Altman, 2001). Age, gender and baseline score on the scale being examined were entered as covariates, together with ‘centre’ entered as a random factor, because treatment was defined as participation in the group programme within the confines of one of the 23 centres.


Of the 115 participants in the treatment group 97 were assessed at follow-up, as were 70 of the 86 control participants ( Fig. 1). The mean attendance was 11.6 sessions (s.d.=3.2, range 2–14) and 89% of people attended seven or more sessions. Table 1 compares treatment and control participants’ characteristics in terms of age, gender and baseline scores and provides information about the total participant group. We attempted to collect data on years of education but in the vast majority of instances this was not available. None of the participants had been prescribed an acetylcholinesterase inhibitor.

View this table:
Table 1

Characteristics and scores of participants at baseline assessment

Difference between groups at follow-up

In Table 2, significance levels set at 5% are presented from the ANCOVA comparing groups (treatment and control) in all instances. Significant results for covariates (centre and/or gender) are included when they occurred. At follow-up, the treatment group had significantly higher scores on MMSE and ADAS–Cog and rated their quality of life (QoL–AD) more positively than the control group did, and the confidence intervals for the differences between groups were above zero for all three measures. There was a trend towards an improvement in communication in the treatment group (P=0.09) but no difference between the groups in terms of functional ability (CAPE–BRS), anxiety or depression. Centre emerged as a significant covariate in relation to ADAS–Cog, Holden Communication Scale, Cornell and RAID scales, and CAPE–BRS score. A number of gender differences emerged. Quality of life for women in the treatment group improved more than that for the men, whereas the quality of life for men in the control group deteriorated significantly more than it did for the women. Dependency levels (CAPE–BRS) and communication (Holden) also deteriorated for men in the treatment group (though less than for the men in the control group). In contrast, women in the treatment group improved on both measures whereas women in the control group deteriorated (though less than the men in the control group).

View this table:
Table 2

Change from baseline in measures of efficacy at follow-up: intention-to-treat analysis

Numbers needed to treat

The number needed to treat (NNT) is a calculation of the number of people who needed to be treated in a particular intervention in order to achieve one favourable outcome. It is calculated as the reciprocal of the ‘absolute risk reduction’: the difference in the proportion experiencing a specified adverse outcome between the control and treatment groups. Using the formulae and framework provided in a previous study ( Livingston & Katona, 2000) including acetylcholinesterase inhibitors, two NNT analyses using the ADAS–Cog scores were performed in this study ( Table 3):

View this table:
Table 3

Numbers needed to treat: comparison of cognitive stimulation therapy with antidementia drug trials

  1. when calculating no deterioration (score ≥0) as improvement and any deterioration (< 0) as adverse, 50% of the treatment group improved compared with 37% of the control group: thus eight people needed to be treated in order for one to benefit (95% CI 4–144);

  2. when calculating an increase in score of 4 or over as improvement and 3 or below as adverse, 30% of the treatment group improved compared with 13% of the control group: thus six people needed to be treated in order for one to benefit (95% CI 4–17).


Major findings

This evidence-based programme of cognitive stimulation therapy showed significant improvements in two measures of cognition, including the MMSE (the primary outcome measure), and also in the QoL–AD (a secondary outcome measure). The improvements in cognition are consistent with the findings of earlier studies ( Woods, 1979; Breuil et al, 1994). The overall ADAS–Cog (a secondary outcome measure) change indicated improvement in a number of factors. With the exception of explicit rehearsal in place orientation, which is directly questioned, there was no obvious reason why participation in groups should have had a direct practice effect on any other tasks in the ADAS–Cog, such as word recall or recognition. This suggests that generalised cognitive benefits resulted from inclusion in the programme. Nevertheless, such groups probably need to be ongoing, at least weekly, to increase the chance of the relative benefits being sustained.

Contrary to the Cochrane review ( Spector et al, 1998) we found no change in behaviour in this study (and the former review found only one individual trial that demonstrated a significant difference in behaviour ( Baines et al, 1987)). Changes in cognition might be unlikely to have any impact on areas of functional dependence described in the CAPE–BRS, such as feeding and dressing ( Woods, 1996). Other authors ( Zanetti et al, 1995) have suggested that behavioural outcome measures are often not sensitive enough to detect the functional impact of cognitive stimulation programmes. There were positive trends in communication, which had not been shown empirically in any of the earlier reality orientation trials. Communication is a factor that is likely to deteriorate in individuals moving into residential care, yet the small-group context was probably novel for many of the participants, perhaps exercising long unused communication skills. It is not known why women reacted more favourably to the programme. For men, being in the minority in most groups could have created discomfort and a reluctance to communicate.

Variation between centres

There was a significant variation between centres from baseline to follow-up in measures of cognition (ADAS–Cog), behaviour, mood and communication. Some centres appeared more institutionalised, and in these there were poor staff–patient relationships and functioning was not optimised. Thus, it might have been the case that the effects of groups were not strong enough to combat the effects of a negative environment. Moreover, in some centres with a better quality of social environment, perhaps including a local programme of activities, residents might have been functioning near their optimum, leaving little scope for improvement. Groups including people at different stages of dementia were sometimes difficult to run. People with milder dementia could become irritated by people with more severe cognitive impairment, and observing their confusion might have been off-putting and hence detrimental to the group process. Pitching the sessions at an appropriate level was clearly important. It is possible that the social interaction provided by the groups could have been of benefit, but our Cochrane review ( Spector et al, 1998) found that in RCTs social groups appeared to be of no benefit to cognition.


Rigorous inclusion criteria were necessary to ensure a reasonably homogeneous participant group, and were aimed at recruiting people who were able to participate and less likely to leave the study. This meant many centres were excluded because of insufficient numbers. Cluster randomisation might have been useful in allowing centres with five to seven suitable candidates to be included, but would have had the disadvantage that large numbers of clusters would be needed to ensure statistical power and external validity ( Bowling, 1997). More importantly, the significant difference between centres on many scales in this study shows that it would have been difficult to ensure the comparability of clusters. Outside the context of a research trial, groups would probably be selected through clinical judgement, considering how people would mix; and people with poorer vision or hearing, or with greater communication difficulties, might be included to make up numbers.

There were a number of other limitations. In the randomisation procedure ideally the generation of the allocation sequence, enrolment into the trial and allocation to group should be separate and performed by different, independent staff. Differences in control conditions between centres meant that the ‘control group’ was not homogeneous; however, ‘usual activities’ generally meant doing nothing. Last, in contrast to the results on the primary and secondary outcome measures which were rated directly with the participants, none of the scales rated by staff (e.g. mood, communication, behaviour) showed significant improvements for the cognitive stimulation therapy group. Staff perceptions about the therapy groups might have introduced a bias into the ratings of the scales. We took precautions to avoid this by ensuring that the local member of staff who acted as co-therapist was not involved in completion of the rating scales. However, it is likely that other staff could have been aware of which people were in the groups and this might have influenced their ratings.

Comparison with acetylcholinesterase inhibitors

Number-needed-to-treat analyses were previously performed for three acetylcholinesterase inhibitors: tacrine, rivastigmine and donepezil ( Livingston & Katona, 2000). Analyses were performed identically in this study, considering two levels of change as improvement, so that a direct comparison could be made ( Table 3). Calculations were also included for galantamine, using the results from another trial ( Wilcock et al, 2000). These comparisons show that for small improvements or no deterioration, the programme was not quite as effective as rivastigmine, donepezil and galantamine. For greater improvements (4 or more points), cognitive stimulation therapy did as well as galantamine or tacrine and substantially better than rivastigmine or the lower dosage of donepezil (5 mg). Only the higher dosage of donepezil (10 mg) had a smaller NNT. These results are particularly interesting considering that the drug programmes lasted for 24 weeks, 26 weeks or 30 weeks compared with only 7 weeks of cognitive stimulation therapy. However, since these drug studies applied only to Alzheimer’s disease, and since drug therapy and psychological therapy are different forms of treatment, some caution is required when interpreting these comparisons.

Mechanisms for change

There are a number of possible mechanisms of change. The learning environment during sessions was designed to be optimal for people with dementia, for example by focusing on implicit memory and integrating reminiscence and multi-sensory stimulation throughout the programme. Stimulation in the group could improve cognition and might make participants feel more able to communicate. The groups could work against the excess disability due to the ‘malignant social psychology’ of a negative social environment ( Kitwood, 1997) by improving self-esteem through social stimulation and encouragement. Finally, groups positively reinforced questioning, thinking and interacting with other people, objects and the environment. This effect might have extended beyond the groups, with people communicating more effectively and responding to the environment and to others.

Recent research has highlighted strategies that can involve memory training and cognitive stimulation programmes. Providing participants with ‘didactic training’ (forming mental images of words) and ‘problem solving’ (practical steps to manage daily problems, such as using notebooks and calendars) has been shown to result in small but short-lived changes in memory performance ( Zarit et al, 1982). The use of external memory aids, such as diaries, calendars, large clocks and clear signposting, is becoming increasingly common for people with dementia. Research is also identifying ways of creating an optimal learning environment: for example, ‘errorless learning’ involves encouraging people, when learning new information, only to respond when they are sure that they are correct, thus avoiding interference effects; and ‘spaced retrieval’ involves learning and retaining information by recalling information over increasingly long periods ( Clare & Woods, 2001).


This study found improvements in both the primary (MMSE) and secondary (ADAS–Cog and QoL–AD) outcome measures for people in the cognitive stimulation therapy group. Although there is a body of research on the various psychological interventions for dementia, much of it lacks methodological rigour and might not be considered ‘evidence-based’. The previous RCTs were small, with the largest having 56 participants ( Breuil et al, 1994), and could be criticised for weaknesses such as lack of standardisation of groups, selection and detection biases, and absence of intention-to-treat analyses. Our study is the only major evidence-based trial examining the effectiveness of cognitive stimulation therapy for dementia. Some guidelines counsel against the use of cognitive stimulation programmes because of the possibility of adverse reactions such as frustration ( American Psychiatric Association, 1997). This study has shown that cognitive improvements are associated with benefits to quality of life rather than deterioration. Indeed, this is the first study to show improvements in quality of life of people with dementia participating in such a programme. The findings suggest that reality orientation groups, which are widely used both throughout the UK and internationally, are likely to be beneficial for many people with dementia and should be regarded more positively by staff, carers and service providers. Future research needs to identify the most effective ways of teaching care staff to implement this programme, the possible benefits of a longer-term cognitive stimulation therapy programme, and the potential effects of combining cognitive stimulation therapy with drug therapy.

Clinical Implications and Limitations


  1. Cognitive stimulation therapy groups appear to improve both cognitive function and quality of life for people with dementia.

  2. The degree of benefit for cognitive function appears similar to that attributable to acetylcholinesterase inhibitors.

  3. The groups were popular with the participants, and can be conducted in a variety of settings.


  1. To maintain the benefits relative to the control group, it is likely that cognitive stimulation therapy would need to be continued on a regular basis long after the end of the14-session programme.

  2. Staff ratings might have included an element of bias despite efforts to reduce this.

  3. Many centres were excluded because they had insufficient numbers or residents fitting the inclusion criteria.


This paper is dedicated to the memory of Margaret Butterworth, who died in December 2002 having worked tirelessly for the needs of carers and people with dementia over many years. The work was led by Dr Martin Orrell, who received funding from the NHS London Regional Office, Research and Development Programme, and Barking, Havering and Brentwood Community NHS Trusts. The views expressed in the publication are those of the authors and not necessarily those of the NHS or the Department of Health. We thank all the residents and staff of the residential homes and day centres who participated in the study. We also thank Professor Stephen Senn, and Pasco Fearon for statistical advice.

  • Received January 13, 2003.
  • Revision received May 13, 2003.
  • Accepted May 13, 2003.


View Abstract