Background The significant reductions in hospital admission demonstrated in US assertive community treatment (ACT) studies have not been replicated in the UK. Explanations cite poor UK ‘model fidelity’ and/or better UK standard care. No international model-fidelity comparisons exist.
Aims To compare high-fidelity US ACT teams with a UK team.
Method The UK 700's ACT team (n=97) was compared with high-fidelity US ACT teams (n=73) by using two measures: a forerunner of the Dartmouth Assertive Community Treatment schedule (to assess adherence to ACT principles) and 2-year prospective activity data.
Results The UK and US teams had similar high-fidelity scores. Although significant differences were found in the amount and type of activity, practice differences in areas central to ACT were not great.
Conclusions The failure of UK ACT studies to demonstrate the outcome differences of early US studies cannot be attributed entirely to the lack of ACT fidelity.
Differences in outcome between US and UK assertive outreach studies continue to generate controversy. Significant reductions in hospital care demonstrated in earlier US studies ( Stein & Test, 1980) have not been replicated generally in UK studies ( Holloway et al, 1995; Holloway & Carson, 1998; UK700 Group, 1999). Two possible explanations have been advanced: better quality-control services have disadvantaged the UK studies ( Burns & Priebe, 1996; Tyrer, 2000); and UK trials do not replicate assertive community treatment (ACT) effectively (poor ‘model fidelity’) ( Marshall & Creed, 2000). Drake and colleagues ( Drake et al, 1998; McHugo et al, 1999) demonstrated that model fidelity was associated with outcome in New Hampshire ACT teams for dual-diagnosis patients. It is usually impossible to compare model fidelity between studies because adequate data are not published ( Burns & Priebe, 1996). The St George's ACT team (UK-ACT) was one of the four sites in the UK700 trial that failed to achieve a significant reduction in hospitalisation but for which we have detailed care process and team structure data ( Burns et al, 2000). We tested the model fidelity of this team against the four ‘ high-fidelity’ New Hampshire teams by examination of their apparent adherence to the ACT model, and prospectively collected activity data.
Most mental health professionals have an understanding of what ACT consists of, although a precise definition has so far eluded researchers. Teague et al ( 1995) captured the ingredients that are widely accepted as essential features of the model: a multi-disciplinary ACT team with small case-loads (typically staff: patient ratios between 1:10 and 1:12) providing high-intensity services in vivo and a team approach to sharing responsibility for the whole case-load. The ACT team is assertive in its attempts to engage patients for whom the team has continuous responsibility 24 h a day, 7 days a week. Staff work across typical professional boundaries and endeavour to work closely with the patients' natural support networks. It has been noted that many of the components of ACT teams are not entirely dissimilar to UK community mental health teams ( Burns & Firn, 2002).
The measurement of fidelity to the ACT model was an explicit aim of both the UK and US studies and was measured prospectively in both sites. However, differences in the data collection protocols of the twin studies imposed constraints on what can be used and also may have introduced significantly consistent biases. A detailed examination of the New Hampshire protocol and data collection process was undertaken in an intensive week-long site visit (M.F.). Differences in the protocols meant that only nine process variables covering five distinct ACT areas of activity reported in the UK700 trial ( Burns et al, 2000) could be compared reliably. The five areas were:
carers and support networks;
in vivo treatment;
assisting with basic needs;
increasing patients' functioning.
New Hampshire teams (US-ACT)
The four high-fidelity teams ( McHugo et al, 1999) were identified as ‘strong ACT’ ( McHugo et al, 1999) from seven modified ACT teams in a seven-site randomised controlled trial of ACT patients with dual-diagnosis, severe mental illness and substance misuse ( Drake et al, 1998).
Fidelity to ACT principles had been confirmed using an early development ( Teague et al, 1995) of the Dartmouth ACT scale ( Teague et al, 1998). Thirteen implementation criteria were identified, nine of which ‘reflected features of the PACT model’. These were services provided in the community, assertive engagement, intensity of service, small case-load, continuous responsibility, continuity of staffing, team approach, multi-disciplinary team and working closely with support networks. The four specifically for substance misuse are not included in this comparison.
Two of the authors rated each programme on each criterion on a scale from 1 (low fidelity) to 5 (high fidelity) in half-point steps ( Teague et al, 1998). Anchor points were defined for each end-point, with values for intermediate points being allocated proportionally. Their ratings were made independently at one time-point towards the end of the study and were based on a variety of sources but principally their day-to-day knowledge of the programmes and clinicians' activity logs. They were then discussed by all three authors and these discussions ‘yielded a final consensus rating for each team’ ( Teague et al, 1995). Overall scores for each programme were the mean of individual scores on all criteria.
St George's team (UK-ACT)
This team replicated the New Hampshire protocol as closely as possible. Two psychiatrists working clinically with the team (including T.B.) rated it on each of the items independently. M.F. also rated the team, although three components were rated exclusively on event-recording data (services provided in vivo, intensity of service and working closely with support networks).
Sample: patients and staff. Seventy-eight patients randomly allocated to the four US-ACT teams were recruited over 25 months from June 1989. The inclusion criteria were similar to those used in the UK700 trial, except that the US patients all had a second diagnosis of substance misuse disorder. The UK-ACT data are based on 97 patients. Substance misuse was not measured in the UK700 study, but a year after the study ended 23% of patients on the case-load had a co-occurring substance misuse diagnosis ( Laugharne et al, 2002). Psychiatrists' activity data were not recorded in the US-ACT study, so UK psychiatrists' data were excluded to allow a more direct comparison. Staff from other disciplines participated in recording their activities at both sites (n=25 for US-ACT and n=49 for UK-ACT).
Process recording: US-ACT data. Activity was recorded for one week in six throughout the study. Staff completed a log sheet for each study patient for whom they performed any service in the sampled week. This recorded the time (in minutes) spent with each patient by ten categories of activity:
Activities of daily living.
Family (all family contacts).
For each category, staff recorded the location (‘centre’ or ‘ community’) and the mode of the intervention (‘direct’ or ‘indirect’) ( Teague et al, 1995). ‘ Centre’ was defined as ‘in the mental health centre’, and ‘community’ as ‘anywhere else’. ‘ Direct’ activity was defined as ‘activities done with or services provided to the client’. ‘Indirect’ activity was defined as ‘time spent on behalf of the client without the client present (doing paperwork, calling other agencies, driving time, etc.)’. Individual contacts or care events were not recorded, only the total time.
Activity data were used only for periods when the patient was in a position to receive care. Five patients were excluded and the analyses were based on 73 patients, two with truncated study periods.
For comparison with UK-ACT data, only each US-ACT patient's first 2 years in the study were utilised. Because the US-ACT data were collected only for one week in six, they were adjusted for comparison with the continuous UK-ACT data. An individual factor was calculated for each patient in order to inflate the proportion of their care for which activity-recording had taken place to 2-year totals.
Comparison variables. Differences in data collection protocols meant that inter-site comparison was possible on only nine composite process variables ( Table 1), reflecting five ACT components.
Variables are based on the duration of the activities performed in relation to each patient. ‘Duration’ variables are expressed as a mean rate (in minutes) per patient per 30 days. ‘Proportions’ express either the time spent (in minutes) on a specific type of activity as a proportion of total time performing all activities or of all ‘direct’ activities calculated for each individual patient. The first two duration variables (‘direct contact’ and ‘career activity’) are ‘ headline’ variables because the remainder are derived, at least in part, from one or both. The precise composition of each variable was constrained by differences in data collection between the two sites. Table 2 describes the content of each variable with reference to the local (UK-ACT and US-ACT) definitions described above.
To test for differences between these nine variables, group comparisons were made. Two-sample t-tests were performed to compare means for each variable. Within-group distributions were examined and skewness and kurtosis statistics were calculated. Where either the skewness or kurtosis statistic was significantly different from zero (at the 5% level), a non-normal distribution was assumed and the t-test was validated by bootstrap techniques. Levene's test of equivalence was used to indicate variables where it was appropriate to assume equivalence of variance. In the event, no variables were normally distributed and bootstrap analyses were implemented to check the validity of the t-test results ( Efron & Tibshirani, 1993) for each of the nine variables. The bias-corrected accelerated confidence interval yielded by the bootstrap method was compared with that of the t-test. Where the two intervals were similar, the two-sample t-test results were presented. Where the t-tests were not appropriate, the bias-corrected (accelerated) confidence intervals produced using the bootstrap analyses were used.
Table 3 shows the model-fidelity scores for UK-ACT, assessed by each rater, and Table 4 shows the aggregate score along with scores for the seven US-ACT teams. The US-ACT teams A-D were the ‘strong-ACT’ teams and E-G the ‘weak-ACT’ teams. It can be seen that the UK-ACT score rates as ‘strong ACT’ (i.e. it has high model fidelity as measured on the early Dartmouth ACT schedule).
The results of the group comparison of care activities performed in the UK-ACT and (strong) US-ACT sites are presented in Table 5. There are significant differences in eight of the nine variables tested. The US-ACT teams recorded significantly greater amounts of direct and overall activity than the UK-ACT team. For the activity performed, however, the UK-ACT team recorded higher proportions of in vivo care (variable 3), basic-needs activity (variables 5 and 6) and activities to increase patients' functioning (variables 8 and 9).
The US-ACT teams recorded more activity than the UK team in all four of the activity-rate areas measured. There is strong evidence of a difference between US and UK teams in the headline variables ‘duration of direct contact’ and ‘duration of carer activity’, as well as in ‘ duration of activities to increase patients' functioning’, but there is no significant difference in ‘duration of basic-needs activity’. The average US-ACT patient received more than 400 min of ‘ direct’ contact in each 30-day period, compared with 249 min in the UK-ACT patients (P < 0.001). This is a difference of 36 min per week. The US-ACT patients received 37 min of carer activity, compared with 15 min for the UK-ACT patients (P < 0.001). Because UK-ACT carer activities were recorded only when a single event lasted for 15 min or more, this represents a maximum of only one carer visit per 30 days.
Proportion of types of activity
The proportion of activities concerning three ACT areas (in vivo care, basic-needs activity and activities to increase patients' functioning) were measured using five variables. A greater proportion of all these types of activity was recorded for the UK-ACT team than in the US-ACT teams. There is strong evidence of an increase in the UK in the proportion of direct activity performed in vivo, the proportion (total and direct) of basic-needs activity and the proportion of direct activities to increase patients' functioning. There was some evidence also of an increase in the proportion of total activities to increase patients' functioning.
In the UK-ACT site a far higher proportion of all direct activity (83%) was performed in vivo, compared with only 58% in the US-ACT sites. The two pairs of variables, addressing the proportions of basic-needs activities and of activities to increase patients' functioning, followed similar patterns in each site, with the proportion of each being higher in the UK-ACT site. The proportion of activities to increase patients' functioning accounted for 19% (total) and 20% (direct) in the UK-ACT site, compared with 12% (total) and 14% (direct) in the US-ACT site.
In vivo activity
An additional variable was created (termed ‘duration of direct in vivo activity’) by taking all ‘direct’ activity that was performed at the patient's home or neighbourhood. The distributions were non-normal, and bootstrap analyses were implemented to verify the t-test result. The US-ACT patients received 32.1 min more direct in vivo activity every 30 days than did the UK-ACT patients (95% CI -28.0 to 92.2, P=0.29). The mean duration of direct in vivo activity was 244.2 min (s.d.=120.0) for US-ACT patients and 212.1 min (s.d.=266.3) for UK-ACT patients.
Although the model-fidelity measure was applied rigorously in both sites, different raters were used and this may have biased the results. Three items, however, were based entirely on relatively objective activity-recording data. The similarity on model-fidelity measures suggests that practice was broadly similar. Although only the aggregated ‘consensus’ score of all three raters was available for each component for the US-ACT teams, it was possible to take the lowest score of any rater for the UK-ACT team. Using this conservative approach, it scored as having high fidelity. We would conclude that, despite the absence of 24-h direct care, UK-ACT falls well within the range of acceptable model fidelity.
Process of care
Despite having similar model-fidelity scores, there were major differences in the level of contact and proportion of the time spent on different activities. The average US-ACT patient received 62% more direct contact than the average UK-ACT patient (the equivalent of 36 extra minutes weekly). Such a large difference has the potential to accommodate real clinical advantages.
Stein & Test's descriptions of ACT ( Stein & Test, 1978) stress four areas of patient need, a deficiency in any of which may result in hospitalisation: ‘motivation to remain in the community’, ‘ freedom from pathological dependent relationships’, ‘ material resources’ and ‘coping skills’. The last two of these are addressed in this study. ‘Material resources’ equates to activity focused on basic needs and ‘coping skills’ equates to increasing patients' functioning. For Stein & Test, ‘material resources’ refers to food, shelter, clothing, medical care, recreation, etc. ( Stein & Test, 1978), which equates to the housing and finance elements of the ‘basic-needs activity’ variables (variables 4, 5 and 6). Stein & Test's ‘ coping skills’ equate to the daily living skills and occupation and leisure elements of the ‘patients' functioning’ variables (variables 7, 8 and 9).
Despite the UK-ACT team's lower over-all activity levels, a greater proportion of their activity was focused on patients' basic needs and on increasing their functioning. This may suggest that the UK-ACT team was in fact adhering to a pattern of care specifically intended and expected to enhance patients' community tenure. Indeed, by combining the duration of direct activity (variable 1) with the proportion of direct activity that is focused on basic needs (variable 6) or patients' functioning (variable 9) we can obtain an approximate mean duration rate for each of these focuses of activity. This calculation indicates that very similar amounts of time were allocated to these activities on both sides of the Atlantic. For the direct basic-needs activity this was 40.13 min for the US-ACT (10% of 403 min) and 44.19 min for UK-ACT (18% of 249 min). For patient functioning activities the amounts were 54.98 min for US-ACT (14% of 403 min) and 50.35 min for UK-ACT (20% of 249 min). In both of these key areas the differences amount to less than 5 min per 30 days.
The additional variable, ‘duration of direct activity performed in vivo’, is at the core of Stein & Test's accounts of ACT practice (Stein & Test, 1978, 1980). If activity rates are crucial to outcome, then one might expect to find a significant difference between this practice in US-ACT, which achieved limited substance misuse gains, and UK-ACT, which demonstrated no outcome differences. However, there was no real difference on this variable, although the estimate is imprecise and the wide confidence interval suggests that the difference could be as big as 92.2 min per 30 days.
The small number of variables used for the comparison resulted from differences in data collection in the two sites, which also meant that we could compare only the duration of contact and not the contact frequency. Even within the variables tested, five systematic differences and two biases arising from definitions were identified. All the systematic differences maximise the potential difference and all variables are affected.
General systematic differences
The following systematic differences affect activity rates but not proportions. Thus, differences in proportions are more robust than differences in total duration.
(a) Potential to over- or underrecord
UK-ACT staff recorded only specific ‘events’, making it impossible to identify how staff spent their working week. Consequently there was no incentive to inflate their recorded activities, but there was a risk that some contacts could be overlooked. The US-ACT staff were required to account for all their working time (e.g. for billing or performance management purposes) and this provided an incentive to ‘apportion’ the whole working week.
(b) Telephone contact, carer contact and care coordination
These activities were recorded in UK-ACT only when an event lasted for 15 min or more. The US-ACT data, however, include all activities of the same type (e.g. all telephone calls with a given patient, however short). The US-ACT data are thus more inclusive.
(c) Recording units
The US-ACT data were recorded in quarter-hour units. As a New Hampshire team leader explained:
‘Case management activity... is recorded in units equal to fifteen
minutes, but they [case managers] may make four phone calls in a fifteen
minute time frame and it would come out as four units.’
Thus, 15 min of activity would be recorded in US-ACT as a total of 1 h, whereas the same activity in UK-ACT would not have been recorded at all. In this respect, the US-ACT data are overstated and the UK-ACT data are understated.
(d) Special weeks
The US-ACT teams recorded data for one week in six, whereas the UK-ACT team recorded continuously. This could inflate the US-ACT recording because of a ‘ Hawthorne effect’ ( Arnold et al, 1991), or because more activity was kept for these ‘ special weeks’.
(e) Indirect activity
In the US-ACT data all ‘indirect’ activity is identified as having a particular focus, whereas ‘attempted’ (but failed) face-to-face patient contact was not coded with a focus category in UK-ACT. Thus, ‘total’ activity for UK-ACT data, which comprises direct and indirect elements, will be understated.
The following definitional differences introduce bias into results. Although it was not possible to quantify the effect of these biases, they all act in the same direction: to increase activity recorded for US-ACT and/or to decrease that recorded for UK-ACT. This means that we can confidently assume that the duration variables represent the maximum order of inter-site differences. In all but one of these, maximum rates of activity in the USA are no more than twice those in the UK.
(a) ‘Family’ activity
The US-ACT activities were classified according to their ‘predominant theme’ unless time was divided between several activities, in which case it was apportioned accordingly. However, any family activity ‘ trumped’ (ranked higher than) any other activity, including the basic-needs activity (variables 4-6) and activities to increase patients' functioning (variables 7-9). Consequently, the UK-ACT data for those variables may have been understated.
(b) ‘Service setting’
In vivo activity is defined as that performed outside of a service setting (UK-ACT) or outside the mental health centre (US-ACT). The UK definition is wider in that other (non-mental) health and social service settings are treated as service settings. Consequently, more US-ACT activities will have been classified as in vivo.
Implications for UK practice
It has been proposed that differences in outcome between US-ACT and UK-ACT ( Holloway et al, 1995; Marshall et al, 2001) may reflect failed model fidelity in the UK ( Marshall & Creed, 2000). However, in the areas of practice central to ACT compared in this study, the maximum differences in practice between the high-fidelity US-ACT teams and the UK-ACT team are not great. If these small differences in activity rates do account for the failure of the St George's arm of the UK700 trial, then the differences in practice between successful and unsuccessful ACT (or between successfully and unsuccessfully implemented ACT) in the UK context are very small.
The US authors have explained their failure to demonstrate differences in hospitalisation rates (between either high-and low-fidelity ACT teams or between ACT or standard case management) by the quality of their control services. Mueser et al ( 1998) point out that ‘ almost all the controlled studies have compared the ACT or ICM models with “practice as usual”’ and Drake et al ( 1998) point out that these usually comprise hospital- or clinic-based services or services with very high case-loads. In contrast, the US control groups were ‘exceptionally good’ ( Drake et al, 1998), having incorporated ACT principles but with larger case-loads.
The same explanation has been proposed for the UK700 trial and UK studies generally ( Tyrer, 2000). In light of this explanation, it is interesting that the two sites compared here differed most on the crude headline measure of intensity of service, yet almost not at all on the more ACT-specific ‘duration of direct in vivo activity’. There were also no discernible differences in in vivo direct activity focused on either ‘basic needs’ or ‘increasing patients' functioning’. This suggests that the UK-ACT team was more ACT-like than not, and in terms of salient ACT activity that the failure of UK studies to demonstrate the outcome differences of early US studies cannot be attributed entirely to lack of model fidelity.
Clinical Implications and Limitations
Small differences in data collection procedures can exaggerate or distort perceived differences in clinical practice.
Prospective collection of service data is possible and can yield improved understanding of team functioning.
Failure of the St George's assertive community treatment (UK-ACT) team to reduce hospitalisation cannot be explained entirely by poor model fidelity.
Data were collected using different procedures and different categories in the two sites.
Model fidelity judgements in ACT are evolving and there is no scientifically validated consensus.
The professional context in which care data are collected may have a significant, but unquantified, impact on accuracy.
- Received May 14, 2002.
- Revision received October 9, 2002.
- Accepted October 29, 2002.
- © 2003 Royal College of Psychiatrists