The British Journal of Psychiatry
Cognitive analytic therapy for personality disorder: randomised controlled trial
Susan Clarke, Peter Thomas, Kirsty James
  • Declaration of interest




Cognitive analytic therapy (CAT) is a theoretically coherent approach developed to address common processes underlying personality disorders, but is supported by limited empirical evidence.


To investigate the effectiveness of time-limited CAT for participants with personality disorder.


A service-based randomised controlled trial (trial registration: ISRCTN79596618) comparing 24 sessions of CAT (n = 38) and treatment as usual (TAU) (n = 40) over 10 months for individuals with personality disorder. Primary outcomes were measures of psychological symptoms and interpersonal difficulties.


Participants receiving CAT showed reduced symptoms and experienced substantial benefits compared with TAU controls, who showed signs of deterioration during the treatment period.


Cognitive analytic therapy is more effective than TAU in improving outcomes associated with personality disorder. More elaborate and controlled evaluations of CAT are needed in the future.

The assessment and treatment of people with personality disorder1 are complicated by the heterogeneity of symptoms within individual disorders2 and high levels of comorbidity among disorders. 3 A meta-analytic review of randomised controlled trial (RCT) evaluations of a range of specialist treatments for personality disorder4 showed encouraging results compared with standard care. Nevertheless, the strength of evidence was variable: trials were often underpowered, inadequately reported or relevant outcomes were ignored. Moreover, outcome trials have focused predominantly on borderline personality disorder,4,5 despite the fact that this accounts for only 10% of diagnoses,6 and empirically validated interventions for this group are complex7,8 and of long duration.9 Thus, effective but less resource-intensive interventions are required, not only for participants with borderline personality disorder who may not require complex or long-term programmes of care, but also for the broader range of participants with a personality disorder. Cognitive analytic therapy (CAT) may meet this need; it is integrative but theoretically coherent, unimodal, and brief (limited to 16–24 sessions). It uses a relational focus to target the intrapsychic and interpersonal problems common to all personality disorders.10,11 Despite CAT’s widespread adoption in the UK (, evidence of efficacy to date remains limited.1214 The present study was therefore designed to extend the evidence base.



A RCT was used to compare the effectiveness of 24-session CAT with treatment as usual (TAU) at a specialist personality disorder clinic in a public health setting (trial registration: ISRCTN79596618). For all participants, outcome and process measures (described below) were assessed at baseline. Participants in the CAT group were again assessed shortly after completing therapy. Because clinical audit had shown that the average duration of 24-session CAT was 10 months, the second assessment of TAU participants occurred 10 months after their baseline assessment. To balance the obligation to provide care against the assessment of long-term therapeutic impact, TAU participants were offered 24-session CAT at this time. Participants in the original CAT group were, however, further followed-up 18 months after therapy (see online supplement).

Treatment allocation concealment was achieved using a telephone-based system of randomisation, administered by the Dorset Research and Development Support Unit. The random sequence was computer generated, using baseline scores on the primary outcome measure of personality disorder to stratify randomisation according to whether participants reached criteria for each of the clusters (A alone (n = 0); B alone (n = 18); C alone (n = 28)) or comorbid clusters (n = 53). Participants were randomised 1:1 within each stratum in (varying) block sizes to ensure approximately equal numbers in the CAT and TAU groups within each stratum. The study protocol was approved by the UK National Health Service (NHS) Research Ethics Committee (Dorset).


Participants who met diagnostic criteria for a personality disorder were drawn from referrals to a specialist out-patient service, the intensive psychological therapies service. As required for referral to the service, all had completed at least one previous episode of therapy. Exclusion criteria, based on DSM-IV,1 included psychotic illness, substance dependence and intellectual disability. Because dialectical behaviour therapy (DBT) is an evidence-based treatment for parasuicidal behaviour,15 participants who engaged in self-harming behaviour at least monthly16 were deemed not to be eligible for the trial and were referred directly to an established DBT programme in keeping with the service protocol.


Screening measure

The Millon Clinical Multiaxial Inventory (MCMI-III)17 is a 175-item self-report measure examining 14 personality patterns and 10 clinical syndromes. Each subscale produces a score between 0 and 115, with scores of 85 and above considered to indicate the presence of a clinical syndrome or personality problem.

Primary outcome measures

We used the Structured Clinical Interview for DSM-IV Axis II (SCID-II)18 to assess symptoms of personality disorder. This measure has good to excellent interrater reliability.19 To measure distress arising from interpersonal difficulties, we used a self-report questionnaire, The Inventory of Interpersonal Problems: 32-item (IIP).20

Secondary outcome measures

Adjustment was measured using the Clinical Outcomes in Routine Evaluation (CORE),21 a 34-item self-report measure that produces a global distress (total) score from four subscales (subjective well-being, common problems and symptoms, life and social functioning and risk). The Service Satisfaction Scale (SSS-30),22 a 30-item self-report questionnaire, was used to assess participant satisfaction across four domains: practitioner’s manner and skill, perceived outcome, office procedures and general access.

Dissociation was monitored using two measures: (a) The Dissociative Questionnaire (DisQ),23 a 63-item self-report questionnaire that provides an overall score based on four subscales (identity confusion, loss of control, amnesia and absorption), and (b) the Dissociative Experiences Scale (DES),24 a self-report questionnaire (28-items) assessing the frequency of dissociative experiences.

The Symptom Checklist-90-Revised (SCL-90-R),25 a 90-item self-report measure, was used to evaluate a broad range of psychological problems and symptoms of psychopathology, providing scores on nine primary symptom dimensions and three global indices. Overall emotional distress was indexed by The Global Severity Index (GSI). These final two measures became available to the study after some participants had been recruited and randomised but were used to take advantage of their strong psychometric properties and frequent use within the psychotherapy literature. As a result, the data-set for these two measures is reduced.

In addition to psychometric outcome measures, the Dorset Healthcare Trust Participant Administration System was accessed to provide outcome-relevant data on the frequency and duration of all accident and emergency attendances and in-patient admissions, including those for general health difficulties.

Process measure

The Personality Structure Questionnaire (PSQ),26 an eight-item self-report measure designed to examine CAT theory-consistent process changes in personality integration during treatment, was used to assess identity disturbance and personality integration.


Cognitive analytic therapy

Following the principles outlined by Ryle & Kerr11 and guidelines developed by the Association of Cognitive Analytic Therapy (ACAT), participants in the CAT condition were offered 24 sessions of CAT and 3 follow-up sessions at 3, 6 and 12 months after termination of weekly therapy. All eight therapists had completed an ACAT accredited 2-year practitioner training course. Each case was allocated 15 minutes of weekly supervision with an ACAT-accredited supervisor. The CAT participants also received the usual benefits associated with standard care.

All CAT sessions were audio-taped and, following treatment, a randomly selected 4% (n = 37) were assessed for therapist competence by three independent raters. Ten domains of therapeutic practice were judged using the Competence in CAT (CCAT) measure, which demonstrates acceptable levels of reliability and validity.27 These data indicated that the delivery of CAT was satisfactory overall, with an average rating across trial therapists of 22 (range 13–38).

Treatment as usual

Participants in the TAU condition received the usual benefits associated with standard NHS care during the assessment period. This typically comprised care from a community mental health team, clinical services and contact with a general practitioner.


On referral, eligibility was assessed using the MCMI-III. Participants who reached psychometric criteria for personality difficulties and did not meet exclusion criteria were informed about the trial and, if willing, were further assessed using the SCID-II. These interviews were conducted by two independent psychiatrists and four psychologists who had been trained to a level of 80% agreement with other experienced and reliable raters. Pre-therapy (baseline) SCID-IIs, completed prior to randomisation, assessed participants’ lifetime experiences of personality disorder symptoms (diagnostic criterion). Those who met the inclusion criteria, and agreed to participate in the study, signed a consent form. After completing the baseline questionnaires, they were randomly assigned to CAT or TAU.

As in previous studies28,29 post-therapy and post-TAU SCID-II interviews assessed symptoms for the interval since the baseline assessment (symptomatic criterion) and best endeavours were used to ensure that assessors were masked to treatment allocation (e.g. participants were asked not to mention any information that could allow assessors to guess their treatment condition). All SCID-II interviews (pre, post and follow-up) were audio-taped and a random 10% were rated by a second reliability assessor, who was also naive to treatment allocation. Interrater reliability across the three categories used (symptom present; threshold; absent) was high (kappa (κ) = 0.79; 90% agreement). Shortly after completing CAT, or 10 months after commencing TAU, participants provided outcome and process psychometric data. In addition, the CAT participants completed the same measures and the SCID-II 18 months after therapy.

Overview of statistical methods

As the SCID-II is a nominal variable, the data were dichotomised into two categories (‘personality disorder’ or ‘no personality disorder’) and Fisher’s exact test was used to compare the distribution of scores between the two study groups. Because other measures (Table 1) differed between groups at baseline, analysis of covariance (ANCOVAs) were used to evaluate treatment effectiveness, using the respective baseline scores for each measure as the covariate. Analyses were run using a conservative intention to treat (ITT) procedure,30 utilising data from all recruited participants who provided pre- and post-assessment data, regardless of whether they completed treatment. It should be noted that the number of data points for each psychometric test varied owing to occasional omission or completion errors by participants. Like previous trials with relatively small sample sizes,31,32 Cohen’s d between-group effect sizes (ESs), adjusted for baseline differences, were calculated by dividing the adjusted mean differences by the baseline pooled standard deviation,33 even when no significant group effects had been obtained at post-test.

View this table:
Table 1

Means (s.d.) of demographic characteristics and outcome measures as a function of group and time

Because the distribution of hospital utilisation data was highly skewed, with very few participants having one or more admission, the data were dichotomised into ‘no admissions’ or ‘one or more admissions’, and Fisher’s exact test was used to analyse between-group differences pre- and post-therapy.

To assess whether group differences were reflected in outcomes for individual participants, Jacobson & Truax’s34 criteria for reliable and clinically significant change were computed using the published normative values for the IIP, CORE and GSI. Thomas & Truax’s35 recommended categories of change were then used: recovered (reliable and clinically significant change), improved (reliable change without significant clinical change), same (no change) and deteriorated (reliable change with worsening symptoms). After categorising participants as ‘recovered or improved’ or ‘same or deteriorated’, Fisher’s exact test was used to compare change between groups.

Analyses assessing whether changes in the PSQ were associated with changes in remaining outcome measures were conducted to explore CAT theory-consistent processes of change. Separate Pearson’s correlations for the CAT and TAU conditions were computed utilising residual gain scores for both the PSQ and outcome measures. These scores indexed pre- to post-therapy change, adjusted for the correlation between repeated tests.36


Participant recruitment

Figure 1 shows the participant flow throughout the trial. Of 165 eligible participants, 128 consented to participate but 6 withdrew their consent, and 23 were subsequently excluded, 13 because they failed to meet personality disorder inclusion criteria after a more detailed assessment, and 10 for other reasons, including moving out of the area and imprisonment.

Fig. 1

CONSORT flow chart of participant recruitment to the trial. CAT, cognitive analytic therapy; ITT, intention to treat; TAU, treatment as usual.

Initial power calculations based on previous data37 indicated that the anticipated between-group standardised ES of 0.5 (medium effect33) could be detected with a sample of 64 participants per group with 80% power (using a two-tailed 5% significance level). Difficulties with recruitment and loss during or after treatment, however, resulted in an actual achieved final sample size of 78 participants at termination contributing to the analysis and therefore power to detect a standardised ES of 0.5 was reduced to 58%.

Baseline data

The sample of 99 participants for whom baseline data were obtained consisted of 71 women (72%) and 28 men. Ages ranged from 19 to 59 years, with a mean of 36.0 years (s.d. = 9.5). An extensive range of cluster A, B and C personality disorder diagnoses was present in the sample and high levels of comorbidity were evident. In total 88% of the sample held a diagnosis of two or more disorders; 53% displayed diagnoses across two clusters; 28% across all three clusters. Finally, as a result of local referral criteria, 68% of participants had a diagnosis of borderline personality disorder.


Descriptive analyses established that the data for continuous primary and secondary outcomes were normally distributed.

Primary outcome measures

Based on the SCID-II, all participants met diagnostic criteria for at least one personality disorder at baseline (the median number of personality disorders and s.d. per condition are noted in Table 1). Post-therapy, 9/27 (33%) CAT participants no longer met symptomatic criteria for any personality disorder, whereas all 30 (100%) TAU participants met the criterion for at least one (P<0.001, Fisher’s exact test). Moreover, 16 (53%) TAU participants met symptomatic criteria for a greater number of personality disorders at post-test; no CAT participants showed deterioration (P<0.001, Fisher’s exact test). Table 1 also shows pre- and post-therapy data for the IIP, the second outcome measure (on a ITT basis). For this measure, ANCOVA indicated a significant between-group difference in favour of CAT (F(1,69) = 16.507, P<0.001) with a large ES (d = 1.00).

Secondary outcome and process measures

Secondary outcome measures are also shown in Table 1. The ANCOVAs showed an advantage for the CAT group over TAU for three of the five measures; the CORE (F(1,71) = 10.487, P = 0.002); the DisQ (F(1,72) = 11.410, P = 0.001); and the PSQ (F(1,70) = 9.136, P = 0.003). Between-group ESs, adjusted for baseline differences, were calculated for all secondary and process outcome measures. Large ES values favouring CAT over TAU were found in the CORE (d = 0.80), and GSI (d = 0.64), medium ES values for the DisQ (d = 0.60), and PSQ (d = 0.50), with small ES values for the DES (d = 0.24).

Sensitivity analyses

Sensitivity analyses using last observation carried forward, based on the assumption of no change in participants who did not complete post-therapy assessment were conducted. In all cases, the significant differences in the results reported above remained significant.

Healthcare utilisation and participant satisfaction

The Fishers’ exact test suggested that at baseline (Table 1) and post-intervention, there were no significant between-group differences in in-patient or accident and emergency admissions (P>0.05). Independent t-tests showed that, using the total satisfaction scale of the SSS-30, CAT participants (mean 5.5, s.d. = 3.3) were significantly more satisfied with treatment than TAU participants (mean 3.9, s.d. = 2.3) (t(53) = 2.01, P = 0.05).

Clinically significant individual change

The values used for change calculations were drawn from published psychometric data.20,21,25Table 2 shows the percentage of participants who reliably recovered, improved, remained the same or deteriorated during the treatment period. More CAT than TAU participants achieved benefits (i.e. improved or recovered) in interpersonal relating (IIP), with a similar trend towards symptomatic relief (CORE and GSI). Between-group differences using the Fisher’s exact test were significant for the IIP (P<0.001) and CORE (P<0.001), but not for GSI (P = 0.083).

View this table:
Table 2

Percentage of reliable and clinically significant change for both conditions

Exploratory mechanisms of change

Pearson’s correlations for the CAT group showed that PSQ residual gain scores were significantly associated with residual gain scores on the IIP (r = 0.778, P = 0.045), CORE (r = 0.496, P = 0.001), GSI (r = 0.315, P = 0.027) and DisQ (r = 0.469, P = 0.001), but were not significantly associated with DES residual gain (r = 0.159, P = 0.169) scores. Correlations for the TAU group showed that PSQ residual gain scores were significantly associated with DisQ residual gain (r = 0.280, P = 0.040) scores only. Overall, this suggests that reductions in personality fragmentation were significantly associated with improvements in interpersonal and symptomatic outcomes for CAT, but not TAU participants. Full details of the results of the uncontrolled 18-month follow-up are provided in the online supplement.


Pre–post group comparisons

This RCT provides evidence that CAT can be an effective therapeutic intervention for the self-management and interpersonal difficulties associated with a broad range of personality disorders. At post-therapy, a significantly higher proportion of CAT participants (9, 33%) no longer met symptomatic criteria for personality disorder; in contrast, all TAU participants remained symptomatic. Moreover, more than half of all TAU participants (16, 53%) showed deterioration at this time, meeting symptomatic criteria for more personality disorders. No CAT participants deteriorated. As predicted, group analysis indicated that CAT participants showed significant improvements in interpersonal functioning and significant reductions in symptomatic distress, in comparison with TAU participants. Furthermore, assessment of changes on an individual basis showed that a significantly higher proportion of CAT participants were classified as ‘recovered’ or ‘improved’ in measures of distress related to interpersonal functioning and psychological symptoms, but more TAU participants were classified as the ‘same’ or ‘deteriorated’. Although CAT did not have an impact on healthcare utilisation post-intervention, this was probably because participants with chronic self-harming behaviour were excluded from the study, resulting in a floor effect. Participants in the CAT intervention were more satisfied than those receiving TAU. Our uncontrolled 18-month follow-up suggested that improvements may have been sustained among those that were followed-up. However, this within-participant comparison should be interpreted with caution, particularly given the high level of attrition.

It is notable that TAU not only failed to match CAT, but that some of its recipients showed signs of deterioration. This finding supports Tyrer & Simmonds’38 suggestion that social functioning of complex participants with comorbid personality disorders can deteriorate in the absence of specialist treatment, and highlights the importance of interventions targeting processes and difficulties associated with these disorders. The study thus adds to previous RCTs that have provided evidence of the benefits of specialist personality disorder treatments.4

To the best of our knowledge, it is the first RCT of CAT with an adult population displaying the complete range of personality disorders. As such, it suggests that the core theoretical principles of CAT can be applied effectively across a diagnostically heterogeneous group. Because CAT is time-limited and relatively brief, is broadly applicable and does not require a complex programme of care, it may have economic and practical benefits for service delivery. Thus, our study fills a niche in the evidence base by showing that effective specialist treatment can be provided, without recourse to the resource-intensive, complex programmes of care often recommended for people with a severe personality disorder.39

Our outcomes may also suggest that focus, structure and collaboration – generic features of therapy identified as important for people with a borderline personality disorder39 and instantiated in CAT – are also useful for people with other personality disorder diagnoses. This factor may in part explain why other studies13,14 fail to show clear differences between CAT and good clinical care, where the latter is structured and focused. Likewise, the possibility remains that, in the present study, general aspects of patient contact and the structure – rather than the theoretical content of the CAT intervention – led to the group differences.14 The data, however, suggested that the CAT therapeutic process was related to outcome. Although the sample size was too small to assess mediation formally, the study provided some preliminary evidence that observed improvements following CAT, but not TAU, were associated with processes that are consistent with CAT theory.

Strengths and limitations

The use of an RCT methodology in a naturalistic setting created some research/practice tensions that contributed to both the strengths and limitations of the study. On the one hand, CAT was delivered by health service professionals in a public health setting, to a heterogeneous group of participants with personality disorder, representative of a characteristic out-patient population. These factors strengthen the external validity of our findings. On the other hand, clients with cluster A personality disorders alone did not present for treatment and two categories of participants were not included in our sample. Individuals who had no previous treatment were not accepted as referrals into the specialist service. By the same token, those who engaged in self-harm were excluded because they were referred directly to the DBT programme that the specialist service provided. Thus, our sample was not fully characteristic of a personality disordered population. Although self-harming behaviour can be formulated within the CAT model, the effectiveness of CAT with self-harming adults remains untested and requires further controlled evaluation.

Our study may also have been compromised by other factors. Although similar in sample size to previous trials,13 our final sample was somewhat underpowered. In addition, TAU was chosen to provide a reasonable comparison of what experimental participants would otherwise receive. Unlike CAT, however, TAU was not systematically monitored or assessed and so its quality remains unknown. Moreover, the fact that TAU participants knew they would be able to obtain CAT 10 months after providing baseline data may have had a negative impact on their motivation to change during the comparison period. Finally, although CAT therapists showed acceptable mean levels of competence, adherence was assessed retrospectively rather than controlled. Building on these findings, future research will need to assess the impact of CAT in comparison with either a well-defined and systematically assessed TAU condition, or a theoretically coherent comparison. Such comparisons would also benefit from a formal cost-effectiveness evaluation. Finally, future trials should aim to obtain controlled maintenance data, following the end of weekly treatment, to assess the sustainability of any benefits more systematically.

In conclusion, these findings provide preliminary evidence that CAT is more effective than standard public healthcare, and indicate its potential value as an intervention across much of the range of personality disorder. Our findings thus provide the preliminary data that are required before embarking on a more elaborate and controlled evaluation of CAT, designed to assess its efficacy and practicality as an economical intervention for individuals with personality disorder.


The authors thank all participants in this study and the staff of the Intensive Psychological Therapies Service, Dorset HealthCare University NHS Foundation Trust. Thanks also go to Glenys Parry, Liz Fawkes, Dawn Bennett, Jane Blunden, Jason Hepple and Bob Remington for their valuable contributions and encouragement.

  • Received January 6, 2012.
  • Revision received July 23, 2012.
  • Accepted August 13, 2012.


View Abstract