The British Journal of Psychiatry
Neurocognitive dimensions characterising patients with first-episode psychosis


Background Assessment of neurocognitive dysfunction in schizophrenia is hampered by the multitude of tests used in the literature.

Aims We aimed to identify the main dimensions of an assessment battery for patients with first-episode psychosis and to estimate the relationship between dimension scores and gender, age, education, diagnosis and symptoms.

Method Eight frequently used neuropsychological tests were used. We tested 219 patients 3 months after start of therapy or at remission, whichever occurred first.

Results We identified five dimensions: working memory (WM); verbal learning (VL); executive function (EF); impulsivity (lm); and motor speed (MS). Significant findings were that the MS score was higher for men, and the WM and VL scores were correlated with years of education.

Conclusions Neurocognitive function in first-episode psychosis is described by at least five independent dimensions.

Subgroups of patients within the psychotic spectrum are characterised by differences in behavioural and affective, as well as cognitive, symptom profiles. Neuropsychological deficits are recognised as a core determinant of the illness (Green, 1998; Rund & Borg, 1999; Bilder et al, 2000) but concept terminology and assessment methods still remain unsettled issues. Factor analysis is a potent technique for reducing a number of measures into a smaller set of uncorrelated dimensions (Lieh-Make & Lee, 1997). The complexity of the component structure depends on the number of tests included and the clinical characteristics of the group studied (Bechtoldt et al, 1962; Heaton et al, 1995).

In this paper we present the results of eight neuropsychological tests in a group of stabilised patients with first-episode psychosis. We used data-reducing techniques to try to identify the main dimensions of an assessment battery for a group of patients with first-episode psychosis. We wanted to answer the following questions:

  1. To what extent are the scores of different neurocognitive tests intercorrelated?

  2. Can we meaningfully combine the test scores into a fairly small number of dimension scores?

  3. To what extent are these dimension scores intercorrelated?

  4. To what extent are they related to patient age, gender, education, diagnosis or symptoms?


The study is part of a multi-site investigation of the relationship between duration of untreated psychosis (DUP) and outcome. The study is carried out in four sites, two (Stavanger and Haugesund in Norway) with an early detection programme and two (Ullevål sector in Oslo, Norway and Roskilde, Denmark) with an ordinary detection programme. All patients gave written informed consent. The regional ethics committee has approved the study. On admission, the patients were diagnosed according to DSM-IV (American Psychiatric Association, 1994) by special assessment teams, and rated on scales including a split Global Assessment of Functioning (GAF) scale that gave separate scores for symptoms and function. At 3 months GAF was scored again. Treatment was initiated according to a standard protocol, including neuroleptic medication, individual, supportive psychotherapy and multi-family groups. Further details are given by Johannessen et al (2001) and Larsen et al (2001).


This paper is based on a sample of 219 patients. Diagnostic and demographic characteristics and symptom scores at 3 months are presented in Table 1.

View this table:
Table 1

Demographic variables and symptom scores

Because our intention was to measure neurocognitive traits and not be biased by acute effects of the psychotic episode, patients were tested 3 months after start of therapy or at remission, whichever occurred first. As seen from Table 1, this strategy seems to have been successful as the symptom level was fairly low at 3 months.

Neurocognitive tests

Eight neuropsychological tests were chosen for assessing neurocognitive function. We selected elected tests used frequently and shown to be sensitive for diagnostic and prognostic issues in schizophrenia. The tests were administered by a trained test technician or approved neuropsychologist. The tests were administered in the following order.

California Verbal Learning Test (CVLT)

The CVLT (Delis et al, 1987) measures capacity for explicit verbal memory. The test consists of oral presentation of a 16-word ‘shopping list’ (list A) for five immediate recall trials, followed by a single presentation and recall of a second 16-word ‘interference’ list (list B). The words on both lists consist of four items from each of four categories. Free- and category-cued recall of list A is elicited immediately after recall of list B (short delay) and again 20 minutes later (long delay). Finally, a recognition trial is run, involving oral presentation of 44 ‘shopping items’ of which subjects are asked to identify the 16 list A items. Scoring of the test involves computing several parameters of learning strategies in addition to the number of words recalled at the various stages of learning. Based on scores from 286 normal subjects and 113 neurological patients, Delis et al (1988) found support for a 6-component model of the CVLT. Subsequent research has confirmed the multi-dimensional nature of the test (e.g. Vanderploeg et al, 1994) and has suggested qualitative differences in the way psychiatric patients solve the task compared with normal controls (Kareken et al, 1996). Our study did not involve computerised scoring of test protocols and only measures of immediate and delayed recall, recognition, perseverations and intrusions are reported.

Backward Masking Test (BMT)

The BMT (Spaulding et al, 1981; Rund, 1993; Green et al, 1994a,b) assesses the earliest phases of visual information processing. A standard target duration procedure in which pairs of digits (target stimuli) are presented for 16.5 ms on the monitor was used. The stimuli are followed by a patterned mask of Xs of equal duration, covering the image of the digits on the monitor. The task consists of 30 stimulus presentations: 10 with a 33 ms stimulus onset mask (short); 10 with a 49.5 ms stimulus onset mask (long); and 10 with no mask. The three test trials are assigned randomly. Identification of each digit in the pair is scored separately, yielding a maximum score of 20 correct for each of the three conditions. In the present report, the no-mask condition is excluded for the final analysis and the mean of the two mask conditions are used in order to improve the reliability of the measure.

Finger Tapping Test (FTT)

The FTT (Lezak, 1995) requires that the subject tap as rapidly as possible with the index finger on a small lever, which is attached to a mechanical counter. The test is basically a test of simple motor speed, although some degree of coordination is required. The subject is given 5 consecutive 10-s trials with the preferred hand and then 5 consecutive trials with the non-preferred hand. Mean number of taps for each hand is computed. Because no lateralised motor deficits were expected, mean score of the two hands are used in the component analysis.

Wisconsin Card Sorting Test (WCST)

The WCST (PC-version, Heaton et al, 1993) is a test of abstract thinking that requires the ability to form a hypothesis and check it out. The test is the most commonly used measure of executive functioning in schizophrenia research (Green, 1998) and provides estimates of perseverative thinking and distractibility. The subject is asked to sort a series of cards to one of four key cards that vary in shape, colour and number of shapes. Feedback after each response provides information whether or not the correct matching rule is being followed. After 10 consecutive correct sorts, the test shifts without warning to reinforce a new sorting rule. The test terminates after 128 trials or when the subject has completed the three correct sorting rules twice. Studies by Bell et al (1997) and Koren et al (1998) find evidence for a three-factor structure in the WCST (perseveration, idiosyncratic sorting/non-perseverative errors and failure to maintain set). The same pattern is generally concluded in normal control subjects or subjects with traumatic brain injuries (Wiegner & Donders, 1999). Recent research has suggested that impaired scores may be explained by reduced intellectual capacity rather than executive dysfunction (Laws, 1999) but the cause-and-effect question has still to be solved.

Controlled Oral Word Association task (COWA)

The COWA (Spreen & Strauss, 1998) is a measure of verbal fluency requiring the ability to generate words beginning with specific letters (F, A and S) for 1 minute each. The instructions followed are identical to those used by Spreen & Benton (1969).

Trail Making Test (TMT)

The TMT (Lezak, 1995) consists of two parts (A and B). Each part measures speed of visual scanning with a motor component. Part A requires the subject to connect series of numbered circles arrayed randomly on a sheet of paper using a pencil. In part B the array consists of both numbers and letters, and the subject must connect them in alternating order. Part B demands simultaneous processing capacity for two sets of mental operations (number and letter sequencing) as well as a rule-following instruction to alternate between the sets. It is a sensitive measure of disturbances in both attention and executive function.

Digit Span Distractibility Test (DSDT)

In the DSDT (Oltmanns & Neale, 1975; Rund, 1983) the subjects hear short strings of digits with and without distractors and are asked to recall the digits in correct order. The test measures short-term memory, selective attention and distractibility. Neutral and distractor items are interspersed randomly. The distraction and neutral digit strings are matched for difficulty level and reliability to avoid problems associated with differential discrimination power (Chapman & Chapman, 1978). The total number of correctly recalled digits for the neutral and distractor lists is divided by a maximum score for comparison between conditions. The score (percentage of correctly recalled digits) for each condition is used in the analysis.

Continuous Performance Test, Identical Pairs version (CPT—IP)

The CPT—IP is a multi-dimensional CPT task that systematically varies type of stimulus, distraction and stimulus exposure time (Cornblatt et al, 1989). Four stimuli conditions are used: numbers; shapes; numbers presented with distractors; and shapes presented with distractors. The test consists of both a slow and a fast condition for each of the four conditions. Computer-generated stimuli are presented on a monitor. The subject is asked to respond as fast as possible by lifting the index finger from a reaction time key whenever two identical stimuli follow each other. For each condition, a series of 150 trials are continually flashed on the screen, with stimulus onset time of 50 ms and dark interval between stimuli of 950 ms.

A subset of measures were selected from each test to be entered as input variables in an overall ‘second-generation’ principal component analysis. The selection of measures from each test was based on a combination of the theoretical foundation of the essential quality of the test, clinical experience and principal component analysis. For example, for the CPT a factor analysis indicated that the best solution was to use the average of hits, false alarms and reaction time across all conditions.


Table 2 gives the mean scores for the selected subset of variables. Compared with standard norms presented in the test manuals and available literature for the DSDT (Rund, 1983) and BMT (Rund et al, 1996), the sample's mean scores indicated a function clearly below normal for most of the tests. A prominent exception was the WCST, where most of the patients scored close to normal.

View this table:
Table 2

The variables selected for the final factor analysis

The intercorrelations between the 17 variables are given in Table 3. As seen from this table, all four WCST variables were strongly to moderately intercorrelated, and so were the CVLT scores except for perseverations. TMT, COWA, DSDT and CPT hits were also strongly to moderately intercorrelated. CPT false alarms and CPT reaction time were moderately intercorrelated, whereas FTT was basically uncorrelated with all the other variables.

View this table:
Table 3

Intercorrelations between the 17 variables

The 17 tests were included in a factor (principal component) analysis with varimax rotation. We wanted to be sure that the factor solution could account for a considerable proportion of the variance of the included variables. We therefore chose to exclude variables with communalities <0.50. The first analysis gave six factors with eigenvalue >1. Two variables (CVLT perseverations and WCST failure to maintain sets) had communalities <0.50. We reran the analysis with the 15 remaining variables and found five factors with eigen-value >1. From this factor solution two more variables (BMT and TMT) had to be excluded because of communality <0.50. We finally ended with 13 variables, for which the factor analysis again gave five factors with eigenvalue >1. Together, the five factors explained nearly 72% of the variance. The communalities and the factor loadings are given in Table 4.

View this table:
Table 4

Communalities and the factor loadings of the 13 final variables

Based on the factor analysis we chose to make an index score for each of the five dimensions. A variable was included in an index if:

  1. it had a strong loading (≥0.50) on the corresponding factor;

  2. the strong loading was specific for this factor (the difference between the loading on the corresponding factor and the highest loading on a non-corresponding factor had to be >0.10).

We z-transformed the variables and calculated the mean of the items of each index (with negative sign if items were reversed).

This gave us the following five index scores:

  1. Working memory (four items: COWA, Digit span with and without distractor and CPT hits).

  2. Executive function (three items, all from the WCST: categories completed, perseverative responses (reversed), number of attempts to first category (reversed)).

  3. Verbal learning (three items, all from the CVLT: immediate recall, delayed free recall, errors (reversed)).

  4. Impulsivity (two items, both from the CPT: false alarms and reaction time (reversed)).

  5. Motor speed (one item only: finger-tapping).

For the first four indices the internal consistency could be calculated. It had a median of 0.73 (range: 0.54 (impulsivity) to 0.82 (executive function)).

The correlations between the index scores and the factor scores had a median of 0.95 (range 0.87 to 0.98), indicating that the index scores could replace the factor scores without substantial loss of information.

The intercorrelations between the index scores are shown in Table 5. As seen from the table, the five scores seemed to represent fairly independent dimensions.

View this table:
Table 5

Intercorrelations between the five dimensions

We also looked at the relationship between the dimension scores and age, gender, education, diagnosis and GAF symptom and function scores. Because of multiple comparisons and the fairly high number of patients, we chose a significance level of 0.001. No dimensions were significantly correlated with age. The difference between genders was clearly significant for motor speed (P <0.0005), with women having lower scores. Years of education were significantly correlated with working memory (=0.29, P<0.005) and verbal learning (r=0.30, P<0.0005).

We did not find any significant relations between any of the neurocognitive dimensions and core/non-core diagnosis, GAF symptom or GAF function scores.


How many dimensions?

This study has identified five distinct dimensions that seem clinically meaningful and psychometrically sound. The five dimensions comprise information from six of the eight tests. Data from two of the tests, BMT and TMT, did not meet our criteria for inclusion and seem to assess neurocognitive dimensions of uncertain validity, at least in this sample. Four of the index scores (executive function, verbal learning, impulsivity and motor speed) include sub-tests from one test only. Working memory is more complex, as it comprises quite different tests as COWA, Digit span (with and without distraction) and CPT hits. Our working memory index is a composite measure, combining verbal fluency, immediate memory and vigilance. The inclusion of CPT hits is not surprising as successful completion of the vigilance tasks clearly depends on immediate memory. However, the vigilance tasks involve more than what Perry et al (2001) call ‘transient online and retrieval’ working memory. They demand ability to store, manipulate and retrieve data, and to keep attention over time. The study of Conklin et al (2000) also indicates that forward and backward digit span tasks tap different cognitive abilities. Our working memory index seems to be most strongly related to immediate memory, probably indicating that in the present sample the variability of immediate memory was so large that it gave no room for an additional factor covering the more specific aspects of vigilance. A partly alternative explanation would be in line with the suggestion by Perry et al (2001) that patients with psychosis may have more general deficits that will influence both working memory and vigilance. In this connection it is worth noting that executive function came out as a separate dimension, which may indicate that the WCST taps a different underlying brain substrate. Such an interpretation is further supported by the fact that even if most of the patients performed clearly poorer than normals on most tests, the majority performed rather well on the WCST.

Can the CPT measure impulsivity?

The impulsivity sub-scale is a new construction. In the present sample there was a clear inverse relationship between the CPT false alarm and the CPT reaction time, and when combined they seemed to give a measure of impulsivity. However, the relationship between the two variables could prove to be a more complex one, as a comparison with normals seemed to indicate that the patients had both a higher percentage of false alarms and a longer reaction time.

Relationship with other variables

Our second main finding was that the dimension scores were weakly related to factors such as education, gender, age, diagnosis and symptom level. We cannot rule out the possibility that more specific findings may be obtained in future analyses of our data, when we go into details of the specific neurocognitive tests and look at diagnostic subgroups and variables such as DUP. By contrast, the five dimensions explained most of the variance in our data-set, and the fact that the group as a whole scored below average on most of the dimensions might imply that the level of neurocognitive functioning is compromised even in a basically remitted sample of patients with first-episode psychosis. If this finding is replicated in our total sample, it could indicate that neurocognitive deficiencies are vulnerability factors for psychosis, more than a result of the psychotic process. However, it might be that neurocognitive function can improve over time, but that such an improvement takes a longer time than symptomatic remission. Only a follow-up investigation can tell us whether this is the case or not. Such a study is under way as part of the Tidlig Intervensjon ved Psykoser (TIPS: Early Intervention in Psychosis) project.


Even if this study is based on a considerable number of patients, the results have to be regarded as preliminary. Replicatory studies are needed to demonstrate the robustness of the identified dimensions.

Clinical Implications and Limitations


  • Working memory, verbal learning and executive function are fairly weakly intercorrelated.

  • Even in a first-episode sample, many patients function poorly on one or more neurocognitive tests.

  • Neurocognitive test scores seem to be weakly related to education, gender, age, diagnosis and symptom level.


  • Several patients were tested more than 3 months after admission (some of them much later).

  • A considerable number of patients were lost to neurocognitive testing.

  • The paper is based on cross-sectional data only.


The authors thank the following for help with the collection of neurocognitive data: Julie Bjørnsen, Nils Johan Halleråker, Patricia Jansen, Per Knudsen, Liv Jæger Midbøe, Olaug Moi, Jan Mydland, Gerd Berit Mygland, Susanne Theill Møller, Lillian Møllerhaug, Caroline Ripa, Patricia Schultz, Kjersti Tvedt, Erik Tveterås, Torill Ueland, Wenche ten Velden, Mette Væver and Eystein Våpenstad. We also thank Jan Egil Norvik for invaluable help with structuring the database of the project.


  • * Presented in part at the European First Episode Schizophrenia Network Meeting, Whistler BC, Canada, 27 April, 2001.


View Abstract