Background Evidence suggests a reversal of the normal left-lateralised response to speech in schizophrenia.
Aims To test the brain’s response to emotional prosody in schizophrenia and bipolar disorder.
Method BOLD contrast functional magnetic resonance imaging of subjects while they passively listened or attended to sentences that differed in emotional prosody.
Results Patients with schizophrenia exhibited normal right-lateralisation of the passive response to ‘pure’ emotional prosody and relative left-lateralisation of the response to unfiltered emotional prosody. When attending to emotional prosody, patients with schizophrenia activated the left insula more than healthy controls. When listening passively, patients with bipolar disorder demonstrated less activation of the bilateral superior temporal gyri in response to pure emotional prosody, and greater activation of the left superior temporal gyrus in response to unfiltered emotional prosody. In both passive experiments, the patient groups activated different lateral temporal lobe regions.
Conclusions Patients with schizophrenia and bipolar disorder may display some left-lateralisation of the normal right-lateralised temporal lobe response to emotional prosody.
Patients with schizophrenia demonstrate relatively greater activation of the right middle temporal gyrus, compared with the left, when listening to normal speech (Woodruff et al, 1997; Lennox et al, 2000; Sommer et al, 2001). The neural response to emotional prosody is normally seen in predominantly right hemisphere temporal lobe language regions (Heilman & Gilmore, 1998; Buchanan et al, 2000; Mitchell et al, 2003), which raises the possibility that the right middle temporal gyrus ‘hyper-responsivity’ in patients with schizophrenia may be explained as an exaggerated functional response to emotional prosody. We thus tested two specific hypotheses: in patients with schizophrenia the cortical areas mediating the response to emotional prosody (particularly the middle temporal gyrus) are more sensitive to (and hence show a higher basal level of responsivity to) emotional prosody; and patients with schizophrenia over-attend to emotional prosody compared with healthy controls (and hence show increased top-down modulation of middle temporal gyrus activation in response to emotional prosody) (Woodruff et al, 1996).
MATERIALS AND METHODS
Thirteen healthy controls were recruited from the staff and students of the University of Manchester. Twelve patients with schizophrenia (ICD-10 diagnosis; World Health Organization, 1993) and eleven patients with bipolar disorder (psychiatric controls, ICD-10 diagnosis; World Heath Organization, 1993) were recruited from out-patients attending Cromwell House Community Mental Health Centre (Salford and Trafford NHS Trust) and the Psychiatry Department at Manchester Royal Infirmary (Central Manchester NHS Trust). The subjects’ demographic statistics are summarised in Table 1. Because gender and handedness each may influence lateralisation of the brain for language (Beaton, 1997), only male subjects who were right-handed (Annett, 1970) were studied. We excluded subjects with a history of hearing difficulties; those who did not speak English as their first language; any who were unable to consent; those who gave a self-report of neurological disorders; those with a history of alcohol or drug misuse, head injuries or long periods of unconsciousness; and those in whom magnetic resonance scanning was contra-indicated. Additionally, controls who gave a self-report of psychiatric disorders were excluded. The study was conducted with full ethical approval from the Manchester University Ethics Committee and the local hospital ethics committees.
The symptomatology of patients with schizophrenia was assessed using the Scale for the Assessment of Positive Symptoms (SAPS; Andreasen, 1979a) and the Scale for the Assessment of Negative Symptoms (SANS; Andreasen, 1979b). Patients were not experiencing auditory hallucinations on the day of scanning as indicated by self-report. On the day of scanning, patients’ delusion and hallucination scores were reassessed to determine whether there had been any changes in symptoms. Details of current medication and illness chronicity were obtained from patients’ medical records. Medication levels in chlorpromazine equivalents were calculated for patients with schizophrenia (Taylor et al, 1999). Nine of the bipolar patients were receiving lithium carbonate or sodium valproate, four were receiving antipsychotic medication and five were being treated with antidepressants.
A set of sentences were devised describing happy scenarios (e.g. She won the lottery jackpot) and sad scenarios (e.g. The dog had to be put down) that were approximately the same length and of a comparable style and format. The sentences were rated for perceived ‘happiness’ or ‘ sadness’ by a group of control subjects (n=20). The 60 sentences rated closest to the happy (0) and sad (10) ends of the 0-10 rating scale were selected for recording by an experienced phonetician (A.C.) in three styles of emotional intonation: happy, sad and neutral. Audiocassette recordings were then digitised at 22 kHz/16 bits using an Apple Macintosh Centris 660 AV. A subset of sentences was bandpass filtered with a low-pass filter at 333 Hz to remove semantic information, thus creating ‘ pure’ emotional prosodic stimuli. A second survey determined that subjects (n=27) could reliably distinguish the emotion intoned in the filtered sentences. Paired t-tests revealed significant differences in emotional intonation rating (same scale as first survey) between sentences recorded in happy, sad and neutral styles of intonation (all P<0.001).
Stimuli were presented as 37.8-s epochs consisting of eight contiguous stimuli, approximately 4.7 s in length, in an alternating A/B block design. Each A/B block was repeated three times. To dissect out the components of the response to emotional prosody, three separate scanning studies were performed. In the first experiment, passive listening to filtered emotional prosody was compared with the response to background scanner noise. In the second experiment, passive listening to unfiltered emotional prosodic sentences was compared with the response to background scanner noise. Pilot studies had suggested that comparing emotional prosody with neutral prosody was too subtle to achieve sufficient contrast. For the first two experiments, no active response was required. Subjects were asked to lie as still as possible, with their eyes closed, and listen passively. In these two passive experiments, sentences describing happy or sad scenarios were randomised to avoid habituation.
In the final experiment, subjects were presented with phrases depicting happy and sad scenarios and, when prompted, were instructed to attend to either semantic content or emotional intonation. In condition A, subjects were asked to concentrate on what the speaker was saying, and if they thought that the speaker was describing a happy scenario they were to respond by squeezing the response bulb. In condition B, they were required to ignore semantic content and concentrate on the speaker’s emotional tone of voice, responding only when phrases were spoken in a happy tone of voice. Subjects performed this task in its entirety before entering the scanner to ensure that they were able to execute it properly. In this active experiment, we randomised statements where the emotion conveyed by intonation style was congruent with that conveyed by verbal content with those where the emotions conveyed by intonation and content were incongruent. In the passive listening experiments, the emotion provided by intonation and content was congruent throughout. The same series of sentences were used for all three experiments, i.e. verbal content remained the same. (All subjects performed all the scanning paradigms, in the order described above.)
Image acquisition and analyses
Auditory stimuli were presented using Macstim version 2.25 (David Darby, University of Melbourne) on a G3 Apple Macintosh PowerBook running operating system version 8.5 and played through a magnetic resonance-compatible patient MUSICBOX™ functional imaging sound system (Woodrow Premise, St Albans, UK). Before scanning, samples of the auditory stimuli were played to subjects positioned inside the scanner. If necessary, subject-specific volume adjustments were made to ensure that all stimuli were clearly heard. All functional magnetic resonance imaging (fMRI) experiments were performed on a Philips Gyroscan ACS NT 1.5 Tesla system (retrofitted with Powertrak 6000 gradients) operating at software level 6.1.2 (Hamburg, Germany).
A T1-weighted turbo inversion recovery data set was acquired at 28 contiguous 3.5-mm slices acquired parallel to the body of the corpus callosum to achieve full brain coverage (echo time=18 ms, repetition time=6850 ms, in-plane resolution=0.89 mm, field of view=230 mm2 and acquisition matrix=256×256 with one excitation). In the same orientation and position as for the structural scan (to capture all possible non-cerebellar haemodynamic responses), 72 T2*-weighted gradient echo, echo planar magnetic resonance images depicting BOLD contrast were acquired at each of 14 contiguous 7-mm thick slices. The single-shot imaging protocol comprised echo time=50 ms, repetition time=3.1 s, flip angle=90°, echoplanar imaging factor=63, matrix=128×128, field of view=230 mm2 and in plane resolution=1.8 mm2. Total scan time was 3 min 47 s per individual experiment.
Data were analysed using SPM99 (Friston et al, 1995) implemented in MATLAB version 5.2 (The Mathworks Inc., MA, USA) and run on an Ultra 2 Creator 3D SUN workstation (Sun Microsystems, CA, USA) operating through Solaris version 3.5 (Sun Microsystems). Realignment, spatial normalisation and smoothing were performed prior to statistical analysis. Random effects analyses were carried out using a general linear model with a delayed boxcar waveform convolved with the haemodynamic response function. A high-pass filter was applied to the data to isolate the high-frequency effects of interest and to minimise physiological noise. Subject-specific low-frequency signal drift was removed by modelling with low-frequency sine and cosine waves (low-pass filter). Global flow effects were removed by proportional scaling.
Effects at each and every voxel were estimated and regionally specific effects were compared using linear contrasts. One mean contrast image was produced for each subject on the respective comparisons within the three experiments. For the group analyses, statistical inferences were based on the theory of random Gaussian fields, and a random effects model was applied. In the second-level analysis, two-sample t-tests were used to determine between-group differences. In view of the restricted activation differences predicted in the hypotheses (middle temporal gyrus), whole-brain correction was deemed too conservative. Thus, an uncorrected threshold of P<0.001 was adopted in regions of the middle temporal gyrus. Covariate analyses were performed to assess the relationship between chlorpromazine equivalent medication levels and functional activations in relation to all experiments (positive and negative correlations).
In summary, the interaction contrasts (i) to (vi) were performed in SPM for each of conditoins (a) to (d).
When comparing passive listening to ‘pure’ filtered emotional prosody relative to rest...
When comparing passive listening to unfiltered emotional prosody relative to rest...
When observing attention to emotional prosody (ignoring semantics)...
When observing attention to semantics (ignoring emotional prosody)...
... which regions were activated more in patients with schizophrenia than in healthy controls?
... which regions were activated less in patients with schizophrenia than in healthy controls?
... which regions were activated more in patients with schizophrenia than in patients with bipolar disorder?
... which regions were activated less in patients with schizophrenia than in patients with bipolar disorder?
... which regions were activated more in healthy controls than in patients with bipolar disorder?
... which regions were activated less in healthy controls than in patients with bipolar disorder?
Statistical analyses of the demographic and behavioural data are summarised in Table 1. Here we summarise the temporal lobe activations that were significant at a threshold ofP<0.001. Supra-threshold activations about which we did not have an a priori hypothesis are summarised in Tables 2, 3, 4, 5, 6. Tables 2 and 3 summarise the within-group contrasts for both patient groups. A full report on the neural response to emotional prosody in our healthy control subjects can be found in Mitchell et al (2003). However, in light of our hypotheses, it was the between-group comparisons that were of primary interest (Tables 4, 5 and 6) and it is the between-group differences that are the focus of the discussion that follows.
Scanning session 1: filtered emotional prosody v. rest
Consistent with our hypothesis, patients with schizophrenia activated the right anterior middle temporal gyrus to a greater extent than patients with bipolar disorder in response to passive listening to filtered emotional prosody. However, patients with bipolar disorder demonstrated greater activation than those with schizophrenia on the same task in another region of right middle temporal gyrus, posterior to that shown to be more responsive in patients with schizophrenia.
Scanning session 2: emotional prosody in normal speech v. rest
In patients with schizophrenia relative to healthy controls, contrary to our prediction of a greater right middle temporal gyrus response to emotional prosody in schizophrenia, emotional prosody in normal speech induced greater activation of the left middle temporal gyrus. On the other hand, healthy controls did activate the left superior temporal gyrus to a greater extent than did patients with schizophrenia.
Patients with schizophrenia also demonstrated relatively greater activation bilaterally in the superior temporal gyrus and the left middle temporal gyrus compared with patients with bipolar disorder. However, patients with bipolar disorder demonstrated relatively greater activation of lateral and anterior portions of the superior temporal gyrus than did patients with schizophrenia.
Scanning session 3 (not tabulated)
Attention to semantics (ignoring emotional prosody)
No brain regions activated to a greater extent in patients with schizophrenia than in controls. Healthy controls activated the left middle frontal (Brodmann area 46, 197 voxels, Z score 3.52, x=-36, y=33, z=21) and right inferior frontal gyri (Brodmann area 46, 19 voxels, Z score 3.66, x=48, y=30, z=12) to a greater extent than patients with schizophrenia.
Attention to emotional prosody (ignoring semantics)
When attending to emotional prosody, patients with schizophrenia activated the left insula to a greater extent than did healthy controls (180 voxels, Z score 3.71, x=-33, y=-18, z=15) but they did not demonstrate increased activation of the right middle temporal gyrus compared with healthy controls. No brain regions were activated to a greater extent in healthy controls than in patients with schizophrenia.
As the chlorpromazine equivalent medication levels in patients with schizophrenia attending to emotional prosody increased, so did the neural activity in the left superior temporal gyrus, insula and superior frontal gyrus. However, there was no overlap between areas in which activity was linked to medication levels and areas recruited by patients to attend to emotional prosody.
In conjunction with analyses of the main effects of the subject group described above, further statistical analyses were performed to assess between-group differences in the relative laterality of response to emotional prosody. For each of the three experiments, a general linear model analysis of peak lateral temporal lobe activation Z scores was performed to examine interactions between subject group and the hemisphere in which lateral temporal lobe activation was greatest. The region of interest was defined as including activations in the superior and middle temporal gyri. The results of these analyses are summarised in Table 7.
Patients with schizophrenia demonstrated some reversal of the normal right-lateralised temporal lobe response to emotional prosody, particularly in response to unfiltered emotional prosody.
Patients with bipolar disorder also displayed some evidence of a reversal of the normal temporal lobe response to emotional prosody.
When actively attending to emotional prosody, patients with schizophrenia activated the left insula to a greater extent than did healthy controls.
When listening to pure emotional prosody, patients with bipolar disorder activated the amygdala, bilateral superior temporal gyrus and right inferior frontal gyrus less than did controls.
Key abnormalities in patients with schizophrenia
We hypothesised that for both passive and active listening to emotional prosody patients with schizophrenia would demonstrate greater activation of the right middle temporal gyrus than healthy control subjects. This hypothesis presupposed that, like healthy controls, patients with schizophrenia would activate the right middle temporal gyrus to process emotional prosody, albeit to a greater degree. Contrary to our hypotheses, however, patients with schizophrenia demonstrated some reversal of the normal right-lateralised temporal lobe response to emotional prosody (i.e. left more than right). In response to both pure emotional prosody and emotional prosody in normal speech, the within-group Z scores (Tables 2 and 3) suggested that, compared with rest, healthy controls activated the right lateral temporal lobe more than the left, and that patients with schizophrenia activated the left temporal lobe more than the right. In response to unfiltered emotional prosody, the between-group comparisons demonstrated that patients with schizophrenia activated the left middle temporal gyrus significantly more than did healthy controls, and post hoc analyses established that there was a highly significant interaction between subject group and hemisphere, such that patients with schizophrenia showed greater activation of the left lateral temporal lobe than of the right, and healthy controls demonstrated greater activation of the right lateral temporal lobe than of the left (see Table 7).
Previous studies suggest that there may be some reversal of the normal left-lateralised temporal lobe response to semantics in patients with schizophrenia (Woodruff et al, 1997). This previously demonstrated right lateralisation of response to semantics in schizophrenia could feasibly be accompanied by a left lateralisation of the response to emotional prosody. Indeed, such a double reversal of dominance was described by Joseph (1986) in his post-surgical investigation of a patient who had undergone neurosurgery to alleviate early-onset left-hemisphere epilepsy. That case study supports the idea that the left hemisphere could ‘take over’ functions normally subserved by the right hemisphere if the right hemisphere is damaged sufficiently and early enough in brain development. However, our abnormal lateralisation findings are confined to emotional prosody. Our observation that emotional prosody is sometimes left lateralised in schizophrenia means that an excessive cortical response to emotional prosody in schizophrenia is unlikely to explain the right-sided bias in their cortical response to external speech (Woodruff et al, 1997). Furthermore, the similarity of our observation in patients with schizophrenia to that observed in individuals with bipolar disorder suggests that factors that lead to this right-left reversal or failure of normal lateralisation are common to some general ‘psychosis factor’, not necessarily specific to schizophrenia alone. Despite this similarity, patients with schizophrenia and bipolar disorder activated different temporal lobe regions from each other in response to filtered and unfiltered emotional prosody. The data suggest that, in our tasks, schizophrenia and bipolar disorder are associated with functional abnormalities in different regions of the middle and superior temporal gyri (Table 5). It is possible that the two diseases may differentially affect temporal lobe function, perhaps reflecting observed differences in language (Docherty et al, 1996; Thomas et al, 1996) and auditory event-related potential disturbances (Souza et al, 1995) between these two patient groups.
A similarly noteworthy result arose when attention to emotional prosody was compared with attention to semantics in patients with schizophrenia relative to healthy controls. In this active condition, patients with schizophrenia activated the left insula to a greater extent than did healthy controls. It has been reported previously (Shapleske et al, 2002) that in patients with schizophrenia the insula is reduced in size compared with healthy controls. Top-down attentional modulation is a means by which directing attention towards a stimulus in one sensory modality enhances cortical activation in sensory cortex of the same modality (Woodruff et al 1996). Together, these findings could indicate that in patients with schizophrenia the insula is hyper-responsive to attentional modulation of the response to speech that contains emotional prosody. In this context, hyper-responsivity of the insula could reflect a reduction of the insula inhibition that normally occurs during self-regulation of emotional response (Beauregard et al, 2001), as part of the affect regulation difficulties (Shaw et al, 1999) or fronto-temporal disinhibition that characterise the disorder (Friston et al, 1992).
Other abnormalities in patients with bipolar disorder
Compared with healthy controls, patients with bipolar disorder listening to pure emotional prosody demonstrated significantly less activation of the amygdala, uncus, bilateral superior temporal gyrus and right inferior frontal gyrus. Previous neuroimaging studies have demonstrated that emotional prosody recognition normally activates the right prefrontal cortex (George et al, 1996; Buchanan et al, 2000) and the right anterior auditory cortex (Buchanan et al, 2000; Mitchell et al, 2003). We speculate that the relative lack of right superior temporal and inferior frontal gyrus activity in patients with bipolar disorder could indicate that this patient group has a reduced (neural) capacity to process the emotional prosody tested in our study, or that this capacity is under-utilised. Our finding of significantly decreased amygdala activation in patients with bipolar disorder compared with healthy controls listening to pure emotional prosody is consistent with structural amygdala abnormalities well documented in bipolar disorder (Strakowski et al, 1999; Altshuler et al, 2000). The existence of functional amygdala abnormalities in bipolar disorder in response to emotional stimuli has been demonstrated already, in facial emotion discrimination tasks (Kennedy et al, 1997; Yurgelun-Todd et al, 2000).
Because patients did not demonstrate a performance deficit, the patient-control differences in functional activations can be ascribed to neural processes in schizophrenia itself rather than to behavioural impairment alone. Furthermore, contrary to recent hypotheses (Heinrichs & Zakzanis, 1998; Bilder et al, 2000), patients with schizophrenia did not demonstrate global attenuation of brain activity, because when processing emotional prosody they activated several brain regions not activated by healthy controls. In this respect, it is important to note that, in response to emotional prosody, patients with schizophrenia actually activated some brain regions more than did healthy controls, suggesting that when performing certain processing tasks this patient group has the capacity to recruit brain regions and cognitions that healthy controls cannot.
In a future study, it would be interesting to examine whether, like prosodic comprehension, prosodic expression is mediated abnormally by patients with schizophrenia. It is not clear from the literature whether patients with schizophrenia have a tendency to be abnormally right lateralised for the production of speech as well as its comprehension.
Patients with bipolar affective disorder may experience several different phases of illness. Thus, it is likely that a patient’s cognitive dysfunctions and neural response to emotional prosody also may vary in phases. To support the hypothesis that emotional dysfunctions are related to emotional prosodic abnormalities in patients with bipolar disorder, it would be informative to determine whether this patient group also responds abnormally to non-emotional or linguistic prosody.
Clinical Implications and Limitations
▪ Owing to the importance of emotional interpretation for successful social integration, further attention should be paid to the role of emotional prosodic processing dysfunction in the communicative abilities of people with schizophrenia.
▪ The role of emotional prosody in misinterpretation of statements or reinforcement of paranoid activity as occurs in delusions should be considered in more depth.
▪ Our findings may increase understanding of the functional implications of abnormal cerebral lateralisation. The abnormal lateralisation of emotional prosodic processing in schizophrenia could predispose patients to symptoms such as auditory hallucinations and thought disorder.
▪ In common with functional imaging studies in general, this study did not specifically address the impact of any structural abnormalities on functional abnormalities.
▪ Because there was only a limited measure of output during passive listening to emotional prosody (debriefing), we cannot be specific about the additional cognitive operations, perhaps unrelated to the task, that patients may have been engaging while in the scanner.
▪ Although the covariate analyses suggested a lack of association between the response to emotional prosody and medication levels, the correlational approach cannot prove or rule out whether medication affects the emotional prosodic processing in schizophrenia.
- Received April 11, 2003.
- Revision received July 30, 2003.
- Accepted October 29, 2003.
- © 2004 Royal College of Psychiatrists