The British Journal of Psychiatry



The pathophysiology of auditory verbal hallucinations remains poorly understood.


To characterise the time course of regional brain activity leading to auditory verbal hallucinations.


During functional magnetic resonance imaging, 11 patients with schizophrenia or schizoaffective disorder signalled auditory verbal hallucination events by pressing a button. To control for effects of motor behaviour, regional activity associated with hallucination events was scaled against corresponding activity arising from random button-presses produced by 10 patients who did not experience hallucinations.


Immediately prior to the hallucinations, motor-adjusted activity in the left inferior frontal gyrus was significantly greater than corresponding activity in the right inferior frontal gyrus. In contrast, motor-adjusted activity in a right posterior temporal region overshadowed corresponding activity in the left homologous temporal region. Robustly elevated motor-adjusted activity in the left temporal region associated with auditory verbal hallucinations was also detected, but only subsequent to hallucination events. At the earliest time shift studied, the correlation between left inferior frontal gyrus and right temporal activity was significantly higher for the hallucination group compared with non-hallucinating patients.


Findings suggest that heightened functional coupling between the left inferior frontal gyrus and right temporal regions leads to coactivation in these speech processing regions that is hallucinogenic. Delayed left temporal activation may reflect impaired corollary discharge contributing to source misattribution of resulting verbal images.

The pathophysiological basis of auditory verbal hallucinations, experienced by 60–80% of people with schizophrenia, remains uncertain. Functional neuroimaging studies have associated these hallucinations with haemodynamic activity in diverse brain regions, including the left inferior frontal gyrus,14 right inferior frontal gyrus,2,46 middle and superior temporal gyri,25,7 primary and association auditory cortex,2,8 hippocampal and parahippocampal regions,2,3,5,9 and the thalamus,3,4,7,9 with much variation between patients and across studies. It is uncertain, moreover, which of these findings reflect core mechanistic processes producing auditory verbal hallucinations or are downstream consequences, such as registration of hallucinated content in verbal memory or propagated activity to non-essential areas.

The purpose of our study was to improve our understanding of the chain of brain events leading to auditory verbal hallucinations by determining the sequential organisation of corresponding regional blood oxygen level-dependent (BOLD) activity derived from functional magnetic resonance imaging (fMRI). Although temporal resolution of BOLD signals is relatively coarse (of the order of 1 s or more), our approach was encouraged by three prior reports of BOLD signal time course – including one report by our group – showing heightened activity prior to motorically signalled auditory verbal hallucination onset in the right middle temporal cortex,1012 the left inferior frontal gyrus,11 and the neighbouring insula,12 suggesting precursor events leading to auditory verbal hallucinations. These reports were limited by small sample sizes (n = 1–6) and the possibility that BOLD activity, even in language processing regions, is altered by motor behaviour required to signal hallucination occurrences for time-course analysis.13 Since we completed our study, a report of BOLD time course preceding hallucination events based on a larger number of patients (n = 15) by Diederen et al has been published.14 In that study, hallucination events were signalled by squeezing a balloon. A control group of healthy participants who squeezed a balloon randomly during scanning were also studied. No statistically significant shift in regional activity prior to random balloon-squeezes was detected for this control group, so these data were not further considered. Their study detected sites of reduced activity prior to hallucination events, which was most pronounced in the parahippocampal gyrus.

Our study involved 11 patients who signalled hallucination events by button-presses during scanning, and a comparison sample of 10 patients with similar diagnoses and medication status who did not experience hallucinations. The latter group depressed a button at random intervals during scanning matching the frequency and duration of button-pressing by the hallucination group. Given that motor initiation programmes may produce BOLD signals, regional activity associated with hallucination events was statistically adjusted for activity arising from random button-pressing. Participants in the hallucination group were not themselves utilised for generating random button-press behaviour since the initiation of such behaviour could be influenced by frontal activation associated with their auditory verbal hallucinations,24,11 thereby altering timing and possibly reducing statistical power. In contrast to the study by Diederen et al, our primary emphasis was on delineating regional activation rather than deactivation while focusing on language processing regions.


Eleven patients with schizophrenia or schizoaffective disorder reporting severe auditory verbal hallucinations were studied together with a comparison group of ten patients with these diagnoses but without active auditory verbal hallucinations. Participants were selected for the former group if they experienced such hallucinations on average at least once every 3 min with intervening subjective silence, and could signal onset and offset of these hallucinations reliably. These patients did not overlap with our previous sample.12 All participants were right-handed and had no history of significant neurological disorder, head trauma, substance misuse during the prior month or substance dependence at any time. Written consent was obtained according to expectations of the Yale University School of Medicine institutional review board. Diagnostic assessments were conducted using the Structured Clinical Interview for DSM–IV – Patient Version (SCID–I/P) version 2.0.15 Symptom assessments were conducted using the Positive and Negative Syndrome Scale (PANSS).16 Comparison group patients were enrolled to match the hallucination group in terms of overall positive symptom severity.

Scanning protocol

Magnetic resonance imaging employed a 3 T Trio scanner (Siemens, Erlangen, Germany). Twenty-two 4 mm T1-weighted images were acquired parallel to the anterior–posterior commissural line with a 0.8 mm skip between images. Blood oxygen level-dependent data were collected at the same slice locations and were acquired in runs lasting 4 min 6 s each: 164 gradient-recalled, single-shot echoplanar images for each slice, repetition time (TR) 1500 ms, time to echo (TE) 30 ms, flip angle 80°,64×64 acquisition matrix, voxel size 3.125 mm×3.125 mm×4.8 mm. The first four images of each slice in each run were discarded in order to allow magnetisation to reach a steady state. Participants were generally studied for six runs. Those experiencing hallucinations were asked to depress and release a button to signal the onset and offset respectively of each auditory verbal hallucination event during scanning. Control group participants were instructed to press the button spontaneously at random intervals during scanning; feedback was given between runs to adjust the frequency and duration of button-presses to match the hallucination group. Control participants were instructed not to count button-presses mentally or cue themselves with self-talk, to diminish the likelihood that these behaviours were accompanied by inner speech. Extra runs were required for three members of the hallucination group to ensure a sufficient sample of hallucination events. Extra runs were required for two control participants in order to match the button-press frequency of the hallucination group. Some runs were dropped from the analysis because of excessive head movement.

Image analyses


All slices were first corrected for the time of acquisition using sinc interpolation. The data were then motion-corrected using the statistical parametric mapping algorithm ( A spatial Gaussian filter with 2 pixels full width at half maximum was applied to the data. Pixels with a median value below 5% of the maximum median pixel value were set to zero, and data were low-pass filtered with a cut-off of 0.2 Hz to focus on fluctuations related to the BOLD response and suppress the effects of higher-frequency noise.

Reference time course

For each participant the time course of haemodynamic activity arising from putative neural activity marked by button presses/releases was convolved with a standard gamma model of haemodynamic response where neural activation elicits a smoothed BOLD signal rise and fall spread predominantly over a 2–8 s delay.17

Correlating time courses to characterise sequential event-related activity

For each participant and each scan, BOLD signal fluctuations in all pixels comprised by a given slice were correlated with the haemodynamically convolved reference time course at time shifts ranging from –9.0 s to 4.5 s at 1.5 s intervals (TR of scan protocol). At each time shift and pixel, correlations derived in this fashion were averaged across runs. These correlations were transformed into an approximately normal Gaussian distribution yielding a map representing the strength of time-shifted correlations relative to the reference time course expressed as standardised z-values which were then transformed to Talairach coordinates. This correlational method does not require that the BOLD signal returns to baseline from one hallucination event to the next in order to detect event-related activity. Consequently, patients were included in the study who signalled some hallucinations with very short inter-event intervals that might be excluded by more standard event-related BOLD signal averaging methods. Irregular temporal spacing of these events improves statistical power for this correlation-based method, just as ‘ jittering’ allows closer event spacing in other event-related fMRI designs.18

Delineation of regions of interest

At each Talairach voxel and time shift, activity was assessed as BOLD time course correlations with hallucination event on/off time course for the 11 patients with hallucinations and compared using two-tailed t-tests, with the same activity measure calculated for random button-presses for the 10 patients in the control group. The contrast map of regional activity across the ten time shifts considered (online Fig. DS1) was used to demarcate four regions of interest (ROIs) for time-course analysis in cortical language centres in left and right posterior temporal regions and left and right inferior frontal regions based on prior fMRI and positron emission tomography activation studies of auditory verbal hallucinations;17,11,12 any voxel in these regions was included in the corresponding ROI if expressing a level of significance with a cut-off of P<0.005 (uncorrected) when contrasting groups for any of the time shifts considered. A midbrain ROI was also included using the same statistical cut-off in light of studies implicating midbrain dopamine and serotonin neuromodulatory groups in the pathophysiology of schizophrenia.19,20

Time-course analyses

Region of interest activity derived from BOLD signal–event time-course correlations at different time shifts was residualised by statistically removing effects of baseline variables showing trend-level differences between the two study groups. Residualised ROI activity data for hallucinations were then transformed into z’ scores by scaling against corresponding, time-shifted ROI activity data for random button-presses as follows: Math where Math is the residualised z′ score for region (R), individual (i) and time shift (j), Math is the corresponding residualised activity assessed relative to hallucination events, and Math and Math are the mean and standard deviation of residualised, event-related activity relative to control on/off button-press behaviour calculated for the control group as a whole for the same region and time shift.

Haemodynamic smoothing of BOLD signal response to neural activity will cause our correlational measure of event-related haemodynamic activity to spread across multiple positive and negative time shifts.17 Nonetheless, differences in sequential timing of peak neural activity associated with hallucination events in different regions should be detectable as differences in time shifts when peak haemodynamic activity is expressed, so long as their haemodynamic response functions are similar. Consequently, our primary statistical analysis compared motor-adjusted activity time course (represented as sequences of time-shifted residualised z′ scores) for left v. right homologous ROIs, since their haemodynamic response profiles are likely to derive from similar vascular physiology. To this end, a linear mixed model was employed with left and right homologous ROIs as a regional within-individual factor and the ten time shifts within each region as a repeated measure. The correlation structure of the data was modelled by random effects for participant and by structured variance–covariance matrix for repeated observations within each region. The latter variance–covariance structure was the best fitting according to information criterion. All data were analysed using SAS version 9.1 for Windows. Insofar as the midbrain ROI was a single bilateral cluster, motor-adjusted time course of activity for that ROI was characterised relative to time shift only.

In order to statistically compare level of functional coupling between ROIs for hallucination v. random button-press events, activity levels associated with signalled events were themselves correlated between different ROIs within the two groups. These group-specific correlations were then contrasted by means of Fisher r to Z transformations.

All significance levels reported were two-tailed.


The two patient groups were well-matched in terms of gender, education and PANSS positive and total symptom scores, with members of the control group tending to be somewhat older and exhibiting greater negative symptoms at trend levels (Table 1). Motor-adjusted activity for each time shift relative to hallucination events was therefore residualised to remove variance due to the latter two variables. All patients apart from one in the non-hallucination group received antipsychotic drugs. The two patient groups were not statistically different in terms of the frequency and duration of button-presses, number of runs analysed and the number of button-presses occurring in an interval less than or equal to 13.5 s, the time domain of the analysis (Table 1).

View this table:
Table 1

Demographic, clinical and data acquisition profile of the two study groups

Time course of residualised motor-adjusted activity associated with hallucination events

Residualised motor-adjusted activity associated with hallucination events was significantly greater for the left inferior frontal gyrus ROI than for the right inferior frontal gyrus ROI (F(1,190) = 7.3, P<0.008). Moreover, there was a significant effect of time shift (F(9,190) = 23.8, P<0.0001), and a significant region×time-shift interaction (F(9,190) = 22.3, P<0.0001). Figure 1 shows the trajectory of residualised, motor-adjusted activity prior to and subsequent to hallucination events for these two regions. Post hoc analyses using pooled variance found statistically greater residualised motor-adjusted activity for the left inferior frontal gyrus ROI compared with right inferior frontal gyrus ROI for time shifts ranging from –6.0 s to 0 s.

Fig. 1

Comparison of motor-adjusted activity assessed as residualised z’ scores corrected for age and negative symptoms for the left v. right inferior frontal gyrus (IFG) regions of interest (ROIs) averaged across the group of patients experiencing hallucinations.

The x-axis reflects time shift with 0 s corresponding to the interpolated event occurrence. Boxed significance levels reflect time shifts where the two sets of data were significantly different from each other at an uncorrected significance level cut-off of P<0.005. Error bars reflect standard error.

There was no regional difference overall for residualised motor-adjusted activity associated with hallucination events in a comparison between left and right temporal ROIs (F(1,190) = 0.52). However, there was a significant effect of time shift (F(9,190) = 18.8, P<0.0001) and a significant interaction of region×time shift (F(9,190) = 8.4, P<0.0001). Figure 2 shows the trajectory of these temporal region data at negative and positive time shifts relative to hallucination occurrences. Post hoc analyses using pooled variance found robustly greater residualised motor-adjusted activity for the right temporal ROI compared with the left temporal ROI at –4.5 s and – 3.0 s time shifts relative to hallucination occurrences. Residualised motor-adjusted activity for the midbrain ROI was not significantly different from baseline levels until well after hallucination onset, namely at the +3 s time shift (one-sample t = 5.4, d.f. = 10, P<0.0001) and +4.5 s time shift (one-sample t = 10.9, d.f. = 10, P<0.0001).

Fig. 2

A parallel comparison of residualised z’ scores for motor-adjusted activity in the right v. left temporal regions of interest (ROIs) (other aspects of figure as in Fig. 1).

Correlations between activity in the left inferior frontal gyrus and the right temporal ROIs

To probe the functional relationship between left inferior frontal gyrus and right temporal ROIs prior to hallucination events, correlations between haemodynamic activity for these two regions were computed. For hallucination events, correlations between activity in the left inferior frontal gyrus at the earliest time shift considered (–9 s) were positively correlated with activity in the right temporal ROI for that same time shift and for subsequent time shifts up to and including the –4.5 s time shift (Fig. 3(a)). These interregion activity correlations were significantly greater than comparable interregion correlations for random button-presses by patients in the control group across time shifts ranging from –9s to –6 s inclusively (Fig. 3(a)). Group differences were absent for correlations of activity between left inferior frontal gyrus and left temporal ROIs (Fig. 3(b)).

Fig. 3

Correlations between activity in brain regions of interest and hallucination/random button-press events.

(a) Maps of correlations calculated between activity in the left inferior frontal gyrus (IFG) region of interest (ROI) at the earliest (–9 s) time shift relative to hallucination/random button-press event occurrences (corresponding to the zero time shift) and activity for the right temporal ROI at this time shift and the next three time shifts. Boxed values demonstrate significantly greater correlations for patients experiencing hallucinations (n = 11) compared with patients who did not (n = 10). (b) The same correlations calculated between activity for the left IFG ROI at – 9 s time shift and activity for the left temporal ROI at this time shift and the next three time shifts. No difference in these correlations comparing the two groups was detected.


Auditory verbal hallucinations occur discretely in time, thus providing an opportunity to map using fMRI a critical symptom of schizophrenia as a sequence of activations and deactivations in language-processing cortical areas. Magnetoencephalography and scalp/intracortical electroencephalography have demonstrated synchronisation of activity within a 150–400 ms window involving left inferior frontal gyrus and both left and right temporal regions following initiation of speech-processing tasks ordinarily, including speech perception and inner speech production.2125 Although temporal resolution of fMRI data is much coarser, this measurement technique was nonetheless able to detect disrupted synchrony of these cortical language-processing areas associated with auditory verbal hallucinations: parallel coactivation emerged in the left inferior frontal gyrus and right temporal ROIs clearly prior to signalled hallucination events (Figs 1 and 2), and a dramatic delay in peak activity detected in the left temporal ROI of the order of 4.5–6 s (Fig. 2). These findings suggest a leading role of speech-processing regions in the left inferior frontal gyrus and right temporal ROIs in triggering auditory verbal hallucinations. This hypothesis is further supported by robustly positive correlations between activity levels in these two ROIs detected a full 9 s prior to hallucination events even when these levels were no different from baseline (Fig. 3(a)). These findings were specific to participants experiencing hallucinations – correlations between activity levels for these same two ROIs at the same time shifts were negative for control motor behaviour by patients in the non-hallucination group. These findings were also selective insofar as pre-event left inferior frontal gyrus/left superior temporal gyrus activity correlations comparing the two study groups were not significantly different (Fig. 3(b)).

Studies have shown that elevated synchronisation between cortical regions can prompt coactivation states that are experienced consciously as percepts (for review and supporting data see Cosmelli et al).26 This view suggests a mechanistic framework for our findings, namely elevated functional coupling between left inferior frontal gyrus and right temporal speech-processing regions inducing subsequent coactivation in these regions that becomes hallucinogenic.

Heightened activity in right posterior middle temporal and superior temporal gyri prior to auditory verbal hallucinations replicates and extends earlier case studies.1012 Involvement of these regions in generating auditory verbal hallucinations may reflect their role in wilfully generating auditory imagery of non-self speech,27 and ascertaining speaker identity and intentions of environmental speech based on acoustic and prosodic characteristics.28,29 These latter functions may come online owing to enhanced activity in these regions immediately prior to hallucination events, thereby eliciting an uncanny sense that distinct, identifiable non-self individuals or agents are intentionally producing these experiences. It is likely that the full experience of auditory verbal hallucinations requires subsequent recruitment of other temporal regions.11

Heightened activity in the left inferior frontal gyrus was previously reported by Shergill et al for two cases a full 6–9 s prior to signalled auditory verbal hallucination onset.11 Activation in this region is characteristic of inner speech and auditory verbal imagery ordinarily, suggesting a common mechanism.26 Activation in the left inferior frontal gyrus is also associated with heightened auditory attention,30 which could facilitate emergence of activity in temporal regions as speech percepts.

The functional significance of delays in left superior temporal gyrus activation relative to left inferior frontal gyrus activation is suggested by studies indicating that the latter sends ‘feed-forward’ corollary discharge information to bilateral temporal areas in order to dampen perceptual consequences of self-generated inner and overt speech.31,32 Consequently, these experiences are more readily differentiated from speech originating from an environmental source. Corollary discharge propagation to temporal regions could plausibly be expressed as elevated haemodynamic activity. If so, the delay in peak activity in left superior temporal gyrus relative to emergence of hallucinated speech indicated by our data might reflect a partial breakdown of this feed-forward mechanism, thereby disrupting the capacity to experience these acoustic images as self-generated.

Although left inferior frontal gyrus and right posterior temporal regions are linked indirectly by homologous cortical regions and arcuate and callosal white matter pathways, no single pathway links them directly. Heightened functional coordination postulated to occur between these two regions would thus need to be mediated by a third region, probably subcortical. One possibility is a thalamic signal generator. The thalamus projects to inferior frontal gyrus and temporal regions,33 participates in language processing,21 and has been postulated to have a key role in a variety of hallucination syndromes.34 Elevated thalamic activity was not detected prior to auditory verbal hallucinations in our data. Nonetheless, it is possible that thalamic projections can coordinate activity across cortical regions without the thalamus itself showing increased BOLD activity. This possibility is suggested by a study demonstrating that shifts in BOLD activity for a given region reflect primarily its levels of input and local processing rather than its spike outputs.35 A thalamic signal generator, if firing relatively spontaneously without extensive input or local processing, might therefore not produce local BOLD signal activation. This formulation does not address why thalamic projections to right temporal regions might carry greater weight than projections to left homologous regions in people experiencing hallucinations. One possibility is suggested by structural abnormalities in the latter associated with auditory verbal hallucinations,36,37 which could ‘release’ right temporal processes.38

Three fMRI studies have examined functional connectivity associated with auditory verbal hallucinations linking frontal and temporal regions, and are therefore relevant to our findings. Consistent with our proposed coupling mechanism, Raij et al reported that the subjective reality of these hallucinations was correlated with functional connectivity enhancement between left inferior frontal gyrus and a right posterior temporal region elicited by the hallucination event.39 Shergill et al reported that when fMRI data were collected during inner speech production at different rates, functional connectivity between left inferior frontal gyrus and right middle/superior temporal gyri was less in patients with hallucinations than in healthy controls.31 This finding challenges our proposed model. However, functional connectivity underlying voluntary generation of inner speech may be distinct from that underlying auditory verbal hallucinations, which generally consist of involuntary auditory imagery of other people speaking. Finally, Lawrie et al found reduced functional connectivity linking left dorsolateral prefrontal cortex and left temporal cortex during a sentence completion task in three patients with auditory verbal hallucinations compared with five patients without such hallucinations.40 Impaired frontotemporal connectivity could produce the abnormal delay in left superior temporal gyral activation relative to left inferior frontal gyral activation detected in our study.

As noted earlier, a recent study by Diederen et al examined pre-hallucination BOLD time course in a relatively large patient sample, and found deactivation in the left parahippocampal gyrus as well as in the left superior temporal gyrus, right inferior frontal gyrus, left middle frontal gyri, right insula and left cerebellum.14 An examination of our data revealed some comparable findings. Although not a focus of this report, pre-hallucination activity reductions were detected in the right parahippocampal gyrus at –3 s and –1.5 s time shifts (online Fig. DS1), with a smaller left parahippocampal site of reduced activity detected at the –4.5 s time shift. Other sites of pre-hallucination deactivation in our data included bilateral superior temporal gyrus (anterior to superior temporal gyrus sites of elevated activity associated with auditory verbal hallucinations) and a region medial to the right insula overlapping the putamen. These pre-hallucination deactivation findings suggest either broadly distributed, interregion oscillatory instability or early-phase heightened activity in some regions producing secondary inhibition in others. In terms of the latter explanation, Diederen et al proposed that parahippocampal deactivation is induced by dopaminergic hyperactivity, causing aberrant verbal memories that trigger auditory verbal hallucinations.14 Another possibility is that pre-hallucination left inferior frontal gyrus and right temporal activition found in our study (see also Shergill et al)11 reflexively induces inhibition in other brain regions.

Our study had a number of limitations. First, scanner sounds during data acquisition probably obscured hallucination-specific activity in auditory processing brain regions (e.g. primary auditory cortex and thalamus) that may be involved in generating percept-like qualities of these experiences.24,79 Second, fMRI methods have limited temporal resolution, and differences in vascular physiology could distort time-course findings to some degree;18 for instance, it is possible that the vascular physiology of the midbrain is considerably different from that of cortical regions. Therefore, judgements regarding time-shift delays for midbrain v. cortical activity concurrent with auditory verbal hallucinations in absolute time based on BOLD activity must remain tentative. Third, motor initiation processes producing random button-presses may be different from those signalling auditory verbal hallucinations, since the former are spontaneous and the latter are in response to percept-like experiences. Moreover, motor behaviour for both conditions is likely to have lateralising effects that would have biased some of our findings. Along these lines, starting at the –4.5 s to –3 s time shift, variance of activity computed across participants increased for the right inferior frontal gyrus ROI – but not for the left inferior frontal gyrus ROI – for both control button-press behaviour and motorically signalled hallucinations. This shift in variance reduced our ability to detect motor-adjusted activity accompanying hallucinations in the right inferior frontal gyrus (see equation in Method section) and renders comparisons with left inferior frontal gyrus motor-adjusted activity inconclusive. However, variances for left/right temporal and left inferior frontal gyral activity remained comparable across time shifts and between conditions, thereby lending confidence to the relative sequencing of motor-adjusted activity associated with auditory verbal hallucinations for these regions.

In summary, our findings suggest that early-phase coupling and coactivation involving the left inferior frontal gyrus and right temporal regions – with dramatically delayed activation in the left superior temporal gyrus – play an important part in generating auditory verbal hallucinations. Pre-hallucination regional deactivation may also have a central mechanistic role.14 To test our model further we have begun a study of patients with auditory verbal hallucinations, combining low-frequency repetitive transcranial magnetic stimulation (rTMS) with fMRI assessed at baseline and after this intervention. We intend to test the prediction that the degree of improvement in hallucinations following rTMS delivered to posterior temporal regions41 positively correlates with the level of reduction in functional coordination between the right superior temporal gyrus, the left inferior frontal gyrus and subcortical regions such as the thalamus.


The study was funded by National Institute of Mental Health grants R01MH073673 and R01MH067073, the Dana Foundation, the National Alliance for Research on Schizophrenia and Depression (NARSAD), National Institutes of Health NIH/NCRR/GCRC program grant RR00125 and the Department of Mental Health and Addiction Services of the State of Connecticut.

  • Received September 8, 2010.
  • Accepted November 17, 2010.


View Abstract