The British Journal of Psychiatry
Independent course of childhood auditory hallucinations: a sequential 3-year follow-up study


Background Childhood hallucinations of voices occur in a variety of contexts and have variable long-term outcomes.

Aim To study the course of experience of voices sequentially over a 3-year period in those with and those without a need for mental health care (patient status).

Method In a group of 80 children of mean age 12.9 years (s.d.=3.1), of which around 50% were not receiving mental health care, baseline measurement of voice characteristics, voice attributions, psychopathology, stressful life events, coping mechanisms and receipt of professional care were used to predict 3-year course and patient status.

Results The rate of voice discontinuation over the 3-year period was 60%. Patient status was associated with more perceived influence on behaviour and feelings and more negative affective appraisals in relation to the voices. Predictors of persistence of voices were severity and frequency of the voices, associated anxiety/depression and lack of clear triggers in time and place.

Conclusions Need for care in the context of experience of voices is associated with appraisal of the voices in terms of intrusiveness and ‘ omnipotence’. Persistence of voices is related to voice appraisals, suggesting that experience of voices by children should be the target of specific interventions.

Hallucinations of hearing voices by children and adolescents can occur in the context of a variety of psychiatric states, such as schizophrenia (Bettes & Walker, 1987; Green et al, 1992; Galdos et al, 1993; Galdos & van Os, 1995), symptoms of anxiety and depression (Chambers et al, 1982; Garralda, 1984a; Ryan et al, 1987), migraine (Schreier, 1998), trauma, dissociation processes and reactive psychoses (Famularo et al, 1992; Putnam & Peterson, 1994; Altman et al, 1997). All these studies concern selected clinical samples, and small numbers of long-term follow-ups of such groups suggest variable outcomes, reflecting the heterogeneity of the subject selection procedures in the different studies (Garralda, 1984b; Del Beccaro et al, 1988). A recent population-based epidemiological study found that the prevalence of at least sometimes experiencing hallucinatory phenomena in children was 8%, of which only a third had a DSM—III (American Psychiatric Association, 1980) diagnosis, about twice the diagnostic rate for children in the non-hallucinatory group (McGee et al, 2000). For the majority of children experiencing hallucinations in the general population, therefore, these experiences appear to be non-pathological. A small group, however, may develop persistent symptoms and subsequent psychotic disorder in adult life. In a recent follow-up of 761 children, self-reported psychotic symptoms at age 11 years increased the odds for psychotic illness at age 26 years by 16.4 times, but the actual number of children that developed a psychotic disorder was very small. In this study, childhood psychotic experiences had predictive value independent of childhood psychiatric diagnosis (Poulton et al, 2000). Other work also suggests that childhood hallucinations (especially persisting hallucinatory experiences) may increase the risk for later psychotic disorder (Fennig et al, 1997).

There is thus an extreme diversity of possible outcomes in children with pathological and non-pathological hallucinatory experiences, and although the risk is increased, only a very small number of children develop psychosis. Although there have been a number of prospective and retrospective long-term follow-up studies (Garralda, 1984b; Poulton et al, 2000), no shorter-term sequential follow-ups of children with hallucinations have been conducted to study the course of these experiences themselves and the possible factors that influence their short-term course. In this study, a group of children who were hearing voices were followed sequentially over a period of 3 years to establish the rate of discontinuation of voices in the short term. In addition, we wished to establish the predictive value of attributions associated with the voices, life events and social adversity, coping, professional help received, concurrent psychopathology and dissociation. We wished to recruit not only individuals who were in contact with mental health professionals but also children who had never sought professional help, in order to examine to what extent differences between ‘patients’ and ‘ non-patients’ would influence the course of hallucinatory experiences (Romme & Escher, 1989, 1994; Romme et al, 1992).



Power calculations suggested that around 80 children were required at conventional alpha, given an effect size of 2 and a 40% overall rate of voice discontinuation over 3 years. In order to recruit children and adolescents who were hearing voices, we proceeded in two stages. First, extensive media contacts, formed in the course of a prior investigation (Romme et al, 1992), were used in an attempt to create awareness and reduce stigmatisation at the national level. For example, two of the research team (M.R. and S.E.) made an appearance on a popular TV talk show, where the experience of a child hearing voices, as well as the view of its parents, were discussed. At the end of the programme, viewers were invited to attend a special conference on the subject. This conference was attended by 40 children and their parents and received media coverage. In the second stage, the actual recruitment started. Press releases were sent out and several TV and radio appearances were made in which help in contacting the target population was requested. The local community paediatric health service in Maastricht was contacted, where all children aged 0-14 years are screened periodically, as well as several child and adolescent psychiatric services in the country. Over a period of approximately 1 year 80 children were recruited. The children were seen at baseline, and subsequently at 1-year intervals over a period of 3 years; each child was therefore interviewed four times. Whenever a child had indicated at the first follow-up that the voices had disappeared, no attempts were made to re-interview at the second follow-up, in order to avoid unnecessary focus on a past, and often upsetting, experience. However, in order to have an estimate of the rate of recurrence of voices after earlier discontinuation, an attempt was made to again interview all 80 children at the third follow-up, regardless of whether they heard voices or not. Thus, a total of 31 person-years of follow-up were available of children who on an earlier occasion had indicated that they had stopped hearing voices.

Research instruments and procedures

The research team consisted of two field workers who had extensive prior experience in interviewing individuals with hallucinations, using a similar format as the one in the present study. They also had prior experience in administrating the other instruments used. Nearly all children were interviewed at home. The baseline interview was conducted in the presence of at least one parent or grandparent (with a few exceptions where adolescents had specifically asked that the parent not be present). All subjects and their parents, where appropriate, provided written informed consent, conforming to the local ethics committee guidelines. At each interview the same instruments were used. The main instrument was the Maastricht Voices Interview for Children. The interview consisted of the Maastricht Voices Interview for adults that we adapted for children with the help of a clinical child psychologist. It contains several items in relation to the experience of hearing voices which were included on the basis of extensive qualitative research involving many individuals hearing voices over a period of at least 10 years, including children (Romme, 1996; Romme & Escher, 1996).

The number of voices was coded as 1, 2-5, 6-10, > 10 and ‘ variable’. The frequency of the voices was coded as ‘ continuous’, ‘hourly’, ‘daily’, ‘ weekly’, ‘monthly’ and ‘variable’. The emotional tone of the voice was coded as ‘mainly positive’, ‘ mainly negative’ and ‘variable’. Triggers of the voices were coded as ‘present’ or ‘absent’ in the analyses, and assessed in terms of time, place, and emotional trigger events (angry, scared, sad, tired, jealous, doubtful, guilty, insecure, in love, unhappy, lonely, happy).

The degree of coping mobilised by the voices was assessed by constructing a total score of 18 possible coping mechanisms that were enquired about in a structured way and could be scored as ‘present’, or ‘ absent’. In addition, different dimensions of coping were constructed based on a previous classification (Bak et al, 2001a,b) by calculating the sum of the items divided by the number of items. We thus distinguished the coping styles: passive illness behaviour (using medication, using alcohol or drugs), passive problem-solving (ignoring voice, listening selectively to voice, doing something), active problem-solving (making a deal with voice, swearing at voice, seeking distraction, making a drawing of the voice, sending voice away, writing something about the voice), active problem-avoiding (thinking of something else, running away from the voice, calling someone over the telephone, visiting someone), symptomatic coping (listening carefully to voice, performing a ritual against the voice) and other coping (other way of coping).

Attributions were assessed as follows. A secondary explanation was coded as ‘ present’ if the child had indicated explanations of the voice being caused by a spirit or ghost, by a special gift, by a disease, coming from a different world, or another explanation. Attribution of perceived power of the voices in terms of negative influence on emotions and behaviour was expressed as the total score of the ‘yes’/‘no’ coded items: ‘voice is scaring me’; ‘voice makes me confused’; ‘voice interferes with paying attention at school’; ‘voice interferes with homework’; ‘I am getting into arguments because of the voice’; ‘I am getting punished because of voice’; ‘voice makes me run away’; ‘ voice makes me do things I don't want’. A total score of life events in the past year was constructed by asking, in a structured way, about 22 common childhood events scored ‘present’ or ‘ absent’ (e.g. death, illness, accidents, friends moving away, changing school, first menstruation, pregnancy, unanswered love, arguments, parental divorce, repeating school year, other life events). A total childhood adversity score was constructed by adding the scores of the items that were assessed in a structured way (scored ‘present’ or ‘ absent’): childhood not pleasant; not feeling safe at home; not feeling safe on the street; not feeling safe at school; being punished in strange ways; being beaten regularly; being scolded regularly; feeling unwanted; witness of physical abuse; witness of sexual abuse; molested or abused sexually. In addition, the child was asked specifically whether it thought the voice was related to some past stressful life event or trauma.

Receipt of professional help and help-seeking behaviour were assessed by asking whether professional help was being received in relation to the voices, and/or whether a hospital admission had ever occurred in relation to the voices. In addition, after assessing the size of the child's social network in a structured way, the proportion of the total network to which the child had revealed the voices was assessed.

General and specific psychopathology were assessed using the extended Brief Psychiatric Rating Scale (BPRS; Overall & Gorham, 1962; Lukoff et al, 1986), yielding scores for the dimensions anxiety/depression (anxiety, depressive mood, guilt feelings, suicidality), positive psychotic symptoms (suspiciousness, unusual thought content, hallucinations) negative psychotic symptoms (motor retardation, blunted affect, emotional withdrawal, self-neglect), disorganisation (disorientation, conceptual disorganisation, bizarre behaviour) and mania (excitement, euphoria, hyperactivity, distractibility) to rate psychiatric symptoms, and the Youth Self-Report/11-18 (YSR) to measure general problem behaviour expressed as the total score (Verhulst et al, 1996). Although the YSR and related Child Behaviour Checklist (CBCL; Verhulst et al, 1996) scales allow for the calculation of separate scores corresponding to several behavioural dimensions based on exploratory factor analysis, confirmatory factor analytic studies have shown inadequate empirical support for these syndromes and their differentiation (Greenbaum & Dedrick, 1998; Hartman et al, 1999). Instead, a general problem behaviour factor appears to underlie CBCL and related scales data across different age groups (Greenbaum & Dedrick, 1998; Hartman et al, 1999). We therefore examined the total amount of psychopathology, as measured by the total problem score. The Dissociative Experience Scale (DES; Bernstein & Putnam, 1986) is a 28-item self-report instrument measuring dissociative experiences. The respondent is asked to indicate agreement with a statement on a scale between 1 and 10. The sum of the amount of agreement, divided by the number of items constitutes the DES score. A score exceeding 30 is considered deviant. Items of the DES were reworded to match the cognitive level of children, and all items were read aloud to the children, followed by assessment of whether they had understood the question.

Global level of functioning was assessed using the Children's Global Assessment Scale (CGAS; Shaffer et al, 1983). This scale was administered on the occasion of the first and last measurement (the last measurement being the last time the child was seen for follow-up).

During the interviews, care was taken to elicit and record the child's experiences rather than those of the parents. Researchers did not make therapeutic statements or suggestions (although they would, of course, answer any questions to the best of their ability), and did not comment on ongoing mental health care, if present. At the end of each interview, research staff made a report covering all the data collected. This was subsequently discussed with the research team (M.R., S.E. & A.B.) in order to identify problems and ambiguities and to create continuing consensus on how to conduct the interview and rate answers in a standardised way.


The 3 years were divided into three equal time-bands. During each of the three time-bands (baseline to year 1, year 1-2, and year 2-3) person-time was allocated to each patient. Person-time was the number of years between baseline and the interview at which the child indicated that the voices had disappeared, or between baseline and the last interview if the voices did not disappear. If a child who was eligible for interview missed an interview on a given occasion, but was seen on the next occasion, the most recently updated status was used to assess the presence of auditory hallucinations and person-time was assigned accordingly. For example, a child who was not interviewed at year 2, but was seen at year 3 was retained in the data-set on the basis of the year 3 interview outcome.

Incidence rates (of discontinuation of hallucinations) were calculated for each time-band by dividing the number of incident discontinuations by the person-time of follow-up. Incidence rates were expressed as number of cases per 100 persons per year. Cumulative incidence was expressed as the number of discontinuations during the 3 years divided by the total baseline sample. Children who ceased to hear voices at one screening point were excluded for incidence rate analyses at the next screening points. The STCOX command in STATA (version 6; STATA Corporation, 1999) was used to estimate Cox maximum-likelihood proportional hazard models of discontinuation of hallucinations (Clayton & Hills, 1993). Children who had ceased to hear voices at one screening point were excluded for the Cox proportional hazard analyses at the next screening points (i.e. were ‘censored’ in statistical terms). Associations were expressed as the hazard ratio, a measure of relative risk (Kendler et al, 1994), with 95% confidence intervals. Tests were performed two-tailed with α of 0.05.

To examine whether the effect of predictors of voice discontinuation varied as a function of receipt of mental health care or being a case in terms of high levels of psychopathology, problem behaviour, or low levels of social functioning, we fitted interactions between predictors on the one hand, and receipt of mental health care, case level of psychopathology (defined as score greater than 90th percentile on BPRS sum score), case level of problem behaviour (score greater than 90th percentile on the YSR) and case level of poor social functioning (score greater than 90th percentile on the CGAS) on the other. Interactions were assessed by likelihood ratio tests (LRS).


Rate of voice discontinuation

The mean age was 12.9 years (s.d. 3.1; range 8-19 years). Around half of the children (53.8%) were female. Taking into account that children who ceased to hear voices were ‘censored’ in the analyses, a total of 80, 75, 47 and 35 children were available for analysis at baseline, year 1, year 2 and year 3, respectively, and the rate of voice discontinuation (or censoring in survival analytical terms) was 25.3%, 26.0% and 47.7% at the year 1, year 2 and year 3 interviews, respectively (Table 1). A total of 31 person-years of follow-up were available for children who on an earlier occasion had indicated that their voices had gone away. The rate of recurrence in these children was 13% (four children).

View this table:
Table 1

Rate of change of auditory hallucinations in 80 children over a 3-year period

Patients v. non-patients

Children who received professional help generally had higher scores of social and psychopathological dysfunction than non-patients (Table 2). Thus, at baseline they had a higher rating on the BPRS hallucinations item, higher YSR problem behaviour scores, lower CGAS scores and more BPRS anxiety/depression and negative symptoms. Children who were receiving mental health care more often reported the presence of emotional triggers to the voices and more often reported childhood adversity. Emotional appraisals of the voices in this group were more often negative and they more often perceived an influence of the voices on emotions and behaviour. Finally, children receiving mental health care more often made use of passive problem-solving coping strategies.

View this table:
Table 2

Baseline differences between children who did and who did not receive professional help

Predictors of voice discontinuation

Higher severity ratings on the BPRS hallucinations item and higher frequency predicted voice persistence (Tables 3,4,5). Voice persistence was also associated with older age and lack of clear triggers of time and place. The greater the proportion of individuals in the social network to whom children had revealed their voices, the greater the risk of voice persistence. This association remained when an adjustment was made for the number of people in the network (hazard ratio, HR=0.18, 95% CI 0.04-0.81). Adjustment for the case variables changed the parameters only minutely and did not change the pattern of results. Having mental health care in itself did not influence the probability of voice discontinuation.

View this table:
Table 3

Descriptive statistics of predictors of voice discontinuation

View this table:
Table 4

Non-psychopathological factors influencing discontinuation of voices

View this table:
Table 5

Psychopathological factors influencing voice discontinuation

Predictors and care status

To examine whether the effect of predictors of voice discontinuation varied as a function of being a case or not, we fitted interactions between predictors on the one hand, and receipt of mental health care, case level of psychopathology, case level of problem behaviour and case level of social functioning (all as defined above) on the other. There was no clear or suggestive evidence to suggest that the effect of predictors varied between those who were and those who were not a case by any of the variables used.


In a group of 80 children hearing voices, the cumulative rate of discontinuation over a 3-year period was 60%. In 31 years of follow-up for children whose voices disappeared on one occasion, the experience recurred on another in four cases. Children hearing voices who received professional help differed from children who did not, in that they had a higher rating on the BPRS hallucinations item, higher YSR problem behaviour scores, lower CGAS scores and more BPRS anxiety/depression and negative symptoms. These children also reported more perceived influence on behaviour and feelings and more negative affective appraisals in relation to the voice. This combination of results suggests that need for care is to a large part associated with the child's (and parents') appraisal of the voices rather than the perception itself.

Predictors of persistence of voices, independent of receipt of mental health care, severity of psychopathology, severity of problem behaviour or level of impairment of social functioning, were severity and frequency of the voices, as well as associated anxiety/depression, lack of clear triggers of time and place and having told more people in the child's network about the voices. However, being in receipt of mental health care did not influence voice persistence.

Methodological issues

In order to examine the influence of possible misclassification of children whose voices recurred after earlier discontinuation, the four children whose voices recurred were reclassified as voice persistence on all earlier measurement occasions. This did not affect the overall pattern of results. The effect of age was reduced (HR=0.94, 95% CI 0.85-1.04; P=0.20) and the effect of a place trigger became statistically more imprecise (HR=1.70, 95% CI 0.93-3.11; P=0.085). The effect of sleep-related occurrence of voices became more marked (HR=1.99, 95% CI 1.07-3.70; P=0.029), as did the effect of life events (HR=0.56, 95% CI 0.34-0.94; P=0.027), indicating that the higher the number of reported life events, the higher the risk of voice persistence.

We did not conduct detailed diagnostic interviews yielding ICD or DSM diagnoses. Instead, we collected dimensional measures of problem behaviour, psychopathology and social functioning. This was carried out for several reasons. First, the variable of interest in this study was experience of voices, and the psychopathological and social functioning context of these experiences is arguably better described by global dimensional measures than diagnostic constructs of uncertain validity. Second, many of our subjects would not have met the distress/dysfunction criteria for disorder and therefore remained ‘undiagnosable’ using traditional measurements.

Some of the subjects were quite young and the age range was quite large. It is possible that the older children were better able to understand questions about, for example, coping and childhood trauma. Care was taken, however, that all questions were understood by the children during interview, and to the extent that this bias operated it is unlikely to have produced the current results.

Receipt of professional help

Although receipt of professional help cannot be equated with need for care, and absence of professional help with absence of need, the mean level of need for care, given the fact that mental health services are freely available for all in The Netherlands, can be safely assumed to be much higher in the receipt-of-care group. The result suggests that need for care in the context of experience of voices was associated not only with higher levels of problem behaviour and related negative symptoms of psychosis, but also with characteristics of the voices themselves and associated attributions.

Children who were in care not only had more severe ratings of the BPRS hallucinations item and more severe ratings of anxiety/depression, negative symptoms and problem behaviour, but also had more negative affective appraisals and felt that the voices had more influence on emotions and behaviour. These children also reported more frequent experience of earlier traumatic experiences and precipitation by emotional triggers. It has been suggested that appraisal of characteristics such as volume, tone and frequency of the voices are the result of appraisals of the voices' ranking and power, and that the distress associated with the voices is linked to the underlying social and interpersonal cognitive processes that drive these appraisals (Birchwood et al, 2000). Dysfunctional cognitions could thus contribute to the development of need for mental health care and increased likelihood of voice persistence. The fact that the perceived influence of the voice over emotions and behaviour and negative affective appraisals were also associated with need for care similarly points to the importance of underlying beliefs and appraisal processes. The higher frequency of reports of feeling traumatised by early events and recent triggers may also be interpreted in the tendency to feel ‘over-powered’ by events, and/or may represent repeated exposures to trauma with the reduced capability to assert oneself in the face of intrusive experiences or events. Some individuals may develop a need for care as a result of a ‘power imbalance’ between the individual and ‘ intruders’ or intruding events, the origin of which may lie in the cognitive processes that regulate adaptation to the voices. There may be an association between early childhood trauma and psychotic experiences (Greenfield et al, 1994; Ross et al, 1994; Berenbaum, 1999), also according to investigations using modern epidemiological approaches (Window et al, 2000).

There was no difference between those with and without mental health care in terms of total coping score. However, those with mental health care less often displayed passive problem-solving (ignoring voice, listening selectively to voice, doing something), which agrees with the view that need for care is related to the way the person interacts with their experience of voices, rather than just the experience by itself (Romme & Escher, 1994; Claridge, 1997).

Predictors of voice persistence

Predictors of voice persistence only concurred partially with those that discriminate between those who did and those who did not receive mental health care. However, severity of voices assessed by the BPRS, as well as associated anxiety and depression, influenced both receipt of care and voice persistence, confirming the possible role of cognitive processes regulating appraisals of volume and frequency of experience of voices and the ensuing affective response. The favourable course associated with the child being able to identify triggers of time and place (for example, only hearing voices at school or when alone in one's bedroom at night) suggests that individuals whose voices are not omnipresent (and therefore not ‘omnipotent’) but instead limited to a circumscribed situation are more likely to overcome the experience.

Higher scores on the DES were associated with an increased likelihood of voice persistence. High levels of dissociation measured with the DES are associated with higher levels of proneness to psychosis, and the current findings suggest that the DES could be identifying those individuals who are most liable to develop enduring psychotic symptoms (Allen & Coyne, 1995; Spitzer et al, 1997; Pope & Kwapil, 2000). High scores on the DES in the context of psychosis may be associated with childhood adversity (Goff et al, 1991), and although reported childhood adversity did not predict voice persistence, it could be that part of the effect of the DES is mediated through its association with childhood adversity, individuals with high DES scores representing the group most prone to enduring experiences of voices given exposure to early adversity.

The greater the proportion of people in the network that the child had revealed the experience of voices to, the greater the likelihood that the voices would persist, independent of the number of people in the network. One explanation of this finding is that the more severe and the more stressful the experience of voices is, the more likely it becomes that the child will reveal the experience to significant others in his or her network.

Being in receipt of mental health care did not influence voice persistence, possibly because, at the time of this investigation, most care appeared to be directed towards suppressing voices rather than coping with them.

Influence of caseness variables on course of voices

A remarkable finding was that voice persistence or discontinuation was independent of the various caseness variables used. Thus, the predictive effect of severity, anxiety/depression, triggers of time and place and other variables was the same in cases and non-cases, regardless of whether caseness was defined in terms of need for mental health care, problem behaviour, psychopathology or social functioning. This suggests that the longitudinal course of voices is a relatively autonomous process, influenced by a range of factors that are independent of the presence of diagnosis and, if a diagnosis is present, severity of psychopathology and problem behaviour. Thus, the cognitive processes underlying volume and frequency of the voices and the associated distress could be more important in predicting longitudinal course than caseness context. This suggests that therapeutic interventions for the experience of voices should be targeted specifically at the experience itself, rather than the presence of diagnosis.

Clinical Implications and Limitations


  • The majority of children hearing voices stop reporting the experience over the course of 3 years.

  • The children hearing voices who need professional help are more likely to feel overpowered by the voices.

  • Voice appraisals and associated anxiety and depression are better predictors of voice persistence than ‘traditional’ measures of psychotic symptomatology, problem behaviour and global functioning.


  • Lack of detailed information.

  • The age range was quite wide, introducing a large degree of sample heterogeneity.

  • The use of self-reports of life events and early trauma.


  • * Presented in part at the European First Episode Schizophrenia Network Meeting, Whistler BC, Canada, 27 April 2001.


View Abstract