The British Journal of Psychiatry
Approaches to gene mapping in complex disorders and their application in child psychiatry and psychology


Background Twin studies demonstrate the importance of genes and environment in the aetiology of childhood psychiatric and neurodevelopmental disorders. Advances in molecular genetics enable the identification of genes involved in complex disorders and enable the study of molecular mechanisms and gene—environment interactions.

Aims To review the role of molecular genetics studies in childhood behavioural and developmental traits.

Method Molecular approaches to complex disorders are reviewed, with examples from autism, reading disability and attention-deficit hyperactivity disorder (ADHD).

Results The most robust finding in ADHD is the association of a variable number tandem repeat polymorphism in exon 3 of the DRD4 gene. Other replicated associations with ADHD are outlined in the text. In autism, there is a replicated linkage finding on chromosome 7. Linkage studies in reading disability have confirmed a locus on chromosome 6 and strongly suggest one on chromosome 15.

Conclusions In the next 5-10 years susceptibility genes for these disorders will be established. Describing their relationship to biological and behavioural function will be a far greater challenge.

The classic genetic approaches of family, twin and adoption studies have provided considerable evidence that genetic influences play an important role in the development of child behaviour and cognition. There have also been remarkable advances in the application of molecular methods in medicine, together with technological progress in mapping and sequencing the human genome. As a result, many are now persuaded that the time is right to focus on the identification of genes that may give rise to childhood psychiatric and behavioural disorders. Expectations aroused by work in this area are high, so that molecular genetic strategies now have a pivotal role in research aimed at elucidating both the biological basis (nature) and environmental basis (nurture) of childhood behavioural and neurodevelopmental disorders, with the hope of major developments in prevention and treatment. In this review we focus on contemporary approaches to gene discovery in complex disorders and describe how they have been applied in child psychiatry and psychology.


Molecular genetic research in the field of child development and psychopathology is comparatively new. In part, this is because of difficulties in defining childhood disorders and the scarcity of sufficiently detailed family, twin and adoption studies demonstrating familial resemblance between close family members and the relative contribution of genes and environment. In many clinical areas the question of familial risk is only now starting to be addressed using more rigorous protocols. Indeed, the relative ease with which molecular genetic studies can be performed has in many cases led to molecular approaches being pursued in advance of good phenotypic definition. As a result many groups are developing both quantitative and molecular approaches side by side, since they are complementary in the investigation of complex behavioural disorders (for general reviews see McGuffin et al, 1994; Plomin et al, 1994, 2000; Rutter et al, 1999b).

One of the key issues for those engaged in molecular genetic studies is the identification of heritable phenotypes. Traditional diagnostic categories are useful in practice, but their validity in terms of their underlying aetiology is uncertain. Nevertheless, the general approach so far has been to use categorical criteria based on definitions such as those in DSM-IV (American Psychiatric Association, 1994) or ICD-10 (World Health Organization, 1992).

Alternative approaches aimed at identifying quantitative trait loci (QTLs) can be applied whenever a behavioural or developmental trait is continuously distributed throughout the population. These approaches may be particularly applicable to child psychopathology, since with few exceptions, behavioural disorders of childhood can be conceptualised as extremes on continuously distributed dimensions. Many of the conditions encountered in childhood seem to have close parallels in normal variations in human behaviour. Questions therefore arise about the relationship between quantitative dimensions of behaviour and extreme diagnostic categories, and whether genes that exert an influence across a normal range of behaviour also exert their influence at the pathological extremes.

Quantitative genetic studies using twin samples have approached this issue by considering the relationship between diagnostic categories or cut-offs on dimensional scores in twin probands and quantitative scores in their twin partners (DeFries & Fulker, 1985, 1988). A simple multiple regression procedure is used to test the hypothesis that quantitative trait scores of twin partners (co-twins) should be more similar to those of the probands for identical co-twins compared with non-identical co-twins. Using this approach, mild learning difficulties (Plomin et al, 1991), hyperactivity and spelling disability (Stevenson et al, 1993) appear to represent extremes of continuously distributed traits, whereas severe learning difficulties (Plomin et al, 1991) and early language delay (Dale et al, 1998) appear to be distinct disorders.


Molecular genetic research has had a dramatic impact on the study of simple genetic disorders with Mendelian patterns of inheritance. Over 100 genes that are the sole or main cause of such disorders have been identified using a variety of positional cloning and targeted candidate gene strategies (see the Online Mendelian Inheritance in Man (OMIM) database at In most cases this has laid the foundation for further functional studies resulting in considerable insights into the molecular and biochemical basis of these disorders. One of the most exciting aspects of this work has been the ability to discover genes involved in disease processes in the absence of a priori hypotheses, by first identifying chromosomal regions containing disease genes and then screening genes within the region for functional mutations. Conditions such as autism, attention-deficit hyperactivity disorder (ADHD), reading disability and mild mental impairment, however, are said to show complex inheritance. They do not conform to Mendelian patterns of segregation and are thought to result from the combined effects of several genes (oligogenic) or perhaps many genes (polygenic), each of which, on its own, has only a small effect. In these cases, variations of single genes are neither sufficient nor necessary to cause the disorder, but such genes act as susceptibility genes, increasing risk for the disorder. Mapping and identifying the genes responsible for such complex disorders represents a greater challenge than that posed by rarer Mendelian diseases, but one that is becoming rapidly more tractable.

Linkage analysis using multiply affected families

Early genetic studies of common behavioural disorders such as schizophrenia and bipolar disorder were based on the assumption of single-gene inheritance. Large, multiply affected families were identified, and this appeared to show Mendelian inheritance. In some cases results using this approach suggested the identification of rare familial forms segregating single genes of major effect. For example, there are three independent reports of linkage between markers on chromosome 4p near to the dopamine D5 receptor gene, and bipolar and schizo-affective disorder (Blackwood et al, 1996; Asherson et al, 1998; Ewald et al, 1998).

However, analyses of such families using traditional linkage approaches have in general been unsuccessful and are unlikely to identify genetic risk factors for common forms of these disorders. As a result, more recent linkage studies of behavioural phenotypes have focused on using sibling pairs and small nuclear families rather than multiplex pedigrees (reviewed by Craddock & Owen, 1996; Risch, 2000).

Linkage analysis using affected sibling pairs

Pairs of siblings affected with the same disorder are presumed to share susceptibility genes inherited from the same parent. This hypothesis can be tested by ascertaining a series of affected sibling pairs and genotyping the sample with markers spread evenly throughout the genome. Where affected siblings share parental alleles more often than by chance alone, this indicates linkage between a susceptibility gene and the marker alleles. These studies can be robust in the sense that diagnostic criteria can be specified in an attempt to reduce genetic and aetiological heterogeneity, and no assumptions are required about the underlying mode of inheritance. The major limitation is the power of the technique, so that large numbers of affected pairs are needed to detect genes of moderate to small effect.

A measure frequently used to evaluate the power of affected sibling pair linkage is the ratio of the risk to the sibling of an affected proband and population prevalence, a parameter known as λs. Low λ s values may be due to a variety of factors such as polygenic transmission, genetic heterogeneity, phenocopies and low penetrance, which may require unfeasibly large sample sizes to overcome. Disorders such as autism may be more amenable to this approach since the estimated λ s is very large (100-200), well within the theoretical resolution of linkage strategies. On the other hand, disorders such as ADHD have an estimated λs somewhere between 2 and 5 (Biederman et al, 1990, 1992). Indeed, if more than one gene causes ADHD, then the λ value for any single gene (the gene-specific λ or λg) must be very low.

Association strategies

Association studies compare the frequencies of marker alleles in a group of affected individuals with those in a sample of control subjects without the disorder or drawn from the general population. A statistically significant difference suggests either tight linkage, resulting in linkage disequilibrium between a marker allele and the susceptibility locus, or that the marker allele itself confers susceptibility to disorder. Linkage disequilibrium (LD) describes the phenomenon where two loci are so close together on a chromosome that they are not separated by recombination events over many generations. Loci linked together in such a way reflect fragments of ancestral chromosomes that remain intact despite many meiotic events over multiple generations and therefore appear to be associated even in individuals from different families.

The power of association to detect genes of small effect is well known. Risch & Merikangas (1996), using stringent criteria that took into account the large number of loci required to screen the entire genome, estimated that a sample of 340 unrelated cases would detect association with a susceptibility gene with a frequency of 0.5 and gene-specific λ of 2.0. In comparison, 2498 affected sibling pairs would be required to detect the same gene by linkage analysis. Despite this, the usefulness of the approach has been limited by the fact that thousands of markers are required to perform a whole genome search. For this reason association approaches are still in their infancy and in most cases have only been applied to the analysis of a few candidate genes. Fortunately, the development of a new-generation high-density marker map and the technology to screen these in large numbers is under way, so that LD mapping strategies are fast becoming the focus of current interest in the study of complex disorders (Chakravati, 1999; Kruglyak & Nickerson, 2001).

Association studies in behavioural disorders have unfortunately thrown up a number of contradictory results (for example, see O'Donovan & Owen, 1999). This has been in part because of the problems of diagnosis and comparability of patient populations from different centres, and inadequate sample sizes in many studies. A confounding factor in case—control studies is the selection of control subjects, which can result in stratification effects. The solution to this problem is to sample both parents of affected probands and use the non-transmitted parental alleles as control genotypes (haplotype relative risk analysis; Falk & Rubinstein, 1987), or look for increased transmission of a specific allele from heterozygote parents (transmission disequilibrium test; Spielman & Ewens, 1996).

Quantitative trait loci mapping

An alternative approach to mapping genes for disorders using categorical criteria is the analysis of traits that are continuously distributed in the population (Plomin et al, 1994, 2000). Such quantitative traits are influenced by the action and co-action of multiple genes or QTLs. Developmental traits such as general cognitive ability (g) and reading ability, and behavioural traits such as hyperactivity, may be better perceived in this way. In QTL linkage, the difference in trait values, measured on a dimensional scale for pairs of siblings, is squared and examined as a function of the number of shared parental alleles (Haseman & Elston, 1972; Kruglyak & Lander, 1995). An alternative approach selects one sibling from the top few per cent (diagnosed cases) and regresses the phenotypic score of the other sibling onto the number of shared parental alleles (Fulker et al, 1995; Fulker & Cherney, 1996). Considerable additional power can be gained by the use of phenotypically discordant as well concordant sibling pairs and the selection of siblings within the top and bottom deciles (Risch & Zhang, 1995; Purcell et al, 2001). Furthermore, both concordant and discordant siblings provide a powerful resource for QTL association mapping as well as linkage, using variance component approaches (Fulker et al, 1999).

Genetic maps and high-throughput genotyping

Investigators working before DNA markers became readily accessible were restricted to the use of ‘classic’ genetic markers such as red blood cell antigens (ABO, MNS and Rh) and the human leucocyte antigens (HLA). The use of DNA markers began with the discovery of techniques for measuring variation within genomic DNA. Modern maps were introduced following the discovery that within non-coding regions of genomic DNA there are simple sequence repeats (SSRs) of short (two to four) nucleotide sequences (Weber & May, 1990). The usefulness of these markers lies in the ease with which they can be typed, following introduction of the polymerase chain reaction (PCR) and the use of efficient gel electrophoresis systems that allow multiple SSRs to be analysed together. Newer machines using capillaries instead of conventional slab gels enable the processing of 6000-18 000 individual genotypes on one machine in a day. This rate of production is adequate for genome-wide linkage, but for association mapping a much closer grid of markers is required, so that even more rapid methods are needed.

New approaches to very rapid genotyping

New approaches that are expected to have far-reaching consequences for mapping genes in complex disorders depend upon the detection of single-nucleotide polymorphisms (Anon., 1999; Craig et al, 2000). Single-nucleotide polymorphisms (SNPs) consist mainly of single base substitutions and are the most frequent type of variation in the human genome. The SNP Consortium and International Human Genome Sequencing Consortium have identified and mapped 1.42 million SNPs, which are distributed throughout the human genome at an average density of one SNP every 1900 base pairs (International SNP Map Working Group, 2001). Further developments aim at screening gene sequences directly, so that instead of searching for associations with anonymous markers, genome scans will be able to focus on variation within genes that affects protein sequence and expression, or lies very close to such functional variants. So far 60 000 of the identified SNPs fall within protein coding sequences and 85% of coding regions are within 5000 base pairs of the nearest SNP. It has been estimated that a systematic approach to cataloguing all the variation that may be relevant to human behaviour would involve around 10 000 genes, an achievable goal within the next few years. Such polymorphisms will then be available for use in behavioural genetics association studies.

In parallel with the identification of SNPs, methods are under development for efficient SNP genotyping (see Landegren et al, 1998, for review). The approaches which have gained widest publicity are the development of DNA micro-arrays, commonly referred to as DNA chips (Lander, 1999), and, more recently, matrix-assisted laser desorption/ionisation time-of-flight mass spectrometry (MALDITOF; Griffin et al, 1999). The impact of these technologies on gene mapping is likely to be profound, since it will be possible to screen very large samples for linkage and association with extremely dense marker maps, greatly increasing the power to detect genes of small effect in complex disorders.


The search for genetic variations that exert an influence on child behaviour and development has only just begun. Nevertheless, there has been considerable progress in the study of autism, ADHD and reading disability. Here we will review the most prominent of these findings, which will also serve to illustrate the range of gene mapping strategies being applied.

Molecular studies of autism

Autism is a pervasive developmental disorder with onset by 3 years of age and is defined by the presence of a triad of social and communication impairments with restricted, repetitive or stereotyped behaviours. The disorder is interesting from a genetic perspective because, unlike most complex disorders, it is relatively rare and has a high degree of familiality. Nevertheless, it shows a complex mode of inheritance best explained by the action of several genes. Familial clustering is high, with an estimated λ s of about 100, calculated from a population prevalence of around 2-4 per 10 000 and a risk to siblings of around 3%. Such clustering may be the result of shared environmental factors, but findings from twin studies provide overwhelming evidence for the importance of genetic influences: monozygotic (MZ) concordance rates are 60-90%, compared with less than 5% in dizygotic (DZ) twins, giving an estimated broad heritability of over 90% (reviewed by Rutter et al, 1999a,b). Twin studies also suggest that traditional diagnostic boundaries are far too restricted, since a higher MZ:DZ concordance ratio is found using a broad autistic phenotype, consisting of a combination of cognitive and social deficits similar to autism but in milder form (Bailey et al, 1998).

In the light of these findings it is not surprising that there has been a concerted effort to pursue genetic linkage studies using affected sibling pairs. This was first achieved by the International Molecular Genetic Study of Autism Consortium, which brought together groups from across Europe and the USA. In large collaborative projects the reliability of clinical assessments and subsequent diagnosis across multiple groups is always a key issue, especially with behavioural phenotypes, which are difficult to measure accurately and are generally quantitative rather than qualitative. However, this was facilitated by the adoption by all participating groups of the same assessment instruments: the Autism Diagnostic Interview (ADI; Le Couteur et al, 1989) and the Autism Diagnostic Observation Schedule (ADOS; Lord et al, 1989). A public information resource and DNA data-bank known as the Autism Genetic Resource Exchange has been established (

The first report of a full genome linkage screen for autism identified three chromosomal regions showing evidence suggestive of linkage. The most significant of these was on the long arm of chromosome 7, which gave rise to a gene-specific λ of 5.0 (International Molecular Genetic Study of Autism Consortium, 1998). Consistent evidence for the chromosome 7 locus and another locus on chromosome 2 has come from several published and unpublished data-sets (Barrett et al, 1999; Phillipe et al, 1999; Risch et al, 1999; Auranen et al, 2000; reviewed by Lamb et al, 2000). If the identified genetic loci acted in a simple additive fashion they would contribute less than 10% to the overall λs value of around 100, suggesting that there must be multiple genes of very small effect or — perhaps more likely — important gene—gene and gene—environment interactions. The next step is to identify the genes themselves, by genotyping dense maps of SSR and SNP markers to refine the linkage regions and search for associations with particular genes and functional variants.

Molecular studies of reading disability

Familial transmission of reading disability has been recognised for a long time and twin studies demonstrate a substantial heritable component, estimated to be between 50% and 70% (DeFries et al, 1987; Gillis et al, 1992; Alarcon et al, 1998). Furthermore, twin studies suggest that the QTL perspective, which views genetic factors involved in reading disability as the same as those contributing to the quantitative dimension of the disability, is valid and should inform the design of molecular genetic studies. In fact, molecular studies of reading disability began in 1983 using traditional linkage approaches, giving suggestive but inconclusive results on chromosomes 6 and 15 (Smith et al, 1983; Bisgaard et al, 1987; Gross-Glenn et al, 1991; Rabin et al, 1993).

Cardon et al (1994) were the first to apply a QTL approach to linkage mapping. This was achieved by deriving a continuous measure of reading ability from a battery of psychometric tests used to test the siblings of probands with reading disability. They found considerable evidence for a gene in the chromosome 6 region implicated earlier, with a moderate to strong influence on reading ability. Further evidence for the chromosome 6 and chromosome 15 loci came from an analysis of six large families (Grigorenko et al, 1997), which found that these loci were linked to two distinct reading-related phenotypes: phonological awareness with chromosome 6, and single-word reading with chromosome 15. Final confirmation for the chromosome 6 locus has come from two linkage studies (Fisher et al, 1999; Gayan et al, 1999). Since then, association mapping with simple sequence repeat markers has been used to screen the chromosome 6 and 15 linkage regions. Association was detected with chromosome 15 markers in two series of probands giving an overall significance value of P=0.000 000 08 (Morris et al, 2000). Current work is focused on identifying the specific genes involved.

Molecular studies of ADHD

Attention-deficit hyperactivity disorder is characterised by a persistent pattern of overactivity, inattention and impulsivity which is pervasive across social situations and accompanied by substantial social impairments. The disorder is common, occurring in 2-5% of children, affecting boys 2-3 times more frequently than girls, and is one of the major causes of childhood behavioural problems (Taylor et al, 1996). Hyperactivity is known to aggregate within families (Cantwell, 1972; Biederman et al, 1990, 1992) and twin studies have consistently shown it to be among the most highly heritable behaviours in childhood (reviewed in Thapar et al, 1999). Although ADHD is diagnosed using operational criteria to define diagnostic categories, measures of hyperactivity are continuously distributed in the general population. Recent twin studies have used dimensional rating scales, with clinical cut-offs applied when diagnostic categories were required. These studies all show high heritabilities regardless of where these cut-offs had been made and regardless of whether diagnostic or continuous criteria had been applied. This suggests strongly that a dimensional perspective on hyperactivity is a valid and powerful approach to the identification of QTLs and should be considered complementary to the study of defined clinical types. Despite this, most studies to date have applied categorical definitions based on current DSM definitions of ADHD.

Molecular genetic studies in ADHD began by focusing on candidate genes within the dopamine system, based on a priori hypotheses from neurochemical and neuropharmacological research. Replicated associations have been reported with variations in genes for the dopamine receptors 4 (DRD4) and 5 (DRD5) and the dopamine transporter (DAT1) (Collier et al, 2000). The most robust of these findings is the association between ADHD and the 7-repeat of a 48 bp sequence within the coding region of DRD4, which is commonly repeated two, three, four or seven times (reviewed by Mill et al, 2001). A recent meta-analysis of seven case—control and fourteen within-family studies of this association, including both published and unpublished data, supports this finding, with odds ratios of 1.8, P=4 × 10-8 and 1.3, P=2 × 10-2, respectively (Faraone et al, 2001).

An interesting feature of the DRD4 data is the wide range of sampling procedures giving rise to positive findings. Swanson et al (1998) used a highly selected group of children who all responded to methylphenidate and had the combined subtype of ADHD without significant comorbidity. This contrasts greatly with the studies by Rowe et al (1998), who assessed children attending a behavioural clinic and applied diagnostic criteria following completion of DSM—IV rating scales, and Smalley et al (1998), who applied broader DSM—III—R and DSM—IV criteria following diagnostic interview. An alternative approach adopted by Curran et al (2001a) was to take a QTL perspective in which ADHD was conceptualised as the extreme of a normally distributed trait. In this study, they examined the relationship of the DRD4 polymorphism in a sample of children selected from the general population on the basis of high and low scores on five ADHD items of the Strengths and Difficulties Questionnaire as rated by the parents, and found a significant association with high-scoring individuals (χ2=8.63, P=0.003; odds ratio 2.09).

The DAT1 findings are of particular interest since stimulant drugs interact directly with the transporter protein. To date, there have been nine published association studies of ADHD with a 480 bp allele of a variable number tandem repeat (VNTR) polymorphism in the 3′-untranslated region of the gene: five support an association and four do not (summarised by Curran et al, 2001b). Meta-analysis of these data is consistent with a very small main effect for the 480 bp allele and is not yet convincing (χ2=3.45, P=0.06, OR=1.15). However, there is significant evidence of heterogeneity between the combined data-sets (χ2=22.64, d.f.=8, P=0.004), suggesting that the studies may divide into two groups: those in which the associated DAT1 allele has a main effect and those in which the allele does not. In this case, failure to replicate the association in some studies may result from variation in the strength of the genetic influence in different populations. The cause of such heterogeneity remains unknown and requires further investigation.

Finally, association and linkage to a marker near to DRD5 has been reported by several groups (Daly et al, 1999; Barr et al, 2000; Tahir et al, 2000).


The identification of susceptibility genes is only an initial step. Describing the molecular mechanisms involved and their relationship to biological and behavioural function will be a far greater challenge. Bridges will need to be built between structure and function, between molecular mechanisms and behaviour, and between social, genetic and developmental psychiatry. This area of functional genomics will be the real challenge in the post-genomic era if we are to see tangible benefits from current progress in mapping out the genetic and environmental influences on child behavioural and neurodevelopmental disorder.

Clinical Implications and Limitations


  • Both genes and environment are implicated in many child psychiatric conditions and psychological traits.

  • Molecular genetic studies aim to identify common genetic risk factors.

  • Long-term benefits are expected from an improved understanding of the genetic and environmental mechanisms involved.


  • Very large samples are required to reliably detect genes for such complex human traits.

  • Further advances are required in the rate of genetic marker data.

  • Bridging the gap between gene structure and function in relation to behavioural outcomes remains a major challenge.


The authors thank the UK Medical Research Council for their support of this research. Dr Sarah Curran is a Wellcome Trust Training Fellow.


  • Received January 17, 2000.
  • Revision received April 5, 2000.
  • Accepted April 5, 2000.


View Abstract