Rating scales in old age psychiatry


Background There is a vast array of scales available to assess all aspects of mental and physical health in older people which may be of relevance to the work of old age psychiatrists.

Aims To summarise some of the scales that may be commonly used in clinical and research practice and to give the reader guidelines as to where further information can be obtained.

Method The scales were selected on the basis of the authors' own clinical and research knowledge and information was gathered from a comprehensive text on assessment scales in old age psychiatry.

Results The selected scales are described in brief and a table outlines the purposes for which they are most suitable.

Conclusions Although many scales are available, the choice of the individual scale relies specifically on the question that is to be asked. The ideal scale does not exist.


A multitude of scales are available to assess the effects of mental and physical problems in older people. A recent compendium of scales (Burns et al, 1999b) contained 162, all of which are available to the old age psychiatrist interested in applying them in a clinical setting. This wide choice presents a formidable challenge to the clinician or researcher in deciding which scale is the most appropriate to use. The purpose of this paper is to help readers sift through the plethora of published scales and enable them to move towards making an informed choice of what to use and when.

What is the purpose of applying the rating scale?

Determining which scale should be selected must always follow an analysis of the underlying purpose. It is remarkable how often this simple step is ignored and this frequently leads to the wrong choice. Is the scale to be used to screen a population, to assess severity of symptoms, to help with diagnosis or to monitor change?

What is to be measured?

There are five major clinical domains that are relevant to the old age psychiatrist: mood; behaviour; functioning; cognition; and quality of life and carer burden. Each can be measured separately using a specific scale, or alternatively can be assessed as part of a multi-dimensional instrument.

Who is to carry out the rating?

Ratings can be self-reported, observerrated, or based on information from an informant. The choice of instrument is often based on a combination of the user's familiarity with the scale, the time available for its application, and the presence and reliability of an informant. Subjective ratings are highly dependent on the cooperation of patients and their ability to understand either written or verbal instructions. Observer-based ratings can be time-consuming, and can misinterpret the severity and impact of the illness because they reflect a ‘snapshot’ rather than a ‘ video’ of the patient's illness. Informant-based ratings are commonly used for patients with dementia, who may not be reliable observers of their own functioning or behaviour; such ratings may be subject to bias, influenced by the informant's mood state or perceptions. Often, a combination of proxy reporting followed by direct patient interview gives the best result.

What resources are available?

The time available and the person who is to carry out the rating are key factors in determining choice of scale. For a scale to be used as part of routine clinical practice it has to be brief and easy to administer. Many instruments require specific training although, generally speaking, scales can be completed by any competent clinician. There is rarely a need for independent assessment of interrater and test—retest reliability, unless the scale is being applied to a population different from that in the original description.

Which scale to use?

A brief description of the scales most frequently used in old age psychiatry is provided below and is summarised in Table 1, which also lists the time needed for the rating procedure, and whether it is to be done by an observer, the caregiver or the patient. Some instruments have been developed specifically for elderly patients, whereas others have been adapted for use in the elderly. For example, the Brief Psychiatric Rating Scale was developed for use in young patients with schizophrenia but is often used to measure agitation in elderly patients with dementia. Scales developed specifically for, and standardised in, older people are preferable to scales developed for younger people, which may not translate well to older populations. The Hamilton Rating Scale for Depression may underestimate depression in older patients because of the atypical nature of depressive symptoms in the elderly. Even where scales are designed for the elderly population, some have been developed with a specific disease entity in mind and may not be appropriate for use in all situations. The Geriatric Depression Scale is a self-report rating scale for depression in older people, but may not be useful following stroke, or in patients with dementia and depression.

View this table:
Table 1

Summary of scales of use in old age psychiatry


Geriatric Depression Scale

The Geriatric Depression Scale (GDS) is a self-report scale designed to be simple to administer and not to require the skills of a trained interviewer (Yesavage et al, 1983). Each of the 30 questions has a yes/no answer, with the scoring dependent on the answer given. A sensitivity of 84% and specificity of 95% have been documented with a cut-off score of 11/12; a cut-off of 14/15 decreased the sensitivity rate to 80% but increased specificity to 100%. A 15-item version of the GDS has been devised by Shiekh & Yesavage (1986), and is probably the most common version currently used. The shortened version has a cut-off score of 6/7 and correlates significantly with the parent scale. Logistic regression analysis has been used to derive a four-item version which has a specificity of 88% with a cut-off of 1/2, and sensitivity of 93% with a cut-off of 0/1 (Katona, 1994). For the assessment of depression in older people, it is the scale against which others should be rated.

Brief Assessment Schedule Depression Cards

The Brief Assessment Schedule Depression Cards (BASDEC) system is based on the Brief Assessment Schedule with the novel development that, because of the difficulties of questions being overheard on geriatric wards, patients choose answers from a deck of cards (Adshead et al, 1992). The scale is administered by an interviewer and takes 2-8 minutes to complete. The pack is made up of 19 cards with enlarged black print on a white background and are presented one at a time. Both the GDS and the BASDEC performed identically well in the original study with a sensitivity of 71% and negative predictive value of 86% against a psychiatric diagnosis, using a BASDEC cut-off score of 6/7.

Cornell Scale for Depression in Dementia

The Cornell Scale (Alexopoulos et al, 1988) is specifically for the assessment of depression in dementia and is administered by a clinician. It takes 20 minutes with the carer and 10 minutes with the patient.

It differs from other depression scales in the method of administration rather than in analysis of any different symptom profile seen in depression with dementia compared with depression alone (Purandara et al, 2001). The 19-item scale is rated on a three-point score of ‘absent’, ‘ mild or intermittent’ and ‘severe’ symptoms, with a note when the score is unevaluable. A score of 8 or more suggests significant depressive symptoms. It is the best scale available to assess mood in the presence of cognitive impairment.

Geriatric Mental State Schedule

The Geriatric Mental State Schedule (GMSS) is one of the most widely used instruments for measuring a wide range of psychopathology in older people in all settings, but most importantly in community surveys (Copeland et al, 1976). Literature on the GMSS is extensive, and a number of different factors can be derived from the results. There is a computerised algorithm of proven reliability and validity, AGECAT, which provides standardised diagnoses. The GMSS can be administered via a laptop computer, has been translated into a number of different languages, has to be administered by a trained interviewer, and takes about 45 minutes to deliver. The use of the GMSS is limited to research, where it represents the gold standard.

Centre for Epidemiological Studies — Depression scale

The Centre for Epidemiological Studies — Depression (CES—D) scale is a self-administered scale, taking 5 minutes to complete. Originally developed for a general population study (Radloff, 1977), the instrument has been found to be particularly useful in older people. The scale consists of 20 items and the scoring range is from 0 to 60. A cut-off score of 16 has been suggested to differentiate patients with mild depression from normal subjects, with a score of 23 and over indicating significant depression.

Hamilton Rating Scale for Depression

The Hamilton Rating Scale for Depression (Hamilton, 1960) is the gold standard of observer-rated depression rating scales. It is a semi-structured interview, requires training to complete, and takes 20-30 minutes to administer. It is used to assess in all age groups, both for clinical and research purposes, the severity of depression rather than as a diagnostic tool. A cut-off score of 10/11 is generally regarded as appropriate for the diagnosis of depression.

Montgomery—Åsberg Depression Rating Scale

The Montgomery—Åsberg Depression Rating Scale (MADRS) is administered by a trained interviewer, takes 20 minutes to complete and was designed as a measure of change in studies of the treatment of depression (Montgomery & Åsberg, 1979). It was developed by taking items from a longer scale. It is widely used in treatment trials, in both young and older patients. Specific instructions are given regarding the ratings and there is a comparative lack of emphasis on somatic symptoms, making it useful for the assessment of depression in people with physical illness. Cut-off scores have been suggested by Snaith et al (1986): 0-6 indicates the absence of depression (or recovery in the setting of a clinical trial); 7-19, mild depression; 20-34, moderate depression; and 35 and above, severe depression.


Mini-Mental State Examination

The Mini-Mental State Examination (MMSE) is a rating of cognitive function and takes 10 minutes to administer by a trained interviewer (Folstein et al, 1975). It is the most widely used measure of cognitive function, and users need some training and familiarisation with the instrument. Much has been written about the MMSE and amendments have been suggested such as the Standardised Mini-Mental State Examination (Molloy et al, 1991) and the Modified Mini-Mental State (Teng et al, 1987). The original validity and reliability of the MMSE were based on 206 patients with a variety of psychiatric disorders, the scale successfully separating those with dementia, depression, or a combination of the two. Details of extensive subsequent validity and reliability studies are described by Tombaugh & McIntyre (1992). A cut-off score of 23 for the presence of cognitive impairment has been suggested, with variations depending on lack of education.

Mental Test Score and Abbreviated Mental Test Score

The Mental Test Score (MTS) and its abbreviated version are brief questionnaires to assess the degree of cognitive function, particularly memory and orientation; the MTS takes 10 minutes to administer, and the abbreviated form 3 minutes (Hodkinson, 1972). The MTS was developed from the Blessed Dementia Scale and was used in a study of over 700 patients carried out under the auspices of the Royal College of Physicians in the 1970s. A score of 25 and above (out of 34) is within normal range. From it, the Abbreviated Mental Test Score (AMTS) was developed, scored out of 9 or 10 (depending on whether the optional recognition questionnaire is completed). A cut-off score of 7/8 out of 10 (or 6/7 out of 9) is suggested to discriminate between cognitive impairment and normality. Qureshi & Hodkinson (1974) further validated the shorter questionnaire.

Clock drawing test

The clock drawing test takes only 2 minutes to administer and reflects frontal and temporoparietal functioning (Brodaty & Moore, 1997; Shulman et al, 1986). The main advantages are its simplicity of administration and the non-threatening nature of the task. The patient is asked to draw a clock face marking the hours and then draw the hands to indicate a particular time (e.g. 10 minutes to 2). Standardised methods of scoring have been described with sensitivities of up to 86% and specificity of up to 96% compared with diagnosis using the MMSE. This test is particularly useful in the general practice setting.

Seven-minute neurocognitive screening battery

The 7-minute neurocognitive screening battery is a test for cognitive impairment which aims to distinguish patients with dementia and normal controls (Solomon et al, 1998). It takes a mean of 7 minutes 42 seconds (range 6-11 minutes) to administer by a trained interviewer. The 7-minute screen consists of four tests representing four cognitive areas affected in Alzheimer's disease: memory, verbal fluency, visuoconstruction and orientation for time. The screening instrument was designed so that it could be rapidly administered by a technician, requiring no clinical judgement or training. It distinguishes patients with early Alzheimer's disease from those with normal ageing. It is a relatively new instrument and its exact use has still to be established.

Alzheimer's Disease Assessment Scale

The Alzheimer's Disease Assessment Scale (ADAS) takes 45 minutes administered by a trained observer and is a standardised assessment of cognitive function and non-cognitive features (Rosen et al, 1984). The cognitive section of the scale (ADAS-Cog) is the gold standard for measuring change in cognitive function in drug trials. Deterioration of about 10% per year in cognitive tests in patients with Alzheimer's disease is regarded as average. The cognitive domains include components of memory, language and praxis, while the non-cognitive features include mood state and behavioural changes. There are 11 main sections testing cognitive function and 10 assessing non-cognitive features.


Clinical Dementia Rating

The Clinical Dementia Rating (CDR) scale is used as a global measure of dementia (Hughes et al, 1982; Berg, 1984) and is usually completed by a clinician in the setting of detailed knowledge of the individual patient. Much of the information will therefore already have been gathered, either as part of normal clinical practice or as part of a research study. If a specific interview is carried out, about 40 minutes is needed to gather the relevant information. The CDR has become one of the main methods by which the degree of dementia is quantified into stages. Six domains are assessed: memory; orientation; judgement and problem-solving; community affairs; home and hobbies; and personal care. Ratings are 0 for healthy people, 0.5 for questionable dementia and 1, 2 and 3 for mild, moderate and severe dementia as defined in the CDR scale.

Clinicians' Global Impression of Change

The Clinicians' Global Impression of Change scale is administered by a trained rater and takes 10-40 minutes (Guy, 1976). The ratings depend on the ability of the clinician to detect change, and any change that is clinically detectable is significant. By definition, these measures are global ratings of a patient's clinical condition, and inevitably draw information from a wide variety of sources. The scale has been used extensively in clinical trials of antidementia drugs where a global assessment of the degree of dementia is required, and can usefully assess change from a specified baseline (Knopman et al, 1994; Schneider & Olin, 1996).


Neuropsychiatric Inventory

The Neuropsychiatric Inventory (NPI) evaluates a wider range of psychopathology than comparable instruments (Cummings et al, 1994). It may help distinguish between different causes of dementia, records severity and frequency separately, and takes 10 minutes to administer. The NPI assesses ten domains: delusions; hallucinations; dysphoria; anxiety; agitation/aggression; euphoria; disinhibition; irritability/lability; apathy; and aberrant motor behaviour. A screening strategy is used to cut down the length of time the instrument takes to administer, but obviously it takes longer if replies are positive. It is scored from 1 to 144 and severity and frequency are independently assessed. The NPI has been translated into a number of languages and it is now used widely in drug trials.


The BEHAVE—AD (Reisberg et al, 1987) takes 20 minutes to administer by a clinician and was designed particularly to be useful in prospective studies of behavioural symptoms and in pharmacological trials to document behavioural symptoms in patients with Alzheimer's disease. The BEHAVE—AD is the original behaviour rating scale in Alzheimer's disease. It is in two parts: the first part concentrates on symptomatology, and the second requires a global rating of the symptoms, on a four-point scale of severity. The domains covered are paranoid and delusional ideation; hallucinations; activity disturbances; aggression; diurnal variation; mood; and anxieties and phobias.


The Manchester and Oxford Universities Scale for the Psychopathological Assessment of Dementia (MOUSEPAD) is administered to carers by an experienced clinician, and takes 15-30 minutes, most items being given a three-point severity score (Allen et al, 1996). The main indication for use of the scale is the measurement of psychiatric symptoms and behavioural changes in patients with dementia.

The MOUSEPAD is based on the longer Present Behavioural Examination (Hope & Fairburn, 1992), and was developed as a shorter instrument and one with an equal emphasis on psychiatric symptomatology and behavioural changes.

Cohen-Mansfield Agitation Inventory

The seven-point rating system of the Cohen-Mansfield Agitation Inventory (CMAI) assesses 29 different agitated behaviours in patients with cognitive impairment (Cohen-Mansfield, 1989). It takes 10-15 minutes and is carried out by carers. Training is essential. The agitated behaviours include wandering, aggression, inappropriate vocalisation, hoarding items, sexual disinhibition and negativism, and are rated on a seven-point scale of frequency. The CMAI is useful for the assessment of agitation in residents of nursing and residential homes.

Revised Memory and Behaviour Problems Checklist

The Revised Memory and Behaviour Problems Checklist assesses behavioural problems in dementia, taken from caregiver reports (Teri et al, 1992). It is a 24-item list that provides one total score and three subscores for memory-related problems, depression and disruptive behaviours, assessing both the frequency of the behaviour and the caregiver's reaction.


Bristol Activities of Daily Living Scale

The Bristol Activities of Daily Living Scale was designed specifically for use in patients with dementia (Bucks et al, 1996). The scale assesses 20 daily living abilities. Face validity was measured by way of carer agreement that the items were important, construct validity was confirmed by principal components analysis and concurrent validity by assessment with observed performance, and there is good test—retest reliability. Three phases in the design of the scale are described, and researchers designing their own scale should read the account of this development, which is a model of clarity.

Alzheimer's Disease Functional Assessment and Change Scale

The Alzheimer's Disease Functional Assessment and Change Scale (ADFACS) is used for the assessment of activities of daily living in patients with Alzheimer's disease with particular reference to outcomes in clinical trials (Galasko et al, 1997). It is informant-based and takes 20 minutes. The scale has been used in drug trials, and consists of ten items for instrumental activities of daily living: ability to use the telephone; performing household tasks; using household appliances; handling money; shopping; preparing food; ability to get around both inside and outside the home; pursuing hobbies and leisure activities; handling personal mail; and grasping situations or explanations. These are rated from no impairment to severe impairment.

Basic activities of daily living are assessed on a six-point scale (an additional rating, very severe impairment, is included). These are: toileting, dressing, personal hygiene and grooming, physical ambulation and bathing. The scale was developed from 45 activities of daily living items, with the chosen items having been shown to be sensitive to change over 12 months, to correlate with the MMSE and to have good test—retest reliability (Galasko et al, 1997).

Interview for Deterioration in Daily Living Activities in Dementia

The Interview for Deterioration in Daily Living Activities in Dementia (IDDD) assesses activities of daily living, taking 15 minutes to administer with a caregiver (Teunisse et al, 1991). The scale covers 33 self-care activities such as washing, dressing and eating, as well as more complex activities such as shopping, writing and answering the telephone, tasks performed equally by men and women (earlier scales of activities of daily living tended to rely more heavily on female-dominated and less complex tasks). Both the initiative to perform activities and the performance itself are evaluated.

Disability Assessment for Dementia

The Disability Assessment for Dementia (DAD) scale (Gelinas et al, 1999) is rated by a trained observer and takes 20 minutes. It is a new functional scale specifically developed for patients with Alzheimer's disease and assesses basic and instrumental activities of daily living.


Psychogeriatric Assessment Scale

The Psychogeriatric Assessment Scale (PAS) provides an assessment of the clinical changes of dementia and depression (Jorm et al, 1995). The package is easy to administer and score, and can be used by lay interviewers. It is intended for use both in research and service evaluation, taking about 10 minutes to administer by a trained lay interviewer or clinician. There are three scales derived from an interview with the subject (cognitive impairment, depression, stroke) and three derived from an interview with an informant (cognitive decline, behavioural change, stroke).

Brief Psychiatric Rating Scale

The Brief Psychiatric Rating Scale (BPRS) takes about 20 minutes and is administered by a trained interviewer. The BPRS is a 16-item, seven-point ordered category rating scale which has been developed through previous versions (Overall & Gorham, 1962). The domains assessed are somatic concern; anxiety; emotional withdrawal; conceptual disorganisation; guilt feelings; tension; mannerisms and posturing; grandiosity; depressive mood; hostility; suspiciousness; hallucinatory behaviour; motor retardation; uncooperativeness; unusual thought content; and blunted affect. The questions are completed in 2-3 minutes following the interview.

Health of the Nation Outcome Scales 65+

The Health of the Nation Outcome Scales 65+ (HoNOS 65+) are an adaptation of the equivalent scale for younger people (Burns et al, 1999a). It is a 12-item score dealing with the following aspects of the mental state and living situation: aggression; self-harm; drug and alcohol use; cognitive problems; physical illness and disability; hallucinations and delusions; depression; other symptoms; relationships; activities of daily living; residential environment; and daytime activities.

Its main use is in the provision of the global assessment of a patient. Its administration takes about 10 minutes and requires some training. The HoNOS 65+ is becoming a useful tool in defining the characteristics of populations of older people with mental health problems.

Cambridge Mental Disorders of the Elderly Examination

The Cambridge Mental Disorders of the Elderly Examination (CAMDEX) is a structured instrument made up of eight sections — an interview with the subject, a cognitive section (the CAMCOG), the interviewer's observations of the subject, a physical examination, results of investigations, a note of medication, any additional information and an interview with an informant (Roth et al, 1986). The resulting information provides a formal diagnosis in a number of categories: four types of dementia, delirium, depression, anxiety, paranoid disorder, and other psychiatric disorders. Interrater reliability is excellent and a cut-off score of 79/80 gives a 92% sensitivity and 96% specificity in relation to a diagnosis of dementia. The CAMDEX has been used extensively in research studies.


General Health Questionnaire

The General Health Questionnaire (GHQ) is a self-administered screening test used for detecting psychiatric disorders in community settings and non-psychiatric clinical settings (Goldberg & Williams, 1988). A number of versions are available; the commonly used 12-item one takes 5 minutes. It is not normally used as a screening measure in older people, but has been used as a measure of psychological distress and psychiatric morbidity in carers of patients with dementia (Marriott et al, 2000) and seems to be sensitive to change in that situation.

Quality of Life in Alzheimer's Disease Patient and Caregiver Report

The Quality of Life in Alzheimer's Disease Patient and Caregiver Report (QoL—AD) is used for the assessment of quality of life in dementia and is taken from self and caregiver reports (Logsdon et al, 1999). This 13-item assessment relates to the domains of mood, physical health, memory, relationships, self-esteem and current situation. Each is marked on a four-point scale.


Confusion Assessment Method

The Confusion Assessment Method (CAM) instrument (Inouye et al, 1990) consists of nine operationalised criteria from DSM—III—R (American Psychiatric Association, 1987) including the four cardinal features of delirium: acute onset and fluctuation, inattention, disorganised thinking and altered level of consciousness. Both the first and second features, and either the third or fourth feature, are required for the diagnosis. The results have been validated against psychiatric diagnosis and found to be valid.

Cognitive Failures Questionnaire

The Cognitive Failures Questionnaire (CFQ) is used as a measure of self-reported failures in perception, memory and motor function (Broadbent et al, 1982) and takes about 10 minutes to complete. This questionnaire may be of use in screening different memory complaints in a population or clinical sample. Its use has not been validated against the presence or absence of dementia, but it gives a useful overview of which aspects of memory loss are giving rise to problems.


In old age psychiatry, as in general psychiatry, the art of practice involves making judgements about the presence or absence of psychiatric illness and the assessment of its impact and severity. A good psychiatrist may make these judgements automatically, but a better psychiatrist supplements clinical judgement by making sure all the right questions have been asked and by rating the severity of the illness or impairment. The use of rating scales helps formalise the assessment approach, ensures thoroughness, may clarify the presence or absence of mental illness, gives an index of severity, and facilitates the determination of response to treatment and disease course over time.

The use of rating scales in old age psychiatry has to a large extent been restricted to the academic and research arenas. Although there are many complex and unwieldy scales that could only be used in research settings, the majority of the scales described here are suitable for clinical use to complement and improve our assessment of patients. Old age psychiatrists should become more comfortable with routine use of such scales, and training in and exposure to the various rating scales that can be used in elderly people should be incorporated into undergraduate, postgraduate and specialist training programmes. Rating scales can be as useful a clinical tool to the old age psychiatrist as the stethoscope and patella hammer are to the physician.

Clinical Implications and Limitations


  • Take care to choose the correct scale.

  • Make sure it is suited to the purpose intended.

  • Always check source material when using a scale.


  • It is difficult to choose a scale because of the large number available.

  • Many scales have poorly documented validity and reliability.

  • Many scales are used for a purpose for which they are not intended.


  • Received November 7, 2000.
  • Revision received February 9, 2001.
  • Accepted May 23, 2001.


View Abstract