Journal of Public Health Medicine Vol. 21, No. 3, pp. 255–270 Printed in Great Britain Short Form 36 (SF-36) Health Survey questionnaire: which normative data should be used? Comparisons between the norms provided by the Omnibus Survey in Britain, the Health Survey for England and the Oxford Healthy Life Survey Ann Bowling, Matthew Bond, Crispin Jenkinson and Donna L. Lamping Abstract Background Population norms for the attributes included in measurement scales are required to provide a standard with which scores from other study populations can be compared. This study aimed to obtain population norms for the Short Form 36 (SF-36) Health Survey Questionnaire, derived from a random sample of the population in Britain who were interviewed at home, and to make comparisons with other commonly used norms. Methods The method was a face-to-face interview survey of a random sample of 2056 adults living at home in Britain (response rate 78 per cent). Comparisons of the SF-36 scores derived from this sample were made with the Health Survey for England and the Oxford Healthy Life Survey. Results Controlling for age and sex, many of mean scores on the SF-36 dimensions differed between the three datasets. The British interview sample had better total means for Physical Functioning, Social Functioning, Mental Health, Energy/Vitality, and General Health Perceptions. The Health (interview) Survey for England had the lowest (worst) total mean scores for Physical Functioning, Social Functioning, Role Limitations (physical), Bodily Pain, and Health Perceptions. The postal sample in central England had the lowest (worst) total mean scores for Role Limitations (emotional), Mental Health and Energy/Vitality. Conclusion Responses obtained from interview methods may suffer more from social desirability bias (resulting in inflated SF-36 scores) than postal surveys. Differences in SF36 means between surveys are also likely to reflect question order and contextual effects of the questionnaires. This indicates the importance of providing mode-specific population norms for the various methods of questionnaire administration. Keywords: SF-36, health status, methodology, questionnaires Introduction The SF-36 is a generic measure of health status, which was derived from the batteries developed for the Rand Medical Outcomes Study in the USA.1 It can be used to provide a population-based measure of broader health status, for use in service planning and monitoring and in making comparisons with the health of populations elsewhere, and in measuring the health outcomes of clinical interventions.2 All measurement scales need to define population norms for the attribute of interest, to provide a standard with which scores from other study populations can be compared – the investigator needs to know what scores would be expected from a comparable group to that under study. Thus normative data are essential to interpret a scale’s scores for particular study populations. For example, a judgement of ‘healthiness’ or ‘unhealthiness’ on the basis of scores on a health status scale needs to relate to some norm (e.g. statistical, as in a group average). It is essential that norms are valid scores derived from members of a clearly defined, random sample of the relevant population, which, in turn, has been taken from a representative and unbiased sampling frame. This comparison enables investigators to determine whether scores from the sample under study are above or below those for the general population. Both national and regional norms are required. National norms CHIME/Population Studies and Primary Care, Royal Free and University College London Medical School, Whittington Campus, London N19 5NF. Ann Bowling, Professor of Health Services Research Matthew Bond, Lecturer in Health Services Research Health Services Research Unit, Department of Public Health, University of Oxford, Institute of Health Sciences, Old Road, Headington, Oxford OX3 7LF. Crispin Jenkinson, Deputy Director Health Services Research Unit, Department of Public Health and Policy, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT. Donna L. Lamping, Senior Lecturer in Health Services Research Address correspondence to Professor A. Bowling. q Faculty of Public Health Medicine 1999 256 JOURNAL OF PUBL IC HEALTH MEDICINE are essential for making comparisons with national datasets, as well as for making comparisons at local levels. However, regional norms are also required for the latter given variations in health status by geographical area, independently of the socio-demographic composition of the population. Most commonly used norms are based on random samples of patients registered with general practitioners (GPs). Centralized lists of these patients are commonly used in research, given that 98 per cent of the British population are registered with National Health Service (NHS) GPs, although they do present problems of ‘blanks’ (people who have died or moved and whose records have not been updated). The currently used normative data for the SF-36 in the UK were derived from postal surveys of people registered with GPs in central England and Sheffield.3–5 The central England study norms (the Oxford Healthy Life Survey), which was restricted to people aged 18–64, are the most widely used.3,4 The sampling frame for the Sheffield survey was limited to lists of patients aged 16–74 registered with just two general practices, and which under-represented people in social class II and overrepresented those in social class III and employed women.5 Given the need for regional norms for making local comparisons, as well as for norms relating to older people, regional postal and interview surveys based on the SF-36 have also been carried out, for example, in Aberdeen, Scotland,6 and in West Glamorgan in Wales, and Dudley and North Staffordshire in England.7–10 All of these regional norms were based on lists of GPs’ patients. The postal survey in Aberdeen sampled GPs’ patients with pre-defined conditions who had been referred to out-patients departments, and who were aged between 16 and 86 years registered with four training practices, although they did subsequently compare them with a sample from the electoral register.6 More recently, normative data for England have been provided by the English Health Survey for 1996, which was based on personal interviews in respondents’ homes. The provision of norms from a mixture of different research designs (e.g. postal and interview surveys) in different areas creates difficulties when comparing data.11 SF-36 scores have been reported to vary significantly between samples in different areas, perhaps reflecting the known variation in health status and mortality by region, even when controlling for social class (e.g. SF-36 scores have been reported to be lower (worse) in West Glamorgan than in Oxford or Aberdeen),8 and higher on most dimensions in East Anglia and Oxford than in other regions.11 One further limitation of the commonly used UK and regional norms in the UK is that, as well as being area specific, they take different younger age ranges as the starting point for sampling (e.g. ages 16, 18 or 20) and take different older age cut-off points (e.g. 65, 75, 89), or they sample only very old age groups (e.g. 70 years and over). The question these surveys collectively pose is: are the differences in norms a result of: sample selection bias, geographical area differences, differences in mode of administration, question order effects or contextual differences (the nature of the survey and the positioning of the SF-36 scale in the wider questionnaire can both influence response bias)? This questioning has led to some investigators carrying out local surveys with the aim of providing more appropriate norms,7–10 or to investigators even making comparisons with US norms, given the unavailability of national British norms.12 The SF-36 has been tested for use in the USA and in 12 other countries, as well as the UK (Australia, the Netherlands, France, Belgium, Canada, Denmark, Italy, Japan, Norway, Spain, Sweden, and a Chinese (HK) version has been developed). These studies formed the International Quality of Life Assessment (IQOLA) project. However, despite evidence on the validity and reliability of the norms for each country, published together in a special journal issue,13 these norms suffer from the same geographical constrictions as the UK norms. Also, just one of the groups of researchers in this collection of international papers on the SF-36 questioned the mode of administration of the instrument.14 Perkins and Sanson-Fisher14 reported on their Australian study in which they randomly allocated community sample members to the postal or telephone interview mode of administration of the SF36. Not only was overall response and item response higher with the telephone interview mode (although more people aged 75 and over refused to participate with this mode), and also internal consistency (Cronbach’s alpha), but mean SF-36 scores for Bodily Pain, Social Functioning, Role Limitations (emotional) and Mental Health (four of the eight scale dimensions) were also significantly higher for the interview mode of administration in comparison with the postal mode. Unfortunately, the researchers did not explore the implications of the latter difference. Because of the age restrictions of the surveys providing UK norms, and the need for regional norms for comparison, coinvestigators in West Glamorgan, Dudley and North Staffordshire carried out interview surveys of a random samples of people aged 65–897,9 or included only people aged 70 and over.9 They conducted further postal and interview surveys in West Glamorgan, based on a random sample of people aged 20–89.8,10 Thus, these omitted young adults aged 16–20, and very elderly people aged 90+. The scarcity of SF-36 norms for older people has been experienced in other countries and is not unique to the UK.13 The suitability of the SF-36 for use with this population is still uncertain. Relatively poor levels of item response with increasing age in surveys involving self-completion of the SF-365,6,10 have led to questions about its value in older age groups and criticism of the relevance of the items to older people.6,16,17 The high total and item non-response to the SF-36 among older people, particularly when cognitive impairment and physical disability was present, has been confirmed in recent surveys of older hospital and ambulatory care patients, which has again led to serious doubts about its utility as a health status measure for self-administration among older people.18 Given that there is an inverse association with health status and age, SF-36 scores are likely to inflate the ‘healthiness’ of NORMATIVE DATA FOR SF-36 populations. Moreover, as Lyons et al.19 have pointed out, as older people are the main users of health services, a health status measure that is unsuitable for them has limited practical use. More recently, a Health Survey for England was designed to provide annual data about the nation’s health. The 1996 survey included the SF-36.11 The survey was based on a random sample of 720 postcode sectors, and a random sample of 12 960 addresses from the postcode file. This provides the largest dataset for the provision of norms to date, and involved administration of the SF-36 to adults aged 16+. However, figures for item-response and non-response to the SF-36 have not been released and the published data have not been widely accessed. Norms based on different modes of questionnaire administration are also required. The SF-36, like many measures of health status, can be self or interviewer administered. Each mode of administration has advantages and disadvantages. Selfadministered questionnaires are more economical in staff time and money, and allow respondents to complete the instrument in their own time. However, problems include bias or lack of full completion because of respondents’ lack of comprehension, illness or frailty, the researchers’ lack of control over question order effects (respondents can read through the questionnaire before completing it and their response is therefore biased), lack of motivation and variable time periods of completion. The best quality data are usually derived from face to face interviews with people where the interviewer can motivate the respondent, explain questions where appropriate, control the order in which the questions are asked, maximize response, and minimize item non-response. However, personal interviews are expensive and the interviewer can also introduce biasing effects (interviewer bias) and may lead to social desirability bias, particularly in sensitive areas, including physical and mental health.20–22 Studies comparing the results of questionnaires by mode of administration have reported inconsistent results. Whereas some report no or few differences in results,23,24 others have shown that telephone interviews yield more positive response patterns than self-administered questionnaires mailed to respondents.19,21,22 McHorney et al.22 provided national norms for the SF-36 in the USA for both a postal and telephone survey approach. They reported that all of the SF-36 dimension scores were more favourable by about 3–10 points in the telephone interviews, and the reporting of chronic conditions was more frequent for postal than telephone interview respondents. Younger people were more likely to refuse to participate in the postal mode, thus non-response bias cannot be ruled out as one explanation for more negative health status scores in the postal mode. However, this explanation is unlikely, as the findings of McHorney et al. have been replicated by Perkins and Sanson-Fisher,14 as described earlier, and by Lyons et al.19 in a randomized cross-over study of outpatient attenders. Patients were randomly assigned to either an 257 initial out-patient postal questionnaire followed up by an interview-based questionnaire while attending out-patients departments, or an initial interview-based questionnaire while in out-patients departments, with a postal follow-up questionnaire. Lyons et al. compared SF-36 profiles of the same respondents’ clinic-based interviews and postal questionnaires self-completed at home, and reported that seven of the eight SF36 dimension scores were lower (indicating worse health status) for self-completion than interview-based formats, with the largest differences in Role Limitations due to emotional problems and Social Functioning. The implication of these studies is that the mode of administration of the SF-36 leads to systematic differences in health status ratings, with selfcompleted modes providing more negative profiles. Lyons et al. pointed to the implications for considerable error in the interpretation of data from study designs based on baseline interviews and postal follow-up questionnaires to assess outcomes of health care interventions. As they also pointed out, there are problems in interpretation of data where the differences in scores related to mode of administration are as large as the effect of the therapies or medical condition under investigation. They illustrated how the difference in scores by mode of administration can equate to 20–50 per cent of the impact of having a condition.7,19 Two further problems in interpretation may be presented by question order effects (position of SF-36 in wider questionnaire, before or after other questions on health) and contextual effects (sensitizing effects of the health status scale being included within a questionnaire on health status, medical effectiveness, or in a generic questionnaire). These issues have not been fully examined for the SF-36, or for other health status questionnaires. One principle of questionnaire design is that general questions should be placed before specific ones, to minimize bias from order effects.25 It is possible that if disease-specific questions are asked before generic health status items, then respondents’ generic health status ratings would be more favourable. This is because the disease items had already been considered by respondents and therefore excluded in replies to the generic items. Accordingly, Keller and Ware26 recommended that the SF-36 should be presented to respondents before more specific health and disease items or scales, and that there should be a clear break between scales, to remove potential bias from order effects. This can only be controlled for in interviewer modes of administration. Accordingly, Barry et al.,27 in their study of benign prostatic hyperplasia, reported that SF-36 scores were better when disease-specific modules were administered first, although the differences were not statistically significant. This issue is pertinent when comparing results from the SF-36 with normative data. It is necessary to look, not only at mode of administration, but also the positioning of the SF-36 within interview-based questionnaires and also the context of the study. For example, the central England study3 was based on a Healthy Life Survey, which may have sensitized respondents to health and lifestyle issues from 258 JOURNAL OF PUBL IC HEALTH MEDICINE the outset. The central England study team correctly placed the SF-36 at the beginning of the survey questionnaire, but this would minimize order effects only if respondents did not read through the (postal) questionnaire before responding. In contrast, the Office for National Statistics (ONS; formerly known as the Office of Population Censuses and Surveys) Omnibus Survey investigators administered the SF-36 to respondents in the middle of a lengthy interview about a range of non-health topics (e.g. savings, household characteristics). The Health Survey for England in 199611 involved the administration of the SF-36 towards the end of a lengthy interview about negative health (e.g. respiratory and other specific medical problems). This positioning carried the potential for bias from order effects. Just as investigators require appropriate geographical norms with which to compare their data, researchers undertaking a personal interview survey cannot necessarily rely on norms derived from a postal survey and vice versa. In the absence of appropriate norms, at the very least the comparisons will require adjustment. This step requires the provision of information on area, contextual, question order and mode of administration effects. These effects are examined here. SF-36 norms based on a random sample of adults interviewed at home in Britain are presented. The availability of total population norms extends previously published norms based on regional samples in the UK and national norms for England only, and provides normative data for adults of all ages in Britain, including elderly people. Comparisons with the regional norms from the central England survey – the Oxford Healthy Life Survey3,4 (the most commonly used normative dataset) – and results from the Health Survey for England11 are presented in this paper, with the aim of furthering debate on the issue of which norms are most appropriate for use. Aims and methods The aims of the analyses presented here were to provide population norms, and information on internal consistency, for the Short Form 36 Health Survey Questionnaire (SF-36), based on a random sample of the population in Britain. A further aim was to make comparisons between the norms presented and the commonly used UK regional norms based on the sample from central England, and the more recent results from the 1996 Health Survey for England, which were described earlier.3,4,11 The study design for the British survey presented here was a face-to-face interview survey, in respondents’ homes, of a national random sample of people aged 16 and over in Great Britain. The survey was carried out in November 1992, and was commissioned by the King’s Fund Institute in London. The dataset has recently been released by the Office for National Statistics to the Data Archive at the University of Essex, with approved access for authorized, registered users. The survey was previously unpublished. The vehicle for the study was the ONS Omnibus Survey in Great Britain. The sampling frame was the British postcode address file of ‘small users’, which includes all private household addresses. The sample was a multi-stage stratified random sample. The postcode address file was stratified by region, the proportion of households renting from local authorities, and the proportion in which the head of the household is in socio-economic group 1–5 or 13 (i.e. a professional, employer or manager). One hundred postal sectors were selected with probability proportional to size. Within each postal sector, 30 addresses were selected randomly. The number of sampled addresses was 3000, with the aim of achieving a target of 2000 completed interviews. If an address contained more than one household, the interviewer was instructed to use a standard procedure to select just one household randomly. In households with more than one adult member, just one person aged 16 or over was selected for interview with the use of a Kisch grid. Because only one household member was interviewed, people in households that contained few adults had a better chance of selection than those in households with many. A weighting factor was applied to correct for this unequal probability. The individual adult, rather than the household, was the unit of analysis in the results presented here. Of the 3000 selected addresses, 356 were ineligible (e.g. non-domestic). At the remaining 2644 addresses, 327 (12 per cent) people refused to take part, 40 (2 per cent) were incapable of interview, 221 (8 per cent) were non-contactable. Consequently, 2056 people aged 16 and over were interviewed in person in their own homes, giving a response rate of 78 per cent. Details of the design of the central England Survey and the Health Survey for England 1996, with which comparisons were made, are briefly described next. The sampling frame for the central England study, known as the Oxford Healthy Life Survey, was computerized registers of GPs’ patients aged 18–64 in four family health services authorities (now merged with district health authorities) in central England: Berkshire, Buckinghamshire, Northamptonshire and Oxfordshire. The study was a postal survey in 1991– 1992 of 9332 randomly sampled people (representing 72 per cent who responded).3 The central England sample did not aim to include anyone aged 65 or over. The investigators compared the characteristics of their sample with 1981 Census data and 1991 population estimates, and reported that their sample mirrored closely the characteristics of the general population (for age, sex and social class), although it slightly overrepresented those in the higher social classes I, II and IIInm (65 per cent fell into these groups, in comparison with 56 per cent for the British population in the 1991 Census). They did note the limitations of their data. Most users make comparisons with the central England normative data. The investigators of the latter have developed a user’s handbook,4 and also developed a scoring algorithm for the two sub-scales that can be derived from the SF-36 – the SF-36 Physical and Mental Component Summary Scores.28 NORMATIVE DATA FOR SF-36 The Health Survey for England was designed to provide annual data about the nation’s health. The 1996 survey included the SF-36 and focused on respiratory disease.11 The survey was based on a random sample of 720 postcode sectors, and a random sample of 12 960 addresses from the postcode file, after stratification for socio-demographic factors. Within each household, all persons aged two and over were eligible for inclusion in the survey. Interviews were obtained with 20 328 people; 16 443 were with those aged 16+ (75 per cent response rate for these adults), and this group were asked to complete the SF-36. The characteristics of respondents were broadly similar in age, sex and social class, to those in the 1991 Census, although it slightly under-represented men. Comparison with mid-1995 population estimates show that men aged 16–34 were slightly under-represented. A major advantage of this, over earlier datasets, is that apart from covering England as a whole, the sample included sizeable numbers of older people: in the households that agreed to co-operate with the survey, 928 men and 1121 women (n ¼ 2049) were aged 65–74, and 573 men and 903 women (n ¼ 1476) were aged 75þ; between 96 and 98 per cent of these groups of older people were interviewed, and between 91 and 96 per cent of people in these age groups in the co-operating households completed the SF-36. This sample, then, represents the largest representative dataset for older people in England. Measures The questions for the Omnibus Survey included the anglicized, UK version of the SF-36,4 the ONS Omnibus standard sociodemographic items, and items on self-reported illness or injury that restricted usual activities during the last 2 weeks, long-term illness that limited daily activities, together with single items on health service use (in last 2 weeks, respondent talked to a doctor, attended casualty or A&E department (apart from antenatal or postnatal visits), or has been an in-patient). The SF-36 contains 36 items within eight dimensions: Physical Functioning (ten items); Social Functioning (two items); Role Limitations due to physical problems (four items); Role Limitations due to emotional problems (three items); Mental Health (five items); Energy/Vitality (four items); Pain (two items) and General Health Perceptions (five items); and an item on perceived changes in health status in the past 12 months. The scoring method for the SF-36 involves recoding, summing and transforming dichotomous (‘yes’ or ‘no’) and ranked (e.g. ‘none’ to ‘very severe’) response categories for the eight dimensions using a scoring algorithm, into a scale ranging from zero (worst possible health state) to 100 (best possible health state). The results for the eight dimensions have conventionally been reported as means, rather than frequency distributions.2,29 This has been done for pragmatic reasons, to optimize the ability to make easier comparisons of results across studies, and on the grounds that the treatment of the data 259 as interval level has minimal effects on most statistical procedures, although this has been the subject of debate.30,31 Two summary scores – Physical and Mental Health Component Summary Scores – can also be calculated, in each case using a formula that involves multiplying each SF-36 scale z-score by its respective factor score coefficient.4,28,29 The ONS version of the SF-36 contained one difference. The question in the Mental Health dimension asking ‘How much time during the past month ...’ ‘Have you felt calm and peaceful?’, the word ‘peaceful’ was changed to ‘cheerful’. Given that question wording can affect response, it is possible that this may lead to some differences in response between this and the original item included in other surveys, and interpretation of results should be cautious for this item. In the tables presented, the authors have focused on the magnitude of the score differences, rather than statistical significance, given that very small differences are likely to be statistically significant with such large sample sizes. Results Table 1 shows the socio-demographic characteristics of the 1992 Omnibus Survey sample, which are comparable with the characteristics of the adult sample for the 1992 General Household Survey (GHS) in Britain.32 Checks which the ONS makes on non-response bias for the GHS indicate that this is small.33 The characteristics of respondents are similar across each data source, except for reporting of a long-term health problem, which can probably be explained by differences in question wording. The characteristics of the respondents also compare well with 1991 Census figures (mid-term 1992 population estimates) for Britain (case estimates based on a 10 per cent random sample). Table 2 confirms the results of the earlier research in the UK and USA2,3,5,6,22 showing good internal consistency of the SF36 dimension scores, with Cronbach’s alphas exceeding 0.80 for each dimension except Social Functioning and Health Perceptions. The table presents the Cronbach alpha coefficients for the sample compared with those reported from five other sources. Results were similar across all six samples. Kolmogorov–Smirnov tests were also highly significant for each of the eight domains, and are shown in Table 2. These results confirm the highly skewed nature of the distributions (see Fig. 1), which is a problematic feature of all health status scales. Table 3 shows the mean and standard deviations for the eight SF-36 dimensions for the Omnibus sample by their age, sex, social class, limitations on activities, long-term health problems and health service use. Item response was high for all SF-36 dimensions, as would be expected in an interview-based survey. The eight dimension scores were able to distinguish between those in the highest and lowest social classes; those in younger and older age groups; males and females; those with and without a longstanding illness; and users and non-users of 260 JOURNAL OF PUBL IC HEALTH MEDICINE Table 1 Socio-demographic characteristics of sample* 1992 Omnibus sample 1992 GHS sample ................................ ........................... Adults % Adults (no.) Adults % 1991 Census mid-term Population estimates 1992 ...................................... Adults % Sex Male Female 45 55 (929) (1122) 47 53 49 51 Age 16<25 25<45 45<55 55<65 65<75 75+ 10 36 15 15 13 11 (204) (735) (302) (297) (281) (217) 14 36 16 13 12 8 16 36 15 15 11 9 Household size 1 2 3–11 25 38 37 (513) (770) (773) 26 33 40 26 34 40 Ethnic group White Black and other 97 3 (1981) (68) 95 5 95 5 Region of residence The North and North West Midlands and East Anglia Greater London South East South West & Wales Scotland 25 20 12 20 14 9 (520) (416) (236) (407) (288) (189) 26 20 11 19 14 9 26 20 12 19 14 9 Social class† I professional II intermediate IIInm skilled non-manual IIIm skilled manual IV semi-skilled V unskilled 4 24 52% 24 23 17 8 (73) (483) (476) (448) (341) (151) 4 24 53% 25 21 17 8 Housing tenure Owner occupier Owner–mortgage Rents local/housing authority Rents privately 26 44 22 8 (528) (902) (452) (167) 25 42 25 7 – – – – Marital status Married/cohabiting Single Widowed Divorced/separated 59 18 13 9 (1124) (368) (272) (191) 65 20 9 6 – – – – Age completed full-time education <14 15<19 19+ 23 64 13 (463) (1319) (274) 13 69 18 – – – Economic status Working full time Working part time Unemployed Inactive 37 13 5 45 (747) (254) (94) (900) 41 15 6 37 – – – – } } 5 28 23 22 16 6 261 NORMATIVE DATA FOR SF-36 Table 1 contd 1992 Omnibus sample 1992 GHS sample ................................ ........................... Adults % Adults (no.) Adults % 1991 Census mid-term Population estimates 1992 ...................................... Adults % Contact with GP in last 2 weeks Yes No 20 80 (420) (1629) 16 84 – – Out-patient in last 3 months Yes No 17 83 (355) (1694) 15 85 – – In-patient in last year Yes No 13 87 (257) (1792) 10 90 – – Long-term health problem‡ Yes No 22 78 (452) (1594) 37 63 – – Acute health problem Yes No 16 84 (320) (172) 13 87 – – Number of respondents* 1972–2056 11 385–19 274 45 110 470 *Totals do not all equal 100% as a result of weighting. †Social class in Census was derived from a question on paid job in last 10 years; social class in Omnibus and GHS was derived on ‘current main job’ and ‘last main job’. ‡GHS: ‘Longstanding illness, disability or infirmity’. health services; thus supporting the construct validity of the SF36. These results support earlier research.2,4,5 Table 4 compares the Omnibus (interview) Survey SF-36 dimension scores by sex and age group with those from the central England postal survey (the normative dataset most often used for comparisons in the UK) and with the Health (interview) Survey for England. The total scores for the three samples show that the postal central England survey had the lowest (worst) mean scores for Role Limitations (emotional), Mental Health and Energy/ Vitality. The Health (interview) Survey for England had the lowest (worst) total mean scores for Physical Functioning, Social Functioning, Role Limitations (physical), Bodily Pain and Health Perceptions. Finally, the Omnibus Survey had the highest mean scores for Physical Functioning, Social Functioning, Mental Health, Energy/Vitality and General Health Perceptions. The Health Survey for England had lower (worse) total mean scores than both of the other surveys on five of the eight dimensions. Tri-variate analyses within age groups showed that, for most age groups, the SF-36 scores for the three dimensions of Physical Functioning, Bodily Pain and General Health Perceptions were lowest (worst health) for the Health (interview) Survey for England, with the central England postal survey in the middle, and highest (best health) in the Omnibus (interview) Survey. Patterns were in this direction but less consistent with increasing age for Social Functioning. The central England postal survey scores were lower (worse health) than the scores for two interview surveys for younger age groups in relation to Energy/Vitality, Role Limitations (emotional) and Mental Health. Differences were less evident or consistent for Role Limitations (physical). Multiple linear regression analyses showed that the slight social class differences between the samples did not account for the variance in the SF-36 dimension scores between the studies; nor did any variation in the age and sex distributions in the samples (tables available from the authors). Discussion Regional norms are useful for comparing with local datasets, given that health status varies by area of residence. National norms are also required for making comparisons with national and with (as a yardstick) local datasets. The ONS Omnibus Survey data provided the opportunity to calculate norms for the SF-36 based on a random sample of the population in Britain, and to make comparisons with existing norms for the UK derived from the regional central England survey and the Health Survey for England 1996. The data show that, controlling for age and sex, many of the SF-36 dimension means differed between the three datasets, 262 K–S Z ¼ 22:10 p < 0:001 2036 K–S Z ¼ 6:15 p < 0:001 2040 0.89 0.81 0.82 0.84 K–S Z ¼ 5:59 p < 0:001 2019 K–S Z ¼ 5:13 p < 0:001 2040 K–S Z ¼ 19:89 p < 0:00012037 0.82 0.81 0.73 0.81 0.86 0.68 K–S Z ¼ 16:45 p < 0:001 2035 K–S Z ¼ 11:71 p < 0:001 2041 0.92 0.84 0.89 0.90 2047 K–S Z ¼ 11:75 p < 0:01 0.93 0.93 SF-36 Manual2 Skewness No. of Garratt et al. McHorney et al. Brazier et al. Jenkinson et al. Table 8.2 ONS (OS) 1992 Kolmgorov–Smirnov Items (Cronbach’s a) (Cronbach’s a)* (Cronbach’s a) (Cronbach’s a) (Cronbach’s a) (Cronbach’s a) ONS (OS) 1992 0.80 0.83 0.80 0.85 0.76 0.88 0.82 0.90 3 0.96 0.95 0.95 0.96 0.73 0.96 0.85 0.93 5 0.81/0.85 0.82/0.84 0.82/0.76 0.87/0.84 0.63/0.78 0.89/0.89 0.88/0.89 0.93/0.92 22 0.89 0.86 0.83 0.86 0.80 0.86 0.86 4 2 5 4 2 3 5 * Postal/interview survey. 0.92 10 Physical Functioning Role Limitations (phys) Bodily Pain General Health Perceptions Energy/Vitality Social Functioning Role Limitations (emotional) Mental Health Dimension 6 Table 2 Internal consistency and skewness of SF-36 and comparison of Cronbach’s alpha coefficients with other studies No. of respondents ONS (OS) 1992 JOURNAL OF PUBL IC HEALTH MEDICINE with the Omnibus Survey sample having better total health status means for Physical Functioning, Social Functioning, Mental Health, Energy/Vitality and General Health Perceptions. The Health (interview) Survey for England had the lowest (worst) total mean scores for Physical Functioning, Social Functioning, Role Limitations (physical), Bodily Pain and Health Perceptions. The postal central England survey had the lowest (worst) total mean scores for Role Limitations (emotional), Mental Health, and Energy/Vitality. Analyses by age showed that the differences were greatest for younger age groups. These may be more sensitive areas for younger people, and thus interviews may suffer more from social desirability bias (resulting in inflated SF-36 scores) than postal surveys. Although these observations are based on three different samples, with potential for the effects of selection and response bias, regression analyses suggested that this explanation is unlikely to account substantially for the observed differences. One possible explanation for the differences in SF-36 dimension means between these surveys is the different methods of questionnaire administration used. The central England survey was based on a postal survey and the ONS Omnibus Survey and the Health Survey for England were based on personal interview surveys. It is possible that the more anonymous postal approach, free from contamination by interviewer effects, may have led to higher (and more accurate) reporting of morbidity, particularly in the more sensitive areas of mental and emotional health. This is consistent with the literature that under-reporting of health problems (perceived as undesirable characteristics) is more likely in interview situations than with self-administered questionnaires.34 It is also consistent with the findings of Perkins and Sanson-Fisher,14 Lyons et al.19 and McHorney et al.22 that interview surveys using the SF-36 (face-to-face and telephone) lead to under-reporting of problems in comparison with postal approaches, particularly for Mental Health and Role Limitations (emotional). Thus it appears that social desirability bias during the Omnibus interviews may partly explain the better reported health status of the Omnibus Survey interview sample in comparison with the central England survey postal sample on most SF-36 dimensions. However, this explanation does not account for the even poorer reported health status of the respondents interviewed for the Health Survey for England 1996, in comparison with both of the other postal and interview surveys. This sample had the lowest (worst) total means for Physical Functioning, Social Functioning, Role Limitations (physical), Bodily Pain and Health Perceptions. The explanation for this inconsistency is likely to be found in both contextual and question order effects. The Omnibus Survey was unlikely to sensitize respondents to health at the outset of the questionnaire because the context of the survey was generic. Thus it would be expected that self-reported health status would be better than that in the central England study, which placed the SF-36 in the context of a healthy life survey, and the Health NORMATIVE DATA FOR SF-36 263 Figure 1 Histograms of SF-36 dimensions with normal plot. Survey for England, which focused on disease (respiratory conditions), thereby sensitizing respondents to health issues from the outset. This explanation is supported by the observation that in the Health Survey for England, the number of respondents who reported a longstanding illness was 3–4 per cent higher than in the annual GHSs in Britain, conducted by the ONS.32,33 Investigators at the ONS have reported that more positive responses to this item are obtained when asked in the context of health surveys, in comparison with the GHSs, supporting a contextual bias; this is also supported by data from their Omnibus Surveys.35,36 This explanation is supported by the methodological literature,20 and is likely to apply to the differences in SF-36 scores. Question order effects (position of SF-36 in wider questionnaire; e.g. before or after other questions on health) may also account for some of the observed differences. It was 264 JOURNAL OF PUBL IC HEALTH MEDICINE Table 3 Mean (SD) scores for the SF-36 dimensions by age and sex and social class and health variables Physical Functioning Role Limitations (Physical) Bodily Pain ................................... ...................................... ................................... General Health Perceptions ...................................... Total score Mean (SD) ( n) Mean (SD) ( n) Mean (SD) ( n) Mean (SD) ( n) Age 16–24 25–34 35–44 45–54 55–64 65–74 75–84 85+ 95.5 94.5 93.3 87.2 78.0 72.7 57.9 39.3 (12.1) (13.5) (13.4) (20.9) (26.3) (26.7) (28.6) (31.5) (204) (415) (319) (297) (297) (281) (296) (36) 90.7 87.5 88.2 83.5 72.7 72.6 62.2 65.0 (24.5) (29.8) (28.0) (33.8) (40.6) (39.5) (42.1) (42.5) (202) (414) (318) (296) (295) (279) (176) (35) 86.0 86.7 84.9 80.7 74.4 75.5 70.3 65.7 (22.6) (22.6) (21.8) (25.5) (28.6) (27.2) (29.5) (30.5) (203) (414) (318) (298) (296) (279) (178) (35) 78.6 79.4 75.5 71.3 65.3 65.0 58.7 60.3 (17.7) (17.9) (19.5) (23.4) (26.5) (24.1) (24.5) (22.2) (204) (416) (319) (297) (295) (279) (175) (34) Sex Male Female 86.3 81.8 (22.5) (25.7) (925) (1117) 82.7 79.0 (33.7) (36.5) (921) (1109) 83.2 78.1 (24.4) (26.8) (922) (1114) 71.9 71.3 (23.0) (22.8) (921) (1113) Social class I Prof II Semi-prof IIInm IIIm IV V 89.9 85.5 85.8 82.0 79.9 75.7 (18.0) (19.2) (22.6) (26.0) (27.3) (29.5) (73) (481) (474) (447) (339) (151) 83.5 86.4 82.2 77.1 77.2 75.7 (33.3) (30.1) (33.8) (38.2) (36.8) (40.1) (71) (480) (473) (443) (337) (149) 83.7 85.2 80.4 79.0 75.5 75.6 (21.0) (22.1) (24.5) (27.2) (28.7) (28.7) (73) (481) (474) (444) (337) (151) 76.7 75.5 72.9 69.7 68.4 65.6 (18.9) (19.4) (22.0) (23.9) (25.2) (25.2) (72) (479) (473) (446) (338) (150) Cut down activities because of illness (in last 2 weeks) Yes 63.0 (33.4) (319) 27.7 No 87.7 (20.2) (1722) 90.4 (37.4) (24.7) (316) (1713) 49.8 86.1 (29.8) (20.6) (318) (1717) 51.5 75.3 (27.5) (19.9) (316) (1717) Long-term health problem Yes 52.3 (28.9) No 92.7 (13.1) (42.5) (24.7) (444) (1583) 56.0 87.4 (30.2) (19.6) (447) (1586) 44.8 79.0 (23.5) (16.1) (444) (1587) Attended A&E or hospital out-patient (in past 3 months) Yes 71.5 (30.8) (353) 60.4 No 86.4 (22.1) (1689) 84.9 (43.0) (31.9) (353) (1677) 65.3 83.6 (30.9) (23.5) (353) (1683) 60.4 73.9 (27.0) (21.2) (351) (1683) Been hospital in-patient (in past 12 months) Yes 70.2 (31.3) (256) No 85.8 (22.7) (1786) (44.0) (32.7) (255) (1775) 67.6 82.3 (31.8) (24.4) (256) (1780) 59.7 73.2 (27.4) (21.7) (256) (1778) (449) (1590) 44.1 90.9 58.2 83.9 pointed out in the Introduction that the principles of questionnaire design recommend asking general questions before specific ones, to minimize bias from order effects. If disease-specific questions are asked before a generic health status scale, then the generic health status ratings are likely to be more favourable because the disease items had already been considered by respondents and therefore excluded in replies to the generic items. The central England study correctly placed the SF-36 at the beginning of the survey questionnaire, but this would minimize order effects only if respondents did not read through the (postal) questionnaire before responding. The ONS Omnibus Survey administered the SF-36 to respondents in the middle of a lengthy interview about a range of non-health topics (e.g. savings, household characteristics). In contrast, the Health Survey for England in 1996 administered the SF-36 towards the end of a lengthy interview about respiratory and other specific problems with health. This positioning had the potential for creating bias from order effects but does not explain the poorer health status scores obtained in this survey. In conclusion, it is necessary to assess, not only mode of administration, but also the positioning of the SF-36, as with any health status scale, within interview-based questionnaires and also the context of the study. These results indicate the importance of providing population norms for the various modes of questionnaire administration, and also taking account of contextual and question order effects. As McHorney et al.22 concluded, the varying norms for the different modes of questionnaire administration should not be regarded as any more or less accurate or valid, they are simply different, and should be regarded as relative rather than absolute data. The provision of different norms for different modes of administration for scales is essential so that investigators can make comparisons with appropriate norms and not wrongly interpret their data. Where 265 NORMATIVE DATA FOR SF-36 Table 3 contd Energy/Vitality Social Functioning Role Limitations (emotional) ................................... ................................... ................................... Mental Health ...................................... Total score Mean (SD) ( n) Mean (SD) ( n) Mean (SD) ( n) Mean (SD) ( n) Age 16–24 25–34 35–44 45–54 55–64 65–74 75–84 85+ 68.3 66.8 64.9 63.0 61.8 63.0 54.2 52.1 (19.0) (18.4) (18.2) (23.1) (24.6) (24.1) (24.9) (20.8) (203) (415) (319) (296) (297) (279) (176) (35) 91.5 91.1 90.5 87.7 84.7 85.8 75.9 73.7 (17.2) (16.9) (19.4) (22.8) (25.6) (25.6) (29.1) (24.7) (203) (413) (317) (297) (297) (279) (176) (35) 91.1 90.3 88.2 88.4 81.8 88.1 82.8 87.6 (24.6) (26.2) (28.7) (28.6) (35.8) (29.8) (34.7) (32.4) (203) (414) (317) (297) (296) (280) (174) (35) 79.6 77.2 76.0 75.6 76.0 80.0 75.6 73.9 (15.2) (16.2) (18.3) (20.2) (20.2) (18.3) (19.5) (17.8) (203) (415) (319) (297) (297) (279) (176) (34) Sex Male Female 67.2 60.6 (21.7) (21.6) (923) (1112) 88.8 86.2 (21.4) (23.4) (920) (1112) 90.4 85.3 (26.6) (32.0) (912) (1110) 79.5 75.1 (17.7) (18.6) (920) (1115) Social class I Prof II Semi-prof IIInm IIIm IV V 69.0 65.9 62.8 64.3 60.5 60.7 (18.5) (19.3) (19.7) (23.1) (24.8) (25.2) (73) (480) (474) (445) (338) (149) 90.7 91.4 88.0 86.2 84.3 81.9 (17.8) (18.3) (21.1) (23.0) (26.0) (27.7) (73) (479) (473) (445) (337) (149) 93.4 89.7 88.7 88.2 83.1 82.8 (17.5) (26.2) (28.2) (29.5) (34.3) (36.5) (73) (481) (472) (443) (338) (149) 81.2 79.0 76.6 78.7 74.1 73.8 (15.3) (16.7) (16.6) (18.6) (20.7) (21.1) (73) (481) (475) (444) (338) (149) Cut down activities because of illness (in last 2 weeks) Yes 44.0 (24.8) (315) 60.8 No 67.1 (19.3) (1719) 92.3 (30.9) (16.6) (316) (1715) 66.4 91.6 (45.1) (23.9) (316) (1714) 66.3 79.1 (21.8) (16.9) (316) (1718) Long-term health problem Yes 44.1 (23.9) No 69.0 (17.9) (30.5) (14.9) (445) (1584) 71.6 92.1 (41.7) (23.5) (444) (1584) 67.5 79.8 (22.1) (16.1) (446) (1586) Attended A&E or hospital out-patient (in past 3 months) Yes 54.2 (25.0) (351) 74.7 No 65.5 (20.7) (1684) 90.0 (30.2) (19.7) (352) (1680) 79.5 89.3 (37.7) (27.5) (352) (1679) 72.5 78.1 (20.3) (77.7) (351) (1684) Been hospital in-patient (in past 12 months) Yes 50.9 (26.1) (254) No 65.4 (20.6) (1781) (31.9) (20.1) (255) (1777) 79.9 88.7 (37.2) (28.4) (255) (1777) 71.3 77.9 (21.5) (17.7) (255) (1780) (446) (1586) 65.3 93.5 72.1 89.5 appropriate norms are not available, awareness of this problem is required so that researchers may consider whether to make adjustments when making comparisons with existing norms. Acknowledgements We are grateful to members of the King’s Fund Institute for commissioning the study, and to the Office for National Statistics for conducting it and depositing it on the Data Archive, as well as to the staff of the Data Archive for granting us access as authorized users. We are particularly grateful to Fiona Dawe of the ONS and Cathy Cooper at the Data Archive for their help in accessing the dataset, Lee Marriott for typing the tables and Dr Ronan Lyons at the Welsh Combined Centres for Public Health for helpful advice. Crown Copyright 1992. Used with permission of the Office for National Statistics. 266 JOURNAL OF PUBL IC HEALTH MEDICINE Table 4 Comparison of SF-36 dimension norms in Britain Means for dimensions by age and sex Health Survey for England (HSE) 1996 (ages 16+) Oxford (Central England) Healthy Life Survey 1991–1992 (ages 18–64) British ONS Survey 1992 (ages 16+) ............................................ ..................................... .................................... Mean (SEM) Mean (SD) Mean (SD) Physical Functioning Males 16–24 25–34 35–44 45–54 55–64 65–74 75–84 85+ 92 94 91 87 76 70 58 – (0.59) (0.40) (0.50) (0.61) (0.94) (1.00) (1.28) (–) (n ¼ 7294) 92.8 93.9 91.9 87.9 80.0 n/a n/a n/a (16.8) (14.2) (14.5) (17.4) (22.1) (–) (–) (–) (n ¼ 3963) 94.8 96.3 93.5 89.7 79.0 76.2 65.4 52.3 (15.0) (10.6) (14.8) (19.1) (25.6) (26.4) (27.5) (30.5) (n ¼ 916) Females 16–24 25–34 35–44 45–54 55–64 65–74 75–84 85+ 90 90 87 82 72 62 48 – (0.56) (0.47) (0.54) (0.61) (0.86) (0.92) (1.10) (–) (n ¼ 8760) 90.1 92.9 89.4 84.8 74.8 n/a n/a n/a (16.4) (13.3) (16.1) (18.3) (23.5) (–) (–) (–) (n ¼ 4838) 96.1 93.1 93.2 85.3 77.2 70.0 52.9 32.0 (8.9) (15.2) (12.2) (21.3) (27.1) (26.8) (28.4) (30.2) (n ¼ 1109) Total sample mean 81 (0.21) 88.4 (17.9) 89.6 (19.3) Role Limitations (Physical) Males 16–24 25–34 35–44 45–54 55–64 65–74 75–84 85+ 92 91 89 84 75 68 57 – (0.72) (0.70) (0.72) (0.91) (1.28) (1.42) (1.89) (–) (n ¼ 7308) 91.8 92.0 89.5 87.6 78.8 n/a n/a n/a (22.6) (23.2) (25.5) (28.3) (36.1) (–) (–) (–) (n ¼ 4051) 90.6 90.2 88.4 88.7 70.0 74.4 68.7 75.0 (24.1) (26.0) (28.3) (28.5) (41.9) (38.6) (41.1) (38.2) (n ¼ 917) Females 16–24 25–34 35–44 45–54 55–64 65–74 75–84 85+ 90 88 84 80 72 64 55 – (0.77) (0.67) (0.80) (0.94) (1.25) (1.30) (1.51) (–) (n ¼ 8747) 86.6 86.9 84.0 82.4 76.6 n/a n/a n/a (25.5) (29.2) (32.0) (32.0) (36.9) (–) (–) (–) (n ¼ 5007) 90.8 85.4 88.1 78.9 75.2 71.2 57.9 59.1 (25.0) (32.3) (27.8) (37.3) (39.3) (40.2) (42.4) (44.7) (n ¼ 1101) Total sample mean 80 (0.28) 85.8 (29.9) 84.2 (32.7) Bodily Pain Males 16–24 25–34 35–44 45–54 55–64 65–74 82 84 82 78 74 75 (0.67) (0.60) (0.63) (0.71) (0.90) (0.93) 86.6 87.5 85.6 81.8 78.8 n/a (17.9) (17.7) (19.7) (22.2) (23.6) (–) 86.8 88.0 86.0 86.3 75.6 78.8 (21.9) (21.6) (21.3) (22.5) (28.3) (26.0) 267 NORMATIVE DATA FOR SF-36 Table 4 contd Means for dimensions by age and sex Health Survey for England (HSE) 1996 (ages 16+) Oxford (Central England) Healthy Life Survey 1991–1992 (ages 18–64) British ONS Survey 1992 (ages 16+) ............................................ ..................................... .................................... Mean Mean (SD) Mean (SEM) (SD) 75–84 85+ 73 – (1.19) (–) (n ¼ 7351) n/a n/a (–) (–) (n ¼ 5064) 75.7 82.1 (26.8) (29.6) (n ¼ 916) Females 16–24 25–34 35–44 45–54 55–64 65–74 75–84 85+ 80 80 78 74 69 69 65 – (0.70) (0.59) (0.63) (0.68) (0.89) (0.87) (1.05) (–) (n ¼ 8809) 81.7 82.1 79.4 77.4 75.0 n/a n/a n/a (20.8) (21.1) (22.0) (22.3) (25.1) (–) (–) (–) (n ¼ 5041) 85.4 85.7 84.0 75.6 73.2 72.9 66.7 56.1 (23.2) (23.2) (22.1) (26.6) (28.9) (27.8) (30.7) (27.3) (n ¼ 1106) Total sample mean 77 (0.21) 81.5 (21.6) 82.5 (24.8) General Health Perceptions Males 16–24 25–34 35–44 45–54 55–64 65–74 75–84 85+ 74 74 73 69 64 62 61 – (0.63) (0.52) (0.53) (0.62) (0.80) (0.81) (0.97) (–) (n ¼ 7301) 77.2 76.7 74.1 72.0 68.1 n/a n/a n/a (17.4) (17.7) (18.5) (20.1) (22.9) (–) (–) (–) (n ¼ 4031) 79.8 79.7 75.8 72.7 63.1 64.8 61.3 65.2 (17.9) (17.5) (19.4) (22.2) (26.9) (24.2) (26.7) (23.9) (n ¼ 912) Females 16–24 25–34 35–44 45–54 55–64 65–74 75–84 85+ 70 74 73 70 65 64 61 – (0.58) (0.48) (0.54) (0.56) (0.72) (0.69) (0.80) (–) (n ¼ 8715) 72.1 77.3 74.1 73.1 68.0 n/a n/a n/a (20.3) (18.5) (20.3) (19.9) (22.0) (–) (–) (–) (n ¼ 4959) 77.9 79.1 75.1 70.3 67.4 65.1 58.7 57.3 (17.6) (18.2) (19.6) (23.9) (26.1) (24.1) (23.0) (21.0) (n ¼ 1105) Total sample mean 69 (0.17) 73.5 (19.9) 74.0 (21.9) Energy & Vitality Males 16–24 25–34 35–44 45–54 55–64 65–74 75–84 85+ 69 68 66 66 64 64 58 – (0.59) (0.49) (0.51) (0.53) (0.72) (0.75) (0.94) (–) (n ¼ 7339) 66.4 64.5 63.5 62.9 62.9 n/a n/a n/a (17.1) (17.3) (18.6) (19.9) (20.3) (–) (–) (–) (n ¼ 4025) 73.0 71.3 67.8 68.5 63.4 65.0 57.7 56.9 (16.6) (16.8) (18.2) (22.4) (25.0) (25.0) (26.5) (23.4) (n ¼ 914) Females 16–24 25–34 35–44 63 62 61 (0.57) (0.47) (0.50) 59.8 58.3 58.2 (19.4) (19.5) (19.9) 64.1 63.4 62.4 (20.1) (18.9) (17.8) 268 JOURNAL OF PUBL IC HEALTH MEDICINE Table 4 contd Means for dimensions by age and sex Health Survey for England (HSE) 1996 (ages 16+) Oxford (Central England) Healthy Life Survey 1991–1992 (ages 18–64) British ONS Survey 1992 (ages 16+) ............................................ ..................................... .................................... Mean (SEM) Mean (SD) Mean (SD) 45–54 55–64 65–74 75–84 85+ 60 60 59 53 – (0.53) (0.67) (0.66) (0.82) (–) (n ¼ 8800) 59.4 59.0 n/a n/a n/a (20.3) (21.4) (–) (–) (–) (n ¼ 4973) 57.7 60.5 61.5 51.9 49.3 (22.4) (24.3) (23.1) (23.7) (19.2) (n ¼ 1104) Total sample mean 63 (0.16) 61.1 (19.6) 64.7 (20.8) Social Functioning Males 16–24 25–34 35–44 45–54 55–64 65–74 75–84 85+ 89 90 88 87 84 83 79 – (0.61) (0.52) (0.56) (0.64) (0.83) (0.91) (1.21) (–) (n ¼ 7354) 90.2 91.3 90.5 89.8 86.9 n/a n/a n/a (16.4) (16.3) (17.0) (18.7) (22.6) (–) (–) (–) (n ¼ 4073) 91.7 93.2 91.9 91.2 84.4 86.0 77.0 80.3 (21.5) (14.7) (18.0) (19.4) (26.1) (25.1) (28.0) (24.1) (n ¼ 916) Females 16–24 25–34 35–44 45–54 55–64 65–74 75–84 85+ 85 86 86 85 83 82 78 – (0.64) (0.52) (0.57) (0.60) (0.79) (0.81) (0.99) (–) (n ¼ 8813) 90.2 91.3 90.5 89.8 86.9 n/a n/a n/a (16.4) (16.3) (17.0) (18.7) (22.6) (–) (–) (–) (n ¼ 5051) 91.4 89.5 89.3 84.9 85.0 85.7 75.2 69.7 (17.4) (18.3) (20.5) (25.4) (25.2) (26.1) (29.9) (24.8) (n ¼ 1104) Total sample mean 85 (0.19) 88.0 (19.5) 89.0 (20.8) Role Limitations (emotional) Male 16–24 25–34 35–44 45–54 55–64 65–74 75–84 85+ 90 90 89 86 84 81 77 – (0.80) (0.69) (0.74) (0.87) (1.07) (1.20) (1.65) (–) 82.9 87.1 86.0 85.7 85.8 n/a n/a n/a (31.1) (27.9) (28.6) (29.5) (29.9) (–) (–) (–) 93.4 93.2 90.6 91.7 86.3 89.5 82.9 100.0 (19.7) (23.2) (26.2) (24.9) (31.5) (28.0) (35.3) (0.0) (n ¼ 7294) (n ¼ 4056) (n ¼ 917) Females 16–24 25–34 35–44 45–54 55–64 65–74 75–84 85+ 84 85 85 84 81 78 75 – (0.93) (0.76) (0.79) (0.85) (1.09) (1.16) (1.36) (–) (n ¼ 8732) 78.8 80.6 80.3 80.8 83.3 n/a n/a n/a (33.0) (34.0) (33.6) (33.6) (32.5) (–) (–) (–) (n ¼ 4011) 89.1 88.0 86.2 85.1 77.7 87.0 82.7 80.3 (28.1) (28.2) (30.6) (31.7) (38.7) (31.1) (34.4) (39.4) (n ¼ 1002) Total sample mean 84 (0.25) 82.9 (31.8) 88.0 (29.1) 269 NORMATIVE DATA FOR SF-36 Table 4 contd Means for dimensions by age and sex Health Survey for England (HSE) 1996 (ages 16+) Oxford (Central England) Healthy Life Survey 1991–1992 (ages 18–64) British ONS Survey 1992 (ages 16+) ............................................ ..................................... .................................... Mean (SEM) Mean (SD) Mean (SD) Mental Health Males 16–24 25–34 35–44 45–54 55–64 65–74 75–84 85+ 77 78 77 77 78 78 79 – (0.53) (0.43) (0.44) (0.47) (0.56) (0.59) (0.75) (–) (n ¼ 7333) 74.8 75.8 75.0 76.0 78.0 n/a n/a n/a (15.4) (15.2) (16.1) (16.7) (17.3) (–) (–) (–) (n ¼ 3984) 81.2 80.1 77.7 79.9 78.3 81.4 77.1 77.0 (15.5) (15.0) (18.1) (18.1) (19.5) (18.0) (19.4) (24.3) (n ¼ 912) Females 16–24 25–34 35–44 45–54 55–64 65–74 75–84 85+ 73 73 73 74 73 74 75 – (0.50) (0.43) (0.43) (0.44) (0.56) (0.57) (0.67) (–) (n ¼ 8794) 70.2 71.6 71.6 73.2 74.4 n/a n/a n/a (17.4) (15.2) (17.8) (18.2) (18.5) (–) (–) (–) (n ¼ 4946) 78.2 75.0 74.6 71.5 74.0 78.8 74.6 72.2 (14.9) (16.7) (18.3) (21.4) (20.5) (18.5) (19.6) (13.4) (n ¼ 1107) Total sample mean 75 (0.14) 73.8 (17.2) 76.6 (18.3) Sample sizes: CEHLS: completed mail questionnaires = 9332 adults aged 18–64 (72% response rate). Sample mirrored closely the characteristics of general population in 1981 Census and 1991 population estimates, although it slightly under-represented those in social classes I, 11 and IIInm. HSE: interviewed = 16443 adults aged 16+ (75% response rate). Characteristics broadly similar to age, sex and social class of population in 1991 Census, although it slightly under-represented men. In comparison with mid-1995 population estimates men aged 16–34 were slightly under-represented. HSE provided data on ages 75+ collectively with no older age breakdown. ONS: interviewed = 2056 adults aged 16+ (78% response rate). The sample compared well with 1991 Census figures and mid-term 1992 population estimates. n/a, not applicable. References Department of Public Health and Primary Care, Health Services Research Unit, 1996. 1 Ware JE, Sherbourne CD. The MOS 36 item short-form health survey (SF-36). 1: Conceptual framework and item selection. Med Care 1992; 30: 473–480. 5 Brazier JE, Harper R, Jones NMB, et al. Validating the SF36 health survey questionnaire: new outcome measure for primary care. Br Med J 1992; 305: 160–164. 2 Ware JE, Snow KK, Kosinski MA, Gandek MS. SF-36 health survey. Manual and interpretation guide. Boston, MA: New England Medical Center, The Health Institute, 1993. 6 Garratt AM, Ruta DA, Abdalla MI, Buckingham JK, Russell IT. The SF 36 health survey questionnaire: an outcome measure suitable for routine use within the NHS? Br Med J 1993; 306: 1440–1444. 3 Jenkinson C, Coulter A, Wright L. Short form 36 (SF 36) health survey questionnaire: normative data for adults of working age. Br Med J 1993; 306: 1437–1440. 4 Jenkinson C, Layte R. Wright L, Coulter A. The UK SF-36: an analysis and interpretation manual. A guide to health status measurement with particular reference to the Short Form 36 Health Survey. Oxford: University of Oxford, 7 Lyons RA, Lo SV, Littlepage BNC. Comparative health status of patients with 11 common illnesses in Wales. J Epidemiol Commun Hlth 1994; 48: 388–390. 8 Lyons RA, Fielder H, Littlepage BNC. Measuring health status with the SF-36: the need for regional norms. J Publ Hlth Med 1995; 17: 46–50. 270 JOURNAL OF PUBL IC HEALTH MEDICINE 9 Lyons RA, Crome P, Monaghan S, Killalea D, Daley JA. Health status and disability among elderly people in three UK districts. Age Ageing 1997; 26: 203–209. 23 Hochstim JA. A critical comparison of three strategies of collecting data from households. J Am Statist Assoc 1967; 62: 976–989. 10 Lyons RA, Perry HM, Littlepage BNC. Evidence for the validity of the short-form 36 questionnaire (SF-36) in an elderly population. Age Ageing 1994; 23: 182–184. 24 Wu AW, Jacobson DL, Berzon RA, et al. The effect of mode of administration on Medical Outcomes Study health ratings and EuroQol scores in AIDS. Qual Life Res 1997; 6: 3–10. 11 Prescott-Clarke P, Primatesta P, eds. Health survey for England, 1996. Vols 1 and 2. London: The Stationery Office, 1998. 25 Bowling A. Research methods in health. Investigating health and health services. Buckingham: Open University Press, 1997. 12 Lamping D. When is a norm a norm? The representativeness of population norms for UK version of SF-36 (abstract). Qual Life Res 1997; 6: 675. 26 Keller SD, Ware JE. Questions and answers about the SF-36 and SF-12. Med Outcomes Trust Bull 1996; 4: 3. 13 Gandek B, Ware JE, eds. Translating functional health and well-being: International Quality of Life Assessment (IQOLA) project studies of the SF-36 Health Survey. J Clin Epidemiol, Special Issue 1998; 51: 891–1214. 14 Perkins JJ, Sanson-Fisher RW. An examination of self- and telephone-administered modes of administration for the Australian SF-36. J Clin Epidemiol, Special Issue 1998; 51: 969–973. 15 Ware JE, Kosinski M, Keller SD. SF-36 physical and mental summary scales: a user’s manual. Boston, MA: The Health Institute, 1994. 16 Hayes V, Morris J, Wolfe C, Morgan M. The SF-36 Health Survey Questionnire: is it suitable for use with older adults? Age Ageing 1995; 24: 120–125. 17 Hill S, Harries, U. Assessing the outcome of health care for the older person in community settings: should we use the SF-36? Outcomes briefing. UK Clearing House for Health Outcomes 1994; 4: 26–27. 18 Parker SG, Peet SM, Jagger C, Farhan M, Castleden CM. Measuring health status in older patients. The SF-36 in practice. Age Ageing 1998; 27: 13–18. 19 Lyons RA, Wareham K, Lucas M, et al. SF-36 scores vary by method of administration: implications for study design. J Publ Hlth Med 1999; 21: 41–45. 20 Locander W, Sudman S, Bradburn N. An investigation of interview method, threat, and response distortion. J Am Statist Assoc 1976; 71: 269–274. 21 Siemiatycki M. A comparison of mail, telephone, and home interview strategies for household health surveys. Am J Publ Hlth 1979; 69: 238–245. 22 McHorney CA, Kosinski M, Ware JE. Comparisons of the costs and quality of norms for the SF-36 Health Survey collected by mail versus telephone interview: results from a national survey. Med Care 1994; 32: 551–567. 27 Barry MJ, Walker-Corkery E, Chang Y, et al. Measurement of overall and disease-specific health status: does the order of questionnaires make a difference? J Hlth Serv Res Policy 1996; 1: 20–27. 28 Jenkinson C, Layte R, Lawrence K. Development and testing of the SF-36 summary scale scores in the United Kingdom: results from a large scale survey and clinical trial. Med Care 1997; 35: 410–416. 29 Ware JE, Kosinski M, Bayliss MS, et al. Comparison of methods for scoring and statistical analysis of SF-36 health profiles and summary measures: summary of results from the Medical Outcomes Study. Med Care 1995; 33(Suppl. 4): AS264–AS279. 30 Julious SA, George S, Campbell J. Sample sizes for studies using the short form 36 (SF-36). J Epidemiol Commun Hlth 1995; 49: 642–644. 31 Lamping DL, Campbell KA, Schroter S. Methodological issues in combining items to form scales and analysing interval data (abstract). Qual Life Res 1998; 7: 621. 32 Thomas M, Goddard E, Hickman M, Hunter, P. General household survey 1992. Office of Population Censuses and Surveys, Social Survey Division. London: HMSO, 1994. 33 Foster K, Jackson B, Thomas M, Hunter, P, Bennett N. General household survey 1993. Office of Population Censuses and Surveys. London: HMSO, 1995. 34 Cannell CC, Groves RM, Miller PV. The effects of mode of data collection on health survey data. J Am Statist Assoc, Proc Section Social Statist 1981; 1: 1–6 (suppl.). 35 Breeze E, Maidment A, Bennett N, Flatley J, Carey S. Health survey for England 1992. Office of Population Censuses and Surveys. London: HMSO, 1994. 36 Bowling A. Health care rationing: the public’s debate. Br Med J 1996; 312: 670–674. Accepted on 16 March 1999
© Copyright 2026 Paperzz