Summary of results from testing of experimental Subjective Well-being questions - December 2012 Measuring Subjective Well-being, Measuring National Well-being Programme, ONS www.ons.gov.uk/well-being Background From April 2011, ONS included four subjective well-being (SWB) questions covering evaluative, eudemonic and affect measures of well-being on the Annual Population Survey (APS) and the Opinions & Lifestyle Survey (OPN)1. The four questions are as follows: • Overall, how satisfied are you with your life nowadays? (evaluative) • Overall, to what extent do you feel the things you do in your life are worthwhile? (eudemonic) • Overall, how happy did you feel yesterday? (affect) • Overall, how anxious did you feel yesterday? (affect) All answered on a scale of 0 to 10 where 0 is 'not at all' and 10 is 'completely‘. For more information on the four questions and the different ways of measuring subjective well-being please see Measuring Subjective Well-being in the UK on the ONS website: http://www.ons.gov.uk/ons/index.html 1. The questions included on the APS have not changed since their introduction in 2011; All testing work involves the OPN with exception to the month on month which was APS. 2 Experimental Statistics • As with all survey based questions, responses to subjective well-being questions can be affected by various methodological issues such as mode of interview, question order and context effect, question wording, and response scales. • ONS is undertaking an iterative process of both quantitative and qualitative testing in order to investigate how well the subjective well-being questions capture the required information. • In taking this work forward it should be recognised that measuring subjective wellbeing is still an ‘emerging’ science. Our aim at ONS is to develop a suite of questions which are both statistically robust and practical for wider roll-out across policy relevant surveys. • ONS statistics have been labelled ‘experimental’ to signal their on-going refinement and development. • This presentation reports the results of the second phase of quantitative testing carried out on the OPN survey. The first results can be found in section 8 of ‘Initial investigation into Subjective Well-being from the Opinions Survey’ (ONS, December 2011). For a full copy of the March 2012 cognitive testing report, see link http://www.ons.gov.uk/ons/guide-method/user-guidance/well-being/advisory-groups/well-being-technicaladvisory-group/index.html 3 OPN Survey Testing • The OPN Survey offers a flexible testing vehicle (1,000 respondents monthly; new module each month). • Split trials carried out on OPN in this phase of testing include: question order, alternative prompts/preambles for the ‘yesterday’ questions, question wording, show-cards, response variation for days of the week. • Data from the APS survey was used to provide information on the extent to which responses varied across months of the year. • We have previously looked at interviewer led versus self- completion interviews and face to face interviews versus self-completion interviews, for further information please see Annexes A, B and C. • Additionally Dolan et al (2012) investigated the differences between interviewer administered and telephone administered responses to the APS. http://eprints.lse.ac.uk/45273/ 4 Question Order Analysis • Responses to SWB questions can be affected both by the respondent’s current mood and the survey context. • It is important to find out if the order in which the different subjective well-being questions are asked affects the responses to the questions. • For this reason, split trials were conducted in which the four subjective well-being questions were asked in different orders. 5 Question Order – 4 Samples 6 Testing Question Order • Sample 1 replicates the order in which the SWB questions are asked in ONS surveys. Key reasons for the existing question order include: ONS cognitive testing found that respondents preferred positive questions to be placed first as they were seen as being easier to answer. the ‘life satisfaction’ question appears first because it is considered easiest to answer; the ‘satisfaction’ and ‘worthwhile’ questions both ask about respondent’s lives overall, and are therefore placed next to each other. the ‘happy’ and ‘anxious’ questions ask about the respondent’s emotions the previous day and are therefore asked next to each other. the ‘anxious’ question is considered to be the most negative and is therefore placed last. The scale is also the reverse to the others so putting it last avoids confusion. • As the context in which the questions are asked can affect responses it was important to test the questions in different orders. However, it should be noted that the ‘satisfaction’ question was asked first 3 out of 4 times and the ‘anxiety’ question was never asked first for the reasons described above. 7 Question Order – Results Subjective well-being mean scores by question order sample1. Sample 1 7.9 7.8 7.7 7.6 7.5 7.4 7.3 7.2 7.1 7.0 6.9 Sample 2 Sample 3 Sample 4 The differences between the order of the questions were small but the differences we do see are as we might have expected. • When ‘life satisfaction’ followed the affect questions (sample 3) its mean rating was lower (7.3) compared with the other orders (7.5). • For the ‘worthwhile’ question the lowest mean of 7.6 was observed when asked after the anxious question. • The ‘happy yesterday’ question had the lowest mean (7.1) when asked after the ‘anxious yesterday’ question (Sample 4). • However, none of these differences were found to be statistically significant. This suggests that the 4 different question orders tested here did not result in any systematic difference in how people answered the questions. • For further information see annex D. 7.7 7.6 7.5 7.6 7.6 7.5 7.5 7.3 7.2 7.2 7.2 7.1 Overall, how satisfied are you Overall, to what extent do you Overall, how happy did you feel yesterday? with your life nowadays? feel that the things you do in your life are worthwhile? Anxious mean scores by question order sample1. 3.5 • Sample 1 Sample 2 Sample 3 Sample 4 3.4 3.3 3.2 3.2 3.1 3.1 3.2 3.1 3.0 2.9 2.8 2.7 2.6 2.5 Overall, how anxious did you feel yesterday? 1. Data compiled using five months data, September, October, December 11, January and April 12. 8 Testing Show Cards • We tested show cards in the OPN testing although these are not normally used in ONS surveys generally or for the well-being questions in the APS. The key reasons for testing show cards were because: ONS cognitive testing found that the scale for the ‘anxiety’ question was misunderstood by some respondents e.g. these respondents would give a high rating (indicating high anxiety) to this question when they had meant to give a low rating (indicating low anxiety). Why was the scale for the ‘anxiety’ question misunderstood? • • • • For the ‘life satisfaction’, ‘worthwhile’, and ‘happy yesterday’ questions a 10 out of 10 is the best possible result and would indicate ‘complete satisfaction’ etc, however, for the ‘anxiety’ question a rating of 10 out of 10 would indicate the most negative result, e.g. that the respondent was ‘completely anxious’. Cognitive testing indicated that when show cards were introduced they helped respondents to reverse the scale and therefore interpret the anxiety question correctly. The split trial testing was used to see what difference show cards would make to survey responses on a larger scale. Show cards were presented with the scale horizontally across the show card. 9 Show Cards Versus No Show Cards Life Satisfaction, Worthwhile and Happy Yesterday: Show cards and No Show cards, February/ March 20121 Showcard No Showcard 7.9 7.8 7.7 7.6 7.5 7.4 7.3 7.2 7.1 7.0 6.9 7.7 • Comparing responses when show cards were used versus when they were not, the means were significantly higher for ‘life satisfaction’ (at the 5% level) and happy yesterday (at the 10% level) when show cards were used. • There were no significant differences found between show cards versus no show cards for either the ‘worthwhile’ or ‘anxious yesterday’ questions. • The ‘worthwhile’ question seems to be the most stable with or without show cards (with a mean of 7.7 in both cases). • These are surprising results given the previous ONS cognitive testing work (see previous slide). However, the sample used here was small, therefore further testing may be helpful. • We have also examined the distribution of responses to each question in cases where statistically significant differences were found between the means when show cars were used or not. The patterns of response distributions did not differ in any interesting way regardless of whether show cards were used. • For further information see annex E. 7.7 7.6 7.4 7.4 7.2 Overall, how satisfied are you Overall, to what extent do you Overall, how happy did you feel yesterday? with your life nowadays? feel that the things you do in your life are worthwhile? Anxious Yesterday: Show cards and No Show cards, February/ March 20121 3.5 3.4 3.3 3.2 3.1 3.0 3.0 2.9 2.9 2.8 2.7 2.6 2.5 Showcard No Showcard 1. Only two months data available and therefore based on small sample sizes. 10 Testing Different Versions Of The ‘Yesterday’ Questions • Previous ONS cognitive testing found that some respondents did not like the reference to ‘yesterday’ when answering the ‘happy & anxious yesterday’ questions as it was not seen as being representative of their general state. • This led to the trial of an additional instruction from the interviewer; 'please think about yesterday even if it was not a typical day'. This instruction was tested in two alternative ways: • Sample 1 tested the instruction immediately after the question: 'Overall, how happy did you feel yesterday? Please think about 'yesterday' even if it was not a typical day?’ Sample 2 tested the instruction as preamble to the two affect questions: 'The next two questions ask about how you felt yesterday. Please think about 'yesterday' even if it was not a typical day. Overall, how happy did you feel yesterday?‘ 11 Happy Yesterday Variation Results Happy yesterday question variation, July/August/September 20121 • There was no significant difference between the mean scores for the ‘happy yesterday’ question regardless of which version of the preamble to the question was used. • This indicates that the alternative ways of administering the preamble do not affect the way that respondents understand or answer the questions. • We have not yet tested the ‘anxious yesterday’ question with the alternative preambles, or conducted a split trial of the ‘happy’ and ‘anxious’ yesterday questions with the addition of ‘please think about ‘yesterday’ even if it was not a typical day’ versus the original wording. This is planned for ongoing split trial testing work. • For further information see annex F. 7.9 7.8 7.7 7.6 7.5 7.4 7.3 7.3 7.3 7.2 7.1 7.0 6.9 Overall, how happy did you feel yesterday? Please think about ‘yesterday’ even if it was not a typical day. The next two questions ask about how you felt yesterday. Please think about ‘yesterday’ even if it was not a typical day. Overall, how happy did you feel yesterday? 1. Data compiled using three month’s data. 12 Testing Alternatives To The ‘Life Satisfaction’ Question Overall, it is important to note that no specific time frame for the ‘life satisfaction’ question is intended, rather the aim of the question is to capture information from ‘more recent times’ in people’s lives. For the ‘life satisfaction’ question three different endings of the question were tested: • Overall, how satisfied are you with your life nowadays? • Overall, how satisfied are you with your life these days? • Overall, how satisfied are you with your life? Why were these alternatives tested? ONS cognitive testing had found that some younger respondents felt the term ‘nowadays’ used in the original question is old fashioned. Additionally it is important to test alternative versions of the ‘life satisfaction’ question as it is possible that an international standard harmonised version of the ‘life satisfaction’ question may not necessarily have exactly the same wording as the ONS question. Therefore it is helpful to know whether other versions of the question will elicit similar responses or not. 13 Life Satisfaction ‘These Days’ Versus ‘Nowadays’ Results Life satisfaction question wording - November 20111 • ‘These days’ had a higher mean score of 7.5 compared to ‘nowadays’ at 7.4. This difference was not statistically significant. • This finding would indicate that the change from ‘nowadays’ to ‘these days’ does not alter the way in which respondents answer this question, but the sample sizes were small so findings should be cautiously interpreted. • This would suggest that, even if respondents understand these terms in slightly different ways as indicated by cognitive testing, this does not mean on average that we get very different results. • For further information see annex G. 7.9 7.8 7.7 7.6 7.5 7.5 7.4 7.4 7.3 7.2 7.1 7.0 6.9 Overall, how satisfied are you with your Overall, how satisfied are you with your life nowadays? life these days? 1. Only one month’s data available and therefore based on small sample sizes. 14 Life Satisfaction ‘These Days’ Versus ‘No Ending’ Results • Asking the question with no time reference (i.e no ending) resulted in a significantly higher mean score of 7.7 compared to 7.5 when a time reference of ‘these days’ was used. • This is a significant difference at the 5% level, but is based on a small sample size and therefore should be interpreted cautiously. • This finding suggests that the two versions of the question may collect slightly different information from respondents. • However the distributions of both variations follow a similar pattern. • This difference could be due to the fact that the use of ‘these days’ focuses respondents more on recent times in their lives, whereas in the version with no time reference respondents may be considering their whole lives. • Further investigations maybe required in the future to augment the sample sizes. • For further information see annex G. Life satisfaction question wording - July/August 20121 7.9 7.8 7.7 7.7 7.6 7.5 7.5 7.4 7.3 7.2 7.1 7.0 6.9 Overall, how satisfied are you with your life these days? Overall, how satisfied are you with your life? 1. Only two month’s data available and therefore based on small sample sizes. 15 Life Satisfaction ‘Nowadays’ Versus ‘No Ending’ Results • There was no difference found in the mean scores when the ‘life satisfaction’ question was asked with no time reference versus the ‘nowadays’ time reference. Therefore both versions of the question had a mean score of 7.7. • Although the sample sizes were small here, this appears to suggest that there is no important difference between the way that respondents answer these two versions of this question. • This is surprising given the significant difference in response to the ‘these days’ and ‘no ending’ versions of the question (see previous slide). Again, this suggests there would be value in replicating this test to increase the sample sizes and enable question confidence in the findings. • For further information see annex G. Life Satisfaction question wording - September 20121 7.9 7.8 7.7 7.7 7.7 7.6 7.5 7.4 7.3 7.2 7.1 7.0 6.9 Overall, how satisfied are you with your life nowadays? Overall, how satisfied are you with your life? 1. Only one month’s data available and therefore based on small sample sizes. 16 Testing alternatives to the ‘Worthwhile’ Question • The ‘eudemonic’ approach draws on self-determination theory and tends to measure such things as people’s sense of meaning and purpose in life. The original ONS eudemonic question is as follows: ‘Overall, to what extent do you feel that the things you do in life are worthwhile’ • Previous ONS cognitive testing suggested that answers to this question may be affected by social desirability bias which could lead to people giving inflated scores. • Additionally some respondents with low levels of education did not understand the term ‘worthwhile’. • For this reason the words ‘meaningful’ and ‘purpose’ were tested as alternatives to ‘worthwhile’. The question variations trialled were: Overall, to what extent do you feel that the things you do in your life are meaningful? and Overall, to what extent do you feel that the things you do in your life have purpose? 17 ‘Meaningful’ Versus ‘Purpose’ Question Results Testing alternatives to ‘worthwhile’ - July/August/Sept 20121 • ‘Purpose’ had higher mean scores than ‘meaningful‘ (7.7 and 7.5 respectively), this difference is significant at the 5% level. The distribution of responses to the question followed a similar pattern for both versions of the questions. • These findings suggest that people interpret ‘purpose’ and ‘meaningful’ differently and are more likely to give a higher rating out of 10 when asked about the things they do in their life having ‘purpose’, compared with when they are asked about the things in their life having ‘meaning’. • No testing has been carried out yet with comparing ‘meaningful’ versus ‘worthwhile’. • The terms 'purpose‘ and 'worthwhile' have previously been tested on the OPN Survey2. • For further information see annex G. 7.9 7.8 7.7 7.7 7.6 7.5 7.5 7.4 7.3 7.2 7.1 7.0 6.9 Overall, to what extent do you feel that the Overall, to what extent do you feel that the things you do in your life are meaningful? things you do in your life have purpose? 1. Data compiled using three month’s data. 2. http://www.ons.gov.uk/ons/guide-method/user-guidance/well-being/publications/previouspublications/index.html 18 Alternatives To The ‘Anxious Yesterday’ Question • Previous ONS cognitive testing found differences in the way that respondents interpreted the terms ‘anxious’ and ‘worried’. • Qualitative evidence suggested that ‘worry’ was considered by respondents to be less serious than ‘anxious’. • The cognitive testing also found that some respondents felt that the term ‘anxious’ had a level of stigma attached to it which could potentially lead to lower scores (indicating lower anxiety) being given for this question (due to social acceptability). Two versions of the questions were therefore trialled during OPN testing: Overall, how anxious did you feel yesterday? and Overall, how worried did you feel yesterday? 19 Alternatives To ‘Anxious Yesterday’ Results • ‘Worried yesterday’ had a higher mean score than ‘anxious yesterday’ (3.3 and 3.0 respectively), however, this difference was not statistically significant. • This suggests that there is no systematic difference in how people respond to these questions in a survey context. • ‘Anxious’ and ‘worry’ was previously compared in the 'Initial investigation into subjective well-being from the Opinions Survey' December 20112. However, the responses to the 'anxiety' question had a higher mean score than the mean score for the 'worry' question (In Dec 2011 anxious was 3.6 and worried 3.1)3. • This difference could be due to the fact that only one month’s worth of data was used in the current analysis, or that previously the questions were administered as part of a suite of negative affect questions rather than a split trial. • For further information see annex G. Anxious yesterday question wording - November 20111 3.5 3.4 3.3 3.3 3.2 3.1 3.0 3.0 2.9 2.8 2.7 2.6 2.5 Overall, how anxious did you feel yesterday? Overall, how worried did you feel yesterday? 1. Only one month’s data available and therefore based on small sample sizes. 2. http://www.ons.gov.uk/ons/rel/wellbeing/measuring-subjective-wellbeing-in-theuk/investigation-of-subjective-well-being-data-from-the-ons-opinions-survey/initial-investigationinto-subjective-well-being-from-the-opinions-survey.html 3. UK average for anxious is 3.1, on the testing used above it is 3.0 20 Month On Month Analysis (APS) • It is interesting to compare month by month subjective well-being data in order to investigate whether there are any significant fluctuations, caused by seasons, weather or other national events which may have impacted on the whole population. • For monthly analysis, the APS data was used rather than OPN data due to the larger sample size of the APS, and consistency in the administration of the four subjective well-being questions over time. • Our expectation was that the ‘yesterday’ questions would show greater fluctuations month on month than the longer term assessments entailed in the ‘life satisfaction’ and ‘worthwhile’ questions. 21 Month On Month Results For Life Satisfaction, Worthwhile And Happy Yesterday Mean satisfaction with life, worthwhile, happy yesterday by month1 1. Data complied using one year’s data from April 2011 to March 2012. • A significant difference at the 5% level was found between months of the year for ‘life satisfaction’, although for all months the mean scores for ‘life satisfaction’ ranged very little (between 7.4 and 7.5).* • No significant difference was found between months of the year for the ‘worthwhile’ question. For all months the mean scores for ‘worthwhile’ varies from 7.6 to 7.7.* • A significant difference at the 1% level was found between months for the ‘happy yesterday’ question. • For all months the mean scores for ‘happy yesterday’ fluctuated between 7.2 and 7.4 over the period. • For further information see annexes H and I. *Please note, whole (unrounded) numbers were used in significance tests 22 Month On Month Results For Anxious Yesterday Mean anxious yesterday by month1 3.5 3.4 3.1 Responses to the ‘anxious’ yesterday question are considered separately to show differences in responses more clearly than would be possible if the full 11 point scale was used in the chart. • A significant difference at the 1% level was found between months for the ‘anxious yesterday’ question. • For all months the mean scores for ‘anxious yesterday’ ranged from 3.0 to 3.3 over the year. • For further information see annexes H and I. 3.3 3.3 3.2 • 3.2 3.1 3.2 3.2 3.1 3.2 3.1 3.1 3.0 3.0 3.1 3.0 2.9 2.8 2.7 2.6 2.5 1. Data complied using one year’s data from April 2011 to March 2012. 23 Events/Weather Conditions Which May Have Impacted Month On Month Results There were significant differences for the ‘life satisfaction’, ‘happy yesterday’ and ‘anxious yesterday’ questions. • Though there was a significant difference for ‘life satisfaction’, there was no national or global events found to aid explanation for the peaks or troughs over the 12 month period. • There was a 1% significant difference for the ‘happy yesterday’ question with April 2011 and March 2012 being the ‘happiest’ months with a score of 7.4. Possible reasons for this could include: A heat wave in April 2011 sees warmest UK April for more than 100 years. The royal wedding resulted in an extra bank holiday for many. March 2012 was the 3rd warmest, 3rd sunniest and 5th driest since records began. • The least happiest month was November 2011 with a score of 7.2, possible reasons for this are: Tuition fees demonstration in Westminster Pension strikes across UK • There was a 1% significant difference for ‘anxious yesterday’ question with May 2011 being the most ‘anxious’ month with a score of 3.3. Possible reasons for this could include: May 2011 Osama Bin Laden was killed. May 2011 Iceland’s volcano erupted resulting in ash cloud and cancelled airline flights in the UK. • Whilst December 2011 and March 2012 were the least ‘anxious’ months with a score of 3.0 , a possible reasons for this could be: In December 2011 it was announced that the UK economy grew slightly faster than expected in the third quarter . • There were no significant differences found for the ‘worthwhile’ question. In order to investigate monthly variation in more detail several years worth of APS data would be needed. This is something ONS will re-visit when more data is available. 24 Days Of The Week Results Interviews are not normally conducted on a Sunday, therefore respondents answering on a Sunday were excluded from this analysis due to a very small sample size. • Overall, how satisfied are you with your life 7.9 7.8 7.7 7.6 7.5 7.4 7.3 7.2 7.1 7.0 6.9 7.5 7.5 7.5 nowadays?1 7.5 7.5 7.4 • • • Monday Tuesday Wednesday Thursday Friday Saturday • Overall, to what extent do you feel that the things you do in your life are worthwhile?1 7.9 7.8 7.7 7.6 7.5 7.4 7.3 7.2 7.1 7.0 6.9 7.8 7.7 7.7 7.6 7.7 7.6 • • Monday Tuesday Wednesday Thursday Friday Other studies have found significant differences in responses depending on which day of the week the questions were asked. Therefore we looked at whether that also arises in the OPN data. There were no significant differences found between days of the week for life satisfaction ratings. Little fluctuations were present over the week. There was significant difference at the 5% level for days of the week for worthwhile ratings. The Tuesday mean score was significantly lower than Mondays, Wednesdays and Fridays and the Wednesday mean score also significantly higher than Thursday’s. For more information on the significant differences between days of the week see in annex – Days of the week analysis – day on day significant differences. For further information see annexes J and K. Saturday 1. Data complied using one year’s data from September 2011 to June 2012. 25 Days Of The Week Results The ‘Happy’ and ‘Anxious’ questions ask 'how happy/ anxious did you feel yesterday'. Therefore when answering respondents are thinking of the previous day. These charts show the days of the week that their answer referred to and not the day of the interview. Overall, how happy did you feel yesterday?1 7.9 7.8 7.7 7.6 7.5 7.4 7.3 7.2 7.1 7.0 6.9 • A significant difference at the 1% level was found between days of the week for the ‘happy yesterday’ question. 7.3 • People were significantly happier on a Sunday compared to other days of the week, however, this difference was not as strong between Sundays and Fridays. Friday • A significant difference at the 1% level was also found between days of the week for the ‘anxious yesterday’ question. 3.3 • The findings were similar to the ‘happy yesterday‘ question: mean scores were significantly lower on a Sunday (meaning people were less anxious) compared to other days of the week. The difference was significant at the 1% level. • Respondents were the most anxious on a Friday (3.3). • For further information on significant differences see annexes J and K. 7.5 Sunday 7.2 7.2 7.2 7.2 Monday Tuesday Wednesday Thursday Overall, how anxious did you feel yesterday?1 3.5 3.4 3.3 3.2 3.1 3.0 2.9 2.8 2.7 2.6 2.5 3.2 3.2 3.2 3.1 2.7 Sunday Monday Tuesday Wednesday Thursday 1. Data complied using one year’s data from September 2011 to June 2012. Friday 26 Days Of The Week – Overview of Results • The findings showed that the affect questions are more volatile than the evaluative (life satisfaction) or eudemonic (worthwhile) questions. • There is evidence of a weekend/weekday effect suggesting: o that people are happier and less anxious at the weekend than during the week. However, there are fewer respondents on weekends (due to interviewing practices) which may have affected results. • Interviews are not normally carried out on a Sunday with a few exceptions. Respondents answering on a Sunday were therefore excluded from days of the week analysis due to a very small sample size. This means that no data are available for respondents answering affect questions in relation to Saturdays. • Our results replicate those of others showing a significantly different pattern of affect responses at the weekend than on weekdays, see links below: http://www.tandfonline.com/doi/abs/10.1080/17439760.2012.691980?tab=pe rmissions & and http://www.bbc.co.uk/news/health-19316104 27 Annexes 28 Alternative Question Overview – Annex A 29 Interviewer Led Versus Self-Completion Interviews Means and Response Rates – Annex B For further information and a full copy of the Initial investigation into Subjective Well-being from the Opinions Survey see http://www.ons.gov.uk/ons/rel/wellbeing/measuring-subjective-wellbeing-in-the-uk/investigation-of-subjective-well-being-data-from-the-onsopinions-survey/index.html 30 Face To Face Versus Telephone Interviews By Average Means – Annex C For further information and a full copy of the First ONS Annual Experimental Subjective Well-being Results see http://www.ons.gov.uk/ons/rel/wellbeing/measuring-subjective-wellbeing-in-the-uk/firstannual-ons-experimental-subjective-well-being-results/index.html • Dolan, Paul and Kavetsos, Georgios (2012) analysed ONS data from the APS and found that individuals consistently reported higher subjective well-being over the phone than compared with face-to-face interviews. 'Happy talk: mode of administration effects on subjective well-being‘ http://eprints.lse.ac.uk/45273/ 31 Question Order Analysis Response Rates and Sample Sizes – Annex D 32 Show Cards Analysis Response Rates and Sample Sizes – Annex E 33 ‘Yesterday’ Question Analysis Response Rates and Sample Sizes – Annex F 34 Question Wording Analysis Response Rates and Sample Sizes Annex G 35 Month On Month Analysis Response Rates and Sample Sizes - Annex H 36 Significant Differences Highlighted in the Month on Month Data – Annex I 37 Days Of The Week Analysis – Day On Day Significance Testing – Annex J 38 Days Of The Week Analysis Response Rates and Sample Sizes – Annex K 39 Notes on Significance Testing Significance Tests used for Well-being Split Trial Analysis Parametric statistical tests have been used to test the significance of the differences between different conditions. Parametric tests rely on assumptions about the shape of the distribution (i.e. assume a normal distribution) in the underlying population and about the form or parameters (i.e. means and standard deviations) of the assumed distribution. Nonparametric statistical procedures have the advantage of no or few assumptions about the shape or parameters of the population distribution from which the sample was drawn; however it is difficult to take account of the survey design when using these procedures. Therefore, parametric tests were used. Studentised t-test The t-test assesses whether the means of two groups are statistically different from each other. 40 Notes on Significance Testing continued F Test - ANOVA The F-test in one-way analysis of variance (ANOVA) is used to assess whether the means of three or more groups are statistically different from each other. The advantage of the ANOVA F-test is that we do not need to pre-specify which treatments are to be compared, and we do not need to adjust for making multiple comparisons. The disadvantage of the ANOVA F-test is that if we reject the null hypothesis, we do not know which treatments can be said to be significantly different from the others. In this situation it is necessary carry out post hoc analysis. The formula for the one-way ANOVA F-test statistic is or Note that when there are only two groups for the one-way ANOVA F-test, F = t2 where t is the Studentised t statistic. 41 Notes on Significance Testing continued Two way, between subjects ANOVA This is an extension of the F test ANOVA which deals with situations when there is more than one condition. This was used to analysis the June data in which there was a wording condition and a showcard condition. In this situation, there may also be an interaction effect. The analysis of variance can be presented in terms of a linear model, Xijk = µ + αi + βj + αiβj + εijk Where denotes the overall mean, denotes the main effect of condition A (i.e. wording), denotes the main effect of condition B (i.e. showcards), denotes the interaction effect and captures an extra difference due to the combination of condition A and condition B that is not accounted for in the main effects. 42 Notes on Significance Testing continued Assumptions All of these tests have the same assumptions: • Independence of observations – the samples for the different conditions are independent of each other • Normally distributed data • Homoscedasticity - variances of the different populations are assumed to be equal. The independence assumption has been met. Although the data is not normally distributed, the tests are robust with respect to non-normality if the skew in the data is not too extreme (as with this data). The variances of the different conditions are similar for most of the analysis. In situations where they were not, for example the days of the week analysis (where the variance of Sunday is different from the others days) that condition was removed for some of the analysis. 43 Notes on Significance Testing continued Sample sizes Sample sizes vary across the different split trials. Some split trials were only tested for one month, meaning sample sizes were small and significance test results should therefore be interpreted cautiously. The larger the sample used in a significance test the more robust the result. A larger sample will also allow testing to detect very small statistically significant differences. Whole numbers Whole (unrounded) numbers were used in significance testing in this analysis. 44 Notes on the Opinions & Lifestyle Survey (OPN) • The OPN uses a random probability sample stratified by: region, the proportion of households with no car, the proportion in National Statistics Socio-economic categories one to three and the proportion of people aged over 65 years. In common with other ONS social surveys, it uses the Royal Mail’s small user postcode address file to draw the sample from across Great Britain. • An initial sample of 2,100 addresses is drawn each month and advance letters are sent to all addresses giving a brief account of the survey. Participation is purely voluntary and interviewers only call at addresses where no refusal has been made to the advance letter. The interviewer will make up to 20 calls at an address at different times in the day and the week to try to make contact, after which the address is marked as a non-contact. • The interviewing period starts in the first week of the calendar month and continues for the duration of the month in question. The interviewer uses a Kish grid to randomly select one of the adults (aged 16 and over) living within the household for interview. All interviews are carried out face-to-face (except for a very small number of telephone reissues) by ONS interviewers trained to carry out National Statistics surveys. • The final achieved sample is around 1,100 adults (aged 16 and over) per month with an approximate overall survey response rate of around 60 per cent. • The allocation of the split trails in the OPN was based on address number, each address number is allocated to a sample. All interviewers have all versions of the trial questions, the version used depends on the addresses that they have been allocated. 45 Notes on the Opinions & Lifestyle Survey (OPN) continued • All estimates in this report are weighted. By weighting the estimates, we ensure that they are more representative of the population but with the assumption that those people who did not respond to the survey would provide on average the same ratings of subjective well-being as those that do. • There are two weights in the Opinions Survey, firstly a weight that adjusts for the differences in the probability of an individual being selected due to different household sizes and sample design and secondly a weight that calibrates the sample so that it is representative of the overall population levels in Great Britain by age, sex and region. We used the second of these two weights when carrying out this analysis. • For more information on the methodology of the Opinions and Lifestyle Survey please find link to the survey user guide: http://www.ons.gov.uk/ons/about-ons/who-we-are/services/opinions-andlifestyle-survey/opinions-and-lifestyle-survey--opn-.html 46 Feedback and Further Information www.ons.gov.uk/well-being [email protected] Discuss National Well-being http://www.statsusernet.org.uk 47
© Copyright 2026 Paperzz