Analytics: Advanced Tools October 2014 Jane Tang SVP, Advanced Analytics, Vision Critical [email protected] Driver Analysis Driver Analysis Motivation: Don’t Ask Why • Why not just ask respondents directly why they purchase a particular product? • Consumers are generally unconscious of why we do what we do when it comes to product purchase decisions. • Respondents will tell you answers that they think you want to hear. • You get their justifications for their purchase, not their motivation. Reference: Nisbett, Richard, & Wilson, Timothy. (1977). Telling more than we can know: Verbal reports on mental processes. Psychological Review, 84, 231-259. Driver Analysis Motivation: Stated Importance • Why not ask respondents to state the importance of product/service attributes? • The traditional approach of asking respondents to indicate the importance of attributes on a scale requires no tradeoff: • “Everything is really important” • No Differentiation: what’s truly important vs. marginal items. • Prosocial bias • Tradeoff methods such as Conjoint and/or MaxDiff are suitable, but are often more time consuming/require additional questionnaire real estate. We recommend using a derived importance method through driver analysis. Driver Analysis Background In Driver analysis, we seek to understand the motivation behind consumer behaviours by observing the pattern of associations and correlations between their decisions and their perception/experience with the product/service being offered • If we are interested in what drives consumer purchase decisions, we look for correlations between purchase decision and consumer perception and experience with the product. • If we are interested in what drives customer satisfaction, we look for correlations between overall customer satisfaction rating and their satisfaction with key service points. Driver Analysis: Correlation & Causation Despite the name, a “driver” analysis is the analysis of relationships and correlation. A driver analysis does not establish causality. Driver Analysis Issues • Requires complete data for every variable. • Missing data must be replaced with the mean, or other value. • Alternatively, a reduced base size must be used in the analysis. • The majority of the respondents need to be able to rate the attributes/services • Does not distinguish between drivers of satisfaction and drivers of dissatisfaction. • Not predictive. The analysis is based on observation of current behavior of consumers only, not predictive of their future behavior. Driver Analysis Issues Common in any form of key driver analysis, variables with little variations will not show up as key drivers. • Table stake attributes are unlikely to show up as key drivers. • All airlines are safe, so safety is not a key driver of airline choice among travelers. • Components that affect only a very small proportion of the customers will not show up as key drivers. • Satisfaction with claims is not as key driver as only a very small portion of the insurance policy results in claims. • Consideration in questionnaire design – consistency in how you measure potential drivers. Unstructured vs. Structured Dependent Variable: Overall Satisfaction Independent Variables: Satisfaction with service attributes 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. Ease of making your purchase The financing and payment terms associated with your purchase Consistently approving my purchases Communicating how the purchase approval process works Having enough information to understand how Bill Me Later works Stating clearly the financing and payment terms associated with your purchase Adding Bill Me Later to your PayPal funding options Paying your Bill Me Later bill Having enough available credit to make your purchases Communicating your available line of credit Ease of managing your account online Shapley Value Regression Penalty & Reward SV Regression Background • Shapley Value Regression is a type of Unstructured Driver Analysis. • SV Regression is able to deal effectively with the multicollinearity issue that’s often present in Market Research data. • Multicollinearity is when there are strong correlations between the various aspects of consumer perceptions to the point that it affects the stability of the results in the driver analysis via traditional approaches. • http://www.visioncritical.com/blog/untangling-which-attributes- drive-purchase SV Regression Methodology • Model Specification • Dependent variable is the response variable you want to study. • Has to be at least ordinal, i.e. 3 ordered response category or more. • Interval or ratio scales are preferred. • Independent variables are the potential drivers that could influence the response variable. • can be metric or non-metric. • Examples: brand association questions: which of the following brands do you associate with these attributes? • What sample size do you need? • We require at least 10 cases of data for each potential drivers. • If there are 20 potential drivers, we need at least n=200 cases of data. • As the ratio falls below 10:1, we encounter the risk of overfitting the model to the sample, making the results too specific to the sample and lacking generalizability. SV Regression Example Output • R2 – How much do the potential drivers together influence the response variable • Relative Importance – the relative importance of the potential drivers (sums to 100%). SV Regression Output - Details • R2 - Coefficient of Determination • Measures the proportion of the variance of the response variable that is explained by all the potential drivers. It varies between 0 and 1. The higher the R2, the stronger the association between potential drivers and the response variable. • SV – Shapley Value • the contribution of each potential driver to the overall R2 of the regression • sdSV – Standard deviations of the Shapley Values • Relative Importance • Rebasing the Shapley Values so that they sum to 100%. The relative importance of an item = SV/ Overall R2. Performance vs. Importance Performance Impact Performance Impact BASE CASE - SINGLE-CHOICE with a minimum of 3 ordered categories to be used as the the overall performance indicator for ONE brand. - 8-14 attributes that measure distinctive aspects of performance that are directly related to that brand positively. - Require an absolute minimum of n=10 cases of data for each performance attribute. Prefer a lot more. Penalty & Reward Analysis • Shapely Value regression assumes the relationship is symmetrical: • Good product perception is associated with higher product purchase intent AND poor product perception is associated with lower purchase intent • What is the relationship is NOT symmetric? • Poor product performance on certain attributes may be associated with lower purchase intent, but good product performance on that same attribute is NOT associated with higher purchase intent. This is called a penalty • Good perception on an attribute is associated with higher purchase intent, but poor perception on that attribute is not associated with lower purchase intent. This is called a reward. • Penalty & Reward analysis is a TUR related counting process. • Using the same Shapley Value principle to understand the contributions of each attributes, as rewards and separately as penalties. Penalty/Reward Output Flavor of meat Overall amount of food Overall flavor Value for the money Quality of pasta or rice Quality of vegetables Appearance of meat Overall appearance of food Quality of nutrition information Penalty Reward Structual Equation Model PLS/Path Model SEM/PLS • Structural Equation Models(SEM) combines serveral analysis together: regression, confirmatory factor analysis and analysis of variance. • We use a Partial Least Square (PLS) algorithm to estimate SEM models. • SEM allows (and requires) you to presuppose a structure to your data – which means you have to think very specifically about what you want your data to explain and what you want the explanation to look like. • Because you have presupposed—or hypothesized—a structure that you expect your data will fit, SEMs provide statistical validation of your theory. That is, the output tells you in a variety of ways whether or not your theory is a plausible representation of reality (the data). • A key feature of SEMs are latent, or unobserved, variables (measured indirectly through the Measurement Model). • The drivers are the loadings from the latent variables to the dependent variable—for example, overall satisfaction. SEM Methodology • Force the researcher to think about the analysis in advance of doing it • Provide simple interpretations of complex data. • Can be useful in small sample situations SEM Output DURABLE 0.82 CUTTING EDGE 0.95 TRUSTWORTHY 0.66 HIGH TECH 0.76 HIGHEST QUALITY 1.12 Measurement Model for the Latent Variables QUALITY RESPONSIVE 1.2 KNOWLEDGEABLE 0.98 HELPFUL 0.88 FRIENDLY 0.65 PRICE EVALUATE 0.85 PRICE COMPARE 0.87 VALUE FOR $ 1.02 0.92 0.93 0.43 SERVICE OVERALL SATISFACTION 0.86 BRAND LOYALTY BRAND USAGE 1.15 0.81 VALUE RECOMMEND Quality is the strongest driver of satisfaction. More than twice as important as service Satisfaction is an important determinant of loyalty. 0.75 Latent Variables USE AGAIN SEM Output II Conjoint Analysis: Rating Based Conjoint Choice Based Conjoint (CBC) Discrete Choice Model (DCM) Overview of Conjoint Analysis: • Conjoint analysis is a popular marketing research technique that marketers use to determine what features a new product should have and how it should be priced. • Conjoint analysis became popular because it was a far less expensive (smaller sample size) and more flexible way to address these issues than concept testing. • When there are just too many potential product combinations for concept testing • Need to understand the tradeoff respondents make • Need to understand the competitive context • Need to test respondent’s reaction to price changes in a realistic setting. Overview of Conjoint Analysis: • Conjoint analysis involving showing respondent potential product combinations. • Products can be factored into parts, called factors. Different options within each factor represents factor levels. • The basic premise of Conjoint Analysis that a respondent makes purchase decision based on the inherent value he places on factor levels. • He will tradeoff the levels within different factors. E.g. trade in his favourite color for lower price, etc… • Non-compensatory rules are allowed. • vegetarian: the meatless burger pattie is a “must have” Overview of Conjoint Analysis: These three steps form the basics of conjoint analysis: 1. Collecting trade-offs: questionnaire with statistical design showing various options of the product, and respondents input in terms of product preference. 2. Estimating buyer value systems: modeling by the analytics team. 3. Making predictions: simulation based on the model developed. Analytics team working with you for results best suited to answer your client’s marketing question. Rating Based Conjoint • We design conjoint cards that represent possible products based on factor levels. Respondents are asked to rate each cards in terms of purchase intent. (Or as in this example, likelihood to vote for this candidate.) Candidate A Health Care The government should get out of health care Foreign Affairs Overseas America should focus on leading the world and promoting our values, and not listen to the UN Size Of Government The federal government is bloated, corrupt and wasteful—spending needs to be cut dramatically Energy/Environment Jobs, a strong economy and energy independence are more important that the environment Education The best way to improve the education system is by giving more resources to our public school teachers. For a family of four with a household income of $85,000: Increase tax by $1,000. For a single person with an income of $35,000: Increase tax by $400. • Alternatively we can show respondents a stack of cards and ask him to rank all the cards in terms of his preference. Rating Based Conjoint • Analysis: based on regression. Linear (ratings), Logistic (ranking). • Individual level estimate is possible, i.e., each respondents will have a model based his own data: collect lots of information from each individual. So most models are at the aggregate level. • Output: • Preference for the various product options on the same rating scale • simulated preference rating • Relative preference for the various levels within each factor • Isotherm • Problems: • Ratings: scale usage issues, “yeah”ers vs. “nay”ers. • Ranking: only works with very small problem • New applications: Media Impact tool, Virtual Menu Board Choice Based Conjoint • Choice Based Conjoint: we design conjoint cards that represent possible products based on factor levels. Products are grouped into options within a card, and respondents are asked to choose within the group. • Over the last decade, academics and practitioners have favored choice over ratings-based methods: • Stronger mathematical theory (McFadden: MNL theory) • Stronger psychological underpinnings • Argued to be more accurate (comparison to market data) Discrete Choice Model • DCM is really just one type of CBC, where the focus is less on optimizing the product offer, more on the market competitive context. DCM CBC • • • Uses with multiple factors (6-10) to describe products Respondents are shown limited number of options per card (4-6). Usually come at the earlier stage in product development for – – – Market potential Best feature combination Rough price level • • • Mostly use Brand/Price combo to describe products Respondents are shown many options that represents most of the market Usually at later stage in product development to: – – Test for various marketing inputs, such as package, POS Determine pricing scenario, product lineup vs. competitions. CBC Choice Tasks CBC Task DCM Choice Tasks DCM Choice Tasks CBC & DCM • Output: the basic output is similar as those from Rating based conjoint • CBC: • Factor importance/Level preference - Isotherm • Simulation: simulator, product optimization • Individual level estimation allows you to further segment the respondents. • Potentially developing different optimized product for each segment. Caution: no simple typing tools for these. • Feature optimization. • DCM: • Usually no isotherm except for impact of packaging change, sale/promotions • Simulator: line optimization, pricing optimization The CBC Simulator High Factor Importance/Level Preference ISOTHERM EXPLANATION • Each feature shown as vertical line, where longer the line, greater the strength in driving choice. Features are displayed in descending order. Features on the left are more important than features on the right. • The options within each feature are shown as tick marks along each vertical line. The higher up on the vertical line, the stronger the preference for that option. • Let’s use Shape & display size as an example. After price, it is the most important attribute driving choice, so it should be an important focus area when designing the new device. Price 1 PREFERENCE Price 2 Shape 4 Brand A Shape 3 2.0 GHz Shape 7 2 Shape 6 Price 3 Shape 5 Yes 14 hours 10 MP H Brand T Brand C V Brand U Brand K 1.5 GHz 1.0 GHz No 6 hours 12% 6% 5% 5% Feature1 Battery Life Camera 10 hours 8 MP 5 MP 4% Yes Yes No No 5 4 stars stars 3 stars Yes No No TH3 TH2 TH1 TH5 TH4 TH6 4% 3% 2% 2% 2% 0% Feature4 Feature5 Thickness Feature3 Display Feature2 Yes Price 4 Low Shape 1 Price 5 Price 6 37% 18% Price Shape & Manufacturer Speed display size brand More Important ATTRIBUTE IMPORTANCE Less Important Choice Share Pricing Scenarios Through Simulation Supporting Materials • For documentation, proposals, reporting, questionnaires, learning materials, etc… Advanced Analytics Intranet/ Knowledge Center/Statistical Methods/CBC PROGRAMMING CUSTOM CONJOINT SHOWCASE
© Copyright 2025 Paperzz