Social Networking for Churn Analysis Predictive Analytics World 2011 SF David Katz Dataspora David Katz Consulting Outline of Case Study • • • • The Project The Goals Methods Findings The Project • Mobile Phone Provider with concern about losing subscribers • Uses Predictive Analysis (Logistic Regression) to target subscribers with increased risk of churn • Looking for ways to improve prediction and retention • Pilot Project identified “Social Networking” type data Goals • Can Social Networking improve the existing churn model? • This is a means to an end – The ultimate goal is to reduce churn. Reducing Churn • 1) Identify a group of subscribers with high risk of unsubscribe. • 2) Target effective intervention to this group without offering expensive incentives to many who will not unsubscribe. • This can be a high bar! Predicting Unsubscribe Rate Can Help Target Cost-Effective Interventions Churn Rate 70% 60% 50% 40% 30% 20% 10% 0% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Preliminary Univariate Analysis • Hypothesis – subscribers who have been calling unsubscribers are more likely to unsubscribe. • Calling network – who is calling whom? Social Network Analysis More Connections found between Cancellers than expected by chance April 2009 21 customers Dynamics of cancellation in a selected customer call network May 2009 2 deactivations Dynamics of cancellation in a selected customer call network June 2009 4 deactivations Dynamics of cancellation in a selected customer call network July 2009 7 deactivations Dynamics of cancellation in a selected customer call network Cancellation Rates Differ for customers calling cancellers Subscriptions at Start of June Cancellations among this group in June Cancellation Rate Subset with known calls to May churners Subset of these churning in June Cancellation Rate 1980582 26294 1.3% 35387 837 2.4% And this persists over time… 3.0% 2.5% 2.0% Overall Cohort 1.5% Called Cancellers During May 1.0% 0.5% 0.0% June July August Social Networking Data • Association – Relatively Stable – May be correlated with other predictors – How much can it add to existing churn model? • Influence – Related in time – May be transient Variable Selection • Initial variable selection done using “earth”, the R implementation of Multivariate Adaptive Regression Splines • Can recognize nonlinear predictors effectively • Extremely fast algorithm reduces computation by computing the cross products incrementally Social Network Variables • Inbound callers “closer” than outbound callees. • Voice calls are much better measures of association than SMS/texting. • Counting Calls slightly better than overall length of calls. Main Social Networking Variable Selected • Percentage of incoming voice calls in the last 90 days which were from customers cancelling in the last 30 days. • Those with 10% or higher on this variable were designated as churn-chatters – those speaking with recent unsubscribers relatively frequently. Generalized Additive Models • Generalization of Generalized Linear Models • Using family=“binomial” makes it a generalization of Logistic Regression • Allows nonlinear predictors (without transformations or binning) Generalized Additive Models • Used mgcv::gam – the Implementation in R • Logit(p) = ∑ f(x) • Automatic plotting of each smooth f(x) term Smooth Term Example Days to Contract End Timeline • Timing is an essential dimension – When can we obtain the data? – When will the intervention impact the customer? – What time window should we use for evaluating the cancellation rate? • Prescribed by the client: – 60 day lead time – 60 day window for measuring resulting churn. The baseline model predicts seems to predict churn reasonably well when we look at the complete set of subscribers… 0.06 0.04 0.02 Fraction Churn Actual Pct Churn Actual 0.08 Fit 60-120 day churn with no social network data 0.02 0.04 0.06 Predicted w/o Social Networking Each dot represents one semidecile of the subscribers 0.08 …but misses the systematic bias when we plot the churn-chatters. 0.15 Subset with > day 10%period 30 days of calls predicting churnchurn.chat over a 60-120 0.10 Predicted w/o social networking 0.05 Fraction Churn Actual Pct Churn Actual Actual 0.02 0.04 0.06 0.08 0.10 0.12 0.14 Predicted w/o Social Networking * Churn-chatters are those who have more than 10% incoming calls coming from churners The impact of social networking on churn-chatters is larger if we look at the churn window for the immediate next 30 days 0.15 0.20 Actual 0.05 0.10 Predicted w/o social networking 0.00 Fraction ChurnChurn ActualActual Pct 0.25 0.30 30 days of calls predicting next 30 days of churn 0.00 0.05 0.10 Predicted w/o Social Networking 0.15 The 4-month window gives us the most comprehensive look at churn-chatters 0.4 30 days of calls predicting next 120 days of churn 0.3 0.1 0.2 Predicted w/o social networking 0.0 Fraction ChurnChurn Actual Pct Actual Actual 0.05 0.10 0.15 0.20 Predicted w/o Social Networking 0.25 0.30 There is great value in reacting faster as churn-rate declines rapidly over time – especially in high risk groups identified by model and churn chat 0.15 0.05 0.10 Feb Mar Apr May 0.00 Pct Churning in Each 30-day period Comparing Churn Rates Over Time All EM/EM+ Subs Top 5% Per Base Model No Social Networking in the Base Model Top Churn Chatters Percentage impact on churn propensity Impact of Social Networking Highly Dependent on Time to end of contract Days to Contract End Red – churn chatters Black – all others Not a Proportional Hazard • The Social Networking effect is markedly different depending on “Days to Contract End” • The greatest SN effect is closer to the actual end of contract. • Additional SN variables are also in the model Lessons Learned • Those receiving calls recently are less likely to churn. Lessons Learned • Social Networking is composed of at least two effects – • Shows affiliation • In our model this is an indicator variable with a relatively small value, constant over time. • Has influence • Immediate influence decays over time • Higher at the critical time near end of contract Lessons Learned • Adding social networking variables does help identify more churners • Especially in the high-risk group of churn-chatters: those who get more than 10% of the calls in the last 90 days from people who have churned in the last 30 days. 0.3 0.2 0.1 0.0 Actual Churn Pct 0.4 30 days of calls predicting next 120 days of churn 0.05 0.10 0.15 0.20 Predicted w/o Social Networking 0.25 0.30 Lessons Learned • Time is of the essence! • Experimenting beyond the box of current timing constraints reveals new features in the data. • SN has the strongest immediate effect (days) and then decays over a longer period (several months) Time since Caller Churn Regardless of Association Measure Lessons Learned • Time is of the essence! • Interventions must happen right away or many customers will already be gone. 0.15 0.05 0.10 Feb Mar Apr May 0.00 Pct Churning in Each 30-day period Comparing Churn Rates Over Time All EM/EM+ Subs Top 5% Per Base Model No Social Networking in the Base Model Top Churn Chatters
© Copyright 2026 Paperzz