Under the bonnet: Mosaic data, methodology and build

Under the bonnet:
Mosaic data, methodology
and build
Paul Cresswell
Matt Holgate
Kevin Smith
Experian Marketing Services
New Mosaic
Launch event
1st April 2014 #OneMosaic
© 2014 Experian Limited. All rights reserved. Experian and the marks used herein are service marks or registered trademarks of
Experian Limited. Other product and company names mentioned herein are the trademarks of their respective owners. No part of this
copyrighted work may be reproduced, modified, or distributed in any form or manner without the prior written permission of Experian.
Experian Public.
Under the bonnet: what we’ll cover
#OneMosaic
1. The Mosaic Segmentation
Process
2. New Data to Identify New Trends
3. Innovation in Methodology
4. Optimising Mosaic and
Measuring Uplift
5. Bringing Mosaic to Life:
Interpretation & Description
6. The Mosaic DNA and Mosaic Data
Exhaust: other Mosaic data sets
© 2014 Experian Limited. All rights reserved.
Experian Public.
2
Our Mosaic build team were charged with creating a
segmentation for today and tomorrow, identifying the
#OneMosaic
significant new trends we’ve seen in society…
 UK Economy now in recovery phase
 New rental and owning patterns
 New immigration trends
 Trends in urban/rural migration, household composition, family structures
Societal changes mean we need new clusters relevant to
understanding the UK consumer of today and tomorrow
© 2014 Experian Limited. All rights reserved.
Experian Public.
3
And we’re certainly well equipped….!
#OneMosaic
• Our segmentation product knowledge and expertise ensures we continue to
develop innovative segmentation solutions. Mosaic UK development team have
over 100 years combined experience delivering global and regional solutions,
segmenting over 2.3 billion consumers and 700 million households
• Experienced architects developing new techniques to fully exploit depth of
consumer data available
• Mosaic optimises balance of geo-demographics and individual data, delivering
a stable, robust segmentation
• Experian invested significantly more than we’ve ever done to completely
redesign our interpretation and user tools to ensure Mosaic is easy to
understand and use
© 2014 Experian Limited. All rights reserved.
Experian Public.
4
1. The Mosaic Segmentation Process
Building Mosaic: Step by Step
#OneMosaic
1. Design Requirements
2. Data Sourcing/Gathering
3. Data Processing
4. Clustering
5. Additional Reporting with Interpretative Data
Building a
segmentation
isn’t just
about
clustering!
6. Interpretation
7. Visualisation
8. Additional Outputs
9. Implementation
© 2014 Experian Limited. All rights reserved.
Experian Public.
5
2. New Data to Identify New Trends
Depth and quality of data is at the heart of Mosaic
#OneMosaic
Through combined Experian proprietary, public and
trusted 3rd party sourced data, Mosaic condenses
billions of pieces of information into a concise, easy
to understand consumer classification
• Greatest ever volume and type of data
• Full assessment of new and existing data
sources
• Granular household data
• Key Census trends
• New data identifying new segments
• Investment in new research sources for
description and visualisation
© 2014 Experian Limited. All rights reserved.
Experian Public.
6
2. New Data to Identify New Trends
Our person, household & neighbourhood data
drives our most robust household level solution #OneMosaic
ever
Consumers.
Our ConsumerView picture of all UK adults created using hundreds of millions
of input records with actual data at individual and household level:
Neighbourhoods
& Places
•
Edited Electoral register
•
Large scale contributor files
•
Lifestyle survey responses
•
Experian movers information
•
Decades of archived data on years at address
•
Family/personal names linked to ethnicity
•
Directors from Companies House
•
•
•
•
•
•
Experian Credit non-CAIS risk indicators
Land Registry data from 1995 onwards
Registers of Scotland Transaction data
Council Tax band
pH data/National Business Database
PAF including multiple residences
Neighbourhood information
Census data
•
ConsumerView variables accumulated to
postcode level
•
Rurality/Urbanisation
•
Shopping accessibility measures
•
Commercial/Residential mix
Accurate, universal source of information
still very relevant in classifying consumers.
Covers household composition,
employment, ethnicity, dwelling type,
tenure, health, qualifications, industry,
occupations.
850m+ source records with 450+ variables to fully
understand consumer characteristics, lifestyles & locations
© 2014 Experian Limited. All rights reserved.
Experian Public.
7
2. New Data to Identify New Trends
Just some of our new data sources directly
allowing us to identify new segments……
#OneMosaic
 ConsumerView Variable Improvements using new data:
Tenure & Property - Rightmove and National Register Social
Housing
►
Children using Emma’s diary
►
Northern Ireland detailed property data available for first time
►
Improvements from refreshed census data used in calibrations.
Higher Education Statistics Authority (HESA) college and
university students database linked to home & term time postcodes
Transience & attractiveness - Experian movers data indicates
how populations are changing
OS Streetview - Mapping data about property area, ratio of
gardens to buildings, cul-de-sacs, sea/part/lake view
Census 2011 - a new snapshot of entire UK population
►




 Accessibility to shopping centres, high streets; distance to
motorways, schools, railway stations, coast, GPs, bus stops….
Level of urbanisation/rurality based on population densities

 Open Data - Police.UK crime data, GCSE results, Gas & Electricity

consumption, DWP child benefits, tax credits, income support by
Census LSOA areas
Broadband speeds (OFCOM) - Postcode level
© 2014 Experian Limited. All rights reserved.
Experian Public.
8
2. New Data to Identify New Trends
New Tenure data identifies significant changes
in property ownership at household level
#OneMosaic
 Data from Rightmove and Government Open Data enables more accurate household level
tenure data. We can confidently differentiate owners, private renters and social renters
at household level.
 Example – identifying consumers with different tenures living alongside one another in an
estate that was once all social housing
Budget owners
M54 Down-to-earth Owners
Private renting
J40 Renting a Room
Social renting
I39 Families in Need
© 2014 Experian Limited. All rights reserved.
Experian Public.
9
2. New Data to Identify New Trends
Census data – still a strong contextual data
asset so why not use it
#OneMosaic
 Census 2011 is a strong contextual data asset that accurately shows key demographic
characteristics of whole population. Gives valuable neighbourhood information such as
ethnic diversity, employment and industry, levels of education and health
 28% of information used to build Mosaic UK is from Census. Experian produce ongoing
Current Year Estimates of key census variables which feed into the build
G28 Modern Parents
L49 Dependent Greys
A03 Penthouse Chic
L49 Dependent Greys
Best health
Worst health
Highest education
Lowest education
C10 Wealthy Landowners
K46 High Rise Residents
A05 Uptown Elite
J40 Make Do & Move On
Most cars
Fewest cars
Most professionals
Most manufacturing
© 2014 Experian Limited. All rights reserved.
Experian Public.
10
2. New Data to Identify New Trends
Higher Education Statistics Authority (HESA)
identifying students
Previous Mosaic
#OneMosaic
New Mosaic
• More accurate assignment of students
• Most “studenty” type (O66 Student Scene) consists
of 77% students vs just 34% in previous Mosaic
© 2014 Experian Limited. All rights reserved.
Experian Public.
11
3. Innovation in Methodology
Learning from over a hundred collective years
of segmentation building…..
#OneMosaic
 Experience-driven design
of effective and unique
methodologies for variable
selection, smoothing,
transformation & weighting
 Optimisation of household
level solution whilst
retaining benefits of
postcode data
 Maximising discrimination
in larger urban areas
© 2014 Experian Limited. All rights reserved.
Experian Public.
12
3. Innovation in Methodology
Example - Smoothing Household Data using
advanced analytics techniques
 Singular Value Decomposition
(SVD) is an advanced
analytics technique to reduce
“noise” in big data, making it
easier to spot patterns.
 Method used for analysing
other huge data sets in our
business - Hitwise and
Cheetahmail.
#OneMosaic
%0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0
0.8
1.0
1.2
1
 Allows us to use all household
and person information about
a household, converting binary
response data into
“probability” scores
 Scores have distributions, so
better suited to clustering
© 2014 Experian Limited. All rights reserved.
Experian Public.
13
3. Innovation in Methodology
Optimising Mosaic at a household level
#OneMosaic
New methodology clusters at household level using household & neighbourhood data to
exploit best clusters from both levels to drive optimum discrimination in all areas
Household
clustering
using
smoothed
data
Postcode DE75 7JA
Postcode
allocation

Subgroups found even where they don’t group
together distinctly at postcode level

Household characteristics well defined across
all age groups

Consumers in mixed neighbourhoods well
defined by household characteristics

Neighbourhood characteristics still highly
significant

Postcode allocation still retains optimum
discrimination

Types dominated by locational factors such as
rurality well defined
© 2014 Experian Limited. All rights reserved.
Experian Public.
Nearly 60% of households have a different
Mosaic Type to their postcode
14
3. Innovation in Methodology
Optimising household characteristics
#OneMosaic
Improved identification of consumer types who don’t
always live together in the same postcodes
Take this street of Suburban homeowners where mixed lifestages often live in same road…..
J42 Midlife
Stopgap
© Google
© 2014 Experian Limited. All rights reserved.
Experian Public.
H30 Primary
Ambition
E21 Solo
Retirees
F22 Boomerang
Boarders
© Google
15
3. Innovation in Methodology
Optimising household characteristics
#OneMosaic
Improved identification of consumer types in rural locations
Consumers in rural locations can get grouped together because of location but in reality they
have very different incomes and lifestyles. Optimised use of household level data allows these
differences to be uncovered…..
D15 Local Folk
© Google
D16 Outlying Seniors
© Google
© Google
© 2014 Experian Limited. All rights reserved.
Experian Public.
16
3. Innovation in Methodology
Optimising household characteristics
#OneMosaic
Improved identification of emerging family types
 Families supporting adult children - Given financial pressures on young people from
high house prices/rents, student loans, finding employment..., more couples continuing to
provide a home for their children into adulthood
 New types where adult children are living at home - Bank of Mum & Dad who can afford
to support their kids to Budget Generations where older children living at home more
likely to cause financial strain to parents. Cultural Comfort & Asian Heritage also have
high incidences of adult children living at home.
B08 Bank of Mum & Dad
© 2014 Experian Limited. All rights reserved.
Experian Public.
I37 Budget Generations
17
3. Innovation in Methodology
Maximising discrimination in larger urban areas
• Large urban areas very different
to rest of UK and have seen big
changes since last census
#OneMosaic
Mosaic Japan - Metropolitan Elites
• Household level characteristics
key in differentiating consumers in
these locations - these differences
can be lost at postcode level
• Use tried & tested methods
recently deployed in Tokyo - one of
world’s largest urban conurbations for Mosaic Japan
• Allows creation of more diverse
clusters, potentially new clusters
and better discrimination
© 2014 Experian Limited. All rights reserved.
Experian Public.
18
3. Innovation in Methodology
Example - we’re seeing significant
improvements in discrimination in London
 In inner London number of
types with >2,000 households
increased from 10 to 13
#OneMosaic
Main types old O63 & O64 are
allocated to in new Mosaic
• Renters / Homeowners /
Council tenants clearly
differentiated
• Tiers of wealth, age,
household composition,
property type well
differentiated within renting
majority in these areas
• Household clustering
distinguishes much more
variation in old O63 Urban
Cool and O64 Bright Young
Things segments
© 2014 Experian Limited. All rights reserved.
Experian Public.
19
3. Innovation in Methodology
Mosaic identifying micro-variations in urban
areas
#OneMosaic
Bloomsbury in London shows how new Mosaic picks out tiny pockets of council housing,
upper-class residences, students much better.
O66: Student Scene
A01: World Class
Wealth
K44: Inner City Stalwarts &
K45: Crowded Kaleidoscope
© 2014 Experian Limited. All rights reserved.
Experian Public.
20
4. Optimising Mosaic & Measuring Uplift
Settling on a Solution:
Selecting the best from the rest
#OneMosaic
In our segmentation process we
create hundreds of potential solutions.
We use a wide range of diagnostics to
test, select, validate and interpret
clusters created.
For those who understand these things….!
 Average distance to cluster/total error in
cluster
But a key “acid test” is that
characteristics of clusters
make sense and they’re
identifying key trends
 Relative sizes of clusters
 Loss of variance of first join in dendogram
 Variance explained in individual variables
 Regional analysis
 Characteristics of clusters make sense
© 2014 Experian Limited. All rights reserved.
Experian Public.
21
4. Optimising Mosaic & Measuring Uplift
Overall Performance improvement: 20% Uplift
#OneMosaic
 Measure performance of Mosaic to
discriminate client customer bases
using
►
►
Total Weighted Deviation how much client customer file
differs from UK in terms of it’s
Mosaic Distribution
Lift Score – % of maximum
possible lift gained above
random assignment of types
 Calculated these metrics across
>50 behaviour files covering many
different vertical markets.
 On average seeing 20%
uplift in these metrics
compared to Mosaic 2009
© 2014 Experian Limited. All rights reserved.
Experian Public.
By targeting 5% of our ConsumerView universe
you’d reach 21% of client’s customer base with
new Mosaic, but 16% in previous Mosaic
22
5. Bringing Mosaic to Life: Interpretation & Description
New Interpretive data & visualisation tools
#OneMosaic
 Rich depth of
interpretative data
paints detailed picture of
who target audiences
and customers are
 More than 4,300
profiles considered
while writing descriptions
 Completely new
visualisation portal
© 2014 Experian Limited. All rights reserved.
Experian Public.
23
5. Bringing Mosaic to Life: Interpretation & Description
New Interpretive data & visualisation tools
#OneMosaic
© 2014 Experian Limited. All rights reserved.
Experian Public.
24
5. Bringing Mosaic to Life: Interpretation & Description
Experian commissioned Research Now survey
#OneMosaic
around technology and digital behaviour
Smartphone Usage
Multiscreen Content
A03
Penthouse
Chic
N59 Asian
Heritage
J41
Disconnected
Youth
Watching content on smartphone,
tablet, laptop, PC: Every Day
O66
Student
Scene
© 2014 Experian Limited. All rights reserved.
Experian Public.
25
6. Other Useful Mosaic Data Sets
Making the most of segmentation assets
 Mosaic Segments - the DNA or basic building
blocks of Mosaic. Can be combined on basis of
specific data to create a bespoke segmentation to
meet your needs
#OneMosaic
Mosaic DNA!
 Mosaic Factors distil underlying data used to build
each Mosaic, summarised into continuous noncorrelated variables
 “Distance” to Type measure extent to which
household or postcode fits its allocated Type

2nd
Best Type indicates which Mosaic Type is next
best fit
Mosaic Data Exhaust!
 Affinity Measures indicate how close a household
or postcode is to ‘tipping point’ between Mosaic
Types
 Geographic aggregations of Mosaic mix to many
geographical areas for use in location insight
© 2014 Experian Limited. All rights reserved.
Experian Public.
26
6. Other Useful Mosaic Data Sets
Example: Segments – the DNA of Mosaic
Build customised clusters with
even more discriminatory
power for clients – from
Mosaic DNA
 Detailed household consumer
sub-types created using input
data from Mosaic.
 Target organisations with niche
audiences not adequately
identified by Mosaic classification
 Allows clients to capitalise on
input data used to build Mosaic &
build own bespoke
classification that can be linked
back to Mosaic
Our Mosaic Custom tool
developed to do exactly this!
© 2014 Experian Limited. All rights reserved.
Experian Public.
Mosaic
Types
Mosaic
Segments
#OneMosaic
Bespoke
Customer
Segmentation
Custom
Segment A
Custom
Segment B
Custom
Segment C
Mosaic UK Segments are 238 subtypes split
out of 66 Types, providing further level of
discrimination for profiling & analysis
27
#OneMosaic
New Mosaic
Understanding the UK consumer today
and tomorrow, through:
New data identifying new trends
Innovative in methodology
Bringing Mosaic to life through new
visualisation tools and customised
solutions
© 2014 Experian Limited. All rights reserved.
Experian Public.
28
#OneMosaic
Questions
Ask our Mosaic build experts
© 2014 Experian Limited. All rights reserved.
Experian Public.
29
#OneMosaic
[email protected]
@ExperianMkt_UK
0845 234 0391
www.experian.co.uk/mosaic
© 2014 Experian Limited. All rights reserved.
Experian Public.
30