1 Mads Krogh Nielsen

Using datamining to collect taxes
- development and implementation of an automatized collection in the
danish public sector
Mads Krogh Nielsen
Danish ministry of Taxation
[email protected]
More efficient collection in SKAT
The motivation for a shift in regime
A need for improved efficiency in the public sector
•
National Audit Office 2003: ”>The IT based structures are ineffective”.
•
New structure in SKAT – centralized collection, 30 centres -> 6 regions
•
Reduction – 4 year plan: 2006, 12000 FTE, 2010, 7000 FTE
Also – great potential:
10.000
•
Shift in strategy – OECD Compliance approach
rather than solely focusing on control.
9.000
8.000
7.000
6.000
•
Denmark has already one single ID code for all
individuals and companies. Have great opportunity
for data analysis.
5.000
4.000
3.000
2.000
1.000
13
12
20
11
20
10
20
09
20
08
20
07
20
Established cooperation with rather many 3rd party
partners such as banks, other authorities, employers
mortgage institutions..
20
•
27. januar 2012
1
More efficient collection in SKAT
What do we do to achieve this?
1. Higher rate of efficiency
- Doing the right things at the right time
2. Maximum use of automation
- Intelligent systems – Knowledge based models
- Using data we already have available
3. Automatized collection
27. januar 2012
More efficient collection in SKAT
The EFI system with graphic representation
The Nervous system
Decisions
Dialogue
Adjustment of the
comprehensive
machinery
Collection
Engine
Ensures a Uniform
and fair collection
Approved
collection
strategies
Proper manning at the
right time and place
27. januar 2012
2
More efficient collection in SKAT
So how does it work, this modelling?
Log (odds of being a good account )  B0 + B1 *Var1  B2 *Var2  ...  Bn *Varn
 Score( B0 ) E   D * B0  / log(2) 


 Score( Bi )   D * Bi  / log(2)

Score  Score  Bi 
i
27. januar 2012
More efficient collection in SKAT
So how does it work, this modelling?
Risk assessment – the projection of risk based on events and/or states that
have occured.
First we must agree on a
”Definition of Bad”
Then we find what characterizes
such a person/company
[12 months of non-payment] = [Properties of the individual
Subsequently we are able to score individuals/companies who
have not yet failed, based on the results of other individuals
27. januar 2012
3
More efficient collection in SKAT
Scoring the danes
The toughest decision!
BAD
Definition of Bad – the very core of the technology:
”Customers who did not pay on their debts the last 12 months”.
27. januar 2012
More efficient collection in SKAT
Daily sequences of the scoring
Segmenting
Grouping of customers
in segments by simple
queries
Scorecard
Calculation of
debitors score based
on the scorecard of
the segment.
Cut off
Tracks
Calculated score
placed in intervals
that gives grouping of
debitors.
The group is attached
to a number which
indicates the
collection effort.
Companies
Sole proprieties
Persons
0 – 25
26 – 40
41 – 90
91 -100
Implacement on track 1
Implacement on track 2
Implacement on track 3
Implacement on track 4
27. januar 2012
4
More efficient collection in SKAT
After this, it is the tracks that executes - – and saves resources:
DW
Various payment strategies
”Toughness”
Telephone
incasso
”a nice letter”
27. januar 2012
More efficient collection in SKAT
Scoring the danes
What are we looking for?
DW
Analysis of +200 various parameters.
From 70 to 50:
Significant parameters on B2C:
Employment_CAT = 6
MARITAL_STATE
Arrears_amt
Nbr_claims_last_4_agreements
N_cars_owned
N_houses_08
Assets_08
AGE_YRS
COMMUNE_CODE
Debt_National_Train_Company
Debt_National_Broadcast_license
AVG_OWNERSHIP_SHARE
RELATIONSHIP_TO_COMPANY
TOTAL_ARREAR_OPEN_BOD
LARGESTUDENTDEBT
TOTAL_ARREAR_OPEN_BOD
TOTAL_AGREEMENT_BOD_2
TOTAL_HOUSES_VALUE
ARREAR
27. januar 2012
5
More efficient collection in SKAT
Univariate – followed by Multivariate (Least Angle Regression)
27. januar 2012
More efficient collection in SKAT
Scorecard Persons
Intercept information 289
points
Demografic information
(Max points =98)
Company involvement
(Max points =13)
Income and Asset
information (Max points =
197)
Special debt information
(Max points = 79)
Total Score
27. januar 2012
6
More efficient collection in SKAT
Gunnar
Dorthe
Yvonne
Moped Mullen
Willem Jr.
Jane
27. januar 2012
More efficient collection in SKAT
Portrait of a good guy
Willem Jr.
Age 23 years
Lives in Allerød community (201)
Married with Dorthe
They live together
Works in a bank (not involved in any
owner relationship)
Owes: ”600 kr. too much payed wage”
Has never before owed money
Has no tv license, Train tickets or large
student’s debt
Willem Jr. score = 619 points
27. januar 2012
7
Portræt af en synder
More efficient collection in SKAT
Portrait of another guy
Moped Mullen
28 years
Lives alone on Lolland (rural area)
Single
Is co owner of an MLM company
Has many payment agreements
which he nurses very badly.
Latest challenge in the long row is
alimony to Jane (even if he claims, it
is not his kid)
Has a non-paid TV-license and a
Train fine but no student’s grants to
pay back on.
Moped Mullen Score = 328 points
27. januar 2012
Portræt af Yvonne
More efficient collection in SKAT
Yvonne
Age 27 years
Lives in Lemvig community
Married and lives with Gunnar
Works in Matas
Has a former payment
agreement (3 parking tickets)
She pays these every month.
Her latest challenge is a debt
on personal tax.
She has a large student’s debt
as she studiet musical
therapist in Aalborg. No unpaid
DSB and TV license fine.
Yvonne score = 584 points
27. januar 2012
8
More efficient collection in SKAT
Probability to keep
the agreement.
Connection between the score and probability of keeping a
payment agreement
89%
67%
}
50%
Points that doubles
the odds = 40
33%
11%
300
380
460
500
620
540
700
Score
Intercept = 289
500 points = fifty-fifty chance
27. januar 2012
More efficient collection in SKAT
Probability to keep
the agreement.
Connection between the score and probability of keeping a
Søren
payment agreement
Yvonne
89%
619
584
67%
}
50%
Points that double
odds = 40
33%
Kaj
11%
328
300
380
460
500
540
620
700
Score
Intercept = 289
500 points = fifty-fifty chance
27. januar 2012
9
More efficient collection in SKAT
Score Card
Persons
27. januar 2012
More efficient collection in SKAT
Score Card
Persons
27. januar 2012
10
More efficient collection in SKAT
27. januar 2012
More efficient collection in SKAT
27. januar 2012
11
More efficient collection in SKAT
27. januar 2012
More efficient collection in SKAT
27. januar 2012
12
More efficient collection in SKAT
Bad rate in danish municipalities
27. januar 2012
More efficient collection in SKAT
Conclusion
These improvements materializes as such:
•
A promising automatized handeling of the collection process.
•
An effective and swift iterative process thanks to the Modeling software
(being able to do ETL and analysis in one operation).
•
Striking reductions in the collection costs and proces cyklus according
to the automatizing.
•
Higher service level due to standardizing and better ressource
distribution.
27. januar 2012
13
More efficient collection in SKAT
Thank you for the attention…
[email protected]
27. januar 2012
14