Danny Hendler
Advanced Topics in on-line Social Networks Analysis
Social networks analysis seminar
Second introductory lecture
Presentation prepared by Yehonatan Cohen
Some of the slides based on the online book “Social media mining”, R. Zafarani, M. A. Abbasi & H. Liu.
Talk outline
Node centrality
• Degree
• Eigenvector
• Closeness
• Betweeness
Transitivity measures
Data mining & machine learning concepts
Decision trees
Naïve Bayes classifier
Node centrality
Name the most central/significant node:
1
2
13
4
11
9
12
10
8
7
5
6
3
Node centrality (continued)
Name it now!
12
10
8
13
11
9
7
6
1
4
2
5
3
Node centrality: Applications
Detection of the most popular actors in a network
Advertising
Identification of “super spreader” nodes
Health care / Epidemics
Identify vulnerabilities in network structure
Network design
…
Node centrality (continued)
What makes a node central?
• Number of connections
• It is central if its removal disconnects the graph
• High number of shortest paths passing through the node
• Proximity to all other nodes
• Central node is the one whose neighbors are central
• …
Degree centrality
Degree centrality is the number of a node’s neighbours:
Alternative definitions are possible
•
Take into account connection strengths
•
Take into account connection directions
•
…
Degree centrality: an example
12
13
10
11
8
9
7
6
1
4
2
5
3
Node
Degree
4
4
6
3
7
3
8
3
9
3
10
3
11
2
12
2
Eigenvector centrality
Not all neighbours are equal
•
Popular ones (with high degree) should weigh more!
Eigenvector centrality
of node vi
Adjacency matrix
, where
Choosing the maximum eigenvalue guarantees all vector values are positive
Eigenvector centrality: an example
Closeness centrality
If a node is central, it can reach other nodes “quickly”
•
Smaller average shortest paths
Average length of shortest
paths from v
, where
Closeness centrality: an example
12
13
10
11
8
9
7
6
1
4
5
3
Node
Closeness
4
0.353
6
0.438
7
0.444
8
0.4
9
0.428
10
0.342
11
2
12
Betweeness centrality
Betweeness centrality: an example
12
13
10
11
8
9
7
6
1
4
5
3
Node
Betweeness
4
30
6
39
7
36
8
21.5
9
7.5
10
20.5
11
2
12
Talk outline
Node centrality
• Degree
• Eigenvector
• Closeness
• Betweeness
Transitivity measures
Data mining & machine learning concepts
Decision trees
Naïve Bayes classifier
Transitivity measures
Link prediction: which links more likely to appear?
Transitivity typical in social networks
We need measures for such link-formation behaviour
(Global) Clustering Coefficient
3 × 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑖𝑎𝑛𝑔𝑙𝑒𝑠
𝐶=
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑛𝑛𝑒𝑐𝑡𝑒𝑑 𝑡𝑟𝑖𝑝𝑙𝑒𝑡𝑠
(Global) Clustering Coefficient
3 × 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑖𝑎𝑛𝑔𝑙𝑒𝑠
𝐶=
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑛𝑛𝑒𝑐𝑡𝑒𝑑 𝑡𝑟𝑖𝑝𝑙𝑒𝑡𝑠
(Global) Clustering Coefficient
3 × 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑖𝑎𝑛𝑔𝑙𝑒𝑠
𝐶=
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑛𝑛𝑒𝑐𝑡𝑒𝑑 𝑡𝑟𝑖𝑝𝑙𝑒𝑡𝑠
Triangles: {v1,v2,v3},{v1,v3,v4}
(Global) Clustering Coefficient
3 × 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑖𝑎𝑛𝑔𝑙𝑒𝑠
𝐶=
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑛𝑛𝑒𝑐𝑡𝑒𝑑 𝑡𝑟𝑖𝑝𝑙𝑒𝑡𝑠
Triangles: {v1,v2,v3},{v1,v3,v4}
Triplets: (v1,v2,v3),(v2,v3,v1),(v3,v1,v2)
(v1,v3,v4),(v3,v4,v1),(v4,v1,v3)
(v1,v2,v4),(v2,v3,v4)
(Global) Clustering Coefficient
3 × 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑖𝑎𝑛𝑔𝑙𝑒𝑠
𝐶=
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑛𝑛𝑒𝑐𝑡𝑒𝑑 𝑡𝑟𝑖𝑝𝑙𝑒𝑡𝑠
Triangles: {v1,v2,v3},{v1,v3,v4}
Triplets: (v1,v2,v3),(v2,v3,v1),(v3,v1,v2)
(v1,v3,v4),(v3,v4,v1),(v4,v1,v3)
(v1,v2,v4),(v2,v3,v4)
6
𝐶=
8
Local Clustering Coefficient
| 𝑒𝑗𝑘 : 𝑣𝑗 , 𝑣𝑘 ∈ 𝑁𝑖 , 𝑒𝑗𝑘 ∈ 𝐸 |
𝐶(𝑣𝑖) =
𝑘𝑖 (𝑘𝑖 −1)
Local Clustering Coefficient
| 𝑒𝑗𝑘 : 𝑣𝑗 , 𝑣𝑘 ∈ 𝑁𝑖 , 𝑒𝑗𝑘 ∈ 𝐸 |
𝐶(𝑣𝑖) =
𝑘𝑖 (𝑘𝑖 −1)
Number of
connected neighbors
Local Clustering Coefficient
| 𝑒𝑗𝑘 : 𝑣𝑗 , 𝑣𝑘 ∈ 𝑁𝑖 , 𝑒𝑗𝑘 ∈ 𝐸 |
𝐶(𝑣𝑖) =
𝑘𝑖 (𝑘𝑖 −1)
Number of
neighbor pairs
Number of
connected neighbors
Local Clustering Coefficient
| 𝑒𝑗𝑘 : 𝑣𝑗 , 𝑣𝑘 ∈ 𝑁𝑖 , 𝑒𝑗𝑘 ∈ 𝐸 |
𝐶(𝑣𝑖) =
𝑘𝑖 (𝑘𝑖 −1)/2
Number of
neighbor pairs
Number of
connected neighbors
Talk outline
Node centrality
• Degree
• Eigenvector
• Closeness
• Betweeness
Transitivity measures
Data mining & machine learning concepts
Decision trees
Naïve Bayes classifier
Big Data
Data production rate dramatically increased
o
Social media data, mobile phone data, healthcare data, purchase data…
Image taken from “data science and prediction”, CACM, December 2013
Data mining/
Knowledge Discovery in DB (KDD)
Infer actionable knowledge/insights from data
o
o
o
When men buy diapers on Fridays, they also buy beer
Email spamming accounts tend to cluster in communities
Both love & hate drive reality ratings
Data mining/
Knowledge Discovery in DB (KDD)
Infer actionable knowledge/insights from data
o
o
o
When men buy diapers on Fridays, they also buy beer
Email spamming accounts tend to cluster in communities
Both love & hate drive reality ratings
Involves several tasks
o
o
o
o
o
o
Anomaly detection
Association rule learning
Classification
Regression
Summarization
Clustering
Data mining process
Data instances
Data instances (continued)
Unlabeled
example
Labeled
example
Predict whether an individual that visits an online
book seller will buy a specific book
Categories of ML algorithms
Supervised Learning Algorithm
Classification (class attribute is discrete)
Assign data into predefined classes
Spam Detection, fraudulent credit card detection
Regression (class attribute takes real values)
Predict a real value for a given data instance
Predict the price for a given house
Unsupervised Learning Algorithm
Group similar items together into some clusters
Detect communities in a given social network
Supervised learning process
We are given a set of labeled examples
These examples are records/instances in the format (x, y)
where x is a vector and y is the class attribute, commonly a
scalar
The supervised learning task is to build model that maps x to y
(find a mapping m such that m(x) = y)
Given unlabeled instances (x’,?), we compute m(x’)
E.g., fraud/non-fraud prediction
Talk outline
Node centrality
• Degree
• Eigenvector
• Closeness
• Betweeness
Transitivity measures
Data mining & machine learning concepts
Decision trees
Naïve Bayes classifier
Decision tree learning - an example
Splitting Attributes
Refund
Yes
No
No
MarSt
Married
Single, Divorced
TaxInc
< 80K
No
Class labels
No
> 80K
Yes
T id
Refund
Marital
status
Taxable
Income
1
Yes
Single
125K
No
2
No
Married
100K
No
3
No
Single
70K
No
4
Yes
Married
120K
No
5
No
Divorced
95K
Yes
6
No
Married
60K
No
7
Yes
Divorced
220K
No
8
No
Single
85K
Yes
9
No
Married
75K
No
10
No
Single
90K
Yes
Training Data
Cheat
Purity is measured by entropy
Features selected based on set purity
To measure purity we can use [minimize] entropy.
Over a subset of training instances, T, with a binary
class attribute (values in {+,-}), the entropy of T is
defined as:
p+ is the proportion of positive examples in D
p- is the proportion of negative examples in D
Entropy example
Assume there is a subset T, containing 10 instances. Seven
instances have a positive class attribute value and three have a
negative class attribute value [7+, 3-]. The entropy measure for
subset T is
What is the range of entropy values?
[0 , 1]
Pure
Balanced
Information gain (IG)
We select the feature that is most useful in separating
between classes to be learnt, based on IG
IG is the difference between the entropy of the parent
node and the average entropy of the child nodes
We select the feature that maximizes IG
Information gain calculation example
Information gain calculation example
Information gain calculation example
Information gain calculation example
Information gain calculation example
Information gain calculation example
Information gain calculation example
Decision tree construction: example
T id
Refund
Marital
status
Taxable
Income
1
Yes
Single
125K
No
2
No
Married
100K
No
3
No
Single
70K
No
4
Yes
Married
120K
No
5
No
Divorced
95K
Yes
6
No
Married
60K
No
7
Yes
Divorced
220K
No
8
No
Single
85K
Yes
9
No
Married
75K
No
10
No
Single
90K
Yes
Training Data
Cheat
Decision tree construction: example
T id
Refund
Marital
status
Taxable
Income
1
Yes
Single
125K
No
2
No
Married
100K
No
3
No
Single
70K
No
4
Yes
Married
120K
No
5
No
Divorced
95K
Yes
6
No
Married
60K
No
7
Yes
Divorced
220K
No
8
No
Single
85K
Yes
9
No
Married
75K
No
10
No
Single
90K
Yes
Training Data
Splitting Attribute
Cheat
Refund
Yes
Model: Decision Tree
Decision tree construction: example
T id
Refund
Marital
status
Taxable
Income
1
Yes
Single
125K
No
2
No
Married
100K
No
3
No
Single
70K
No
4
Yes
Married
120K
No
5
No
Divorced
95K
Yes
6
No
Married
60K
No
7
Yes
Divorced
220K
No
8
No
Single
85K
Yes
9
No
Married
75K
No
10
No
Single
90K
Yes
Training Data
Splitting Attribute
Cheat
Refund
Yes
Model: Decision Tree
Decision tree construction: example
T id
Refund
Marital
status
Taxable
Income
1
Yes
Single
125K
No
2
No
Married
100K
No
3
No
Single
70K
No
4
Yes
Married
120K
No
5
No
Divorced
95K
Yes
6
No
Married
60K
No
7
Yes
Divorced
220K
No
8
No
Single
85K
Yes
9
No
Married
75K
No
10
No
Single
90K
Yes
Training Data
Splitting Attribute
Cheat
Refund
Yes
NO
Model: Decision Tree
Decision tree construction: example
T id
Refund
Marital
status
Taxable
Income
1
Yes
Single
125K
No
2
No
Married
100K
No
3
No
Single
70K
No
4
Yes
Married
120K
No
5
No
Divorced
95K
Yes
6
No
Married
60K
No
7
Yes
Divorced
220K
No
8
No
Single
85K
Yes
9
No
Married
75K
No
10
No
Single
90K
Yes
Training Data
Splitting Attribute
Cheat
Refund
Yes
NO
Model: Decision Tree
Decision tree construction: example
T id
Refund
Marital
status
Taxable
Income
1
Yes
Single
125K
No
2
No
Married
100K
No
3
No
Single
70K
No
4
Yes
Married
120K
No
5
No
Divorced
95K
Yes
6
No
Married
60K
No
7
Yes
Divorced
220K
No
8
No
Single
85K
Yes
9
No
Married
75K
No
10
No
Single
90K
Yes
Training Data
Splitting Attribute
Cheat
Refund
Yes
NO
No
MarSt
Married
Model: Decision Tree
Decision tree construction: example
T id
Refund
Marital
status
Taxable
Income
1
Yes
Single
125K
No
2
No
Married
100K
No
3
No
Single
70K
No
4
Yes
Married
120K
No
5
No
Divorced
95K
Yes
6
No
Married
60K
No
7
Yes
Divorced
220K
No
8
No
Single
85K
Yes
9
No
Married
75K
No
10
No
Single
90K
Yes
Training Data
Splitting Attribute
Cheat
Refund
Yes
NO
No
MarSt
Married
Model: Decision Tree
Decision tree construction: example
T id
Refund
Marital
status
Taxable
Income
1
Yes
Single
125K
No
2
No
Married
100K
No
3
No
Single
70K
No
4
Yes
Married
120K
No
5
No
Divorced
95K
Yes
6
No
Married
60K
No
7
Yes
Divorced
220K
No
8
No
Single
85K
Yes
9
No
Married
75K
No
10
No
Single
90K
Yes
Training Data
Splitting Attribute
Cheat
Refund
Yes
NO
No
MarSt
Married
NO
Model: Decision Tree
Decision tree construction: example
T id
Refund
Marital
status
Taxable
Income
1
Yes
Single
125K
No
2
No
Married
100K
No
3
No
Single
70K
No
4
Yes
Married
120K
No
5
No
Divorced
95K
Yes
6
No
Married
60K
No
7
Yes
Divorced
220K
No
8
No
Single
85K
Yes
9
No
Married
75K
No
10
No
Single
90K
Yes
Training Data
Splitting Attribute
Cheat
Refund
Yes
NO
No
MarSt
Married
NO
Model: Decision Tree
Decision tree construction: example
T id
Refund
Marital
status
Taxable
Income
1
Yes
Single
125K
No
2
No
Married
100K
No
3
No
Single
70K
No
4
Yes
Married
120K
No
5
No
Divorced
95K
Yes
6
No
Married
60K
No
7
Yes
Divorced
220K
No
8
No
Single
85K
Yes
9
No
Married
75K
No
10
No
Single
90K
Yes
Training Data
Splitting Attribute
Cheat
Refund
Yes
NO
No
MarSt
Single, Divorced
Married
NO
Model: Decision Tree
Decision tree construction: example
T id
Refund
Marital
status
Taxable
Income
1
Yes
Single
125K
No
2
No
Married
100K
No
3
No
Single
70K
No
4
Yes
Married
120K
No
5
No
Divorced
95K
Yes
6
No
Married
60K
No
7
Yes
Divorced
220K
No
8
No
Single
85K
Yes
9
No
Married
75K
No
10
No
Single
90K
Yes
Training Data
Splitting Attribute
Cheat
Refund
Yes
NO
No
MarSt
Single, Divorced
Married
NO
Model: Decision Tree
Decision tree construction: example
T id
Refund
Marital
status
Taxable
Income
1
Yes
Single
125K
No
2
No
Married
100K
No
3
No
Single
70K
No
4
Yes
Married
120K
No
5
No
Divorced
95K
Yes
6
No
Married
60K
No
7
Yes
Divorced
220K
No
8
No
Single
85K
Yes
9
No
Married
75K
No
10
No
Single
90K
Yes
Training Data
Splitting Attribute
Cheat
Refund
Yes
No
NO
MarSt
Married
Single, Divorced
TaxInc
NO
> 80K
Model: Decision Tree
Decision tree construction: example
T id
Refund
Marital
status
Taxable
Income
1
Yes
Single
125K
No
2
No
Married
100K
No
3
No
Single
70K
No
4
Yes
Married
120K
No
5
No
Divorced
95K
Yes
6
No
Married
60K
No
7
Yes
Divorced
220K
No
8
No
Single
85K
Yes
9
No
Married
75K
No
10
No
Single
90K
Yes
Training Data
Splitting Attribute
Cheat
Refund
Yes
No
NO
MarSt
Married
Single, Divorced
TaxInc
NO
> 80K
Model: Decision Tree
Decision tree construction: example
T id
Refund
Marital
status
Taxable
Income
1
Yes
Single
125K
No
2
No
Married
100K
No
3
No
Single
70K
No
4
Yes
Married
120K
No
5
No
Divorced
95K
Yes
6
No
Married
60K
No
7
Yes
Divorced
220K
No
8
No
Single
85K
Yes
9
No
Married
75K
No
10
No
Single
90K
Yes
Training Data
Splitting Attribute
Cheat
Refund
Yes
No
NO
MarSt
Married
Single, Divorced
TaxInc
NO
> 80K
Yes
Model: Decision Tree
Decision tree construction: example
T id
Refund
Marital
status
Taxable
Income
1
Yes
Single
125K
No
2
No
Married
100K
No
3
No
Single
70K
No
4
Yes
Married
120K
No
5
No
Divorced
95K
Yes
6
No
Married
60K
No
7
Yes
Divorced
220K
No
8
No
Single
85K
Yes
9
No
Married
75K
No
10
No
Single
90K
Yes
Training Data
Splitting Attribute
Cheat
Refund
Yes
No
NO
MarSt
Married
Single, Divorced
TaxInc
NO
> 80K
Yes
Model: Decision Tree
Decision tree construction: example
T id
Refund
Marital
status
Taxable
Income
1
Yes
Single
125K
No
2
No
Married
100K
No
3
No
Single
70K
No
4
Yes
Married
120K
No
5
No
Divorced
95K
Yes
6
No
Married
60K
No
7
Yes
Divorced
220K
No
8
No
Single
85K
Yes
9
No
Married
75K
No
10
No
Single
90K
Yes
Training Data
Splitting Attribute
Cheat
Refund
Yes
No
NO
MarSt
Married
Single, Divorced
TaxInc
< 80K
NO
> 80K
Yes
Model: Decision Tree
Decision tree construction: example
T id
Refund
Marital
status
Taxable
Income
1
Yes
Single
125K
No
2
No
Married
100K
No
3
No
Single
70K
No
4
Yes
Married
120K
No
5
No
Divorced
95K
Yes
6
No
Married
60K
No
7
Yes
Divorced
220K
No
8
No
Single
85K
Yes
9
No
Married
75K
No
10
No
Single
90K
Yes
Training Data
Splitting Attribute
Cheat
Refund
Yes
No
NO
MarSt
Married
Single, Divorced
TaxInc
< 80K
NO
> 80K
Yes
Model: Decision Tree
Decision tree construction: example
Taxable
Income
Splitting Attribute
T id
Refund
Marital
status
1
Yes
Single
125K
No
2
No
Married
100K
No
3
No
Single
70K
No
4
Yes
Married
120K
No
5
No
Divorced
95K
Yes
6
No
Married
60K
No
7
Yes
Divorced
220K
No
8
No
Single
85K
Yes
9
No
Married
75K
No
< 80K
10
No
Single
90K
Yes
NO
Training Data
Cheat
Refund
Yes
No
NO
MarSt
Married
Single, Divorced
TaxInc
NO
> 80K
Yes
Model: Decision Tree
Decision tree construction: example
Taxable
Income
Splitting Attribute
T id
Refund
Marital
status
1
Yes
Single
125K
No
2
No
Married
100K
No
3
No
Single
70K
No
4
Yes
Married
120K
No
5
No
Divorced
95K
Yes
6
No
Married
60K
No
7
Yes
Divorced
220K
No
8
No
Single
85K
Yes
9
No
Married
75K
No
< 80K
10
No
Single
90K
Yes
NO
Training Data
Cheat
Refund
Yes
No
NO
MarSt
Married
Single, Divorced
TaxInc
NO
> 80K
Yes
Model: Decision Tree
Talk outline
Node centrality
• Degree
• Eigenvector
• Closeness
• Betweeness
Transitivity measures
Data mining & machine learning concepts
Decision trees
Naïve Bayes classifier
Naïve Bayes' Classifier
Let Y represent the class variable with class values (𝑦1 , 𝑦2 ,…, 𝑦𝑛 )
Let 𝑋 = (𝑥1 , 𝑥2 ,…, 𝑥𝑚 ) be an unclassified instance (feature vector)
Naïve Bayes Classifier estimates: 𝑦 = 𝑎𝑟𝑔𝑚𝑎𝑥 𝑃(𝑦𝑖 |𝑋)
𝑦𝑖
Naïve Bayes' Classifier
Let Y represent the class variable with class values (𝑦1 , 𝑦2 ,…, 𝑦𝑛 )
Let 𝑋 = (𝑥1 , 𝑥2 ,…, 𝑥𝑚 ) be an unclassified instance (feature vector)
Naïve Bayes Classifier estimates: 𝑦 = 𝑎𝑟𝑔𝑚𝑎𝑥 𝑃(𝑦𝑖 |𝑋)
𝑦𝑖
From Bayes formula: 𝑃(𝑦𝑖 |𝑋) =
𝑃
𝑋 𝑦𝑖
𝑃(𝑦𝑖 )
𝑃(𝑋)
Naïve Bayes' Classifier
Let Y represent the class variable with class values (𝑦1 , 𝑦2 ,…, 𝑦𝑛 )
Let 𝑋 = (𝑥1 , 𝑥2 ,…, 𝑥𝑚 ) be an unclassified instance (feature vector)
Naïve Bayes Classifier estimates: 𝑦 = 𝑎𝑟𝑔𝑚𝑎𝑥 𝑃(𝑦𝑖 |𝑋)
𝑦𝑖
From Bayes formula: 𝑃(𝑦𝑖 |𝑋) =
𝑃
𝑋 𝑦𝑖
𝑃(𝑦𝑖 )
𝑃(𝑋)
Assumption: 𝑃(𝑋|𝑦𝑖 )= 𝑚
𝑗=1 𝑃(𝑥𝑗 |𝑦𝑖 )
Naïve Bayes' Classifier
Let Y represent the class variable with class values (𝑦1 , 𝑦2 ,…, 𝑦𝑛 )
Let 𝑋 = (𝑥1 , 𝑥2 ,…, 𝑥𝑚 ) be an unclassified instance (feature vector)
Naïve Bayes Classifier estimates: 𝑦 = 𝑎𝑟𝑔𝑚𝑎𝑥 𝑃(𝑦𝑖 |𝑋)
𝑦𝑖
From Bayes formula: 𝑃(𝑦𝑖 |𝑋) =
𝑃
𝑋 𝑦𝑖
𝑃(𝑦𝑖 )
𝑃(𝑋)
Assumption: 𝑃(𝑋|𝑦𝑖 )= 𝑚
𝑗=1 𝑃(𝑥𝑗 |𝑦𝑖 )
𝑃 𝑦𝑖 𝑋) =
(
𝑚
𝑗=1
(𝑃(𝑥𝑗 |𝑦𝑖 ) 𝑃(𝑦𝑖 ))
𝑃(𝑋)
Naïve Bayes' Classifier: example
Naïve Bayes' Classifier: example
X
Naïve Bayes' Classifier: example
𝑃 𝑋 𝑦𝑖 𝑃(𝑦𝑖 )
𝑃(𝑦𝑖 |𝑋) =
𝑃(𝑋)
Naïve Bayes' Classifier: example
𝑃 𝑋 𝑦𝑖 𝑃(𝑦𝑖 )
𝑃(𝑦𝑖 |𝑋) =
𝑃(𝑋)
Naïve Bayes' Classifier: example
𝑃 𝑋 𝑦𝑖 𝑃(𝑦𝑖 )
𝑃(𝑦𝑖 |𝑋) =
𝑃(𝑋)
Naïve Bayes' Classifier: example
𝑃 𝑋 𝑦𝑖 𝑃(𝑦𝑖 )
𝑃(𝑦𝑖 |𝑋) =
𝑃(𝑋)
Naïve Bayes' Classifier: example
𝑃 𝑋 𝑦𝑖 𝑃(𝑦𝑖 )
𝑃(𝑦𝑖 |𝑋) =
𝑃(𝑋)
Naïve Bayes' Classifier: example
𝑃 𝑋 𝑦𝑖 𝑃(𝑦𝑖 )
𝑃(𝑦𝑖 |𝑋) =
𝑃(𝑋)
Naïve Bayes' Classifier: example
𝑃 𝑋 𝑦𝑖 𝑃(𝑦𝑖 )
𝑃(𝑦𝑖 |𝑋) =
𝑃(𝑋)
Naïve Bayes' Classifier: example
𝑃 𝑋 𝑦𝑖 𝑃(𝑦𝑖 )
𝑃(𝑦𝑖 |𝑋) =
𝑃(𝑋)
Naïve Bayes' Classifier: example
𝑃 𝑋 𝑦𝑖 𝑃(𝑦𝑖 )
𝑃(𝑦𝑖 |𝑋) =
𝑃(𝑋)
>
Naïve Bayes' Classifier: example
𝑃 𝑋 𝑦𝑖 𝑃(𝑦𝑖 )
𝑃(𝑦𝑖 |𝑋) =
𝑃(𝑋)
>
𝑦(𝑖8) = N
Classification quality metrics
Binary classification
(Instances, Class labels): (x1, y1), (x2, y2), ..., (xn, yn)
yi {1,-1} - valued
Classifier: provides class prediction Ŷ for an instance
Outcomes for a prediction:
True class
Predicted
class
1
-1
1
True positive
(TP)
False positive
(FP)
-1
False negative
(FP)
True negative
(TN)
Classification quality metrics (cont'd)
P(Ŷ = Y): accuracy (TP+TN)
P(Ŷ = 1 | Y = 1): true positive rate/recall/sensitivity
P(Ŷ = 1 | Y = -1): false positive rate
P(Y = 1 | Ŷ = 1): precision (TP/(TP+FP))
True class
Predicted
class
1
-1
1
True positive
(TP)
False positive
(FP)
-1
False negative
(FP)
True negative
(TN)
Classification quality metrics: example
Consider diagnostic test for a disease
Test has 2 possible outcomes:
‘positive’ = suggesting presence of disease
‘negative’
An individual can test either positive or negative
for the disease
Classification quality metrics: example
Individuals
without the
disease
Individuals
with
disease
Test Result
Machine Learning: Classification
Call these patients “negative”
Call these patients “positive”
Test Result
Machine Learning: Classification
Call these patients “negative”
Call these patients “positive”
True Positives
without the disease
with the disease
Test Result
Machine Learning: Classification
Call these patients “negative”
without the disease
with the disease
Call these patients “positive”
Test Result
False
Positives
Machine Learning: Classification
Call these patients “negative”
Call these patients “positive”
True
negatives
without the disease
with the disease
Test Result
Machine Learning: Classification
Call these patients “negative”
Call these patients “positive”
False
negatives
without the disease
with the disease
Test Result
Machine Learning: Cross-Validation
What if we don’t have enough data to set aside
a test dataset?
Cross-Validation:
Each data point is used both as train and test data.
Basic idea:
Fit model on 90% of the data; test on other 10%.
Now do this on a different 90/10 split.
Cycle through all 10 cases.
10 “folds” a common rule of thumb.
Machine Learning: Cross-Validation
Divide data into 10 equal
pieces P1…P10.
Fit 10 models, each on
90% of the data.
Each data point is
treated as an out-ofsample data point by
exactly one of the
models.
model
P1
P2
P3
P4
P5
P6
P7
P8
P9
P10
1
train
train
train
train
train
train
train
train
train
test
2
train
train
train
train
train
train
train
train
test
train
3
train
train
train
train
train
train
train
test
train
train
4
train
train
train
train
train
train
test
train
train
train
5
train
train
train
train
train
test
train
train
train
train
6
train
train
train
train
test
train
train
train
train
train
7
train
train
train
test
train
train
train
train
train
train
8
train
train
test
train
train
train
train
train
train
train
9
train
test
train
train
train
train
train
train
train
train
10
test
train
train
train
train
train
train
train
train
train
© Copyright 2025 Paperzz