Bayesian Networks

Abdelrahman Hassan







What are the Bayes Nets?
What are the usage of Bayes Nets?
What is a naïve Bayes Nets?
Bayes Net Applications
Personalized News Recommendation Based on Click Behavior
Modeling the Effectiveness of Curriculum in Educational Systems Using
Bayesian Networks
Learning Discrete Bayesian Networks from Continuous Data

An expert system that captures all existing
knowledge.

Consists of
 DAG
▪ Directed acyclic graph.
 CPD
▪ Conditional probability distribution.
A
B
C
D
E
P(C|A,B)
P(F|C)
P(D|B)
P(E|B)
F
DAG
CPD
NOT Only Tree
A
Tree
B
C
F
B
D
E
C
D
F
A
E

P(A) =
p(Ai| pa(Ai))

A = { A1, A2, …, An} are nodes

P(A) is the joint probability of Nodes A

Pa(A) are parents of A

Serial

Diverging

Converging
BB
A A
F
CC
CD
E
F

Assume Domain = 2
(T, F) or (H, T)

A
P(A,B,C,D,E,F)
=> 2^6 = 64

B
P(A,B,C,D,E,F)
= P(A)P(B)P(F|C) P(C|A, B)P(D|B)P(E|B)
=> 1+1+2+4+2+2 = 12
C
F
D
E

Known Structure
predict outcomes or diagnoses causal effect

Not Known Structure
discover relation ships
B =T
B =F
C=T
0.9
0.1
C=F
0.4
0.6
B =T
B =F
D=T
0.2
0.8
D= F
0.7
0.3
B
C
D

Information
 30% of the US population smokes.
 Lung cancer can be found in about 70 people per 100,000.
 Tuberculosis (TB) occurs in about 10 people per 100,000.
 Bronchitis can be found in about 800 people per 100,000.
 Dyspnea can be found in about 10% of people, but most
of that is due to asthma and causes other than TB, lung
cancer, or bronchitis.

More Info
 50% of your patients smoke.
 1% have TB.
 5.5% have lung cancer.
 45% have some form of mild or chronic bronchitis.

APRI system developed at AT&T Bell Labs

NASA Vista system

Office Assistant in MS Office 97/ MS Office 95

Microsoft Pregnancy and Child-Care

Introduction

Methods

User Interest

Approach Steps

Life Traffic Impact
Steps





Google News Click Logs Analysis
Study Drawbacks for existing Methods
Design a new Method covering All
Implement new Method
Combining Methods together

Collaborative filtering
It recommends news stories that were read by users with
similar click history

Drawbacks
 The system cannot recommend stories that have not
yet been read by other users
 Google News is usually published within one hour.
However, the collaborative filtering method has to
wait several hours to collect enough clicks to
recommend the news story to users.

WebClipping2
Uses a Bayesian Classifier in order to calculate the
probability that a specific article would be interesting to
the user, Rather than requiring users to provide explicit
feedbacks

Depend on
 Reading time
 Number of lines read
 Some other characteristics

Short Term
usually is related to hot news events and changes quickly. In
contrast

L0ng Term
often reflects actual user interest.

Captures the dynamic changes of user
interest

Keep The context of news trends

Discovers the genuine interest of users

Combines the genuine interest with the
current news trend to predict the user’s
current news interest.
Click Distribution over the big sample of Data
Ni is the number of clicks on articles classified into
category ci made by user u in month t
Ntotal is the total number of clicks made by the user in
the time period
D(u,t) represent the proportion of time the user spent
reading about each topic category and reflects the
interest distribution of the user in that month.
Change of User’s News Interests over Time
d1 and d∞ distance between the click distribution of
the month to be predicted and those of previous
months.
Larger value of d1and d∞ distance in a month implies
bigger differences between the click distribution of
that month and the month to be predicted.

The system predicts user’s genuine news
interests regardless of the news trend, using
the user’s clicks in each past time period.

The predictions made with data in a series of
past time periods are combined to gain an
accurate prediction of the user’s genuine news
interests.

The system predicts the user’s current interests
by combining her genuine news interests and
the current news trend in her location.

Predicting User’s Genuine News Interest

Combining Predictions of Past Time Periods

Predicting User’s Current News Interest
Life Traffic Impact
Click-through rates (CTR) of the recommended
news section
Life Traffic Impact
CTR of the Google News homepage
Life Traffic Impact
Frequency of visiting Google News website

Introduction

Related Works

Proposed Methods

Curriculum Modeling

Modeling Results

Online education could turn into one of the
leading IT services.

Bayesian network is identified using
effective parameters including these
variable forms. This network determines the
possibility of investigating the success of
curriculum planning and specifies the
effectiveness rate of variables based on data
set in this research.

Scientist Apprenticeship

The educational model of Germany's dual
system

The model of the Australian TAFE curriculum

The model of DACUM

The International model of the MES curriculum

X is selected by a series of adjective amount

f(x) target function from a series like v

The series of training data and output of the
target function or class that the new sample
belongs to is targeted

Identify the most probabilistic class (Vmap) by
having an adjective amount
<a1, a2,…, an> describing the new sample.

This Equation can be rewrite as

Evaluating training data is easy if we know
the rate of vi repetition.

evaluating different sentences

The problem exists where the number of
sentences equals the number of samples
multiplied by the amount of the target
function.

observing the output of the target function,
the probability of observing adjectives a1,
a2,… is equal to multiplying the probability of
each adjective separately

determine the effective parameters on the
success or lack of success of a selected
curriculum (In the terminology of Bayesian
networks, these parameters are considered
as variables)

Modeling using the Bayesian network
▪ A series of relevant parameters along with its amount
should be derived
▪ The network structure should be in the form of a graph
without cycles and with nodes of variables.
▪ For every network variable, a CPD (Conditionally
Probability Distribution) is determined.

The amount of P (AG, S, A, NumC, RBG,
Pub, G, RecL, satisfaction) based on the
chain rule of Bayesian networks and on the
basis of network structure for the curriculum
is equal to

The data set is considered for 117 M.S. students of IT from the ELearning Center of Amirkabir University of Technology in three
continuous terms

According to the statistical data set, the greatest effect in
recommendation letter approval is given to the current term’s GPA
and published papers with effectiveness of 1.95 and 0.93,
respectively. When we investigated the data set for not receiving
a recommendation letter, we have similar variables as before, but
with different effectiveness. That is, the effectiveness rate of the
current term’s GPA and published

Papers with amounts 1.05 and 0.93 respectively, obtain the
highest amount once more