Title of Presentation

Aim :
to contribute to the building of effective and efficient
methods to predict clinical outcome that can be
constructed from routinely collected data
Tessy Badriyah
Healthy Computing, 1nd June 2011
Methodology
Dataset
Model was built from Biochemistry and Haematology Outcome Model (BHOM)
dataset, during 12-month study period => 17,417 patients.
The fields are : death - at discharge - F=alive, T =dead (class attribute), age at
admission, mode of admission (mostly emergency, but some elective), gender,
haemoglobin, white cell count, urea, serum sodium, serum potassium,
creatinine, urea / creatinine
Statistical Analysis
The statistical analysis to asses the overall performance of the model are
discrimination (area under ROC curve or c-index) and calibration (chi-test)
Design Experiment
We conducted our experiment using Logistic Regression as standard method
(‘gold standard’) in the Health Care data and several methods in machine
learning techniques (Decision Trees, Neural Networks, K-Nearest Neighbour,).
We used 10-fold cross validation method and the process was repeated 10
times to avoid bias during the formation of the cross validation data splits.
0.9
0.85
0.8
0.75
0.7
Logistic
Regression
Decision Trees
Neural Network
K-Nearest
Neighbour
Both Logistic regression and Machine Learning methods
have shown a good results
Machine Learning methods is worth to looking at to
predict clinical outcome
In another experiment, we were increasing the number of
non-survivors in the dataset to test the hypothesis that
the discrimination can be improved when the proportion
of non-survivors has been increased => hypothesis
proved.
Further investigation and model development to predict
another clinical outcome (e.g. readmission rates) which
has different types of target data with previous clinical
outcome.