2 - Information Services and Technology

A Collaborative Filtering Approach to Assess Individual Condition Risk !
Based on Patients’ Social Network Data!
Xiang
1
Ji ,
Soon Ae
2
Chun ,
James
1
Geller !
1 New
Jersey Institute of Technology, Newark, NJ, 07032!
2 City University of New York, College of Staten Island, Staten Island, NY, 10314!
!
Evaluation Results!
Condition Risk Assessment (CRA) Approach!
!
phenomenon that
§  The coverage is used
!
!
Abstract!
§  Comorbidity refers to the
conditions are correlated with each other. E.g. some
persons may develop depression that is secondary to
alcohol dependence [1]. !
§  We propose a prediction approach using patients’
social network data to model comorbidity.!
! Patient
!
! 1
!
! 2
!
! 3
P P P P4 P5 §  The model is simple. It generated comprehensible
features as well as good results in our experiment.!
Future !
(EHR on Social Network)!
Now !
in Hospital)!
!
C1, C2, C3, C4, C7 C1, C3, C7, C8 C2, C4, C8, C7 C1, C5, C6 C5, C7 The goal is to assess !
Condition risk for P0: !
!
!
Prediction Results !
Top k
5
10
20
50
100
P0: C1, C3, C4, C8!
tail!
head!
Step 2: Consider the first unprocessed condition C2 in T, get the set of !
all the patients who also have C2 , mark the set as Nc2 . Nc2 = {P1, P3} !
Patient Id
Step 3: Compute the similarities between head of P0 and the patients
in Nc2 à s(P0, P1) = 1, and s(P0, P3) = 0 !
!
296
42
!
!
!
!
!
!
Step 4: Calculate the utility and support of P0 getting C2, repeat Step 2.!
U0,c2 = (1+0)/2 = 0.5, Sc2 = 2/5 = 0.4 !
!
!
!
!
!
!
!
Conditions (time ->) An example: !
Step 1: Calculate the Target for P0 :!
Target = ConditionUnion – head = {C2, C4, C5, C6, C7, C8} !
Introduction!
!
!
! (EHR
!
!
!
!
!
to measure what percentage of
conditions diagnosed for patients in the tail is covered in the
prediction list. !
§  The half-life decay accuracy is the ratio between predicted
ranked list and the perfectly ranked list. !
Condition Risk
Assessment !
(CRA) Model!
Medication !
Suggestions!
Evaluation of
!Prediction Results!
!
Dataset!
§  There are two datasets: patient dataset and diagnosis dataset.*!
§  The patient dataset contains 17,407 patients’ basic information,
including patients’ id, username, gender, age and location !
Working Directions!
!!
§  Patients’ profile and diagnosis data
from social
network site: PatientsLikeMe* is used as the primary
data source.!
!
!§  CRA model extends the collaborative filtering
! technique used in recommender systems [2].!
!§  Ranked list contains a list of tuples representing the
! user, probability of getting a certain condition, and the
!
condition’s
support
value.!
!
§  Prediction accuracy and coverage was evaluated on
!
individual predictions.! !
!
!* PatientsLikeMe, http://www.patientslikeme.com !
!
Gender
!
!
!
!
!
Age!
!
!
!
§  The prediction performance of CRA approach will be
compared with the performance of CARE approach proposed
by Davis et al. [3].!
§  Prediction system implementation using CRA model.!
§  This work was funded by PSC-CUNY Research Grant. !
!
!
!
! !
!
§  Better non-temporal or temporal similarity measure?!
!§  How to utilize users’ demographic information?!
Acknowledgment!
!
!
!
!
!
!
!
Average Coverage
0.220
0.298
0.401
0.517
0.578
Example of new patient prediction!
Diagnosed Conditions Top 2 Predicted Conditions
Chronic Fatigue Syndrome,
Migraine, Fibromyalgia,
Generalized Anxiety Disorder
Eating Disorder, Phobic
Social Anxiety Disorder,
disorder
PTSD
HIV, Seborrheic
Bipolar Disorder, Lactose
Dermatitis
Intolerance
!Discussion!
! !
Ranked List of
Future Conditions!
50
Average Accuracy
0.267
0.280
0.299
0.302
0.302
!
Conditions
Conditions with Most Patients!
References!
![1] Schuckit, M. A., Tipp, J. E., Bergman, M., Reich, W., Hesselbrock, V.
Condition # of Patients MS (Multiple Sclerosis) 3459 Fibromyalgia 3164 Major Depressive Disorder 1624 Generalized Anxiety Disorder 1106 Chronic Fatigue Syndrome 914 ALS 894 M., & Smith, T. L. Comparison of induced and independent major
depressive disorders in 2,945 alcoholics. American Journal of
Psychiatry, 154(7), 948–957. (1997).!
[2] Su, X. and Khoshgoftaar, T. M. 2009. A survey of collaborative filtering
techniques. Adv. in Artif. Intell., (2009, 2-2). !
[3] Davis, D. A., Chawla, N. V., Christakis, N. A. and Barabasi, A. L. 2010.
Time to CARE: a collaborative engine for practical disease prediction.
Data Min. Knowl. Discov., (20, 3 2010), 388-415.!
!