Harbin Institute of Technology

Harbin Institute of Technology
Collaborative Friendship Networks in Online
Healthcare Communities: An Exponential
Random Graph Model Analysis
1
Xiaolong Song Shan Jiang
Xiangbin Yan Hsinchun Chen
Outline
• Introduction
• Literature Review
• Theory and Research Hypotheses
Harbin Institute of Technology
• Research Design
2
• Results and Discussions
• Conclusion
Harbin Institute of Technology
Introduction
3
The Healthcare is Getting Social
As will be seen, our go-to source for health
and medical information is moving away
from our doctor—it is increasingly by
crowdsourcing and friendsourcing our
Harbin Institute of Technology
entrusted social network.
4
—The Creative Destruction of Medicine:
How the Digital Revolution Will Create
Better Health Care by Eric Topol
Harbin Institute of Technology
Health 2.0
5
How and Why Various Patients Form
Collaborative Friendship Online
• Relative to subscription and communication relationships,
friendship express some “strong ties” features, such as
reciprocity and emotional support.
Harbin Institute of Technology
• Unlike in real world and Facebook, patients in online
healthcare communities are strangers.
6
• Patients are motivated to find others who have experienced
or are suffering from similar health problems.
• Friendship in online healthcare communities is created
based on a common goal that patients collaborate in
expectation of positive changes in health condition.
Harbin Institute of Technology
How and Why Various Patients Form
Collaborative Friendship Online
7
• In this study, we utilize a theory-grounded
statistical modeling approach—Exponential
Random Graph Model (ERGM) to investigate
what aspect of individual characteristics affect
the formation of collaborative friendship
network (CFN) in online healthcare
communities.
Harbin Institute of Technology
Literature Review
8
Literature Review
• Our work draws from two streams of literature.
Harbin Institute of Technology
- We first review the recent related studies on
patient networks in online healthcare
communities.
9
- Then we review the analytical methodology,
which is built upon on the literature on ERGMs.
Patient Networks in Online Healthcare
Communities
Harbin Institute of Technology
Table 1. Summary of Selected Patient networks in Online Health Communities Studies
10
Literature
Types of
Relationships
Research Direction
Analytical Approaches
Chang (2009)
Communication
Network characteristics
Network structural analysis
Ma et al. (2010)
Friendship
Network characteristics
Network structural analysis
Durant et al. (2010)
Communication
Network characteristics
Network structural analysis
Centola (2010)
Friendship
Network influence
Social experiment
Centola (2011)
Friendship
Network influence
Social experiment
Yan et al. (2011)
Subscription
Tie formation
Logistic regression
Stewart et al. (2012)
Communication
Network characteristics
Network structural analysis
Durant et al. (2012)
Communication
Tie formation
Network structural analysis
Chomutare et al. (2013)
Communication
Network characteristics
Network structural analysis
Chuang et al. (2013)
Communication
Network characteristics
Blockmodel
Harbin Institute of Technology
Research Gaps
11
• With the exceptions of Durant et al. (2012) on
communication networks and Yan et al.
(2011)’s work on subscription networks, no
research has examined the formation
mechanism of patient networks, especially
patient CFNs.
• A network-based approach that has the
capability to deal with multiple parameters
simultaneously and the interdependency of
network ties is needed.
Exponential Random Graph (p*) Models
• Strengths:
ERGMs can estimate multiple parameters simultaneously and compare
relative importance of different generative processes.
ERGMs assume network ties are interdependent. The approach can
explicitly capture the interdependencies of relational data [1].
Harbin Institute of Technology
• The general mathematical form of exponential random graph models is
as follow:
12
(1)
where the summation in the model is over all configurations. ηA is the
parameter corresponding to configuration A. gA(y) indicates the network
statistic. κ is a normalizing quantity to ensure proper probability
distribution. A configuration is a subset of possible network ties [2].
[1] Handcock, M. S., Hunter, D. R., Butts, C. T., Goodreau, S. M., & Morris, M.: Statnet: Software tools for the
representation, visualization, analysis and simulation of network data. Journal of Statistical Software. 24, 1548
(2008)
[2] Robins, G.: Exponential random graph models for social networks. Handbook of Social Network Analysis.
Sage (2011)
Harbin Institute of Technology
Exponential Random Graph (p*) Models
13
• In recent studies, Wimmer et al. utilize ERGMs to
investigate racial homophily in a friendship network based
on the Facebook [1]. Direct reciprocity and indirect
reciprocity in network exchange in online communities are
found by using ERGMs [2]. A “performance-based
clustering” phenomenon is observed within a large opensource community by examining strategic selection and
homophily [3].
• These studies suggest that ERGMs represent a promising
class of model to study tie formation problem in the social
media context, which meets our need.
[1] Wimmer A and Lewis K. Beyond and below racial homophily: ERG Models of a friendship network documented on
Facebook. American Journal of Sociology. 2010; 116: 583-642.
[2] Faraj S and Johnson SL. Network exchange patterns in online communities. Organization Science. 2011; 22: 1464-1480.
[3] Shen C and Monge P. Who connects with whom? A social network analysis of an online open source software community.
First Monday. 2011; 16.
Harbin Institute of Technology
Theory and Research Hypotheses
14
Harbin Institute of Technology
Homophily
15
Social ties are more likely to occur
between individuals with common
features or similar attributes [1].
[1] McPherson, M., Smith-Lovin, L., & Cook, J. M.: Birds of a feather: Homophily in
social networks. Annual Review of Sociology. 415-444 (2001)
Homophily
• Patients generally have a variety of individual attributes and
health interests. So we need identify what types of shared
characteristics lead to homophily.
Harbin Institute of Technology
• Regarding friendship, existing evidences reveal that the
similarity in some demographic attributes like gender can
affect friendship formation in real world.
16
• Besides demographic attributes, patients also have healthrelevant attributes, which are related to their information
needs.
• Prior studies have shown that treatments and health
condition are the most popular topics online. So we also
take them into account.
Hypotheses
Gender-homophily
Harbin Institute of Technology
•As mentioned above, same gender can affect
friendship formation in the real world. However,
more and more studies have found that there is no
gender homophily in online settings.
17
Hypothesis 1: The effect of same gender on
collaborative friendship formation is insignificant.
Hypotheses
Treatment-homophily
•Individuals desire the information about the risk and benefits of their
treatments.
•The treatment experience of others can be seen as an important
reference.
Harbin Institute of Technology
Hypothesis 2.1: Patients with the same treatments tend to establish
collaborative friendships.
18
•The bigger the difference in the number of treatments between two
patients, the more likely they differ in health condition.
•Recent research has found that the similar number of treatments has a
positive influence on the formation of subscription relationships.
Hypothesis 2.2: Patients with a similar number of treatments tend to
establish collaborative friendships.
Hypotheses
Health condition-homophily
•Health condition includes disease severity and disease duration.
Harbin Institute of Technology
•Extant study suggests that individuals prefer not to connect with
others worse than them. For example, nonoverweight people
prefer to befriend nonoverweight peers.
19
Hypothesis 3.1: Patients in good health status are more likely to
develop collaborative friendships.
•Disease duration is often related to complications. The
similarity in disease duration enables patients more comparable.
Hypothesis 3.2: Patients with similar disease duration are more
likely to develop collaborative friendships.
Harbin Institute of Technology
Research Design
20
Harbin Institute of Technology
Research Framework
21
Figure 1. ERGM analysis of collaborative friendship networks in online healthcare communities.
Datasets
Harbin Institute of Technology
• We collect data from TuDiabetes (http: //www.tudiabetes.org/) a
worldwide online diabetes community launched in 2007.
The website allows members to make friends, providing a
fitting context for our purpose.
22
• Two sets of information were collected: (a) We obtained all
the profile information of these users who joined the
community by July 5, 2013; (b) We also extracted all the
friendship data of these users during the time. Finally, the
process resulted in a dataset of 2118 users and 4134
friendship ties.
Measure
Harbin Institute of Technology
• We identify a patient as having good health status if her
HbA1C% is less than or equals to 7%. The value is
suggested by the American Diabetes Association as the
target for most non-pregnant adults with diabetes*.
23
• We used the difference of the year of diagnosis to measure
the similarity of disease duration. Considering high changes
in year level due to scaling effect, we applied the log
transformation.
*http://www.diabetes.org/living-with-diabetes/treatment-and-care/blood-glucosecontrol/checking-your-blood-glucose.html/.
ERGM Analysis
Table 2. Research Hypothesis, Parameter and Configuration
Hypothesis
Parameter
Hypothesis 1: The effect of samegender on collaborative friendship
formation is insignificant.
[gender]-interaction
Hypothesis 2.1: Patients with same
treatments tend to establish
collaborative friendships.
[treatment]-matching
Harbin Institute of Technology
Hypothesis 2.2: A similar number of
treatment increases the probability
for patients to establish collaborative
friendships.
[number of treatment]-difference
Hypothesis 3.1: Patients in good
health status are likely to build
collaborative friendships.
[good health status]-interaction
Hypothesis 3.2: Patients with similar
disease duration are likely to form
collaborative friendships.
[diagnosis duration]-difference
24
Configuration
Harbin Institute of Technology
ERGM Analysis
25
• In ERGMs, configurations are constructed to represent
different hypotheses.
• Among others, the parameter [Attr]-difference measures the
absolute difference between two continuous attributes.
• So for hypotheses 2.2 and 3.2, if the corresponding
parameters are negative and significant, we can say the
hypotheses are supported.
• We estimate the parameters by Markov Chain Monte Carlo
maximum likelihood estimation, as suggested by prior
studies.
 The ERGMs generate random networks and compare with the observed
network in network statistics.
 The more similar the two networks are, the better the ERGM estimations
are.
Evaluation
Goodness-of-fit test
Harbin Institute of Technology
• In order to validate how well the ERGM model
fits the observed network, we conduct a
goodness-of-fit testing.
26
- We generate 100,000,000 simulated
networks.1000 samples are picked up to
compare with the observed network in a series
of network statistics.
- If the differences between them are small, we
can conclude that our model fits perfectly.
Harbin Institute of Technology
Results and Discussion
27
Results and Discussion
Harbin Institute of Technology
Table 3. Results of ERGM Estimates
28
Type
Demographic
homophily
Treatment
experience
homophily
Hypothesis
Estimate
Std dev
t-statistics
Result
H1
0.086700
0.17196
0.04368
Supported
H2.1
0.255650*
0.05300
-0.05804
Supported
H2.2
0.209109*
0.01900
-0.00381
NOT supported
Health
condition
homophily
H3.1
0.506711*
0.15458
-0.03431
Supported
H3.2
-2.094521
2.68512
-0.04568
NOT supported
Notes:
t-statistics = (observation - sample mean)/standard error
* means statistical significance
Results and Discussion
• Following prior studies, a parameter is considered significant if the value of the
estimate is at least twice the standard error
• H1 is supported. This result is consistent with the previous finding that gender
homophily does not appear in social media [1,2].
• H2.1 is supported, while H2.2 is not supported. Patients with shared treatments
tend to face similar problems and have same information need[3].
Harbin Institute of Technology
• H3.1 is supported. This finding provides new evidence to support prior research
[4] that health status similarity can also affect friend selection in social media.
29
• H3.2 is not supported. One possible explanation is that patients with short
disease duration also need to learn how to prevent complications and make
friends with others who have longer illness experience.
[1] Thelwall, M.: Homophily in myspace. Journal of the American Society for Information Science and Technology. 60,
219-231 (2008)
[2] Yan, L., Tan, Y., & Peng, J.: Network dynamics: How can we find patients like us?, Available at SSRN.
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1820748 (2011)
[3] Charnock, D., Shepperd, S., Needham, G., & Gann, R.: DISCERN: an instrument for judging the quality of written
consumer health information on treatment choices. Journal of Epidemiology and Community Health. 53, 105-111 (1999)
[4] La Haye, K., Robins, G., Mohr, P., & Wilson, C.: Homophily and contagion as explanations for weight similarities
among adolescent friends. Journal of Adolescent Health. 49, 421-427 (2011)
In Sum
Harbin Institute of Technology
• First, patients make friends in online healthcare
communities without thinking about gender,
which could facilitate the flow of healthcare
knowledge within a wider range.
30
• Second, in order to acquire effective and
consistent peer support, patients build
collaborative friendships with others who have
similar health status and current treatments.
Goodness of Fit Tests
Harbin Institute of Technology
Table 3. Results of Goodness-of-fit Test
31
Parameter
edge
[gender]-interaction
[treatment]-matching
[number of treatment]difference
[good health status]interaction
[diagnosis duration]difference
Observed value
3183
43
580
Mean
3182.971
42.876
581.523
Std dev
40.949
6.899
21.783
t-Ratio
0.001
0.018
-0.070
4233.000
4206.319
74.253
0.359
61
60.223
6.898
0.113
23.993
23.978
0.493
0.031
Notes:
t-Ratio = (observation - sample mean)/standard error
Harbin Institute of Technology
Conclusion
32
Harbin Institute of Technology
Implications
33
• Health-homophily such as treatment homophily
and health-status homophily can increase the
likelihood of collaborative friendship formation.
Taking account of these factors can help health
social media improve the friend-seeking service
and promote users’ socialization.
Limitations
Harbin Institute of Technology
• The limitation of this study is that we only
focus on a diabetes setting and are less
confident whether our findings could be
generalized to other illness. Further work will
extend to broader contexts.
34
Harbin Institute of Technology
Thank
35
You