tutorial3updated

Contents
0. Processing the Tie Data and Create a VNA File for Binary Network ................................................... 2
1. Appearance of UCINET ........................................................................................................................ 4
2. UCINET dataset .................................................................................................................................... 5
2.1 Set Default Folder .......................................................................................................................... 5
2.2. Data Format.................................................................................................................................. 5
2.3 Save as UCINET files in Netdraw.................................................................................................... 5
2.4 Import from VNA format ............................................................................................................... 5
3. Node Level Analysis ............................................................................................................................. 5
3.1 Degree ........................................................................................................................................... 5
3.2 Betweenness ................................................................................................................................. 8
3.3 Closeness ..................................................................................................................................... 11
3.4 Eigenvector .................................................................................................................................. 13
4. Link and Group Level Analysis ........................................................................................................... 16
4.1 Length and Distance .................................................................................................................... 16
4.2 Network Components ................................................................................................................. 17
4.3 Structure Holes ............................................................................................................................ 19
5. Network Level Analysis ...................................................................................................................... 21
5.1 Average Path Length ................................................................................................................... 21
5.2 Clustering coefficient................................................................................................................... 23
5.3 Degree distribution...................................................................................................................... 24
0. Processing the Tie Data and Create a VNA File for Binary Network
create a table name relation in the MySQL Workbench, by executing the SQL query:
create table tutorial.relation(fromid nvarchar(50), toid nvarchar(50))
Insert the binary relation data into this table by executing the query:
Insert into tutorial.relation
select distinct p.account_id as a, q.account_id as b
from tutorial.contributor as p, tutorial.contributor as q
where p.projectid = q.projectid And p.account_id != q.account_id
select and copy the data of accounts as node data.
select and copy the relation data as tie data
Save as VNA file.
1. Appearance of UCINET
When UCINET is started the following window appears.
The submenu buttons give access to all of the routines in UCINET and these are grouped into File,
Data, Transform, Tools, Network, Visualize, Options and Help. Note that the buttons located below
these are simply fast ways of calling routines in the submenus. The default directory given at the
bottom is where UCINET picks up any data and stores any files (unless otherwise specified) this
directory can be changed by clicking on the button to the right.
2. UCINET dataset
2.1 Set Default Folder
It's often a good idea to set up a new directory for each project, and to set the default to this new
directory using the file-cabinet icon on the toolbar, or File>Change default folder.
2.2. Data Format
UCINET datasets are stored in a special (Pascal) format, but can be created and manipulated using
both UCINET's and other software tools (text editors and spreadsheets). Each UCINET dataset
consists of two separate files that contain header information (e.g. myfile.##h) and the data lines (e.g.
myfile.##d). UCINET stores the attribute data and network data into separate files.
Because of this somewhat unusual way of storing data, it is best to create data sets with the internal
spreadsheet editor or DL language tools, or to import text or spreadsheet files and save the results as
UCINET files
2.3 Save as UCINET files in Netdraw
Click on File/Save Data As/Ucinet, you will see:
•
Binary Network: the network data will be saved as binary, i.e., the value of each tie will be 1.
•
Valued Network: save the network data with the value of each tie (e.g. strength)
•
Attribute: save the attribute data (e.g., country, kudo_rank...)
2.4 Import from VNA format
Click on Data at the menu, and choose Import text file/VNA.
3. Node Level Analysis
3.1 Degree
Purpose: Calculates the degree and normalized degree centrality of each vertex and gives the overall
network degree centralization.
Click on Network/Centrality and Power/Degree..., you will see
Parameters
Input dataset:
Name of file containing network to be analyzed. Data type: Valued Graph
Treat data as symmetric: (Default = Yes).
If Yes directed data is automatically converted to undirected by taking the underlying graph.
No gives a separate analysis for in and out-degrees.
Count reflexive ties (diagonal values)? (Default = No).
No means that self loops are ignored.
Output dataset: (Default = 'FreemanDegree').
After execution, a txt file will appear with the information of degree analysis.
the top 5 nodes
The average degree
A table which contains a list of the degree and normalized degree (n Degree) centralities expressed as a
percentage for each vertex, together with the share. The share is the centrality measure of the actor divided by
the sum of all the actor centralities in the network. These have been ordered so that the actor with the highest
centrality appears first. Note the stored UCINET output file retains the original order.
Descriptive statistics which give the mean, standard deviation, variance, minimum value and maximum value for
each list generated. This is followed by the degree network centralization index expressed as a percentage. For
the treating data as symmetric option a heterogeneity measure is produced. This is the sum of the squares of the
proportion of the total centrality held by each actor.
For directed data the tables are the same as for undirected except that separate values are calculated for in and
out degrees
:
3.2 Betweenness
PURPOSE Calculates the betweenness and normalized betweenness centrality of each vertex and gives the
overall network betweenness centralization.
Click on Network/Centrality and Power/Freeman Betweenness/Node Betweenness, you will see
PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Digraph.
Output dataset: (Default = 'FreemanBetweenness').
Name of file which will contain betweenness and normalized betweenness centrality of each vertex.
After execution, you will see:
The top 5 nodes
The average betweenness
A table which contains a list of the betweenness and normalized betweenness centrality expressed as a
percentage for each vertex.These have been ordered so that the actor with the highest centrality appears first.
Note the stored UCINET output file retains the original order.
Descriptive statistics which give the mean, standard deviation, variance, minimum value and maximum value for
both lists. This is followed by the betweenness network centralization index expressed as a percentage
3.3 Closeness
PURPOSE Calculates the farness and normalized closeness centrality and variants of each vertex and gives the
overall network closeness centralization.
Click on Network/Centrality and Power/Closeness, you will see
PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Digraph
Type:
Choices are:
Sum of geodesic distances(Freeman) distances are lengths of shortest paths, the standard Freeman measure.
Sum of reciprocal distances distances are the reciprocal of the lengths of the geodesic paths.
Avg of reversed distances (Valente-Foreman) the reversed distance is the diameter minus the geodesic
distance.
All paths distances between actors are the sums of the distances on all paths connecting them.
All trails distances between the actors are the sums of the distances on all trails connecting them.
Output Dataset: (Default = 'Closeness')
Name of file which will contain farness and normalized closeness centrality of each vertex.
After execution, you will see:
The top 5 nodes
The average closeness
A table which contains a list of the farness (or closeness) and normalized closeness centrality expressed as a
percentage, for each vertex. These have been ordered so that the actor with the highest centrality appears first.
Note the stored UCINET output file retains the original order. If the data is directed then in-closeness and outcloseness are calculated seperately. Note that in-closeness for the reversed closeness is called integration and the
out-closeness is called radiality.
Descriptive statistics which give the mean, standard deviation, variance minimum value and maximum value for
both lists. This is followed by the closeness network centralization index expressed as a percentage. If the data is
directed then separate in and out values are calculated.
3.4 Eigenvector
PURPOSE Calculates the eigenvector of the largest positive eigenvalue as a measure of centrality.
Click on Network/Centrality and Power/Eigenvector..., you will see
PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Valued Graph (Symmetric data only).
Output dataset: (Default = 'BonacichCentrality').
Name of file which will contain eigenvector centrality measure for each vertex.
After execution, you will see
Nodes with hight
eigenvector
average eigenvector
A table of positive eigenvalues. The eigenvalues are placed in descending order under the heading VALUE. The
table gives information on 'how dominant' the largest eigenvalue is. The table gives the percentage and
cumulative percentage of the total eigenvalue sum for each eigenvalue. The ratio of each eigenvalue to the next
largest is also presented.
This is followed by a list of vertices which contains the eigenvector and normalized eigenvector centrality
measure for every vertex. These values should be interpreted in terms of an interval scale.
Finally the network eigenvector centralization index expressed as a percentage is given
4. Link and Group Level Analysis
4.1 Length and Distance
PURPOSE Trace all paths between specified nodes in the network.
Click on Network/Paths, you will see
PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Digraph
From node
Label or id number of starting node.
To node
Label or id number of terminating node.
Type of paths desired
Length<=K gives all paths up to length K.
Avg strength>=K gives paths whose average value of weights is greater than or equal to K.
Value of K
The value of K to be used. Auto will find the minimum value for which a path exists and hence this will be set to
the distance between the nodes and all paths will be geodesics.
After execution, you will see:
4.2 Network Components
PURPOSE Identify the components, of an undirected graph - and the weak or strong components of a directed
graph and find the main component.
Click on Network/Regions/Components/Simple Graphs, you will see
PARAMETERS
Input dataset:
Name of file containing network data to be analyzed. Dat type: Directed graph.
Minimum Size to save: (Default = 3)
Size of smallest component which is to be saved in the component by actor incidence matrix specified below.
Kind of components: (Default = Strong)
For directed data specify whether Strong or Weak components are required. For undirected data either choice
will yield the components.
Output sets: (Default = 'Components_Sets')
Name of file which will contain a component by actor incidence matrix. A 1 in row i column j means that node j
is in component i. This file is not displayed in the LOG FILE.
Output Partition matrix: (Default = 'Components_Partition')
Name of file which will contain a partition vector. A j in the ith position means that node i is a member of
component j. This file is not displayed in the LOG FILE.
Output Main Comp.Vec:(Default=InMainComp)
Name of file which contains an indicator vector where a 1 in row i indicates that i is a member of the main
component and a 0 indicates that it is not a member. This file is not displayed in the
LOG FILE.
After execution, you will see:
4.3 Structure Holes
Click on Network/Ego Networks/Structural Holes, you will see
PARAMETERS
Input dataset:
Name of file containing network to analyze. Data type: Directed Graph.
Method
Choices are:
Ego network model -- ties beyond egonet have no effect consideration is only given to alter ties with other alters.
Whole network model -- includes alter ties outside of egonet in caculating the measures account is taken of all of
alters ties whether they are connected to ego or not.
Undefined values (Default =na)
What value to use for any undefined measure, the default is to report these as missing values but the user can
select others such as zero.
Ouput dyadic redundancy (Default=<inputfilename>-DR)
Name of actor-by-actor matrix that indicates the extent to which the column actor (an alter) is a redundant
contact for the row actor (ego).
Ouput dyadic constraint (Default=<inputfilename>-DC)
Name of actor-by-actor matrix that indicates the extent to which the row actor (ego) is constrained by each other
actor in its ego network.
Node level measures: (Default=<inputfilename>-SH)
Name of actor-by-variable matrix to hold structural hole measures.
How to define ego net
Radio buttons so that selected alters can either be just out going ties from ego, in comming ties to ego or both
which would mean all ties.
After execution, you will see:
how much investment the
network has in the specific
individual, how constraint
network is by that node
Three tables are output. First is the set of monadic (nodal) structural hole measures based on redundancy and
constraint. The following measures are displayed:
effsize. Burt's measure of the effective size of ego's network (essentially, the number of alters minus the average
degree of alters within the ego network, not counting ties to ego).
efficiency. The effective size divided by the number of alters in ego's network.
constraint. Burt's constraint measure (equation 2.4, pg. 55 of Burt, 1992). Essentially a measure of the extent to
which ego is invested in people who are invested in other of ego's alters.
hierarchy. Burt's adjustment of constraint (equation 2.9, pg 71), indicating the extent to which constraint on ego
is concentrated in a single alter.
The second table is the dyadic redundancy matrix. For each ego (rows) it gives the extent to which each of its
alters are tied to all of ego's other alters (i.e., the extent to which the alter is redundant).
The third table is the dyadic constraint matrix. For each ego (rows) it gives the extent to which it is constrained
by each of its alters. Ego is contained by alter j if (a) j represents a large proportion of ego's relational investment,
and (b) if ego is heavily invested in other people who are in turn heavily invested in j. In short, j constrains Ego
if ego is heavily invested in j directly and indirectly.
5. Network Level Analysis
5.1 Average Path Length
PURPOSE Constructs a distance or generalized distance matrix between all nodes of a graph and gives some
cohesion measures based on this matrix. The routine also provides the facility to
transform this matrix from distance to nearness.
Click on Network/Cohesion/Geodesic Distance, you will see:
PARAMETERS
Input dataset
Name of file containing dataset to be analyzed.
Output dataset
Name of data file containing distance matrix.
Please choose missing values to represent undefined distances
After execution, you will see:
Average value of distances, compactness and distance weighted fragmentation. Table of frequencies giving
geodesic length, frequency it occurred and proportion of pairs of each distance followed by the matrix of
distances (or nearness if transformation applied) between all pairs of nodes.
5.2 Clustering coefficient
PURPOSE Calculate the clustering coefficient of every actor and the clustering and weighted
clustering coefficient of the whole network.
Click on Network/Cohesion/Clustering Coefficient, you will see:
PARAMETERS
Input network dataset:
Name of file containing dataset to be analyzed. Data type: Digraph.
(output) Node-level coefficients (Default = 'ClusteringCoefficients')
Name of UCINET file that will contain the clustering coefficients for each actor together with their
degree.
After execution, you will see a log file with:
The overall clustering coefficient and the weighted overall clustering coefficient.
A table with the actor level clustering coefficient together with their degree
The overall clustering
coefficient
The overall clustering coefficient and the weighted overall clustering coefficient.
A table with the actor level clustering coefficient together with their degree.
5.3 Degree distribution
Execute the following statement in the MySQL workbench.
select A.degree, count(*) from
(
select fromid, count(distinct toid) as degree
from tutorial.relation
group by fromid
)A
group by A.degree
order by A.degree asc
Copy and paste the results to EXCEL
get the sum of count(*)
get the column of p(k) by the frequency/sum(frequency)
get the column of ln(k) and ln(p(k))
Highlight the cells of ln(k) and ln(p(k)), then select Insert from the ribbon and choose
Scatter from the Charts category. Select the chart which does not connect the data with lines.
To add a line of best fit (linear regression in this case), click on Trendline in the Layout tab. Select
More Trendline options.
In the window that appears, select linear for the Trend/Regression type and click on the
checkboxes for “Display Equation on chart” and “Display R-squared value on chart”.