PNet Program for the Simulation and Estimation of Exponential Random Graph (p*) Models USER MANUAL Peng Wang Garry Robins Philippa Pattison Department of Psychology School of Behavioural Science University of Melbourne Australia September, 2009 Table of Content Introduction ___________________________________________________________ 1 Acknowledgements__________________________________________________________ 1 System Requirements _______________________________________________________ 2 Setup PNet ________________________________________________________________ 2 Update PNet _______________________________________________________________ 2 Using PNet ____________________________________________________________ 3 Start PNet _________________________________________________________________ 3 Simulation _________________________________________________________________ 4 Simulation Setup _________________________________________________________________ 4 Simulation Output ________________________________________________________________ 8 Estimation _________________________________________________________________ 9 Estimation Setup _________________________________________________________________ 9 Estimation Options: _______________________________________________________________ 9 Estimation Output _______________________________________________________________ 11 Goodness of Fit ____________________________________________________________ 12 Goodness of Fit Setup ____________________________________________________________ 12 Goodness of Fit Output ___________________________________________________________ 12 Approximate Bayesian goodness of fit _________________________________________ 13 PNet Extensions _______________________________________________________ 14 BPNet ___________________________________________________________________ 14 Introduction ____________________________________________________________________ 14 Simulation _____________________________________________________________________ 14 Estimation _____________________________________________________________________ 17 Goodness of Fit _________________________________________________________________ 17 References ___________________________________________________________ 18 Appendix A – Sample Files ______________________________________________ 21 Sample Input Files _________________________________________________________ 21 Sample Output Files________________________________________________________ 23 “start_statistics_[session name].txt”, “end_statistics_[session name].txt” and “sample_statistics_[session name].txt” _______________________________________________ 23 “simulation_[session name].txt” ____________________________________________________ 24 “parameter_[session name].txt” _____________________________________________________ 25 “estimation_[session name].txt” ____________________________________________________ 25 “covariance_[session name].txt” ____________________________________________________ 26 “gof_[session name].txt” __________________________________________________________ 27 Appendix B – Model Parameter Description ________________________________ 32 Non-directed Graphs _____________________________________________________________ 32 Directed Graphs _________________________________________________________________ 34 XPNet Graph Statistics ___________________________________________________________ 36 BPNet Graph Statistics ___________________________________________________________ 39 IPNet Graph Statistics ____________________________________________________________ 42 Introduction PNet is a program for statistical analysis of exponential random graph (p*) models (ERGMs). It has three major functionalities: Simulation: Simulating network distributions with specified model parameter values. Estimation: Estimating specified ERGM parameters for a given network. Goodness of Fit: Testing the goodness of fit of a specified model to a given network with a particular set of parameters. Acknowledgements PNet contains code and ideas from many people. We would like to thank the following people for contributing to this program. Carter Butts, Galina Daraganova, Steve M. Goodreau, Mark S. Handcock, Nicholas Harrigan, David Hunter, Tasuku Igarashi, Johan Koskinen, Dean Lusher, Martina Morris, Ken Sharpe, Tom A.B. Snijders, Christian E.G. Steglich, Lei Xing, Yu Zhao. -1- System Requirements Operating system Software Microsoft® Windows operating systems Microsoft .NET framework version 1.1+ Java TM 2 platform standard edition 5.0+ The Software required is freely available from Microsoft and Sun‟s web site. Microsoft: www.microsoft.com Under Download, search for .NET Framework Version 1.1 Redistributable Package JAVA TM 2 Platform Standard Edition 6.0 http://java.sun.com/javase/downloads/index.jsp Setup PNet PNet consists of two components, a user interface developed in Java “PNet.jar”, and a simulation/estimation engine “pnet.dll” developed in C to achieve good performance. Before installing PNet, make sure you system meet the specified system requirements as described. Copy the PNet.jar and pnet.dll into the same folder; you can then start the program by double clicking on the PNet.jar icon. Note that PNet.jar and pnet.dll files must be located in the same folder for the Java interface PNet.jar to call the library functions in pnet.dll. Update PNet Newer version of PNet will be available and can be downloaded from www.sna.unimelb.edu.au/pnet/pnet.html Please replace your current PNet.jar and pnet.dll files, and update the Java runtime environment to finish the update. -2- Using PNet To setup a simulation, estimation or goodness of fit, you will need to choose the relevant options from the user interface and specify several program settings. The program requires input files, and produces text file output. Samples of input files and output files can be found in Appendix A. Start PNet PNet can be started from the Windows Start menu under Program Files, PNet. At the top of PNet main window, both Session Name and Session Folder are required for the output file names and location. Session Name Provide a name for the current session for simulation, estimation, goodness of fit or approximate Bayesian goodness of fit. This name will be used for the names of the output files. All output files will have file names that end with the Session Name you provided here, (E.g. if you have a session name MySession under simulation, you will have an output file named “simulation-MySession.txt.”) Session Folder All program output files will be located in the Session Folder selected here. You can browse through your system and select the folder by clicking on the Browse button. Simulation, Estimation, Goodness of fit and Approximate Bayesian goodness of fit, each has its own tab, with similar structures. Under each tab, several settings need to be specified to configure your p* model. -3- Simulation Simulation Setup To correctly configure simulation, you need to specify several settings Number of Actors Type in the number of actors in the network. Starting Graph Density Type in the starting density of a random graph in the simulation that used to generate the starting simulation network. Type in a floating point number between 0.0 and 1.0. Select Network Type: Models for directed and non-directed networks can be simulated. Choose the network type here. If you have a model that having constraint on the maximum number of ties that an actor can have, you should also specify it here by clicking on the checkbox and type in the maximum degree. Select Structural Parameters Click on the Structural Parameters Checkbox to enable the selection button. By clicking the Select Parameters button, structural parameter dialog appears. Select parameters for your simulation model and specify their values and lambda values if they are higher order parameters. The “Clear All” button will deselect all parameters and reset their values to -4- 0. It will also reset lambda to the default value of 2.0 The “Select All” button will select all parameters. Finish the structural parameter selection by clicking on the OK button. Select Dyadic Attribute Parameters Select the dyadic attribute parameters if you have one or more fixed setting network as network covariates. By clicking on the Browse button, a file open dialog appears, select the Covariate network file and click on OK. The dyadic attribute file is a plain text file having the dyadic attributes listed in the adjacency matrix format. -5- Select Actor Attribute Parameter Actor attribute Parameters are used in social selection models. You may have three different types of attributes, Binary, Continuous and Categorical. The can be selected in a similar manner. The number of attributes should be specified before select the actual parameters. Attribute files should also be specified similar to the way how dyadic attribute file is specified. Please check Appendix A for attribute file format. Simulation Options Fix out-degree distribution Directed networks only, this option will make simulated samples having identical out-degree distribution. Fix the graph density Fix the density of the graph, i.e. the number of arcs/edges in the network does not change through the entire simulation. Note, as the number of arcs/edges has been fixed, the arc/edge parameter should not be selected for simulation. Structural “0” File: Structural-zeros refers to the indicators for tie variables that are fixed -6- through the simulation. One may fix part of the network by applying a structural-zero file to the simulation. The file should contain a binary adjacency matrix with the same number of rows/columns as in the number of actors. In the matrix, “1” indicates the corresponding tie in the network is NOT fixed, “0” otherwise. Please check Appendix A for structural-zero file format. Pick up Sample graph Files, Sample degree distribution, Sample geodesic distribution and Sample clustering coefficient. If selected, the corresponding samples will set to be part of the program output in separate files. Burn in Burn in is the starting period of a simulation during which the network is evolving and getting adapted to the specified parameter values. Depends on the size of the network and number of parameter values, burn-in can vary largely. The larger the network, or the more parameter involved, the longer burn-in is needed. K-statistics tend to have longer burn-in. Number of iterations Type in the number of iterations after burn in for the simulation Number of samples to pickup Type in the number of sample graph statistics should be picked up in the simulation -7- Click on the start button, the simulation starts. PNet will notify you once the simulation finished. Simulation Output File Output “start_statistics_session.txt” This file contains the starting graph with selected statistics “simulation_graph.sps” or “simulation_digraph.sps” This file contains the SPSS script to plot the scatter-plot and histogram of the simulated graph statistics using SPSS version 12.0 and above. “simulation_graph.txt” or “simulation_digraph.txt” This file contains the list of sample statistics collected during the simulation. Using the SPSS script file, you can plot the statistics as scatter-plots and histograms. “parameter_graph.txt” or “parameter_digraph.txt” Showing parameter values used in simulation. -8- Estimation Estimation Setup To correctly setup an estimation run, several settings need to be configured. Same as in Simulation, Session Name and Number of Actors should be provided. Network File can be selected by clicking on the Browse Button. Network Type is selected the same way as in Simulation. By setting up the maximum degree, the model is conditional on the maximum degree of each actor. Structural, Covariate and different kind of Actor Attribute parameters can be selected as in Simulation. See detailed parameter description in Appendix B. Starting parameter values can be specified as well at the parameter selection dialog. If parameter values are not specified, all starting parameter values are set to 0.0, except the edge or arc parameter which is calculated based on the density of the network. Estimation Options: Fix out-degree distribution For Directed networks only, this option will estimate conditional models such that the out-degree distribution will be fixed trough out the estimation. Fix the graph density Fix the density of the graph, i.e. the number of arcs/edges in the network does not change through the entire simulation. Note, as the number of arcs/edges has been fixed, the arc/edge parameter is not estimable, and it should not be selected for estimation. Fix the graph density By fixing the graph density, the number of arcs/edges will not change during estimation. Fixing graph density may help convergence for parameter estimation, especially for large networks. Note, as the number of arcs/edges has been fixed, the arc/edge parameter should not be selected for estimation. -9- Structural “0” File: By applying structural “0” file, part of the network under estimation can be fixed. The file should contain a binary matrix where “1” indicates the corresponding tie in the network is NOT fixed, “0” otherwise. Please check Appendix A for the format of the structural-zero file. Number of Sub-phases Each sub-phase refines the parameter values, but more subphases do not guarantee convergence. The default value is 5. If a good set of starting parameter values is available, small number of sub-phases may help reduce time required for the estimation. Gaining Factor (a-value) The a-value is halved after each sub-phase. The default a-value is 0.01. Smaller a-value may be used, if a good set of starting parameter values is available. Multiplication Factor The larger the multiplication factor, the longer the estimation, but it may help convergence especially for some large networks. The default value is 10. Set it to the number of parameters may be helpful, and K-statistics tends to need factor values bigger than 20 (e.g. 20 to 100). Number of steps in phase 3 In phase 3, the program simulates network graphs using estimated parameters from phase2, and produce t-statistics according to the simulation and observation. The default value is 500 steps. - 10 - Maximum number of estimation runs As default, the program will perform 1 run of estimation and quit. Multiple runs can be performed one after the other; each run uses the parameter values from the end of the previous run. A better parameter estimate may be obtained as the new estimation may start with a better set of parameter values. The program will stop once the model has converged, or the maximum number of estimation runs has reached. Do GOF @ model convergence PNet can perform automatic goodness of fit test once the model under estimation has converged. The GOF output file will be located in the session folder. Update After first estimation run, the update button will be enabled. It is used when you want to start next estimation run with previous estimated parameters so that you may start form a better set of parameters. Note: PNet will always load the previous estimation session. Please do NOT use update, if the session name, session folder, or network file has been changed. Estimation Output File Output “start_statistics_graph.txt” This file contains the starting graph with graph statistics “estimation_graph.txt” Estimation result shows starting parameter values, starting graph statistics and parameter updates through Phase 2 of the estimation. The final estimates and estimated covariance matrix are shown at bottom of the file. “covariance_graph.txt” It contains the estimated covariance matrix by itself, and it can be used as the covariance file in Approximate Bayesian goodness of fit. - 11 - Goodness of Fit Goodness of Fit Setup Most settings for Goodness of Fit is the same as in Simulation, except the observed network and parameter values are required. The observed network file can be specified as in Estimation. Make sure that all parameters are selected; you may do this by using the “select all button in the parameter selection panel. The parameter values from your model should also be specified. You can type in the parameter values as in simulations, or you can use the “Update” button. Note: Update button will only work once all parameters have been selected (you may use “select all” button in parameter selection panels). It always loads parameters from immediate previous estimation session. Please only use update button immediately after a successful estimation. Goodness of Fit Output File Output “start_statistics_graph.txt” This file contains the observed graph and graph statistics “simulation_graph.sps” This file contains the SPSS script to plot the scatter-plot and histogram of the simulated graph statistics using SPSS version 15.0 and above. “accept_graph.txt” Showing the ratio of accepted simulation tie changes within each simulation intervals between every two sample graphs. “gof_graph.txt” Goodness of fit file contains the original or observed statistics for the given network graph, and goodness of fit for the specified model for all available graph statistics. - 12 - Approximate Bayesian goodness of fit In terms of program setup, the difference between approximate Bayesian goodness of fit and Goodness of fit described in previous section is that approximate Bayesian goodness of fit requires the estimated covariance matrix as part of the input. The covariance file is a text file containing the estimated covariance matrix only. One may use the covariance file generated from the immediate previous estimation session; or one can copy the estimated covariance matrix from the estimation result file and past it into a new text file. As covariance matrix is only regards to the model estimates, pleas ONLY select parameters that are included in the model in the parameter selection panels. Other options for Approximate Bayesian goodness of fit are identical to the settings in Goodness of fit described in previous section. - 13 - PNet Extensions BPNet Introduction BPNet is a program designed for exponential random graph models for bipartite networks where network ties are only defined between two sets of actors. The network statistics include both structural and configurations involving actor attributes. The general setup and use of the program is similar to PNet. It has a Java user interface, and C simulation engine. Modifications are made to accompany features of bipartite networks. Following are screen shots for BPNet with user instructions aside. Simulation The same as in PNet, Session Name and Folder need to be specified first, and output files will have file names ending with session name, and they will be located in the session folder. Numbers of actors (A) and (P) are the number of nodes in set A and set P. Simulations can be started with a random bipartite networks with specified density. Structural parameters can be selected by clicking on the check boxes. The details of the network configurations can be found in Appendix B. - 14 - Parameter values for simulations can be specified during parameter selection. The default values are 0s. Actor attribute parameters are selected by click on the check boxes, and type in the number of attributes for a particular type (binary, continuous, or categorical). Available parameters will show up after clicking the “Select Parameters…” button. Using continuous attribute as an example, attribute file name must be specified, and parameters and their values can then be selected. The attribute file format are the same as in PNet, where attribute names are separated using (,) as the first line, then the attributes listed in space, or tab separated columns. Attribute file format examples are listed in Appendix A. Since two sets of actors are involved in BPNet, separate attribute files are required for parameters involving only one set of actors, either A or P. For interaction actor attribute effects, the attribute file should list attributes for nodes - 15 - in set A first, then followed by attributes for nodes in set P. Putting them in the other order will produce wrong modeling results. Simulation options are similar to PNet, where we can fix the graph density, or use structural zero files to fix part of the network and treat them as exogenous. Sample graphs, degree distributions, and clustering coefficients can be collected in separate output files. Burn in, number of iterations, and number of samples to pick up are the same as in PNet. - 16 - Estimation The Network File is text file with a binary rectangular matrix. The number of rows for the matrix should be the same as the number of Actors(A), and the number of columns is the number of Actors(P). Note: putting the number of actors in the other order will produce wrong modeling results. Other settings for estimation are the same as in PNet. Goodness of Fit Goodness of fit settings are the same as in simulation, except the network file needs to be specified in the same way as in estimation. - 17 - References A. Baddeley and J. Möller. Nearest-neighbour markov point processes and random sets. International Statistical Review, 57:89–121, 1989. Peter Bore, Mark Hujsman, Tom A.B. Snijders, Christian Steglich, Lotte Wichers and Evelien Zeggelink. StOCNET: An open software system for the advanced statistical analysis of social networks. Groningen ICS / SciencePlus. http://stat.gamma.rug.nl/stocnet/. 2003 P. Erdös and A. Renyi. On the evolution of random graphs. Publications of the Mathematical Institute of the Hungarian Academy of Science., 5:17–61, 1960. Ove Frank and David Strauss. Markov graphs. Journal of the American Statistical Association, 81:832–842, Sep. 1986. Charles J. Geyer and Elizabeth A. Thompson. Constrained Monte Carlo maximum likelihood for dependent data. Journal of the Royal Statistical Society. Series B (Methodological), 54(3):657–699, 1992. Steven M. Goodreau. Advances in exponential random graph (p*) models applied to a large social network. Social Networks (Special Edition)., 29:231–248, 2007. Mark S. Handcock. Assessing degeneracy in statistical models of social Networks, working paper no. 39., Center for Statistics and the Social Sciences, University of Washington, 2003. Mark S. Handcock, David R. Hunter, Carter T. Butts, Steven M. Goodreau, and Martina Morris, statnet: An R package for the Statistical Modeling of Social Networks. Funding support from NIH grants R01DA012831 and R01HD041877. URL http://www.csde.washington.edu/statnet. 2003. Paul W. Holland and Samuel Leinhardt. An exponential family of probability distributions for directed graphs. Journal of the American Statistical Association, 76(373):33–50, Mar. 1981. David R. Hunter. Curved exponential family models for social networks. Social Networks (Special Edition)., 29:216–230, 2007. - 18 - David R. Hunter, Steven M. Goodreau, and Mark S. Handcock. Goodness of fit of social network models. Journal of the American Statistical Association, In Press. Philippa E. Pattison and Garry L. Robins. Neighborhood-based models for social networks. Social Methodology, 32:301–337, 2002. Philippa E. Pattison and Garry L. Robins. Building models for social space: Neighbourhood-based models for social networks and affiliation structures. Mathematics and Social Sciences, 42(168):11–29, 2004. Philippa E. Pattison and Stanley Wasserman. Logit models and logistic regression for social networks, ii. multivariate relations. Brithish Journal of Mathematical and Statistical Psychology, 52:169–194, 1999. Herbert Robbins and Sutton Monro. A stochastic approximation method. The Annals of Mathematical Statistics, 22(3):400–407, Sep. 1951. Garry L. Robins and Philippa E. Pattison. Models and Methods in Social Network Analysis. Interdependencies and Social Processes: Generalized Dependence Structures. Cambridge University Press, 2005. Garry L. Robins, Philippa E. Pattison, Yuval Kalish, and Dean Lusher. An introduction to exponential random graph (p*) models for social networks. Social Networks (Special Edition)., 29:173–191, 2007. Garry L. Robins, Philippa E. Pattison, and Stanley Wasserman. Logit models and logistic regressio for social networks, iii. valued relations. Psychometrika, 64(3):371–394, Sep. 1999. Garry L. Robins, Philippa E. Pattison, and Jodie Woolcock. Small and other worlds: Global network structures from local processes. American Journal of Sociology, 110(4):894–936, Jan. 2005. Garry L. Robins, Tom A.B. Snijders, Peng Wang, Mark Handcock, and Philippa E. Pattison. Recent developments in exponential random graph (p*) models for social networks. Social Networks (Special Edition)., 29:192– 215, 2007. John Skovretz and Katherine Faust. Logit models for affiliation networks. Sociological Methodology, 29:253–280, 1999. - 19 - Tom A.B. Snijders. Markov Chain Monte Carlo estimation of exponential random graph models. Journal of Social Structure, 3:2, 2002. Tom A.B. Snijders, Peter Boer, Evelien Zeggelink, Mark Huisman, and Christian Steglich. Siena: Simulation investigation for empirical network analysis. 2001. Tom A.B. Snijders, Philippa E. Pattison, Garry L. Robins, and Mark Handcock. New specifications for exponential random graph models. Sociological Methodology., 36:99–153, 2006. Christian Steglich and Tom A.B. Snijders. Dyanmic networks and behavior: Separating selection from influence. In Press. Stanley Wasserman and Katherine Faust. Social Network Analysis. Cambridge University Press, 1994. Stanley Wasserman and Philippa E. Pattison. Logit models and logistic regression for social networks, i. an introduction to markov graphs and p*. Psychometrika, 6(3):401–425, Sep. 1996. Stanley Wasserman and Garry L. Robins. Models and Methods in Social Network Analysis. An Introduction to Random Graphs, Dependence Graphs, and p*. Cambridge University Press, 2005 - 20 - Appendix A – Sample Files Sample Input Files Sample network or dyadic attribute file: Sample structural zero file: The file contains a binary matrix where „1‟ indicates changeable ties, and „0‟ indicates fixed ties. Applying this structural zero file example will fix all the tie variables related to node 2 and 5. Also ties between node 1 and 13, node 1 and 14, are also fixed. Network files or dyadic attribute setting files contain the observed or covariate network of interest in the adjacency matrix format. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 1 1 1 1 1 1 1 1 1 - 21 - 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 1 1 1 1 1 1 1 1 1 0 1 1 0 1 0 1 1 1 1 1 1 1 1 0 1 1 0 1 1 0 1 1 1 1 1 1 1 0 1 1 0 1 1 1 0 1 1 1 1 1 1 0 1 1 0 1 1 1 1 0 1 1 1 1 1 0 1 1 0 1 1 1 1 1 0 1 1 1 1 0 1 1 0 1 1 1 1 1 1 0 1 1 0 0 1 1 0 1 1 1 1 1 1 1 0 1 0 0 1 1 0 1 1 1 1 1 1 1 1 0 Attribute file formate Each column represents an attribute. Each row corresponds to the same row as in the adjacency matrix Attribute names should be listed in the first line, delimited by „,‟s. o Note that attribute names should not start with numbers to meet the SPSS script requirements for variable names. Sample binary actor attribute file: Sample categorical actor attribute file: department,club 1 1 3 2 2 3 3 2 1 3 2 1 1 2 2 3 3 1 3 3 2 2 3 2 1 1 1 2 member,gender 1 1 1 1 0 1 1 0 1 1 0 0 1 0 0 0 1 1 1 0 0 1 0 0 0 1 1 0 Sample continuous actor attribute file: income,age,performance 1.0 23 2 1.1 34 6 1.1 42 5 0.5 23 4 0.3 24 1 1.1 19 1 1.5 38 2 0.2 49 1 0.1 58 1 0.2 47 2 1.0 24 3 0.2 36 2 0.1 19 4 0.5 20 3 - 22 - Sample Output Files Estimation and goodness of fit output files are tab delimited to easy creating tables in excel. Following are examples of output files, and excel tables where applicable. “start_statistics_[session name].txt”, “end_statistics_[session name].txt” and “sample_statistics_[session name].txt” vertices 14 matrix 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 ***This graph contains:**** vertices 14 arc 24 reciprocity 2 AinS(2.00) 17.62500 AoutS(2.00) 15.25000 AT-T(2.00) 6.50000 A2P-T(2.00) 40.50000 member_interaction 6 gender_interaction 2 member_sender 11 gender_sender 7 member_receiver 13 gender_receiver 12 Digraph Density = 0.13187 In-degree Distribution:(range[0..n-1]) 4 2 4 3 0 1 0 0 0 0 0 0 0 0 Standard deviation of in-degree distribution = 1.435697 Skewness of in-degree distribution = 0.508356 Out-degree Distribution:(range[0..n-1]) 2 6 1 4 1 0 0 0 0 0 0 0 0 0 Standard deviation of out-degree distribution = 1.220572 Skewness of out-degree distribution = 0.322264 Corr. Coef. between in and out degree distributions = 0.238744 - 23 - Mean degree = 1.71429 Global Clustering Coefficients: Cto = 0.18421 Cti = 0.15217 Ctm = 0.16279 Ccm = 0.20930 AKC-T = 0.16049 AKC-D = 0.20000 AKC-U = 0.13953 AKC-C = 0.20988 Geodesic Distribution:(range[1..n-1,inf]) Note: geodesic = shortest path between two nodes. The geodesic distribution is not based on semi-paths. 24 32 29 20 5 0 0 0 0 0 0 0 0 72 Quartiles of the geodesic distribution. Note: Quartiles equal to the number of nodes refer to infinite geodesics. 2 4 14 14 Triad 300 210 120C 120D 120U 201 111D 111U 030T 030C 102 021D 021C 021U 012 003 Census: 0 0 1 0 2 0 6 2 2 2 13 10 20 12 130 164 “simulation_[session name].txt” id arc recip in2star out2star in3star out3star uktri 1000 16 3 6 5 0 0 0.00 2000 12 1 3 4 0 1 2.00 3000 14 1 5 3 1 0 1.00 4000 14 2 3 2 0 0 0.00 5000 15 2 4 4 0 0 3.00 6000 15 1 4 3 1 0 1.00 … … … 97000 18 2 6 7 0 1 3.00 98000 13 1 2 1 0 0 0.00 99000 17 1 4 5 0 0 0.00 100000 19 4 5 6 0 0 1.00 - 24 - “parameter_[session name].txt” Simulation result for digraph with 14 of vertices. Parameter Values of: arc -1.50000 reciprocity 1.20000 2-in-star -1.30000 2-out-star -1.20000 3-in-star -1.10000 3-out-star -1.30000 AT-U(2.00) 1.50000 Proposed 1000000 digraphs. Samples are picked up at 1 per 1000 digraphs. Accepted 127343 proposed digraphs. “estimation_[session name].txt” STOCHASTIC ESTIMATION FOR NETWORK example ESTIMATION SETTINGS Number of sub-phases in estimation (phase 2) = 5 starting a-value in estimation (phase 2) = 0.010000 Multiplication factor for estimation (phase 2) = 10 Number of steps in final simulation (phase 3) = 500 Number of estimation runs = 10 STOCHASTIC APPROXIMATION RUN 1 original statistics:24.000000 2.000000 17.625000 15.250000 6.500000 40.500000 starting parameters:-2.902123 0.200148 0.546233 -0.013692 -0.046386 0.143345 Phase1 started with the following setup: a = 0.010000 num of steps = 25 num of iterations in each step = 224.378698 ************************************ mean statistics in phase1:22.880000 1.760000 16.556250 14.147500 5.880000 37.625000 END PHASE1 parameter:-2.902123 0.200148 0.546233 -0.013692 -0.046386 0.143345 Phase 2 started Subphase 0 started with a valued 0.010000 Subphase 0 has gone up to 213 steps Parameter after Subphase 0:-2.90351 0.33987 0.12749 Subphase 1 started with a valued 0.010000 Subphase 1 has gone up to 233 steps Parameter after Subphase 1:-2.86694 0.26247 0.12810 - 25 - 0.55254 -0.01341 -0.03369 0.54200 -0.01491 -0.04047 Subphase 2 started with a valued 0.005000 … Subphase 4 started with a valued 0.001250 Subphase 4 has gone up to 725 steps Parameter after Subphase 4:-2.90684 0.20529 0.55291 -0.00977 -0.04367 0.14037 END PHASE2 parameter:-2.906837 0.205294 0.552914 -0.009773 -0.043674 0.140375 Phase3 started with the following setup: num of steps = 500 num of iterations in each step = 224.378698 *************************************** mean statistics in phase3:24.616000 2.096000 18.135141 15.876000 6.834750 42.064875 Estimation Result for Network SUMMARY (parameter, standard error, tstatistics) NOTE: t-statistics = (observation - sample mean)/standard error effects estimates stderr t-ratio arc -2.906837 0.98133 -0.08919 * reciprocity 0.205294 0.81794 -0.05820 AinS(2.00) 0.552914 0.52603 -0.06192 AoutS(2.00) -0.009773 0.59406 -0.07868 AT-T(2.00) -0.043674 0.51150 -0.06069 A2P-T(2.00) 0.140375 0.19825 -0.07173 Estimated Covariance Matrix 0.963005 -0.009254 -0.318091 -0.009254 0.669034 -0.027916 -0.318091 -0.027916 0.276712 -0.384405 -0.058325 0.073059 0.264326 0.007124 -0.151908 -0.107297 -0.001399 0.009170 effects arc reciprocity AinS(2.00) AoutS(2.00) AT-T(2.00) A2P-T(2.00) estimates -2.906837 0.205294 0.552914 -0.009773 -0.043674 0.140375 stderr 0.98133 0.81794 0.52603 0.59406 0.5115 0.19825 -0.384405 -0.058325 0.073059 0.352912 -0.136727 0.002928 t-ratio -0.08919 -0.0582 -0.06192 -0.07868 -0.06069 -0.07173 0.264326 0.007124 -0.151908 -0.136727 0.261629 -0.038435 -0.107297 -0.001399 0.009170 0.002928 -0.038435 0.039304 0.264326 0.007124 -0.151908 -0.136727 0.261629 -0.038435 -0.107297 -0.001399 0.009170 0.002928 -0.038435 0.039304 * “covariance_[session name].txt” 0.963005 -0.009254 -0.318091 -0.384405 0.264326 -0.107297 -0.009254 0.669034 -0.027916 -0.058325 0.007124 -0.001399 -0.318091 -0.027916 0.276712 0.073059 -0.151908 0.009170 -0.384405 -0.058325 0.073059 0.352912 -0.136727 0.002928 - 26 - “gof_[session name].txt” GOODNESS OF FIT Parameter Values: arc -1.19735 reciprocity 0.28248 2-in-star 0.00000 … Isolates 0.00000 AinS(2.00) 0.61944 AoutS(2.00) -0.92446 AinS(2.00) 0.00000 … K-L-star(2.00) 0.00000 AT-T(2.00) -0.02294 … AT-TDU(2.00) 0.00000 A2P-T(2.00) 0.21751 A2P-D(2.00) 0.00000 … A2P-TDU(2.00) 0.00000 member_interaction 0.39270 gender_interaction -1.01727 member_sender -1.12695 gender_sender -1.33001 member_receiver -0.18683 gender_receiver 0.49272 … gender_out2star 0.00000 Simulated 1000000 digraphs. Statistic samples are picked up at 1 per 1000 digraphs. Accepted 262407 proposed digraphs. observation, sample mean (standard error), t-statistic t-statistics = (observation - sample mean)/standard deviation effects arc 24 reciprocity 2-in-star 2-out-star 3-in-star 3-out-star path2 43 T1 0 T2 0 T3 1 T4 0 T5 2 T6 0 T7 7 T8 7 T9(030T) T10(030C) Sink 0 observed mean stddev t-ratio 24.135 4.888 -0.028 2 2.063 1.467 -0.043 23 23.997 9.764 -0.102 19 20.436 9.849 -0.146 13 15.866 12.423 -0.231 8 12.216 11.910 -0.354 43.605 18.853 -0.032 0.014 0.133 -0.105 0.357 1.109 -0.322 1.535 2.261 -0.237 0.819 1.266 -0.647 0.687 1.231 1.067 0.918 1.694 -0.542 9.042 8.190 -0.249 7.819 7.847 -0.104 7 7.216 5.313 -0.041 3 2.416 2.082 0.280 1.430 1.101 -1.299 - 27 - Source 2 2.742 1.311 -0.566 Isolates 2 0.685 0.824 1.596 AinS(2.00) 17.625 17.646 5.961 -0.004 AoutS(2.00) 15.250 15.500 6.049 -0.041 AinS(2.00) 17.625 17.646 5.961 -0.004 AoutS(2.00) 15.250 15.500 6.049 -0.041 K-1-star(2.00) 30.125 27.346 9.801 0.284 1-L-star(2.00) 30.500 28.742 9.674 0.182 K-L-star(2.00) 20.750 18.188 5.091 0.503 AT-T(2.00) 6.500 6.737 4.665 -0.051 AT-C(2.00) 8.500 6.736 5.469 0.323 AT-D(2.00) 7.000 6.656 4.608 0.075 AT-U(2.00) 6.000 6.678 4.577 -0.148 AT-TD(2.00) 6.750 6.697 4.627 0.012 AT-TU(2.00) 6.250 6.708 4.612 -0.099 AT-DU(2.00) 6.500 6.667 4.579 -0.036 AT-TDU(2.00) 6.500 6.690 4.602 -0.041 A2P-T(2.00) 40.500 41.202 16.830 -0.042 A2P-D(2.00) 17.500 18.975 8.679 -0.170 A2P-U(2.00) 21.500 22.538 8.644 -0.120 A2P-TD(2.00) 29.000 30.089 12.376 -0.088 A2P-TU(2.00) 31.000 31.870 12.187 -0.071 A2P-DU(2.00) 19.500 20.756 7.982 -0.157 A2P-TDU(2.00) 26.500 27.572 10.689 -0.100 member_interaction 6 5.895 2.458 0.043 gender_interaction 2 1.993 1.417 0.005 member_sender 11 10.940 2.902 0.021 gender_sender 7 7.055 2.243 -0.025 member_receiver 13 12.828 3.821 0.045 gender_receiver 12 12.135 3.445 -0.039 member_interaction_reciprocity 1 0.393 0.655 0.927 gender_interaction_reciprocity 0 0.060 0.242 -0.248 member_activity_reciprocity 2 1.295 1.139 0.619 gender_activity_reciprocity 1 1.346 1.107 -0.313 member_in2star 15 11.970 7.073 0.428 gender_in2star 16 11.759 6.633 0.639 member_path2 18 17.830 10.092 0.017 gender_path2 13 12.390 6.845 0.089 member_out2star 7 6.453 4.351 0.126 gender_out2star 3 2.338 2.058 0.322 Std Dev in-degree dist 1.436 1.408 0.275 0.100 Skew in-degree dist 0.508 0.555 0.490 -0.094 Std Dev out-degree dist 1.221 1.215 0.270 0.019 Skew out-degree dist 0.322 0.600 0.528 -0.526 CorrCoef in-out-degree dists 0.239 0.163 0.293 0.258 Global Clustering Cto 0.184 0.166 0.079 0.233 Global Clustering Cti 0.152 0.138 0.066 0.219 Global Clustering Ctm 0.163 0.152 0.072 0.144 Global Clustering Ccm 0.209 0.147 0.096 0.643 Global Clustering AKC-T 0.160 0.152 0.070 0.124 Global Clustering AKC-D 0.200 0.166 0.077 0.441 Global Clustering AKC-U 0.140 0.137 0.064 0.036 Global Clustering AKC-C 0.210 0.146 0.093 0.683 ACCEPTANCE RATE: 0.2624 SAMPLE GEODESIC DISTRIBUTION - 28 - Note: geodesic = shortest path between two nodes. The geodesic distribution is not based on semi-paths. FIRST QUARTILES Median of sample G25s: 2 Interquartile range: 1 Observed first quartile geodesic: 2 in model samples, 0.00% of graphs have lower G25. in model samples, 27.50% of graphs have higher G25. SECOND QUARTILES Median of sample G50s: 4 Interquartile range: 11 Observed median geodesic: 4 in model samples, 36.80% of graphs have lower G50. in model samples, 48.40% of graphs have higher G50. THIRD QUARTILES Median of sample G75s: 14 Interquartile range: 0 Observed first quartile geodesic: 14 in model samples, 14.00% of graphs have lower G75. in model samples, 0.00% of graphs have higher G75. GOF on Triad Census Triad observed mean stddev t-ratio 300 0 0.014 0.133 -0.105 210 0 0.273 0.636 -0.429 120C 1 0.905 1.138 0.083 120D 0 0.504 0.800 -0.630 120U 2 0.372 0.693 2.350 201 0 0.603 1.198 -0.503 111D 6 5.020 3.855 0.254 111U 2 4.061 3.415 -0.603 030T 2 3.656 2.788 -0.594 030C 2 1.210 1.327 0.596 102 13 12.100 8.607 0.105 021D 10 9.375 4.749 0.132 021C 20 20.389 8.096 -0.048 021U 12 11.845 4.778 0.032 012 130 129.376 16.257 0.038 003 164 164.297 29.134 -0.010 Mahalanobis distance =7.168743 (51.390873) 50% simulated samples have smaller Mahalanobis distances than the observed network. effects arc reciprocity 2-in-star 2-out-star 3-in-star observed 24 2 23 19 13 - 29 - mean 24.135 2.063 23.997 20.436 15.866 stddev 4.888 1.467 9.764 9.849 12.423 t-ratio -0.028 -0.043 -0.102 -0.146 -0.231 3-out-star path2 T1 T2 T3 T4 T5 T6 T7 T8 T9(030T) T10(030C) Sink Source Isolates AinS(2.00) AoutS(2.00) AinS(2.00) AoutS(2.00) K-1-star(2.00) 1-L-star(2.00) K-L-star(2.00) AT-T(2.00) AT-C(2.00) AT-D(2.00) AT-U(2.00) AT-TD(2.00) AT-TU(2.00) AT-DU(2.00) AT-TDU(2.00) A2P-T(2.00) A2P-D(2.00) A2P-U(2.00) A2P-TD(2.00) A2P-TU(2.00) A2P-DU(2.00) A2P-TDU(2.00) member_interaction gender_interaction member_sender gender_sender member_receiver gender_receiver member_interaction_reciprocity gender_interaction_reciprocity member_activity_reciprocity 8 43 0 0 1 0 2 0 7 7 7 3 0 2 2 17.625 15.25 17.625 15.25 30.125 30.5 20.75 6.5 8.5 7 6 6.75 6.25 6.5 6.5 40.5 17.5 21.5 29 31 19.5 26.5 6 2 11 7 13 12 1 0 2 - 30 - 12.216 43.605 0.014 0.357 1.535 0.819 0.687 0.918 9.042 7.819 7.216 2.416 1.43 2.742 0.685 17.646 15.5 17.646 15.5 27.346 28.742 18.188 6.737 6.736 6.656 6.678 6.697 6.708 6.667 6.69 41.202 18.975 22.538 30.089 31.87 20.756 27.572 5.895 1.993 10.94 7.055 12.828 12.135 0.393 0.06 1.295 11.91 18.853 0.133 1.109 2.261 1.266 1.231 1.694 8.19 7.847 5.313 2.082 1.101 1.311 0.824 5.961 6.049 5.961 6.049 9.801 9.674 5.091 4.665 5.469 4.608 4.577 4.627 4.612 4.579 4.602 16.83 8.679 8.644 12.376 12.187 7.982 10.689 2.458 1.417 2.902 2.243 3.821 3.445 0.655 0.242 1.139 -0.354 -0.032 -0.105 -0.322 -0.237 -0.647 1.067 -0.542 -0.249 -0.104 -0.041 0.28 -1.299 -0.566 1.596 -0.004 -0.041 -0.004 -0.041 0.284 0.182 0.503 -0.051 0.323 0.075 -0.148 0.012 -0.099 -0.036 -0.041 -0.042 -0.17 -0.12 -0.088 -0.071 -0.157 -0.1 0.043 0.005 0.021 -0.025 0.045 -0.039 0.927 -0.248 0.619 gender_activity_reciprocity member_in2star gender_in2star member_path2 gender_path2 member_out2star gender_out2star Std Dev in-degree dist Skew in-degree dist Std Dev out-degree dist Skew out-degree dist CorrCoef in-out-degree dists Global Clustering Cto Global Clustering Cti Global Clustering Ctm Global Clustering Ccm Global Clustering AKC-T Global Clustering AKC-D Global Clustering AKC-U Global Clustering AKC-C 1 15 16 18 13 7 3 1.436 0.508 1.221 0.322 0.239 0.184 0.152 0.163 0.209 0.16 0.2 0.14 0.21 - 31 - 1.346 11.97 11.759 17.83 12.39 6.453 2.338 1.408 0.555 1.215 0.6 0.163 0.166 0.138 0.152 0.147 0.152 0.166 0.137 0.146 1.107 7.073 6.633 10.092 6.845 4.351 2.058 0.275 0.49 0.27 0.528 0.293 0.079 0.066 0.072 0.096 0.07 0.077 0.064 0.093 -0.313 0.428 0.639 0.017 0.089 0.126 0.322 0.1 -0.094 0.019 -0.526 0.258 0.233 0.219 0.144 0.643 0.124 0.441 0.036 0.683 Appendix B – Model Parameter Description Non-directed Graphs Parameters Without Actor Attributes Edge (L) Isolate 2-Star (S2) 3-Star (S3) Triangle (T1) Alt-Triangle (AT) Alt-Star (AS) Alt-2-Path (A2P) 2-Triangle (T2) Bow-Tie 3-Path 4-Cycle 1-Edge-Triangle (1-ET) 2-Edge-Triangle (2-ET) Alt-Edge-Triangle (AET) 4-Clique 5-Clique 6-Clique 7-Clique Alt-Clique (AC) Parameters with Actor Attributes – actors with attribute – actors with or without attribute [Attr] – attribute name [Attr]-interaction [Attr]-activity [Attr]-T3u [Attr]-T2u - 32 - [Attr]-T1u [Attr]-O3u [Attr]-O2au [Attr]-O1au [Attr]-O2bu [Attr]-O1bu Parameters for Continuous Attributes [Attr]-Sum + [Attr]-interaction x [Attr]-difference1 Parameters for Categorical Attributes [Attr]-Matching [Attr]-Mismatch Parameters for Dyadic Attributes Dyadic covariate 1 [Attr]-Edge [Attr]-S21 [Attr]-S22 [Attr]-T1 [Attr]-T2 [Attr]-T3 Absolute difference between two actor attributes - 33 - - Directed Graphs Parameters Without Actor Attributes Arc sink Reciprocity source In-2-star Out-2-star In-3-star Out-3-star 2-path T7 T8 T4 T5 T3 T6 T2 Transitive Triad (T9) Cyclic Triad (T10) T1 isolate Alt-in-star (AinS) Alt-out-star (AoutS) Alt-in-1-out-star (Ain1outS) 1-in-alt-out-star (1inAoutS) Alt-in-alt-out-star (AinAoutS) AT-T AT-C AT-D AT-U A2P-T A2P-U A2P-D - 34 - Parameters with Actor Attributes – actors with attribute – actors with or without attribute [Attr] – attribute name [Attr]-Interaction [Attr]-Interactionreciprocity [Attr]-Sendermissing [Attr]-Receivermissing [Attr]-Activityreciprocity [Attr]-in-2-star [Attr]-2-path [Attr]-Sender [Attr]-Receiver [Attr]-out-2-star Parameters for Continuous Attributes [Attr]-Sender [Attr]-Receiver [Attr]-Receivermissing [Attr]-Sender-missing [Attr]-Sum + [Attr]-Difference - [Attr]-Product x + [Attr]-Differencereciprocity - [Attr]-Sumreciprocity [Attr]-Productreciprocity [Attr]-in-2-star [Attr]-2-path [Attr]-out-2-star Parameters for Categorical Attributes [Attr]-Matching [Attr]-Mismatch [Attr]-Mismatchreciprocity [Attr]-Matching-reciprocity Parameters for Dyadic Attributes Dyadic covariate [Attr]-Arc - 35 - x XPNet Graph Statistics Parameters for two nondirected networks (A and B) Network A Network B EdgeAB 2-StarAB 3-Star-AAB 3-Star-ABB TriangleAAB TriangleABB Binary Attributes Rab Rbab Continuous Attributes SumAB + DifferenceAB Categorical attributes Same-category-AB Diff- category -AB - 36 - - Parameters for two directed networks (A and B) Network A Network B ArcAB ReciprocityAB ReciprocityAAB ReciprocityABB ReciprocityAABB In-2-StarAB Out-2-StarAB Mixed-2-StarAB In-3-Star-AAB Out-3-Star-AAB In-3-Star-ABB Out-3-Star-ABB T-ABB T-BAA T-AAB T-BBA T-ABA T-BAB - 37 - C-AAB C-ABB Binary Attributes Mrs Mrr Mrb Mrbm Mrm Continuous Attributes Msum + Mdiff - Msumm + Mdiffm - Categorical attributes Same-cate-arcAB Diff-cate-arcAB Same-cate-reciAB Diff-cate-reciAB - 38 - BPNet Graph Statistics Set-P Set-A L Sp2 Sa2 Sp3 Sa3 L3 C4 K-Sp K-Sa K-Cp K-Ca Binary Attributes – actors with attribute – actors with or without attribute [Attr] – attribute name [Attr]_RA [Attr]_RP [Attr]_TSCA [Attr]_TSCP - 39 - [Attr]_TSOA1 [Attr]_TSOP1 [Attr]_TSOA2 [Attr]_TSOP2 [Attr]_C4A1 [Attr]_C4P1 [Attr]_C4A2 [Attr]_C4P2 [Attr]_rAP Continuous Attributes [Attr]_RAC [Attr]_RPC [Attr]_TSCAC [Attr]_TSCPC [Attr]_TSOACS + [Attr]_TSOPCS [Attr]_TSOACD - [Attr]_TSOPCD - 40 - + - [Attr]_C4ACS [Attr]_C4PCS + [Attr]_C4ACD [Attr]_C4PCD [Attr]_RAPC Categorical Attributes [Attr]_2path_match_A [Attr]_2path_match_P [Attr]_2path_mismatch_A [Attr]_2path_mismatch_P [Attr]_4cycle_match_A [Attr]_4cycle_match_P [Attr]_4cycle_mismatch_A [Attr]_4cycle_mismatch_P - 41 - + IPNet Graph Statistics Denotes actors with attribute. Denotes actors with or without attribute. Attribute Density Star2 Activity Star3 Two-PathEquivalence PartnerResource Contagion Partner-Activity T1 T2 T3 Setting matrix SettingHomophily Distance matrix GeographicHomophily Contagionamong-partners Remoteness Remoteness-topartners Parameters for Binary Attributes oOb o_Ob Parameters for Continuous Attributes oOc o_Oc - 42 - Parameters for Categorical Attributes oO_Osame oO_Odiff - 43 -
© Copyright 2026 Paperzz