Title: Gephi Name: Sébastien Heymann Affil./Addr.: Université Pierre et Marie Curie, LIP6, ComplexNetworks team 4 place Jussieu, 75005 Paris, France Phone: +33 (0)1 44 27 88 88 E-mail: [email protected] Gephi Synonyms exploratory network analysis, network visualization, visual analytics, open source Glossary • API: an Application Programming Interface is an interface for software components to communicate with each other with a clear separation of components. • Homophily: tendency to link to similar others. • Layout: algorithm which calculates the position of elements in a graphic space. • Raster image: image encoded by a two-dimensional matrix of pixels. • Shortest path: minimal distance between two nodes of a network. • Sparkline: small intense, simple, word-sized graphic with typographic resolution. • Subgraph: graph whose nodes and edges connecting these nodes are a subset of a graph. • Vector image: images encoded by a set of geometrical functions. 2 Definition Gephi was created in 2008 by Mathieu Bastian, Sébastien Heymann, and Mathieu Jacomy, and extended by Eduardo Ramos Ibañez, Cezary Bartosiak, Julian Bilcke, Patrick McSweeney, André Panisson, Jérémy Subtil, Helder Suzuki, Martin Skurla, and Antonio Patriarca. It is suitable for the analysis of all kind of complex networks, although it is mostly used for social network analysis. It is distributed using a dual licensing scheme under the GNU General Public License (GNU GPL) v3 and the Common Development and Distribution License (CDDL) v1. Gephi can be used as a stand-alone application for the desktop, and as Java library for embedding some of the features in third-party programs. It scales to 10,000 nodes and edges with 1GB RAM and 1 CPU, and up to 1 million nodes and edges with 32GB RAM and 8 CPUs. It runs on Linux, Windows, and Mac OS X. It is written in Java 6 and OpenGL 1.2. Introduction Gephi is an open source software for the visual exploration of networks (also called graphs). A network is made of a set of entities, called the nodes, and a set of relationships between entities, called the edges. While various softwares exist to visualize and analyse networks, Gephi is particularly suited for networks with node attributes. Attributes are key-value pairs associated to each node or each edge. For example, individuals of a social network may have attributes such as gender, language, and age. [ Video introducing Gephi: https://vimeo.com/9726202 ] Gephi users interact with the visualization in real-time to position the nodes in a two or three dimensional space using layout algorithms, or by manually moving nodes (see Fig.1). They use node attributes to change the color and size of the nodes, in order to find groups and individuals. The goal is to study the correlation of node attributes 3 Fig. 1. Overview of Gephi 0.8. and network structure by using visual patterns. Classic metrics of social network analysis, such as node degree or betweenness centrality measures, can be computed and used in the visualization as well (see Fig.2). The network can also be filtered based on attributes. Fig. 2. Network visualization example. Node size is proportional to the betweenness centrality value of the node. 4 Gephi is not limited to social networks. Any kind of network can be analysed, like the internet topology (i.e. connections between machines), peer-to-peer file-sharing networks, biological networks, on-line social networks (e.g. Twitter, Facebook), communication (e.g. email) and financial networks, but also semantic networks, organizational networks and more. Gephi aims at covering the entire process from data importing to aesthetics refinements and interaction. Data can be imported and exported in various file formats, and can be retrieved from databases. Once the visual exploration is over, the user refine aesthetics and export graphics in vector file formats to ensure readability and quality publishing on print and interactive graphics. This project is supported by an international community, which is lead by the French non-profit corporation called the Gephi Consortium. Key Points The strengths of Gephi are real-time visual feedback, performance, modularity, and its community. The Gephi user interface is focused on the creation of network visuals in realtime. The key innovation is to ease the interactions with the network. The user can literally play with the visual representation of the network. By playing, we mean experimenting various visual configurations for the purpose of seeing the outcome of any action instantaneously. This is made possible with the following features. The user apply layout algorithms to shape the network structure in 2-D or 3-D, for instance using force-directed layouts. Such algorithms calculate the layout of a network using repulsive forces between all nodes, but also using attractive forces between nodes which are adjacent. Each layout iteration calculates the forces applied on each node, and updates each node position. The visualization is refreshed at each iteration, therefore providing 5 real-time feedback for users. Some layouts are implemented with no stopping condition. The user can therefore tweak the layout parameters in real-time, until they decide to stop its execution. Interactions while calculating layout is made technically possible by using multi-threading processing, and the GPU for rendering the visualization. Gephi is stable and can scale enough to load networks of up to 1 million nodes and edges. The rendering engine is able to handle large networks and yet guarantees responsiveness. The minimum technical requirements of the software makes small networks actionable on low configurations such as netbooks. large networks of around a million of nodes and edges can also be analysed on visualization servers. In addition of interactive exploration of large networks, Gephi provides efficient implementations of classic metrics used in Social Network Analysis, including Betweenness Centrality, Clustering Coefficient, PageRank or Louvain Modularity for community detection. Gephi is a stand-alone application, built with Java SE 6 on top of the NetBeans Platform, which is a software for creating applications (see Fig.3). An installer makes Gephi available on all platforms having the Java Virtual Machine running. A graphic card with OpenGL 1.2 is required. The features are extensible through Java plug-ins which use the Gephi APIs. For instance, the OpenOrd layout algorithm is a plug-in which implements the Layout API. Such source code structure makes the software maintainable. A version of the software without the user interface is also distributed under the name of Gephi Toolkit. It is used as a Java library to create novel applications on desktop or on server. Finally, key benefits are provided by the Gephi community: members answer questions on the forum and fix the most common bugs. They organize meet-ups in their cities, and provide training seminars to newcomers. 6 Fig. 3. Architecture of Gephi (left) and Gephi Toolkit( right). Historical Background Gephi is a software developed since 2008. It was primarily created to enable researchers in social sciences to study the Web at Fondation Maison des Sciences de l’Homme in Paris, France. Today, the Gephi Consortium aims at creating a sustainable software and technical ecosystem, driven by an international open-source community which shares common interests in networks and complex systems. Since the begining, an non-profit organization called Association Gephi provided a legal entity to support, protect and promote the Gephi project. Hosted alternatively by Association WebAtlas, Linkfluence SAS and SciencesPo Medialab, the initial contributors Mathieu Bastian, Sebastien Heymann and Mathieu Jacomy have progressively set up an international community of users and contributors. They notably participated in the Google Summer of Code program each year since 2009, and won the Oracle 2010 Duke’s Choice Award for best Innovative Technical Data Visualization. They launched the Gephi Consortium in 2011, which is a non-profit corporation created to join the efforts of industrials, laboratories and civil society in building Gephi. Created under the French law of July 1st, 1901, it is governed by a board of directors. The Gephi Consortium makes an R&D effort to build generic and reusable parts of Gephi, improves the competitive technology at low costs, and creates standards to ensure interoperability. 7 Research partners include Inria, Sciences-Po Medialab, Fondation Maison des Sciences de l’Homme TIC-Migrations, UPMC-CNRS LIP6 ComplexNetworks, Université de Technologie de Compiègne COSTECH, ISI Foundation, Indiana University Center for Complex Networks and Systems Research, and Stanford Mapping the Republic of Letters. Private parters include Quid Inc, Linkfluence SAS, and Neo Technology Inc. Features Input/output data formats File input While many file formats exist to encode network data, Gephi supports the most common ones: CSV, GDF, GEXF, GML, GraphML, Graphviz DOT, Pajek NET, Tulip TLP, Ucinet DL, XGMML, and RDF. The later is available throught the Semantic Web plugin, developed by the Inria WIMMICS team. A spreadsheet importer helps users to model tabular data as a network. File Output Data can be saved in Gephi sessions, but also in other formats like node and edge lists, GDF, GEXF, GraphML, Pajek NET, Ucinet DL, and CXF. Therefore, users can migrate data from one software to another in order to benefit from their specific features. Graphical Output Network visuals can be exported in PDF or SVG for printing. Designers can edit them using third-party tools. Raster graphics such as PNG is also available, as long as the Tikz format for embedding figures in LateX documents. Interactive graphics exporters 8 are available as plug-ins, such as Microsoft Seadragon graphics, and KMZ for exporting nodes with geographical coordinates. Data exported in GEXF with visual attributes (i.e. node position, color, size) can be re-used in browser-based viewers like Sigma.js. GEXF The file format working group of the Gephi Consortium created the Graph Exchange XML Format (GEXF), which is the standard used in Gephi to encode network data. This format is an XML language for describing network nodes and edges, attributes, hierarchies and their temporal evolution. GEXF is an improvement compared to GraphML for the capability to encode dynamic networks. The namespaces allow anyone to extend the format for specific purposes, for instance the addition of application data, without disrupting other applications. The libraries made by the community enable the reading and writing of GEXF files in C++, R, Python, Java, Perl and Javascript. They facilitate the adoption of the format, and improve interoperability between Gephi and third-party tools. The GEXF format is also supported by other softwares such as NetworkX, Tulip, and GraphStream, and by on-line services like Issuecrawler. Databases Gephi can retrieve data from relational databases such as MySQL, SQL Server, PostgreSQL, SQLite and Teradata. The community creates plug-ins to support graph databases such as Neo4j, OrientDB, and InfiniteGraph. Streaming Real-world structures are constantly changing, and file formats are not suitable to exchange such type of dynamic data. A lot of well-established on-line systems already stream data to its users using a streaming API. Twitter for example defined a Streaming API to allow near real-time access to its data. Inspired by the GraphStream Java 9 Library, the Graph Streaming API of Gephi provides a unified framework for streaming network events in a JSON format, like the addition, the modification and the removal of nodes and edges over time. A client can receive data from a master, but the specifications allow more flexibility: clients can interact with the master by pushing data to it. In the case of two Gephi instances connected through this API, a change in a network at the master’s Gephi should cause a change in the client’s Gephi, and a change at the client’s Gephi will cause it to send requests to the master to update its network accordingly. Both instances work in a distributed mode. Different people could therefore work in a collaborative mode to study a network. [ Video of the Graph Streaming in action: https://www.youtube.com/watch? v=7SW_FDiY0sg ] Layouts Layouts are algorithms which position the nodes in the 2-D or 3-D graphic space. Choosing the right layout and tuning its parameters requires skills at the crossing of art and science. The readability of network visualizations is indeed both a matter of individual perception, knowledge on the data, and analytic skills. Layouts are used to help navigate in the network. The various patterns created emphasis different properties of the structure of networks. Force-directed algorithms Gephi provides layouts of the class called force-directed algorithms. These layouts rely on a physical metaphor to position the nodes according to the position of the others. Roughly speaking, connected nodes tend to be closer, while disconnected nodes tend to be further. They are usually described as spring embedders [Kobourov, 2012] due to the way the forces are computed. Choosing a layout is a trade-off between the capability of 10 the algorithm to handle the given data set, the user time constraint, and the structural properties to be emphasised. Layouts may take edge weight into account in calculating forces. They may prevent node to overlap, thus increasing the readability. Finally, some implementations can run faster on multi-core CPUs. The following table provides the technical capabilities of available layouts: Table 1. Layouts technical capabilities. The number of nodes and time complexity gives an order of magnitude. layout # nodes time complexity edge weight node overlap multi-cpu Fruchterman-Reingold 1 to 1,000 O(N 2 ) no prevent no ForceAtlas 1 to 10k O(N 2 ) yes prevent plug-in ForceAtlas 2 1 to 1m O(N log(N )) yes prevent no OpenOrd 100 to 1m O(N log(N )) yes cluttered native Yifan Hu Multilevel 100 to 100k O(N log(N )) no cluttered no Fruchterman-Reingold This layout [Fruchterman, 1991] simulates the graph as a system of mass particles. The nodes are the mass particles and the edges are springs between the particles. The algorithms try to minimize the energy of this physical system. It has become a standard but remains very slow (see Fig.4). ForceAtlas ForceAtlas is the home-brew layout of Gephi. It is made to lay out real-world networks, which have the following properties: scale-free distribution of node degree, and smallworld effect (i.e. small distance between all nodes). It is focused on readability but it is slow (see Fig.5). 11 Fig. 4. Coappearance network of characters in the novel Les Miserables [Knuth, 1993], laid out by Fruchterman-Reingold. Fig. 5. Coappearance network of characters in the novel Les Miserables [Knuth, 1993], laid out by ForceAtlas. ForceAtlas 2 Improved version of the ForceAtlas to handle large networks while keeping a good readability. Nodes repulsion is approximated with a Barnes-Hut calculation [Barnes, 1986], which therefore reduces the algorithm complexity. It replaces the attraction and repulsion forces of Force Atlas by a scaling parameter (see Fig.6). [ Video of the layout on a grid: https://vimeo.com/24682771 ] 12 Fig. 6. Coappearance network of characters in the novel Les Miserables [Knuth, 1993], laid out by ForceAtlas 2. OpenOrd It is one of the few force-directed layout algorithms that can scale to over 1 million nodes, making it ideal for large graphs [Martin, 2011]. However, small graphs (i.e. hundreds or less nodes) do not always end up looking good. The algorithm is originally based on Frutcherman-Reingold and works with a fixed number of iterations controlled via a simulated annealing type schedule (liquid, expansion, cool-down, crunch, and simmer). Long edges are cut to allow clusters to separate. This algorithm expects undirected weighted graphs and aims at better distinguishing clusters. It can be run in parallel on multiple processors to speed up computing. It stops automatically (see Fig.7). [ Video of the layout on a grid: https://vimeo.com/24731034 ] Yifan Hu Multilevel It is a very fast algorithm with a good quality on large graphs. It combines a forcedirected model with a graph coarsening technique to reduce the complexity [Hu, 2005]. The repulsive forces on one node from a cluster of distant nodes are approximated by a Barnes-Hut calculation, which treats them as one super-node. It stops automatically (see Fig.8). 13 Fig. 7. Coappearance network of characters in the novel Les Miserables [Knuth, 1993], laid out by OpenOrg. Fig. 8. Coappearance network of characters in the novel Les Miserables [Knuth, 1993], laid out by Yifan Hu Multilevel. [ Video of the layout on a grid: https://vimeo.com/24731449 ] Other layouts Circular It draws nodes in a circle ordered by any node attribute. It is useful to show a distribution of nodes with their links (see Fig.9). 14 Fig. 9. Coappearance network of characters in the novel Les Miserables [Knuth, 1993], laid out by the Circular layout. Radial Axis It is provided with the Circular Layout plug-in. It groups nodes and draws the groups in axes (or spars) radiating outwards from a central circle. Groups are generated using a metric (degree, betweenness centrality...) or an attribute. It is useful to study homophily by showing distributions of nodes inside groups with their links (see Fig.10). Fig. 10. Coappearance network of characters in the novel Les Miserables [Knuth, 1993], laid out by the Radial Axis layout. 15 Geographical The GeoLayout uses latitude and longitude coordinates to set nodes position on the graphic space. Several projections are available, including Mercator which is used by Google Maps and other on-line services. Graphviz binding All Graphviz layouts are made available through a Gephi plug-in. Metrics Gephi provides classic statistics to the study of social networks. Network metrics are statistics related to the whole network. Node metrics are statistics related to each node. Edge metrics are statistics related to each edge. Network metrics Diameter It is the maximal distance between all pairs of nodes [Brandes, 2001]. Density It is a measure of how close the network is to complete. A complete graph has all possible edges and density equal to 1. Louvain Modularity It is a non-overlapping community detection algorithm based on modularity optimization able to run on large networks [Blondel, 2008]. Intuitively, it shows how the network divides naturally into groups of nodes with dense connections within groups and sparser connections between groups. 16 Number of Connected Components Connected Components are subgraphs in which a path exists between all pairs of nodes, and no path exists from a node of the subgraph to a node not in the subgraph [Tarjan, 1972]. Clustering Coefficient The Watts-Strogatz clustering coefficient, when applied to a single node, is a measure of how complete the neighborhood of a node is. When applied to an entire network, it is the average clustering coefficient over all of the nodes in the network [Latapy, 2008]. Node metrics Degree Centrality The degree of a node is the number of edges that are adjacent to that node. Betweenness Centrality It measures how often a node appears on shortest paths between nodes in the network [Brandes, 2001]. Closeness Centrality It is the average distance from a given node to all other nodes in the network [Brandes, 2001]. Eigenvector Centrality Node importance in a network based on a node’s connections. A node is central to the extent that the node is connected to others who are central. 17 PageRank Importance of a Web page within the network considering the probability that a user reaches this page based on the hyperlinks. It is a variant of the Eigenvector Centrality [Page, 1999]. HITS Hyperlink-Induced Topic Search (HITS) is a link analysis algorithm that rates Web pages, developed by Jon Kleinberg [Kleinberg, 1999]. The HITS metric determines two values for a page: its authority, which estimates the value of the content of the page, and its hub value, which estimates the value of its links to other pages. Edge metrics Average Path Length The average distance between all pairs of nodes. Connected nodes have distance 1. The diameter is the longest distance between any two nodes in the network (i.e. how far apart are the two most distant nodes) [Brandes, 2001]. Dynamic metrics Some metrics can be computed over time: the number of nodes, the number of edges, the average degree, and the clustering coefficient. Methods Gephi takes its roots in the Exploratory Data Analysis field of research. Promoted by John Tukey in the book Exploratory Data Analysis (1977) to visualize data sets and statistical results, this approach emphasis the importance of curiosity and serendipity (i.e. discoveries made while searching for something else) to data analysis. As John 18 Tukey says, “the greatest value of a picture is when it forces us to notice what we never expected to see”. Main benefit is the generation of novel questions and research hypotheses. As depicted by Ben Fry in Computational Information Design (2004), one has to acquire and clean data, filter and compute statistics on it, represent and interact with it. But this process involves many back-and-forth between the different steps of data analysis.Visualizing the data may indeed reveal the need to acquire more data, or filter it in another way ; interacting with it may requires to change visual variables and aesthetics. Gephi is designed to facilitate this non-linear process. In particular, Gephi is focused on the visualization of the network, the real-time interaction with the data (e.g. node grouping, filtering, use of statistical results in the visualization), and the building of a visual language [Bertin, 1999]. This language makes use of circles and lines, colors and sized to create informative visuals, which aim at being the network equivalent of geographical maps [Boyack, 2005]. Visualization Gephi is focused on the creation of node-link diagrams, which are graphics of dots joined by lines as a representation of nodes (the dots) and edges (the lines). Users interact with the visualization to explore the network structure and raise hypotheses based on the visual patterns. Beyond layouts, the mapping of data attributes and visual attributes allows to set node color and size, label color and size, edge color and thickness. Interaction techniques available in Gephi includes zooming and panning, node selection, node dragging, and tools like node painter and shortest path discovery (see Fig.11). 19 Fig. 11. Visualization window. Interoperability With the support of various file formats, Gephi can exchange data with at least these network analysis tools: Cytoscape, CuttleFish, GraphStream, Graphviz, GUESS, IGraph, JUNG, Network Workbench, NodeXL, Pajek, Sonivis, Tulip, UCINET, and Visone. Special options Filters A key aspect of social network analysis is the identification of groups and the study of the connections between them. For instance, the study of homophily in networks relies on the correlation between the linkage structure and node attributes. One can ask “do people who like the same film tend to connect more with each other, and less with 20 the rest of the network?” The discovery of relevant groups is made easier with filters, which are conditions on nodes or edges applied to view a subgraph. Gephi provides a user interface to create filter queries based on metrics and attributes. For examples: • Show only the nodes with degree between 38 and 125. • Show only the nodes with gender attribute equal to “female”. • Show only the nodes connected to a given node, and the relations between them. Filters can be combined using boolean operators to create complex queries. A scripting plug-in in Python enables the creation of scripts in a similar fashion as the GUESS software do. Network spreadsheet The network can be seen as a list of nodes and edges. A node table and an edge table are available in the Data Laboratory. Users can add nodes and edges, create or delete attributes. Each table have different features for searching, sorting, and editing data, like the merging of nodes, and the removal of duplicates. Vector graphics maker The publication of visual results requires to control rendering details and aesthetics, especially for printing. The user can tune the rendering of nodes and edges, and see the result before exporting it in a vector file format, either SVG or PDF (see Fig.12 and 13). Timeline Dynamic networks are networks which evolve over time with the addition and removal of nodes and edges. They have been the subject of increasing interest, given their potential as a theoretical model and their promising applications. Following this trend, 21 Fig. 12. Interface of the vector graphics maker. Fig. 13. Examples of aesthetics improvements: fonts (left), colors and sizes (right). Gephi has incorporated tools to study dynamic networks. From a visualization perspective, a critical tool is the Timeline component, which allows users to select pertinent time intervals to display and explore the corresponding network. The timeline component features a sparkline chart in the background of the interval selection drawer. This feature helps users to focus on particular periods of the evolution of the dynamic network, like bursts of connections or changes in network density or other simple metrics. 22 The timeline animation enable the selected time frame to slide as the corresponding network is being displayed on the screen, like a movie player (see Fig.14). Fig. 14. Timeline component, where the sparkline shows the number of edges over time. The selected period is from December to April. Software updates When fixes are deployed, the users are notified in the Gephi interface, where they can apply them in a few clicks. Gephi automatically gets the list of plug-ins available from the Gephi Plugin portal. Documentation Information • Website: https://gephi.org/ • Wiki: http://wiki.gephi.org/ • Video introduction: http://vimeo.com/9726202 Help • Forum: https://forum.gephi.org/ (5000+ posts, 1300+ topics, 800+ active members) • Mailing-lists: [email protected], [email protected] • Individual contact: [email protected] News • Blog: https://gephi.org/blog/ 23 • Twitter: https://twitter.com/gephi • Facebook: https://www.facebook.com/groups/gephi/ • Video channel: https://vimeo.com/channels/gephi/ Key Applications Web analysis: the e-Diasporas Atlas The Digital Diasporas Atlas project aims at mapping and analyzing the occupation (in a quasi-geopolitical sense) of digital territories by the “connected migrants” [Diminescu, 2008]. In the context of the eDiasporas Atlas, the network serves primarily to allow formulation of research hypotheses. “Networks serve as an embodiment of the construction of an interpretation of data. They thus all have a heuristic function, their interpretation being an aspect of visual analytics.” Gephi has been used to visualise and interpret the structuring and distribution of actors in migrant-community networks on the Web. “By handling the network, by observing its evolution (timeline), by visualizing the place and the connections of a given website, by identifying clusters, by filtering the data by categories, in brief by interpreting the graph, the researcher produces various representations (or views) of the corpus that allow him to formulate hypothesis of research that will be supported (or not) by other online/offline fieldwork investigations.” [Diminescu, 2011] Social media analysis: Truthy “Truthy is a system to analyze and visualize the diffusion of information on Twitter. It evaluates thousands of tweets an hour to identify new and emerging bursts of activity around memes of various flavors. The data and statistics provided by Truthy are 24 designed to aid in the study of social epidemics: How do memes propagate through the Twittersphere? What causes a burst of popularity?” [Ratkiewicz, 2011] This system helps users to identify suspicious memes which might deliver disinformation or propaganda. These memes are firstly selected by the system by analysing the diffusion network of the information (i.e. retweets and mentions). Then users classify the meme by exploring the related statistics, timeline and visualizations of the diffusion network. The Gephi Toolkit is used for rendering the visualizations. Dynamic network analysis: Face-to-Face Contact Patterns “Describing and understanding contacts between children at school would help quantify the transmission opportunities of respiratory infections and identify situations within schools where the risk of transmission is higher. The measurements were carried out in a French school (612 years children). Data were collected on the time-resolved face-toface proximity of children and teachers.” [Stehlé, 2011] The dynamical evolution of the contacts was visualized using Gephi. The video is available on https://vimeo.com/31490438. Future Directions The Gephi Consortium identifies strategic needs from the industry and research, create standards to ensure interoperability, and organize the contributors to produce generic and reusable parts of Gephi. A stable Gephi 1.0 is under study, in parallel with developments to include Dynamic Network Analysis and improvements on visualization capabilities using shader techniques on GPU, and customizable renderers for information visualization research. A web marketplace is currently developed to facilitate the exchange of services be- 25 tween members of the community, like professional training, consulting and private development. Cross-References Visualization of Networks, Visualization of Large Networks, Data Mining, Large Networks Analysis of, Mapping Online Networks, Network Representations of Complex Data, Temporal Networks, Formats, Linked Open Data, References [Barnes and Hut (1986)] Barnes J, Hut P (1986) A hierarchical O(N log N) force-calculation algorithm. In: Nature (ISSN 0028-0836), vol. 324, Dec. 4, 1986, pp. 446–449 [Bastian et al. (2009)] Bastian M, Heymann S, Jacomy M (2009) Gephi: an open source software for exploring and manipulating networks. In: Proceedings of the Third International AAAI Conference on Weblogs and Social Media (ICWSM’09), in American Journal of Sociology, pp.361-362 [Bertin J (1999)] Bertin J (1999) Sémiologie graphique: les diagrammes, les réseaux, les cartes. Editions de l’Ecole des Hautes Etudes en Sciences Sociales [Blondel et al. (2008)] Blondel V, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks, In: Journal of Statistical Mechanics: Theory and Experiment 2008 (10), P1000 [Boyack et al. (2005)] Boyack K W, Klavans R, Brner K (2005) Mapping the backbone of science. In: Scientometrics 64(3), pp. 351–374 [Brandes U (2001)] Brandes U (2001) A faster algorithm for betweenness centrality, In: Journal of Mathematical Sociology, vol. 25, pp. 163–177 [Diminescu D (2008)] Diminescu D (2008) The Connected Migrant: an Epistemological Manifest. In: Social Sciences Information, vol 47 [Diminescu et al. (2011)] Diminescu D, Bourgeois M, Renault M, Jacomy M (2011) Digital Diasporas Atlas Exploration and Cartography of Diasporas in Digital Networks. In: Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media (ICWSM’11) 26 [Fruchterman and Reingold (1991)] Fruchterman T M J, Reingold E M (1991) Graph Drawing by Force-Directed Placement. In: Software: Practice and Experience, 21(11) [Fry B (2004)] Fry B (2004) Computational Information Design. Ph.D. Thesis [Hu Y F (2005)] Hu Y F (2005) Efficient and high quality force-directed graph drawing. In: The Mathematica Journal, 10 (37-71) [Kleinberg J (1999)] Kleinberg J (1999) Authoritative sources in a hyperlinked environment. In: Journal of the ACM 46 (5): 604632 [Kobourov S G (to appear in 2012)] Kobourov S G (to appear in 2012) Force-Directed Drawing Algorithms. In: Handbook of Graph Drawing and Visualization, CRC Press [Knuth D E (1993)] Knuth D E (1993) The Stanford GraphBase: A Platform for Combinatorial Computing. Addison-Wesley, Reading, MA [Latapy M (2008)] Latapy M (2008) Main-memory Triangle Computations for Very Large (Sparse (Power-Law)) Graphs. In: Theoretical Computer Science (TCS) 407 (1-3), pp.458-473 [Martin et al. (2011)] Martin S, Brown W M, Klavans R, Boyack K (2011) OpenOrd: An Open-Source Toolbox for Large Graph Layout. In: SPIE Conference on Visualization and Data Analysis (VDA) [Page et al. (1999)] Page L, Brin S, Motwani R, Winograd T (1999) The PageRank citation ranking: Bringing order to the Web. Technical Report. Stanford InfoLab. [Ratkiewicz et al. (2011)] Ratkiewicz J, Conover M, Meiss M, Gonçalves B, Patil S, Flammini A, Menczer F (2011) Truthy: mapping the spread of astroturf in microblog streams, In: Proceedings of the 20th international conference companion on World wide web (WWW ’11) [Stehlé et al. (2011)] Stehlé J, Voirin N, Barrat A, Cattuto C, Isella L, Pinton J-F, Quaggiotto M, Van den Broeck W, Rgis C, Lina B, Vanhems P (2011) High-Resolution Measurements of Face-to-Face Contact Patterns in a Primary School. In: PLoS One, August 16, 2011. [Tarjan R (1972)] Tarjan R (1972) Depth-First Search and Linear Graph Algorithms. In: SIAM Journal on Computing 1 (2): 146160 [Tukey J (1977)] Tukey J (1977) Exploratory Data Analysis, 1 edn., Addison-Wesley Recommended Reading [Conway and White (2012)] Conway D, White J M (2012) Machine Learning for Hackers. Chapter Analyzing Social Graphs, Visualizing the Clustered Twitter Network with Gephi 27 [Bulik-Sullivan and Sullivan (2012)] Bulik-Sullivan B, Sullivan P (2012) The authorship network of genome-wide association studies. In: Nature Genetics 44, 113 [De Maeyer J (2010)] De Maeyer J (2010) Methods for mapping hyperlink networks: Examining the environment of Belgian news websites. In: 11th International Symposium on Online Journalism [Helmond and Weltevrede (2012)] Helmond A, Weltevrede E (2012) Where do bloggers blog? Platform transitions within the historical Dutch blogosphere. In: First Monday, vol 17, number 2 [Kelly et al. (2012)] Kelly J, Barash V, Alexanyan K, Etling B, Faris R, Gasser U, Palfrey J (2012) Mapping Russian Twitter. In: Berkman Center Research Publication No. 2012-3 [Latour et al. (2012)] Latour B, Jensen P, Venturini T, Grauwin S, Boullier D (2012) The Whole is Always Smaller Than Its Parts. In: British Journal of Sociology [Oldham et al. (2012)] Oldham P, Hall S, Burton G (2012) Synthetic Biology: Mapping the Scientific Landscape. In: PLoS ONE 7(4): e34368. [Teng et al. (2011)] Teng C-Y, Lin Y-R, Adamic L (2011) Recipe recommendation using ingredient networks [Barabasi A-L (2003)] Barabasi A-L (2003) Linked: How Everything Is Connected to Everything Else and What It Means for Business, Science, and Everyday Life. Plume [Börner K (2010)] Börner K (2010) Atlas of Science: Visualizing What We Know. The MIT Press [Easley and Kleinberg (2010)] Easley D, Kleinberg J (2010) Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press [Newman et al. (2006)] Newman M, Barabasi A-L, Watts D J (2006) The Structure and Dynamics of Networks. Princeton University Press [Watts D J (2003)] Watts D J (2003) Six Degrees: The Science of a Connected Age. W. W. Norton & Company
© Copyright 2026 Paperzz