Synonyms Glossary - Sébastien Heymann

Title:
Gephi
Name:
Sébastien Heymann
Affil./Addr.:
Université Pierre et Marie Curie, LIP6, ComplexNetworks team
4 place Jussieu, 75005 Paris, France
Phone: +33 (0)1 44 27 88 88
E-mail: [email protected]
Gephi
Synonyms
exploratory network analysis, network visualization, visual analytics, open source
Glossary
•
API: an Application Programming Interface is an interface for software components to communicate with each other with a clear separation of components.
•
Homophily: tendency to link to similar others.
•
Layout: algorithm which calculates the position of elements in a graphic space.
•
Raster image: image encoded by a two-dimensional matrix of pixels.
•
Shortest path: minimal distance between two nodes of a network.
•
Sparkline: small intense, simple, word-sized graphic with typographic resolution.
•
Subgraph: graph whose nodes and edges connecting these nodes are a subset of
a graph.
•
Vector image: images encoded by a set of geometrical functions.
2
Definition
Gephi was created in 2008 by Mathieu Bastian, Sébastien Heymann, and Mathieu
Jacomy, and extended by Eduardo Ramos Ibañez, Cezary Bartosiak, Julian Bilcke,
Patrick McSweeney, André Panisson, Jérémy Subtil, Helder Suzuki, Martin Skurla,
and Antonio Patriarca. It is suitable for the analysis of all kind of complex networks,
although it is mostly used for social network analysis. It is distributed using a dual
licensing scheme under the GNU General Public License (GNU GPL) v3 and the
Common Development and Distribution License (CDDL) v1. Gephi can be used as a
stand-alone application for the desktop, and as Java library for embedding some of the
features in third-party programs. It scales to 10,000 nodes and edges with 1GB RAM
and 1 CPU, and up to 1 million nodes and edges with 32GB RAM and 8 CPUs. It
runs on Linux, Windows, and Mac OS X. It is written in Java 6 and OpenGL 1.2.
Introduction
Gephi is an open source software for the visual exploration of networks (also called
graphs). A network is made of a set of entities, called the nodes, and a set of relationships
between entities, called the edges. While various softwares exist to visualize and analyse
networks, Gephi is particularly suited for networks with node attributes. Attributes are
key-value pairs associated to each node or each edge. For example, individuals of a social
network may have attributes such as gender, language, and age.
[ Video introducing Gephi: https://vimeo.com/9726202 ]
Gephi users interact with the visualization in real-time to position the nodes in
a two or three dimensional space using layout algorithms, or by manually moving nodes
(see Fig.1). They use node attributes to change the color and size of the nodes, in order
to find groups and individuals. The goal is to study the correlation of node attributes
3
Fig. 1. Overview of Gephi 0.8.
and network structure by using visual patterns. Classic metrics of social network analysis, such as node degree or betweenness centrality measures, can be computed and
used in the visualization as well (see Fig.2). The network can also be filtered based on
attributes.
Fig. 2. Network visualization example. Node size is proportional to the betweenness centrality value
of the node.
4
Gephi is not limited to social networks. Any kind of network can be analysed,
like the internet topology (i.e. connections between machines), peer-to-peer file-sharing
networks, biological networks, on-line social networks (e.g. Twitter, Facebook), communication (e.g. email) and financial networks, but also semantic networks, organizational
networks and more.
Gephi aims at covering the entire process from data importing to aesthetics
refinements and interaction. Data can be imported and exported in various file formats,
and can be retrieved from databases. Once the visual exploration is over, the user refine
aesthetics and export graphics in vector file formats to ensure readability and quality
publishing on print and interactive graphics.
This project is supported by an international community, which is lead by the
French non-profit corporation called the Gephi Consortium.
Key Points
The strengths of Gephi are real-time visual feedback, performance, modularity, and its
community.
The Gephi user interface is focused on the creation of network visuals in realtime. The key innovation is to ease the interactions with the network. The user can
literally play with the visual representation of the network. By playing, we mean experimenting various visual configurations for the purpose of seeing the outcome of any
action instantaneously. This is made possible with the following features. The user apply layout algorithms to shape the network structure in 2-D or 3-D, for instance using
force-directed layouts. Such algorithms calculate the layout of a network using repulsive forces between all nodes, but also using attractive forces between nodes which are
adjacent. Each layout iteration calculates the forces applied on each node, and updates
each node position. The visualization is refreshed at each iteration, therefore providing
5
real-time feedback for users. Some layouts are implemented with no stopping condition.
The user can therefore tweak the layout parameters in real-time, until they decide to
stop its execution. Interactions while calculating layout is made technically possible by
using multi-threading processing, and the GPU for rendering the visualization.
Gephi is stable and can scale enough to load networks of up to 1 million nodes
and edges. The rendering engine is able to handle large networks and yet guarantees
responsiveness. The minimum technical requirements of the software makes small networks actionable on low configurations such as netbooks. large networks of around a
million of nodes and edges can also be analysed on visualization servers. In addition
of interactive exploration of large networks, Gephi provides efficient implementations
of classic metrics used in Social Network Analysis, including Betweenness Centrality,
Clustering Coefficient, PageRank or Louvain Modularity for community detection.
Gephi is a stand-alone application, built with Java SE 6 on top of the NetBeans
Platform, which is a software for creating applications (see Fig.3). An installer makes
Gephi available on all platforms having the Java Virtual Machine running. A graphic
card with OpenGL 1.2 is required. The features are extensible through Java plug-ins
which use the Gephi APIs. For instance, the OpenOrd layout algorithm is a plug-in
which implements the Layout API. Such source code structure makes the software
maintainable. A version of the software without the user interface is also distributed
under the name of Gephi Toolkit. It is used as a Java library to create novel applications
on desktop or on server.
Finally, key benefits are provided by the Gephi community: members answer
questions on the forum and fix the most common bugs. They organize meet-ups in
their cities, and provide training seminars to newcomers.
6
Fig. 3. Architecture of Gephi (left) and Gephi Toolkit( right).
Historical Background
Gephi is a software developed since 2008. It was primarily created to enable researchers
in social sciences to study the Web at Fondation Maison des Sciences de l’Homme in
Paris, France. Today, the Gephi Consortium aims at creating a sustainable software and
technical ecosystem, driven by an international open-source community which shares
common interests in networks and complex systems.
Since the begining, an non-profit organization called Association Gephi provided
a legal entity to support, protect and promote the Gephi project. Hosted alternatively
by Association WebAtlas, Linkfluence SAS and SciencesPo Medialab, the initial contributors Mathieu Bastian, Sebastien Heymann and Mathieu Jacomy have progressively
set up an international community of users and contributors. They notably participated
in the Google Summer of Code program each year since 2009, and won the Oracle 2010
Duke’s Choice Award for best Innovative Technical Data Visualization. They launched
the Gephi Consortium in 2011, which is a non-profit corporation created to join the
efforts of industrials, laboratories and civil society in building Gephi. Created under the
French law of July 1st, 1901, it is governed by a board of directors. The Gephi Consortium makes an R&D effort to build generic and reusable parts of Gephi, improves the
competitive technology at low costs, and creates standards to ensure interoperability.
7
Research partners include Inria, Sciences-Po Medialab, Fondation Maison des
Sciences de l’Homme TIC-Migrations, UPMC-CNRS LIP6 ComplexNetworks, Université de Technologie de Compiègne COSTECH, ISI Foundation, Indiana University Center for Complex Networks and Systems Research, and Stanford Mapping the Republic
of Letters. Private parters include Quid Inc, Linkfluence SAS, and Neo Technology Inc.
Features
Input/output data formats
File input
While many file formats exist to encode network data, Gephi supports the most common ones: CSV, GDF, GEXF, GML, GraphML, Graphviz DOT, Pajek NET, Tulip
TLP, Ucinet DL, XGMML, and RDF. The later is available throught the Semantic
Web plugin, developed by the Inria WIMMICS team. A spreadsheet importer helps
users to model tabular data as a network.
File Output
Data can be saved in Gephi sessions, but also in other formats like node and edge
lists, GDF, GEXF, GraphML, Pajek NET, Ucinet DL, and CXF. Therefore, users
can migrate data from one software to another in order to benefit from their specific
features.
Graphical Output
Network visuals can be exported in PDF or SVG for printing. Designers can edit them
using third-party tools. Raster graphics such as PNG is also available, as long as the
Tikz format for embedding figures in LateX documents. Interactive graphics exporters
8
are available as plug-ins, such as Microsoft Seadragon graphics, and KMZ for exporting
nodes with geographical coordinates. Data exported in GEXF with visual attributes
(i.e. node position, color, size) can be re-used in browser-based viewers like Sigma.js.
GEXF
The file format working group of the Gephi Consortium created the Graph Exchange
XML Format (GEXF), which is the standard used in Gephi to encode network data.
This format is an XML language for describing network nodes and edges, attributes,
hierarchies and their temporal evolution. GEXF is an improvement compared to
GraphML for the capability to encode dynamic networks. The namespaces allow anyone
to extend the format for specific purposes, for instance the addition of application data,
without disrupting other applications. The libraries made by the community enable the
reading and writing of GEXF files in C++, R, Python, Java, Perl and Javascript. They
facilitate the adoption of the format, and improve interoperability between Gephi and
third-party tools. The GEXF format is also supported by other softwares such as NetworkX, Tulip, and GraphStream, and by on-line services like Issuecrawler.
Databases
Gephi can retrieve data from relational databases such as MySQL, SQL Server, PostgreSQL, SQLite and Teradata. The community creates plug-ins to support graph
databases such as Neo4j, OrientDB, and InfiniteGraph.
Streaming
Real-world structures are constantly changing, and file formats are not suitable to exchange such type of dynamic data. A lot of well-established on-line systems already
stream data to its users using a streaming API. Twitter for example defined a Streaming API to allow near real-time access to its data. Inspired by the GraphStream Java
9
Library, the Graph Streaming API of Gephi provides a unified framework for streaming
network events in a JSON format, like the addition, the modification and the removal
of nodes and edges over time. A client can receive data from a master, but the specifications allow more flexibility: clients can interact with the master by pushing data
to it. In the case of two Gephi instances connected through this API, a change in
a network at the master’s Gephi should cause a change in the client’s Gephi, and a
change at the client’s Gephi will cause it to send requests to the master to update its
network accordingly. Both instances work in a distributed mode. Different people could
therefore work in a collaborative mode to study a network.
[ Video of the Graph Streaming in action: https://www.youtube.com/watch?
v=7SW_FDiY0sg ]
Layouts
Layouts are algorithms which position the nodes in the 2-D or 3-D graphic space.
Choosing the right layout and tuning its parameters requires skills at the crossing of
art and science. The readability of network visualizations is indeed both a matter of
individual perception, knowledge on the data, and analytic skills. Layouts are used to
help navigate in the network. The various patterns created emphasis different properties
of the structure of networks.
Force-directed algorithms
Gephi provides layouts of the class called force-directed algorithms. These layouts rely
on a physical metaphor to position the nodes according to the position of the others.
Roughly speaking, connected nodes tend to be closer, while disconnected nodes tend to
be further. They are usually described as spring embedders [Kobourov, 2012] due to the
way the forces are computed. Choosing a layout is a trade-off between the capability of
10
the algorithm to handle the given data set, the user time constraint, and the structural
properties to be emphasised. Layouts may take edge weight into account in calculating
forces. They may prevent node to overlap, thus increasing the readability. Finally, some
implementations can run faster on multi-core CPUs. The following table provides the
technical capabilities of available layouts:
Table 1. Layouts technical capabilities. The number of nodes and time complexity gives an order of
magnitude.
layout
# nodes
time complexity edge weight node overlap multi-cpu
Fruchterman-Reingold 1 to 1,000
O(N 2 )
no
prevent
no
ForceAtlas
1 to 10k
O(N 2 )
yes
prevent
plug-in
ForceAtlas 2
1 to 1m
O(N log(N ))
yes
prevent
no
OpenOrd
100 to 1m O(N log(N ))
yes
cluttered
native
Yifan Hu Multilevel
100 to 100k O(N log(N ))
no
cluttered
no
Fruchterman-Reingold
This layout [Fruchterman, 1991] simulates the graph as a system of mass particles.
The nodes are the mass particles and the edges are springs between the particles. The
algorithms try to minimize the energy of this physical system. It has become a standard
but remains very slow (see Fig.4).
ForceAtlas
ForceAtlas is the home-brew layout of Gephi. It is made to lay out real-world networks,
which have the following properties: scale-free distribution of node degree, and smallworld effect (i.e. small distance between all nodes). It is focused on readability but it
is slow (see Fig.5).
11
Fig. 4. Coappearance network of characters in the novel Les Miserables [Knuth, 1993], laid out by
Fruchterman-Reingold.
Fig. 5. Coappearance network of characters in the novel Les Miserables [Knuth, 1993], laid out by
ForceAtlas.
ForceAtlas 2
Improved version of the ForceAtlas to handle large networks while keeping a good
readability. Nodes repulsion is approximated with a Barnes-Hut calculation [Barnes,
1986], which therefore reduces the algorithm complexity. It replaces the attraction and
repulsion forces of Force Atlas by a scaling parameter (see Fig.6).
[ Video of the layout on a grid: https://vimeo.com/24682771 ]
12
Fig. 6. Coappearance network of characters in the novel Les Miserables [Knuth, 1993], laid out by
ForceAtlas 2.
OpenOrd
It is one of the few force-directed layout algorithms that can scale to over 1 million
nodes, making it ideal for large graphs [Martin, 2011]. However, small graphs (i.e.
hundreds or less nodes) do not always end up looking good. The algorithm is originally
based on Frutcherman-Reingold and works with a fixed number of iterations controlled
via a simulated annealing type schedule (liquid, expansion, cool-down, crunch, and
simmer). Long edges are cut to allow clusters to separate. This algorithm expects
undirected weighted graphs and aims at better distinguishing clusters. It can be run
in parallel on multiple processors to speed up computing. It stops automatically (see
Fig.7).
[ Video of the layout on a grid: https://vimeo.com/24731034 ]
Yifan Hu Multilevel
It is a very fast algorithm with a good quality on large graphs. It combines a forcedirected model with a graph coarsening technique to reduce the complexity [Hu, 2005].
The repulsive forces on one node from a cluster of distant nodes are approximated by
a Barnes-Hut calculation, which treats them as one super-node. It stops automatically
(see Fig.8).
13
Fig. 7. Coappearance network of characters in the novel Les Miserables [Knuth, 1993], laid out by
OpenOrg.
Fig. 8. Coappearance network of characters in the novel Les Miserables [Knuth, 1993], laid out by
Yifan Hu Multilevel.
[ Video of the layout on a grid: https://vimeo.com/24731449 ]
Other layouts
Circular
It draws nodes in a circle ordered by any node attribute. It is useful to show a distribution of nodes with their links (see Fig.9).
14
Fig. 9. Coappearance network of characters in the novel Les Miserables [Knuth, 1993], laid out by
the Circular layout.
Radial Axis
It is provided with the Circular Layout plug-in. It groups nodes and draws the groups
in axes (or spars) radiating outwards from a central circle. Groups are generated using a
metric (degree, betweenness centrality...) or an attribute. It is useful to study homophily
by showing distributions of nodes inside groups with their links (see Fig.10).
Fig. 10. Coappearance network of characters in the novel Les Miserables [Knuth, 1993], laid out by
the Radial Axis layout.
15
Geographical
The GeoLayout uses latitude and longitude coordinates to set nodes position on the
graphic space. Several projections are available, including Mercator which is used by
Google Maps and other on-line services.
Graphviz binding
All Graphviz layouts are made available through a Gephi plug-in.
Metrics
Gephi provides classic statistics to the study of social networks. Network metrics are
statistics related to the whole network. Node metrics are statistics related to each node.
Edge metrics are statistics related to each edge.
Network metrics
Diameter
It is the maximal distance between all pairs of nodes [Brandes, 2001].
Density
It is a measure of how close the network is to complete. A complete graph has all
possible edges and density equal to 1.
Louvain Modularity
It is a non-overlapping community detection algorithm based on modularity optimization able to run on large networks [Blondel, 2008]. Intuitively, it shows how the network
divides naturally into groups of nodes with dense connections within groups and sparser
connections between groups.
16
Number of Connected Components
Connected Components are subgraphs in which a path exists between all pairs of nodes,
and no path exists from a node of the subgraph to a node not in the subgraph [Tarjan,
1972].
Clustering Coefficient
The Watts-Strogatz clustering coefficient, when applied to a single node, is a measure
of how complete the neighborhood of a node is. When applied to an entire network, it
is the average clustering coefficient over all of the nodes in the network [Latapy, 2008].
Node metrics
Degree Centrality
The degree of a node is the number of edges that are adjacent to that node.
Betweenness Centrality
It measures how often a node appears on shortest paths between nodes in the network
[Brandes, 2001].
Closeness Centrality
It is the average distance from a given node to all other nodes in the network [Brandes,
2001].
Eigenvector Centrality
Node importance in a network based on a node’s connections. A node is central to the
extent that the node is connected to others who are central.
17
PageRank
Importance of a Web page within the network considering the probability that a user
reaches this page based on the hyperlinks. It is a variant of the Eigenvector Centrality
[Page, 1999].
HITS
Hyperlink-Induced Topic Search (HITS) is a link analysis algorithm that rates Web
pages, developed by Jon Kleinberg [Kleinberg, 1999]. The HITS metric determines two
values for a page: its authority, which estimates the value of the content of the page,
and its hub value, which estimates the value of its links to other pages.
Edge metrics
Average Path Length
The average distance between all pairs of nodes. Connected nodes have distance 1. The
diameter is the longest distance between any two nodes in the network (i.e. how far
apart are the two most distant nodes) [Brandes, 2001].
Dynamic metrics
Some metrics can be computed over time: the number of nodes, the number of edges,
the average degree, and the clustering coefficient.
Methods
Gephi takes its roots in the Exploratory Data Analysis field of research. Promoted by
John Tukey in the book Exploratory Data Analysis (1977) to visualize data sets and
statistical results, this approach emphasis the importance of curiosity and serendipity
(i.e. discoveries made while searching for something else) to data analysis. As John
18
Tukey says, “the greatest value of a picture is when it forces us to notice what we
never expected to see”. Main benefit is the generation of novel questions and research
hypotheses.
As depicted by Ben Fry in Computational Information Design (2004), one has
to acquire and clean data, filter and compute statistics on it, represent and interact
with it. But this process involves many back-and-forth between the different steps of
data analysis.Visualizing the data may indeed reveal the need to acquire more data,
or filter it in another way ; interacting with it may requires to change visual variables
and aesthetics.
Gephi is designed to facilitate this non-linear process. In particular, Gephi is
focused on the visualization of the network, the real-time interaction with the data
(e.g. node grouping, filtering, use of statistical results in the visualization), and the
building of a visual language [Bertin, 1999]. This language makes use of circles and
lines, colors and sized to create informative visuals, which aim at being the network
equivalent of geographical maps [Boyack, 2005].
Visualization
Gephi is focused on the creation of node-link diagrams, which are graphics of dots
joined by lines as a representation of nodes (the dots) and edges (the lines). Users
interact with the visualization to explore the network structure and raise hypotheses
based on the visual patterns. Beyond layouts, the mapping of data attributes and
visual attributes allows to set node color and size, label color and size, edge color and
thickness. Interaction techniques available in Gephi includes zooming and panning,
node selection, node dragging, and tools like node painter and shortest path discovery
(see Fig.11).
19
Fig. 11. Visualization window.
Interoperability
With the support of various file formats, Gephi can exchange data with at least these
network analysis tools: Cytoscape, CuttleFish, GraphStream, Graphviz, GUESS, IGraph, JUNG, Network Workbench, NodeXL, Pajek, Sonivis, Tulip, UCINET, and
Visone.
Special options
Filters
A key aspect of social network analysis is the identification of groups and the study of
the connections between them. For instance, the study of homophily in networks relies
on the correlation between the linkage structure and node attributes. One can ask “do
people who like the same film tend to connect more with each other, and less with
20
the rest of the network?” The discovery of relevant groups is made easier with filters,
which are conditions on nodes or edges applied to view a subgraph. Gephi provides a
user interface to create filter queries based on metrics and attributes. For examples:
•
Show only the nodes with degree between 38 and 125.
•
Show only the nodes with gender attribute equal to “female”.
•
Show only the nodes connected to a given node, and the relations between them.
Filters can be combined using boolean operators to create complex queries. A
scripting plug-in in Python enables the creation of scripts in a similar fashion as the
GUESS software do.
Network spreadsheet
The network can be seen as a list of nodes and edges. A node table and an edge table
are available in the Data Laboratory. Users can add nodes and edges, create or delete
attributes. Each table have different features for searching, sorting, and editing data,
like the merging of nodes, and the removal of duplicates.
Vector graphics maker
The publication of visual results requires to control rendering details and aesthetics,
especially for printing. The user can tune the rendering of nodes and edges, and see
the result before exporting it in a vector file format, either SVG or PDF (see Fig.12
and 13).
Timeline
Dynamic networks are networks which evolve over time with the addition and removal
of nodes and edges. They have been the subject of increasing interest, given their potential as a theoretical model and their promising applications. Following this trend,
21
Fig. 12. Interface of the vector graphics maker.
Fig. 13. Examples of aesthetics improvements: fonts (left), colors and sizes (right).
Gephi has incorporated tools to study dynamic networks. From a visualization perspective, a critical tool is the Timeline component, which allows users to select pertinent
time intervals to display and explore the corresponding network. The timeline component features a sparkline chart in the background of the interval selection drawer. This
feature helps users to focus on particular periods of the evolution of the dynamic network, like bursts of connections or changes in network density or other simple metrics.
22
The timeline animation enable the selected time frame to slide as the corresponding
network is being displayed on the screen, like a movie player (see Fig.14).
Fig. 14. Timeline component, where the sparkline shows the number of edges over time. The selected
period is from December to April.
Software updates
When fixes are deployed, the users are notified in the Gephi interface, where they can
apply them in a few clicks. Gephi automatically gets the list of plug-ins available from
the Gephi Plugin portal.
Documentation
Information
•
Website: https://gephi.org/
•
Wiki: http://wiki.gephi.org/
•
Video introduction: http://vimeo.com/9726202
Help
•
Forum: https://forum.gephi.org/ (5000+ posts, 1300+ topics, 800+ active members)
•
Mailing-lists: [email protected], [email protected]
•
Individual contact: [email protected]
News
•
Blog: https://gephi.org/blog/
23
•
Twitter: https://twitter.com/gephi
•
Facebook: https://www.facebook.com/groups/gephi/
•
Video channel: https://vimeo.com/channels/gephi/
Key Applications
Web analysis: the e-Diasporas Atlas
The Digital Diasporas Atlas project aims at mapping and analyzing the occupation
(in a quasi-geopolitical sense) of digital territories by the “connected migrants” [Diminescu, 2008]. In the context of the eDiasporas Atlas, the network serves primarily to
allow formulation of research hypotheses. “Networks serve as an embodiment of the
construction of an interpretation of data. They thus all have a heuristic function, their
interpretation being an aspect of visual analytics.” Gephi has been used to visualise
and interpret the structuring and distribution of actors in migrant-community networks
on the Web.
“By handling the network, by observing its evolution (timeline), by visualizing
the place and the connections of a given website, by identifying clusters, by filtering the
data by categories, in brief by interpreting the graph, the researcher produces various
representations (or views) of the corpus that allow him to formulate hypothesis of research that will be supported (or not) by other online/offline fieldwork investigations.”
[Diminescu, 2011]
Social media analysis: Truthy
“Truthy is a system to analyze and visualize the diffusion of information on Twitter.
It evaluates thousands of tweets an hour to identify new and emerging bursts of activity around memes of various flavors. The data and statistics provided by Truthy are
24
designed to aid in the study of social epidemics: How do memes propagate through the
Twittersphere? What causes a burst of popularity?” [Ratkiewicz, 2011]
This system helps users to identify suspicious memes which might deliver disinformation or propaganda. These memes are firstly selected by the system by analysing
the diffusion network of the information (i.e. retweets and mentions). Then users classify the meme by exploring the related statistics, timeline and visualizations of the
diffusion network. The Gephi Toolkit is used for rendering the visualizations.
Dynamic network analysis: Face-to-Face Contact Patterns
“Describing and understanding contacts between children at school would help quantify
the transmission opportunities of respiratory infections and identify situations within
schools where the risk of transmission is higher. The measurements were carried out in
a French school (612 years children). Data were collected on the time-resolved face-toface proximity of children and teachers.” [Stehlé, 2011]
The dynamical evolution of the contacts was visualized using Gephi. The video
is available on https://vimeo.com/31490438.
Future Directions
The Gephi Consortium identifies strategic needs from the industry and research, create
standards to ensure interoperability, and organize the contributors to produce generic
and reusable parts of Gephi.
A stable Gephi 1.0 is under study, in parallel with developments to include Dynamic Network Analysis and improvements on visualization capabilities using shader
techniques on GPU, and customizable renderers for information visualization research.
A web marketplace is currently developed to facilitate the exchange of services be-
25
tween members of the community, like professional training, consulting and private
development.
Cross-References
Visualization of Networks, Visualization of Large Networks, Data Mining, Large Networks Analysis of, Mapping Online Networks, Network Representations of Complex
Data, Temporal Networks, Formats, Linked Open Data,
References
[Barnes and Hut (1986)] Barnes J, Hut P (1986) A hierarchical O(N log N) force-calculation algorithm. In: Nature (ISSN 0028-0836), vol. 324, Dec. 4, 1986, pp. 446–449
[Bastian et al. (2009)] Bastian M, Heymann S, Jacomy M (2009) Gephi: an open source software for
exploring and manipulating networks. In: Proceedings of the Third International AAAI Conference
on Weblogs and Social Media (ICWSM’09), in American Journal of Sociology, pp.361-362
[Bertin J (1999)] Bertin J (1999) Sémiologie graphique: les diagrammes, les réseaux, les cartes. Editions de l’Ecole des Hautes Etudes en Sciences Sociales
[Blondel et al. (2008)] Blondel V, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of
communities in large networks, In: Journal of Statistical Mechanics: Theory and Experiment 2008
(10), P1000
[Boyack et al. (2005)] Boyack K W, Klavans R, Brner K (2005) Mapping the backbone of science. In:
Scientometrics 64(3), pp. 351–374
[Brandes U (2001)] Brandes U (2001) A faster algorithm for betweenness centrality, In: Journal of
Mathematical Sociology, vol. 25, pp. 163–177
[Diminescu D (2008)] Diminescu D (2008) The Connected Migrant: an Epistemological Manifest. In:
Social Sciences Information, vol 47
[Diminescu et al. (2011)] Diminescu D, Bourgeois M, Renault M, Jacomy M (2011) Digital Diasporas
Atlas Exploration and Cartography of Diasporas in Digital Networks. In: Proceedings of the Fifth
International AAAI Conference on Weblogs and Social Media (ICWSM’11)
26
[Fruchterman and Reingold (1991)] Fruchterman T M J, Reingold E M (1991) Graph Drawing by
Force-Directed Placement. In: Software: Practice and Experience, 21(11)
[Fry B (2004)] Fry B (2004) Computational Information Design. Ph.D. Thesis
[Hu Y F (2005)] Hu Y F (2005) Efficient and high quality force-directed graph drawing. In: The
Mathematica Journal, 10 (37-71)
[Kleinberg J (1999)] Kleinberg J (1999) Authoritative sources in a hyperlinked environment. In: Journal of the ACM 46 (5): 604632
[Kobourov S G (to appear in 2012)] Kobourov S G (to appear in 2012) Force-Directed Drawing Algorithms. In: Handbook of Graph Drawing and Visualization, CRC Press
[Knuth D E (1993)] Knuth D E (1993) The Stanford GraphBase: A Platform for Combinatorial Computing. Addison-Wesley, Reading, MA
[Latapy M (2008)] Latapy M (2008) Main-memory Triangle Computations for Very Large (Sparse
(Power-Law)) Graphs. In: Theoretical Computer Science (TCS) 407 (1-3), pp.458-473
[Martin et al. (2011)] Martin S, Brown W M, Klavans R, Boyack K (2011) OpenOrd: An Open-Source
Toolbox for Large Graph Layout. In: SPIE Conference on Visualization and Data Analysis (VDA)
[Page et al. (1999)] Page L, Brin S, Motwani R, Winograd T (1999) The PageRank citation ranking:
Bringing order to the Web. Technical Report. Stanford InfoLab.
[Ratkiewicz et al. (2011)] Ratkiewicz J, Conover M, Meiss M, Gonçalves B, Patil S, Flammini A,
Menczer F (2011) Truthy: mapping the spread of astroturf in microblog streams, In: Proceedings
of the 20th international conference companion on World wide web (WWW ’11)
[Stehlé et al. (2011)] Stehlé J, Voirin N, Barrat A, Cattuto C, Isella L, Pinton J-F, Quaggiotto M, Van
den Broeck W, Rgis C, Lina B, Vanhems P (2011) High-Resolution Measurements of Face-to-Face
Contact Patterns in a Primary School. In: PLoS One, August 16, 2011.
[Tarjan R (1972)] Tarjan R (1972) Depth-First Search and Linear Graph Algorithms. In: SIAM Journal on Computing 1 (2): 146160
[Tukey J (1977)] Tukey J (1977) Exploratory Data Analysis, 1 edn., Addison-Wesley
Recommended Reading
[Conway and White (2012)] Conway D, White J M (2012) Machine Learning for Hackers. Chapter
Analyzing Social Graphs, Visualizing the Clustered Twitter Network with Gephi
27
[Bulik-Sullivan and Sullivan (2012)] Bulik-Sullivan B, Sullivan P (2012) The authorship network of
genome-wide association studies. In: Nature Genetics 44, 113
[De Maeyer J (2010)] De Maeyer J (2010) Methods for mapping hyperlink networks: Examining the
environment of Belgian news websites. In: 11th International Symposium on Online Journalism
[Helmond and Weltevrede (2012)] Helmond A, Weltevrede E (2012) Where do bloggers blog? Platform transitions within the historical Dutch blogosphere. In: First Monday, vol 17, number 2
[Kelly et al. (2012)] Kelly J, Barash V, Alexanyan K, Etling B, Faris R, Gasser U, Palfrey J (2012)
Mapping Russian Twitter. In: Berkman Center Research Publication No. 2012-3
[Latour et al. (2012)] Latour B, Jensen P, Venturini T, Grauwin S, Boullier D (2012) The Whole is
Always Smaller Than Its Parts. In: British Journal of Sociology
[Oldham et al. (2012)] Oldham P, Hall S, Burton G (2012) Synthetic Biology: Mapping the Scientific
Landscape. In: PLoS ONE 7(4): e34368.
[Teng et al. (2011)] Teng C-Y, Lin Y-R, Adamic L (2011) Recipe recommendation using ingredient
networks
[Barabasi A-L (2003)] Barabasi A-L (2003) Linked: How Everything Is Connected to Everything Else
and What It Means for Business, Science, and Everyday Life. Plume
[Börner K (2010)] Börner K (2010) Atlas of Science: Visualizing What We Know. The MIT Press
[Easley and Kleinberg (2010)] Easley D, Kleinberg J (2010) Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press
[Newman et al. (2006)] Newman M, Barabasi A-L, Watts D J (2006) The Structure and Dynamics of
Networks. Princeton University Press
[Watts D J (2003)] Watts D J (2003) Six Degrees: The Science of a Connected Age. W. W. Norton
& Company