Graphs

CS 162
Introduction to Computer Science
Chapter 12
Graphs in C++
Herbert G. Mayer, PSU
Status 9/11/2014
1
Syllabus
 Goal
 Heap Space
 Definition of Graph
 Building a Graph
 Graph Data Structure
 Graph Traversal
 References
2
Goal
Improve CS 162 students’ understanding and use of
dynamic memory management: using the system
heap to allocate and free data space
Sometimes heap use is explained, implementing linked
lists that can grow and shrink dynamically
For a system programmer that is way too simple; so we
shall exercise heap management by building
directed, cyclic graphs
Our graphs may be unconnected and each node ( AKA
vertex) may have any number of successor nodes
Undirected edges are interpreted to be bi-directional
3
Heap Space
Static data space of a program exists during whole execution;
created at program load time, returned at program exit() time. In
Unix the .bss section initializes to 0 by default; can overwrite
Automatic space is data space created by the run-time system.
Each time a new function scope is entered, automatic space is
allocated off the stack. Part of the return operation is to free
such previously granted automatic space. AKA dynamic data
space
Controlled space is data space allocated by a specific request to
allocate; e.g. the C function malloc() or C++ new command.
Some languages (Ada, Java) use the built-in new operator to
acquire heap space. Space can be de-allocated automatically
(Java) or by the explicit command free(). On a run-time
system requiring a programmer to call free(), it is possible to
“forget” to call, and consequently such memory space will
never be returned until main() exits; creates holes
We shall build graphs using heap space
4
Heap Space
The lifetime of an object in the heap (controlled space) is the time
span from explicit birth (malloc() or new) to the moment of
explicit death (free(), or exit() from main()) operation; this is
program controlled
Each such heap object is identified by a pointer of suitable type
The lifetime of the pointer itself is defined by its own storage
attribute, static, automatic, or controlled
Thus it can happen that an object continues to live on the heap –
i.e. consumes heap space– while its identifying pointer has
long ceased to exist
Thus another type of holes in memory is created
If this happens with system SW that runs forever, memory
gradually vanishes and your computer crashes
5
Hole in Heap Space
#include <iostream.h>
typedef int * pi_type;
void hole( void )
{ // hole
pi_type pi = new int; // C++, malloc(int) in C
*pi = 109; // just to show space exists!
// forgot: free( pi );
return;
// hole is left
} //end hole
int main()
{ // main
hole();
// lots of other stuff . . .
return 0;
} //end main
6
Formal Definition of Graph
Graph: A graph G is a data structure G = { V, E } consisting of a set E of edges
and a set of V vertices, AKA nodes. Any node vi ϵ V may be connected to any
other node vj. Such a connection is called an edge. Edges may be directed, even
bi-directed. Different from a tree, a node in G may have any number of
predecessors –or incident edges; that is THE main difference of graph vs. tree
Empty Graph: For expediency we ignore the possibility of a graph G being empty; in
an empty graph the data structure that points the graph is simply NIL
Connected Graph: If all n > 0 nodes vn in G are connected somehow, the graph G is
called connected, regardless of edge directions
Strongly Connected Component: A subset SG ⊆ G is strongly connected, if for every
node vi in SG, i > 0, such a vi can reach all vi nodes in SG somehow
Directed Acyclic Graph (DAG): A DAG is a graph with directed edges but no cycles. A
node may still have multiple predecessors
When programming graphs, it is convenient to add fields to the node type for auxiliary
functions; e.g. it is possible to process all nodes in a linear fashion by adding a
link field, often called a “finger” or “link”

Possible use: traversing all nodes in G, though G may be unconnected!
7
Building a Graph
8
Graph Data Structure
A graph G( v, e ) consists of nodes v and edges e

Implemented via some node_type data structure
G is often identified and thus accessible via one select
node, called entry node, or simply entry, AKA head
Head is of type pointer to node_type
G is not necessarily connected

If parts of G are unconnected, how can they be retrieved in
case of a necessary, complete graph traversal?
Several methods of forcing complete traversal:


Either create a super-node, not part of G proper, in a way that
each unconnected region is pointed at, thus reachable
Or have a linked-list (LL) meandering through each node of
G, without this link field being part of G proper; e.g. “finger”
9
Sample Graph G0
G0
1
R
2
3
Y
G
B
O
5
P
6
10
4
Graph Data Structure
Sample Graph G0 above has 6 nodes
The ID of a node, AKA name of each node is shown next
to the nodes, e.g. 1 2 3 4 5 …
The graph’s node type data structure includes such
name information as part of node_type
In addition, each node in G0 has attributes, such as R, G,
Y etc. in the sample above; such attributes can be
arbitrarily complex consisting of many fields
There may be more attributes belonging to each node,
depending on what the graph is used for
Any of these attributes must also be declared in the
node_type data structure
Successors, if any, of each node must be encoded in the
node; there is no upper limit on the number!
G0 has 3 SCCs; 2 of those are trivial, i.e. not interesting!
11
Graph Data Structure
Since in there is no inherent upper bound on the
number of successor nodes, a suitable way to define
successors is via a linked list of tuples, AKA link
nodes
Thus a possible data type for successor nodes is a
pointer to a link node
Link nodes can be allocated off the heap, as needed;
they are not of type node_type, but of link_type
And each link is a tuple, i.e. consists of just 2 fields:


One field pointing to the next link; the type is pointer to
link_type, in some languaged expresses as *link_type
The other field pointing to the successor node; the type is
pointer to node_type
For convenience, the last link inserted is added at the
head of the list, simplifying searches for the list end
12
Graph Traversal
Graph G with i nodes may be unconnected, and yet
each unconnected region is part of G
Algorithm may require visiting each node ni ϵ G
Requires additional data structure to visit node ni
Moreover, any full traversal of G must start with some
node of G; which one?
13
1. Graph Traversal: Fixed Array
Specify fixed-sized array a[K] of K >=i node pointers,
all initially null, not necessarily all used, in which
each used element a[i], i < K points to node ni
Say, in increasing order of indices, starting at index 0
As soon as a[i] yields a null pointer, no more nodes
are identified in a[]
Thus G is known to have i nodes
Advantage:
•
Simple to implement
Disadvantage:
•
•
Wasted space almost always, since a[] needs to be large
Too limited, the method fails if i >= K
14
1. Graph Traversal Fixed Array
m
a
z
b
k
15
2. Graph Traversal Linked List
Instead of fixed array, use a linked list of link-elements
Each element has 2 fields
Field next points to next link element, if any
Field finger points to its node ni ϵ G
Link element is allocated dynamically off the heap only
when another node ni in G is added
Advantage:
•
•
•
Also simple to implement
Space consumption is directly proportional to the number of
nodes in G; thus no wasted space
Runs out of space only when all memory space is exhausted
Disadvantage:
•
•
Is a separate data structure from graph G, so you end up
with 2 data structs that are to be synchronized
Per link element, i.e. per node, 2 scalar fields are consumed
16
3. Graph Traversal Added Field
Extend the data structure of the graph G
Keep all fields needed for the graph proper, but add a
finger field, of type pointer to node
Like the other 2 methods, this will require an outside
data structure, of type pointer to node, initially nil, to
point to any of the nodes ni ϵ G
Thus all finger fields form a linked list of graph nodes
Advantage:
•
•
•
Also simple to implement
Space consumption is directly proportional to the number of
nodes in G
Runs out of space only when all memory space is exhausted
Disadvantage:
•
One more field, but the cost is minimal and contained!
17
Graph Data Structure, Link
//
//
//
//
//
node may have any number of successors
all need to be retrievable.
so each node in G has a link pointer,
pointing to LL of all successor nodes.
Last node connected will be inserted at head
typedef struct link_tp * link_ptr_tp;
typedef struct node_tp * node_ptr_tp;
typedef struct link_tp
{
link_ptr_tp next_link;
node_ptr_tp next_node;
} str_link_tp;
// forward ref
// forward ref
// point to next link
// point to successor node
#define LINK_SIZE sizeof( str_link_tp )
18
Graph Data Structure, Node
//
//
//
//
"name" is arbitrary number given during creation
"link" is head of LList of successor nodes, while
"finger" is linear link through all nodes
"visited” only true after visit; initially FALSE
typedef struct
{
link_ptr_tp
node_ptr_tp
int
bool
others ...
} str_node_tp;
node_tp
link;
finger;
name;
visited;
//
//
//
//
//
points to LL of successors
finger through all nodes
name=ID given at creation
to check connectivity
other fields: attributes
#define NODE_SIZE sizeof( str_node_tp )
19
Building a Graph, one Node
//
//
//
//
create a node in graph G, identified by “name”
connect to the global “finger” at head of LList
assumption: no such node “name” exists in graph
assume: global “finger”, initially NULL
node_ptr_tp make_node( int name )
{ // make_node
node_ptr_tp node = (node_ptr_tp) malloc( NODE_SIZE );
// check once non-Null, not repeatedly on user side!
ASSERT( node, "space for node missing" );
node->finger
= finger;
// re-assign glob finger!!
node->link
= NULL;
// pointer type
node->name
= name;
// IDs this node
node->visited = FALSE;
// initially
finger
= node;
// now link to “this”
return node;
} //end make_node
20
Building a Graph from Pairs
// input is list of pairs, each element being a node name
// craft edge from first node name=a to second node name=b;
// If a node is new: create it; else use ptr = exists()
while( scanf( "%d%d", &a, &b ) ) {
// a, b are ints
if ( ! ( first = exists( a ) ) ) {
// ‘a’ new node?
first = make_node( a );
// allocate ‘a’
} //end if
if ( ! ( second = exists( b ) ) ) { // ‘b’ new node?
second = make_node( b );
// allocate ‘b’
} //end if
// both exist. Either created, or pre-existed: Connect!
if ( new_link( first, second ) ) {
link = make_link( first->link, second );
ASSERT( link, "no space for link node" );
first->link = link;
}else{
// link was there already, no need to add again!
printf( "<><> skip duplicate link %d->%d\n", a, b );
} //end if
} //end while
21
Building a Graph from Pairs
// check, whether link between 2 nodes already exists
// if not, return true: New! Else return false, NOT new!
bool new_link( node_ptr_tp first, node_ptr_tp second )
{ // new_link
int target
= second->name;
link_ptr_tp link = first->link;
while ( link ) {
if ( target == link->next_node->name ) {
return FALSE; // it is an existing link, NOT new
} //end if
// check next node; if any
link = link->next_link;
} //end while
// none of successors equal the second node's name
return TRUE;
// is a new link
} //end new_link
22
References
1. Control Flow Graph, in: Mayer, H. “Parallel
Execution Enabled by Refined Source Analysis:
Cost and Benefits in a Supercompiler”, R.
Oldenbourg Verlag München/Wien, March 1997
2. Graphs in: C. Berge, “Graphs and Hypergraphs”,
North-Holland, Amsterdam 1973
3. SCCs: Robert Tarjan, "Depth-First Search and Linear
Graph Algorithms". SIAM J. Computing, Vol. 1, No. 2,
June 1972
23

Download Report

Graphs

Paperzz.com

Your Paperzz