CH2

Chapter 2
The Big Picture
1
Overview
●
The big picture answers to several questions.

What are data structures?

What data structures do we study?

What are Abstract Data Types?

Why Object-Oriented Programming (OOP) and
Java for data structures?

How do I choose the right data structures?
2
2.1 What Are Data Structures
How many cities with more than 250,000 people lie within 500 miles of
Dallas, Texas? How many people in my company make over
$100,000 per year? Can we connect all of our telephone customers
with less than 1,000 miles of cable?
To answer questions like these, it is not enough to have the necessary
information.
We must organize that information in a way that allows us to find the
answers in time to satisfy our needs.
Representing information is fundamental to computer science. The
primary purpose of most computer programs is not to perform
calculations, but to store and retrieve information — usually as fast as
possible.
For this reason, the study of data structures and the algorithms that
manipulate them is at the heart of computer
science. helping you to understand how to structure information to 3
support efficient processing.
2.1 What Are Data Structures
●
A data structure is an aggregation of data components
that, together, constitute a meaningful whole.


The components themselves may be data structures.
Stops at some “atomic” unit.
Atomic or primitive type A data type whose
elements are single, non decomposable data items
(can be broken into parts)
● Composite type A data type whose elements are
composed of multiple data items
(ex: take tow integers (simple elements) x, y to form a
point (x, y)
●
4
2.2 What Data Structures Do We
Study?

Data structure categories:
●
●

●
Linear
Non-linear
Category is based on how the data is conceptually
organized or aggregated.
Linear structures

List, Queue and Stack are linear collections, each
of them serves as a repository in which entries may
be added or removed at will.
●
Differ in how these entries may be accessed once they
are added.
5
2.2 What Data Structures Do We
Study?
●
List

The List is a linear collection of entries in which
entries may be added, removed, and searched for
without restrictions.
●
Two kinds of list:


●
Ordered List
Unordered List
Queue

Entries may only be removed in the order in which
they are added.
●
●
First in First out (FIFO) data structures
No search for an entry in the Queue
6
2.2 What Data Structures Do We
Study?
●
Stack

Entries may only be removed in the reverse order in
which they are added.
●
●
Last In, First Out (LIFO)
No search for an entry in the Stack.
7
2.2 What Data Structures Do We
Study?
●
Trees:

A tree is a nonlinear structure with a unique starting
node (the root), in which each node is capable of
having many child nodes, and in which a unique
path exists from the root to every other node. Trees
are useful for representing hierarchical relationships
among data items.

Root The top node of a tree structure; a node with
no parent
8
Tree
9
Not Tree
10
2.2 What Data Structures Do We
Study?
●
Binary Tree

Binary tree A tree in which each node is capable
of having two child nodes, a left child node and a
right child node

Leaf A tree node that has no children
11
A Binary Tree
12
Binary Search Tree
●
●
Binary Search Tree
A binary tree in which the key value in any node
is greater than the key value in its left child and
any of its descendants (the nodes in the left
subtree) and less than the key value in its right
child and any of its descendants (the nodes in
the right subtree)
13
Binary Search Tree
14
AVL Tree
●
AVL Tree

Height-balanced, binary search tree.
●
AVL Tree derives its importance from the fact that it
speeds up this search process to a remarkable degree.
15
Heap
●
Heap as a Priority Queue

A priority queue is a specialization of the FIFO
Queue.
●
●
●
Entries are assigned priorities.
The entry with the highest priority is the one to leave first.
The heap is a special type of binary tree.
16
Complete binary tree
●
a complete binary tree is one in which every
level but the last must have the maximum
number of nodes possible at that level.

The last level may have fewer than the maximum
possible nodes, but they should be arranged from
left to right without any empty spots.
17
Heap
18
Hash Table
●
Hash Table:

Hash Functions: A function used to manipulate the
key of an element in a list to identify its location in
the list

Hashing: The technique for ordering and accessing
elements in a list in a relatively constant amount of
time by manipulating the key to identify its location
in the list

Hash table: Term used to describe the data
structure used to store and retrieve elements using
hashing
19
Using a hash function to Determine the
Location of the Element in an Array
20
Graphs

Graph: A data structure that consists of a set of models and
a set of edges that relate the nodes to each other

Vertex: A node in a graph

Edge (arc): A pair of vertices representing a connection
between two nodes in a graph

Two kinds of graphs:
●
●

Undirected graph: A graph in which the edges have no direction
Directed graph (digraph): A graph in which each edge is directed
from one vertex to another (or the same) vertex
A general tree is a special kind of graph, since a hierarchy is
a special system of relationships among entities.
●
●
Graphs may be used to model systems of physical connections such
as computer networks, airline routes, etc., as well as abstract
relationships such as course pre-requisite structures.
Standard graph algorithms answer certain questions we may ask of
the system.
21
Graphs
22
Graphs
●
●
●
●
Adjacent vertices: Two vertices in a graph that
are connected by an edge
Path: A sequence of vertices that connects two
nodes in a graph
Complete graph: A graph in which every
vertex is directly connected to every other
vertex
Weighted graph: A graph in which each edge
carries a value
23
24
Data Type
●
●
●
meaningful data is organized into primitive data types such as integer,
real, and boolean and into more complex data structures such as arrays
and binary trees.
So the idea of a data type includes a specification of the possible
values of that type together with the operations that can be
performed on those values.
Abstract data type (ADT) A data type whose properties
(domain and operations) are specified independently of any
particular implementation


An abstract data type specifies the values of the type, but not how those
values are represented as collections of bits, and it specifies operations
on those values in terms of their inputs, outputs, and effects rather than
as particular algorithms or program code.
the primitive data types is abstract data types.
25
2.4 Why OOP and Java for Data
Structures?
●
●
●
A stack may be built using a List ADT. The
stack object contains a List object which
implements its state, and the behavior of the
Stack object is implemented in terms of the List
object's behaviour.
A stack interface defines four operations:

push

pop

get-Size

isEmpty
OR Reuse List ADT by inheritance.
26
2.5 How Do I choose the Right Data
Structures?
●
●
The interface of operations that is supported by
a data structure is one factor to consider when
choosing between several available data
structures.
The efficiency of the data structures:
●
●
How much space does the data structure occupy?
What are the running times of the operation in its
interface?
27
2.5 How Do I choose the Right Data
Structures?
●
Example

Implementing a printing job storage for a printer:
●

requires a queue data structure.
Maintains a collection of entries in no particular
order.
●
requires an unordered list data structure.
28
2.5 How Do I choose the Right Data
Structures?
●
The running time of each operation in the
interface.
●
●
A data structure with the best interface with the best fit
may not necessarily be the best overall fit, if the running
times of its operations are not up to the mark.
When we have more than one data structure
implementation whose interfaces satisfy our
requirements, we may have to select one based
on comparing the running times of the interface
operations.
●
Time is traded off for space,

i.e. more space is consumed to increase speed, or a reduction in29
speed is traded for a reduction in the space consumption.
2.5 How Do I choose the Right Data
Structures?
●
Time-space tradeoff

We are looking to “buy” the best implementation of
a stack.
●
StackA. Does not provide a getSize operation.

●
●
i.e. there is not single operation that a client can use to get the
number of entries in StackA.
StackB. Provides a getSize operation, implemented in
the manner we discussed earlier, transferring entries
back and forth between two stacks.
StackC. Provides a getSize operation, implemented as
follows: a variable called size is maintained that is
incremented every time an entry is pushed, and
decremented every time an entry is popped.
30
2.5 How Do I choose the Right Data
Structures?
●
Three situations:

Need to maintain a large number stacks, with no
need to find the number of entries.

Need to maintain only one stack, with frequent need
to find the number of entries.
●

Ex: Stack array of length 10, 1000 calls for getsize
Need to maintain a large number of stacks. With
infrequent need to find the number of entries.
●
Ex: Stack array of length 10, 3 calls for getsize, 1000
stacks
31
2.5 How Do I choose the Right Data
Structures?
●
Situation 1, StackA fits the bill.

Tempting to pick StackC, simply because we may
want to play conservative: what if we need getSize
in the future?
32
2.5 How Do I choose the Right Data
Structures?
●
Situation 2, StackB or Stack C.

Need to use getSize.

getSize in StackB is more time-consuming than that
in StackC.

We need only one stack, the additional size variable
used by StackC is not an issue.

Since we need to use getSize frequently, it is better
to with StackC.
33
2.5 How Do I choose the Right Data
Structures?
●
Situation 3 presents a choice between StackB
and StackC.

If getSize calls are infrequent, we may choose to go
with StackB and suffer a loss in speed.

The faster getSize delivered by StackC is at the
expense of an extra variable per stack, which may
add up to considerable space consumption since
we plan to maintain a number of stacks.
34
2.5 How Do I choose the Right Data
Structures?
●
getSize in StackB is more time-consuming that
that in StackC.

How can we quantify the time taken in either case?

For each data structure we study, we present the
running time of each operation in its interface.
35
Time complexity
●
●
Use of time complexity makes it easy to estimate the running
time of a program.
Complexity can be viewed as the maximum number of primitive
operations that a program may execute. Regular operations are
single additions, multiplications, assignments etc. We may
leave some operations uncounted and concentrate on those
that are performed the largest number of times. Such
operations are referred to as dominant.
36
Time complexity
The operation in line 4 is dominant and will be executed n
times. The complexity is described in Big-O notation: in this
case O(n) — linear complexity.
37
Time complexity
●
●
●
The complexity specifies the order of magnitude within which
the program will perform its operations. More precisely, in the
case of O(n), the program may perform c · n operations, where
c is a constant; however, it may not perform n2 operations.
when calculating the complexity we omit constants: i.e.
regardless of whether the loop is executed 20· n times or n / 5
times, we still have a complexity of O(n), even though the
running time of the program may vary.
When analyzing the complexity we must look for specific, worstcase examples of data that the program will take a long time to
process.
38
Comparison of different time
complexities
39
Comparison of different time
complexities
40
Comparison of different time
complexities
41
Exponential and factorial time
●
●
●
It is worth knowing that there are other types of
time complexity such as factorial time O(n!)
and exponential time O(2n). Algorithms with
such complexities can solve problems only for
very small values of n, because they would take
too long to execute for large values of n.
42
Comparison of rates of growth for
different time complexities
43