module 4 - Notesvillage

MODULE 4
Searching:A searching algorithm is an algorithm that accepts an argument a tries to find a record
whose key is a .The algorithm may return the entire record or the pointer to that record .If
the search is unsuccessful the algorithm may return a special null record or a null pointer.
A successful search is called retrieval and the table of records in which a key is used for
retrieval is called a search table or a dictionary.
Searches in which the entire table of records is placed in the memory is called internal
searches whereas those in which most of the table is kept in auxiliary memory is called
external searches.
Sequential searching
This is the simplest form of search. This is applicable to a table organized either as an
array or as a linked list.
Let us assume K is an array of n keys k(0) through k(n-1) and r as an array of records r(0)
through r(n-1) such that k(i) is the key of r(i) .the program returns I if k(i) equals key else
returns -1.
For (i=0; i<n; i++)
If (key = = k (i))
Return (i);
Return (-1);
Insertion of a record with key key if it is not already there.
For (i=0; i<n; i++)
If (key = = k (i))
Return (i);
K (n) = key;
R (n) = rec;
Storing the table as a linked list has the advantage that the size of the table can be
increased dynamically as needed
Assume that the table is organized as a linear linked list pointed to by the table and linked
by pointer field next.
Q=null;
For(p=table ;p!=null && k(p) !=key;p=next(p))
Q=p;
If(p!=null)
Return(p);
/* insert a new node*/
s=getnode();
File structures and algorithms
Melvin C Varkey, Amal Jyothi College of Engg.
1
k(s)=key;
r(s)=rec;
next(s)= null;
if(q== null)
table =s;
else
next(q)=s;
return(s);
Tree searching:We will see search operations on file that is organized as a tree. In a binary tree all the
left descendants of a node with key k1 have keys that are less than k1 and all right
descendants have keys that are greater than or equal to k1.
The inorder traversal of such a tree yields the file in ascending key order.
Binary search in trees
P= tree;
While(p! = null && key != k(p))
if((key)<k(p))
p = left(p);
p = right(p);
return(p);
Insertion in to a binary search tree
q=null;
p=tree;
while(p!=null)
{
If(key == k(p))
return(p);
q=p;
if(key <k(p))
p=left(p);
else
p=right(p);
}
v=maketree(rec,key);
if(q==null)
tree=v;
else
if(key <k(q))
left(q)=v;
else
right(q)=v;
return(v);
File structures and algorithms
Melvin C Varkey, Amal Jyothi College of Engg.
2
Height balanced trees (AVL trees)
Tree T is height balanced iff
1) TL AND TR are height balanced
2) hL – h R <=1 where hL and h R are the heights of left and right sub trees.
BF(T) can be either -1 ,0 or 1.
New node
After insertion
After rebalancing
0
MAR
1) MARCH
No rebalancing
-1
MAR
2) MAY
“
0
MAY
3) NOV
0
MAY
-2
MAR
-1
MAY
0
NOV
0
MAR
0
NOV
RR
Y is inserted as the Right subtree of right Subtree of A.Rotate march to left
File structures and algorithms
Melvin C Varkey, Amal Jyothi College of Engg.
3
4) AUG
1
MAY
1
MAR
No rebalancing
0
NOV
0
AUG
1
MAY
2
MAY
5)APR
2
MAR
0
AUG
0
NOV
0
NOV
0
APR
0
MAR
1
AUG
0
APR
LL –Y is inserted as the left subtree of left subtree of A
Rotate march to right
0
MAR
2
MAY
6) JAN
-1
AUG
0
AUG
0
NOV
0
APR
1
MAR
0
APR
-1
MAY
0
JAN
0
NOV
0
JAN
File structures and algorithms
Melvin C Varkey, Amal Jyothi College of Engg.
4
LR –Y is inserted as the left subtree of right subtree of A
7) DEC
1
MAR
No rebalancing
-1
AUG
-1
MAY
0
APR
1
JAN
0
NOV
0
DEC
1
MAR
8) JULY
No rebalancing
-1
AUG
0
APR
-1
MAY
0
JAN
0
DEC
File structures and algorithms
0
NOV
0
JULY
Melvin C Varkey, Amal Jyothi College of Engg.
5
1
MAR
9)FEB
2
MAR
-2
AUG
0
APR
0
DEC
-1
MAY
1
JAN
-1
MAY
1
AUG
0
NOV
-1
DEC
0
NOV
0
JAN
0
FEB
0
JULY
0
JULY
-1
DEC
0
FEB
RL –FEB is inserted as the right subtree of left subtree of A
PROGRAM FOR INSERTION OF A NODE IN TO AN AVL TREE
/*Inserting a node in an AVL tree */
Class Avlnode
{
int data;
Avlnode *leftchild,*rightchild;
Int balfact;
};
Class AVL
{
Private :
Avlnode * root;
Public:
Boolean insert (int x);
};
Boolean AVL ::insert (int x) // x is to be inserted //
Avlnode *a,*b,*c,*f,*p,*q,*y;
Boolean found,unbalanced;
Int d;
File structures and algorithms
Melvin C Varkey, Amal Jyothi College of Engg.
6
if (! Root)
{
Y = new Avlnode;
Y ->data = x; root = y; root -> balfact =0;
Root ->leftchild = root ->rightchild=0;
Return TRUE;
}
/* Locate insertion point for x .a keeps track of the most recent node with balfact +- 1.
and f is the parent of a.q follows p through the tree. */
//Phase 1
f=0; a=p= root;q=0;found=false;
while(p && ! found) // search for the insertion point
{
if (p-> balfact ){ a=p;f=q;}
if(x< p->data){ q=p;p=p->leftchild;}
else if (x> p->data){ q=p;p=p->rightchild;}
else { y=p;found=TRUE;}
}//END OF WHILE
If (! found)
/* phase 2 :insert and rebalance .x is not in the tree and may be inserted as the appropriate
child of q*/
Y= new Avlnode;
y->data = x;
y->leftchild =y->rightchild =0;
y->balfact=0;
if(x<q->data) q->leftchild =y;
else q->rightchild =y;
/* adjust balfact of all nodes on the path from a to q.by the definition of a,all nodes on
this path must have balfact 0 ,so will change to +- 1. d+-1 implies that x is inserted in the
left subtree of a .d =-1 implies that x is inserted in the right subtree of a */
If(x> a->data)
{
P= a->right child; b=p,d=-1;
}
else
{
P= a->leftchild;b=p;d=1;}
While( p!= y)
{
If( x>p->data)
File structures and algorithms
Melvin C Varkey, Amal Jyothi College of Engg.
7
{
p->balfact =-1;
p=p->rightchild;
}
else
{
p->balfact =1;p=p->leftchild;
}
}
//Is tree unbalanced?
if(! (a->balfact ) || (a-> balfact +d) //tree still balanced
{
a->balfact += d;
unbalanced =false;
}
if (unbalanced)//tree unbalanced ,determine rotation
{
if(d==1) //left imbalance//
{
if(b->balfact == 1) //rotation type is LL
{
a-> leftchild = b-> rightchild;
b->rightchild =a;
a->balfact =0;b->balfact =0;
}
else// rotation type is LR
{
c= b->rightchild;
b->rightchild= c->leftchild;
a->leftchild = c->rightchild;
c->leftchild = b;
c->rightchild =a;
switch(c->balfact)
{
case 1: //LR(b)
a->balfact =-1;b->balfact =0;
break;
case -1: //LR( c)
b->balfact =1; a->balfact =0;
break;
case 0: // LR(a)
b->balfact =0; a->balfact = 0;
break;
}
File structures and algorithms
Melvin C Varkey, Amal Jyothi College of Engg.
8
c->balfact =0 ;b=c; // b is the new root of the subtree
}//end of LR
}// end of left imbalance
else
{
/* right imbalance –symmetric to left imbalance – write the code for RR and RL */
}
//subtree with root b is rebalanced
if( ! f) root =b;
ese if( a== f->leftchild) f->leftchild =b;
Else if(a== f->rightchild) f->right child =b;
} //end of if unbalance
return TRUE;
}// end of if(! Found)
Return false;
}//end of insert
Multiway search trees:A multi way search tree of order n is a general tree in which each node has n or fewer sub
trees and contains one fewer key than it has sub trees.
E.g.:- If a node has 4 keys, then it has 5 sub trees.
If s0,s1…..sm-1 are the m sub trees of a node containing keys k0,k1….k m-2 in ascending
order ,all the keys in sub tree s0 are less than or equal to k0,all keys in the sub tree sj are
greater than kj-1 and less than or equal to kj.
The sub tree sj is called the left sub tree of key kj and its root is called the left son of key
kj.
Fig 1
A
12
B
6
50
85
C
10
D
37
12
F
25
File structures and algorithms
E
50
85
100 120 150
G
62
65
H
69
110
Melvin C Varkey, Amal Jyothi College of Engg.
9
This is a multi way search tree of order 4 .The nodes A,D,E and G contains maximum no
of subtrees (4) and maximum no of keys(3) .Such nodes are called full nodes.
Searching a multiway search tree:Each node contains an
1) Integer field – numtrees (p) – no of sub trees of node (p)
2) Pointer fields - son (p, 0) through son (p, numtrees (p)-1) which points to the subtrees
of node(p)
3) key fields- k(p,0) through k(p,numtrees(p)-2) are the keys contained in the node(p) in
the ascending order.
Assume that the function nodesearch (p, key) returns the smallest integer j such that the
key <= k (p ,j)
Algorithm
p = tree;
if (p == null)
{
position =-1;
return (-1);
}
i= node search (p, key);
if ( i< numtrees ( p) -1 && key == k(p,i))
{
position= I;
return (p);
}
return (search ((son ( p,i)));
Traversing a multiway search tree:if (tree ! = null)
{
nt = numtrees (tree);
for ( i = 0;i < nt-1; i++)
{traverse (son ( tree,i));
printf ( “%d”, k(tree ,i));
}
traverse ( son( tree,nt));
}
File structures and algorithms
Melvin C Varkey, Amal Jyothi College of Engg.
10
Insertion of a node in to multiway search tree:q = null;
p = tree;
while (p != null)
{
i= nodesearch( p,key);
q= p;
i( i< numtrees( p)- 1 && key== k(p,i))
{
found= true;
position = i;
return(p);
}
p= son(p ,i);
}
found= false;
position = i;
insrec( p,position ,rec);
Semi leaf:A semi leaf is a node with at least one empty sub tree .In the previous figure all the nodes
B, C, D, E, F, G, H are semi leaves.
Top down multiway search trees:A top down multiway search tree is characterized by the fact that any non full node is a
leaf.
A
fig 2
Eg:30 80
B
C
110 20
50
E
D
60
F
110 150
G
H
3
7
14
18
55
L
M
I
120 130
N
O
160
P
K
2
5
6
12
15
17
111
125
140
Q
R
4
16
File structures and algorithms
Melvin C Varkey, Amal Jyothi College of Engg.
11
Here the order of the tree is 3.
1) Full nodes ( having 3 subtrees and 2 keys) are A , B, C, D, E, F, H, K , M.
2) Non full nodes are G, I, J, L, N, O, P, Q, R. They are leafs also (has empty
subtrees)
Balanced multiway search trees:If all the semileaves are at the same level then a multi way search tree is called balanced
Fig 3
Eg:-
50
20
7 13
3 5
10
35
75
30
17
25
40
32
37
60 70
42 43
55
61 65
120 150
72
80
130
All its semileaves are at the same level. And all its semileaves are leaves.
The fig1 and fig 2 are not balanced.
B-trees:A balanced order –n multiway search tree in which each non root node contains atleast
(n-1)/2 keys is called a B-tree of order n
Eg:- A B-tree of order 12 contains at least 5 keys in each nonroot nodeas does a B-tree of
order 11.A B-tree of order n is also called n-(n-1) tree or (n-1)-n tree.ie,each node in the
tree has a maximum no of n-1 keys and n sons
Thus a 4-5 tree is a B-tree of order 5,as is a 5-4 tree.
Note
2-3 or 3-2 tree is the most elememntary non trivial(ie non binary) B-tree
File structures and algorithms
Melvin C Varkey, Amal Jyothi College of Engg.
12
160
395
380 382
406
430
412
480
508
451 472
493 506
511 518
This is a B-tree of order 5
Insertion of a node in to a B-tree (of order n, n=odd)
1. Locate the leaf in to which the key should be inserted
2. If the leaf is not full add the key to that leaf
3. If the located leaf is found to be full, instead of creating a new node with only one key
split the full leaf in to 2.
a) Left leaf
2) Right leaf
E.g.:- Assume that there are n-1 keys in the full leaves
The n keys [(n-1) keys in the full leaves + one key to be inserted) are divided in to 3
1) the lowest n/2 keys are placed in to the left leaf
2) the highest n/2 leys are placed in to the right leaf
3) The middle key is placed in the father node
Consider the following B-tree
380 395 406 412
320
540
430
480
451
472
File structures and algorithms
493 506 511
Melvin C Varkey, Amal Jyothi College of Engg.
13
Now let us add the new key 382
The key 382 is to be inserted in to the leaf F
The keys will be inserted in the order 380 382 395 406 412
The lowest n/2 keys == 380 382
The highest n/2 keys== 406 412
The middle key is == 395 (taken to the parent node)
320
540
395 430 480
380 382
406
412
451
472
493 506 511
The tree after inserting the node 382
Insertion of a node in to a B-tree of order n which is even:A) LEFT BIAS
If the order is even then the keys are divided in to uequal sized groups
1) One of size n/2
2) The second of size (n-1)/2
3) The middle one is taken to the parent node
Let us insert 102 to the following tree
87
23
61 74
140
90 100 106
152 186 194
Fig a:1
File structures and algorithms
Melvin C Varkey, Amal Jyothi College of Engg.
14
Take leaf 90 100 106 and the order is 90, 100 102 106
1) take n/2 =2 = (90 ,100) as left leaf
2) (n-1) /2 =1 =  106 as right leaf
3) 102 to the parent node
87 102
23
61 74
90
140
100
106
152 186 194
B) RIGHT BIAS
N keys are divided in to 3 groups
1) One size of (n-1)/2
2) Second group of size n/2
3) Middle one is taken to the parent
Let us insert 102 to the figure a:1
90
100
Left leaf
To the
parent
102 106
right leaf
87 100
23
61 74
File structures and algorithms
90
140
102 106
152 186 194
Melvin C Varkey, Amal Jyothi College of Engg.
15
Insertion of a node when the father node is also full
87 102
23
61 74
140
89 91 101
104 110
120
145 152 180
Now let us insert 90 .here since C is full we split it in to 2
89
90
Left
91
101
to the parent
right
91 is taken to the parent nodes A which is also full. So it is also split.
87
91
Left
102
140
to the parent
right
So now 102 becomes the new root .So the resultant tree is
102
87
23
61 74
91
89 90
File structures and algorithms
140
101
104 110
120
Melvin C Varkey, Amal Jyothi College of Engg.
145 152 180
16
Algorithm for insertion of a node in to a B-tree:The procedures used in the algorithm
1) insert (key,rec,s,position) -> inserts a record in to a B-tree.
2) node(nd) –> node where key is to be inserted.
3) father (nd) –>father of node nd
4) index (nd) -> position of the pointer nd in the node(father(nd))
5) pos -> position in node nd where the key and the record are to be inserted.
6) newkey and newrec ->key and record being inserted.
7) newnode –> pointer to the subtree that contains the keys greater than the new key.
8) maketree(key ,rec) -> creates a new node containing the single key key and the record
rec and all the pointers null
Algorithm
nd = s;
pos = position;
newnode = null;
newrec = rec;
newkey = key;
f = father(nd);
while(f1= null && numtrees(nd) = =n)
{
split (nd,pos,newkey ,newrec,newnode,nd2,midkey,midrec);
newnode = nd2;
pos = index(nd);
nd = f;
f = father(nd);
newkey = midkey;
newrec = midrec;
}
if (numtrees(nd)<n)
{
insnode(nd,pos,newkey,newrec,newnode);
return;
}
split (nd,pos,newkey,newrec,newnode,nd2,midkey,midrec);
tree = maketree (midkey,midrec);
son (tree,0) = nd;
son (tree,1) = nd2;
Algorithm for split
nd2 = getnode();
if (pos>ndiv2)
{
copy (nd,ndiv2 + 1,n-2, nd2);
insnode (nd2,pos-ndiv2 -1,newkey,newrec,newnode);
File structures and algorithms
Melvin C Varkey, Amal Jyothi College of Engg.
17
numtrees (nd) = ndiv2+1;
midkey = k(nd,ndiv2);
midrec = r(nd,ndiv2);
return;
}
if( pos = = ndiv2)
{
copy (nd,ndiv2,n-2,nd2);
numtrees (nd) = ndiv2 +1;
son (nd2,0) = newnode;
midkey = newkey;
midrec = newrec;
return;
}
If (pos <ndiv2)
{
copy(nd,ndiv2,n-2,nd2);
numtrees(nd) = ndiv2;
insnode(nd,pos,newkey,newrec,newnode);
midkey = k(nd,ndiv2-1);
midrec = r(nd,ndiv2 -1);
return;
}
Algorithm for insnode (nd, pos, newkey, newrec, newnode)
for (i = numtrees(nd) – 1; i>= pos +1;i--)
{
son (nd, i+1) = son( nd,i);
k ( nd,i) = k(nd,i-1);
r ( nd,i) = r (nd ,i-1);
}
son ( nd, pos +1) = newnode;
k ( nd,pos) = newkey;
r ( nd,pos) = newrec;
numtrees (nd) + = 1;
Algorithm for copy (nd1, first, last, nd2)
Set numkeys = last – first +1;
It copies fields k(nd1,first)  k(nd2,0)
:
:
:
k( nd1,last)  k(nd2,numkeys)
File structures and algorithms
Melvin C Varkey, Amal Jyothi College of Engg.
18
and copies fields
(r,nd1,first)  r( nd2,0)
:
:
r(nd1,last) r( nd2,numkeys)
and son fields
son( nd1,first)  son( nd2,0)
son( nd1,last+1)  son( nd2,numkeys +1)
sets numtrees(nd2) = numtrees +1;
Deletion of a node from a B-tree:In a strict B-tree we must preserve the requirement that each node contains at least
(n-1)/2 keys.
Deleting a key from a non leaf node
If a key is being deleted from a non leaf node its successor is moved to the deleted
position
50
30
10 20
40
33 35
55
42
45
52 53
70
56 60
73 75 78 80
Deleting key 70 results in
50
30
10 20
40
33 35
55
42
45
52 53
73
56 60
75 78 80
73 which is the successor of 70 is taken to its position (after this movement (n-1)/2 keys
are maintained).
File structures and algorithms
Melvin C Varkey, Amal Jyothi College of Engg.
19
Deleting a key from a leaf node:If a key is being deleted from a leaf node if the node maintains (n-1)/2 keys ,the key can
be simply deleted.
Deleting 75 (75 is in a leaf)
50
30
10 20
40
55
33 35
42
45
73
52 53
56 60
78 80
If on deletion of a key ,the no of keys in anode drops below (n-1)/2 .This situation is
called underflow.
Underflow case 1:When an underflow occurs the solution is to examine the leaf’s younger or elder brother.
If the brother contains more than (n-1)/2 keys; the key ks in the father node that separates
between the 2 brothers can be added to the underflow node and the last or first key of the
brother (last if the brother is younger and first if the brother is elder.) is added to the
father in place of ks.
50
30
10 20
40
33 35
55
42
45
52 53
70
56 60
73 75 78 80
8080
Let us delete 60
The younger brother is G and elder brother is I (contains more than (n-1)/2 keys)
.70 is moved to the node H and 73 is moved up.
File structures and algorithms
Melvin C Varkey, Amal Jyothi College of Engg.
20
50
30
10 20
40
55
33 35
42
45
73
52 53
56 70
75 78 80
8080
Let us delete 56
Parent key 73 is taken to the its node and the first key of the elder brother is taken up.
50
30
10 20
40
55
33 35
42
45
75
52 53
70 73
78 80
8080
Under flow case 2:If both the brothers contain exactly n-1/2 keys, no keys can be shifted .In such a situation,
the underflow node and one of its brothers and the separator key from their father are
consolidated or concatenated in to a single node.
Consider the tree
50
30
10 20
40
33 35
55
42
45
52 53
73
70 73
78 80
8080
Let us delete 73
Both the brothers G & I contains exactly (n-1)/2 keys
So 73 is removed and node H, node I and its separator key 75 are consolidated.
File structures and algorithms
Melvin C Varkey, Amal Jyothi College of Engg.
21
50
C
30
10 20
40
55
33 35
42
45
70 75 78 80
52 53
Now node C (non leaf node) is having less than (n-1)/2 keys
Now node C which is the father of the node H is in underflow so it borrows from its
brother B .brother B is still in the underflow .It is consolidated with the separator key A
30 40 50 55
10 20
33
35
42
45
52 53
70 75 78 80
Threaded binary tree
In a linked representation of a binary tree we notice that there are more null links than
actual pointers
T
A
B
D
0
H
0
0
0
I
C
E
0
0
F
0
0
G
0
0
File structures and algorithms
Melvin C Varkey, Amal Jyothi College of Engg.
22
A method to make use of these null links is to replace these by threads to other nodes in
the tree.
Concept of threads
A node P can be represented as follows
P
LCHILD DATA RCHILD
If RCHILD(P) is normally equal to zero ,we replace it by a pointer to the node which
would be printed after P when traversing the tree in inorder.A null LHILD link at node P
is replaced by a pointer to the node which immediately precedes node P in inorder.The
inorder traversal of the above figure yields the nodes in the order
H,D,I,B,E,A,F,C,G
Here nodes H,I,E,F,G are having null pointers.
Take H
a) LCHILD(H) points to node which precedes it .since there is no node its is null
b) RCHILD(H) points to the node which succeeds it (D)
Take I
a) LCHILD(I) –points to the node which precedes it in inorder( D)
b) RCHILD( I) points to the node which succeeds it in inorder( B)
Take E
a) LCHILD(E) –points to B
b) RCHILD( E) points to A
Take F
a) LCHILD (F) points to A
b) RCHILD(F) points to C
Take G
a) LCHILD(G) points to C
b) RCHILD( G) points to null
File structures and algorithms
Melvin C Varkey, Amal Jyothi College of Engg.
23
So the threaded binary tree can be drawn as follows
T
A
B
D
C
E
H
F
G
I
B+ tree:A
B+ tree is a variation of the basic B-tree structure. In B+ tree all the keys are
maintained in leaves and keys are replicated in non leaf nodes to define paths for locating
individual records .The leaves are linked together to provide a sequential path for
traversing the keys in the tree.
98
36 53 81
8 36
R8
42
53
56 81
104 119
83 96 98
102 104
107 119
R36
125 128
R125
To locate the record associated with the key 53 ,the key is first compared with 98 .since it
is less proceed to node B.53 is then compared with 36 and then 53 in node B .Since it is
less the or equal to 53,procedd to node B.Search doesn’t halt here as in the case of a Btree. In a B-tree a pointer to the record corresponding to the key is contained with each
key in that tree.
File structures and algorithms
Melvin C Varkey, Amal Jyothi College of Engg.
24
R128
Insertion of a key
Insertion of a node proceeds as much as in a B-tree except that when a node is split
middle key is retained in the left half node as well as being promoted to the father.
Deletion of a key
When a key is deleted from a leaf it can be retained in the non leaves since it is still a
valid separator between the keys in the node below.
File structures and algorithms
Melvin C Varkey, Amal Jyothi College of Engg.
25