Introduction
Hello Readers, in my this article I am going to explain about the Trees in data structure. No doubt, pointers (dynamic variables) have great advantages of flexibility over the array implementation of lists. We know that lists have one limitation. They are arranged in such a way that it is necessary to move through them only one position at a time, that is there is no way to move through the list other than one node at a time. In this article we overcome this limitation by studying trees, a valuable non-linear data structure that is useful in many applications.
Definitions and Terminology
Tree is one of the most important non-linear data structures. Generally a tree structure is used to organize the data so that items of information are related by branches. One very common place where such a situation arise is ‘parent-child’ relationship. Figure-1 shows parent-child relationship among person in a family. Such a tree is known as a hypothetical family tree.
Figure 1
In figure1, each node represents a person name and each line connecting two nodes denotes a relation namely ‘parent-child’ relation between the connected nodes. Often a node is represented or identified by its label and this convention will be used in the remaining part of this book. This figure shows that the node labeled ‘Radhey’ is a parent of nodes labeled ‘Mohan’, ‘Deepak’ and ‘Sunita’. Similarly the node labeled ‘Rahul’ and ‘Kiran’ are children of ‘Simmy’.
In other words we can say that if a node ‘X’ is connected to node ‘Y’ and ‘X’ is above ‘Y’ in the tree then ‘X’ is a parent of ‘Y’ and ‘Y’ is a child of ‘X’. Similarly if a node ‘X’ is connected to node ‘Y’ and ‘Y’ is connected to ‘Z’, as shown in figure-2 and ‘X’ is above the ‘Y’ and ‘Y’ is above the ‘Z’ then ‘Y’ is a parent of ‘Z’ and ‘X’ is a grand father of ‘Z’ and ‘Z’ is a children of ‘Y’ and a grand children of ‘X’.
Also note that each tree originates from a unique node. In figure-1, the tree originates from a node ‘Radhey’. And such a node, which does not have any parent is called the ‘root’ node. In figure-1, ‘Radhey’ is the root node of the tree.
Figure 2
We can also have a tree as shown in figure- 3.
Figure 3
This tree originates from a unique node, ‘A’, which is the root node of the tree. Such types of trees are normally two-way branching. Before the detail discussion of trees, let us define a tree and some terminology that are often used during the discussion of any type of tree. Consider the figure-4.
Tree
A tree, ‘T’, may be defined as a finite set of one or more nodes such that
- There is a specially designated node, called the root, ‘R’, of the tree.
- The remaining nodes (excluding the root node) are partitioned into n ≥ 0 disjoint sets T1, T2, …., Tn and each of these sets is a tree in turn. The trees T1, T2, …., Tn are called the subtrees of the root.
Node
Each element of a tree is called a node of the tree. A node stands for the item of information plus the branches to other nodes. A tree, shown in figure-4, has 12 nodes and each item of data represent a single letter for convenience.
Degree of a node
The number of subtrees of a node is called its degree. The degree of ‘A’ is 3, of ‘C’ is 1, of ‘D’ is 1 and of ‘G’ is 2.
Leaf Nodes
Nodes that have degree zero are called leaf nodes. ‘F’, ‘I’, ‘J’, ‘K’ and ‘L’ are leaf nodes. Leaf nodes are also called as terminal nodes.
Non-terminal Nodes
Except leaf nodes, all other nodes are referred to as non-terminal nodes. ‘A’, ‘B’, ‘C’, ‘D’, ‘E’, ‘G’ and ‘H’ are non-terminal nodes.
Parent/Child
The subtrees of a node ‘X’ are children of ‘X’. If ‘X’ has two children ‘Y’ and ‘Z’ then ‘X’ is said to be a parent of ‘Y’ and ‘Z’.
Sibling
Children of the same parent are said to be sibling. In figure-4, ‘B’, ‘C’ and ‘D’ are siblings. Similarly ‘J’ and ‘K’ are siblings.
Figure 4
Edge
The links between a parent and its children are also referred to as edges or branches. If ‘X’ is a father of ‘Y’ then we can define an edge as a tuple (X, Y).
Path
A collection of edges connected one node to another is often termed as a path. For example, the collection of edges {(A, C), (C, G), (G, J)} denotes a path from the node ‘A’ to ‘node ‘J’ in the tree of figure-4.
Ancestors of a node
All nodes occurring in the path from a given node ‘n’ to the root of a tree are called ancestors of ‘n’. The ancestors of a node ‘K’ are ‘A’, ‘C’ and ‘G’ and node ‘J’ is descendant of ‘C’.
Level of a node
The level of a node in a tree is defined as – the root of the tree has level 0 and level of any other node in the tree is one more than the level of its father. If a node is at level ‘i’ then its children are at level ‘i+1’. For example in the tree of figure-5, the node ‘E’ is at level 2 and ‘K’ is at level 3.
Figure 5
Depth of a Tree
The depth of a tree is the maximum level of any leaf in the tree. This equals the length of the longest path from the root to any leaf. Thus the depth of the tree shown in figure-5 is 3.
Binary Trees
In fact, binary trees are important types of tree structures that occur very often. In binary trees, any node can have at most two children (which are subtrees). Moreover, children of a node of a binary tree are called left child and right child. Figure-6 shows a typical binary tree.
In figure-6, nodes ‘A’, ‘B’, and ‘C’ have two children each. On the other hand nodes ‘D’ and ‘F’ have only one child each. Node ‘D’ has only right subtree and node ‘F’ has only left subtree. The nodes ‘E’, ‘G’, ‘H’ and ‘I’ have no subtrees and they are leaf nodes in the tree.
Figure 6
Thus a binary tree ‘T’ may also be defined as a finite set of nodes that is either empty or consists of one or more nodes such that there is a specially designated node, called the root of the tree, R, and two disjoint binary trees called the left subtree and the right subtree. A left or right subtree can be empty.
Figure-7 shows some binary trees.
(a) (b) (c) (d)
Figure 7
In figure-7(b), the root of the binary tree has an empty right subtree and in figure-7(c), the root of the binary tree has an empty left subtree. If every non-leaf node in a binary tree has non-empty left and right subtrees, the tree is termed as a strictly binary tree. Figure-7(d) shows a strictly binary tree. Similarly if a binary tree is called a completely binary tree of depth ‘d’ is strictly binary tree all of whose leaves are at level ‘d’. Figure-8 shows a complete binary tree of depth 3. This type of binary tree will be discussed later on.
Figure 8
We can also have some special kinds of binary trees, known as skewed trees. Figure-9(a) and figure-9(b) shows a skewed left binary tree and skewed right subtree respectively.
Figure 9
In a binary tree we can easily find the maximum number of nodes on any level. We know that the root node is the only node on level 1. So the maximum number of nodes on level 1 is 1. Similarly on level 2 we can have maximum two nodes and on level 3 we can have maximum four nodes. Thus on any level, say ‘i’, we can maximum 2i-1 number of nodes for i≥1.
Representation of Binary Trees in C
A binary tree can be represented in following two ways:
(i) Sequential Representation of Binary Trees
(ii) Linked List Representation of Binary Trees
Sequential Representation of Binary Trees
A binary tree can also be represented by using arrays. Consider the binary tree of Figure 10. For this we will use one array of characters ‘data’ and two array of integers ‘lchild’ and ‘rchild’. The array data[] is used to contain the data fields of the nodes of this tree, as shown in Table-1. For example, if data[i] denotes the data field of node, say ‘m’, the lchild[i] would contain the index value of the left child of node ‘m’ on array data. Similarly the rchild[i] would contain the index value of the right child of node ‘m’ on array data.
Figure 10
Table 1
For example, if data[1] is ‘B’, then lchild[1] =3 and rchild[1] = 4. This signifies that the 1st node contains ‘B’ as its data; its left child is the node indexed by ‘3’, that is ‘D’, and right child is the node indexed by ‘4’, that is ‘E’. Assume that by convention, the root node is always obtained at an index 0 in this array.
We can also represent a binary tree using a single array. In this we use the numbering scheme where the nodes are numbered sequentially level by level. Nodes in the same level are numbered from left to right. Figure 11 shows the level-wise numbering of the nodes of the tree of figure10. The numbers are enclosed in circular figure.
Figure 11
While numbering nodes one should also number the empty nodes. This figure shows that for i=0, the left child is at 1 and the right child is at 2. Similarly for i=1, the left child is at 3 and the right child is at 4. Thus we can say that if the number appearing against one node ‘n’ is ‘i’ then the number appearing against the left and right children of ‘n’ are ‘2i+1’ and ‘2i+2’ respectively. Now if we store these nodes in an array tree[] then the number appearing against the nodes may act as indices of the nodes in this array. Table-2 shows the array representation for tree shown in figure-11.10.
Table 2
In table-2, unutilized space is marked by dash entries. For complete binary trees, the array representation is ideal, as no space is wasted. However if the tree is not complete binary tree then there will be a lot of unutilized space. Therefore array representation will not always be efficient in terms of space. Another limitation of array representation of binary tree is that insertion and deletion of nodes from the middle of a tree require a lot of movement of many nodes to reflect the change in level of these nodes. These problems are overcome by linked representation of binary trees.
Linked List Representation of Binary Trees
We can also represent a binary tree using a linked list. In a linked list representation of binary trees, each node has three fields – lchild, data and rchild, as shown in figure-12. The ‘lchild’ pointer points to the left subtree and the ‘rchild’ pointer points to the right subtree.
Figure 12
A straightforward implementation of this representation can be done by using pointers to implement the links.
Here is the ‘C’ declaration of a binary node:
struct TreeNode
{
struct TreeNode *lchild;
int data;
struct TreeNode *rchild;
};
typedef struct TreeNode TreeNode;
Now we will assume that ‘Root’ will be point to the root node of the tree. If ‘Root’ is Null then the tree is empty. The root node of a binary tree is declared as:
TreeNode *Root;
The linked list representation of the binary tree shown in figure-10 is given in figure-13.
However we can also implement the linked representation using array elements. Using array implementation, we may declare it as:
struct TreeNode
{
int lchild;
int data;
int rchild;
};
struct TreeNode Tree[N];
where ‘N’ specifies the total number of nodes in a binary tree. But unfortunately this array representation suffers from its static image. It means that we can neither shrink nor expand its size during the execution of a program. For the further discussion of binary tree we will consider the linked list representation of binary tree using dynamic nodes (pointers).
Figure 13
Binary Tree Traversals
There are many operations that we want to perform on binary trees. Traversing of a binary tree is one of the most important operations required to be done on a binary tree. Traversing of a binary tree is a process of visiting each node in the tree exactly once. If there are ‘n’ nodes in a binary tree then there are n! different orders in which they could be visited, but most of these have little regulatory or pattern. However a good traversal technique visits the nodes of a tree in some linear sequence. At a given node, then, there are three tasks we shall wish to do in some order: we may visit the node itself, we may traverse its left subtree; and we may traverse its right subtree. If we name these three tasks V, L and R for visiting the node, moving left and moving right respectively, then there are six possible combinations of traversal:
LVR LRV VLR VRL RVL RLV
If it is conventionally assumed that visiting the left subtree always precedes visiting the right subtree, then only three traversal techniques remain, LVR, LRV and VLR, which are known as inorder, postorder and preorder respectively. These names are chosen according to the step at which the given node is visited. With inorder traversal, the node is visited between the subtrees, with postorder it is visited after both the subtrees and with preorder the node is visited before the subtrees.
Actually the choice of the name inorder, posrorder and preorder is not accidental, rather relates closely to producing the infix, postfix, and prefix forms of an expression. A tress representing an expression is known as an expression tree. An expression tree is constituted from the simple operands as the leaf nodes of a binary tree, and the operators as the non-terminal nodes. Figure-14 shows a simple expression tree.
Figure 14
This expression tree consists of binary expression. However for a unary operator, one subtree will be empty. At this stage we will not worry for how this binary tree was formed but assume that it is available. Now we will use this tree to illustrate each of three familiar traversal techniques.
Inorder Traversal
If a binary tree ‘T’ is traversed in an inorder form then we move down the tree towards the left until we can go no farther. Then we visit the node, move one node to the right and continue the same process for the left subtree of right node. However if we can not move to the right, go back one more node.
Here is the algorithm which traverses a binary tree in an inorder fashion.
Algorithm : Inorder
Inorder(Root)
Here ‘Root’ contains the address of the root node.
Perform these steps if Root is not equal to Null.
Step-1 : Call Inorder(Root->lchild)
Step-2 : Print the value of data of Root node
Step-3 : Call Inorder(Root->rchild)
Step-4 : Exit
The ‘C’ implementation of this algorithm is as:
Implementation : Inorder
Inorder(TreeNode *Root)
{
if (Root != NULL)
{
Inorder(Root->lchild);
printf(“\n%c”, Root->data);
Inorder(Root->rchild);
}
}
If this function is executed on the tree of figure-14 then the following output would be printed:
A + B * C / D
Since this traversal gives the infix form of an expression, that’s why it is called as inorder traversal.
Postorder Traversal
If a binary tree ‘T’ is traversed in a postorder form then the left subtree of ‘T’ is traversed first, then the right subtree of ‘T’ is traversed and finally the root node of ‘T’ is visited.
Here is the algorithm which traverses a binary tree in postorder fashion.
Algorithm : Postorder
Postorder(Root)
Here ‘Root’ contains the address of the root node.
Perform these steps if Root is not equal to Null.
Step-1 : Call Postorder(Root->lchild)
Step-2 : Call Postorder(Root->rchild)
Step-3 : Print the value of data of Root node
Step-4 : Exit
The ‘C’ implementation of this algorithm is as:
Implementation : Postorder
Postorder(TreeNode *Root)
{
if (Root != NULL)
{
Postorder(Root->lchild);
Postorder(Root->rchild);
printf(“\n%c”, Root->data);
}
}
If this function is executed on the tree of figure-11.14 then the following output would be printed:
A B C * D / +
Since this traversal gives the postfix form of an expression, that’s why it is called as postorder traversal.
Preorder Traversal
If a binary tree ‘T’ is traversed in a preorder form then root node is visited first, then the left subtree of ‘T’ is traversed first, and then finally the right subtree of ‘T’ is traversed.
Here is the algorithm which traverses a binary tree in preorder fashion.
Algorithm : Preorder
Preorder(Root)
Here ‘Root’ contains the address of the root node.
Perform these steps if Root is not equal to Null.
Step-1 : Print the value of data of Root node
Step-2 : Call Preorder(Root->lchild)
Step-3 : Call Preorder(Root->rchild)
Step-4 : Exit
The ‘C’ implementation of this algorithm is as:
Implementation : Preorder
Preorder(TreeNode *Root)
{
if (Root != NULL)
{
printf(“\n%c”, Root->data);
Preorder(Root->lchild);
Preorder(Root->rchild);
}
}
If this function is executed on the tree of figure-14 then the following output would be printed:
/ + A * B C D
Since this traversal gives the prefix form of an expression, that’s why it is called as preorder traversal.
Non-recursive Traversal of a Binary Tree
We can also implement a non-recursive traversal of a binary tree using a stack. First we consider transformation of the recursive function Inorder() in which we traverse the binary tree in an ‘inorder’ fashion.
Here is the non-recursive version of the inorder traversal.
NonRecInorder(TreeNode *Root)
{
TreeStackNode *s;
Initialize(&s);
while (1)
{
while (Root != NULL)
{
PushNode(&s, Root);
Root = Root->lchild;
}
if (IsEmpty(s))
return;
temp = PopNode(&s);
printf(“%c ”, temp->data;
Root = Root->rchild;
}
}
In this function we have used a linked stack ‘s’ and a few functions, such as Initialize, PushNode, PopNode and IsEmpty, while implementing a linked stack.
Here is the ‘C’ declaration of a node for above linked stack:
struct TreeStackNode
{
struct TreeNode *address;
char data;
struct TreeStackNode *link;
};
typedef struct TreeStackNode TreeStackNode;
The operations to be done on such a stack are defined as:
Initialize(TreeStackNode **s)
{
*s = NULL;
}
PushNode(TreeStackNode **s, TreeNode *Root)
{
TreeStackNode *temp;
temp = getnode();
temp->address = Root;
temp->link = (*s);
}
PopNode(TreeStackNode **s)
{
TreeStackNode *temp;
temp = (*s);
(*s) = (s)->link;
return (temp->address);
}
IsEmpty(TreeStackNode *s)
{
if (s == NULL)
return 1;
else
return 0;
}
TreeStackNode *getnode()
{
TreeStackNode *temp;
temp = (TreeStackNode *)malloc(sizeof(TreeStackNode));
return temp;
}
Similarly we can write non-recursive traversal functions for preorder and postorder. Since a recursive tree traversal techniques also implicitly use a stack, therefore now the question arises – “Is binary tree traversal possible without the use of extra space for a stack? Of course we can have a binary tree traversal algorithm without the use of stack by adding a new ‘parent’ field to each node. The ‘parent’ field is used to contain the address of parent of current node. Using this field we can trace our way back to any root and down again.