Unit - 4
Trees
Each node in the below structure is labeled with a number. Each arrow in the diagram above represents a link between the two nodes.
Fig: tree
Root - In a tree hierarchy, the root node is the highest node. To put it another way, the root node is the one that has no children. The root node of the tree is numbered 1 in the above structure. A parent-child relationship exists when one node is directly linked to another node.
Child node - If a node is a descendant of another node, it is referred to as a child node.
Parent - If a node has a sub-node, the parent of that sub-node is said to be that node.
Sibling - Sibling nodes are those that have the same parent.
Leaf node - A leaf node is a node in the tree that does not have any descendant nodes. A leaf node is the tree's bottommost node. In a generic tree, there can be any number of leaf nodes. External nodes are also known as leaf nodes.
Internal node - An internal node is a node that has at least one child node.
Ancestor node - Any previous node on a path from the root to that node is an ancestor of that node. There are no ancestors for the root node. Nodes 1, 2, and 5 are the ancestors of node 10 in the tree depicted above.
Descendant node - A descendant of a node is the direct successor of the supplied node. In the diagram above, node 10 is a descendent of node 5.
Binary trees
The Binary tree means that the node can have a maximum of two children. Here, the binary name itself suggests 'two'; therefore, each node can have either 0, 1 or 2 children.
Let's understand the binary tree through an example.
Fig: Example
The above tree is a binary tree because each node contains the utmost two children. The logical representation of the above tree is given below:
Fig: Logical representation
In the above tree, node 1 contains two pointers, i.e., left and a right pointer pointing to the left and right node respectively. The node 2 contains both the nodes (left and right node); therefore, it has two pointers (left and right). The nodes 3, 5 and 6 are the leaf nodes, so all these nodes contain NULL pointer on both left and right parts.
Properties of Binary Tree
● At each level of i, the maximum number of nodes is 2i.
● The height of the tree is defined as the longest path from the root node to the leaf node. The tree which is shown above has a height equal to 3. Therefore, the maximum number of nodes at height 3 is equal to (1+2+4+8) = 15. In general, the maximum number of nodes possible at height h is (20 + 21 + 22+….2h) = 2h+1 -1.
● The minimum number of nodes possible at height h is equal to h+1.
● If the number of nodes is minimum, then the height of the tree would be maximum. Conversely, if the number of nodes is maximum, then the height of the tree would be minimum.
If there are 'n' number of nodes in the binary tree.
The minimum height can be computed as:
As we know that,
n = 2h+1 -1
n+1 = 2h+1
Taking log on both the sides,
Log2(n+1) = log2(2h+1)
Log2(n+1) = h+1
h = log2(n+1) - 1
The maximum height can be computed as:
As we know that,
n = h+1
h= n-1
Types of Binary Tree
There are four types of Binary tree:
● Full/ proper/ strict Binary tree
● Complete Binary tree
● Perfect Binary tree
● Degenerate Binary tree
● Balanced Binary tree
Key takeaway
The Binary tree means that the node can have a maximum of two children. Here, binary name itself suggests 'two'; therefore, each node can have either 0, 1 or 2 children.
Traversal is a process to visit all the nodes of a tree and may print their values too. Because all nodes are connected via edges (links) we always start from the root (head) node. That is, we cannot randomly access a node in a tree. There are three ways which we use to traverse a tree −
● In-order Traversal
● Pre-order Traversal
● Post-order Traversal
Generally, we traverse a tree to search or locate a given item or key in the tree or to print all the values it contains.
In-order Traversal
In this traversal method, the left subtree is visited first, then the root and later the right subtree. We should always remember that every node may represent a subtree itself.
If a binary tree is traversed in-order, the output will produce sorted key values in an ascending order.
Fig: Inorder
We start from A, and following in-order traversal, we move to its left subtree B. B is also traversed in-order. The process goes on until all the nodes are visited. The output of inorder traversal of this tree will be −
D → B → E → A → F → C → G
Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left subtree.
Step 2 − Visit root node.
Step 3 − Recursively traverse right subtree.
Pre-order Traversal
In this traversal method, the root node is visited first, then the left subtree and finally the right subtree.
Fig: Pre order
We start from A, and following pre-order traversal, we first visit A itself and then move to its left subtree B. B is also traversed pre-order. The process goes on until all the nodes are visited. The output of pre-order traversal of this tree will be −
A → B → D → E → C → F → G
Algorithm
Until all nodes are traversed −
Step 1 − Visit root node.
Step 2 − Recursively traverse left subtree.
Step 3 − Recursively traverse right subtree.
Post-order Traversal
In this traversal method, the root node is visited last, hence the name. First we traverse the left subtree, then the right subtree and finally the root node.
Fig: Post order
We start from A, and following Post-order traversal, we first visit the left subtree B. B is also traversed post-order. The process goes on until all the nodes are visited. The output of post-order traversal of this tree will be −
D → E → B → F → G → C → A
Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left subtree.
Step 2 − Recursively traverse right subtree.
Step 3 − Visit root node.
When only one thread is used in a threaded binary tree, it is referred to as a one way threaded tree, and when both threads are used, it is referred to as a two way threaded tree. When the pointer is set to the root node, If neither an in-order predecessor nor an in-order successor can be found.
Here we will see the threaded binary tree data structure. We know that the binary tree nodes may have at most two children. But if they have only one child, or no children, the link part in the linked list representation remains null. Using threaded binary tree representation, we can reuse that empty link by making some threads.
If one node has some vacant left or right child area, that will be used as thread. There are two types of threaded binary tree. The single threaded tree or fully threaded binary tree. In single threaded mode, there are another two variations. Left threaded and right threaded.
In the left threaded mode if some node has no left child, then the left pointer will point to its inorder predecessor, similarly in the right threaded mode if some node has no right child, then the right pointer will point to its inorder successor. In both cases, if no successor or predecessor is present, then it will point to the header node.
For fully threaded binary tree, each node has five fields. Three fields like normal binary tree node, another two fields to store Boolean value to denote whether link of that side is actual link or thread.
One way threaded tree
● A node's empty left child field can be used to reference its inorder predecessor.
● In the same way, a node's empty right child field can be used to point to its in-order successor.
● A one-way threaded binary tree is one such form of binary tree.
● Thread is a field that stores the address of its in-order successor.
A single-threaded tree is also known as a one-way threaded tree. If the thread is found in the left field, the left field will be changed to point to the node's in-order predecessor. A left-threaded binary tree is a one-way threaded tree with only one direction of threading.
Example
A binary tree with one-way threading and its linked form are shown in the diagram.
Because node5's RIGHT field has a NULL pointer, it will be replaced to point to node1, which is its in-order successor. Similarly, node 8's RIGHT field will point to node 4, node 9's RIGHT field will point to node 2, node 10's RIGHT field will point to node 6, node 11's RIGHT field will point to node 3, and node 12's RIGHT field will contain NULL because it has no in order successor.
Fig: one way threaded tree
Two way binary threaded tree
● A threaded binary tree is one example of this sort of binary tree.
● Thread is a field that stores the address of its inorder successor or predecessor.
● A node's empty left child field can be used to reference its inorder predecessor.
● In the same way, a node's empty right child field can be used to point to its inorder successor.
Figure shows a binary tree without threading and its corresponding linked representation.
The in-order traversal of the tree is given as 8, 4, 9, 2, 5, 1, 10, 6, 11,3, 7, 12
Fig: two way binary tree
Key takeaway
Here we will see the threaded binary tree data structure. We know that the binary tree nodes may have at most two children.
But if they have only one children, or no children, the link part in the linked list representation remains null.
Using threaded binary tree representation, we can reuse that empty links by making some threads.
A binary search tree is a type of binary tree in which the nodes are organized in a predetermined sequence. This is referred to as an ordered binary tree.
The value of all nodes in the left sub-tree is less than the value of the root in a binary search tree. Similarly, the value of all nodes in the right subtree is larger than or equal to the root's value. This rule will be applied recursively to all of the root's left and right subtrees.
Fig: Binary search tree
The image above depicts a binary search tree. We can see that the root node 30 doesn't have any values more than or equal to 30 in its left sub-tree, and it also doesn't have any values less than 30 in its right subtree, thanks to the restriction given to the BST.
Valid BSTs:
Fig. Examples of Valid BSTs
Invalid BSTs:
Fig. Examples of Invalid BSTs
Examine the BSTs in Fig. The first tree, Fig. (a) breaks the second rule: The left node 30 is greater than that of its root node 20. The second tree, Fig. (b) breaks the third rule: The right node 20 is less than that of its root node 50. The third tree, Fig. (c) breaks the second rule: The left node 25 is greater than that of its root node 10. It also breaks the third rule: The right node 08 is less than that of its root node 15.
A Binary Search Tree is a binary tree with the following properties:
1. It may be empty and every node has an identifier.
2. All nodes in left subtree are less than the root.
2. All nodes in right subtree are greater than the root.
4. Each subtree is itself a BST.
Advantages
- In a binary search tree, searching becomes incredibly efficient since we get a hint at each step about which sub-tree has the sought element.
- In comparison to arrays and linked lists, the binary search tree is a more efficient data structure. Every stage of the search procedure destroys half of the sub-tree. In a binary search tree, searching for an element takes o(log2n) time.
- It also speed up the insertion and deletion operations as compared to that in array and linked list.
Operation on binary search tree
Searching - In a binary search tree, finding the position of a certain element.
Insertion - Adding a new element to the binary search tree in the proper location to ensure that the BST property is not violated.
Deletion - In a binary search tree, removing a specific node. However, depending on the amount of children a node has, there can be a variety of deletion scenarios.
Key takeaway
A binary search tree is a type of binary tree in which the nodes are organized in a predetermined sequence. This is referred to as an ordered binary tree.
The value of all nodes in the left sub-tree is less than the value of the root in a binary search tree.
A height balanced binary tree is another name for a balanced binary tree. When the height difference between the left and right subtrees is less than m, where m is usually equal to 1, it is called a binary tree. The number of edges on the longest path between the root node and the leaf node determines a tree's height.
A binary search tree is one in which each node on the left side has a lower value than its parent node, while the node on the right side has a higher value. The root node in the above tree is n1, while the leaf nodes are n4, n6, and n7. The n7 node is the node that is the furthest away from the root node. There are two edges on the n4 and n6 nodes, and three edges between the root node and the n7 node. The height of the above tree is 3 since n7 is the farthest from the root node.
We'll now see if the tree above is balanced or not. The nodes n2, n4, n5, and n7 are found on the left subtree, while the nodes n3 and n6 are found on the right subtree. Two leaf nodes, n4 and n7, are found in the left subtree. Because there is just one edge connecting nodes n2 and n4 and two between nodes n7 and n2, node n7 is the furthest away from the root node.
The left subtree has a height of 2. The right subtree has only one leaf node, n6, and only one edge; thus, the right subtree's height is 1. The difference in height between the left and right subtrees is one. We may say that the above tree is a height-balanced tree because we received the value 1. This method of computing the height difference should be repeated for each node, such as n2, n3, n4, n5, n6, and n7. We may claim that the above tree is a balanced binary tree because the value of k is not greater than one when we process each node.
The leaf nodes in the above tree are n6, n4, and n3, with n6 being the farthest node from the root node. Because there are three edges between the root node and the leaf node, the height of the above tree is three. When we take n1 to be the root node, the nodes n2, n4, n5, and n6 are found in the left subtree, whereas the node n3 is found in the right subtree. n2 is a root node in the left subtree, while n4 and n6 are leaf nodes. Because n6 is the farthest node from its root node among the n4 and n6 nodes, and n6 has two edges, the height of the left subtree is 2.
Because there are no children on the left and right sides of the right subtree, its height is 0. Because the left subtree's height is 2 and the right subtree's height is 0, the height difference between the left and right subtrees is 2. The difference in height between the left and right subtrees must not exceed one, according to the definition. The difference in this situation is 2, which is more than 1, making the binary tree above an imbalanced binary search tree.
Key takeaway
A height balanced binary tree is another name for a balanced binary tree. When the height difference between the left and right subtrees is less than m, where m is usually equal to 1, it is called a binary tree.
AVL Trees (Adelsion Velski & Landis):
AVL Tree was invented by GM Adelson - Velsky and EM Landis in 1962. The tree is named AVL in honour of its inventors.
AVL Tree can be defined as a height balanced binary search tree in which each node is associated with a balance factor which is calculated by subtracting the height of its right subtree from that of its left sub-tree.
Tree is said to be balanced if the balance factor of each node is in between -1 to 1, otherwise, the tree will be unbalanced and need to be balanced.
Balance Factor (k) = height (left(k)) - height (right(k))
If the balance factor of any node is 1, it means that the left sub-tree is one level higher than the right subtree.
If the balance factor of any node is 0, it means that the left subtree and right subtree contain equal height.
If the balance factor of any node is -1, it means that the left sub-tree is one level lower than the right subtree.
An AVL tree is given in the following figure. We can see that, balance factor associated with each node is in between -1 and +1. Therefore, it is an example of AVL tree.
Fig: AVL Tree
Complexity
Algorithm | Average case | Worst case |
Space | o(n) | o(n) |
Search | o(log n) | o(log n) |
Insert | o(log n) | o(log n) |
Delete | o(log n) | o(log n) |
Operations on AVL tree
Due to the fact that, AVL tree is also a binary search tree therefore, all the operations are performed in the same way as they are performed in a binary search tree. Searching and traversing do not lead to the violation in property of AVL tree. However, insertion and deletion are the operations which can violate this property and therefore, they need to be revisited.
SN | Operation | Description |
1 | Insertion | Insertion in AVL tree is performed in the same way as it is performed in a binary search tree. However, it may lead to violation in the AVL tree property and therefore the tree may need balancing. The tree can be balanced by applying rotations. |
2 | Deletion | Deletion can also be performed in the same way as it is performed in a binary search tree. Deletion may also disturb the balance of the tree therefore, various types of rotations are used to rebalance the tree. |
Why AVL Tree?
AVL tree controls the height of the binary search tree by not letting it to be skewed. The time taken for all operations in a binary search tree of height h is O(h). However, it can be extended to O(n) if the BST becomes skewed (i.e. worst case). By limiting this height to log n, AVL tree imposes an upper bound on each operation to be O(log n) where n is the number of nodes.
AVL Rotations
We perform rotation in AVL tree only in case if Balance Factor is other than -1, 0, and 1. There are basically four types of rotations which are as follows:
- L L rotation: Inserted node is in the left subtree of left subtree of A
- R R rotation: Inserted node is in the right subtree of right subtree of A
- L R rotation: Inserted node is in the right subtree of left subtree of A
- R L rotation: Inserted node is in the left subtree of right subtree of A
Where node A is the node whose balance Factor is other than -1, 0, 1.
The first two rotations LL and RR are single rotations and the next two rotations LR and RL are double rotations. For a tree to be unbalanced, minimum height must be at least 2, Let us understand each rotation
1. RR Rotation
When BST becomes unbalanced, due to a node is inserted into the right subtree of the right subtree of A, then we perform RR rotation, RR rotation is an anticlockwise rotation, which is applied on the edge below a node having balance factor -2
Fig 12: RR rotation
In above example, node A has balance factor -2 because a node C is inserted in the right subtree of A right subtree. We perform the RR rotation on the edge below A.
2. LL Rotation
When BST becomes unbalanced, due to a node is inserted into the left subtree of the left subtree of C, then we perform LL rotation, LL rotation is clockwise rotation, which is applied on the edge below a node having balance factor 2.
Fig 13: LL rotations
In above example, node C has balance factor 2 because a node A is inserted in the left subtree of C left subtree. We perform the LL rotation on the edge below A.
3. LR Rotation
Double rotations are bit tougher than single rotation which has already explained above. LR rotation = RR rotation + LL rotation, i.e., first RR rotation is performed on subtree and then LL rotation is performed on full tree, by full tree we mean the first node from the path of inserted node whose balance factor is other than -1, 0, or 1.
Let us understand each and every step very clearly:
State | Action |
A node B has been inserted into the right subtree of A the left subtree of C, because of which C has become an unbalanced node having balance factor 2. This case is L R rotation where: Inserted node is in the right subtree of left subtree of C | |
As LR rotation = RR + LL rotation, hence RR (anticlockwise) on subtree rooted at A is performed first. By doing RR rotation, node A, has become the left subtree of B. | |
After performing RR rotation, node C is still unbalanced, i.e., having balance factor 2, as inserted node A is in the left of left of C | |
Now we perform LL clockwise rotation on full tree, i.e. on node C. Node C has now become the right subtree of node B, A is left subtree of B | |
Balance factor of each node is now either -1, 0, or 1, i.e. BST is balanced now. |
4. RL Rotation
As already discussed, that double rotations are bit tougher than single rotation which has already explained above. R L rotation = LL rotation + RR rotation, i.e., first LL rotation is performed on subtree and then RR rotation is performed on full tree, by full tree we mean the first node from the path of inserted node whose balance factor is other than -1, 0, or 1.
State | Action |
A node B has been inserted into the left subtree of C the right subtree of A, because of which A has become an unbalanced node having balance factor - 2. This case is RL rotation where: Inserted node is in the left subtree of right subtree of A | |
As RL rotation = LL rotation + RR rotation, hence, LL (clockwise) on subtree rooted at C is performed first. By doing RR rotation, node C has become the right subtree of B. | |
After performing LL rotation, node A is still unbalanced, i.e. having balance factor -2, which is because of the right-subtree of the right-subtree node A. | |
Now we perform RR rotation (anticlockwise rotation) on full tree, i.e. on node A. Node C has now become the right subtree of node B, and node A has become the left subtree of B. | |
Balance factor of each node is now either -1, 0, or 1, i.e., BST is balanced now. |
Q: Construct an AVL tree having the following elements
H, I, J, B, A, E, C, F, D, G, K, L
1. Insert H, I, J
On inserting the above elements, especially in the case of H, the BST becomes unbalanced as the Balance Factor of H is -2. Since the BST is right-skewed, we will perform RR Rotation on node H.
The resultant balance tree is:
2. Insert B, A
On inserting the above elements, especially in case of A, the BST becomes unbalanced as the Balance Factor of H and I is 2, we consider the first node from the last inserted node i.e. H. Since the BST from H is left-skewed, we will perform LL Rotation on node H.
The resultant balance tree is:
3. Insert E
On inserting E, BST becomes unbalanced as the Balance Factor of I is 2, since if we travel from E to I we find that it is inserted in the left subtree of right subtree of I, we will perform LR Rotation on node I. LR = RR + LL rotation
3 a) We first perform RR rotation on node B
The resultant tree after RR rotation is:
3b) We first perform LL rotation on the node I
The resultant balanced tree after LL rotation is:
4. Insert C, F, D
On inserting C, F, D, BST becomes unbalanced as the Balance Factor of B and H is -2, since if we travel from D to B we find that it is inserted in the right subtree of left subtree of B, we will perform RL Rotation on node I. RL = LL + RR rotation.
4a) We first perform LL rotation on node E
The resultant tree after LL rotation is:
4b) We then perform RR rotation on node B
The resultant balanced tree after RR rotation is:
5. Insert G
On inserting G, BST become unbalanced as the Balance Factor of H is 2, since if we travel from G to H, we find that it is inserted in the left subtree of right subtree of H, we will perform LR Rotation on node I. LR = RR + LL rotation.
5 a) We first perform RR rotation on node C
The resultant tree after RR rotation is:
5 b) We then perform LL rotation on node H
The resultant balanced tree after LL rotation is:
6. Insert K
On inserting K, BST becomes unbalanced as the Balance Factor of I is -2. Since the BST is right-skewed from I to K, hence we will perform RR Rotation on the node I.
The resultant balanced tree after RR rotation is:
7. Insert L
On inserting the L tree is still balanced as the Balance Factor of each node is now either, -1, 0, +1. Hence the tree is a Balanced AVL tree
Key takeaway
AVL Tree was invented by GM Adelson - Velsky and EM Landis in 1962. The tree is named AVL in honour of its inventors.
AVL Tree can be defined as height balanced binary search tree in which each node is associated with a balance factor which is calculated by subtracting the height of its right subtree from that of its left sub-tree.
An extended binary tree is a sort of binary tree in which all of the original tree's null subtrees are replaced with special nodes known as external nodes, while other nodes are known as internal nodes.
The internal nodes are represented by circles, whereas the external nodes are represented by boxes.
- Internal nodes are those from the original tree, whereas external nodes are those from the special tree.
- Internal nodes are non-leaf nodes, while all external nodes are leaf nodes.
- Every exterior node is a leaf, and every internal node has precisely two children.
- It shows the outcome, which is a full binary tree.
It is not required that all nodes have the same number of children, but each node must include m/2 nodes.
The following graphic depicts a B tree of order 4.
Fig: B tree
Any attribute of B Tree may be violated while performing some operations on it, such as the number of minimum children a node can have. The tree may split or unite in order to keep the features of B Tree.
Application of B tree
Because accessing values held in a huge database that is saved on a disk is a very time consuming activity, B tree is used to index the data and enable fast access to the actual data stored on the disks.
In the worst situation, searching an unindexed and unsorted database with n key values takes O(n) time. In the worst-case scenario, if we use B Tree to index this database, it will be searched in O(log n) time.
Operations
Searching
B Trees are comparable to Binary Search Trees in terms of searching. For example, let's look for item 49 in the B Tree below. The procedure will go somewhat like this:
- Compare item 49 to node 78, which is the root node. Move to the left sub-tree since 49 78 onwards.
- Since 404956, move to the right subtree of 40. 49>45, move to the right subtree of 40.
- Compare and contrast 49.
- Return if a match is found.
The height of the tree affects the search in a B tree. To search any element in a B tree, the search technique requires O(log n) time.
Inserting
At the leaf node level, insertions are made. In order to place an item into B Tree, the following algorithm must be followed.
- Navigate the B Tree until you find the suitable leaf node to insert the node at.
- If there are fewer than m-1 keys in the leaf node, insert the element in ascending order.
- Otherwise, if the leaf node contains m-1 keys, proceed to the next step.
● In the rising sequence of elements, add the new element.
● At the middle, divide the node into two nodes.
● The median element should be pushed up to its parent node.
● If the parent node has the same m-1 number of keys as the child node, split it as well.
Example:
Insert the node 8 into the B Tree of order 5 shown in the following image.
8 will be inserted to the right of 5, therefore insert 8.
The node, now contain 5 keys which is greater than (5 -1 = 4 ) keys. Therefore split the node from the median i.e. 8 and push it up to its parent node shown as follows.
Deletion
At the leaf nodes, deletion is also conducted. It can be a leaf node or an inside node that needs to be destroyed. In order to delete a node from a B tree, use the following algorithm.
- Locate the leaf node.
- If the leaf node has more than m/2 keys, delete the required key from the node.
- If the leaf node lacks m/2 keys, fill in the gaps with the element from the eighth or left sibling.
● If the left sibling has more than m/2 elements, shift the intervening element down to the node where the key is deleted and push the largest element up to its parent.
● If the right sibling has more than m/2 items, shift the intervening element down to the node where the key is deleted and push the smallest element up to the parent.
4. Create a new leaf node by merging two leaf nodes and the parent node's intervening element if neither sibling has more than m/2 elements. If the parent has less than m/2 nodes, repeat the operation on the parent as well.
5. If the node to be destroyed is an internal node, its in-order successor or predecessor should be used instead. The process will be same as the node being deleted from the leaf node because the successor or predecessor will always be on the leaf node.
Example
Delete the node 53 from the B Tree of order 5 shown in the following figure.
53 is present in the right child of element 49. Delete it.
Now, 57 is the only element which is left in the node, the minimum number of elements that must be present in a B tree of order 5, is 2. It is less than that, the elements in its left and right sub-tree are also not sufficient therefore, merge it with the left sibling and intervening element of parent i.e. 49.
The final B tree is shown as follows.
Key takeaway
An extended binary tree is a sort of binary tree in which all of the original tree's null subtrees are replaced with special nodes known as external nodes, while other nodes are known as internal nodes.
A B+ tree is an extension of a B tree that improves the efficiency of search, insert, and delete operations. We know that B trees enable data pointers and key values in both internal and leaf nodes; nevertheless, this is clearly a disadvantage for B trees because it reduces the ability to insert nodes at a certain level, resulting in an increase in node levels, which is clearly undesirable. The data pointers are only stored at the leaf node level in B+ trees, while the key values are only stored in the internal nodes.
It's also worth noting that the nodes at the leaf level are linked to one another, making traversing the data points simple and efficient.
When we need to store a huge amount of data in the main memory, B+ trees come in helpful. Because we know the main memory is limited, we'll utilise B+ trees, in which the internal nodes that store the key (to access the records) are kept in the main memory, while the leaf nodes that hold the data pointers are kept in the secondary memory.
The following is a visual illustration of a B+ tree:
Why B+ trees?
● B+ trees keep track of records that can be retrieved with an equal number of disc accesses later.
● Even if the amount of data stored in them is the same, the height of the B+ tree remains balanced and quite low when compared to B trees.
● Accessing records is much easier when there are fewer tiers.
● We can simply search elements in a sequential manner because the leaf nodes are connected to each other like a linked list.
Fig: Example of B+ trees
Basic operations such as searching, inserting, and deleting are supported. The item will be sorted in each node. Before and after the element in position I there is a child element. As a result, children who were present earlier will have lesser values, whereas children who are present now will have larger values.
Advantages
● In an equal number of disc accesses, records can be retrieved.
● In comparison to B tree, the tree's height maintains balanced and is lower.
● The data stored in a B+ tree can be accessed both sequentially and directly.
● Indexing is done via keys.
● Because the data is only stored on the leaf nodes, search queries are faster.
Inserting in B+ trees
● Conduct a search operation in the B+ tree to determine the best bucket position for this new node.
● If the bucket isn't full (and it doesn't contradict the B+ tree property), put that node in it.
● Otherwise, split the nodes into two and push the middle node (the median node) to the parent node before inserting the new node.
● If the parent node is present but the current node is still full, repeat the procedures above.
Example:
Insert the value 195 into the B+ tree of order 5 shown in the following figure.
195 will be inserted in the right sub-tree of 120 after 190. Insert it at the desired position.
The node contains greater than the maximum number of elements i.e. 4, therefore split it and place the median node up to the parent.
Now, the index node contains 6 children and 5 keys which violates the B+ tree properties, therefore we need to split it, shown as follows.
Deletion in B+ Tree
Step 1: Delete the key and data from the leaves.
Step 2: if the leaf node contains less than minimum number of elements, merge down the node with its sibling and delete the key in between them.
Step 3: if the index node contains less than minimum number of elements, merge the node with the sibling and move down the key in between them.
Example
Delete the key 200 from the B+ Tree shown in the following figure.
200 is present in the right sub-tree of 190, after 195. Delete it.
Merge the two nodes by using 195, 190, 154 and 129.
Now, element 120 is the single element present in the node which is violating the B+ Tree properties. Therefore, we need to merge it by using 60, 78, 108 and 120.
Now, the height of B+ tree will be decreased by 1.
Searching in B+ tree:
In a B+ tree, searching is equivalent to searching in a BST tree. If the current value is less than the searching key, go to the left subtree; if it is greater, go to the current bucket(node) and see where the optimum placement is.
To grasp the searching technique, consider the diagram of a B+ tree below. Let's say we want to find a key in the B+ tree below that has a value of 59.
B Tree VS B+ Tree
SN | B Tree | B+ Tree |
1 | Search keys can not be repeatedly stored. | Redundant search keys can be present. |
2 | Data can be stored in leaf nodes as well as internal nodes | Data can only be stored on the leaf nodes. |
3 | Searching for some data is a slower process since data can be found on internal nodes as well as on the leaf nodes. | Searching is comparatively faster as data can only be found on the leaf nodes. |
4 | Deletion of internal nodes are so complicated and time consuming. | Deletion will never be a complexed process since element will always be deleted from the leaf nodes. |
5 | Leaf nodes cannot be linked together. | Leaf nodes are linked together to make the search operations more efficient. |
References:
- Data Structures and Algorithms in C, M.T. Goodrich, R. Tamassia and D. Mount, Wiley India.
- Data Structures Using C – A.M. Tenenbaum (PHI)
- Data Structures and Algorithm Analysis in C, 3rd Edition, M.A. Weiss, Pearson.
- Classic Data Structures, D. Samanta, 2nd Edition, PHI.
- Data Structure Using C by Pankaj Kumar Pandey.