Unit 1 | unit 1 data structures

DSA

Unit - 1

Data Structures

1.1 Introduction to Data Structures

A data structure is a collection of data pieces that provides an efficient method for storing and organizing data in a computer so that it may be used effectively. Arrays, Linked Lists, Stacks, Queues, and other Data Structures are examples. Data Structures are employed in practically every element of computer science, including operating systems, compiler design, artificial intelligence, graphics, and many more applications.

Data Structures are an important aspect of many computer science algorithms because they allow programmers to efficiently handle data. It is critical to improving the performance of a software or program because the software's primary duty is to save and retrieve user data as quickly as feasible.

Basic terminology

The building blocks of every program or software are data structures. The most challenging challenge for a programmer is deciding on a suitable data structure for a program. When it comes to data structures, the following terminology is utilized.

Data: The data on a student can be defined as an elementary value or a set of values, for example, the student's name and id.

Group items: Group items are data items that include subordinate data items, such as a student's name, which can have both a first and a last name.

Record: A record can be characterized as a collection of several data objects. For example, if we consider the student entity, the name, address, course, and grades can all be gathered together to constitute the student's record.

File: A file is a collection of various records of one type of entity; for example, if there are 60 employees in the class, the relevant file will contain 20 records, each containing information about each employee.

Attribute and Entity: A class of objects is represented by an entity. It has a number of characteristics. Each attribute represents an entity's specific property.

Key takeaway

A data structure is a collection of data pieces that provides an efficient method for storing and organizing data in a computer so that it may be used effectively.

1.2 Need of Data Structure

As applications become more complicated and the amount of data collected grows, the following issues may arise:

● Processor speed: High-speed processing is essential to manage vast amounts of data, however as data grows by the day to billions of files each entity, processors may be unable to handle that much data.

● Data Search: Consider a store with a 106-item inventory. If our programme needs to search for a specific item, it must traverse all 106 items each time, which slows down the search process.

● Multiple requests: If thousands of users are searching for data on a web server at the same time, there is a potential that a very large server will fail during the process.

Data structures are used to solve the difficulties listed above. Data is grouped into a data structure in such a way that not all things must be searched and essential data can be found quickly.

Advantages of Data Structures

● Efficiency: The efficiency of a programme is determined by the data structures used. Consider the following scenario: we have certain data and need to find a specific record. If we structure our data in an array, we will have to search element by element in such a scenario. As a result, employing an array may not be the most efficient solution in this case. There are more efficient data structures for searching, such as an ordered array, a binary search tree, or hash tables.

● Reusability: Data structures are reusable, which means that after we've implemented one, we can use it somewhere else. Data structure implementations can be compiled into libraries that can be utilized by a variety of clients.

● Abstraction: The ADT specifies the data structure, which provides a level of abstraction. The client software only interacts with the data structure through an interface, thus it isn't concerned with the implementation details.

1.3 Abstract Data type

The abstract data type is a special kind of datatype, whose behavior is defined by a set of values and set of operations. The keyword “Abstract” is used as we can use these data types, we can perform different operations. But how those operations are working is totally hidden from the user. The ADT is made of primitive data types, but operation logics are hidden.

Here we will see the stack ADT. These are a few operations or functions of the Stack ADT.

● isFull(), This is used to check whether stack is full or not

● isEmpty(), This is used to check whether stack is empty or not

● push(x), This is used to push x into the stack

● pop(), This is used to delete one element from top of the stack

● peek(), This is used to get the top most element of the stack

● size(), this function is used to get number of elements present into the stack

Example

#include<iostream>

#include<stack>

Using namespace std;

Main(){

stack<int> stk;

if(stk.empty()){

cout << "Stack is empty" << endl;

} else {

cout << "Stack is not empty" << endl;

}

//insert elements into stack

stk.push(10);

stk.push(20);

stk.push(30);

stk.push(40);

stk.push(50);

cout << "Size of the stack: " << stk.size() << endl;

//pop and display elements

while(!stk.empty()){

int item = stk.top(); // same as peek operation

stk.pop();

cout << item << " ";

}

Output

Stack is empty

Size of the stack: 5

50 40 30 20 10

Key takeaway

The abstract data type is a special kind of datatype, whose behavior is defined by a set of values and set of operations.

The keyword “Abstract” is used as we can use these data types, we can perform different operations. But how those operations are working is totally hidden from the user.

The ADT is made of primitive data types, but operation logics are hidden.

1.4 Types of Data Structures Algorithms: Algorithm, Efficiency of an Algorithm

What is an Algorithm?

An algorithm is a process or a set of rules required to perform calculations or some other problem-solving operations especially by a computer. The formal definition of an algorithm is that it contains the finite set of instructions which are being carried in a specific order to perform the specific task. It is not the complete program or code; it is just a solution (logic) of a problem, which can be represented either as an informal description using a Flowchart or Pseudocode.

Characteristics of an Algorithm

The following are the characteristics of an algorithm:

● Input: An algorithm has some input values. We can pass 0 or some input value to an algorithm.

● Output: We will get 1 or more output at the end of an algorithm.

● Unambiguity: An algorithm should be unambiguous which means that the instructions in an algorithm should be clear and simple.

● Finiteness: An algorithm should have finiteness. Here, finiteness means that the algorithm should contain a limited number of instructions, i.e., the instructions should be countable.

● Effectiveness: An algorithm should be effective as each instruction in an algorithm affects the overall process.

● Language independent: An algorithm must be language-independent so that the instructions in an algorithm can be implemented in any of the languages with the same output.

Dataflow of an Algorithm

● Problem: A problem can be a real-world problem or any instance from the real-world problem for which we need to create a program or the set of instructions. The set of instructions is known as an algorithm.

● Algorithm: An algorithm will be designed for a problem which is a step by step procedure.

● Input: After designing an algorithm, the required and the desired inputs are provided to the algorithm.

● Processing unit: The input will be given to the processing unit, and the processing unit will produce the desired output.

● Output: The output is the outcome or the result of the program.

Why do we need Algorithms?

We need algorithms because of the following reasons:

● Scalability: It helps us to understand scalability. When we have a big real-world problem, we need to scale it down into small-small steps to easily analyze the problem.

● Performance: The real-world is not easily broken down into smaller steps. If the problem can be easily broken into smaller steps means that the problem is feasible.

Let's understand the algorithm through a real-world example. Suppose we want to make a lemon juice, so following are the steps required to make a lemon juice:

Step 1: First, we will cut the lemon into half.

Step 2: Squeeze the lemon as much as you can and take out its juice in a container.

Step 3: Add two tablespoons of sugar in it.

Step 4: Stir the container until the sugar gets dissolved.

Step 5: When sugar gets dissolved, add some water and ice in it.

Step 6: Store the juice in a fridge for 5 to minutes.

Step 7: Now, it's ready to drink.

The above real-world can be directly compared to the definition of the algorithm. We cannot perform step 3 before step 2, we need to follow the specific order to make lemon juice. An algorithm also says that each and every instruction should be followed in a specific order to perform a specific task.

Now we will look at an example of an algorithm in programming.

We will write an algorithm to add two numbers entered by the user.

The following are the steps required to add two numbers entered by the user:

Step 1: Start

Step 2: Declare three variables a, b, and sum.

Step 3: Enter the values of a and b.

Step 4: Add the values of a and b and store the result in the sum variable, i.e., sum=a+b.

Step 5: Print sum

Step 6: Stop

Key takeaway

An algorithm is a process or a set of rules required to perform calculations or some other problem-solving operations especially by a computer.

The formal definition of an algorithm is that it contains the finite set of instructions which are being carried in a specific order to perform the specific task.

Efficiency of an Algorithm

Following are the conditions where we can analyze the algorithm:

● Time efficiency: indicates how fast algorithm is running

● Space efficiency: how much extra memory the algorithm needs to complete its execution

● Simplicity: - generating sequences of instruction which are easy to understand.

● Generality: - It fixes the range of inputs it can accept. The generality of problems which algorithms can solve.

The time efficiencies of a large number of algorithms fall into only a few classes

Fig 1: Time efficiency

1.5 Time and Space Complexity

Time complexity:

The time complexity of an algorithm is the amount of time required to complete the execution. The time complexity of an algorithm is denoted by the big O notation. Here, big O notation is the asymptotic notation to represent the time complexity. The time complexity is mainly calculated by counting the number of steps to finish the execution.

Let's understand the time complexity through an example.

Sum=0;
// Suppose we have to calculate the sum of n numbers.
For i=1 to n
Sum=sum+i;
// when the loop ends then sum holds the sum of the n numbers
Return sum;

In the above code, the time complexity of the loop statement will be at least n, and if the value of n increases, then the time complexity also increases. While the complexity of the code, i.e., return sum will be constant as its value is not dependent on the value of n and will provide the result in one step only. We generally consider the worst-time complexity as it is the maximum time taken for any given input size.

Space complexity: An algorithm's space complexity is the amount of space required to solve a problem and produce an output. Similar to the time complexity, space complexity is also expressed in big O notation.

For an algorithm, the space is required for the following purposes:

To store program instructions
To store constant values
To store variable values

To track the function calls, jumping statements, etc.

Key takeaway

The time complexity of an algorithm is the amount of time required to complete the execution.

An algorithm's space complexity is the amount of space required to solve a problem and produce an output.

1.6 Asymptotic notations (Big O, Omega Ω, Theta)

Big Oh (O) Notation:

It is the formal way to express the time complexity.
It is used to define the upper bound of an algorithm’s running time.
It shows the worst-case time complexity or largest amount of time taken by the algorithm to perform a certain task.

Fig 2: Big Oh notation

Here, f (n) <= g (n) …………….(eq1)

where n>= k, C>0 and k>=1

If any algorithm satisfies the eq1 condition then it becomes f (n)= O (g (n))

Big Omega (ῼ) Notation

It is the formal way to express the time complexity.
It is used to define the lower bound of an algorithm’s running time.
It shows the best-case time complexity or best of time taken by the algorithm to perform a certain task.

Fig 3: Big omega notation

Here, f (n) >= g (n)

Where n>=k and c>0, k>=1

If any algorithm satisfies the eq1 condition then it becomes f (n)= ῼ (g (n))

Theta (Θ) Notation

It is the formal way to express the time complexity.
It is used to define the lower as well as upper bound of an algorithm’s running time.
It shows the average case time complexity or average of time taken by the algorithm to perform a certain task.

Fig 4: Theta notation

Let C1 g(n) be upper bound and C2 g(n) be lower bound.

C1 g(n)<= f (n)<=C2 g(n)

Where C1, C2>0 and n>=k

Key takeaway

Big Oh is used to define the upper bound of an algorithm’s running time.
Big Omega is used to define the lower bound of an algorithm’s running time.
Theta is used to define the lower as well as upper bound of an algorithm’s running time.

1.7 Time-Space trade- off

It is also known as time memory tradeoff.
It is a way of solving problems in less time by using more storage space.
It can be used with the problem of data storage.
For example: time complexity of merge sort is O(n log n)in all cases. But requires an auxiliary array.
In quick sort complexity is O(n log n) for best and average case but it doesn’t require auxiliary space.
Two Space-for-Time Algorithm Varieties:
Input enhancement - To store some info, preprocess the input (or its portion) to be used in solving the issue later
Counting sorts
String searching algorithms
Pre-structuring-preprocessing the input to access its components

1) hashing 2) indexing schemes simpler (e.g., B-trees)

Key takeaway

It is also known as time memory tradeoff.
It is a way of solving problems in less time by using more storage space.

1.8 Searching- Linear & Binary Search

Searching is the process of looking for a certain item in a list. If the element is found in the list, the process is considered successful, and the location of that element is returned; otherwise, the search is considered unsuccessful.

Linear Search and Binary Search are two prominent search strategies. So, we'll talk about a popular search strategy called the Linear Search Algorithm.

The linear search algorithm is also known as the sequential search algorithm. It is the most basic search algorithm. We just traverse the list in linear search and match each element of the list with the object whose position has to be discovered. If a match is found, the algorithm delivers the item's location; otherwise, NULL is returned.

It's commonly used to find an element in an unordered list, or one in which the items aren't sorted. The time complexity of linear search in the worst-case scenario is O(n).

The following are the steps involved in implementing Linear Search:

● We must first use a for loop to traverse the array elements.

● Compare the search element with the current array element in each iteration of the for loop, and -

○ Return the index of the related array entry if the element matches.

○ If the element you're looking for doesn't match, move on to the next one.

● Return -1 if there is no match or the search element is not found in the given array.

Let's look at the linear search algorithm now.

Algorithm

Linear_Search(a, n, val) // 'a' is the given array, 'n' is the size of given array, 'val' is the value to search

Step 1: set pos = -1

Step 2: set i = 1

Step 3: repeat step 4 while i <= n

Step 4: if a[i] == val

Set pos = i

Print pos

Go to step 6

[end of if]

Set ii = i + 1

[end of loop]

Step 5: if pos = -1

Print "value is not present in the array "

[end of if]

Step 6: exit

Time Complexity

Case	Time Complexity
Best Case	O(1)
Average Case	O(n)
Worst Case	O(n)

Space Complexity

O(1)

Application of Linear Search Algorithm

The linear search algorithm is useful for the following tasks:

● Both single-dimensional and multi-dimensional arrays can benefit from linear search.

● When the array has only a few elements, linear search is simple to implement and effective.

● When performing a single search in an unordered-List, Linear Search is also quite efficient.

Binary Search

Binary search is the search technique which works efficiently on the sorted lists. Hence, in order to search an element into some list by using binary search technique, we must ensure that the list is sorted.

Binary search follows a divide and conquer approach in which the list is divided into two halves and the item is compared with the middle element of the list. If the match is found then, the location of the middle element is returned; otherwise, we search into either of the halves depending upon the result produced through the match.

Binary search algorithm is given below.

BINARY_SEARCH(A, lower_bound, upper_bound, VAL)

● Step 1: [INITIALIZE] SET BEG = lower_bound

END = upper_bound, POS = - 1

● Step 2: Repeat Steps 3 and 4 while BEG <=END

● Step 3: SET MID = (BEG + END)/2

● Step 4: IF A[MID] = VAL

SET POS = MID

PRINT POS

Go to Step 6

ELSE IF A[MID] > VAL

SET END = MID - 1

ELSE

SET BEG = MID + 1

[END OF IF]

[END OF LOOP]

● Step 5: IF POS = -1

PRINT "VALUE IS NOT PRESENT IN THE ARRAY"

[END OF IF]

● Step 6: EXIT

Complexity

SN	Performance	Complexity
1	Worst case	O(log n)
2	Best case	O(1)
3	Average Case	O(log n)
4	Worst case space complexity	O(1)

Example

Let us consider an array arr = {1, 5, 7, 8, 13, 19, 20, 23, 29}. Find the location of the item 23 in the array.

In 1st step:

BEG = 0
END = 8ron
MID = 4
a[mid] = a[4] = 13 < 23, therefore

In Second step:

Beg = mid +1 = 5
End = 8
Mid = 13/2 = 6
a[mid] = a[6] = 20 < 23, therefore;

In third step:

Beg = mid + 1 = 7
End = 8
Mid = 15/2 = 7
a[mid] = a[7]
a[7] = 23 = item;
Therefore, set location = mid;
The location of the item will be 7.

Fig 3 - Example

1.9 Sorting- Bubble Sort, Insertion Sort, Selection Sort

Bubble sort

Bubble Sort is a simple algorithm for sorting a collection of n elements that is provided in the form of an array with n elements. Bubble Sort compares each element individually and sorts them according to their values.

If an array must be sorted in ascending order, bubble sort will begin by comparing the first and second members of the array, swapping both elements if the first element is greater than the second, and then moving on to compare the second and third elements, and so on.

If there are n elements in total, we must repeat the method n-1 times.

It's called bubble sort because the largest element in the given array bubbles up to the last position or highest index with each full iteration, similar to how a water bubble rises to the sea surface.

Although it is simple to use, bubble sort is largely employed as an educational tool because its performance in the actual world is low. It isn't designed to handle huge data collections. Bubble sort has an average and worst-case complexity of O(n2), where n is the number of elements.

Sorting is done by going through all of the items one by one, comparing them to the next closest element, and swapping them if necessary.

Algorithm

Assume arr is an array of n elements in the algorithm below. The algorithm's presumed swap function will swap the values of specified array members.

Begin BubbleSort(arr)

for all array elements

if arr[i] > arr[i+1]

swap(arr[i], arr[i+1])

end if

end for

return arr

End BubbleSort

Working of Bubble Sort

Assume we're attempting to arrange the elements in ascending order.

1. First Iteration (Compare and Swap)

● Compare the first and second elements starting with the first index.

● The first and second elements are swapped if the first is bigger than the second.

● Compare and contrast the second and third elements now. If they aren't in the right order, swap them.

● The process continues until the last piece is added.

2. Remaining Iteration

For the remaining iterations, the same procedure is followed.

The largest element among the unsorted items is placed at the conclusion of each iteration.

The comparison takes place up to the last unsorted element in each iteration.

When all of the unsorted elements have been placed in their correct placements, the array is said to be sorted.

Key takeaway

Compare and swap adjacent elements of the list if they are out of order.
After repeating the procedure, we end up 'bubbling up' the largest element to the last place on the list by doing it repeatedly.

Insertion sort

Insertion style is a decrease & conquer technique application. It is a sort based on comparison in which the sorted array is constructed on one entry at a time.

Insertion sorting is a very basic process for sorting numbers in an ascending or descending order. The gradual method is accompanied by this method. It can be contrasted with the method of sorting cards at the time of playing a game.

The numbers that are needed to be sorted are known as keys. Here is the insertion sort method algorithm.

Algorithm:

Algorithm: Insertion-Sort (A)

For j = 2 to A.length

Key = A[j]

// Insert A[j] into the sorted sequence A[1.. j - 1]

i = j - 1

While i > 0 and A[i] > key

A[i + 1] = A[i]

i = i -1

A[i + 1] = key

Analysis:

This algorithm's run time is very dependent on the input given.

If the numbers given are sorted, this algorithm runs in O(n) time. If the numbers given are in reverse order, the algorithm runs in O(n2) time.

Example:

Using insertion sort to sort the following list of elements:

89, 45, 68, 90, 29, 34, 17

89 |45 68 90 29 34 17

45 89 |68 90 29 34 17

45 68 89 |90 29 34 17

45 68 89 90 |29 34 17

29 45 68 89 90 |34 17

29 34 45 68 89 90 |17

17 29 34 45 68 89 90

Application:

In the following instances, the insertion sort algorithm is used:

When only a few elements are in the list.
When there are few elements for sorting.

Advantages:

Straightforward implementation. Three variants exist.

● Left to right scan

● Right to left scan

● Binary insertion sort

2. Effective on a small list of components, on a nearly sorted list.

3. In the best example, running time is linear.

4. This algorithm is stable.

5. Is the on-the-spot algorithm

Key takeaways

Insertion sorting is a very basic process for sorting numbers in an ascending or descending order.
Insertion style is a decrease & conquer technique application.
It is a sort based on comparison in which the sorted array is constructed on one entry at a time.

Selection sort

In the selection sort, the smallest value among the array's unsorted items is chosen in each pass and inserted into the proper spot.

To begin, locate the array's smallest element and set it in the first position. Then locate the array's second smallest element and set it in the second spot. The procedure is repeated till the sorted array is obtained.

The n-1 pass of the selection sort algorithm is used to sort the array of n entries.

● The smallest element of the array, as well as its index pos, must be identified in the first run. Swap A[0] and A[pos] after that. As a result of sorting A[0], we now have n -1 elements to sort.

● The location pos of the smallest element in the sub-array A[n-1] is obtained in the second pass. Swap A[1] and A[pos] after that. After sorting A[0] and A[1], we're left with n-2 unsorted elements.

● The position pos of the smaller element between A[n-1] and A[n-2] must be found in the n-1st pass. Swap A[pos] and A[n-1] after that.

As a result, the items A[0], A[1], A[2],...., A[n-1] are sorted using the procedure described above.

Example

Consider the following six-element array. Using selection sort, sort the array's elements.

A = {10, 2, 3, 90, 43, 56}.

Pass	Pos	A[0]	A[1]	A[2]	A[3]	A[4]	A[5]
1	1	2	10	3	90	43	56
2	2	2	3	10	90	43	56
3	2	2	3	10	90	43	56
4	4	2	3	10	43	90	56
5	5	2	3	10	43	56	90

Sorted A = {2, 3, 10, 43, 56, 90}

Algorithm

SELECTION SORT(ARR, N)

● Step 1: Repeat Steps 2 and 3 for K = 1 to N-1

● Step 2: CALL SMALLEST(ARR, K, N, POS)

● Step 3: SWAP A[K] with ARR[POS]

[END OF LOOP]

● Step 4: EXIT

SMALLEST (ARR, K, N, POS)

● Step 1: [INITIALIZE] SET SMALL = ARR[K]

● Step 2: [INITIALIZE] SET POS = K

● Step 3: Repeat for J = K+1 to N -1

IF SMALL > ARR[J]

SET SMALL = ARR[J]

SET POS = J

[END OF IF]

[END OF LOOP]

● Step 4: RETURN POS

Key takeaway

In the selection sort, the smallest value among the array's unsorted items is chosen in each pass and inserted into the proper spot.
To begin, locate the array's smallest element and set it in the first position.
Then locate the array's second smallest element and set it in the second spot.

1.10 Algorithm design strategies - Divide and Conquer strategy, Merge Sort, Quick Sort

It becomes easier to solve the complete problem if we can separate a single major problem into smaller sub-problems, solve the smaller sub-problems, and integrate their solutions to find the solution for the original huge problem.

Let's look into Divide and Rule as an example.

When the British arrived in India, they encountered a country where many religions coexisted peacefully, hardworking but ignorant individuals, and unity in variety, making it difficult for them to establish their empire. As a result, they chose the Divide and Rule strategy. Where the population of India was a single large difficulty for them, they divided it into smaller ones by inciting rivalries amongst local monarchs and pitting them against one another, which worked extremely well for them.

That was history and a socio-political policy (Break and Rule), but the notion here is that if we can divide a problem into smaller sub-problems, it becomes easier to tackle the total problem in the end.

The Divide and Conquer strategy consists of three steps:

● Break the problem down into smaller chunks.

● Solve the subproblems to defeat them. The goal is to break down the problem into atomic subproblems that can be solved individually.

● To obtain the solution to the actual problem, combine the solutions of the subproblems.

Merge sort

Merge Sort is based on a divide and conquer strategy. It divides the unsorted list into n sublists such that each sublist contains one element. Each sublist is then sorted and merged until there is only one sublist.

Step 1: Divide the list into sublists such that each sublist contains only one element.

Step 2: Sort each sublist and Merge them.

Step 3: Repeat Step 2 till the entire list is sorted.

Example: Apply Merge Sort on array A= {85,24,63,45,17,31,96,50}

Step1: A contains 8 elements. First we divide the list into two sublists such that each sublist contains 4 elements. Then we divide each sublist of 4 elements into sublist of two elements each. Finally every sublist contains only one element.

Fig. 4: Example

Step 2: Now we start from the bottom. Compare 1st and 2nd element. Sort and merge them. Apply this for all bottom elements.

Thus we got 4 sublists as follows

Step 3: Repeat this sorting and merging until the entire list is sorted. The above four sublists are sorted and merged to form 2 sublists.

Finally, the above 2 sublists are again sorted and merged in one sublist.

We will see one more example on merge sort.

Example: Apply merge Sort on array A={10,5,7,6,1,4,8,3,2,9,12,11}

Step 1:

Fig 5: Merge sort

Algorithm for Merge Sort:

Let x = no. Of elements in array A

y = no. Of elements in array B

C = sorted array with no. Of elements n

n = x+y

Following algorithm merges a sorted x element array A and sorted y element array B into a sorted array C, with n = x + y elements. We must always keep track of the locations of the smallest element of A and the smallest element of B which have not yet been placed in C. Let LA and LB denote these locations, respectively. Also, let LC denote the location in C to be filled. Thus, initially, we set

LA : = 1,

LB : = 1 and

LC : = 1.

At each step, we compare A[LA] and B[LB] and assign the smaller element to C[LC].

Then

LC:= LC + 1,

LA: = LA + 1, if a new element has come from A .

LB: = LB + 1, if a new element has come from B.

Furthermore, if LA> x, then the remaining elements of B are assigned to C;

LB >y, then the remaining elements of A are assigned to C.

Thus the algorithm becomes

MERGING ( A, x, B, y, C)

1. [Initialize ] Set LA : = 1 , LB := 1 AND LC : = 1

2. [Compare] Repeat while LA <= x and LB <= y

If A[LA] < B[LB] , then

(a) [Assign element from A to C ] set C[LC] = A[LA]

(b) [Update pointers ] Set LC := LC +1 and LA = LA+1

Else

(a) [Assign element from B to C] Set C[LC] = B[LB]

(b) [Update Pointers] Set LC := LC +1 and LB = LB +1

[End of loop]

3. [Assign remaining elements to C]

If LA > x , then

Repeat for K= 0 ,1,2,........,y–LB

Set C[LC+K] = B[LB+K]

[End of loop]

Else

Repeat for K = 0,1,2,......,x–LA

Set C[LC+K] = A[LA+K]

[End of loop]

4. Exit

Quick sort

Quick sort is the most efficient method of sorting .It uses the divide and conquer strategy to sort the elements. In quicksort, we divide the array of items to be sorted into two partitions and then call the quicksort procedure recursively to sort the two partitions. For this we choose a pivot element and arrange all elements such that all the elements in the left part are less than the pivot and all those in the right part are greater than it.

Algorithm to implement Quick Sort is as follows.

Consider array A of size n to be sorted. p is the pivot element. Index positions of the first and last element of array A are denoted by ‘first’ and ‘last’.

Initially first=0 and last=n–1.
Increment first till A[first]<=p
Decrement last till A[last]>=p
If first < last then swap A[first] with A[last].
If first > last then swap p with A[last] and thus the position of p is found. Pivot p is placed such that all elements to its left are less than it and all elements right to it are greater than it. Let us assume that j is the final position of p. Then A[0] to A[j–1] are less than p and A [j + 1] to A [n–1]are greater than p.
Now we consider the left part of array A i.e A[0] to A[j–1].Repeat step 1,2,3 and 4. Apply the same procedure for the right part of the array.
We continue with the above steps till no more partitions can be made for every subarray.

Algorithm

PARTITION (ARR, BEG, END, LOC)

● Step 1: [INITIALIZE] SET LEFT = BEG, RIGHT = END, LOC = BEG, FLAG =

● Step 2: Repeat Steps 3 to 6 while FLAG =

● Step 3: Repeat while ARR[LOC] <=ARR[RIGHT]

AND LOC != RIGHT

SET RIGHT = RIGHT – 1

[END OF LOOP]

● Step 4: IF LOC = RIGHT

SET FLAG = 1

ELSE IF ARR[LOC] > ARR[RIGHT]

SWAP ARR[LOC] with ARR[RIGHT]

SET LOC = RIGHT

[END OF IF]

● Step 5: IF FLAG = 0

Repeat while ARR[LOC] >= ARR[LEFT] AND LOC != LEFT

SET LEFT = LEFT + 1

[END OF LOOP]

● Step 6:IF LOC = LEFT

SET FLAG = 1

ELSE IF ARR[LOC] < ARR[LEFT]

SWAP ARR[LOC] with ARR[LEFT]

SET LOC = LEFT

[END OF IF]

● Step 7: [END OF LOOP]

● Step 8: END

QUICK_SORT (ARR, BEG, END)

● Step 1: IF (BEG < END)

CALL PARTITION (ARR, BEG, END, LOC)

CALL QUICKSORT(ARR, BEG, LOC - 1)

CALL QUICKSORT(ARR, LOC + 1, END)

[END OF IF]

● Step 2: END

Key takeaways

Merge sort consisting two phases as divide and merge.
In divide phase, each time it divides the given unsorted list into two sub list until each sub list get single element then sorts the elements.
In merge phase, it merges all the solution of sub lists and find the original sorted list.
Quick sort is the most efficient method of sorting .It uses the divide and conquer strategy to sort the elements.
In quicksort, we divide the array of items to be sorted into two partitions and then call the quicksort procedure recursively to sort the two partitions.

1.11 Complexity analysis of sorting methods

Bubble Sort, Selection Sort, Insertion Sort, Merge Sort, Quick Sort, Heap Sort, Counting Sort, Radix Sort, and Bucket Sort have all had their time and space difficulties discussed.

Bubble sort

Each nearby pair is compared in bubble sort. We swap them if they aren't in the right order. We repeat the preceding procedure until no more swaps are required, indicating that all of the components have been sorted.

Worse case: O(n2)

We iterate through the array (n - 1) times once it is reverse-sorted. We do (n - 1) swaps in the first iteration, (n - 2) swaps in the second, and so on until we only do one swap in the last iteration. As a result, the total number of swaps equals n * (n - 1) / 2.

Average case: O(n2)

The total number of swaps for a truly random array averages out to roughly n2 / 4, which is again O(n2).

Best case: O(n)

In the first iteration of the array, if no swaps are performed, we know that the array is already sorted and hence cease sorting, resulting in a linear time complexity.

Space Complexity:

The space complexity is O(1) since we only use a constant amount of additional memory away from the input array.

Selection Sort

Selection sort is a straightforward sorting method that divides an array into two parts: a subarray of already sorted elements and a subarray of unsorted elements. Initially, the sorted subarray is empty. The array is iterated over (n - 1) times. We find the smallest element from the unsorted subarray and set it at the end of the sorted subarray in each iteration.

Time Complexity:

Worst case = Average Case = Best Case = O(n2)

For an array of whatever size, we conduct the same number of comparisons.

We execute (n - 1) comparisons in the first iteration, (n - 2) in the second, and so on until the last iteration, when we only perform one comparison. As a result, the total number of comparisons is n * (n - 1) / 2. The number of swaps performed is at most n - 1. So the overall time complexity is quadratic.

Space Complexity:

The space complexity is O(1) because we don't use any other data structures except the input array.

Insertion Sort

The insertion sort method, like selection sort, separates the array into two parts: a subarray of already sorted elements and a subarray of unsorted elements. Only the first member of the array is originally stored in the sorted subarray. We go over each of the remaining elements one by one, putting them in their proper places in the sorted subarray.

Time Complexity:

Worse case: O(n2)

When we use insertion sort on a reverse-sorted array, each element is inserted at the beginning of the sorted subarray, which has the highest temporal complexity.

Average case: O(n2)

The average running time is O(n2 / 4) = O(n2). When the array members are in random order.

Best case: O(n)

When we run insertion sort on an array that has previously been sorted, it merely compares each element to its predecessor, needing n steps to sort the already sorted array of n elements.

Space Complexity:

The space complexity is O(1) since we only use a constant amount of additional memory away from the input array.

Merge Sort

We partition the unsorted array into n subarrays, each of size one, that can be considered sorted trivially in merge sort. Then, until there is only one sorted array of size n left, we merge the sorted subarrays to make new sorted subarrays.

Time Complexity:

Worst case = Average Case = Best Case = O(n log n)

For any input array of a certain size, merge sort performs the same number of operations.

We divide the array into two subarrays repeatedly in this approach, resulting in O(log n) rows with each element appearing exactly once in each row.

Merging every pair of subarrays takes O(n) time for each row. As a result, the entire time complexity is O(n log n).

Space Complexity:

The space complexity is O(n) since we store the merged subarray in an auxiliary array of size at most n.

Quick Sort

Quick sort is a more complicated algorithm. It divides the array into two subarrays using a divide-and-conquer approach. We pick an element called pivot and place it at its proper index before reordering the array by moving all elements less than pivot to the left and all elements greater than it to the right.

The subarrays of these subarrays are then sorted recursively until the entire array is sorted. The pivot element's selection has a significant impact on the quick sort algorithm's efficiency.

Time Complexity:

Worse case: O(n2)

When the array is sorted and the leftmost element is chosen as pivot, or when the array is reverse-sorted and the rightmost element is chosen as pivot, the time complexity increases quadratically because dividing the array produces extremely unbalanced subarrays.

When the array has a high number of identical members, splitting becomes difficult, resulting in quadratic time complexity.

Average case and best case: O(n log n)

The best case scenario for quick-sort is when we consistently use the median element for partitioning. As a result of this partitioning, we can always divide the array in half.

Space Complexity:

Although rapid sort does not need auxiliary space to store array elements, recursive calls require additional space to create stack frames.

Worst case: O(n)

This occurs when the pivot element in each recursive call is the largest or smallest element of the array. After partitioning, the size of the subarray will be n-1 and 1. The recursive tree will be n in size in this situation.

Best case: O(log n)

This occurs when the right position of the pivot element in the partitioned array is always in the middle. Subarrays will have a size of half that of the parent array. The size of the recursive tree will be O in this situation (log n).

References:

Data Structures with C, Seymour Lipschutz, Schaum's Outlines, Tata McGraw Hill Education.
Fundamentals of Computer Algorithms by Horowitz, Sahni,Galgotia Pub. 2001 ed.
Data Structures and program design in C by Robert Kruse, Bruce Leung & Clovis Tondo.
Data Structures: A Pseudocode Approach with C by Richard F. Gilberg and

Behrouz Forouzan.

Unit - 1

Data Structures

1.1 Introduction to Data Structures

Basic terminology

Data: The data on a student can be defined as an elementary value or a set of values, for example, the student's name and id.

Group items: Group items are data items that include subordinate data items, such as a student's name, which can have both a first and a last name.

Attribute and Entity: A class of objects is represented by an entity. It has a number of characteristics. Each attribute represents an entity's specific property.

Key takeaway

A data structure is a collection of data pieces that provides an efficient method for storing and organizing data in a computer so that it may be used effectively.

1.2 Need of Data Structure

As applications become more complicated and the amount of data collected grows, the following issues may arise:

● Data Search: Consider a store with a 106-item inventory. If our programme needs to search for a specific item, it must traverse all 106 items each time, which slows down the search process.

● Multiple requests: If thousands of users are searching for data on a web server at the same time, there is a potential that a very large server will fail during the process.

Data structures are used to solve the difficulties listed above. Data is grouped into a data structure in such a way that not all things must be searched and essential data can be found quickly.

Advantages of Data Structures

1.3 Abstract Data type

Here we will see the stack ADT. These are a few operations or functions of the Stack ADT.

● isFull(), This is used to check whether stack is full or not

● isEmpty(), This is used to check whether stack is empty or not

● push(x), This is used to push x into the stack

● pop(), This is used to delete one element from top of the stack

● peek(), This is used to get the top most element of the stack

● size(), this function is used to get number of elements present into the stack

Example

#include<iostream>

#include<stack>

Using namespace std;

Main(){

stack<int> stk;

if(stk.empty()){

cout << "Stack is empty" << endl;

} else {

cout << "Stack is not empty" << endl;

}

//insert elements into stack

stk.push(10);

stk.push(20);

stk.push(30);

stk.push(40);

stk.push(50);

cout << "Size of the stack: " << stk.size() << endl;

//pop and display elements

while(!stk.empty()){

int item = stk.top(); // same as peek operation

stk.pop();

cout << item << " ";

}

Output

Stack is empty

Size of the stack: 5

50 40 30 20 10

Key takeaway

The abstract data type is a special kind of datatype, whose behavior is defined by a set of values and set of operations.

The keyword “Abstract” is used as we can use these data types, we can perform different operations. But how those operations are working is totally hidden from the user.

The ADT is made of primitive data types, but operation logics are hidden.

1.4 Types of Data Structures Algorithms: Algorithm, Efficiency of an Algorithm

What is an Algorithm?

Characteristics of an Algorithm

The following are the characteristics of an algorithm:

● Input: An algorithm has some input values. We can pass 0 or some input value to an algorithm.

● Output: We will get 1 or more output at the end of an algorithm.

● Unambiguity: An algorithm should be unambiguous which means that the instructions in an algorithm should be clear and simple.

● Finiteness: An algorithm should have finiteness. Here, finiteness means that the algorithm should contain a limited number of instructions, i.e., the instructions should be countable.

● Effectiveness: An algorithm should be effective as each instruction in an algorithm affects the overall process.

● Language independent: An algorithm must be language-independent so that the instructions in an algorithm can be implemented in any of the languages with the same output.

Dataflow of an Algorithm

● Algorithm: An algorithm will be designed for a problem which is a step by step procedure.

● Input: After designing an algorithm, the required and the desired inputs are provided to the algorithm.

● Processing unit: The input will be given to the processing unit, and the processing unit will produce the desired output.

● Output: The output is the outcome or the result of the program.

Why do we need Algorithms?

We need algorithms because of the following reasons:

● Scalability: It helps us to understand scalability. When we have a big real-world problem, we need to scale it down into small-small steps to easily analyze the problem.

● Performance: The real-world is not easily broken down into smaller steps. If the problem can be easily broken into smaller steps means that the problem is feasible.

Let's understand the algorithm through a real-world example. Suppose we want to make a lemon juice, so following are the steps required to make a lemon juice:

Step 1: First, we will cut the lemon into half.

Step 2: Squeeze the lemon as much as you can and take out its juice in a container.

Step 3: Add two tablespoons of sugar in it.

Step 4: Stir the container until the sugar gets dissolved.

Step 5: When sugar gets dissolved, add some water and ice in it.

Step 6: Store the juice in a fridge for 5 to minutes.

Step 7: Now, it's ready to drink.

Now we will look at an example of an algorithm in programming.

We will write an algorithm to add two numbers entered by the user.

The following are the steps required to add two numbers entered by the user:

Step 1: Start

Step 2: Declare three variables a, b, and sum.

Step 3: Enter the values of a and b.

Step 4: Add the values of a and b and store the result in the sum variable, i.e., sum=a+b.

Step 5: Print sum

Step 6: Stop

Key takeaway

An algorithm is a process or a set of rules required to perform calculations or some other problem-solving operations especially by a computer.

The formal definition of an algorithm is that it contains the finite set of instructions which are being carried in a specific order to perform the specific task.

Efficiency of an Algorithm

Following are the conditions where we can analyze the algorithm:

● Time efficiency: indicates how fast algorithm is running

● Space efficiency: how much extra memory the algorithm needs to complete its execution

● Simplicity: - generating sequences of instruction which are easy to understand.

● Generality: - It fixes the range of inputs it can accept. The generality of problems which algorithms can solve.

The time efficiencies of a large number of algorithms fall into only a few classes

Fig 1: Time efficiency

1.5 Time and Space Complexity

Time complexity:

Let's understand the time complexity through an example.

Sum=0;
// Suppose we have to calculate the sum of n numbers.
For i=1 to n
Sum=sum+i;
// when the loop ends then sum holds the sum of the n numbers
Return sum;

For an algorithm, the space is required for the following purposes:

To store program instructions
To store constant values
To store variable values

To track the function calls, jumping statements, etc.

Key takeaway

The time complexity of an algorithm is the amount of time required to complete the execution.

An algorithm's space complexity is the amount of space required to solve a problem and produce an output.

1.6 Asymptotic notations (Big O, Omega Ω, Theta)

Big Oh (O) Notation:

It is the formal way to express the time complexity.
It is used to define the upper bound of an algorithm’s running time.
It shows the worst-case time complexity or largest amount of time taken by the algorithm to perform a certain task.

Fig 2: Big Oh notation

Here, f (n) <= g (n) …………….(eq1)

where n>= k, C>0 and k>=1

If any algorithm satisfies the eq1 condition then it becomes f (n)= O (g (n))

Big Omega (ῼ) Notation

It is the formal way to express the time complexity.
It is used to define the lower bound of an algorithm’s running time.
It shows the best-case time complexity or best of time taken by the algorithm to perform a certain task.

Fig 3: Big omega notation

Here, f (n) >= g (n)

Where n>=k and c>0, k>=1

If any algorithm satisfies the eq1 condition then it becomes f (n)= ῼ (g (n))

Theta (Θ) Notation

It is the formal way to express the time complexity.
It is used to define the lower as well as upper bound of an algorithm’s running time.
It shows the average case time complexity or average of time taken by the algorithm to perform a certain task.

Fig 4: Theta notation

Let C1 g(n) be upper bound and C2 g(n) be lower bound.

C1 g(n)<= f (n)<=C2 g(n)

Where C1, C2>0 and n>=k

Key takeaway

Big Oh is used to define the upper bound of an algorithm’s running time.
Big Omega is used to define the lower bound of an algorithm’s running time.
Theta is used to define the lower as well as upper bound of an algorithm’s running time.

1.7 Time-Space trade- off

It is also known as time memory tradeoff.
It is a way of solving problems in less time by using more storage space.
It can be used with the problem of data storage.
For example: time complexity of merge sort is O(n log n)in all cases. But requires an auxiliary array.
In quick sort complexity is O(n log n) for best and average case but it doesn’t require auxiliary space.
Two Space-for-Time Algorithm Varieties:
Input enhancement - To store some info, preprocess the input (or its portion) to be used in solving the issue later
Counting sorts
String searching algorithms
Pre-structuring-preprocessing the input to access its components

1) hashing 2) indexing schemes simpler (e.g., B-trees)

Key takeaway

It is also known as time memory tradeoff.
It is a way of solving problems in less time by using more storage space.