This is going to be a huge step and a true challenge for me. For the next consecutive thirty days, I’ll try to learn well and write about an algorithm each day. This is a challenge because I’ve always been known as a lazy guy and which maybe I’m! Today is day 0 or a preparatory day. So Let’s try to get some basic ideas related to the computer algorithms! And from the next day, we’ll jump on to the algorithms.
What is an algorithm?
When discussing algorithms, the very first question arises, what is an algorithm? Well, in some very straight words, the algorithm is an ordered set of some procedures which can produce at least an output from some inputs. For example, if you feel sleepy and want to make a cup of coffee, you’ll at first boil water, add some coffee mate, sugar and coffee, stare well and pour on your mug. Done! This specific sequence of tasks which give you a cup of coffee is an algorithm!
In computer science, an algorithm is a finite sequence of well-defined computer implementable instructions that, if followed, accomplishes a particular class of tasks. An algorithm has the following properties:
- Input: An algorithm has zero or more inputs supplied externally.
- Output: An algorithm has one or more outputs, which has a specified relationship with the inputs.
- Definiteness: Each instruction is precisely defined and unambiguous.
- Finiteness: The algorithm terminates after a finite number of steps.
- Effectiveness: All the operations must be sufficiently basic that they can be performed exactly and within a finite length.
That’s all about the algorithm. Often it can be seen that a task can be performed by several algorithms. Like we can generate the prime numbers within a specific range by using the basic rule (checking each number if it is prime or not) or by using the sieve of Eratosthenes. Now the question arises how to select which one is better. This can be done by evaluating the performance analysis of an algorithm.
Performance analysis
Performance analysis of an algorithm means the process of evaluating the performance of an algorithm and thus making a judgmental evaluation among different algorithms. There are several ways of performance analysis of an algorithm. Time Complexity and Space Complexity are the most used methods.
Space Complexity:
Space complexity is the amount of space or memory taken by an algorithm to run as a function of the input length or size. It includes both auxiliary space and the space required by the input.
Auxiliary space is the temporary or extra space that is used while the algorithm is running.
Time Complexity:
Time complexity is the amount of time required by an algorithm to run and complete its process as a function the input length or size.
Both the time and space complexity are crucial considerations while designing an algorithm. Measuring and expressing this complexity universally is also important. For this, the asymptotic analysis and asymptotic notation are used.
Cases to Analyze an Algorithm
To analyse an algorithm we consider three cases. The best case, the average case and the worst case.
Best case:
The best case is used to describe the algorithm’s behaviour under optimal conditions. A best-case happens when the algorithm has to perform the minimum number of operations. Like in linear search if the searching element is in the first index, it is the best case.
Choice of an algorithm is rarely done based on the best case analysis.
Average case:
The average case of an algorithm is the average of time or space for the all possible inputs used by the algorithm. For the linear search, we sum all the time or space required by all the inputs and divide by the total number of inputs. and thus we get the average case.
Average case analysis is sometimes harder. Because for average-case we must know or predict the distribution of all the inputs. Which is sometimes harder to know or predict.
Worst case:
In worst case we calculate the upper bound on running time or space required for an algorithm. Worst case describes the at most resource requirement by an algorithm. For the linear search, worst case is the searching element is not present in the array.
Generally the choice of an algorithm is done base on the worst case analysis. Because it gives an idea about how much resources might be needed at most during the execution.
Asymptotic Analysis: The Growth of Functions
In the asymptotic analysis, we evaluate the performance of an algorithm by the terms of input length or size rather than the actual time of running. But why? Because an algorithm may take different running times with the same inputs in different machines. Like as a fast algorithm may take a longer time in a slower machine, whereas a slower algorithm may take a shorter time in a faster machine with the same input. So we calculate how the taken time or space increases with the increase of the input size.
Let’s consider the following scenario. We have two machines – a faster one and a slower one. We want to sort a given array. We’ll use two algorithms. One is bubble sort and another one is merge sort. We know the merge sort is faster than the bubble sort. So we will run the merge sort in the slower machine and the bubble sort in the faster machine.
For a smaller size of the input, the bubble sort may beat the merge sort, as it is running in a faster machine. But if we continue to increment the size of the input, we’ll see at a stage the merge sort will outperform the bubble sort, though it is running in a slower machine. This happens because the growth of bubble sort is polynomial where the growth of merge sort is logarithmic.
But what is the growth of function? It is the rate of growth of the running time of a function with the input size. We often need to simplify the function to find out the most important part that causes growth. Let’s consider that an algorithm is running on input size n, takes \( 3n^2 + 50n + 100\) instructions to execute. If we plot this equation it’ll be like:
Now if we break this equation into two parts: \(3n^2\) and \(50n + 100\), we’ll notice that \(3n^2\) is larger then the remaining parts. And it plays the main role in the growth of the function. We can see in the following figure:
Again it can be said that \(n^2\) is the main growing factor here. Notice the following figure:
We can clearly notice that \(n^2\) is mainly causing the growth of the function. So by dropping the less significant terms (here \(50n + 100\)) and the coefficient, we can focus on the important part and it’s the rate of growth. When we drop the less significant terms and coefficient, we use the asymptotic notations.
Asymptotic Notations
The asymptotic notations are mathematical tools to represent the time complexity of an algorithm in the asymptotic analysis. Following three notations are mostly used:
Big-Theta (Θ) notation:
The big theta notation bounds a function from above and below. As we saw in the previous section the higher term plays the main role in the growth of a function, so we can get the big-theta notation of the previous equation \( 3n^2 + 50n + 100\), as \(\Theta\left ( n\right )\). It indicates the running time in the average case.
But what actually the big-theta notation indicates? Let a function be \( g(n)\) and \(\Theta\left (g(n)\right ) = f(n)\). Then \( f(n)\) or \(\Theta\left (g(n)\right )\) indicates that there exists some constants \(c_{1}, c_{2}\) and \(n_{0}\) such that for all \(n \geq n_{0}\), \(f(n)\) is between \(c_{1} * g(n)\) and \(c_{2} * g(n)\).
When we use the big-theta notation, we say that we have asymptotically tight bound the function, g(n).
Big-O notation:
The big-O notation bounds a function from the above. The big-O notation of the equation \( 3n^2 + 50n + 100\) is \(O(n^{2})\). It indicates the running time in the worst case.
Let a function be \( g(n)\) and \(O(g(n)) = f(n)\). Then \( f(n)\) or \(O(g(n))\) indicates that there exists some constants \(c\) and \(n_{0}\) such that for all \(n \geq n_{0}\), \(f(n)\) is under \(c g(n)\).
When we use big-O notation, we say that we have asymptotically upper bound the function. It only gives an asymptotic upper bound but not an asymptotic tight bound. Like as it is absolutely okay to indicate the upper bound of the previous equation as \(O(n^3)\). Because the running time grows no faster than \(O(n^3)\). So, sometimes the big-O notation can not be imprecise.
Big-Ω (Big-Omega) notation:
Big-omega notation bounds a function from below. It indicates the running time in the best case.
Let a function be \( g(n)\) and \(Ω(g(n)) = f(n)\). Then \( f(n)\) or \(Ω(g(n))\) indicates that there exists some constants \(c\) and \(n_{0}\) such that for all \(n \geq n_{0}\), \(f(n)\) is above of \(c g(n)\).
When we use big-Ω notation, we say that we have asymptotically lower bound the function. Like big-O notation, the big-Ω notation is not a tight bound also.
That’s all for the day-0 of this series. From the next day we’ll dive into the algorithms! Let me know you thoughts. Thank you!