The lower bound nlogn does not apply to algorithms that do not compare array
elements but use some other information. An example of such an algorithm is
counting sort that sorts an array in O(n)
time assuming that every element in
the array is an integer between 0... c
and c
= O(n)
.
The algorithm creates a bookkeeping array, whose indices are elements of the
original array. The algorithm iterates through the original array and calculates
how many times each element appears in the array.
For example, the array
1 3 6 9 9 3 5 9
corresponds to the following bookkeeping array:
1 0 2 0 1 1 0 0 3
1 2 3 4 5 6 7 8 9
For example, the value at position 3 in the bookkeeping array is 2, because
the element 3 appears 2 times in the original array.
Construction of the bookkeeping array takes O(n)
time. After this, the sorted
array can be created in O(n)
time because the number of occurrences of each
element can be retrieved from the bookkeeping array. Thus, the total time
complexity of counting sort is O(n)
.
Counting sort is a very efficient algorithm but it can only be used when the
constant c is small enough, so that the array elements can be used as indices in
the bookkeeping array.
- Iterate the input array and find the maximum value present in it.
- Declare a new array of size max+1 with value 0
- Count each and every element in the array and increment its value at the corresponding index in the auxiliary array created
- Find cumulative sum is the auxiliary array we adding curr and prev frequency
- Now the cumulative value actually signifies the actual location of the element in the sorted input array
- Start iterating auxiliary array from 0 to max
- Put 0 at the corresponding index and reduce the count by 1, which will signify the second position of the element if it exists in the input array
- Now transfer array received in the above step in the actual input array
countingSort(arr, n)
maximum <- find the largest element in arr
create a count array of size maximum+1
initialise count array with all 0's
for i <- 0 to n
find the total frequency/ occurrences of each element and
store the count at ith index in count arr
for i <- 1 to maximum
find the cumulative sum by adding current(i) and prev(i-1) count and store
it in count arr itself
for j <- n down to 1
copy the element back into the input array
decrement count of each element copied by 1
The best case time complexity occurs when all elements are of the same range that is when k is equal to 1.
In this case, counting the occurrence of each element in the input range takes constant time and then finding
the correct index value of each element in the sorted output array takes n time, thus the total time complexity reduces to O(1 + n)
i.e O(n)
which is linear.
- N is the number of elements
- K is the range of elements (K = largest element - smallest element)
Worst case time complexity is when the data is skewed that is the largest element is significantly large than other elements. This increases the range K.
As the time complexity of algorithm is O(n+k)
then, for example, when k is of the order O(n^2)
, it makes the time complexity O(n+(n^2))
, which essentially
reduces to O( n^2 )
and if k is of the order O(n^3)
, it makes the time complexity O(n+(n^3))
, which essentially reduces to O( n^3 )
.
Hence, in this case, the time complexity got worse making it O(k)
for such larger values of k. And this is not the end. It can get even worse for further larger values of k.
Thus the worst case time complexity of counting sort occurs when the range k of the elements is significantly larger than the other elements.
In the above algorithm we have used an auxiliary array C of size k, where k is the max element of the given array. Therefore the space complexity of Counting Sort algorithm is O(k)
.
- Space Complexity :
O(k)
Larger the range of elements in the given array, larger is the space complexity, hence space complexity of counting sort is bad if the range of integers are very large as the auxiliary array of that size has to be made.