The quick-sort algorithm
Quick-sort is an efficient (in-place, unstable) sorting algorithm. Although its worst-case time complexity is , the average complexity is . On real-world (≈ random) inputs, quick-sort is even more efficient than other sorting algorithms with worst-case time complexity like merge-sort.
I am currently reviewing my sorting algorithms and after all these years, quick-sort still baffles me. It is the less intuitive sorting algorithms that I know of. Now, there is a ton of content about the way quick-sort works and how to implement it. But this time, I wanted to prove to myself that it does work, and I had the most trouble finding a complete and rigorous correctness proof. In this post, I will formalize and gather the elements I have found here and there.
Here is the implementation we’re working with:
|
|
Note: l stand for ’left pointer’ and p stands for ‘pivot’
Why the partitioning part works
The partitioning step (lines 12-18 above) selects a pivot p, say the last element of the input array. Then, the array is rearranged as follows: [ elements < p …, p, … elements > p ]. Most likely, the partitions on ’each side’ of p are not ordered. However, p is already in ’the right place’: when the entire array will be sorted, p will be at exactly the same index.
How come the partition works? Let’s show the following invariants hold throughout the for loop:
At the beginning of each ,
Intuition
means that at the beginning of step i
in the for loop:
- All the elements strictly on the left of pointer
l
are <p
- The elements on the right of pointer
l
(included) are ≥p
, up until indexi-1
These properties are important because the mean that the array is almost partitioned as we want it to be, up to index i-1
. The exception is the pivot, which stays at the end of the array throughout the loop. That is why it is swapped with array[l]
after the loop is completed.
Before diving into the proof of these two statements, how can we have an intuition that they are true?
-
The pointer
l
only moves to the right after an element <p
is found at stepi
, and put at indexl
. So, the elements on the left ofl
can only be <p
. -
Initially,
l
=i
.i
is incremented at each step of the for loop, whilel
can be incremented or not. So,l
≤i
. When doesl
<i
? Wheni
moves further away but notl
, which happens whenarray[i]
≥p
. So, the elements in betweenl
andi
(if any) can only be ≥p
.
Proof
The following is not a proof by induction per say because the input array is finite. The logic is similar though: we are going to chain a (finite) number of implications together.
Base case
At the beginning of the first step of the for loop, , so statements 1) and 2) are vacuous truths (no exists in both cases). Statement 3) is trivially true as we’ve just defined just before entering the loop. Therefore, .
‘Induction’
Let and assume . Now, let’s enter inside the loop and see what happens at step :
- A. If .
Nothing happens, the pointer l
is not moved to right: .
Statement 1): We can replace by at no cost: statement 1) still holds at step .
Statement 2): same thing, since . Moreover, since , then , i.e statement 2) holds at step
- B. Else if . Then:
Before the swap: (B.) and as per statement 2) of with .
So after the swap, and
Also, the pointer l
is moved to right: .
Statement 1): (after the swap), hence statement 1) is valid for , which is equivalent to since
Statement 2): which elements are greater than since the swap? Not anymore. But elements at index , …, (if any) are still as per statement 2) (step ). Moreover, has now joined the club, since the swap. Then , i.e.
In both possible cases A. and B., holds.
Conclusion
We have proven:
- …
From the above, we can conclude that holds:
The above is true at the beginning of step , or more precisely at the end of step since the loop stops before . At this point, one more instruction is executed: swapping and . So, after the final swap, and (from - Statement 2). Finally:
Q.E.D.
Why the whole algorithm works
Now that the correctness of the partitioning part is established, we need to prove that recursively calling quick_sort(array)
sorts array
. This part is much easier than the correctness of the partition, and more documented as well.
- When the size of the input array is 0 or 1,
quick_sort()
does nothing and the array is sorted - Suppose that
quick_sort()
adequately sorts any array of size - Then, let’s see what happens upon calling
quick_sort()
on an array of size . First, the array will be partitioned in three parts:[left_part, pivot, right_part]
. We have show in the previous section thatpivot
is already at the right index. Since all the elements ofleft_part
are less thanpivot
, if the recursive callquick_sort(left_part)
sorts the left part, then the job is done on the left side. Same logic for the right part. Well, the size of[left_part, pivot, right_part]
is , so :left_part
andleft_part
cannot be more than elements! So, by the inductive hypothesis,quick_sort(left_part)
andquick_sort(left_part)
do sort the arrays, and the array[left_part, pivot, right_part]
of size ends up being sorted. The inductive statement holds at step .
The above proves by strong induction that quick_sort
sorts an array of any size .