Why quick-sort works

The quick-sort algorithm

Quick-sort is an efficient (in-place, unstable) sorting algorithm. Although its worst-case time complexity is O(n2)O(n^{2}), the average complexity is O(nlog(n))O(nlog(n)). On real-world (≈ random) inputs, quick-sort is even more efficient than other sorting algorithms with worst-case O(nlog(n))O(nlog(n)) time complexity like merge-sort.

I am currently reviewing my sorting algorithms and after all these years, quick-sort still baffles me. It is the less intuitive sorting algorithms that I know of. Now, there is a ton of content about the way quick-sort works and how to implement it. But this time, I wanted to prove to myself that it does work, and I had the most trouble finding a complete and rigorous correctness proof. In this post, I will formalize and gather the elements I have found here and there.

Here is the implementation we’re working with:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22

def quick_sort(array, start, end):
    '''
    Sorts input `array` between indexes `start` and `end`.
    Returns None, sorting is made in-place.
    '''
    # Base case: array of 0 or 1 elements can only be sorted
    if start >= end:
        return None
    
    # Partition input array
    l = start
    p = array[end]
    for i in range(start, end):
        if array[i] < p:
            array[l], array[i] = array[i], array[l]
            l += 1
    array[l], array[end] = array[end], array[l]

    # Recursive calls on left and right partitions
    quick_sort(array, start, l - 1)
    quick_sort(array, l + 1, end)

Note: l stand for ’left pointer’ and p stands for ‘pivot’

Why the partitioning part works

The partitioning step (lines 12-18 above) selects a pivot p, say the last element of the input array. Then, the array is rearranged as follows: [ elements < p …, p, … elements > p ]. Most likely, the partitions on ’each side’ of p are not ordered. However, p is already in ’the right place’: when the entire array will be sorted, p will be at exactly the same index.

How come the partition works? Let’s show the following invariants hold throughout the for loop:

At the beginning of each i{start,,end1}i \in \{ start, …, end - 1 \},

P(i):{1)&ThinSpace;array[k]&lt;p&ThinSpace;,&ThinSpace;startk&lt;li2)&ThinSpace;array[k]p&ThinSpace;,&ThinSpace;lik&lt;i \mathcal{P} (i): \begin{cases} 1 ) \, array[k] &lt; p \, , \, start \leq k &lt; l_i \\[6pt] 2 ) \, array[k] \geq p \, , \, l_i \leq k &lt; i \\ \end{cases}

Intuition

P(i)\mathcal{P} (i) means that at the beginning of step i in the for loop:

  1. All the elements strictly on the left of pointer l are < p
  2. The elements on the right of pointer l (included) are ≥ p, up until index i-1

These properties are important because the mean that the array is almost partitioned as we want it to be, up to index i-1. The exception is the pivot, which stays at the end of the array throughout the loop. That is why it is swapped with array[l] after the loop is completed.

Before diving into the proof of these two statements, how can we have an intuition that they are true?

  1. The pointer l only moves to the right after an element < p is found at step i, and put at index l. So, the elements on the left of l can only be < p.

  2. Initially, l = i. i is incremented at each step of the for loop, while l can be incremented or not. So, li. When does l < i? When i moves further away but not l, which happens when array[i]p. So, the elements in between l and i (if any) can only be ≥ p.

Proof

The following is not a proof by induction per say because the input array is finite. The logic is similar though: we are going to chain a (finite) number of implications together.

Base case

At the beginning of the first step of the for loop, i=s=lii = s = l_i, so statements 1) and 2) are vacuous truths (no kk exists in both cases). Statement 3) is trivially true as we’ve just defined p=array[end]p=array[end] just before entering the loop. Therefore, P(start)\mathcal{P} (start).

‘Induction’

Let i{start,,end1}i \in \{ start, …, end - 1 \} and assume P(i)\mathcal{P} (i). Now, let’s enter inside the loop and see what happens at step ii:

 

  • A. If array[i]parray[i] \geq p.

Nothing happens, the pointer l is not moved to right: li+1=lil_{i+1} = l_i.

Statement 1): We can replace lil_i by li+1l_{i+1} at no cost: statement 1) still holds at step i+1i+1.

Statement 2): same thing, array[k]p&ThinSpace;,&ThinSpace;li+1k&lt;iarray[k] \geq p \, , \, l_{i+1} \leq k &lt; i since li+1=lil_{i+1} = l_i. Moreover, since array[i]parray[i] \geq p, then array[k]p&ThinSpace;,&ThinSpace;li+1k&lt;i+1array[k] \geq p \, , \, l_{i+1} \leq k &lt; i+1, i.e statement 2) holds at step i+1i+1

 

  • B. Else if array[i]&lt;parray[i] &lt; p. Then:

Before the swap: array[i]&lt;parray[i] &lt; p (B.) and array[li]parray[l_i] \geq p as per statement 2) of P(i)\mathcal{P} (i) with k=lik=l_i.

So after the swap, array[i]parray[i] \geq p and array[li]&lt;parray[l_i] &lt; p

Also, the pointer l is moved to right: li+1=li+1l_{i+1} = l_i+1.

Statement 1): array[li]&lt;parray[l_i] &lt; p (after the swap), hence statement 1) is valid for startklistart \leq k \leq l_i, which is equivalent to startk&lt;li+1start \leq k &lt; l_{i+1} since li+1=li+1l_{i+1} = l_i+1

Statement 2): which elements are greater than pp since the swap? Not array[li]array[l_i] anymore. But elements at index li+1l_i + 1, …, i1i-1 (if any) are still p\geq p as per statement 2) (step ii). Moreover, array[i]array[i] has now joined the club, since the swap. Then array[k]p&ThinSpace;,&ThinSpace;li+1k&lt;i+1array[k] \geq p \, , \, l_i + 1 \leq k &lt; i + 1, i.e. array[k]p&ThinSpace;,&ThinSpace;li+1k&lt;i+1array[k] \geq p \, , \, l_{i+1} \leq k &lt; i + 1

In both possible cases A. and B., P(i+1)\mathcal{P} (i+1) holds.

Conclusion

We have proven:

  • P(0)\mathcal{P} (0)
  • P(0)&ThickSpace;&ThickSpace;P(1)\mathcal{P} (0) \implies \mathcal{P} (1)
  • P(1)&ThickSpace;&ThickSpace;P(2)\mathcal{P} (1) \implies \mathcal{P} (2)
  • P(end1)&ThickSpace;&ThickSpace;P(end)\mathcal{P} (end-1) \implies \mathcal{P} (end)

From the above, we can conclude that P(end)\mathcal{P} (end) holds:

P(end):{1)&ThinSpace;array[k]&lt;p&ThinSpace;,&ThinSpace;startk&lt;lend2)&ThinSpace;array[k]p&ThinSpace;,&ThinSpace;lendk&lt;end \mathcal{P} (end): \begin{cases} 1 ) \, array[k] &lt; p \, , \, start \leq k &lt; l_{end} \\[6pt] 2 ) \, array[k] \geq p \, , \, l_{end} \leq k &lt; end \\ \end{cases}

The above is true at the beginning of step i=endi=end, or more precisely at the end of step i=end1i=end-1 since the loop stops before i=endi=end. At this point, one more instruction is executed: swapping array[lend]array[l_{end}] and array[end]=parray[end]=p. So, after the final swap, array[lend]=parray[l_{end}] = p and array[end]parray[end] \geq p (from P(end)\mathcal{P} (end) - Statement 2). Finally:

{array[k]&lt;p&ThinSpace;,&ThinSpace;startk&lt;lendarray[lend]=parray[k]p&ThinSpace;,&ThinSpace;lendkend \begin{cases} array[k] &lt; p \, , \, start \leq k &lt; l_{end} \\[6pt] array[l_{end}] = p \\[6pt] array[k] \geq p \, , \, l_{end} \leq k \leq end \\ \end{cases}

Q.E.D.

Why the whole algorithm works

Now that the correctness of the partitioning part is established, we need to prove that recursively calling quick_sort(array) sorts array. This part is much easier than the correctness of the partition, and more documented as well.

  • When the size of the input array is 0 or 1, quick_sort() does nothing and the array is sorted
  • Suppose that quick_sort() adequately sorts any array of size k{0,1,,n},n1k \in \{ 0, 1, …, n \}, n \geq 1
  • Then, let’s see what happens upon calling quick_sort() on an array of size n+1n+1. First, the array will be partitioned in three parts: [left_part, pivot, right_part]. We have show in the previous section that pivot is already at the right index. Since all the elements of left_part are less than pivot, if the recursive call quick_sort(left_part) sorts the left part, then the job is done on the left side. Same logic for the right part. Well, the size of [left_part, pivot, right_part] is sl+1+sr=n+1s_l+1+s_r=n+1, so sl+srns_l+s_r \leq n: left_part and left_part cannot be more than nn elements! So, by the inductive hypothesis, quick_sort(left_part) and quick_sort(left_part) do sort the arrays, and the array [left_part, pivot, right_part] of size n+1n+1 ends up being sorted. The inductive statement holds at step n+1n+1.

The above proves by strong induction that quick_sort sorts an array of any size n0n \geq 0.

References