Consider the sorting problem discussed in another chapter. Namely, consider the problem of permuting the elements of an array v[0 .. n-1] to put them in increasing order, i.e., rearrange the elements of the array so that v[0] ≤ . . . ≤ v[n-1].
That previous chapter analysed some simple algorithms for the problem. The present chapter looks at a more sophisticated algorithm — the Quicksort — invented by C. A. R. Hoare in 1962. In some rare instances, Quicksort is as slow as the simple algorithms, but in general it is much faster. More precisely, the algorithm is linearithmic on average but only quadratic in the worst case.
Here is the basic ideia of the algorithm: if the left half of the array contains the small elements and the right half contains the large elements, then all we have to do is rearrange each half in increasing order independently.
We shall use two shorthands to discuss the algorithm.
The expression
v[i..k] ≤ x
will be used as a shorthand for
v[j] ≤ x for all j
in the set of indices i..k
.
The expression
v[i..k] ≤ v[p..r]
will be interpreted as
v[j] ≤ v[q]
for all j in the set i..k
and all q in the set p..r
.
Table of contents:
The core of the Quicksort algorithm is a partition subproblem, that we shall formulate rather vaguely on purpose:
rearrange an array v[p..r] so that all the small elements fall into the
leftpart of the array and all the large elements fall into therightpart.
The begin solving this problem, we must choose a pivot, say c . The elements of the array that are greater than c will be considered large and the remaining ones will be considered small. It is important to choose c so that each part of the rearranged array is strictly smaller than the whole array. The challenge is to solve the partition subproblem quickly and without resorting to an auxiliary array for workspace.
There are many concrete formulations of the partition problem. Here is the first one: rearrange v[p..r] so that v[p..j] ≤ v[j+1..r] for some j in p..r-1. (In this formulation, the pivot is not explicit.) Here is a second formulation: rearrange v[p..r] so that
v[p..j-1] ≤ v[j] < v[j+1..r]
for some j in p..r. (Here, v[j] is the pivot.) An example:
j | |||||||||
666 | 222 | 111 | 777 | 555 | 444 | 555 | 777 | 999 | 888 |
In this chapter, we shall use the second formulation of the partition problem.
v[j..r] > 0into
v[j..r] ≥ 0. Does it make sense to require that v[j] == 0?
int prttn (int v[], int p, int r) { int j = r; for (int i = r-1; i >= p; i--) if (v[i] > v[r]) { int t = v[r]; v[r] = v[i]; v[i] = t; j = i; } return j; }
int prttn (int v[], int p, int r) {
int w[1000], i = p, j = r, c = v[r];
for (int k = p; k < r; ++k)
if (v[k] <= c) w[i++] = v[k];
else w[j--] = v[k];
// now i == j
w[j] = c;
for (int k = p; k <= r; ++k) v[k] = w[k];
return j; }
int prttn (int v[], int p, int r) { int i = p, j = r; int q = (p + r)/2; do { while (v[i] < v[q]) ++i; while (v[j] > v[q]) --j; if (i <= j) { int t = v[i], v[i] = v[j], v[j] = t; ++i, --j; } } while (i <= j); return i; }
Show an example in which this function
does not produce the promised result.
What if we replace return i
by
return i-1
?
Is it possible to make a few changes to the code
to make the function work as promised?
yescase, your algorithm should return the index j.
Here is how the book by Cormen, Leiserson, Rivest and Stein solves the partition problem. (The algorithm is attributed to N. Lomuto.) The function solves the second formulation of the partition problem:
// Rearranges the elements of the array // v[p..r], with p <= r, and returns j // in p..r such that v[p..j-1] <= v[j] < // v[j+1..r]. static int partition (int v[], int p, int r) { int c = v[r]; // pivot int t, j = p; for (int k = p; /*A*/ k < r; ++k) if (v[k] <= c) { t = v[j], v[j] = v[k], v[k] = t; ++j; } t = v[j], v[j] = v[r], v[r] = t; return j; }
(The keyword static indicates that partition is an auxiliary function and prevents the user of Quicksort from calling it directly.)
To show that the function partition is correct, just check that at the start of each iteration — that is, every time the execution transits through point A — we have the following configuration:
p | j | k | r | ||||||
≤ c | ≤ c | ≤ c | > c | > c | ? | ? | ? | ? | c |
(Note that we can have j == p as well as k == j, that is, the left or the right half of the array can be empty.) More precisely, the following invariants hold: at the start of each iteration,
At the last transit through point A, we shall have k == r and therefore the following configuration:
p | j | k | |||||||
≤ c | ≤ c | ≤ c | ≤ c | ≤ c | > c | > c | > c | > c | c |
Hence, when the execution of the function comes to an end, we shall have p ≤ j ≤ r and v[p..j-1] ≤ v[j] < v[j+1..r], as promised.
p | j | r | |||||||
≤ c | ≤ c | ≤ c | ≤ c | ≤ c | c | > c | > c | > c | > c |
We would like j to be halfway between p and r, but we must accept the extreme cases in which j is p or r.
Performance. How much time does the partition function take? The time consumption is proportional to the number of comparisons between elements of the array and this number is proportional to the size r - p + 1 of the array. The function is, therefore, linear.
int partition_G (int v[], int p, int r) { int c = v[p], i = p+1, j = r; while (true) { while (i <= r && v[i] <= c) ++i; while (c < v[j]) --j; if (i >= j) break; int t = v[i], v[i] = v[j], v[j] = t; ++i; --j; } v[p] = v[j], v[j] = c; return j; }
int prtG (int v[], int p, int r) { int c = v[p], i = p+1, j = r; while (i <= j) { if (v[i] <= c) ++i; else { int t = v[i], v[i] = v[j], v[j] = t; --j; } } v[p] = v[j], v[j] = c; return j; }
int partition_R (int v[], int p, int r) { int c = v[p], i = p+1, j = r; while (1) { while (i < j && c < v[j]) --j; while (i < j && v[i] <= c) ++i; if (i == j) break; int t = v[i], v[i] = v[j], v[j] = t; } if (v[j] > c) return p; v[p] = v[j], v[j] = c; return j; }
Now that the partition problem is solved, we can take care of Quicksort proper. The algorithm uses the divide and conquer strategy and looks like an upended Mergesort:
// This function rearranges an array // v[p..r] in increasing order. void quicksort (int v[], int p, int r) { if (p < r) { // 1 int j = partition (v, p, r); // 2 quicksort (v, p, j-1); // 3 quicksort (v, j+1, r); // 4 } }
If p ≥ r (this is the basis of the recursion), there is nothing to do. If p < r, the code reduces the instance v[p..r] of the problem to the pair of instances v[p..j-1] and v[j+1..r]. Since p ≤ j ≤ r, these two instances are strictly smaller than the original instance. Hence, by induction hypothesis, v[p..j-1] will be in increasing order at the end of line 3 and v[j+1..r] will be in increasing order at the end of line 4. Since v[p..j-1] ≤ v[j] < v[j+1..r] due to the partition function, the whole array will be in increasing order at end of line 4. This argument proves the function correct.
We can delete the second
recursive call (line 4)
and replace the if
by a while
.
The resulting code is equivalent to quicksort:
void qcksrt (int v[], int p, int r) { while (p < r) { int j = partition (v, p, r); qcksrt (v, p, j-1); p = j + 1; } }
p < rby
p <= ron line 1 of quicksort? What happens if we replace
p < rby
p != ron line 1?
j-1by
jon line 3 of quicksort? What happens if we replace
j+1by
jon line 4?
quicksort (v,1,4) quicksort (v,1,2) quicksort (v,1,0) quicksort (v,2,2) quicksort (v,4,4)
Repeat the exercise for the array v[1..6] = 55 44 22 11 66 33 .
void qsrt (int v[], int p, int r) { int j = partition (v, p, r); if (p < j-1) qsrt (v, p, j-1); if (j+1 < r) qsrt (v, j+1, r); }
The animation at right (copied from Simon Waldherr / Golang Sorting Visualization) shows Quicksort running on an array v[0..79] of positive numbers. Each element v[i] of the array is represented by the point (i, v[i]).
The first figure below (copied from Wikimedia) shows Quicksort running on a permutation v[0..249] of the set 0..249. The figure shows snapshots of the array at 8 moments during the execution of the algorithm. The second figure (copied from Wikipedia) shows Quicksort running on a permutation v[0..32] of the set 0..32.
There are many other animations of Quicksort. Here are some examples:
broom. Each hair of the
broomis an element of the array and the inclination of the hair is the value of the element. Here are some standalone presentations of the animation:
The function quicksort consumes time proportional to the number of comparisons between elements of the array. If the index j returned by partition is always more or less halfway between p and r, the number of comparisons will be approximately n log n , where n is the size, r-p+1, of the array. On the other hand, if the array is already sorted or nearly sorted, the number of comparisons is approximately
n2 .
Therefore, the worst case of quicksort is no better than that of the basic algorithms. Fortunately, the worst case is very rare: on average, the time consumption of quicksort is proportional to
n log n .
(But the proof of this statement is not easy.) Therefore, the algorithm is linearithmic on average but quadratic in the worst case.
void psort (int v[], int p, int r) { if (p >= r) return; if (v[p] > v[r]) { int t = v[p], v[p] = v[r], v[r] = t; } psort (v, p, r-1); psort (v, p+1, r); }
In the basic version of Quicksort, the code takes care immediately of the segment v[p..j-1] of the array and deals with the segment v[j+1..r] only after v[p..j-1] has been sorted. Depending on the value of j in the successive calls to the function, the execution stack can increase a lot, reaching a height equal to the size of the array. (This happens, for example, if the array is originally in decreasing order.) This phenomenon can exhaust the internal memory. To avoid the excessive growth of the execution stack, we must make two patches:
ifwith a
while.
After these two patches, the size of the array segment at the top of the execution stack will be smaller than one half of the size of the segment which is immediately below the top in the stack. More generally, the segment which is at any position of the execution stack will be smaller than one half of the segment immediately below. In this way, when the function processes an array with n elements, the height of the stack will stay below log n.
// This function rearranges the array
// v[p..r] in increasing order.
void
quickSort (int v[], int p, int r)
{
while (p < r) {
int j = partition (v, p, r);
if (j - p < r - j) {
quickSort (v, p, j-1);
p = j + 1;
} else {
quickSort (v, j+1, r);
r = j - 1;
}
}
}
void quicks (int v[], int p, int r) { if (p < r) { int j = partition (v, p, r); if (j - p < r - j) { quicks (v, p, j-1); quicks (v, j+1, r); } else { quicks (v, j+1, r); quicks (v, p, j-1); } } }