7.2 Performance of quicksort
7.2-1¶
Use the substitution method to prove that the recurrence $T(n) = T(n - 1) + \Theta(n)$ has the solution $T(n) = \Theta(n^2)$, as claimed at the beginning of section 7.2.
Proof
We are given the recurrence
$$ T(n) = T(n - 1) + \Theta(n), $$
and we aim to prove $T(n) = \Theta(n^2)$.
By the definition of $\Theta(n)$, there exist constants $c_1$, $c_2$, and $n_0$ such that for all $n \ge n_0$,
$$ c_1 n \le f(n) \le c_2 n, $$
where $f(n) = \Theta(n)$. Thus, the recurrence becomes
$$ T(n) = T(n - 1) + f(n), \quad \text{with} \quad c_1 n \le f(n) \le c_2 n. $$
Upper Bound
We conjecture $T(n) \le c_3 n^2$ for some constant $c_3$. Assume this holds for $T(k)$, i.e., $T(k) \le c_3 k^2$. Then,
$$ T(k+1) = T(k) + f(k+1) \le c_3 k^2 + c_2 (k+1). $$
We want to show
$$ c_3 k^2 + c_2 (k+1) \le c_3 (k+1)^2. $$
Expanding both sides:
$$ c_3 (k+1)^2 = c_3 k^2 + 2c_3 k + c_3, $$
and
$$ c_3 k^2 + c_2 (k+1) = c_3 k^2 + c_2 k + c_2. $$
Thus, we need
$$ c_2 k + c_2 \le 2c_3 k + c_3. $$
This holds for $c_2 \le 2c_3$ and $c_2 \le c_3$, which is possible by choosing appropriate constants.
Lower Bound
The lower bound follows similarly by applying the lower bound $c_1 n$ for $f(n)$. Thus, we can conclude that
$$ T(n) = \Theta(n^2). $$
This completes the proof.
7.2-2¶
What is the running time of $\text{QUICKSORT}$ when all elements of the array $A$ have the same value?
It is $\Theta(n^2)$, since one of the partitions is always empty (see exercise 7.1-2.)
7.2-3¶
Show that the running time of $\text{QUICKSORT}$ is $\Theta(n^2)$ when the array $A$ contains distinct elements and is sorted in decreasing order.
If the array is already sorted in decreasing order, then, the pivot element is less than all the other elements. The partition step takes $\Theta(n)$ time, and then leaves you with a subproblem of size $n − 1$ and a subproblem of size $0$. This gives us the recurrence considered in 7.2-1. Which we showed has a solution that is $\Theta(n^2)$.
7.2-4¶
Banks often record transactions on an account in order of the times of the transactions, but many people like to receive their bank statements with checks listed in order by check numbers. People usually write checks in order by check number, and merchants usually cash the with reasonable dispatch. The problem of converting time-of-transaction ordering to check-number ordering is therefore the problem of sorting almost-sorted input. Argue that the procedure $\text{INSERTION-SORT}$ would tend to beat the procedure $\text{QUICKSORT}$ on this problem.
The more sorted the array is, the less work insertion sort will do. Namely, $\text{INSERTION-SORT}$ is $\Theta(n + d)$, where $d$ is the number of inversions in the array. In the example above the number of inversions tends to be small so insertion sort will be close to linear.
On the other hand, if $\text{PARTITION}$ does pick a pivot that does not participate in an inversion, it will produce an empty partition. Since there is a small number of inversions, $\text{QUICKSORT}$ is very likely to produce empty partitions.
7.2-5¶
Suppose that the splits at every level of quicksort are in proportion $1 - \alpha$ to $\alpha$, where $0 < \alpha \le 1 / 2$ is a constant. Show that the minimum depth of a leaf in the recursion tree is approximately $-\lg n / \lg\alpha$ and the maximum depth is approximately $-\lg n / \lg(1 - \alpha)$. (Don't worry about integer round-off.)
The minimum depth corresponds to repeatedly taking the smaller subproblem, that is, the branch whose size is proportional to $\alpha$. Then, this will fall to $1$ in $k$ steps where $1 \approx \alpha^kn$. Therefore, $k \approx \log_\alpha 1 / n = -\frac{\lg n}{\lg\alpha}$. The longest depth corresponds to always taking the larger subproblem. we then have an identical expression, replacing $\alpha$ with $1 − \alpha$.
7.2-6 $\star$¶
Argue that for any constant $0 < \alpha \le 1 / 2$, the probability is approximately $1 - 2\alpha$ that on a random input array, $\text{PARTITION}$ produces a split more balanced than $1 - \alpha$ to $\alpha$.
In order to produce a worse split than $1 - \alpha$ to $\alpha$, $\text{PARTITION}$ must pick a pivot that will be either within the smallest $\alpha n$ elements or the largest $\alpha n$ elements. The probability of either is (approximately) $\alpha n / n = \alpha$ and the probability of both is $2\alpha$. Thus, the probability of having a better partition is the complement, $1 - 2\alpha$.