7-5 Median-of-3 partition

One way to improve the $\text{RANDOMIZED-QUICKSORT}$ procedure is to partition around a pivot that is chosen more carefully than by picking a random element from the subarray. One common approach is the median-of-3 method: choose the pivot as the median (middle element) of a set of 3 elements randomly selected from the subarray. (See exercise 7.4-6.) For this problem, let us assume that the elements of the input array $A[1..n]$ are distinct and that $n \ge 3$. We denote the sorted output array by $A'[1..n]$. Using the median-of-3 method to choose the pivot element $x$, define $p_i = \Pr\{x = A'[i]\}$.

a. Give an exact formula for $p_i$ as a function of $n$ and $i$ for $i = 2, 3, \ldots, n - 1$. (Note that $p_1 = p_n = 0$.)

b. By what amount have we increased the likelihood of choosing the pivot as $x = A'[\lfloor (n + 1) / 2 \rfloor]$, the median of $A[1..n]$, compared with the ordinary implementation? Assume that $n \to \infty$, and give the limiting ratio of these probabilities.

c. If we define a "good" split to mean choosing the pivot as $x = A'[i]$, where $n / 3 \le i \le 2n / 3$, by what amount have we increased the likelihood of getting a good split compared with the ordinary implementation? ($\textit{Hint:}$ Approximate the sum by an integral.)

d. Argue that in the $\Omega(n\lg n)$ running time of quicksort, the median-of-3 method affects only the constant factor.

a. $p_i$ is the probability that a randomly selected subset of size three has the $A'[i]$ as it's middle element. There are 6 possible orderings of the three elements selected. So, suppose that $S'$ is the set of three elements selected.

We will compute the probability that the second element of $S'$ is $A'[i]$ among all possible $3$-sets we can pick, since there are exactly six ordered $3$-sets corresponding to each $3$-set, these probabilities will be equal. We will compute the probability that $S'[2] = A[i]$. For any such $S'$, we would need to select the first element from $[i - 1]$ and the third from ${i + 1, \ldots , n}$. So, there are $(i - 1)(n - i)$ such $3$-sets. The total number of $3$-sets is $\binom{n}{3} = \frac{n(n - 1)(n - 2)}{6}$. So,

$$p_i = \frac{6(n - i)(i - 1)}{n(n - 1)(n - 2)}.$$

b. If we let $i = \lfloor \frac{n + 1}{2} \rfloor$, the previous result gets us an increase of

$$\frac{6(\lfloor\frac{n - 1}{2}\rfloor)(n - \lfloor\frac{n + 1}{2}\rfloor)}{n(n - 1)(n - 2)} - \frac{1}{n}$$

in the limit $n$ going to infinity, we get

$$\lim_{n \to \infty} \frac{\frac{6(\lfloor \frac{n - 1}{2} \rfloor)(n - \lfloor \frac{n + 1}{2} \rfloor)}{n(n - 1)(n - 2)}}{\frac{1}{n}} = \frac{3}{2}.$$

c. To save the messiness, suppose $n$ is a multiple of $3$. We will approximate the sum as an integral, so,

$$ \begin{aligned} \sum_{i = n / 3}^{2n / 3} & \approx \int_{n / 3}^{2n / 3} \frac{6(-x^2 + nx + x - n)}{n(n - 1)(n - 2)}dx \\ & = \frac{6(-7n^3 / 81 + 3n^3 / 18 + 3n^2 / 18 - n^2 / 3)}{n(n - 1)(n - 2)}, \end{aligned} $$

which, in the limit $n$ goes to infinity, is $\frac{13}{27}$ which is a constant that $>\frac{1}{3}$ as it was in the original randomized quicksort implementation.

d. Even though we always choose the middle element as the pivot (which is the best case), the height of the recursion tree will be $\Theta(\lg n)$. Therefore, the running time is still $\Omega(n\lg n)$.