16-2 Scheduling to minimize average completion time

Suppose you are given a set $S = \{a_1, a_2, \ldots, a_n\}$ of tasks, where task $a_i$ requires $p_i$ units of processing time to complete, once it has started. You have one computer on which to run these tasks, and the computer can run only one task at a time. Let $c_i$ be the completion time of task $a_i$ , that is, the time at which task $a_i$ completes processing. Your goal is to minimize the average completion time, that is, to minimize $(1 / n) \sum_{i = 1}^n c_i$. For example, suppose there are two tasks, $a_1$ and $a_2$, with $p_1 = 3$ and $p_2 = 5$, and consider the schedule in which $a_2$ runs first, followed by $a_1$. Then $c_2 = 5$, $c_1 = 8$, and the average completion time is $(5 + 8) / 2 = 6.5$. If task $a_1$ runs first, however, then $c_1 = 3$, $c_2 = 8$, and the average completion time is $(3 + 8) / 2 = 5.5$.

a. Give an algorithm that schedules the tasks so as to minimize the average completion time. Each task must run non-preemptively, that is, once task $a_i$ starts, it must run continuously for $p_i$ units of time. Prove that your algorithm minimizes the average completion time, and state the running time of your algorithm.

b. Suppose now that the tasks are not all available at once. That is, each task cannot start until its release time $r_i$. Suppose also that we allow preemption, so that a task can be suspended and restarted at a later time. For example, a task $a_i$ with processing time $p_i = 6$ and release time $r_i = 1$ might start running at time $1$ and be preempted at time $4$. It might then resume at time $10$ but be preempted at time $11$, and it might finally resume at time $13$ and complete at time $15$. Task $a_i$ has run for a total of $6$ time units, but its running time has been divided into three pieces. In this scenario, $a_i$'s completion time is $15$. Give an algorithm that schedules the tasks so as to minimize the average completion time in this new scenario. Prove that your algorithm minimizes the average completion time, and state the running time of your algorithm.

a. Order the tasks by processing time from smallest to largest and run them in that order. To see that this greedy solution is optimal, first observe that the problem exhibits optimal substructure: if we run the first task in an optimal solution, then we obtain an optimal solution by running the remaining tasks in a way which minimizes the average completion time. Let $O$ be an optimal solution. Let $a$ be the task which has the smallest processing time and let b be the first task run in $O$. Let $G$ be the solution obtained by switching the order in which we run $a$ and $b$ in $O$. This amounts reducing the completion times of a and the completion times of all tasks in $G$ between $a$ and $b$ by the difference in processing times of $a$ and $b$. Since all other completion times remain the same, the average completion time of $G$ is less than or equal to the average completion time of $O$, proving that the greedy solution gives an optimal solution. This has runtime $O(n\lg n)$ because we must first sort the elements.

b. Without loss of generality we my assume that every task is a unit time task. Apply the same strategy as in part (a), except this time if a task which we would like to add next to the schedule isn't allowed to run yet, we must skip over it. Since there could be many tasks of short processing time which have late release time, the runtime becomes $O(n^2)$ since we might have to spend $O(n)$ time deciding which task to add next at each step.