26-3 Algorithmic consulting
Professor Gore wants to open up an algorithmic consulting company. He has identified n important subareas of algorithms (roughly corresponding to different portions of this textbook), which he represents by the set $A = \{A_1, A_2, \ldots, A_n\}$. In each subarea $A_k$, he can hire an expert in that area for $c_k$ dollars. The consulting company has lined up a set $J = \{J_1, J_2, \ldots, J_m\}$ of potential jobs. In order to perform job $J_i$, the company needs to have hired experts in a subset $R_i \subseteq A$ of subareas. Each expert can work on multiple jobs simultaneously. If the company chooses to accept job $J_i$, it must have hired experts in all subareas in $R_i$, and it will take in revenue of $p_i$ dollars.
Professor Gore's job is to determine which subareas to hire experts in and which jobs to accept in order to maximize the net revenue, which is the total income from jobs accepted minus the total cost of employing the experts.
Consider the following flow network $G$. It contains a source vertex $s$, vertices $A_1, A_2, \ldots, A_n$, vertices $J_1, J_2, \ldots, J_m$, and a sink vertex $t$. For $k = 1, 2, \ldots, n$, the flow network contains an edge $(s, A_k)$ with capacity $c(s, A_k) = c_k$, and for $i = 1, 2, \ldots, m$, the flow network contains an edge $(J_i, t)$ with capacity $c(J_i, t) = p_i$. For $k = 1, 2, \ldots, n$ and $i = 1, 2, \ldots, m$, if $A_k \in R_i$, then $G$ contains an edge $(A_k, J_i)$ with capacity $c(A_k, J_i) = \infty$.
a. Show that if $J_i \in T$ for a finite-capacity cut $(S, T)$ of $G$, then $A_k \in T$ for each $A_k \in R_i$.
b. Show how to determine the maximum net revenue from the capacity of a minimum cut of $G$ and the given $p_i$ values.
c. Give an efficient algorithm to determine which jobs to accept and which experts to hire. Analyze the running time of your algorithm in terms of $m$, $n$, and $r = \sum_{i = 1}^m |R_i|$.
a. Suppose to a contradiction that there were some $J_i \in T$, and some $A_k \in R_i$ so that $A_k \notin T$. However, by the definition of the flow network, there is an edge of infinite capacity going from $A_k$ to $J_i$ because $A_k \in R_i$. This means that there is an edge of infinite capacity that is going across the given cut. This means that the capacity of the cut is infinite, a contradiction to the given fact that the cut was finite capacity.
b. Though tempting, it doesn't suffice to just look at the experts that are on the $s$ side of the cut. To see why this doesn't work, imagine there's one specialized skill areal, such as "Computer power switch operator", that is required for every job. Then, any finite cut that would include any job getting done would requiring that this expert be hired. However, since there is an infinite capacity edge coming from him to every other job, then all of the experts need for all the other jobs would also need to be hired. So, if we have this obiquitously required employee, any minimum cut would have to be all or nothing, but it is trivial to find a counterexample to this being optimal.
In order for this problem to be solvable, one must assume that for every expert you've hired, you do all of the jobs that he is required for. If this is the case, then let $S_k \subseteq [n]$ be the indices of the experts that lie on the source side of the cut, and let $S_i \subseteq [m]$ be the indices of jobs that lie on the source side of the cut, then the net revenue is just
$$\sum_{S_i} p_i − \sum_{S_k} c_k$$
To see this is minimum, transferring over some set of experts and tasks from the sink side to the source side causes the capacity to go down by the cost of those experts and go up by the revenue of those jobs. If the cut was minimal than this must be a positive change, so the revenue isn't enough to justify the hire, meaning that those jobs that were on the source side in the minimal cut are exactly the jobs to attempt.
c. Again, to get a solution, we must make the assumption that for every expert that is hired, all jobs that that expert is required for must be completed. Basically just run either the $O(V^3)$ relabel-to-front algorithm described in section 26.5 on the flow network, and hire the experts that are on the source side of the cut. By the previous part, we know that this gets us the best outcome. The number of edges in the flow network is $m + n + r$, and the number of vertices is $2 + m + n$, so the runtime is just $O((2 + m + n)^3)$, so it's cubic in $\max(m, n)$. There is no dependence on $R$ using this algorithm, but this is reasonable since we have the inherent bound that $r < mn$, which is a lower order term. Without this unstated assumption, I suspect that there isn't an efficient solution possible, but cannot think of what NP-complete problem you would use for the reduction.