26-2 Minimum path cover

A path cover of a directed graph $G = (V, E)$ is a set $P$ of vertex-disjoint paths such that every vertex in $V$ is included in exactly one path in $P$. Paths may start and end anywhere, and they may be of any length, including $0$. A minimum path cover of $G$ is a path cover containing the fewest possible paths.

a. Give an efficient algorithm to find a minimum path cover of a directed acyclic graph $G = (V, E)$. ($\textit{Hint:}$ Assuming that $V = \{1, 2, \ldots, n\}$, construct the graph $G' = (V', E')$, where

$$ \begin{aligned} V' & = \{x_0, x_1, \ldots, x_n\} \cup \{y_0, y_1, \ldots, y_n\}, \\ E' & = \{(x_0, x_i): i \in V\} \cup \{(y_i, y_0): i \in V\} \cup \{(x_i, y_j): (i, j) \in E\}, \end{aligned} $$

and run a maximum-flow algorithm.)

b. Does your algorithm work for directed graphs that contain cycles? Explain.

a. Set up the graph $G'$ as defined in the problem, give each edge capacity $1$, and run a maximum-flow algorithm. I claim that if $(x_i, y_j)$ has flow $1$ in the maximum flow and we set $(i, j)$ to be an edge in our path cover, then the result is a minimum path cover. First observe that no vertex appears twice in the same path. If it did, then we would have $f(x_i, y_j) = f(x_k, y_j)$ for some $i \ne k \ne j$. However, this contradicts the conservation of flow, since the capacity leaving $y_j$ is only $1$. Moreover, since the capacity from $s$ to $x_i$ is $1$, we can never have two edges of the form $(x_i, y_j)$ and $(x_i, y_k)$ for $k \ne j$. We can ensure every vertex is included in some path by asserting that if there is no edge $(x_i, y_j)$ or $(x_j, y_i)$ for some $j$, then $j$ will be on a path by itself. Thus, we are guaranteed to obtain a path cover. If there are $k$ paths in a cover of $n$ vertices, then they will consist of $n − k$ edges in total. Given a path cover, we can recover a flow by assigning edge $(x_i, y_j)$ flow $1$ if and only if $(i, j)$ is an edge in one of the paths in the cover. Suppose that the maximum flow algorithm yields a cover with $k$ paths, and hence flow $n − k$, but a minimum path cover uses strictly fewer than $k$ paths. Then it must use strictly more than $n − k$ edges, so we can recover a flow which is larger than the one previously found, contradicting the fact that the previous flow was maximal. Thus, we find a minimum path cover. Since the maximum flow in the graph corresponds to finding a maximum matching in the bipartite graph obtained by considering the induced subgraph of $G'$ on $\{1, 2, \dots, n\}$, section 26.3 tells us that we can find a maximum flow in $O(V E)$.

b. This doesn't work for directed graphs which contain cycles. To see this, consider the graph on $\{1, 2, 3, 4\}$ which contains edges $(1, 2)$, $(2, 3)$, $(3, 1)$, and $(4, 3)$. The desired output would be a single path $4$, $3$, $1$, $2$ but flow which assigns edges $(x_1, y_2)$, $(x_2, y_3)$, and $(x_3, y_1)$ flow $1$ is maximal.