28.2 Inverting matrices

28.2-1¶

Let $M(n)$ be the time to multiply two $n \times n$ matrices, and let $S(n)$ denote the time required to square an $n \times n$ matrix. Show that multiplying and squaring matrices have essentially the same difficulty: an $M(n)$-time matrix-multiplication algorithm implies an $O(M(n))$-time squaring algorithm, and an $S(n)$-time squaring algorithm implies an $O(S(n))$-time matrix-multiplication algorithm.

Showing that being able to multiply matrices in time $M(n)$ implies being able to square matrices in time $M(n)$ is trivial because squaring a matrix is just multiplying it by itself.

The more tricky direction is showing that being able to square matrices in time $S(n)$ implies being able to multiply matrices in time $O(S(n))$.

As we do this, we apply the same regularity condition that $S(2n) \in O(S(n))$. Suppose that we are trying to multiply the matrices, $A$ and $B$, that is, find $AB$. Then, define the matrix

$$ C = \begin{pmatrix} I & A \\ 0 & B \end{pmatrix} $$

Then, we can find $C^2$ in time $S(2n) \in O(S(n))$. Since

$$ C^2 = \begin{pmatrix} I & A + AB \\ 0 & B \end{pmatrix} $$

Then we can just take the upper right quarter of $C^2$ and subtract $A$ from it to obtain the desired result. Apart from the squaring, we've only done work that is $O(n^2)$. Since $S(n)$ is $\Omega(n^2)$ anyways, we have that the total amount of work we've done is $O(n^2)$.

28.2-2¶

Let $M(n)$ be the time to multiply two $n \times n$ matrices, and let $L(n)$ be the time to compute the LUP decomposition of an $n \times n$ matrix. Show that multiplying matrices and computing LUP decompositions of matrices have essentially the same difficulty: an $M(n)$-time matrix-multiplication algorithm implies an $O(M(n))$-time LUP-decomposition algorithm, and an $L(n)$-time LUP-decomposition algorithm implies an $O(L(n))$-time matrix-multiplication algorithm.

Let $A$ be an $n \times n$ matrix. Without loss of generality we'll assume $n = 2^k$, and impose the regularity condition that $L(n / 2) \le cL(n)$ where $c < 1 / 2$ and $L(n)$ is the time it takes to find an LUP decomposition of an $n \times n$ matrix. First, decompose $A$ as

$$ A = \begin{pmatrix} A_1 \\ A_2 \end{pmatrix} $$

where $A_1$ is $n / 2$ by $n$. Let $A_1 = L_1U_1P_1$ be an LUP decomposition of $A_1$, where $L_1$ is $ / 2$ by $n / 2$, $U_1$ is $n / 2$ by $n$, and $P_1$ is $n$ by $n$. Perform a block decomposition of $U_1$ and $A_2P_1^{-1}$ as $U_1 = [\overline U_1|B]$ and $A_2P_1^{-1} = [C|D]$ where $\overline U_1$ and $C$ are $n / 2$ by $n / 2$ matrices. Since we assume that $A$ is nonsingular, $\overline U_1$ must also be nonsingular.

Set $F = D - C\overline U_1^{-1}B$. Then we have

$$ A = \begin{pmatrix} L_1 & 0 \\ C\overline U_1^{-1} & I_{n / 2} \end{pmatrix} \begin{pmatrix} \overline U_1 & B \\ 0 & F \end{pmatrix} P_1. $$

Now let $F = L_2U_2P_2$ be an LUP decomposition of $F$, and let $\overline P = \begin{pmatrix} I_{n / 2} & 0 \\ 0 & P_2 \end{pmatrix}$. Then we may write

$$ A = \begin{pmatrix} L_1 & 0 \\ C\overline U_1^{-1} & L_2 \end{pmatrix} \begin{pmatrix} \overline U_1 & BP_2^{-1} \\ 0 & U_2 \end{pmatrix} \overline P P_1. $$

This is an LUP decomposition of $A$. To achieve it, we computed two LUP decompositions of half size, a constant number of matrix multiplications, and a constant number of matrix inversions. Since matrix inversion and multiplication are computationally equivalent, we conclude that the runtime is $O(M(n))$.

28.2-3¶

Let $M(n)$ be the time to multiply two $n \times n$ matrices, and let $D(n)$ denote the time required to find the determinant of an $n \times n$ matrix. Show that multiplying matrices and computing the determinant have essentially the same difficulty: an $M(n)$-time matrix-multiplication algorithm implies an $O(M(n))$-time determinant algorithm, and a $D(n)$-time determinant algorithm implies an $O(D(n))$-time matrix-multiplication algorithm.

(Omit!)

28.2-4¶

Let $M(n)$ be the time to multiply two $n \times n$ boolean matrices, and let $T(n)$ be the time to find the transitive closure of an $n \times n$ boolean matrix. (See Section 25.2.) Show that an $M(n)$-time boolean matrix-multiplication algorithm implies an $O(M(n)\lg n)$-time transitive-closure algorithm, and a $T(n)$-time transitive-closure algorithm implies an $O(T(n))$-time boolean matrix-multiplication algorithm.

Suppose we can multiply boolean matrices in $M(n)$ time, where we assume this means that if we're multiplying boolean matrices $A$ and $B$, then $(AB)_{ij} = (a_{i1} \wedge b_{1j}) \vee \dots \vee (a_{in} \wedge b_{nj})$. To find the transitive closure of a boolean matrix $A$ we just need to find the $n^{\text{th}}$ power of $A$. We can do this by computing $A^2$, then $(A^2)^2$, then $((A^2)^2)^2$ and so on. This requires only $\lg n$ multiplications, so the transitive closure can be computed in $O(M(n)\lg n)$.

For the other direction, first view $A$ and $B$ as adjacency matrices, and impose the regularity condition $T(3n) = O(T(n))$, where $T(n)$ is the time to compute the transitive closure of a graph on $n$ vertices. We will define a new graph whose transitive closure matrix contains the boolean product of $A$ and $B$. Start by placing $3n$ vertices down, labeling them $1, 2, \dots, n, 1', 2', \dots, n', 1'', 2'', \dots, n''$.

Connect vertex $i$ to vertex $j'$ if and only if $A_{ij} = 1$. Connect vertex $j'$ to vertex $k''$ if and only if $B_{jk} = 1$. In the resulting graph, the only way to get from the first set of $n$ vertices to the third set is to first take an edge which "looks like" an edge in $A$, then take an edge which "looks like" an edge in $B$. In particular, the transitive closure of this graph is:

$$ \begin{pmatrix} I & A & AB \\ 0 & I & B \\ 0 & 0 & I \end{pmatrix} . $$

Since the graph is only of size $3n$, computing its transitive closure can be done in $O(T(3n)) = O(T(n))$ by the regularity condition. Therefore multiplying matrices and finding transitive closure are are equally hard.

28.2-5¶

Does the matrix-inversion algorithm based on Theorem 28.2 work when matrix elements are drawn from the field of integers modulo $2$? Explain.

It does not work necessarily over the field of two elements. The problem comes in in applying theorem D.6 to conclude that $A^{\text T}A$ is positive definite. In the proof of that theorem they obtain that $||Ax||^2 \ge 0$ and only zero if every entry of $Ax$ is zero. This second part is not true over the field with two elements, all that would be required is that there is an even number of ones in $Ax$. This means that we can only say that $A^{\text T}A$ is positive semi-definite instead of the positive definiteness that the algorithm requires.

28.2-6 $\star$¶

Generalize the matrix-inversion algorithm of Theorem 28.2 to handle matrices of complex numbers, and prove that your generalization works correctly. ($\textit{Hint:}$ Instead of the transpose of $A$, use the conjugate transpose $A^*$, which you obtain from the transpose of $A$ by replacing every entry with its complex conjugate. Instead of symmetric matrices, consider Hermitian matrices, which are matrices $A$ such that $A = A^*$.)

We may again assume that our matrix is a power of $2$, this time with complex entries. For the moment we assume our matrix $A$ is Hermitian and positivedefinite. The proof goes through exactly as before, with matrix transposes replaced by conjugate transposes, and using the fact that Hermitian positivedefinite matrices are invertible. Finally, we need to justify that we can obtain the same asymptotic running time for matrix multiplication as for matrix inversion when $A$ is invertible, but not Hermitian positive-definite.

For any nonsingular matrix $A$, the matrix $A^*A$ is Hermitian and positive definite, since for any $x$ we have $x^*A^*Ax = \langle Ax, Ax \rangle > 0$ by the definition of inner product. To invert $A$, we first compute $(A^*A)^{-1} = A^{-1}(A^*)^{-1}$. Then we need only multiply this result on the right by $A^*$. Each of these steps takes $O(M(n))$ time, so we can invert any nonsingluar matrix with complex entries in $O(M(n))$ time.