11-4 Hashing and authentication

Let $\mathcal H$ be a class of hash functions in which each hash function $h \in \mathcal H$ maps the universe $U$ of keys to $\{0, 1, \ldots, m - 1 \}$. We say that $\mathcal H$ is k-universal if, for every fixed sequence of $k$ distinct keys $\langle x^{(1)}, x^{(2)}, \ldots, x^{(k)} \rangle$ and for any $h$ chosen at random from $\mathcal H$, the sequence $\langle h(x^{(1)}), h(x^{(2)}), \ldots, h(x^{(k)}) \rangle$ is equally likely to be any of the $m^k$ sequences of length $k$ with elements drawn from $\{0, 1, \ldots, m - 1 \}$.

a. Show that if the family $\mathcal H$ of hash functions is $2$-universal, then it is universal.

b. Suppose that the universe $U$ is the set of $n$-tuples of values drawn from $\mathbb Z_p = \{0, 1, \ldots, p - 1\}$, where $p$ is prime. Consider an element $x = \langle x_0, x_1, \ldots, x_{n - 1} \rangle \in U$. For any $n$-tuple $a = \langle a_0, a_1, \ldots, a_{n - 1} \rangle \in U$, define the hash function $h_a$ by

$$h_a(x) = \Bigg(\sum_{j = 0}^{n - 1} a_j x_j \Bigg) \mod p.$$

Let $\mathcal H = \{h_a\}$. Show that $\mathcal H$ is universal, but not $2$-universal. ($\textit{Hint:}$ Find a key for which all hash functions in $\mathcal H$ produce the same value.)

c. Suppose that we modify $\mathcal H$ slightly from part (b): for any $a \in U$ and for any $b \in \mathbb Z_p$, define

$$h'_{ab}(x) = \Bigg(\sum_{j = 0}^{n - 1} a_j x_j + b \Bigg) \mod p$$

and $\mathcal h' = \{h'_{ab}\}$. Argue that $\mathcal H'$ is $2$-universal. ($\textit{Hint:}$ Consider fixed $n$-tuples $x \in U$ and $y \in U$, with $x_i \ne y_i$ for some $i$. What happens to $h'_{ab}(x)$ and $h'_{ab}(y)$ as $a_i$ and $b$ range over $\mathbb Z_p$?)

d. Suppose that Alice and Bob secretly agree on a hash function $h$ form $2$-universal family $\mathcal H$ of hash functions. Each $h \in \mathcal H$ maps from a universe of keys $u$ to $\mathbb Z_p$, where $p$ is aprime. Later, Alice sends a message $m$ to Bob over the Internet, where $m \in U$. She authenticates this message to Bob by also sending an authentication tag $t = h(m)$, and Bob checks that the pair $(m, t)$ he receives indeed satisfies $t = h(m)$. Suppose that an adversary intercepts $(m, t)$ en route and tries to fool Bob by replacing the pair $(m, t)$ with a different pair $(m', t')$. Argue that the probability that the adversary succeeds in fooling Bob into accepting $(m', t')$ is at most $1 / p$, no matter how much computing power the adversary has, and even if the adversary knows the family $\mathcal H$ of hash functions used.

a. The number of hash functions for which $h(k) = h(l)$ is $\frac{m}{m^2}|\mathcal H| = \frac{1}{m}|\mathcal H|$, therefore the family is universal.

b. For $x = \langle 0, 0, \ldots, 0 \rangle$, $\mathcal H$ could not be $2$-universal.

c. Let $x, y \in U$ be fixed, distinct $n$-tuples. As $a_i$ and $b$ range over $\mathbb Z_p, h'_{ab}(x)$ is equally likely to achieve every value from $1$ to $p$ since for any sequence $a$, we can let $b$ vary from $1$ to $p - 1$.

Thus, $\langle h'_{ab}(x), h'_{ab}(y) \rangle$ is equally likely to be any of the $p^2$ sequences, so $\mathcal H$ is $2$-universal.

d. Since $\mathcal H$ is $2$-universal, every pair of $\langle t, t' \rangle$ is equally likely to appear, thus $t'$ could be any value from $\mathbb Z_p$. Even the adversary knows $\mathcal H$, since $\mathcal H$ is $2$-universal, then $\mathcal H$ is universal, the probability of choosing a hash function that $h(k) = h(l)$ is at most $1 / p$, therefore the probability is at most $1 / p$.