27-1 Implementing parallel loops using nested parallelism
Consider the following multithreaded algorithm for performing pairwise addition on $n$-element arrays $A[1..n]$ and $B[1..n]$, storing the sums in $C[1..n]$:
a. Rewrite the parallel loop in $\text{SUM-ARRAYS}$ using nested parallelism (spawn and sync) in the manner of $\text{MAT-VEC-MAIN-LOOP}$. Analyze the parallelism of your implementation.
Consider the following alternative implementation of the parallel loop, which contains a value $grain\text-size$ to be specified:
SUM-ARRAYS'(A, B, C) n = A.length grain-size = ? // to be determined r = ceil(n / grain-size) for k = 0 to r - 1 spawn ADD-SUBARRAY(A, B, C, k * grain-size + 1, min((k + 1) * grain-size, n)) sync
b. Suppose that we set $grain\text -size = 1$. What is the parallelism of this implementation?
c. Give a formula for the span of $\text{SUM-ARRAYS}'$ in terms of $n$ and $grain\text-size$. Derive the best value for grain-size to maximize parallelism.
a. See the algorithm $\text{SUM-ARRAYS}(A, B, C)$. The parallelism is $O(n)$ since it's work is $n\lg n$ and the span is $\lg n$.
b. If grainsize is $1$, this means that each call of $\text{ADD-SUBARRAY}$ just sums a single pair of numbers. This means that since the for loop on line 4 will run $n$ times, both the span and work will be $O(n)$. So, the parallelism is just $O(1)$.
SUM-ARRAYS(A, B, C)
n = floor(A.length / 2)
if n == 0
C[1] = A[1] + B[1]
else
spawn SUM-ARRAYS(A[1..n], B[1..n], C[1..n])
SUM-ARRAYS(A[n + 1..A.length], B[n + 1..A..length], C[n + 1..A.length])
c. Let $g$ be the grainsize. The runtime of the function that spawns all the other functions is $\left\lceil \frac{n}{g} \right\rceil$. The runtime of any particular spawned task is $g$. So, we want to minimize
$$\frac{n}{g} + g.$$
To do this we pull out our freshman calculus hat and take a derivative, we have
$$0 = 1 − \frac{n}{g^2}.$$
To solve this, we set $g = \sqrt n$. This minimizes the quantity and makes the span $O(n / g + g) = O(\sqrt n)$. Resulting in a parallelism of $O(\sqrt n)$.