Work Span Analysis Notes|downey.io

Some loosely structured notes on the IHPC “Work and Span” lectures to help me prepare for the midterm. Explaining things in my own words helps me focus on the material and gives me a searchable artifact to find later. The contents of this one probably won’t be that useful for others though and there’s a lot of messily written non-LaTeX equations. I recommend checking out the official notes for this section.

Work Span Analysis

Formalism for analyzing parallel algorithms

Work W(n)

Total number of vertices (work) in a DAG (directed acyclic graph)

Span D(n) (D represents depth)

Total number of vertices in the longest path through a DAG

The average available parallelism for an algorithm can be represented by W(n)/D(n).

Span Law

Tp(n) >= D(n)

The time to execute given p processors is greater than or equal to the span of the algorithm.

Work Law

Tp(n) >= ceil(W(n)/p)

The time to execute given p processors is greater than or equal to the work of the algorithm divided by the number of processors. Ceiling cause we’re dealing with integers.

Work-Span Law

Tp(n) >= max(D(n), ceil(W(n)/p))

Both laws hold so can be combined. Simply take the max of the two! :)

Brent’s Theorem

The Work-Span Law gives us a lower bound for the time to execute a parallel algorithm. Brent’s Theorem helps us find the upper bound! 👋 😳 👍 The paper describing this theorem can be found here.

Brent’s Theorem

Tp(n) <= (W(n) - D(n))/p + D(n)

The time to execute the DAG is no more than the time to execute the critical path (longest path – the span) plus the time to schedule the rest of the work across the p processors.

Putting it all together…

max(D(n), ceil(W(n)/p)) <= Tp(n) <= (W(n) - D(n))/p + D(n)

Speedup & Work-Optimality

How much faster the parallel algorithm is compared to the best sequential time. Linear speedup means Tp scales linearly with the number of available processors. Note: the * in W*(n) denotes the work of the best sequential algorithm.

Speedup

speedup = best sequential time / parallel time
Sp(n) = W(n) / Tp(n) >= p / (W(n)/W(n) + (p - 1)/(W*(n) / D(n)))

Work-Optimality

W(n) / W*(n) should hopefully equal 1

If you get a highly parallel algorithm by dramatically increasing the work relative to the best sequential algorithm, this is bad for speedup.

Master Theorem

I found this video helpful.