Problem
Consider two processor pipelines. One is in-order, 4-way superscalar (call that P1). The other is scalar, but out-of-order (P2). It may help that we now know about caches and we can talk about code with long load latencies (ie, that often miss in cache).
a) Describe (you can write code, but a good, precise description will suffice) code that would run faster on P1 than P2.
b) Describe code that would run faster on P2 than P1.