Show how MapReduce can be used to efficiently solve the following problem:
Given a collection of input documents, output all pairs of keywords that co-occur in at least 1000 of the documents.
Write pseudocode for map and reduce functions.
Full points for most efficient implementation.
Hint: is multi-phase MapReduce useful here?