A. Consider these documents: Write down the entries in the permuterm index dictionary that are generated by theterm mama.
B. Consider these documents: If you wanted to search for s*ng in a permuterm wildcard index, what key(s) wouldone do the lookup on?
C. If |S| denotes the length of string S, show that the edit distance between s1 and s2 is never more than max{|s1|, |s2|}.
D. Compute the Jaccard coefficients between the query bord and each of the terms in Figure 3.7 that contain the bigram or.
E. For n = 2 and 1 ≤ T ≤ 30 (where T is the number of postings), perform a step-by-step simulation of the algorithm in Figure 4.7 on page 72 of the textbook. Create a table that shows, for each point in time at which T = 2 × k tokens have been processed (1 ≤ k ≤ 15), which of the three indexes I0, . . . , I3 are in use. The first three lines of the table are given below.
T = 2 × k
|
I3
|
I2
|
I1
|
I0
|
2
|
0
|
0
|
0
|
0
|
4
|
0
|
0
|
0
|
1
|
6
|
0
|
0
|
1
|
0
|
F. Compute variable byte codes for the numbers (both document numbers and gaps)in Table 5.3 of the textbook.
G. Compute variable byte and γ codes for the postings list 777, 17743, 294068, 31251336. Use gaps instead of doc IDs where possible. Write binary codes in 8-bit blocks.