You should be able to answer each of the following
questions.
- Why is inverted file the name given to that
type of data structure?
- Given the binary document/query matrix below,
what would the document file look like, and what would
the inverted file look like? You may use simple lists,
properly labeled, one per line. Refer to text Figure 3.9
or other examples. The matrix has a row for each
document, and a column for each term.
Doc. Terms
No. 1 2 3 4 5 6 7
1 1 0 0 0 1 1 1
2 1 0 0 0 0 0 0
3 1 0 0 0 0 1 0
4 0 1 1 0 0 1 0
- The FAST-INV algorithm essentially carries out
what common matrix operation? What is the time and
space complexity?
- What would be the space requirements for
bit vector and hash table representations of a collection
with 1 million documents and 200,000 terms?
- What would be the special advantages or
disadvantages of MMM vs. Paice vs. P-norm, if the
queries involved were all very, very long?