Study Questions

You should be able to answer each of the following questions.

  1. Why is inverted file the name given to that type of data structure?
  2. Given the binary document/query matrix below, what would the document file look like, and what would the inverted file look like? You may use simple lists, properly labeled, one per line. Refer to text Figure 3.9 or other examples. The matrix has a row for each document, and a column for each term.

    Doc. Terms
    No. 1 2 3 4 5 6 7
    1 1 0 0 0 1 1 1
    2 1 0 0 0 0 0 0
    3 1 0 0 0 0 1 0
    4 0 1 1 0 0 1 0

  3. The FAST-INV algorithm essentially carries out what common matrix operation? What is the time and space complexity?
  4. What would be the space requirements for bit vector and hash table representations of a collection with 1 million documents and 200,000 terms?
  5. What would be the special advantages or disadvantages of MMM vs. Paice vs. P-norm, if the queries involved were all very, very long?


fox@cs.vt.edu
Tue Sep 6 04:51:11 EDT 1994