Questions for Form C

If one thinks of creation of an inverted file as a matrix operation, what is the most similar matrix operation to it:
a)
finding eigenvalues.
b)
inversion.
c)
multiplication.
d)
transpose.
e)
none of the above.

Given that only root or stem forms of English words are indexed, and nothing else (no identifiers, numbers, names, etc.), how is the size of the dictionary likely to increase as more documents (say, news stories) are added? How rapidly does it increase initially? How rapidly after 100,000 documents have been added? Would there be an upper bound, and if so, what might it be?

Please match the Boolean operators (EITHER-BUT-NOT-BOTH, AND, OR, AND-NOT) with the corresponding set operations (INTERSECTION, UNION, SET-DIFFERENCE, XOR). [Give a list of the matching pairs.]

An implementation of the Boolean operators using bit vectors will usually assume that each bit vector represents a term or concept. What does each position in a bit vector represent?

Explain, for a P-norm query, what is the effect of a P-value of 2 on an AND operator, as opposed to infinity (e.g., to make the AND stricter, to retrieve more documents, etc.).


fox@cs.vt.edu
Tue Aug 30 04:41:34 EDT 1994