- If one thinks of creation of an inverted file as a matrix
operation, what is the most similar matrix operation to it:
- a)
- finding eigenvalues.
- b)
- inversion.
- c)
- multiplication.
- d)
- transpose.
- e)
- none of the above.
- Given that only root or stem forms of English words are
indexed, and nothing else (no identifiers, numbers, names,
etc.), how is the size of the dictionary likely to increase as
more documents (say, news stories) are added? How
rapidly does it increase initially? How rapidly after
100,000 documents have been added? Would there be an
upper bound, and if so, what might it be?
- Please match the Boolean operators
(EITHER-BUT-NOT-BOTH, AND, OR, AND-NOT) with the corresponding
set operations (INTERSECTION, UNION,
SET-DIFFERENCE, XOR).
[Give a list of the matching pairs.]
- An implementation of the Boolean operators using bit
vectors will usually assume that each bit vector represents
a term or concept. What does each position in a bit vector
represent?
- Explain, for a P-norm query, what is the effect of a
P-value of 2 on an AND operator, as opposed to
infinity (e.g., to make the AND stricter, to retrieve
more documents, etc.).