Questions for Form C

  1. Consider the search schemes listed below but limit your selections to those that support direct text searching (i.e., without an index). i) Which one will work the best for regular expression type searching involving sets of characters and don't cares? ii) Which would work best for very short strings (i.e., shorter than 3 characters)? a) inverted file. b) Patricia tree. c) Boyer-Moore. d) Karp-Rabin. e) Shift-Or.

  2. Assume an ideal pattern, of length m, and a target string of length n. Consider the choices below. i) What is the lower bound on the search time for Boyer-Moore? ii) If we assume a worst-case pattern and target combination, what is the upper bound for the naive search algorithm? a) O(n/m). b) O(n). c) O(n+m). d) O(nlogm). e) O(n*m).

  3. Why are skipped bits used in PAT trees? What complications do they cause?

  4. Explain how PAT carries out the signif type of query, in terms of operations on a PAT tree.

  5. What is a typical value for the number of internal nodes that will fit into a page when a PAT tree is represented? How many disk accesses are likely to be needed to find the first sistring that will satisfy a simple string matching query?


fox@cs.vt.edu
Tue Aug 30 04:43:17 EDT 1994