Summary of Key Concepts

  1. Lexical analysis techniques can be applied to tokenization and stopword removal which are carried out as part of the automatic indexing process.

  2. There are many approaches to shrink the vocabulary which is used for indexing. Stemming can help a good deal with space, may help slightly improve effectiveness, and can be carried out with algorithms that are fast and require little space.


fox@cs.vt.edu
Thu Oct 27 02:57:58 EDT 1994