- Lexical analysis techniques can be applied to tokenization and
stopword removal which are carried out as part of the automatic
indexing process.
- There are many approaches to shrink the vocabulary which is
used for indexing. Stemming can help a good deal with space, may help
slightly improve effectiveness, and can be carried out with algorithms
that are fast and require little space.