Edward A. Fox
Department of Computer Science,
Virginia Tech, Blacksburg VA 24061-0106
Clustering algorithms vary widely in terms of their requirements for space and time, their stability, the tightness of the resulting clusters, whether or not cluster hierarchies are produced, and in their utility for information retrieval applications. Indeed, some collections are not really amenable to clustering, especially when similar documents are not relevant to the same query, or when terms that have similar document occurrence characteristics are not really searchonyms.
This Unit covers these issues, making use of one textbook chapter, two lectures, and an exercise.