In large collections, it is natural to group similar items into clusters. This can be done based on citation links, or more commonly, as a result of pairwise similarity computation. Users often will browse inside a cluster, and disk performance is better if items that are used together are stored together.
Clustering is the process of determining which items should be grouped together. It can be easily understood through illustrations.