Unit RR, Part B1: Basic Vector Space Model
Vector Space
- Model of feature space as a t-dimensional vector space
- Each dimension represents a term or concept (found in documents)
- Each document and query is represented as a point in the space
- Groups of documents are represented by their centroid point
Vector Similarity
- Similarity between two points is measured by the cosine of the angle
between their two vectors.
- That is computed as the inner product of the two vectors,
normalized (divided) by the products of the vector lengths
(square root of the sums of squares).