IR Course topics ... from a SYSTEMS perspective
Systems and Techniques (Core IR)
Each of the following sub-headings (Applications, Media,etc.) represent
independent dimensions describing the landscape on which the core of IR topics
can be painted ... SOME of the various sub-points from different dimensions can be
combined to yield specific instances of ... "information retrieval".
Applications
Information retrieval is more than just retrieval and this group reflects the variety of IR tasks which are commonplace:
- indexing
- filtering / routing
- categorization
- retrieval
- abstracting/summarization
- clustering
- classification
Media
Many different media can be used for IR applications and some of these media (text for example) have received a lot of attention, but IR applications on non-text are becoming core:
- text / documents
- - types (newspaper, bibliographic records, structured, long/short, etc)
- - dependent (i.e, news threads, email messages, versions)
- - multi/monolingual
- image - - direct/indirect attributes
- video - - ditto
- audio
Representation Frameworks
Raises issues of representation both of the corpus to be searched
and of the queries, as well as issues of the matching opertion:
- statistical
- VSM
- Probabilistic
- Bayesian
- Clustering
- Boolean
- NLP tools, techniques and resources
- Logics
- Combinations
- - Data fusion
- - Collection merging ... broadcasting search(es) to subsets
of available collections
Engineering Issues
This is to do with actual implementation techniques used in building systems ... the real "nuts and bolts" stuff:
- Parallel
- Hardware
- Implementation "smarts" or tricks or methods(signature files, inverted files, string searching)
- Distributed / Networked
- Performance evaluation
- - effectiveness (Precision/recall)
- - efficiency (speed, response time)
- Scalability
User Aspects
- HCI / user interface
- Visualisation
- Information needs
- User modelling
Note: we have still top find homes for relevance feedback, query expansion and term
weighting functions !!!