The objective of this class is to introduce students to the fundamentals of modern information retrieval systems. This course will start by studying classic textual information retrieval systems, then move to distributed and multimedia systems. The first half of the course will be lecture and assignment oriented, the second half seminar oriented. Students will be expected to read papers on a research topic of their choice, present a summary to the class, and do an independent project.
Document Processing, Text Preprocessing, Boolean, Vector-Space and Probabilistic Retrieval Systems, Intelligent Search Agents, Natural Language Processing for IR, Corpus Linguistics for IR, Multimedia, Networking, World Wide Web Search Engines.
Programming experience in a high-level language and experience with the UNIX operating system.
The required textbook for this class is "Information Retrieval: Data Structures and Algorithms" by Frakes and Baeza-Yates, Prentice-Hall, 1992. This will be supplemented by research papers.
Final grades in this class will be assigned based on the following scale: A>=90%, B>=80%, C>=70%, D>=60%, and F<60%. There will be four programming assignments (10% of grade each), an open book midterm exam (25%), a class presentation (10%) and a final literature survey or programming project (25%). Late assignments will be penalized 5% a day for each day late.
Programming assignments represent a significant fraction of the required work for this course. Students are expected to submit original work on the specified due date. Copying of assignments is not permitted and will result in a grade of zero on the assignment and referral to the Dean. Students are allowed to ask fellow students questions regarding C++ syntax, isolated program bugs, and general problem solving principles. Cheating on exams is a serious case of academic misconduct and may result in expulsion from the class and referral to the Dean.
01/17 Basics of IR; Chapter 1
01/19 Basics of IR
01/22 Basics of IR
01/24 Data Structures for IR; Chapter 2
01/26 Data Structures for IR
01/29 Data Strutures for IR
01/31 Inverted Files; Chapter 3
02/02 Inverted Files
02/05 Lexical Analysis; Chapter 7
02/07 Lexical Analysis
02/09 Stemming Algorithms; Chapter 8
02/12 Stemming Algorithms
02/14 Boolean Retrieval; Chapter 12
02/16 Vector Space Retrieval; Chapter 14
02/19 Probabilistic Retrieval; Chapter 14
03/11 Midterm Exam 03/25 - 03/29 Spring Break
Advanced Topics and Student Presentations
05/13 Review