Information Retrieval
Instructor: Venkatesh Vinayakarao
Term: Aug - Nov 2020



Welcome to Information Retrieval (IR) course! It is difficult to imagine living without search engines. Availability of big data has necessitated a systematic study of retrieval techniques. Principles and practices of information retrieval have been a focus of both researchers and practitioners alike. This course is not about just search engines. It is about dealing with big data and retrieving information which opens up several interesting applications. This course will introduce students to key parts of IR such as indexing techniques, challenges in query processing and well-known retrieval models.

Key Learning Objectives

At the end of this course, you should be able to:
  • Understand and apply text indexing techniques to big data.
  • Understand and apply text ranking techniques.
  • Analyze and evaluate existing retrieval systems.

Lecture Resources

The lecture schedule of 2019 offering (which was a 2-credit course) is available here.

Lecture #TopicReadingsSlides/Material
Part 1: Building a Search System - An Overview
1Introduction to Information RetrievalChapter 1 from CPS[Video][Slides]
2Building a Simple Retrieval SystemChapter 1 from CPS[Video][Slides]
3Query Processing with Inverted IndexChapter 2 from CPS[Video][Slides]
Assignment 1 released. Please visit Moodle for details.
4Evaluating Retrieval SystemsChapter 8 from CPS[Video][Slides]
Tutorial: Lucene DemoLucene Tutorial[Video][LuceneDemo.zip]
Bonus Task 1 released. Please visit Moodle for details. A preview of a bonus task from an earlier course offering is here.
Part 2: Components of a Retrieval System
2.1 Indexing
5Indexing: Query Processing OrderSection 1.3 from CPS[Video][Slides]
6Indexing: ChallengesChapter 2 from CPS[Video][Slides]
[Video][Slides]
The Power of Indexing
Bonus Task 2 released. Please visit Moodle for details.
2.2 Query Understanding
7Query Understanding: Segmentation and Spelling CorrectionChapter 3 from CPS[Video][Slides]
8Query Understanding: Phonetic CorrectionChapter 3 from CPS[Video][Slides]
Handling Wildcard Queries
Assignment 2 released. Please visit Moodle for details.
9Tutorial: Solr[Video][Slides][Code]
2.3 Index Compression
10Index CompressionChapter 5 from CPS[Video][Slides]
2.4 Crawling
11CrawlersChapter 20 from CPS, Chapter 3 from BDT[Video][Slides]
2.5 Evaluation
12EvaluationChapter 8 from CPS
Part 3: Advanced Topics in Information Retrieval


Evaluation
InstrumentMax Marks
Final Exam30%
Mid-Semester Exam20%
Assignments (4 * 10% each)40%
Mini-Project10%

Pre-requisites
Familiarity with Java will help in coding with Lucene. You may use your favourite programming language (for assignments) as long as the objectives of the assignment are met. Basic understanding of linear algebra, set theory and probability will be useful in understanding the IR models. However, there are no specific pre-requisites for this course. We will revise the fundamentals wherever necessary.

Resources

Text Reference
  • [BDT] Search Engines: Information Retrieval in Practice. Bruce Croft, Donald Metzler, Trevor Strohman


If you are not having fun, you are not the best student you can be!