Christopher D. Manning, Prabhakar Raghavan and Hinrich SchützeIntroduction to Information RetrievalCambridge University Press. 2008.
This is textbook for the course, the pdf and html versions of the book are available from the linked website
A simple system able to answer Boolean queries on the movie summaries dataset.
Code with spelling correction using the edit distance
A code to explore the vector space representation for the TIME dataset.
The dataset used in the code for lecture 6.
Ranking with tf-idf of the results.
file for the coding part of lecture 10
A simple implementation of the PageRank algorithm
A simple implementation of latent semantic indexing
Updated on December 19th.