CSCE 670 :: Information Storage and Retrieval :: Spring 2018

Tues/Thurs 9:35-10:50am in ETB 2005

Instructor: James Caverlee, HRBB 403
Office Hours: Wednesday 2-4pm
Department of Computer Science and Engineering
Texas A&M University

TA: Parisa Kaghazgaran, 408D
Office Hours: Monday 2-4pm and Wednesday 9-11am

Course Schedule :: Project

Course Summary

In this course, we'll study the theory, design, and implementation of text-based and Web-based information retrieval systems, including algorithms and techniques at the core of modern search and recommender systems. By the end of the semester you will be able to:


All course communication will be via Piazza. We will post often to Piazza, so you should plan to check it often (every day).


I expect all students to have had some previous exposure to basic probability, statistics, algorithms, and data structures. You should be able to design and develop large programs and learn new software libraries on your own.


The primary textbook is IIR: Introduction to Information Retrieval, Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schutze, Cambridge University Press. 2008. Available at Cambridge University Press, at Amazon, and other fine booksellers.

We may also read some selections from:

You may find some of these optional textbooks helpful, though none are required:

It is critically important that you study the relevant course readings before class so that we can make the most of our limited class time together. I treat our class meetings as opportunities to highlight significant aspects of the material, to answer questions, to engage in discussions about particular topics, and so on. We cannot cover all of the material in class, so it is up to you to stay on top of the readings and the assignments.


The grading scale is A: 90-100, B: 80-89, C: 70-79, D: 60-69, F: 0-59. The course grading policy is as follows:

In-Class Quick Quizzes (10%). Keeping up with the readings and attendance in class are both important to your success in the course. Hence we will have 10 in-class quick quizzes (5 to 10 minutes each) spread across the semester. Of those 10 quizzes, we will count just 7, meaning that you can miss up to 3 with no negative repercussions. Note that we do not expect you to memorize the readings; we just expect that you have done them and given them a reasonable pass.

Discussion Posts (5%). We expect you to participate in online discussions at Piazza. Over the course of the semester, you should post at least two times; these could be on topics related to the course, help with someone's homework question, etc. Towards your participation grade, the final day to post to the discussion group is April 18. (Of course you are welcome to continue to post afterwards, but these posts will not count toward your participation grade.)

Exams (40%). We will have two in-class exams, each worth 20% of your overall grade. Both exams are closed book, however, you may bring two standard 8.5" by 11" pieces of paper with any notes you deem appropriate or significant (front and back). No calculators, iPads, iPhones, Android phones/tablets, or abacuses are allowed.

Homework (20%). We will have several homework assignments. These will be a mix of programming assignments and problem sets. All programming assignments will be in Python; we make no expectations that you have been exposed to Python before, but we do expect you to come up to speed rapidly.

All homework assignments must be submitted by 11:59pm Central time on the due date. For the homework assignments, you may talk to any other class member or work in groups to discuss the problems in a general way. However, your actual detailed solution must be yours alone. If you do talk to other students, you must write on your assignment who it is that you discussed the problems with. Your submitted work must be written solely by you and not contain work directly copied from others.

Homework Collaboration Clarification: To clarify, your homework is yours alone and you are expected to complete each homework independently. Your solution should be written by you without the direct aid or help of anyone else. However, we believe that collaboration and team work are important for facilitating learning, so we encourage you to discuss problems and general problem approaches (but not actual solutions) with your classmates. If you do have a chat with another student about a homework problem, you must inform us by writing a note on your homework submission (e.g., Bob pointed me to the relevant section for problem 3). The basic rule is that no student should explicitly share a solution with another student (and thereby circumvent the basic learning process), but it is okay to share general approaches, directions, and so on. If you feel like you have an issue that needs clarification, feel free to contact either me or the TA.

Homework Plagiarism Policy: We will use the Stanford Moss system to check homework submissions for plagiarism. Students found to have engaged in plagiarism will be punished severely, typically earning an automatic F in the course and being reported to the Aggie Honor System.

Homework Late Days: For the homework assignments, you have a total of 5 late days that you can use during the semester. However, a single assignment can be submitted up to 3 days late only, so we can post solutions in a timely fashion. For the purposes of the class, a late day is an indivisible 24-hour unit. Once you exhaust your 5 late days, we will not accept any late submissions.

Project (25%). For the project, you will work in teams of three or four on a problem of your choosing that is interesting, significant, and relevant to this course.

Regrade Policy: If you feel that we have made an error in grading, you may resubmit the assignment for a regrade. You must include a brief written statement describing what portion has been graded in error. (Hey, if you are reading this, here's a hint: we will have our first quick quiz on Thursday of the first week of class -- the answer to the first question is "The Beatles"). Note that we reserve the right to examine the entire assignment, so there is a chance we may find errors in your assignment that we missed before.

Americans with Disabilities Act (ADA) Policy Statement

The Americans with Disabilities Act (ADA) is a federal anti-discrimination statute that provides comprehensive civil rights protection for persons with disabilities. Among other things, this legislation requires that all students with disabilities be guaranteed a learning environment that provides for reasonable accommodation of their disabilities. If you believe you have a disability requiring an accommodation, please contact Disability Services, currently located in the Disability Services building at the Student Services at White Creek complex on west campus or call 979-845-1637. For additional information, visit

Academic Integrity Statements

AGGIE HONOR CODE: ''An Aggie does not lie, cheat, or steal or tolerate those who do.'' Upon accepting admission to Texas A&M University, a student immediately assumes a commitment to uphold the Honor Code, to accept responsibility for learning, and to follow the philosophy and rules of the Honor System. Students will be required to state their commitment on examinations, research papers, and other academic work. Ignorance of the rules does not exclude any member of the TAMU community from the requirements or the processes of the Honor System. For additional information please visit: