In this course, we'll study the theory, design, and implementation of text-based and Web-based information retrieval systems. By the end of the semester you will be able to:
We're going to use Google Groups for all course communication, so you should check our Google Group often. If you've got a homework question, post to the group. If you've found a cool link you want to share, post to the group! If you're looking for a study partner, post to the group!! Basically, the Google Group should be your best, first choice for all class-related concerns. We will monitor the group and provide feedback. But everyone is encouraged to contribute.
I expect all students to have had some previous exposure to basic probability, statistics, algorithms, and data structures. You should be able to design and develop large programs and learn new software libraries on your own. All homework assignments are in Python, but we do not assume that you are already proficient in Python.
The primary textbook is IIR: Introduction to Information Retrieval, Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schutze, Cambridge University Press. 2008. Available at Cambridge University Press, at Amazon, and other fine booksellers.
We'll also read some selections from:
You may find some of these optional textbooks helpful, though none are required:
It is critically important that you study the relevant course readings before class so that we can make the most of our limited class time together. I treat our class meetings as opportunities to highlight significant aspects of the material, to answer questions, to engage in discussions about particular topics, and so on. We cannot cover all of the material in class, so it is up to you to stay on top of the readings and the assignments.
The course grading policy is as follows: 4% Class participation, 24% Homework, 48% Quizzes, 24% Project. The grading scale is A: 90-100, B: 80-89, C: 70-79, D: 60-69, F: 0-59.
Class participation. Attendance in class and participation in the discussion are both important to your success in the course. As one crude measure of your participation, we will have around 3 to 5 low-stress ungraded quick quizzes (less than 5 minutes each) spread across the semester. These quick quizzes will not be graded for correctness. I will use them to gauge what topics we need to devote more time to and as an indicator that you were in class. Additionally, we expect you to participate in online discussions at our Google Group. Over the course of the semester, you should make at least three substantive, interesting posts to the discussion forum (either initiating a new topic or responding to someone else).
Quizzes. We'll have four in-class quizzes, each counting for 12% of your final grade. All quizzes are closed book. You may bring one standard 8.5" by 11" piece of paper with any notes you deem appropriate or significant (front and back). No calculators, iPads, iPhones, Android phones/tablets, or abacuses are allowed.
Project. For the project, you will work in teams of either two or three students on a problem of your choosing that is interesting, significant, and relevant to Information Storage & Retrieval. The ultimate goal of your course project is to develop a new tool to tackle some interesting real-world problem.
Homework assignments. We will have four homework assignments over the course of the semester, each worth 6% of your final grade. Update: We are splitting the first homework into two; so we will have a Homework 0 worth 3% and then a Homework 1 worth 3%.
All homework assignments must be submitted by 11:59pm Central time on the due date. For the homework assignments, you may talk to any other class member or work in groups to discuss the problems in a general way. However, your actual detailed solution must be yours alone. If you do talk to other students, you must write on your assignment who it is that you discussed the problems with. Your submitted work must be written solely by you and not contain work directly copied from others.
Homework Collaboration Clarification: To clarify, your homework is yours alone and you are expected to complete each homework independently. Your solution should be written by you without the direct aid or help of anyone else. However, we believe that collaboration and team work are important for facilitating learning, so we encourage you to discuss problems and general problem approaches (but not actual solutions) with your classmates. If you do have a chat with another student about a homework problem, you must inform us by writing a note on your homework submission (e.g., Bob pointed me to the relevant section for problem 3). The basic rule is that no student should explicitly share a solution with another student (and thereby circumvent the basic learning process), but it is okay to share general approaches, directions, and so on. If you feel like you have an issue that needs clarification, feel free to contact either me or the TA.
Homework Plagiarism Policy: We will use the Stanford Moss system to check homework submissions for plagiarism. Students found to have engaged in plagiarism will be punished severely, typically earning an automatic F in the course and being reported to the Aggie Honor System.
Homework Late Days: For the homework assignments, you have a total of 5 late days that you can use during the semester. However, a single assignment can be submitted up to 2 days late only, so we can post solutions in a timely fashion. For the purposes of the class, a late day is an indivisible 24-hour unit. Once you exhaust your 5 late days, we will not accept any late submissions.
Regrade Policy: If you feel that we have made an error in grading either a homework or an exam, you may resubmit the assignment for a regrade. You must include a brief written statement describing what portion of the assignment solution has been graded in error. Note that we reserve the right to examine the entire assignment, so there is a chance we may find errors in your assignment that we missed before.