Course Schedule :: Project
This course is a broad overview of data mining, integrating related concepts from machine learning and statistics; exploratory data analysis, pattern mining, clustering and classification; applications to scientific and online data. By the end of the semester you will be able to:
All course communication will be via Piazza. We will post often to Piazza, so you should plan to check it often (every day).
I expect all students to have had some previous exposure to basic probability, statistics, algorithms, and data structures. You should be able to design and develop large programs and learn new software libraries on your own.
We will read some selections from:
The grading scale is A: 90-100, B: 80-89, C: 70-79, D: 60-69, F: 0-59. The course grading policy is as follows:
Participation (5%). Attendance in class and participation in the discussion are both important to your success in the course. We expect you to participate in online discussions on Piazza. Over the course of the semester, you should post at least one substantive, interesting post to the discussion forum. You must also participate in at least three threads on Piazza. Towards your participation grade, the final day to post to the discussion group is November 27. (Of course you are welcome to continue to post afterwards, but these posts will not count toward your participation grade.)
Quiz (10%). We will have one quizzes during our regular class period on October 11, worth 10%. The quiz is closed book, but you may bring one standard 8.5" by 11" piece of paper with anything you deem appropriate or significant (front and back). No devices allowed.
Final Exam (30%). We will have a comprehensive two-hour final exam on Monday, December 9, from 8am to 10am. For the final, you may bring two standard 8.5" by 11" pieces of paper with anything you deem appropriate or significant (front and back). No devices allowed.
Homework (25%). We will have several homework assignments. These will be a mix of programming assignments and problem sets. Programming assignments will be in Python; we make no expectations that you have been exposed to Python before, but we do expect you to come up to speed rapidly.
All homework assignments must be submitted by 11:59pm Central time on the due date. For the homework assignments, you may talk to any other class member or work in groups to discuss the problems in a general way. However, your actual detailed solution must be yours alone. If you do talk to other students, you must write on your assignment who it is that you discussed the problems with. Your submitted work must be written solely by you and not contain work directly copied from others.
Homework Collaboration Clarification: To clarify, your homework is yours alone and you are expected to complete each homework independently. Your solution should be written by you without the direct aid or help of anyone else. However, we believe that collaboration and team work are important for facilitating learning, so we encourage you to discuss problems and general problem approaches (but not actual solutions) with your classmates. If you do have a chat with another student about a homework problem, you must inform us by writing a note on your homework submission (e.g., Bob pointed me to the relevant section for problem 3). The basic rule is that no student should explicitly share a solution with another student (and thereby circumvent the basic learning process), but it is okay to share general approaches, directions, and so on. If you feel like you have an issue that needs clarification, feel free to contact either me or the TA.
Homework Plagiarism Policy: We will use the Stanford Moss system to check homework submissions for plagiarism. Students found to have engaged in plagiarism will be punished severely, typically earning an automatic F in the course and being reported to the Aggie Honor System.
Homework Late Days: For the homework assignments, you have a total of 5 late days that you can use during the semester. However, a single assignment can be submitted up to 3 days late only, so we can post solutions in a timely fashion. For the purposes of the class, a late day is an indivisible 24-hour unit. Once you exhaust your 5 late days, we will not accept any late submissions.
Project (30%). For the project, you will work in teams of three or four on a problem related to data mining for social good.
Regrade Policy: If you feel that we have made an error in grading, you may resubmit the assignment for a regrade within 7 days of receiving your graded assignment. You must include a brief written statement describing what portion has been graded in error. Note that we reserve the right to examine the entire assignment, so there is a chance we may find errors in your assignment that we missed before.