CSCE 689 :: Internet-Scale Data Management :: Fall 2010

Instructor: James Caverlee, HRBB 403
Office Hours: 10-11am Tues/Thurs, or by appointment
Department of Computer Science and Engineering
Texas A&M University

TA: Brian Eoff, HRBB 408
Office Hours: 10-11am Mon/Wed, or by appointment

This course is an introduction to advanced research topics in Internet-scale data management, addressing the relevant theoretical foundations, methods, and tools from a wide spectrum, including (i) large-scale distributed information management; (ii) data and text mining techniques and algorithms; and (iii) data privacy and security issues in large-scale Internet systems.

Course Syllabus



August 31 / September 2: Introduction to the course, administrivia, etc.

September 7 / 9: MapReduce

September 14 / 16: Link Analysis

September 21 / 23: Text/Web/Blog Mining

September 28 / 30: Communities

October 5 / 7: Social Media

October 12 / 14: Real-Time Web

October 19 / 21: Duplicate Detection

October 26 / 28: Guest Lectures on Visualizing Large Datasets

November 2 / 4: Social Spam

November 9 / 11: Geography and the Social Web

November 16 / 18: Influence in Social Information Networks

November 23: Short Text

November 30 / December 2

December 7





