George Mason University

Department of Computer Science

CS 504: Principles of Data Management and Mining

Fall 2018

Professor Jessica Lin


 News & Announcements
8/23: Welcome to class!
8/30: HW1 posted. Due 9/17 at 4:30pm.
9/20: HW2 posted. Due 10/1 at 4:30pm.
10/20: HW4 posted. Due 11/05 at 4:30pm.

Course Description

Techniques to store, manage, and use data including databases, relational model, schemas, queries and transactions. On Line Transaction Processing, Data Warehousing, star schema, On Line Analytical Processing. MOLAP, HOLAP, and hybrid systems. Overview of Data Mining principles, models, supervised and unsupervised learning, pattern finding. Massively parallel architectures and Hadoop.

Class Time and Location

Monday 4:30-7:10pm
Off-campus

Instructor

Dr. Jessica Lin
Office: Engineering Building 4419
Phone: 703-993-4693
Email: jessica [AT] gmu [DOT] edu
Office Hours: Tuesday 1:30-3:30pm or by appointment

Prerequisites

Graduate Standing

Note: This course cannot be taken for credit by students of the MS CS, MS ISA, MS SWE, MS IS, CS PhD or IT PhD programs.

Grading

Assignments: 30%
          Project: 30%
Midterm: 30%
Quizzes: 10%
Exam

There will be 4 or 5 quizzes and a midterm exam covering lectures and readings. With the exception of the quizzes, which must be taken at the time they are given, prior arrangement needs to be made with the instructor if you cannot make it to the exam. Missed exams cannot be made up.

Textbooks

Data Science for Business: What You Need To Know About Data Mining and Data-Analytic Thinking by Foster Provost and Tom Fawcett

NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence by Pramod J. Sadalage and Martin Fowler

Both books are available on Safari Books for free with your GMU account. More reading materials will be given in class.

Topics

Honor Code Statement

The GMU Honor Code is in effect at all times. In addition, the CS Department has further honor code policies regarding programming projects, which are detailed here. Any deviation from the GMU or the CS department Honor Code is considered an Honor Code violation. All assignments for this class are individual unless otherwise specified.

Learning Disability Accommodation

If you have a documented learning disability or other condition which may affect academic performance, make sure this documentation is on file with the Office of Disability Services and then discuss with the professor about accommodations.

Tentative Schedule

Week
Date
Topic
Assigned
Due
Note
1
8/27
Introduction / ER Modeling
HW1


2
9/3
No Class - Labor Day



3
9/10
ER Model



4
9/17
Relational Model
HW2
HW1

5
9/24
Relational Model / SQL



6
10/1
SQL
HW3
HW2
Quiz 1
7
10/8 (Class meets on Tuesday this week) SQL



8
10/15
Data Mining
HW4
HW3

9
10/22
Data Mining (Classification)


Quiz 2
10
10/29
Data Mining (Model Evaluation, Clustering)


Quiz 3
11
11/5
Data Mining (Association Rule Mining)
Midterm review

Project Proposal
HW4
Quiz 4
12
11/12
Midterm



13
11/19
Post-midterm review
Data Warehouse



14
11/26
NoSQL/MapReduce



15
12/3
Class Review



16
12/10
Project due