

Gediz University, Computer Engineering
Department
Spring
2016
Lecture:
Tuesday: 09:00 
11:45, A117


Instructor: Halûk
Gümüşkaya 
Teaching Assistant: 
Office:
D107 
Office:

Office Hours:
Hours:
MonWed: 13:00 
14:00 
Office Hours: 
email: haluk.gumuskaya@gediz.edu.tr 
email:









Course Description
Introduction to data
mining. Descriptions of Data, Data Preprocessing: data cleaning,
integration and reduction. Data Warehousing and Online Analytical
Processing, Association and Correlation Analysis, Classification:
decision trees, naïve bayesian classification, support vector machines,
neural networks, rulebased classification, patternbased
classification, logistic regression, Cluster Analysis, Outlier Analysis.
Prerequisite
Assumes only a modest statistics or mathematics background, and no database
knowledge is needed.
Lecture Schedule
(tentative)
W 
Lec 
Topics Covered 
0
23/02 
Lec 0 
Course
Overview
Data Deluge and Technical
Challenges, Definitions for Data Mining and Related
Sciences, Data Science Related Jobs and University Graduate
Programs, DIKW Process Examples, Tools and Languages for
Data Science, Course Description and Objectives,
Requirements and Assumptions, Course Outline, Text Books,
Other Lecture Materials and Tools, Course Activities and
Grading 
1
01/03 
Lec
1 
Introduction
to Data Mining
Why Data Mining? What is Data
Mining? What Kinds of Data can be Mined? Data Mining
Functions Which Sciences and Technologies are Used? Which
Kinds of Applications are Targeted? Major Issues in Data
Mining A Brief History of Data Mining and Data Mining
Society 
2
08/03 
Lec 2 
Getting to Know Your Data
Data Objects and Attribute Types, Basic Statistical
Descriptions of Data, Data Visualization, Measuring Data
Similarity and Dissimilarity 
3
15/03 
Lec 3 
Data Preprocessing
Data Preprocessing: An Overview, Data Cleaning, Data
Integration, Data Reduction, Data Transformation and Data
Discretization 
4
22/03 
Lab 1 
Getting to Know Your Data
and Data Preprocessing using WEKA 
5
29/03 
Lec 4 
Association AnalysisBasics
Concepts and Methods
Introduction, Frequent Itemsets, Scalable Mining Methods
and Apriori Algorithm, Finding Association Rules with
Apriori, Learning Association Rules using WEKA 
6
05/04 
Lec 5 
Classification: Basic
Concepts, Decision Trees, and Model Evaluation
Basic Concepts, Decision Tree Based Classification,
Practical Issues of Classification, Model Evaluation,
Classification using WEKA 
7
12/04 

Midterm Exam I 
8
19/04 

ClassificationAdvanced
Methods: RuleBased Classification, Bayes Classification
Methods, Neural Networks and Classification by
Backpropagation, Support Vector Machines, Classification by
Using Frequent Patterns, Lazy Learners (or Learning from
Your Neighbors) 
9
26/04 

ClassificationAdditional
Topics: Other Classification Methods, Model Evaluation
and Selection, Techniques to Improve Classification
Accuracy: Ensemble Methods 
10
03/05 

Classification Applications 
11
10/05 

Cluster AnalysisBasic
Concepts and Methods: Cluster Analysis: Basic Concepts,
Partitioning Methods, Hierarchical Methods, DensityBased
Methods, GridBased Methods, Evaluation of Clustering 
12
17/05 

Cluster AnalysisAdvanced
Methods: Probability ModelBased Clustering, Clustering
HighDimensional Data, Clustering Graphs and Network Data,
Clustering with Constraints 
13
24/05 

Outlier Analysis:
Outlier and Outlier Analysis, Outlier Detection Methods,
Statistical Approaches, ProximityBase Approaches,
ClusteringBase Approaches, Classification Approaches,
Mining Contextual and Collective Outliers, Outlier Detection
in High Dimensional Data 
14
31/05 

Midterm Exam II 
Textbooks
Main Textbooks and Materials

Data Mining: Concepts and Techniques, 3rd Edition, J. Han, M.
Kamber, J. Pei, Morgan Kaufmann, 769 pp, 2011. 

Introduction to Data Mining, P. Tan, M. Steinbach, V. Kumar,
AddisonWesley, 769 pp, 2006. 

Data Mining:
Practical Machine Learning Tools and Techniques, 3rd Edition,
Ian
H. Witten, E. Frank, M. A. Hall, Morgan Kaufmann, 629 pp, 2011. 

Two Data Mining with Weka
Courses, (Youtube
Channel) Ian
H. Witten, MOOC (Massive Open Online Courses) from the University of Waikato, New Zealand. 

Weka Tutorial by Rushdi Shams 
Recommended

Mining Massive Datasets, A. Rajaraman and J. Ullman, 2nd
Edition, Cambridge University Press, 2014, You can be download it
from here
(511 pages, 3 MB). 

Introduction to
Machine Learning, 2nd Edition, Ethem Alpaydın, The MIT Press,
2010. 
 Python for Data Analysis, W.
McKinney, OReilly, 2013. 
 Machine Learning in Action, P.
Harrington, Manning Publications, 2012. 
 Mahout in Action, S. Owen, R.
Anil, T. Dunning, E. Friedman, Manning Publications, 2012. 
Tools and Development Environments

Weka, Data
Mining Software in Java 

Matlab

Grading
20 % : HW Assignments
25 % :
Midterm 1
25 % :
Midterm 2
30 % : Final
