COM 448 Cloud Computing

Home
Up

 

 


      Lecture Schedule

W

Lec

 Topics Covered

Lab

Technology Training

Supplementary

HW

0
10/02
  Course Introduction       HW1
1
17/02
Lec 1 Course Motivation (1/2)
Emerging Technologies, Data Deluge, Industry Trends, Computing Model: Clouds, Data Centers, Virtualization.
       
2
24/02


 
  Course Motivation (2/2)
Research Model: 4th Paradigm, Data Science Process: DIKW, Features of Data Deluge, Data Analytics, Cloud Applications: Physics-Informatics, Recommender Systems, Information Retrieval, Cloud Applications in Research: Science Clouds, Internet of Things, Parallel Computing and MapReduce.
Lab 1


 

Lab 2

Python for Big Data and X-Informatics and
NumPy, SciPy, MatPlotlib (powerful tools which every data scientist who uses Python must know.) This training covers Canopy which is an IDE for Python.

FutureSystems
 
HW2
 



 
3
03/03


 
Lec 2

 
Recommender Systems and Algorithms
Recommender Systems as an Optimization Problem, Kaggle Competitions, Examples of Recommender Systems: Netfliz, Google News Personalization Engine, Yahoo Recommender Systems
Lab 3


 
Using Plotviz
Plotviz is a data visualization tool developed at Indiana University for displaying point distributions in 3D.
 
  HW3
4
09/03



 
  Recommender Systems and Algorithms
Algorithms: User-based Nearest-Neighbor Collaborative Filtering, Vector Space Formulation of Recommender Systems, Item-based Collaborative Filtering, k Nearest Neighbors and High Dimensional Spaces
 
Lab 4



Lab 5
 
kNN  
Recommender Systems - K-Nearest Neighbors (Python & Java Track),

Clustering
Clustering and heuristic methods.
  HW4
5
17/03

 

 

 

 



 

Lec 3.1

 

 

 

Lec 3.2




 

Cloud Computing Technology Part I: Introduction, Software and Systems
Cyberinfrastructure, What is Cloud Computing: Introduction, What and Why is Cloud Computing: Several Other Views, Simple Examples of Use of Cloud Computing, Value of Cloud Computing Public, Private and Hybrid Clouds, Cloud Software Architecture: IaaS and PaaS, Using the HPC-ABDS Software Stack

Cloud Computing Technology Part II: Architectures, Applications and Systems
Cloud (Data Center) Architectures, Analysis of Major Cloud Providers, Commercial Cloud Storage Trends, Cloud Applications, Science Clouds: Science Applications and Internet of Things, Security, Comments on Fault Tolerance and Synchronicity Constraints
 

 

 

 

 

   
6
24/03

 

Lec 3.3

 

Cloud Computing Technology Part III: Data Systems
The 10 Interaction Scenarios (access patterns) I, The 10 Interaction Scenarios – Science Examples, Remaining General Access Patterns, Data in the Cloud Applications, Processing Big Data
       
7
31/03
  Midterm Exam        
8
07/04
Lec 4.1





 
Cloud Programming and Software Environments:
MapReduce and Hadoop Framework

Big Data and Parallel Computing, History of MapReduce, New Parallel Programming Paradigm: MapReduce, The MapReduce Programming Model, Hadoop Framework, Writing Jobs for Hadoop, Hadoop Distributed File System (HDFS), Hadoop Internals, Hadoop 1.0 vs 2.0, MapReduce Cloud Service
 
bullet

Hadoop installation and configuration on notebooks: 1-, 2- and 4-node clusters on notebooks using Cloudera 4.1.1 and 5.3 Hadoop Distributions

bullet

Hadoop installation and configuration on FutureSystems - Indiana University Clusters, our project portal address

 
  HW5
9
14/04
Lec 4.2

 

Lec 4.3

Lec 4.4

Cloud Programming and Software Environments:
Introduction to YARN and MapReduce 2
Overview of MapReduce 1 and 2, YARN Architecture, MapReduce v2, Managing a YARN Cluster, Cloudera and MR2

Hadoop MapReduce 2 Tutorial

Hadoop Ecosystem and HPC Integration

       
10
21/04
Lec 5








 
Big Data Applications and Anallytics Case Study:
Web Search and Text Mining
Web and Document/Text Search: The Problem, Information Retrieval, Web Search Solution in General Starting with History, Key Fundamental Principles behind Web Search, Information Retrieval (Web Search) Components, Search Engines, Boolean and Vector Space Model, Web Crawling and Document Preparation, Indices, TF-IDF and Probabilistic Models, Data Analytics for Web Search, Link Structure Analysis including PageRank, Web Advertising and Search, Clustering and Topic Models
       
11
28/04
Lec 6

 
Technology for Big Data Applications & Analytics
K-Means, Analysis of 4 Artificial Clusters, KMeans in Java using Mahout, MapReduce Revisited: Advanced Topics, Kmeans and MapReduce Parallelism, PageRank
       
12
05/05
Lec 7

 
How to Store Data (NoSQL)
RRDBMS vs NoSQL, NoSQL Characteristics, BigTable, Hbase Hbase Coding, Indexing Technologies, Related Work, Socal Media Searches, Analysis Algorithms
       
13
12/05
Lec 8

 

How to Build a Search Engine (SaaS)
Architecture for a Search Engine, Google Architecture, Evolution of Google’s Search Systems
       
14
19/05
  Project Demonstrations        

 

Home | Copy of Policies and Regulations | COM 448 Cloud Computing