COM 561 Cloud Computing




      Lecture Schedule (tentative and read Important Notes below)



 Topics Covered



  Course Overview   HW1

Lec 1


Distributed System Models and Enabling Technologies
Scalable Computing over the Internet, Technologies for Network-Based Systems, System Models for Distributed and Cloud Computing, Software Environments for Distributed Systems and Clouds, Performance, Security

- How to Read a Paper, S. Keshav, 2012.

- Above the Clouds: A Berkeley View of Cloud Computing, Technical Report, 2009.




Lec 2

Computer Clusters for Scalable Computing
Clustering for Massive Parallelism, Computer Clusters and MPP Architectures, Design Principles of Computer Clusters, Cluster Job and Resource Management, Case Studies of Top Supercomputer Systems

- What is Parallel Computing?



Lec 3

Virtual Machines and Virtualization of Clusters and Datacenters
Implementation Levels of Virtualization, Virtualization Structures/Tools and Mechanisms, Virtualization of CPU, Memory, and I/O Devices, Virtual Clusters and Resource Management, Virtualization for Data-Center Automation

- Xen and the Art of Virtualization-2003

- A Comparison of Software and Hardware Techniques for x86 Virtualization-2006



Lec 4

Cloud Platform Architecture over Virtualized Data Centers:
Data Center Design and Networking
What is a Data Center? What does a Data Center Look Like? Warehouse-Scale Data Center Design, Power and Cooling Requirements, Data-Center Interconnection Networks, Design Considerations for WSC


- The Datacenter as a Computer, An Introduction to the Design of Warehouse-Scale Machines,  L. A. Barroso,  U. Hölzle, Google Inc., 2009.

- High Performance Datacenter Networks, Architectures, Algorithms, and Opportunities, D. Abts, J. Kim, 2011.

- A Guided Tour through Data-center Networking, D. Abts, B. Felderman, ACM Queue, May 3, 2012.

- A Scalable, Commodity Data Center Network Architecture, M. Al-Fares, A. Loukissas, A. Vahdat, SIGCOMM’08, August 17–22, 2008.

Videos on Data Centers:

- Explore a Google Data Center with Street View

- Google Container Data Center


Lec 5

Cloud Platform Architecture over Virtualized Data Centers:
Cloud Computing Service Models
Cloud Computing Services Stack, Infrastructure as a Service (IaaS), Platform as a Service (PaaS), Software as a Service (SaaS), Today’s Cloud Services Stack, Public, Private & Hybrid Clouds, Market-Oriented Cloud Architecture, Inter-Cloud Resource Management, Cloud Security and Trust Management

- Amazon Web Services (AWS)     Getting Started with AWS
- Introduction to Amazon Web Services (video tutorial)

- Good App Engine

- Introduction to Google App Engine For Developers (video tutorial)

- Microsoft Azure

Lec 6

Cloud Platform Architecture over Virtualized Data Centers:
Major Cloud Service Providers
Public Clouds, Amazon Web Services (AWS), Google App Engine, Microsoft Azure
Midterm Exam
Lec 7.1

Cloud Programming and Software Environments:
MapReduce and Hadoop Framework

Big Data and Parallel Computing, History of MapReduce, New Parallel Programming Paradigm: MapReduce, The MapReduce Programming Model, Hadoop Framework, Writing Jobs for Hadoop, Hadoop Distributed File System (HDFS), Hadoop Internals, Hadoop 1.0 vs 2.0, MapReduce Cloud Service


- The Google File System, S. Ghemawat et al., SOSP, 2003.

- MapReduce: Simplied Data Processing on Large Clusters, J. Dean, S. Ghemawat, OSDI, 2004.

- Hadoop home page

- Beyond Batch- The Evolution of the Hadoop Ecosystem - Doug Cutting

- HDFS-Comics

- MapReduce Tutorial (Apache Hadoop 1.2.1) 
- MapReduce Tutorial (Apache Hadoop 2.6.0)
- Google MapReduce Tutorial

Lec 7.2

Cloud Programming and Software Environments:
Introduction to YARN and MapReduce 2
Overview of MapReduce 1 and 2, YARN Architecture, MapReduce v2, Managing a YARN Cluster, Cloudera and MR2


- Hadoop Tutorial: Introducing Apache Hadoop (17 minutes) 
- Hadoop Tutorial: Intro To Hadoop Developer Training | Cloudera (1 hour)
- MapReduce Programming Demo - Global Climate Analysis Example from Hadoop: The Definitive Guide
- Hadoop - Just the Basics for Big Data Rookies (1 hour 25 minutes) 
- Big Data and Hadoop Tutorials - 28 Videos and 20 hours -
- Hadoop MapReduce Fundamentals 1 of 5 
- Intro To MapReduce 

Lec 7.3

Lec 7.4


Cloud Programming and Software Environments:
Hadoop MapReduce 2 Tutorial

Hadoop Ecosystem and HPC Integration

- Hadoop installation and configuration on notebooks: 1-, 2- and 4-node clusters on notebooks using Cloudera 4.1.1 and 5.3 Hadoop Distributions
- Hadoop installation and configuration on
FutureSystems - Indiana University Clusters, our project portal address

Lec 8


Big Data Applications & Analytics Case Study
K-Means, Analysis of 4 Artificial Clusters, KMeans in Java using Mahout, MapReduce Revisited: Advanced Topics, Kmeans and MapReduce Parallelism, PageRank
Lec 9

How to Store Data (NoSQL)
RDBMS vs NoSQL, NoSQL Characteristics, BigTable, Hbase Hbase Coding, Indexing Technologies, Related Work, Socal Media Searches, Analysis Algorithms


Lec 10
How to Build a Search Engine (SaaS)
Architecture for a Search Engine, Google Architecture, Evolution of Google’s Search Systems


  Project Demonstrations    

  Important Notes


The lecture schedules given in the syllabus are tentative and updated here weekly. Look at this table once a week.


Almost all the slides used during the semester will be available here.


I may skip several slides during the lecture (The slides given would be generally too much!). They are included in the course material for completeness and to provide a good reference for your future professional engineering life.


To follow the lecture and understand the materials presented in class better, get the lecture slides and take the print-outs of them, and please bring them to class.


Purposes for bringing slides to class: 1) To allow better concentration in lecture by reducing note-taking pressure and to provide a study-aid before and after lecture.


2) You can take your notes on these slides and be active during the lecture. You digest material much better when you actively take notes from step-to-step demonstrations given by your instructor than by just sitting and watching slides.


Disclaimers: (a) I may not follow these slides exactly in class (b) I may also use the whiteboard and give some extra notes which will not be posted here as needed in class (c) Students are responsible for what I say and teach in class. (d) Reading these slides is not a substitute for attending lecture.