Cloud Computing Seminar Fall 2009

“If computers of the kind I have advocated become the computers of the future, then computing may someday be organized as a public utility just as the telephone system is a public utility... The computer utility could become the basis of a new and important industry.”, John McCarthy, MIT Centennial in 1961

1 Introduction

Cloud Computing what is it? Here are some "answers" from places you might not expect.

  1. 1.From Curtis Clark in S&D's Public Sector, 'Yesterday, in his 2010 Budget, President Obama highlighted cloud computing as a key tool for improving innovation, efficiency and effectiveness in Federal IT. "Cloud-computing is a convenient, on-demand model for network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. The cloud element of cloud-computing derives from a metaphor used for the Internet, from the way it is often depicted in computer network diagrams. Conceptually it refers to a model of scalable, real-time, internet-based information technology services and resources, satisfying the computing needs of users, without the users incurring the costs of maintaining the underlying infrastructure. Examples in the private sector involve providing common business applications online, which are accessed from a web browser, with software and data stored on the “cloud” provider’s servers."'

  2. 2.NPR, "Do you have a Yahoo e-mail account? Maybe a Gmail account? Do you put up pictures on Flickr? Perhaps you've started keeping your schedule online. If so, then you are using cloud computing — that's what tech companies call it when people work and store information on the Internet." http://www.npr.org/templates/story/story.php?storyId=93841182

  3. 3.CNN, "What it is? A service that lets anyone rent unlimited hard drive space and/or processing power via the Internet. Also known as, Grid computing, Utility computing, Elastic computing and Software as a service (SaaS)" http://money.cnn.com/2009/02/03/smallbusiness/cloud_computing.fsb/index.htm?postversion=2009020405

  4. 4.Chicago Tonight a nightly news program, "Cloud Computing is just a new way to think about where the computation is happening. Instead of happening on your machine it is happening on alot of machines all at once all over the world and so you don't have to worry about what is going on on your machine so much as some body else has to worry about a service for you where all the computation is going on" http://video.whyy.org/video/1184344044/subject/957383701#

  5. 5.Irving Wladawsky Berger, "Our speakers put cloud computing in its historical context - from the Cambrian explosion all the way through the rise of electric utilities. They showed how businesses today can use this emerging technology for cost-effective and powerful computing solutions. And they also gave us a feel for the real paradigm shift the cloud could bring to the computing world, especially how large software companies might find themselves vulnerable to disruption (although some firms are still in denial about this)." In other words, something big and profound seems to be going on, although we are not totally sure what it is yet. http://alwayson.goingon.com/permalink/post/28058

Some industrial experts are telling us that Cloud Computing is the next "Big Thing":

  1. 1."Some of us feel that cloud computing may very well be The Next Big Thing - one of those massive changes that the IT industry goes through from time to time that really shake things up - like the advent of personal computers in the 1980s and the Web in the next decade. Others - a minority in this meeting - feel that this is the IT industry engaged in one of its periodic hype cycles." Irving Wladawsky Berger summarizing an Economist article.

  2. 2.Gartner, "Cloud computing heralds an evolution of business that is no less influential than e-business, according to Gartner Inc. Gartner maintains that the very confusion and contradiction that surrounds the term "cloud computing" signifies its potential to change the status quo in the IT market.", http://www.gartner.com/it/page.jsp?id=707508

So what do you think?

As you can tell from the above, surfing the net, listening to the radio and watching TV you will invariably hear the words “Cloud Computing”. But what is it? For that matter is it really “something” or a pop culture intepretation of technology, or just hype from companies? Where did it come from and when? Are there technologies that actually compose it? What should I tell my friends when they ask me about Cloud Computing? What does it mean to me as a computer scientist? What does it mean to me as a citizen? What does it mean to me as a netizen? What does it mean to me as an entrepreneur? Is it the realization of McCarthy's and other pioneer's dreams?

These are some of the questions that we will explore in this seminar course through readings, discussions, presentations and projects. Please come join us but be ready to participate.

2 Formal Description

CAS CS 591 : Special Topics : Cloud Computing Seminar Fall 2009


This is an interactive seminar course exploring Cloud Computing, including its definition, history, and realization from a computer science perspective.   We will focus on systems related topics, including production and transmission of computational capacity, physical and virtual consolidation, centralized and distributed ownership, costs, metering, usage models, impact on software development, and efficiency.  We will also touch upon the technical aspects of the attendant socioeconomic issues, raised by Cloud Computing, that were identified by the early computer scientists who pioneered these ideas.    The course will require the reading, review and critique of literature from various sources, predominately drawn from systems related research literature.   Projects and Presentations will give student the opportunity to explore one or more of the topics discussed in class in a hands-on fashion.  Prerequisites: The course is open to advanced undergraduates and graduate students who are comfortable reading academic literature.  Students should have taken 350, 330 (or equivalents) and be comfortable with operating system and networking topics.  Exceptions will be made at the discretion of the instructor.   

WARNING: Implementation based projects may require considerable “on the job training” and affinity for hacking.

5 Class Organization and Details


This is an interactive seminar course in which each week we will read several papers,  that we will discuss as group. Students will prepare and give a presentation on a specific topic of their choice in agreement with the instructor during a class session. Students will coordinate with the instructor about a project that will be submitted by the end of the semester.

6 Evaluation

The following will probably have some minor adjustments

7 Class Format


Each student will submit a brief 1 page summary of each paper read at the beginning of class. The summary must include 3 questions that the paper raised for you. The class will then proceed by reviewing the papers as a group with a summary leader and discussion leader.

8 Class Schedule and Topics

9 Presentation and Project Ideas/Topics


  1. 1.Build a basic EC2 style environment using Eucalyptus

  2. 2.AMI Development Service

  3. 3.Build a Fuse based File System that permits overloading of S3 names

  4. 4.Conduct a sensitivity analysis of two node LINPACK running virtualized

  5. 5.Develop a Ruby on Rails site for XYZ.

  6. 6.Do a written study of 3rd Third Party EC2 Based Offerings

  7. 7.Do a written study comparing and contrasting the Google App Engine cloud model to the Microsoft Azure model

  8. 8.Develop an iphone app and associated RoR backend for personal image, audio and gps track publishing (using google maps)

  9. 9.Develop an iphone app and associated RoR site for cafepress store front creation and maintenance.

  10. 10.Construct a BU Linux Host factory for EC2

  11. 11.Construct an Archive Print Service which produces a website that archives all documents an links a user "prints"

  12. 12.Construct a transcription service which utilizes Sphinx

  13. 13.Utilize your favorite image/video software and construct a service (optionally construct an iphone client app)

  14. 14.Do a study of VMWare's vCloud and compare it to Eucalyptus, EC2 and OpenQRM

  15. 15.Security and Privacy Study / Presentation

  16. 16.Study global energy and environmental impacts of Cloud Computing

  17. 17.Do a presentation on RoR

  18. 18.Do study or presentation comparing SeaSide (http://www.seaside.st/) and RoR

  19. 19.Present AJAX and discuss the class of technology it represents and how it relates to Cloud Computing.

10 Reading List Pool


Many of the readings will be take from the following papers

Overview

  1. 1."Above the Clouds: A Berkeley View of Cloud Computing", Michael Armbrust et al., Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2009-28, February 10, 2009

  2. 2."Toward a Unified Ontology of Cloud Computing", Lamia Youseff et al., Grid Computing Environments Workshop (GCE08), Nov 2008

  3. 3."Market-Oriented Cloud Computing: Vision, Hype, and Reality for Delivering IT Services as Computing Utilities", Rajkumar Buyya, 2008

History

General Internet

  1. 1."A Brief History of the Internet", Leiner et al., Dec 2003

  2. 2."Netizens: On the History and Impact of Usenet and the Internet", Hauben et al., 1996

Licklider

  1. 1.“Man-Computer Symbiosis”, Licklider, 1960

  2. 2."ON-LINE MAN-COMPUTER COMMUNICATION", Licklider et al., 1962

  3. 3.Memorandum For Members and Affiliates of the Intergalactic Computer Network, Licklider, 1963

  4. 4.“The Computer as a Communication Device”, Licklider et al., 1968

Dennis

  1. 1.Toward the Computer Utility: A Career in Computer System Architecture, Jack B. Dennis, 2008

McCarthy

  1. 1."THE HOME INFORMATION TERMINAL -- A 1970 View", McCarthy, 2001 original 1970

Greenberger

  1. 1."The Computers of Tomorrow", Greenberger, 1964

Frankeston

  1. 1.THE COMPUTER UTILITY AS A MARKETPLACE FOR COMPUTER SERVICES, Robert M. Frankston, 1973 MIT MSc. Thesis

Fano et al.

  1. 1."Some Thoughts About the Social Implications of Accessible Computing", David, 1965

Corbato

  1. 1."AN EXPERIMENTAL TIME-SHARING SYSTEM", Corbató et al., 1962

Resource Allocation

"Why Markets Could (But Don't Currently) Sove Resource Allocation Problems in Systems", Shneidman et al., June 2005

"Two Auction-Based Resource Allocation Environments: Design and Experience", Alvin AuYoung et al. 2009

Not yet catogorized

"Integrated Risk Analysis for a Commercial Computing Service in Utility Computing", Yeo et al, 2008

"The Cloud Is The Computer", Paul McFedries, IEEE Specturm, 2008

"A break in the clouds: towards a cloud definition", Vaquero et al., 2008

"SOME FUNDAMENTALS OF PRICE THEORY FOR COMPUTER SERVICES", Cotton, 1976

"Considerations for computer utility pricing policies", by DANIEL S. DIAMOND, MIT and LEE L. SELWYN Boston University , 1968

"The Working Set Model for Program Behaviour", Denning, 1968

"A Futures Market in Computer Time", SUTHERLAND, 1968

"The Allocation of Computer Resources -- Is Pricing the Answer?", Nielsen, 1970

"Microeconomics and the Market for Computer Services", Cotton, 1975

"Project Kittyhawk: Building a Global-Scale Computer Blue Gene/P as a Generic Computing Platform", Appavoo et al, 2008

"IT Doesn't Matter", Carr, 2003

"The Big Switch to Clouds", Video address Structures 2008 Conference, Carr

"Open Cloud Manifesto", 2008

"Google App Engine"

"Amazon EC2"

"IBM Cloud"

"Micosoft Azure"

"VMWare vCloud"

Eucalyptus

"SOSP LADIS 2009"

"Challenge of the Computer Utility", D F Parkhill, 1966

"Ruby on Rails (RoR)"

"php"

"Web 2.0"

"Rightscale"

"MapReduce: Simplified Data Processing on Large Clusters", Dean et al. OSDI, 2004

"Bigtable: A Distributed Storage System for Structured Data", Dean et al., OSDI 2006

"Dynamo: Amazon’s Highly Available Key-value Store", DeCandia et al., SOSP 2007

"AJAX"

"The Google Platform"

WEBSEARCH FOR A PLANET: THE GOOGLE CLUSTER ARCHITECTURE, Barroso et al., IEEE micro, 2003

"Google uncloaks once-secret server", Stephen Shankland, CNet 2009

"Internet-Scale Service Infrastructure Efficiency", James Hamilton, Key Note ISCA 2009

"Where Does the Power Go in High-Scale Data Centers", Talk by James Hamilton, Sigmetrics 2009

"Cloud Computing Economies of Scale", James Hamilton, Keynote Self managed database systems, 2009

"The Cost of a Cloud: Research Problems in Data Center Networks", Greenberg et al. SIGCOMM 2009

"Autopilot: Automatic Data Center Management” , Isard, OSR 2007

"Toward a Doctrine of Containment: Grid Hosting with Addressaptive Resource Control", Ramakrishnan et al. Supercomputing 2006

"Sharing Networked Resources with Brokered Leases", Irwin et al., USENIX 2006

"Correlating instrumentation to system states: A building block for automated diagnosis and control", Irwin et al., OSDI 2004

"A Scalable, Commodity, Data Center Network Architecture", Mohammad Al-Fares, et al. SIGCOMM, 2008

"Difference Engine: Harnessing Memory Redundancy in Virtual Machines", Gupta et al, OSDI 08

"Revealed: the environmental impact of Google searches", Times online 2009

"Worldwide Server Power and Cooling Expense 2006- 2010 Forecast", Jed Scaramella, IDC 2006

"ESTIMATING TOTAL POWER CONSUMPTION BY SERVERS IN THE U.S. AND THE WORLD", Jonathan G. Koomey, Feb 2007, Study commisioned by AMD

"Google App Engine Run your web applications on Google's infrastructure", Stanford 2008 ee380 talk, 2008

"A Head in the Cloud - The Power of Infrastructure as a Service", Vogels, Stanford ee380 talk, 2008

"Scalable Privacy-Friendly Client Cloud Computing: a gathering Perfect Disruption", Hewitt, Stanford ee380 talk, 2008

"Enernet: Internet Lessons for Solving Energy", Metcalfe, Stanford ee380 talk, 2008

"The Connection Machine", Hillis, 1987.

“Can Cloud Computing Reach The TOP500”, Napper et. al, Workshop on UnConventional High Performance Computing, 2009

“VEE’09”

“Kittyhawk: Enabling cooperation and competition in a global, shared computational system”, Appavoo et al. , IBM Sys Journal, 2009.

“The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines”, Synthesis Lectures on Computer Architecture,  Luiz André Barroso; Google Inc. and Urs Hölzle; Google Inc., 2009.

“From Niches to Riches: The Anatomy of the Long Tail”, Brynjolfsson et. al, Sloan Management Review, Summer 2006, Vol. 47, No. 4, pp. 67-71.

“The Long Tail”, Chris Anderson, Wired, 2004.

“Centralized versus decentralized computing: organizational considerations and management options”,  King, ACM Computing Surveys, 1983.

“The Real Cost of a CPU Hour”, Walker 2009.

“Data Security in the World of Cloud Computing”, Harauz et al. Aug 2009.

“Post-Copy Based Live Virtual Machine Migration Using Adaptive Pre-Paging and Dynamic Self-Ballooning”, Hines et al. VEE 2009.

“HotCloud 2009”

“The Case for Enterprise-Ready Virtual Private Clouds”, Wood et al., HotCloud 2009.

“PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric”,  Mysore et al., SIGCOMM 2009.

“VL2: A Scalable and Flexible Data Center Network”, Greenberg et al., SIGCOMM 2009.

“BCube: A High Performance, Server-centric Network Architecture for Modular Data Centers”,  Guo1 et al., SIGCOMM 2009.

“Xen and the Art of Virtualization”,  Barham et al., SOSP 2003.

“Are Virtual Machine Monitors Microkernels Done Right?”,  Steven Hand et al., HotOS 2005.

“The Impact of Virtualization on Computing Systems.”, Rosenblum, 2007.

“Applications Programming in Smalltalk-80(TM): How to use Model-View-Controller (MVC)”, Burbeck, 1987.

“What Is Ruby on Rails”, Hibbs, 2005.

“True value: assessing and optimizing the cost of computing at the data center level”, Karidis et al., Conference on Computing Frontiers, 2009.

“The Architecture of Virtual Machines”, Smith et al., IEEE 2005.

“Hey, You, Get Off of My Cloud! Exploring Information Leakage in Third-Party Compute Clouds”, Ristenpart et al., proc. ACM Conference on Computer and Communications Security 2009.

“Intel® Virtualization Technology: Hardware support for efficient processor virtualization”, Neiger et al., Intel® Technology Journal, 2006.

“Intel® Virtualization Technology for Directed I/O”, Abramson et al., Intel® Technology Journal, 2006.

“Understanding Full Virtualizaton, Paravirtualization, and Hardware Assist”, VMware White Paper.

“Cloud9: A Software Testing Service”, Ciortea et al. LADIS 2009.

4 Logistics

Location: MCS 137
Times: Fridays 10:00am - 1:00pm

Instructor: Jonathan Appavoo
Office: MCS 284
Office Hours: Tuesdays 10-12 & Thursdays 10-12

3 Announcements & News


Presentation Guidelines and Suggestions:

Example outline/breakdown:

  1. 1) Introductions:  The Problem/The Idea

  2. 2)Cloud Computing Context

  3. 3)Design and Overview

  4. 4)Educate/Teach 3 -5 Technical Aspects unique to your topic

  5. 5)Problems Encountered

  6. 6)Current Status

  7. 7)Discussion : Present 3 - 5 questions or observations that you have for discussion.

Things I will be looking for:

  1. A) Content: 5 Technical Points and Observations with respect to : Scalability, Performance, Resource Management, Security, Ease of Use, Compatibility, Composability, Design.

  2. B)Comprehension:  See that you can comment on how your topic relates to the following aspects:  General Cloud Computing Relationship, Economics, Data Centers, Virtualization, Programming, Social and Political implications.  Note your topic may not relate to all aspects but I expect to be able to identify which ones it does and comment on how it does.

  3. C)Clarity:  Use of Examples, Present Methodology, Identification and articulation of Critical Questions, Evaluation of the technologies, Expressing your personal ideas and identification of implications of your topic or things your learnt about while exploring your topic.

I encourage you to walk through technical aspects with examples and source code.  I also encourage you to come and go over your presentation with me.  Be sure to send me email to give me a heads up.

---------------------------------------

WARNING THE IMAGE REFERENCED BELOW WAS BUILT FROM A BASE IMAGE FROM THE INTERNET (http://chrysaor.info/?page=ubuntu) SO USE CAUTIOUSLY!

Added some resources to help with ruby on rails development: http://www.cs.bu.edu/fac/jappavoo/Resources/ror-vmware-app. You will find here a vmware image that you should be able to boot with vmware player and login in with userid: user and the password discussed in class.  You can then sudo any root commands you need to run.    You may find it a little tricky getting the networking configured.  You can of course configure things anyway you like but I suggest that you set vmware player to host only networking and reboot the virtual machine.    The tricky part is that the network adaptor my not end up being eth0 and if not then you will need to manually change the configuration in the virtual machine’s Linux instance.  Specifically:  determine the name of the adaptor via invocations of ifconfig ethX.  Start your search with X=0 and continue up if you don’t find an adaptor with X<=5 then get help from me.  Once you find the adaptor edit /etc/network/interfaces adding lines for the interface you found.  Base the new lines on the lines for eth0.  (don’t forget you will need to sudo to edit the file eg. ‘sudo vi /etc/network/interfaces’).  Then bring the interface up with ‘sudo ifup ethX’.  You should then be able to access the machine from the host that you are running the vmplayer on. 

There is an example ruby on rails application in ~user/www/mynewapp.   To try it ‘cd ~/user/www/mynewapp’ and then issue ‘ruby scripts/server’.  This starts a built-in webserver and makes the app available.  You can then try accessing the app from either within the virtual machine or from the host that you are running vmware player on if you configured the networking correctly.   Not the server will by default be listening on port 3000.  eg to access the app from within the virtual machine you should be able to issue the command ‘firefox http://localhost:3000’.  HAVE FUN.