CSC 591D-006

Special Topics on Distributed Systems

Spring 2008
Credits:
3
Meeting Times: Tuesday/Thursday, 3:50 - 5:05pm
Meeting Location: EBII 1220
Wolfware Course Web

Instructor Information:

Instructor: Xiaohui (Helen) Gu
Contact: EB-II 3296, gu@csc.ncsu.edu
Office Hours: Day hours in EB-II 3296, or by appointment

Course Objectives: 

This course explores design and implementation principles in modern distributed systems. In particular, the course will emphasize on the automatic management of complex computer systems. Students will learn the state of the art in distributed system architectures, algorithms, and performance evaluation methodologies. Topics include peer-to-peer/overlay/Grid systems, autonomic computing, distributed data processing, robust/secure distributed systems, and distributed system debugging. On completing this course, the student should be able to the following:

 

  • Identify research problems and challenges in distributed systems, (assessed by review and presentation);
  • List the state-of-art tools and techniques for addressing research problems and challenges in distributed systems (assessed by review and presentation);
  • Develop and implement new ideas to solve open problems in  distributed systems (assessed by project);
  • Conduct technical reviews, technical writing, and technical presentations (assessed by review, project, paper, presentation).

Text Books:

There are no assigned textbooks for this course. Topics will be covered during in-class lectures, and through course notes made available on this web page.

Links to the supplementary material in the form of research papers related to each topic are included in this syllabus. PDF for most papers is available through the NCSU library web site, which has full-text access to most recent ACM and IEEE journals and conferences. A number of supplemental distributed system textbooks are also available:

 

Distributed Systems: Concepts and Design, (4th Edition), G. Coulouris, J. Dollimore, and T. Kindberg

Distributed Systems (2nd Edition), Sape Mullender

Distributed Systems: Principles and Paradigms, Andrew S. Tanenbaum, Maarten van Steen

Course Description

Distributed systems have become the fundamental computing infrastructure for many important real-world applications such as Web search engine, media streaming servers, information analytics, and scientific exploration. This course explores design and implementation principles in modern distributed systems. In particular, the course will emphasize on the automatic management of complex computer systems. Students will learn the state of the art in distributed system architectures, algorithms, and performance evaluation methodologies. Topics include peer-to-peer systems, overlay systems, Grid, middleware, autonomic computing, data stream processing, system reliability, and system security. Students will have opportunities to not only learn the common design methodology of many important distributed systems, but also gain hands-on experience through project implementations. The majority of course materials will be drawn from classic papers and current state-of-the-art work. The instructor will lecture for the first half of the semester and students will present papers and projects in the second half of the semester. Students will read and review papers ahead of time, participate in class discussions, present at least one research topic during the course, and do a term project individually or in a two-member team. Students will also write a paper (as well as review other students' papers) describing their project and present their work at the end of the course, in a "conference" format designed to give students an experience similar to that of participating in a professional conference.

Prerequisites:

CSC501, CSC 246 or equivalents. Programming in C++ or Java in Unix environment. If you are not sure whether you can attend this course, please consult the instructor.

Tentative Grading Policy

Written reviews 20%, class participation 20% (presentation: 10%, discussion: 10%), project 60% (proposal 10%, demo 20%, presentation 10%, Final write-up 20%)

Late policy: Calculated by the time recorded in the assignment emails received to the instructor. Students will lose 25% for each 24-hour period they are late on reviews, project, or paper.

 

Paper Review:

 

Review guidelines: Provide a paragraph of summary about the paper, a paragraph of 2-3 strong points of the paper (i.e., Why the paper should be accepted), a paragraph of 2-3 weak points of the paper (i.e., why the paper should be rejected),  brainstorming ideas for developing new research ideas related to the work described in the paper(optional).

 

Project:

 

Suggested Term Project Topics (NCSU unity ID required).

 

Both project proposal and final report should follow typical paper requirements using ACM Double-Column Paper format. The project proposal should include abstract, introduction, and proposed approaches. The final project report should include a full paper content including abstract, introduction, design and algorithms, experiment evaluation, related work, and conclusion. We will organize a mini-conference for the students to present their project work. Three best papers will be selected during the mini-conference.

Tentative Class Schedule:

 W

 Date

Topic

Assigned Readings

Assignments

1

01/10

Introduction

[slides]

 

Chapter 1, Distributed Systems: Concepts and Design

 

 

Investigate your term project idea and do preparation for it. A list of candidate project topics will also be provided to you on the class. Talk to the instructor about your project idea and talk to other students in forming a two-three members group. Email the instructor to setup the appointment.

 

Paper presentation signup. Please send an email to the instructor to bid three papers in the list below and list your choices in decreasing order. You will be allocated with one paper to present based on the FCFS policy and paper availability.

2

01/15

RPC, Distributed Objects, Middleware

[slides]

 

Chapter 5, Distributed Systems: Concepts and Design

 

Investigate your term project idea and do preparation for it. Talk to the instructor about your project idea and talk to other students in forming a group if you would like to work in a two-member group.

Sunday midnight: review due for Time, clocks and the ordering of events in a distributed system, L. Lamport, Communications ACM 1978  and Distributed snapshots: determining global states of distributed systems, Chandy and Lamport, ACM TOCS 1985

Please read the assigned papers on how to review papers and follow the review format described above.

01/17

Replication

[slides]

Chapter 14, Distributed Systems: Concepts and Design

3

01/22

Distributed System Security

[slides]

Chapter 9, Distributed Systems: Concepts and Design

 

Investigate your term project idea and do preparation for it. Talk to the instructor about your project idea and talk to other students in forming a group if you would like to work in a two-member group.

Sunday midnight: review due for Rowstron and P. Druschel, "Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems".  Middleware, 2001 and Jin Liang, Xiaohui Gu, Klara Nahrstedt, " Self-Configuring Information Management for Large-Scale Service Overlays", Proc. of IEEE INFOCOM, Anchorage, Alaska, May, 2007.

 

01/24

Classical Distributed System Concepts

[slides]

4

01/29

Peer-to-Peer Systems

[slides]

 

Sunday midnight: project proposal due.

 

No reviews due this Sunday. You should spend time on preparing your project proposal presentation.

01/31

Service Overlay Networks

[slides]

  • D. Andersen and H. Balakrishnan and F. Kaashoek and R. Morris, Resilient Overlay Networks, Proc. 18th ACM SOSP, 2001.
  • Y. Chu and S. G. Rao and S. Seshan and H. Zhang, A Case For End System Multicast, IEEE Journal on Selected Areas in Communication (JSAC), Special Issue on Networking Support for Multicast", 2002

5

02/05

System Research Methodology

[slides]

Sunday midnight: reviews due for P. Barham and A. Donnelly and R. Isaacs and R. Mortier, Using Magpie for request extraction and workload modeling, Proc. of OSDI, 2004 and C. Killian and J. W. Anderson and R. Jhala and A. Vahdat, Life, Death, and the Critical Transition: Finding Liveness Bugs in Systems Code, Proc. of NSDI, 2007.

 

02/07

Student Project Proposal Presentation

 

6

02/12

Student Project Proposal Presentation

 

 

Sunday midnight: reviews due Cohen and M. Goldszmidt and T. Kelly and J. Symons and J. S. Chase, Correlating Instrumentation Data to System States: A Building Block for Automated Diagnosis and Control, Proc. of OSDI, 2004. and I. Cohen and S. Zhang and M. Goldszmidt and J. Symons and T. Kelly and A. Fox, Capturing, indexing, clustering, and retrieving system history, Proc. of SOSP, 2005

 

02/14

Component Composition

[slides]

7

02/19

Large-Scale Distributed System Management

[slides]

 

Sunday midnight: reviews due N. Tatbul and U. Cetintemel and S. Zdonik and M. Cherniack and M. Stonebraker, Load Shedding in a Data Stream Manager, Proc. of VLDB, 2003. and Xiaohui Gu, Philip S. Yu, Haixun Wang, "Adaptive Load Diffusion for Multiway Windowed Stream Joins",  IEEE International Conference on Data Engineering (ICDE), Istanbul, Turkey, April, 2007.

 

02/21

Autonomic Computing/
System Mining

[slides]

8

02/26

Autonomic Computing/
System Debugging

[slides]

Sunday midnight: reviews due B. Urgaonkar and G. Pacifici and P. Shenoy and M. Spreitzer and A. Tantawi, An analytical model for multi-tier Internet services and its applications, Proc. of SIGMETRICS, 2005 and C. Stewart and K. Shen, Performance modeling and system management for multi-component online services, Proc. of NSDI, 2005

 

02/28

Look inside Google

[slides]

9

03/04

Spring break

  • No class

 

 

03/06

Spring break

  • No class

10

03/11

Distributed Stream Processing System (DSPS)

[slides]

Sunday midnight: reviews due Ian Rose, Rohan Murty, Peter Pietzuch, Jonathan Ledlie, Mema Roussopoulos, Matt Welsh, ``Cobra: Content-Based Filtering and Aggregation of Blogs and RSS Feeds'', Proc. of Symposium on Networked Systems Design & Implementation (NSDI), April 2007. and Nikolaos Michalakis, Robert Soulé, and Robert Grimm, Ensuring Content Integrity for Untrusted Peer-to-Peer Content Distribution Networks,  Proc. of NSDI 2007.

 

03/13

Student Presenation

 

11

03/18

Student Presentation

 

 

No paper reading assigned. You should spend time on your term projects.

 

03/20

Student Presentation

 

12

03/25

Student Presentation

 



No paper reading assigned. You should spend time on your term projects.

 

 

03/27

Student Presentation

 

13

04/01

Student Presentation

 

 

 

No paper reading assigned. You should spend time on your term projects.

 

04/03

Student Presentation

 

14

04/08

Student Presentation

 

 

No paper reading assigned. You should spend time on your term projects.

 

04/10

Student Presentation

 

15

04/15

Spring holidays

  • No class 

 

 

No paper reading assigned. You should spend time on your term projects.

 

04/17

Student Presentation

 

16

04/22

Mini-Conference for Project Presentation

 

 

 

No paper reading assigned. You should spend time on your term projects.

 

04/24

Mini-Conference for Project Presentation

 

17

04/29

Mini-Conference for Project Presentation

 

 

Before Friday midnight: final project report due, project source code and document due

 

Your project source code and document submission should be a single zip file. The zip file should include your system source code including all other dependent packages, the experimental subjects used in the project report, instructions on how to set up and use the system to reproduce the experimental results, and other documents that help others understand your tool source code.

05/01

Mini-Conference for Project Presentation

 

 


 

Suggested Topics for Student Presentations (You can suggest to the instructor the papers that are not in this list but you would like to present):

Peer-to-Peer Systems


Failure Management

Distributed Systems in Real World

Performance Modeling

Large-Scale Data Analytics System

Multi-Core System

Application Decomposition

Distributed System Security

System monitoring

Sensor Network Systems

Grid Computing:

Content Distribution Systems:

Distributed System Security:

 

Academic Integrity

The university provides a detailed policy on academic integrity. This policy can be found in the Code of Student Conduct. It is understood that when you submit your homework, you are implicitly agreeing to the university honor pledge: "I have neither given nor received unauthorized aid on this test or assignment."

Academic dishonesty (e.g., cheating or plagiarism) will not be tolerated under any circumstances. If you are having difficultly with any part of the course material, please see me as soon as possible. I will do everything I can to help you with any course-related problems you may be having. If you are found to be guilty of academic dishonesty, however, I will then do everything I can to see that you are punished as forcefully as possible. This may include asking to have you suspended or expelled from the course, the program, and/or the university. At a minimum, you will receive -50% for the assignment in question, and your name will be placed on record with the university as having committed an academic offence. Multiple offences during your academic career will result in suspension or expulsion from the university. I take absolutely no pleasure in pursuing cases of academic misconduct, and would ask that you please do not put me in this position.

Students With Disabilities

All effort will be made to ensure that no students with disabilities are denied any opportunity to successfully complete this course. If you have specific requirements that need to be addressed, please contact me immediately. Possible changes can include (but are not necessarily limited to) rescheduling classes from inaccessible to accessible buildings, or providing access to auxiliary aids such as tape recorders, special lab equipment, or other services such as readers, note takers, or interpreters. This may also include oral or taped tests, readers, scribes, separate testing rooms, or extension of time limits.

Lab Safety Issues

None.

Pass-Through Costs

None.