CS188, Distributed Systems, Spring 2019

Course Objective

This class teaches design and implementation techniques that enable the building of fast, scalable, fault-tolerant distributed systems.

Course Description

Distributed systems enable developers to leverage computing and storage resources across many networked computers, to supply services that are fast, highly available, and scalable. Study covers fundamental concepts of design and implementation of such distributed systems. Topics include communication models for distributed machines (e.g., sockets, RPCs); synchronization (e.g., clock synchronization, logical clocks, vector clocks); failure recovery (e.g., snapshotting, primary-backup); consistency models (e.g., linearizability, eventual, causal); consensus protocols (e.g., Paxos, Raft); distributed transactions; and locking. Students gain hands-on, practical experience with these topics through multiple programming assignments, which work through steps of creating fault-tolerant, sharded key/value store. Exploration of how these concepts have manifested in several real-world, large-scale distributed systems used by Internet companies like Google, Facebook, and Amazon.

Languages Covered

Programming Language: Go

Assignment Overviews

P1 - MapReduce
- Built a MapReduce library as a way to learn the Go programming language and as a way to learn about fault tolerance in distributed systems.
  - Part I: Word Count - wrote a simple MapReduce program
  - Part II: Distributed MapReduce jobs - wrote a Master that hands out jobs to workers
  - Part III: Handling worker failures - modified the Master to handle failure of workers
P2 - Primary/Backup Key/Value Service
- Built a key/value service using primary/backup replication, assisted by a view service that decides which machines are alive. The view service allows the primary/backup service to work correctly in the presence of network partitions. The view service itself is not replicated, and is a single point of failure
  - Part A: The Viewservice
  - Part B: The primary/backup key/value service
P3 - Paxos-Based Key/Value Service
- Built a key/value service using the Paxos protocol to replicate the key/value database with no single point of failure, and handles network partitions correctly. This key/value service is slower than a non-replicated key/value server would be, but is fault tolerant
  - Part A: Paxos
  - Part B: Paxos-based Key/Value Server
P4 - Sharded Key/Value Service
- Built a key/value storage system that "shards" the keys over a set of replica groups. Each replica group handles puts and gets for just a few of the shards, and the groups operate in parallel; thus total system throughput increases in proportion to the number of groups. The sharded key/value store consists of two components: (1) set of replica groups, in which each is responsible for a subset of the shards, and consists of a handful of servers that use Paxos to replicate the group's shard, and (2) the shard master that decides which replica group should serve each shard (i.e. the configuration). There is a single shard master for the whole system, implemented as a fault-tolerant service using Paxos.
  - Part A: The Shard Master
  - Part B: Sharded Key/Value Server

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Assignments		Assignments
Discussion		Discussion
Lectures		Lectures
Textbook		Textbook
MapReducePaper.pdf		MapReducePaper.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CS188, Distributed Systems, Spring 2019

Course Objective

Course Description

Languages Covered

Assignment Overviews

About

Releases

Packages

Languages

jless97/UCLA-S19-CS188-Distributed-Systems

Folders and files

Latest commit

History

Repository files navigation

CS188, Distributed Systems, Spring 2019

Course Objective

Course Description

Languages Covered

Assignment Overviews

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages