A series of labs accompanying the coursework on distributed systems, including a Raft implementation. This repo holds my solutions to these labs.
This lab involves building a key/value server for a single machine that ensures that each operation is executed exactly once despite network failures and that the operations are linearizable.
The solution is present in kvsrv/{client.go,server.go,common.go} and passes all tests.
This lab involves implementing Raft, a replicated state machine protocol.
The solution is present in raft/raft.go
Implement Raft leader election and heartbeats. The goal for Part 3A is for a single leader to be elected, for the leader to remain the leader if there are no failures, and for a new leader to take over if the old leader fails or if packets to/from the old leader are lost.
The solution passes all tests:
- ✅ Initial election
- ✅ Election after network failure
- ✅ Multiple elections
Implement the leader and follower code to append new log entries
The solution passes the following tests:
- ✅ basic agreement
- ✅ RPC byte count
- ✅ Test progressive failure of followers
- ✅ Test failure of leaders
- ✅ Agreement after follower reconnects
- ✅ No agreement if too many followers disconnect
- ✅ Concurrent Start()s
- ✅ Rejoin of partitioned leader
- ✅ Leader backs up quickly over incorrect follower logs
- ✅+❌ RPC counts aren't too high (Fails sometimes)
Store Raft's persistent state to survive a reboot and resume service where it left off.
The solution passes the following tests:
- ✅ Basic Persistence
- ✅ More Persistence
- ✅ Partitioned leader and one follower crash, leader restarts
- ✅+❌ Figure 8 (Fails sometimes)
- ✅ Unreliable agreement
- ❌ Figure 8 (unreliable)
- ✅+❌ Churn
- ✅+❌ Unreliable churn
This lab involves building a fault-tolerant key/value storage service using the Raft library from Lab 3
The solution is present in kvraft/{client.go, common.go, server.go, kvraft_debug.go}
The solution passes the following tests:
- ✅ One Client
- ✅ Ops complete fast enough
- ❌ Remaining Tests
labrpcandlabgobare packages provided by MIT for performing RPCs. The tester can telllabrpcto delay RPCs, re-order them, and discard them to simulate various network failures.porcupineandmodelsare packages used for testing labs
- Attach a logical clock and a unique identifier to each Clerk
- Store a map of the Clerk Id to the logical clock on the server
- On every state update, such as a
PutorAppendrequest, increment the logical clock on the client and send a request. - If the request received contains the timestamp following the one stored on the server, perform the update and increment the clock on the server
- Else, classify the request as duplicate and handle accordingly.
Getrequests do not involve state change and hence do not require attachment or incrementing of logical clocks
- If a duplicate
Appendrequest is received, use strings.Split to split the value of the given key and return the value expected before thisAppend(As a concurrent update might have appended another value, hence the value expected before thisAppendmight be different from the current value on the server) - This solution was suitable here as
Appendrequests involved unique strings. Dealing with non-uniqueAppends might require a better solution