MIT 6.5840 Notes: Lab 2 - Key/Value Server

Introduction

Key/value server is most commonly used in distributed systems to store and retrieve data. It is a simple data storage system that stores data in the form of key-value pairs. The key is a unique identifier for the data, and the value is the data itself. Key/value servers are used in various applications, such as caching, session management, and distributed storage systems.

In this lab, we will implement a simple key/value server that supports the following operations:

Put(key, value, version): Store the key-value pair in the server.
Get(key): Retrieve the value of the key from the server.

It’s running on a single server, and can be accessed by multiple clients concurrently.

Lab

Key/value server with reliable network

In this part, we will implement a key/value server that not need to handle network failures.

When the server receives a Put request, it performs the following actions based on the version number:

If the provided version number matches the current version of the key, the server updates the value and increments the version number.
If the provided version number is 0 and the key does not exist, the server creates a new key-value pair.
If the provided version number does not match the current version of the key, the server returns an error.

When the server receives a Get request, it returns the value and version number of the key if it exists, or an error if the key does not exist.

All operations should lock the key-value store to ensure that they are atomic.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
// src/kvsrv1/server.go

type KVServer struct {
    mu sync.Mutex

    // Your definitions here.
    data map[string]*ValueHandle
}

type ValueHandle struct {
    Value   string
    Version rpc.Tversion
}

func (kv *KVServer) Get(args *rpc.GetArgs, reply *rpc.GetReply) {
    kv.mu.Lock()
    defer kv.mu.Unlock()

    reply.Err = rpc.OK
    if v, ok := kv.data[args.Key]; ok {
        reply.Value = v.Value
        reply.Version = v.Version
    } else {
        reply.Err = rpc.ErrNoKey
    }
}

func (kv *KVServer) Put(args *rpc.PutArgs, reply *rpc.PutReply) {
    kv.mu.Lock()
    defer kv.mu.Unlock()

    reply.Err = rpc.OK
    key := args.Key
    if v, ok := kv.data[key]; ok {
        if v.Version == args.Version {
            v.Value = args.Value
            v.Version += 1
        } else {
            reply.Err = rpc.ErrVersion
        }
    } else {
        if args.Version == 0 {
            kv.data[key] = &ValueHandle{
                Value:   args.Value,
                Version: 1,
            }
        } else {
            reply.Err = rpc.ErrNoKey
        }
    }
}

Here is the client Get implementation:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
// src/kvsrv1/client.go

func (ck *Clerk) Get(key string) (string, rpc.Tversion, rpc.Err) {
    // You will have to modify this function.
    args := &rpc.GetArgs{
        Key: key,
    }

    reply := &rpc.GetReply{}

    _ = ck.clnt.Call(ck.server, "KVServer.Get", args, reply)

    return reply.Value, reply.Version, reply.Err
}

Implementing a lock using key/value clerk

In this part, we will implement a distributed lock. Here is the constructor of the lock:

1
2
3
// src/kvsrv1/lock/lock.go

func MakeLock(ck kvtest.IKVClerk, l string) *Lock

The l parameter specifies the key used to store the “lock state”. For example, if servers A and B both use l as lock1, the server will store the lock state under the key lock1, indicating they share the same lock. If server A uses lock1 and lock2, the server will store the lock state under the keys lock1 and lock2, indicating they are different locks.

To identify the client, we can generate a unique client ID for each client by randomly generating a 8-byte string.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
// src/kvsrv1/client.go

type Clerk struct {
    clnt   *tester.Clnt
    server string
    id     string
}

func MakeClerk(clnt *tester.Clnt, server string) kvtest.IKVClerk {
    ck := &Clerk{clnt: clnt, server: server}

    ck.id = kvtest.RandValue(8) // generate a random 8-byte string as client ID

    return ck
}

For the lock, we need to implement the following methods:

Acquire: Acquire the lock. If the lock is not held by any client, the client can acquire the lock. If the lock is held by another client, the client cannot acquire the lock.
Release: Release the lock. If the lock is held by the client, the client can release the lock. If the lock is not held by the client, the client cannot release the lock.

The “lock state” is stored in the key-value store. The value of the key is the client ID that holds the lock. If the lock is not held, the value is an empty string.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
// src/kvsrv1/server.go

type KVServer struct {
    mu sync.Mutex

    // Your definitions here.
    data map[string]*ValueHandle
    lock map[string]string // map[l]clientId
}

func (kv *KVServer) Acquire(args *rpc.AcquireArgs, reply *rpc.AcquireReply) {
    kv.mu.Lock()
    defer kv.mu.Unlock()

    switch kv.lock[args.Key] {
    case "":
        kv.lock[args.Key] = args.ClientId
        reply.Err = rpc.OK
    case args.ClientId:
        reply.Err = rpc.ErrAcquired
    default:
        reply.Err = rpc.ErrDenied
    }
}

func (kv *KVServer) Release(args *rpc.ReleaseArgs, reply *rpc.ReleaseReply) {
    kv.mu.Lock()
    defer kv.mu.Unlock()

    switch kv.lock[args.Key] {
    case args.ClientId:
        kv.lock[args.Key] = ""
        reply.Err = rpc.OK
    case "":
        reply.Err = rpc.ErrReleased
    default:
        reply.Err = rpc.ErrDenied
    }
}

When the client attempts to acquire the lock, it should keep retrying until it successfully acquires it. Since the network is reliable, we don’t need to handle the scenario where the lock was acquired by the client but the reply message was lost.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
// src/kvsrv1/lock/lock.go

func (lk *Lock) Acquire() {
    for {
        if err := lk.ck.(*kvtest.TestClerk).IKVClerk.(*kvsrv.Clerk).
            Acquire(lk.l); err == rpc.OK {
            break
        }
    }
}

Key/value server with dropped messages

In this part, we will implement a key/value server that can handle dropped messages in unreliable networks.

It’s easy to handle dropped messages in the Get operation because it does not change the state of the server. However, it’s more challenging to handle dropped messages in the Put operation because it changes the state of the server.

In scenarios where messages may be dropped, the key/value server must handle the following cases:

Successful Put Request: If the initial Put request is received and processed by the server without any message loss, the server’s reply is reliable.
Lost Request Message: If the server does not receive the Put request, the client will retry the request until it is successfully received and processed.
Lost Reply Message: If the server processes the Put request but the reply message is lost, the client will retry the request. In this case, the server will return rpc.ErrVersion due to the incremented version number. The client cannot determine whether the Put request was processed or if another client’s Put request caused the version increment. To address this ambiguity, the client should return rpc.ErrMaybe, indicating that the client is uncertain whether the Put request was processed.

These measures ensure that the key/value server can handle dropped messages effectively, maintaining consistency and reliability in a distributed environment.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// src/kvsrv1/client.go

func (ck *Clerk) Put(key, value string, version rpc.Tversion) rpc.Err {
    // You will have to modify this function.
    args := &rpc.PutArgs{
        Key:     key,
        Value:   value,
        Version: version,
    }
    reply := &rpc.PutReply{}

    retries := 0
    for {
        if ok := ck.clnt.Call(ck.server, "KVServer.Put", args, reply); ok {
            if retries > 0 && reply.Err == rpc.ErrVersion {
                return rpc.ErrMaybe
            }

            return reply.Err
        }
        retries++
    }
}

Implementing a lock using key/value clerk and unreliable network

Applying the same principles as the previous part, we can implement a distributed lock that can handle dropped messages in unreliable networks. Here is the Acquire method for the client:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// src/kvsrv1/client.go

func (ck *Clerk) Acquire(key string) rpc.Err {
    args := &rpc.AcquireArgs{
        Key:      key,
        ClientId: ck.id,
    }

    reply := &rpc.AcquireReply{}

    retries := 0
    for {
        if ok := ck.clnt.Call(ck.server, "KVServer.Acquire", args, reply); ok {
            if retries > 0 && reply.Err == rpc.ErrAcquired {
                return rpc.OK
            }

            return reply.Err
        }

        retries++
    }
}

The Acquire method should keep retrying until it successfully acquires the lock. If the lock is already held by the client, the client should return rpc.OK to indicate that the lock is acquired.

Conclusion

In this lab, we implemented a key/value server that supports Put and Get operations. We also implemented a distributed lock using the key/value clerk. By handling dropped messages in unreliable networks, we ensured that the key/value server and lock maintain consistency and reliability in distributed systems.

This lab provides valuable insights into building distributed systems that can handle network failures and maintain data consistency. By understanding the challenges and solutions presented in this lab, we can design robust and reliable distributed systems that meet the demands of modern applications.

My implementation of this lab can be found on GitHub, hope it helps you understand the concepts better. If you have any questions or feedback, feel free to send me an email. Thank you for reading!