Quorum Requests¶
Note
The content of this chapter is in the 03-quorum branch.
How it Works¶
Quorum requests allows sending a command to more than one vnode and wait until a number of responses are received before considering the request succesful.
To implement this we need a process that distributed the requests to the first N vnodes in the preference list and waits for at least W response to arrive before returning to the requester.
We use a gen_fsm to implement this process, which looks like this:
+------+ +---------+ +---------+ +---------+ +------+
| | | | | | | |remaining = 0 | |
| Init +--->| Prepare +--->| Execute +--->| Waiting +------------->| Stop |
| | | | | | | | | |
+------+ +---------+ +---------+ +-------+-+ +------+
^ | |
| | | +---------+
+---+ +------->| |
| Timeout |
remaining > 0 timeout | |
+---------+
Implementing it¶
To implement it we need to change the code in tanodb.erl instantiate a FSM to handle the request instead of sending the command directly to one vnode.
get(Bucket, Key, Opts) ->
K = {Bucket, Key},
Params = K,
run_quorum(get, K, Params, Opts).
put(Bucket, Key, Value, Opts) ->
K = {Bucket, Key},
Params = {Bucket, Key, Value},
run_quorum(put, K, Params, Opts).
delete(Bucket, Key, Opts) ->
K = {Bucket, Key},
Params = K,
run_quorum(delete, K, Params, Opts).
We are going to generalize that logic in a function called run_quorum, where we can pass options for N, W and Timeout to play with different values:
run_quorum(Action, K, Params, Opts) ->
N = maps:get(n, Opts, ?N),
W = maps:get(w, Opts, ?W),
Timeout = maps:get(timeout, Opts, ?TIMEOUT),
ReqId = make_ref(),
tanodb_write_fsm:run(Action, K, Params, N, W, self(), ReqId),
wait_for_reqid(ReqId, Timeout).
wait_for_reqid(ReqId, Timeout) ->
receive
{ReqId, Val} -> Val
after
Timeout -> {error, timeout}
end.
To wait for the right answer we need to generate a unique identifier for each request and send it with the request itself. The identifier will come back in the message sent by the FSM once the request finishes.
If too much time passed waiting for the response we consider it an error and return before receiving it.
wait_for_reqid(ReqId, Timeout) ->
receive
{ReqId, {error, Reason}} -> {error, Reason};
{ReqId, Val} -> Val
after
Timeout -> {error, timeout}
end.
There are two new files:
- tanodb_write_fsm.erl
- The FSM logic
- tanodb_write_fsm_sup.erl
- The supervisor for the FSMs
Finally we need to add tanodb_write_fsm_sup to our top level supervisor in tanodb_sup.
Trying it¶
To test it we are going to run some calls to the API and observe that now the response contains more than one response:
(tanodb@127.0.0.1)1> B1 = b1.
(tanodb@127.0.0.1)2> K1 = k1.
(tanodb@127.0.0.1)3> V1 = v1.
First let’s try to get a key that doesn’t exist:
(tanodb@127.0.0.1)4> tanodb:get(B1, K1).
{ok,[{[1073290264914881830555831049026020342559825461248,
'tanodb@127.0.0.1'],
{not_found,{b1,k1}}},
{[1050454301831586472458898473514828420377701515264,
'tanodb@127.0.0.1'],
{not_found,{b1,k1}}},
{[1096126227998177188652763624537212264741949407232,
'tanodb@127.0.0.1'],
{not_found,{b1,k1}}}]}
Let’s do the same call but passing options, we want to run the command in 5 vnodes and wait for the response of the 5, the request should finish under a second:
(tanodb@127.0.0.1)5> tanodb:get(k1, v1, #{n => 5, w => 5, timeout => 1000}).
{ok,[{[456719261665907161938651510223838443642478919680,
'tanodb@127.0.0.1'],
{not_found,{k1,v1}}},
{[433883298582611803841718934712646521460354973696,
'tanodb@127.0.0.1'],
{not_found,{k1,v1}}},
{[411047335499316445744786359201454599278231027712,
'tanodb@127.0.0.1'],
{not_found,{k1,v1}}},
{[388211372416021087647853783690262677096107081728,
'tanodb@127.0.0.1'],
{not_found,{k1,v1}}},
{[365375409332725729550921208179070754913983135744,
'tanodb@127.0.0.1'],
{not_found,{k1,v1}}}]}
Let’s try deleting a key that doesn’t exist:
(tanodb@127.0.0.1)6> tanodb:delete(B1, K1).
{ok,[{[1050454301831586472458898473514828420377701515264,
'tanodb@127.0.0.1'],
ok},
{[1073290264914881830555831049026020342559825461248,
'tanodb@127.0.0.1'],
ok},
{[1096126227998177188652763624537212264741949407232,
'tanodb@127.0.0.1'],
ok}]}
Let’s put a value:
(tanodb@127.0.0.1)7> tanodb:put(B1, K1, V1).
{ok,[{[1096126227998177188652763624537212264741949407232,
'tanodb@127.0.0.1'],
ok},
{[1073290264914881830555831049026020342559825461248,
'tanodb@127.0.0.1'],
ok},
{[1050454301831586472458898473514828420377701515264,
'tanodb@127.0.0.1'],
ok}]}
Now let’s get the value:
(tanodb@127.0.0.1)8> tanodb:get(B1, K1).
{ok,[{[1096126227998177188652763624537212264741949407232,
'tanodb@127.0.0.1'],
{found,{{b1,k1},v1}}},
{[1050454301831586472458898473514828420377701515264,
'tanodb@127.0.0.1'],
{found,{{b1,k1},v1}}},
{[1073290264914881830555831049026020342559825461248,
'tanodb@127.0.0.1'],
{found,{{b1,k1},v1}}}]}
Let’s delete it:
(tanodb@127.0.0.1)9> tanodb:delete(B1, K1).
{ok,[{[1073290264914881830555831049026020342559825461248,
'tanodb@127.0.0.1'],
ok},
{[1096126227998177188652763624537212264741949407232,
'tanodb@127.0.0.1'],
ok},
{[1050454301831586472458898473514828420377701515264,
'tanodb@127.0.0.1'],
ok}]}
And try to get it back:
(tanodb@127.0.0.1)10> tanodb:get(B1, K1).
{ok,[{[1073290264914881830555831049026020342559825461248,
'tanodb@127.0.0.1'],
{not_found,{b1,k1}}},
{[1096126227998177188652763624537212264741949407232,
'tanodb@127.0.0.1'],
{not_found,{b1,k1}}},
{[1050454301831586472458898473514828420377701515264,
'tanodb@127.0.0.1'],
{not_found,{b1,k1}}}]}