Listing Keys from a Bucket

Note

While the content of this book is still valid, the code may not run with latest versions of the tools and libraries, for an updated version of the code check the Riak Core Tutorial

Since we already implemented some commands you may be asking yourself, why do we need a full chapter for another command? well, think again…

Since bucket and key are hashed together to decide to which vnode a request will go it means that the keys for a given bucket may be distributed in multiple vnodes, and in case you are running in a cluster this means your keys are distributed in multiple physical nodes.

This means that to list all the keys from a bucket we have to ask all the vnodes for the keys on a given bucket and then put the responses together and return the set of all responses.

For this Riak Core provides something called coverage calls, which are a way to handle this process of running a command on all vnodes and gathering the responses.

In this chapter we are going to implement the tanodb:keys(Bucket) function using coverage calls.

Implementing the CORE API

We start as usual by adding the metric for the keys function.

Then implement tanodb:keys/1, but as you may notice it’s not similar to the previous ones because of what we talked about in the introduction.

In this case we call tanodb_coverage_fsm:start({keys, Bucket}, Timeout), which is a new module, it implements a behavior called riak_core_coverage_fsm, short for riak_core_coverage finite state machine, it implements some predefined callbacks that are called on different states of a finite state machine.

The start function calls tanodb_coverage_fsm_sup:start_fsm([ReqId, self(), Request, Timeout]) which starts a supervisor for this new process.

We also need to register the supervisor in the supervisor tree.

As a side note, tanodb_coverage_fsm uses a module called time_compat to avoid problems with deprecated uses of time in Erlang, for that we need to add the module as a dependency.

When we start the fsm with a command ({keys, Bucket}) and a timeout in milliseconds, it starts a supervisor that starts the finite state machine process, it first calls the init function which initializes the state of the process and returns some information to riak_core so it knows what kind of coverage call we want to do, then riak_core calls the handle_coverage function on each vnode and with each response it calls process_results in our process, when all the results are received or if an error happens (such as a timeout) it will call the finish callback there we send the results to the calling process which is waiting for it.

The handle_coverage implementation is really simple, it uses the ets:match/2 function to match against all the entries with the given bucket and returns the key from the matched results.

You can read more about ets match specs in the match spec chapter on the Erlang documentation.

Relevant code from tanodb.erl:

keys(Bucket) ->
    tanodb_metrics:core_keys(),
    Timeout = 5000,
    tanodb_coverage_fsm:start({keys, Bucket}, Timeout).

Relevant code from tanodb_vnode.erl:

handle_coverage({keys, Bucket}, _KeySpaces, {_, RefId, _},
                State=#state{table_name=TableName}) ->
    Keys0 = ets:match(TableName, {{Bucket, '$1'}, '_'}),
    Keys = lists:map(fun first/1, Keys0),
    {reply, {RefId, Keys}, State};

Test It

Let’s start by checking keys on an empty bucket.

(tanodb@127.0.0.1)1> tanodb:keys(<<"mybucket">>).

{ok,[{1347321821914426127719021955160323408745312813056,
      'tanodb@127.0.0.1',[]},

     ...

     {959110449498405040071168171470060731649205731328,
      'tanodb@127.0.0.1',...},
     {411047335499316445744786359201454599278231027712,...},
     {...}|...]}

The output is quite verbose, here is redacted for clarity, but we get back:

{ok, [{Partition, Node, ListOfKeys}*64]}

That means 64 3-item tuples (one for each vnode) with the partition id, the node where the partition is and the list of keys for that vnode, in this case all of them are empty and in the following cases most of them will be empty so we will filter them to clean the output.

Now let’s put a value:

(tanodb@127.0.0.1)2> tanodb:put({<<"mybucket">>, <<"k1">>}, 42).

{ok,228359630832953580969325755111919221821239459840}

And try again listing keys but this time filtering the empty results:

(tanodb@127.0.0.1)3> lists:filter(fun ({_, _, []}) -> false;
                                      (_) -> true
                                  end,
                                  element(2, tanodb:keys(<<"mybucket">>))).

[{228359630832953580969325755111919221821239459840,
  'tanodb@127.0.0.1', [<<"k1">>]}]

We get one partition that returns the key that we just inserted, you can also check that the partition id is the same as the result from the put call before.

Now let’s insert another value:

(tanodb@127.0.0.1)4> tanodb:put({<<"mybucket">>, <<"k2">>}, 43).

{ok,1210306043414653979137426502093171875652569137152}

And list again, now we get two partitions with keys:

(tanodb@127.0.0.1)5> lists:filter(fun ({_, _, []}) -> false;
                                      (_) -> true
                                  end,
                                  element(2, tanodb:keys(<<"mybucket">>))).

[{1210306043414653979137426502093171875652569137152,
  'tanodb@127.0.0.1', [<<"k2">>]},
 {228359630832953580969325755111919221821239459840,
  'tanodb@127.0.0.1', [<<"k1">>]}]

Yet another value:

(tanodb@127.0.0.1)6> tanodb:put({<<"mybucket">>, <<"k3">>}, 44).

{ok,1073290264914881830555831049026020342559825461248}

And the list again:

(tanodb@127.0.0.1)7> lists:filter(fun ({_, _, []}) -> false;
                                      (_) -> true
                                  end,
                                  element(2, tanodb:keys(<<"mybucket">>))).

[{1210306043414653979137426502093171875652569137152,
  'tanodb@127.0.0.1', [<<"k2">>]},
 {1073290264914881830555831049026020342559825461248,
  'tanodb@127.0.0.1', [<<"k3">>]},
 {228359630832953580969325755111919221821239459840,
  'tanodb@127.0.0.1', [<<"k1">>]}]

Implementing the REST API

The REST API is quite straight forward, we add a new route to cowboy allowing to do GET /store/:bucket without specifying the key, we will interpret this as a request to “get the bucket” which for us means to return the keys.

Then when doing a GET and key is undefined we assume it’s a request to list the bucket’s keys so we request the keys and deduplicate them by using them as keys in a map with the values set to true and then collecting the keys of the map.

Test It

Like in the previous test, let’s start listing an empty bucket:

$ http localhost:8080/store/mybucket
HTTP/1.1 200 OK
content-length: 2
content-type: application/json
date: Sat, 31 Oct 2015 14:12:52 GMT
server: Cowboy

[]

Let’s put a value in that bucket:

$ http post localhost:8080/store/mybucket/bob name=bob color=yellow
HTTP/1.1 204 No Content
content-length: 0
content-type: application/json
date: Sat, 31 Oct 2015 14:12:58 GMT
server: Cowboy

And list it again:

$ http localhost:8080/store/mybucket
HTTP/1.1 200 OK
content-length: 7
content-type: application/json
date: Sat, 31 Oct 2015 14:13:00 GMT
server: Cowboy

[
    "bob"
]

Yet another one:

$ http post localhost:8080/store/mybucket/patrick name=patrick color=pink
HTTP/1.1 204 No Content
content-length: 0
content-type: application/json
date: Sat, 31 Oct 2015 14:13:18 GMT
server: Cowboy

List again:

$ http localhost:8080/store/mybucket
HTTP/1.1 200 OK
content-length: 17
content-type: application/json
date: Sat, 31 Oct 2015 14:13:20 GMT
server: Cowboy

[
    "bob",
    "patrick"
]