Metrics
=======

API Metrics
-----------

Since this is meant to be a production system we can't be far along until we
add metrics, for this we will use `exometer <https://github.com/Feuerlabs/exometer>`_ which is already a dependency of riak_core so we don't need to add it.

We start by defining a `module named tanodb_metrics <https://github.com/marianoguerra/tanodb/blob/0ea3595aefce0f9098cb651eb33263933ce9d6e7/apps/tanodb/src/tanodb_metrics.erl>`_.

The main functions we care about are :

`init/0 <https://github.com/marianoguerra/tanodb/blob/0ea3595aefce0f9098cb651eb33263933ce9d6e7/apps/tanodb/src/tanodb_metrics.erl#L16>`_
    which will initialize all the metrics when the app starts, we will add more metrics here as we add more features.

`core_ping/0 <https://github.com/marianoguerra/tanodb/blob/0ea3595aefce0f9098cb651eb33263933ce9d6e7/apps/tanodb/src/tanodb_metrics.erl#L14>`_
    should be called to register metrics about calls to `tanodb:ping/0 <https://github.com/marianoguerra/tanodb/blob/0ea3595aefce0f9098cb651eb33263933ce9d6e7/apps/tanodb/src/tanodb.erl#L15>`_

`all/0 <https://github.com/marianoguerra/tanodb/commit/0ea3595aefce0f9098cb651eb33263933ce9d6e7#diff-afa3f67ec87f742d64ee9ed311455777R8>`_
    returns the current status of all metrics.

To make the metrics actually work we need to call `tanodb_metrics:init/0 <https://github.com/marianoguerra/tanodb/commit/0ea3595aefce0f9098cb651eb33263933ce9d6e7#diff-4477d4dd0aa2db0e274a56c9158207bdR13>`_ when we start the application and `tanodb_metrics:core_ping/0 <https://github.com/marianoguerra/tanodb/commit/0ea3595aefce0f9098cb651eb33263933ce9d6e7#diff-6f7251bf9e224ebabd766f0331b848adR16>`_ each time tanodb:ping/0 is called.

Test It
.......

Stop, build a release and run the server (I won't tell you how from now on, check previous chapters to see how).

On the server shell run:

.. code-block:: erlang

    (tanodb@127.0.0.1)1> tanodb_metrics:all().
    [{tanodb,[
    
     ...

     {core,[{ping,[{count,0},{one,0}]}]}]

    (tanodb@127.0.0.1)2> tanodb:ping().
    {pong,593735040165679310520246963290989976735222595584}

    (tanodb@127.0.0.1)3> tanodb_metrics:all().
    [{tanodb,[

     ...

     {core,[{ping,[{count,1},{one,1}]}]}]

    (tanodb@127.0.0.1)4>

The `...` are there to skip a lot of metrics about riak_core itself that
are quite useful but not important at this point.

Let's see the shell session step by step, first we call tanodb_metrics:all()
and get the core ping metrics, in this case count and one are 0 since we
didn't call ping yet.

.. code-block:: erlang

    (tanodb@127.0.0.1)1> tanodb_metrics:all().
    [{tanodb,[
    
     ...

     {core,[{ping,[{count,0},{one,0}]}]}]

Then we call ping once.

.. code-block:: erlang

    (tanodb@127.0.0.1)2> tanodb:ping().
    {pong,593735040165679310520246963290989976735222595584}

And ask for the metrics again, we can see that now it registered our call.

.. code-block:: erlang

    (tanodb@127.0.0.1)3> tanodb_metrics:all().
    [{tanodb,[

     ...

     {core,[{ping,[{count,1},{one,1}]}]}]

Erlang Runtime Metrics
----------------------

Until now we have metrics for riak_core and for our API, it would be useful to
have some metrics about the Erlang Runtime, like memory, GC, processes,
schedulers etc. For that we will use a really nice library called `recon <https://github.com/ferd/recon>`_ which unified all the information gathering behind
a nice API.

We start by `adding recon as a dependency <https://github.com/marianoguerra/tanodb/commit/8d6535f360d24a1486bd7b1ed14d7fcde8c465bb#diff-31d7a50c99c265ca2793c20961b60979R6>`_,  then we `create the function tanodb_metrics:node_stats/0 <https://github.com/marianoguerra/tanodb/commit/8d6535f360d24a1486bd7b1ed14d7fcde8c465bb#diff-afa3f67ec87f742d64ee9ed311455777R24>`_ and add it to `tanodb_metrics:all/0 <https://github.com/marianoguerra/tanodb/commit/8d6535f360d24a1486bd7b1ed14d7fcde8c465bb#diff-afa3f67ec87f742d64ee9ed311455777R10>`_.

Test it
.......

Stop, build a release and run. In the shell run:

.. code-block:: erlang

    (tanodb@127.0.0.1)1> tanodb_metrics:all().
    [{tanodb,[

        ...

     {node,[{abs,[{process_count,377},
                  {run_queue,0},
                  {error_logger_queue_len,0},
                  {memory_total,30418240},
                  {memory_procs,11745496},
                  {memory_atoms,458994},
                  {memory_bin,232112},
                  {memory_ets,1470872}]},
            {inc,[{bytes_in,11737},
                  {bytes_out,2470},
                  {gc_count,7},
                  {gc_words_reclaimed,29948},
                  {reductions,2601390},
                  {scheduler_usage,[{1,0.9291112866248371},
                                    {2,0.04754016011809648},
                                    {3,0.04615958261183974},
                                    {4,0.03682005933534583}]}]}]},
     {core,[{ping,[{count,0},{one,0}]}]}]

The metrics should be self explanatory, check `the recon documentation <http://ferd.github.io/recon/>`_ for details.

Web Server Metrics (Cowboy)
---------------------------

We will start with some generic web server metrics, you can add specific ones
with what you have learned in this chapter and by reading `the exometer docs <https://github.com/Feuerlabs/exometer/tree/master/doc>`_.

For the generic metrics we will use `cowboy_exometer <https://github.com/marianoguerra/cowboy_exometer>`_ which is a module I just wrote since it was quite generic :)

We start by adding the `cowboy_exometer dependency <https://github.com/marianoguerra/tanodb/commit/8fb792bc01ac58fbdc709a0c9d2f960605255e54#diff-31d7a50c99c265ca2793c20961b60979R7>`_, this module exposes a middleware and a response hook
to register metrics on all requests, for that we need to `initialize it providing the endpoints we care about <https://github.com/marianoguerra/tanodb/commit/8fb792bc01ac58fbdc709a0c9d2f960605255e54#diff-afa3f67ec87f742d64ee9ed311455777R20>`_ and when we want to collect the metrics we `call cowboy_exometer:stats/1 passing the same endpoints we passed on init <https://github.com/marianoguerra/tanodb/commit/8fb792bc01ac58fbdc709a0c9d2f960605255e54#diff-afa3f67ec87f742d64ee9ed311455777R11>`_.

Finally we need to tell cowboy that we will `add a middleware and a response hook <https://github.com/marianoguerra/tanodb/commit/8fb792bc01ac58fbdc709a0c9d2f960605255e54#diff-4477d4dd0aa2db0e274a56c9158207bdR38>`_.

Test it
.......

After all of this, stop, build, run and make some requests:

.. code-block:: sh

    http localhost:8080/ping

.. raw:: latex

    \newpage

and then on the node shell ask for the metrics:

.. code-block:: erlang

    (tanodb@127.0.0.1)1> tanodb_metrics:all().
    [{tanodb,[

        ...

     {http,[{resp,[{by_code,[{200,[{count,1},{one,1}]},
                             {201,[{count,0},{one,0}]},
                             {202,[{count,0},{one,0}]},
                             {203,[{count,0},{one,0}]},
                             {204,[{count,0},{one,0}]},
                             {205,[{count,0},{one,0}]},
                             {206,[{count,0},{one,0}]},
                             {300,[{count,0},{one,0}]},
                             {301,[{count,0},{one,0}]},
                             {302,[{count,0},{one,0}]},
                             {303,[{count,0},{one,0}]},
                             {304,[{count,0},{one,0}]},
                             {305,[{count,0},{one,0}]},
                             {306,[{count,0},{one,...}]},
                             {307,[{count,...},{...}]},
                             {308,[{...}|...]},
                             {400,[...]},
                             {401,...},
                             {...}|...]}]},
            {req,[{time,[{<<"ping">>,
                          [{n,3},
                           {mean,44126},
                           {min,44126},
                           {max,44126},
                           {median,44126},
                           {50,0},
                           {75,44126},
                           {90,44126},
                           {95,44126},
                           {99,44126},
                           {999,44126}]}]},
                  {active,[{value,0},{ms_since_reset,11546}]},
                  {count,[{<<"ping">>,[{count,1},{one,1}]}]}]}]},
     {node,[{abs,[{process_count,428},
                  {run_queue,0},
                  {error_logger_queue_len,0},
                  {memory_total,50301760},
                  {memory_procs,30854096},
                  {memory_atoms,471201},
                  {memory_bin,222648},
                  {memory_ets,1574728}]},
            {inc,[{bytes_in,11737},
                  {bytes_out,2470},
                  {gc_count,6},
                  {gc_words_reclaimed,29747},
                  {reductions,2848780},
                  {scheduler_usage,[{1,0.05329944038387727},
                                    {2,0.8991375098414373},
                                    {3,0.03932163131802264},
                                    {4,0.05719991628720056}]}]}]},
     {core,[{ping,[{count,1},{one,1}]}]}]

You can see on this line that I made one request to ping and it returned 200:

.. code-block:: erlang

     {http,[{resp,[{by_code,[{200,[{count,1},{one,1}]},

You can also see request time stats per endpoint:

.. code-block:: erlang

            {req,[{time,[{<<"ping">>,
                          [{n,3},
                           {mean,44126},
                           {min,44126},
                           {max,44126},
                           {median,44126},
                           {50,0},
                           {75,44126},
                           {90,44126},
                           {95,44126},
                           {99,44126},
                           {999,44126}]}]},

And request count by endpoint:

.. code-block:: erlang

                  {count,[{<<"ping">>,[{count,1},{one,1}]}]}]}]},

Exposing Metrics as a REST resource
...................................

This one will be simple, first we `add the route to cowboy <https://github.com/marianoguerra/tanodb/commit/de3dde8187ceefdeb787eb835a6e36e80528de6f#diff-4477d4dd0aa2db0e274a56c9158207bdR33>`_ then `add the metrics endpoint to the list of endpoints we want to collect metrics <https://github.com/marianoguerra/tanodb/commit/de3dde8187ceefdeb787eb835a6e36e80528de6f#diff-afa3f67ec87f742d64ee9ed311455777R6>`_ (metricception) and finally `we implement the cowboy handler to return the json <https://github.com/marianoguerra/tanodb/blob/de3dde8187ceefdeb787eb835a6e36e80528de6f/apps/tanodb/src/tanodb_http_metrics.erl>`_.

Test it
:::::::

Stop, build, start and make some requests:

.. code-block:: sh

    http localhost:8080/ping

And then make a request for the metrics (result edited since it's quite big):

.. code-block:: sh

    $ http localhost:8080/metrics

.. code-block:: http

    HTTP/1.1 200 OK
    content-length: 8079
    content-type: application/json
    date: Fri, 30 Oct 2015 10:39:27 GMT
    server: Cowboy

    {
        "core": {
            "ping": { "count": 2, "one": 1 }
        },
        "http": {
            "req": {
                "active": { "ms_since_reset": 279958, "value": 1 },
                "count": {
                    "metrics": { "count": 1, "one": 0 },
                    "ping": { "count": 2, "one": 1 }
                },
                "time": {
                    "metrics": {
                        "50": 0,
                        "75": 0,
                        "90": 0,
                        "95": 0,
                        "99": 0,
                        "999": 0,
                        "max": 0,
                        "mean": 0,
                        "median": 0,
                        "min": 0,
                        "n": 0
                    },
                    "ping": {
                        "50": 0,
                        "75": 349,
                        "90": 349,
                        "95": 349,
                        "99": 349,
                        "999": 349,
                        "max": 349,
                        "mean": 349,
                        "median": 349,
                        "min": 349,
                        "n": 3
                    }
                }
            },
            "resp": {
                "by_code": {
                    "200": { "count": 3, "one": 1 },
                    "201": { "count": 0, "one": 0 },
                    ...
                    "400": { "count": 0, "one": 0 },
                    "401": { "count": 0, "one": 0 },
                    ...
                    "404": { "count": 0, "one": 0 },
                    ...
                    "500": { "count": 0, "one": 0 },
                    ...
                }
            }
        },
        "node": {
            "abs": {
                "error_logger_queue_len": 0,
                "memory_atoms": 471362,
                "memory_bin": 224392,
                "memory_ets": 1579592,
                "memory_procs": 31886248,
                "memory_total": 51342840,
                "process_count": 432,
                "run_queue": 0
            },
            "inc": {
                "bytes_in": 0,
                "bytes_out": 0,
                "gc_count": 2,
                "gc_words_reclaimed": 6624,
                "reductions": 695770,
                "scheduler_usage": {
                    "1": 0.16108125753314584,
                    "2": 0.5187896583972728,
                    "3": 0.18046079477682214,
                    "4": 0.15292436095407036
                }
            }
        },
        "tanodb": {
            ...
        }
    }