Metrics support proposal

(Timon) #1

Metrics

Metrics, this is a useful thing to have in a game engine. We came up with this idea while working on networking. @fletcher, @LucioFranco, @TimonPost decided that it could be quite useful to have metrics, not only for networking but for the entire engine.

So metrics, there are different ways to output them, different ways to get them, different ways to generate them.

Lets put the requirements in a list:

  • Should not impact the performance or systems
  • Should not be blocking
  • Configurable, like in a sense of enabling, disabling, choosing a service to output the metrics too.
  • Statsd support

There are a ton of monitoring platforms out there, we want to provide, at least, support for some popular ones.

popular platforms

  1. InfluxDb
  2. Prometheus
  3. Grafana
  4. Graphite

Any more platforms, priority?

We also want to support some easier way of having metrics of your game like:

  1. Console
  2. Files (csv, xml, etc)
  3. Database (SQLite).

The question here is which metric output should we ‘at least’ support?

There are a few crates of interest:

  1. Hotmic (475 downloads, last active: 5 months, version 0.2.1)
  • based on crossbeam-channel / mio , so it’s blazingly fast (faster than tic ; see rough numbers here)
  • supports counters, gauges, and histograms
  • provides dynamic faceting: what portion of metric data should be recorded, and in what way
  • control mechanism to allow any caller to retrieve metric snapshots at any time]
  • Fork of Tic but faster ‘at least what they say’
  1. Dipstick (13,813 downloads, last active: 6 months, version 0.6.11)
  • Send metrics to console, log, statsd, graphite or prometheus (one or many)
  • Serve metrics over HTTP
  • Locally aggregate the count, sum, mean, min, max and rate of metric values
  • Publish aggregated metrics, on schedule or programmatically
  • Customize output statistics and formatting
  • Define global or scoped (e.g. per request) metrics
  • Statistically sample metrics (statsd)
  • Choose between sync or async operation
  • Choose between buffered or immediate output
  • Switch between metric backends at runtime
  1. Cadence (18,390 downloads, last active: 13 days, version 0.16.11)
  • Support for emitting counters, timers, histograms, gauges, meters, and sets to Statsd over UDP.
  • Support for alternate backends via the MetricSink trait.
  • Support for Datadog style metric tags.
  • A simple yet flexible API for sending metrics.

which crate to use?

  • Hotmic is fast and has a tokio backend, it is very basic and so easier to customize for our needs. It is only focused on receiving and sending metrics. It does not support statsd so we have to use a statsd parser before we can send data.

  • Dipstick already provides a backend for graphite, basic Prometheus, HTTP, console. And works with statsd.

  • Cadence provides also customizable backend with UDP and is havely based on Statsd.

Any idea’s on how to integrate metrics into amethyst and a possible design?

We think it will be a good idea to have metrics in a seperate crate of amehyst like: ‘amethyst_metrics’

1 Like

(Fletcher) #2

Yay!

I’m going to ignore the crate stuff for the moment and focus on this:

There’s a few things to consider:

  1. Metrics within the engine itself
  2. How developers can add their own metrics
  3. What metrics to track

Latency

Latency in a system is almost always the earliest and most reliable indicator that something is wrong somewhere.

Starting Off

To start us off, I would use conditional compilation, pick some function calls, and time their execution. Start a timer when it is called, stop just before it returns, send the result to a metrics sink.

I suggest conditional compilation because this will have a performance impact.

Systems

Once we get that working, I’d suggest we implement OpenTracing, so we can follow requests through Systems. The UI system with @happenslol re-trigger system would be a good first candidate, because we can then follow each UI draw request and the latency at each step.

0 Likes

(Lucio Franco) #3

I think this is a great idea! I think there are a few things to think about similar to what fletcher said. I think one goal of this would be able to get metrics from your bundles and to provide traits that the bundle/system users can use.

I think there are some crates out there that we might be able to take inspiration from:

will post more later

I think it would be a good idea to upfront present metrics on how well your systems are running. That would be very powerful

As for metrics, I would like to do some more research into it but I wonder, if we can provide support without even thinking about what crate to use? I would like to spend some time coming up with some traits that could be injected into amethyst_core.

On a basic level, I would like to define what types of metrics we want. Then either defines one large trait for it or smaller traits that can be combined.

0 Likes

(Fletcher) #4

This drastically expands the scope. We need to stop doing that and ending up with half-finished projects lying around.

0 Likes

(Lucio Franco) #5

:+1: all for it, lets do tracing first.

0 Likes