util.openmetrics

OpenMetrics

This module provides a (currently partial) implementation of the OpenMetrics Internet Draft. The intent is for it to be used with community modules such as mod_prometheus to provide rich metrics, while still allowing existing code to provide measurements without any changes.

User/admin documentation

This documentation is relevant for developers of Prosody or developers of community modules. For documentation for users of Prosody (i.e. admins), please refer to our documentation configuring statistics.

Implementation coverage

Currently, the following concepts of the OpenMetrics internet draft are supported:

  • Gauge metric families
  • Counter metric families
  • Histogram metric families

All metric families support labels. Exemplars and explicit timestamping are not supported.

Exposition serialisation is out of scope for this module.

API reference

Note: Module developers should look at the module:metric() method, which provides a nicer interface and automatically scopes metric names to the module.

Helper functions

timed(metric)

Helper function to measure the time it takes to do something.

Returns a function that you can call to start timing something. That function returns another function that you must call to stop timing. If the second function is never called, the time is not recorded in the metric.

metric should be either a histogram, a summary or a gauge. Using timed with a counter is a bad idea.

local timed = require "util.openmetrics".timed
local metric = module:metric("histogram", "event_duration", "seconds"):with_labels()

function foo_event()
    local mark_done = timed(metric)
    -- do some heavy lifting here
    mark_done()
end

Metric registry

local registry = require "util.openmetrics".new();

registry.<type>(name, unit, description, labels, extra)

Create a new metric family of the given type. For allowed types, see below.

  • name must be the metric name.
  • unit may be the unit of the metric or an empty string if no unit should be announced.
  • description may be a human-readable tooltip-like description of the metric.
  • labels must be an array of label keys to announce. May be nil/empty if no labels are needed on the metric.
  • extra type-specific extra configuration, see below.

Together, name and unit must be unique within a registry.

type may be one of:

  • counter: Creates a Counter type metric family

  • gauge: Creates a Gauge type metric family

  • histogram: Creates a Histogram type metric family

    Histograms require an array of bucket boundaries in extra.buckets. The implicit +Inf bucket boundary must not be included. The boundaries MUST be sorted ascendingly.

Returns the metric family.

registry.get_metric_families()

Return the table which maps metric names to metric family objects.

This can be used to implement exposition.

Metric Family

local family = module:metric("counter", "stanzas_received", "")

All metric family objects implement at least the following methods:

family:with_labels(…)

Obtain the metric with the given set of labels.

Warning: Each label value will create a new metric internally and in downstream monitoring systems. You MUST NOT use attacker controlled strings (such as Service Discovery features, xml namespaces, …) or high-cardinality values (such as user names) as label values, unless you like resource exhaustion denial of service attacks.

The number of arguments must be equal to the number of label keys passed to the constructor of the metric initially, otherwise an error is raised.

Returns a Metric object matching the type of the family.

Note: Even if no label keys are associated with a metric family, it is still required to call :with_labels() (without any arguments) to obtain the metric object.

family:with_partial_label(value)

Return a wrapper around the metric family which has the first label set to the given value.

Practically, it means that family:with_partial_label(foo):with_labels(...) expects one less argument to the call to with_labels. It is possible to chain calls to with_partial_label.

Note that the wrapper only supports the with_labels and with_partial_label methods; the other methods of the family are not supported.

family:reset()

Reset all metrics in the metric family to their initial value.

The main use case for this method is when metrics are used as accumulators during periodic statistics collection. Example: mod_s2s counts connections by host, ip family and type. To avoid having to gather them in a separate table before submitting the numbers to the metric, it can use :reset() and then :add() on the respective metrics to count them up to the desired number.

Note: When using metrics as accumulators, make sure to cork() the backend and uncork it afterwards to avoid spamming push-based backends with many small updates and the increased runtime cost coming with that.

Note: Avoid using :reset() with non-Gauge metric families to avoid loss of precision in derived calcuations.

family:iter_metrics()

Return an iterator which yields all metrics as pairs of their label values and the Metric object.

Metrics

All metrics share a common API. However, data ingestion is specialized for each metric type:

This section only describes the common API.

local family = module:metric("counter", "stanzas_received", "")
local metric = family:with_labels()

metric:iter_samples()

Return an iterator which yields all samples of the metric. This iterates over the contents of the current MetricPoint.

Note: Not all backends buffer the metric data in-memory (e.g. statsd does not). Those backends do not offer iteration; the iterator is implemented, but does not provide any samples.

The values yielded by the iterator are tuples of suffix, labels, value.

  • suffix: Is a string which should be appended to the metric family name (e.g. _total and _created for counter metrics)
  • labels: Is a table of fixed key/value pairs which should be added to the labels of the metric on exposition (used by histograms for the le label)
  • value: The current numeric value of the sample.

Counter Metrics

local family = module:metric("counter", "stanzas_received", "")
local counter = family:with_labels()

counter:set(v)

Set the counter value to v.

This is offered as a compatibility API where absolute counter values are taken from other sources (such as OS monotonic clocks, CPU usage, interface byte counters etc.).

counter:add(v)

Add v to the counter value.

v MUST NOT be negative.

Gauge Metrics

local family = module:metric("gauge", "connections", "")
local gauge = family:with_labels()

gauge:set(v)

Set the gauge to v.

gauge:add(delta)

Add delta to the value of the gauge.

delta MAY be negative.

Histogram Metrics

local family = module:metric("histogram", "event_duration", "seconds", {}, {
    buckets = { 0.001, 0.01, 0.1, 1.0, 10.0, 100.0 },
})
local histogram = family:with_labels()

The buckets must be provided and must be sorted ascendingly. The implicit +Inf bucket must not be provided.

histogram.sample(v)

Add the sample v to the histogram.

This increases the counters of the corresponding buckets by one, as well as the overall sum by v.

Summary Metrics

local family = module:metric("summary", "words", "", {}, {
    quantiles = { 1.0, 0.5, 0.0},
    interval = 60,
})
local summary = family:with_labels()

Note: Currently, no backend implementation supports quantiles and they are silently ignored. Passing quantiles is optional, but if quantiles are passed, the interval must be passed, too.

summary.sample(v)

Add the sample v to the summary.

The quantiles, if supported and given, are automatically recalculated once per interval.