
* Metrics format: {code}<metric name>{<label name>=<label value>, ...}{code}
Be careful passing strings as labels (quote from [Pierre Vincent's blog|https://pierrevincent.github.io/2017/12/prometheus-blog-series-part-1-metrics-and-labels/]):
{quote}
A word on label cardinality
Labels are really powerful so it can be tempting to annotate each metric with very specific information, however there are some important limitations to what should be used for labels.
Prometheus considers each unique combination of labels and label value as a different time series. As a result if a label has an unbounded set of possible values, Prometheus will have a very hard time storing all these time series. In order to avoid performance issues, labels should not be used for high cardinality data sets (e.g. Customer unique ids).
{quote}
h2. Configuration samples:
The following samples are produced monitoring a Petals ESB single container topology hosting 3 components (SOAP, REST and Camel).
Raw metrics can be hard to exploit, as the exporter automatically creates metrics:
Wildcard pattern rule:
{code}
rules:
- pattern: ".*"
{code}
Raw metrics sample:
{code}
# metric: java.lang<type=OperatingSystem><>SystemCpuLoad
java_lang_OperatingSystem_SystemCpuLoad 0.10240228944418933
# metric: java.lang<type=OperatingSystem><>ProcessCpuLoad
java_lang_OperatingSystem_ProcessCpuLoad 3.158981547513337E-4
# metrics: org.ow2.petals<type=custom, name=monitoring_petals-(se-camel | bc-soap | bc-rest)><>MessageExchangeProcessorThreadPoolQueuedRequestsMax)
org_ow2_petals_custom_MessageExchangeProcessorThreadPoolQueuedRequestsMax{name="monitoring_petals-se-camel",} 0.0
org_ow2_petals_custom_MessageExchangeProcessorThreadPoolQueuedRequestsMax{name="monitoring_petals-bc-soap",} 0.0
org_ow2_petals_custom_MessageExchangeProcessorThreadPoolQueuedRequestsMax{name="monitoring_petals-bc-rest",} 0.0
{code}
In this case, we cannot know later in Prometheus where the metrics originated or which Petals ESB container is concerned. By adding a few generic rules, we can add label and control the metric names.
h3. Adding generic rules
In this example, the point of our rules is:
* gather _java.lang_ metrics, name the metric with the explicit MBean and label them by type.
* gather component metrics, name the metric with the explicit MBean, and label in a usable way component and type (monitoring or runtime_configuration).
Generic rules samples:
{code}
rules:
- pattern: 'java.lang<type=(.+)><>(.+): (.+)'
name: "$2"
value: "$3"
labels:
type: "$1"
- pattern: 'org.ow2.petals<type=custom, name=monitoring_(.+)><>(.+): (.+)'
name: "$2"
value: "$3"
labels:
type: "monitoring"
component: "$1"
- pattern: 'org.ow2.petals<type=custom, name=runtime_configuration_(.+)><>(.+): (.+)'
name: "$2"
value: "$3"
labels:
type: "runtime_config"
component: "$1"
{code}
Metrics parsed by generic rules:
{code}
ProcessCpuLoad{type="OperatingSystem",} 2.5760609293017057E-4
SystemCpuLoad{type="OperatingSystem",} 0.10177234194298118
MessageExchangeProcessorThreadPoolQueuedRequestsMax{component="petals-bc-soap",type="monitoring",} 0.0
MessageExchangeProcessorThreadPoolQueuedRequestsMax{component="petals-se-camel",type="monitoring",} 0.0
MessageExchangeProcessorThreadPoolQueuedRequestsMax{component="petals-bc-rest",type="monitoring",} 0.0
{code}
h3. Adding specific rules
And you can go further by adding rules for specific MBeans. Here we will
* group *SystemCpuLoad* and *ProcessCpuLoad* as a single metric.
* rename *MessageExchangeProcessorThreadPoolQueuedRequestsMax* into a shorter metric, while keeping the full name as label and helper.
{code}
- pattern: 'java.lang<type=OperatingSystem><>SystemCpuLoad: (.*)'
name: CpuLoad
value: "$1"
labels:
type: "OperatingSystem"
target: "system"
- pattern: 'java.lang<type=OperatingSystem><>ProcessCpuLoad: (.*)'
name: CpuLoad
value: "$1"
labels:
type: "OperatingSystem"
target: "process"
- pattern: 'org.ow2.petals<type=custom, name=monitoring_(.+)><>MessageExchangeProcessorThreadPoolQueuedRequestsMax: (.+)'
name: "MEPTP_QueuedRequests_Max"
value: "$2"
help: "MessageExchangeProcessorThreadPoolQueuedRequestsMax"
labels:
type: "monitoring"
mbean: "MessageExchangeProcessorThreadPoolQueuedRequestsMax"
component: "$1"
{code}
Metrics parsed by advanced rules:
{code}
CpuLoad{target="system",type="OperatingSystem",} 0.10234667681404555
CpuLoad{target="process",type="OperatingSystem",} 2.655985589352835E-4
MEPTP_QueuedRequests_Max{component="petals-bc-soap",mbean="MessageExchangeProcessorThreadPoolQueuedRequestsMax",type="monitoring",} 0.0
MEPTP_QueuedRequests_Max{component="petals-se-camel",mbean="MessageExchangeProcessorThreadPoolQueuedRequestsMax",type="monitoring",} 0.0
MEPTP_QueuedRequests_Max{component="petals-bc-rest",mbean="MessageExchangeProcessorThreadPoolQueuedRequestsMax",type="monitoring",} 0.0
{code}
You can mix generic and specific patterns, but remember that they are applied in order, so *always put specific rules first\!*
h1. Configuring Prometheus
h2. Configuration file
Prometheus can be configured to connect statically or dynamically to metrics sources, these configurations are under the *scrape_configs* section of the yaml config file.
Depending on how you manage you machines, Prometheus can be connecter dynamically to several services systems including: Azure, Consul, EC2, OpenStack, GCE, Kubernetes, Marathon, AirBnB's Nerve, Zookeeper Serverset, Triton.
You can also rely on a DNS-based service discovery system allowing specifying a set of DNS domain names which are periodically queried to discover a list of targets.
Here we are going to demonstrate static configuration ([static_configs|https://prometheus.io/docs/prometheus/latest/configuration/configuration/#%3Cstatic_config%3E]), specifying a set of targets with direct connection. Note that they can be factored in a file, using [file_std_config|https://prometheus.io/docs/prometheus/latest/configuration/configuration/#%3Cfile_sd_config%3E] )
For the following sample, we are connecting to 2 petals container instances, both are running locally on ports 8585 and 8686.
Sample static config:
{code}
scrape_configs:
- job_name: 'petals monitoring'
static_configs:
- targets: ['localhost:8585']
labels:
container: 'petals-sample-0'
- targets: ['localhost:8686']
labels:
container: 'petals-sample-1'
{code}
We are labeling each one individually, to help differentiating them. Prometheus will add the labels, job-names from the config and also an instance one for each source. This produces in Prometheus interface the following metrics (keeping on with our previous examples):
{code}
CpuLoad{container="petals-sample-0",instance="localhost:8585",job="petals monitoring",target="process",type="OperatingSystem"} 0.007285089849441476
CpuLoad{container="petals-sample-0",instance="localhost:8585",job="petals monitoring",target="system",type="OperatingSystem"} 0.2049538610976202
CpuLoad{container="petals-sample-1",instance="localhost:8686",job="petals monitoring",target="process",type="OperatingSystem"} 0.022037218413320275
CpuLoad{container="petals-sample-1",instance="localhost:8686",job="petals monitoring",target="system",type="OperatingSystem"} 0.22624877571008814
MEPTP_QueuedRequests_Max{component="petals-bc-rest",container="petals-sample-0",instance="localhost:8585",job="petals monitoring",mbean="MessageExchangeProcessorThreadPoolQueuedRequestsMax",type="monitoring"} 0
MEPTP_QueuedRequests_Max{component="petals-bc-rest",container="petals-sample-1",instance="localhost:8686",job="petals monitoring",mbean="MessageExchangeProcessorThreadPoolQueuedRequestsMax",type="monitoring"} 0
MEPTP_QueuedRequests_Max{component="petals-bc-soap",container="petals-sample-0",instance="localhost:8585",job="petals monitoring",mbean="MessageExchangeProcessorThreadPoolQueuedRequestsMax",type="monitoring"} 0
MEPTP_QueuedRequests_Max{component="petals-bc-soap",container="petals-sample-1",instance="localhost:8686",job="petals monitoring",mbean="MessageExchangeProcessorThreadPoolQueuedRequestsMax",type="monitoring"} 0
MEPTP_QueuedRequests_Max{component="petals-se-camel",container="petals-sample-0",instance="localhost:8585",job="petals monitoring",mbean="MessageExchangeProcessorThreadPoolQueuedRequestsMax",type="monitoring"} 0
MEPTP_QueuedRequests_Max{component="petals-se-camel",container="petals-sample-1",instance="localhost:8686",job="petals monitoring",mbean="MessageExchangeProcessorThreadPoolQueuedRequestsMax",type="monitoring"} 0
{code}
There is also the option to define multiple instances in the same targets list:
{code}
scrape_configs:
- job_name: 'petals monitoring'
static_configs:
- targets: ['localhost:8585','localhost:8686']
labels:
container: 'petals-samples'
{code}
{code}
CpuLoad{container="petals-samples",instance="localhost:8585",job="petals monitoring",target="process",type="OperatingSystem"} 0.007285089849441476
CpuLoad{container="petals-samples",instance="localhost:8585",job="petals monitoring",target="system",type="OperatingSystem"} 0.2049538610976202
CpuLoad{container="petals-samples",instance="localhost:8686",job="petals monitoring",target="process",type="OperatingSystem"} 0.022037218413320275
CpuLoad{container="petals-samples",instance="localhost:8686",job="petals monitoring",target="system",type="OperatingSystem"} 0.22624877571008814
{code}
More information on ??[Prometheus documentation|https://prometheus.io/docs/prometheus/latest/configuration/configuration/]
h2. Reload configuration
If the scraping configuration is not set dynamically, you can change the configuration and make Prometheus reload the file.
h3. Remote command
There are two ways to ask Prometheus to reload it's configuration remotely:
Send a *SIGHUP*: determine the [process id|https://www.digitalocean.com/community/tutorials/how-to-use-ps-kill-and-nice-to-manage-processes-in-linux] of Prometheus (look in _'var/run/prometheus.pid'_, or use tools as '_pgrep'_, '_ps aux \| grep prometheus'_). Then use the kill command to send the signal:
{code}kill -HUP 1234{code}
Or, send a *HTTP POST* to the Prometheus web server _'/-/reload'_ handler:
{code}curl -X POST http://localhost:9090/-/reload{code}
Note: as of Prometheus 2.0, to reload over HTTP the_ '--web.enable-lifecycle'\_ command line flag must be set.
In any case, Prometheus should acknowledge the reload:
{code}
level=info ts=2018-10-01T14:57:17.292032129Z caller=main.go:624 msg="Loading configuration file" filename=prometheus.yml
level=info ts=2018-10-01T14:57:17.293868363Z caller=main.go:650 msg="Completed loading of configuration file" filename=prometheus.yml
{code}
h3. File configuration
As mentioned in the [documentation on file configuration|https://prometheus.io/docs/prometheus/latest/configuration/configuration/#%3Cfile_sd_config%3E], using this method will allow to reload automatically and periodically.
{quote}
Changes to all defined files are detected via disk watches and applied immediately. Files may be provided in YAML or JSON format. Only changes resulting in well-formed target groups are applied.
\[. . .\]
As a fallback, the file contents are also re-read periodically at the specified refresh interval.
{quote}
h1. Visualizing monitored metrics
h2. Prometheus API
Prometheus server is reachable through its [HTTP API|https://prometheus.io/docs/prometheus/latest/querying/api]. It allows to directly query metrics and can be useful in specific cases.
For instance, by requesting */api/v1/targets* you can get an overview of configured targets and their health in json format.
request: {code}curl -X GET http://localhost:9090/api/v1/targets{code}
response: [^prometheus_get-api-targets.json].
However there are simpler solutions, as a web UI already build in Prometheus server or open sources softwares natively compatible with this API (like [Grafana|https://grafana.com]).
h2. Prometheus web UI
This UI is accessible connecting to prometheus server */graph*, in our example:
{code}http://localhost:9090/graph{code}
This web UI allows you to enter any expression and see its result either in a table or graphed over time. This is primarily useful for ad-hoc queries and debugging.
But you can also view various prometheus server configuration (targets, rules, alerts, services discovery, etc...).
h2. Grafana
h3. Installing
[Grafana|http://grafana.com/] installation is documented on [Grafana website|http://docs.grafana.org/installation/] and setup for prometheus is documented on [Prometheus website|https://prometheus.io/docs/visualization/grafana/]. It is advised to rely on these sources for an up to date installation.
* In short, install and run as standalone:
{code}
wget https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-5.2.4.linux-amd64.tar.gz
tar -zxvf grafana-5.2.4.linux-amd64.tar.gz
cd grafana-5.2.4
./bin/grafana-server web
{code}
* Or as package:
Add the following line to your */etc/apt/sources.list* file (even if you are on Ubuntu or another Debian version).
{code}
deb https://packagecloud.io/grafana/stable/debian/ stretch main
{code}
Then run:
{code}
curl https://packagecloud.io/gpg.key | sudo apt-key add -
sudo apt-get update
sudo apt-get install grafana
sudo service grafana-server start
{code}
By default, Grafana web UI is available at *localhost:3000*, the default user is *admin/admin*.
h3. Connecting to Prometheus
For an exhaustive documentation go to [Grafana website|http://docs.grafana.org/features/datasources/prometheus/]
In short, once logged as admin:
# Open the side menu by clicking the Grafana icon in the top header.
# In the side menu under the Dashboards link you should find a link named Data Sources.
# Click the + Add data source button in the top header.
# Select _Prometheus_ from the Type dropdown.
# Give a name to the data source
# Set the URL of prometheus server, in our example the default: localhost:9090
# Clic _Save & Test_
*Note:* Grafana data sources can also be [configured by files|http://docs.grafana.org/administration/provisioning/#datasources]
h3. Creating a graph
Follow instructions from [Grafana documentation|http://docs.grafana.org/guides/getting_started/] to create a dashboard and add panels.
While editing a graph, in the metrics tab, you can use the same queries tested in Prometheus. For instance:
{code}
CpuLoad{container="petals-sample-0"}
{code}
Will display the _CpuLoad_ metric only for _petals-sample-0_ container:
!Screenshot from 2018-10-03 16-44-25.png!
You can add different panel types that suit your need ton create your own tailored dashboard:
? !Screenshot from 2018-10-05 15-48-54.png|width=1112!