Understanding telemetry services in Nectar cloud

Intro

Openstack telemetry service is to reliably collect data on the utilization of the physical and virtual resources comprising deployed clouds, persist these data for subsequent retrieval and analysis, and trigger actions when defined criteria are met.  Ceilometer project was the only piece of telemetry services in openstack, but since it was dumped into too many functions, it got split into several subprojects: Aodh (alarm evaluation functionality), Panko (events storage) and Ceilometer (data collection - more about ceilometer history from ex-PTL). The original storage and API function of ceilometer had been deprecated and moved to the other projects. The latest telemetry architecture could refer to openstack system architecture.

In Nectar cloud, we are using the Ceilometer for the metric data collection, Aodh for the alarm services, Gnocchi for the metric API services, and Influxdb as the time series database backend of Gnocchi. Influxdb backend driver is Nectar implementation, which is not yet merged into upstream. You could check out Gnocchi code in nectar repo.  

An instance metric as an example

To better understand the whole telemetry services, we will use an instance metric as an example to go through the whole process. 

How the ceilometer collects meters

When a user instance is created, instance metrics will be created and its metering will be collected (see the other article about the metric collection) by ceilometer services.

Ceilometer has two types of services, ceilometer-polling.service and ceilometer-agent-notification.service, which run on both the central management servers (ceilometer servers) and compute node. The ceilometer-polling service on the compute node (compute namespace) is to poll the instance or the compute node resource utilization, while the ceilometer-polling service on ceilometer servers (central namespace) is to poll other resources which are not tied to the instance or compute node, like other OpenStack services metering. Both of the polling services will send the metering data to message queues as producers. The ceilometer-agent-notification service, which is running on the central ceilometer servers, will consume the data on the message queues, and convert them to ceilometer samples and publish them as required through the pipeline.  In our case, we are using gnocchi as publisher, which means it will store the data through gnocchi.

Compute node

On the compute node,  message queue transport_url and polling service namespace should be configured in /etc/ceilometer/ceilometer.conf.  Besides, the polling.yaml file defines which meter will be monitored and what is the sample interval etc.

> cat polling.yaml
---
sources:
- name: cpu_pollsters
  interval: 60
  meters:
    - cpu (1)
- name: instance_pollsters
  interval: 60
  meters:
    - memory.usage
    - network.incoming.bytes
    - network.incoming.packets
    - network.outgoing.bytes
    - network.outgoing.packets
    - disk.device.capacity
    - disk.device.usage
    - disk.device.read.bytes
    - disk.device.read.requests
    - disk.device.write.bytes
    - disk.device.write.requests

Central ceilometer server

Similarly, there is some configuration needed in ceilometer.conf and polling.yaml on the ceilometer servers.  The messaging_urls are defined for the notification listening if you have a dedicated message queue for each service, or you have a cell-level message queue if you are using cell architecture in a large deployment.  Besides, the dispatcher_gnocchi section is configured for the gnocchi integration, which points to the gnocchi definition file, for example, in our case we called it gnocchi_resources.yaml.
Ceilometer agent notification service relies on the other two configuration files, one is pipeline.yaml  and the other is gnocchi_resources.yaml as configured above.

> cat pipeline.yaml
---
sources:
....
    - name: cpu_source
      meters:
        - "cpu" (2)
      sinks:
        - cpu_sink (3)
        - cpu_delta_sink
....
sinks:
....
    - name: cpu_sink
      transformers:
        - name: "rate_of_change"
          parameters:
            target:
              name: "cpu_util" (4)
              unit: "%"
              type: "gauge"
              scale: "100.0 / (10**9 * (resource_metadata.cpu_number or 1))"
      publishers:
        - gnocchi://?archive_policy=ceilometer&filter_project=service

Pipeline.yaml file defines how the ceilometer will transform the meters data (transformers), and where it should publish to (publishers). From the above example, the meter cpu collected from the compute node will be transformed to gauge type metric cpu_util. And it will push to gnocchi using the archive_policy called ceilometer . And according to the configuration in file gnocchi_resources.yaml, the metric cpu_util will be created with some required attributes. That is why you can see this cpu_util metric when you query the gnocchi resource in this article[link to be updated].

> cat gnocchi_resources.yaml
---
resources:
...
  - resource_type: instance
    archive_policy: ceilometer
    metrics:
      - 'memory'
      - 'memory.usage'
      - 'memory.resident'
      - 'vcpus'
      - 'cpu_util' (5)
      - 'disk.root.size'
      - 'disk.ephemeral.size'
      - 'compute.instance.booting.time'
    attributes:
      host: resource_metadata.(instance_host|host)
      image_ref: resource_metadata.image_ref
      display_name: resource_metadata.display_name
      flavor_id: resource_metadata.(instance_flavor_id|(flavor.id)|flavor_id)
      flavor_name: resource_metadata.(instance_type|(flavor.name)|flavor_name)
      server_group: resource_metadata.user_metadata.server_group
      availability_zone: resource_metadata.availability_zone
    event_delete: compute.instance.delete.start
    event_create: compute.instance.create.end
    event_attributes:
      id: instance_id
      display_name: display_name
      host: host
      flavor_id: instance_type_id
      flavor_name: instance_type
      availability_zone: availability_zone
      user_id: user_id
      project_id: project_id
    event_associated_resources:
      instance_network_interface: '{"=": {"instance_id": "%s"}}'
      instance_disk: '{"=": {"instance_id": "%s"}}'

From the above configuration subtract, our markings like (1) (2) (3) (4) (5) indicate how our interested meter from the compute node is mapped to a metric in the gnocchi.

How the gnocchi stores and queries data

Gnocchi is the metric service and provides the rest API to the external. Hence, all the resources/metrics creation,  measurement stores, and measurement queries are going through gnocchi to the persistent databases. Gnocchi decouples the metric indexing and data storage, so it is using two databases respectively, one for the indexer which we use MySQL, and the other one for the storage which we use time-series database influxdb.

Influxdb backend

As time-series databases, influxdb has some concepts to know before you go ahead to the further chapter.  If we make an analogy with MySQL, the measurement in influxdb is like MySQL table. But influxdb has a very important concept called retention policy, every measurement is tied to a retention policy and we have to specify it since we don't use the default retention policy autogen.  For example, when we query the influxdb measurement data for a metric on influxdb server, we use the InfluxQL like:
SELECT * FROM <Database>.<Retention Policy>.<Measurement> WHERE metric_id = 'METRIC ID'

We can only use the metric_id as the Where clause because in the influxdb show tag keys only has one key called metric_id , it (I assume?) works like the describe table in MySQL.

Storing raw data

For time series database, it is important to know that the store(write) and query(read) is going through different retention policies, like the following POST call to influxdb:

Nov 01 22:10:51 influxdb influxd[1007]: [httpd] - root [01/Nov/2019:22:10:51 +1100] "POST /write?db=gnocchi&precision=n&rp=rp_ceilometer_incoming HTTP/1.1" 204 0 "-" "python-requests/2.18.4" 43a3dc75-fc98-11e9-8d6b-fa163e8baccc 2091

The retention policy(RP) is assembled with rp_<archive policy name>_incoming for raw data writing. In the gnocchi influxdb driver implementation, all the RP with _incoming suffix is the one for raw data writes. The archive policy name is from the pipeline.yaml config file in ceilometer, or can query that through the gnocchi API as this article shows[link to be updated].

So here is a checkpoint when troubleshooting telemetry issues which is to check if the measurement with _incoming RP really got the raw data in. That will prove the collecting statistics and writing data is successful.

Querying via continuous query

When we want to get the statistic result of a specific resource, we just need to use gnocchi CLI to query that. 
Influxdb writes into the raw data but read from the continuous query results.  Continuous query(CQ) is the influxdb internal mechanism to periodically and automatically run InfluxQL command to query the raw data and store the result into a specific measurement.  Using show continuous queries in influxdb console can show how the new(continuous query) measurement is built.

samples_ceilometer_mean_300 CREATE CONTINUOUS QUERY samples_ceilometer_mean_300 ON gnocchi RESAMPLE FOR 10m BEGIN SELECT mean(value) AS value INTO gnocchi.rp_86400.samples_ceilometer_mean_300 FROM gnocchi.rp_ceilometer_incoming.samples_ceilometer GROUP BY time(5m), metric_id END

The above example is showing this CQ is resampling raw data from rp_ceilometer_incoming RP and samples_ceilometer measurement and then writing into rp_86400 RP and samples_ceilometer_mean_300 measurement.  The number in the RP name actually indicates such RP's duration in seconds, which means how long the data should be kept in the influxdb before it gets discarded.  Such number can be calculated from the archive policy as well, for example, the archive policy ceilometer for metric cpu_util is granularity: 0:05:00, points: 8640, timespan: 30 days which means the RP's duration should be 5 * 60 * 8640 = 2592000 (in seconds), with RP name as rp_2592000.

So when we query the measurements of, saying one instance metric cpu_util we could use the following gnocchi CLI:
gnocchi measures show 407eede1-e1cf-41ec-9778-398076c461c8

And under the hood, gnocchi uses the following call to influxdb, the RP name and measurement name is just calculated as mentioned above:
Nov 01 14:40:08 influxdb influxd[14362]: ts=2019-11-01T03:40:08.614363Z lvl=info msg="Executing query" log_id=0F80ScCG000 service=query query="SELECT value AS mean FROM gnocchi.rp_2592000.samples_ceilometer_mean_300 WHERE metric_id = '407eede1-e1cf-41ec-9778-398076c461c8' ORDER BY time DESC LIMIT 8640"

How the Aodh defines alarm and triggers the action

Aodh is the alarm service, which will query the telemetry data via Gnocchi Api, evaluate the alarm criteria and trigger the corresponding actions. (Install the Aodh client python-aodhclient to use the aodh cli).

The alarm will query gnocchi based on its definition. For example, we could create a alarm with gnocchi_aggregation_by_metrics_threshold type alarm, with mean aggregation_method and 300 seconds granularity. The aodh-evaluator process will call the gnocchi api like:

GNOCCHI:
[04/Nov/2019:12:32:20 +1100] "GET /v1/aggregation/metric?start=2019-11-04T01%3A22%3A20.330651&stop=2019-11-04T01%3A32%3A20.330651&aggregation=mean&granularity=300&needed_overlap=0&refresh=False&metric=dc241e5a-9b90-454c-8435-383a544b4a88 HTTP/1.1" 200 78 "-" "aodh-evaluator keystoneauth1/3.13.1 python-requests/2.21.0 CPython/3.6.8"

which will eventually call the influxdb with:
Nov 04 12:32:20 influxdb influxd[14362]: ts=2019-11-04T01:32:20.326973Z lvl=info msg="Executing query" log_id=0F80ScCG000 service=query query="SELECT value AS mean FROM gnocchi.rp_2592000.samples_ceilometer_mean_300 WHERE metric_id = 'dc241e5a-9b90-454c-8435-383a544b4a88' AND time >= '2019-11-04T01:20:00.000000000Z' AND time < '2019-11-04T01:32:20.299446000Z' ORDER BY time DESC LIMIT 8640"

Based on the retrieval measurements data and alarm attributes threshhold and comparison_operator , the alarm state could be ok , alarm or insufficient data. Afterwards, the aodh-notifier will act in accordance with ok_actions ,alarm_actions or insufficient_data_actions . Alarm actions could be http callback or log or zaqar. It is important because its presence makes the whole telemetry a close loop on which some features like auto scaling is based. Another typical http callback of alarm action could be webhook for the custom use.

Insufficient Data

One of the common error that Aodh alarm gets is the insufficient data state. So when troubleshotting this issue, a second checkpoint here is the aodh query granularity is same as the one from gnocchi archive policy. That is to ensure the reading granularity is synced with the writing.

How the auto scaling server group is implemented

Telemetry service could implement an end to end solution like auto-scaling. That is to say the stack instances's statistics data will be collected by ceilometer, store through gnocchi, and read from aodh. When some load indication metric gets really high, the aodh alarm will trigger the actions to signal openstack heat service to update the stack, like adding more instances into the stack to increase the capacity. And vice versa the stack will be scaled down when the alarm is cleared.

We can achieve that using nectar heat template at the link. Heat resource OS::Heat::AutoScalingGroup, OS::Heat::ScalingPolicy and OS::Aodh::GnocchiAggregationByResourcesAlarm all together make this happen. The used alarm type is GnocchiAggregationByResourcesAlarm, which means the alarm is based on the aggregrated metric(cpu_util) measurements of the resource (the server_group including all instances in the stack).






Comments

Popular posts from this blog

Getting capacity and usage information out of Openstack Placement

A brief history of OVN migration