State of the Cloud 2019

Now that 2020 is upon us, I thought it might be a good idea to generate some statistics about the Nectar Cloud for 2019. Instances In 2019, Nectar Cloud ran a total of 70,371 instances. VCPU time These instances ran for a total of 9,203,375 days, 19 hours 17 minutes and 59 seconds of VCPU time [1]. That is around 25,214 VCPU years! The mean VCPU time is about 130 days . The mode VCPU time is 365 days , which means there were lots of single core instances running through the year. Flavour The most popular flavour is m2.large (4 VCPU). There were 26,750 of such instances. End Statistics were generated from Gnocchi . Nectar logs the start/end times of each instance in Gnocchi, as well as a host of other data. As a Nectar user, you can use the Gnocchi API to access metrics for your resources. Let me know if this has been interesting, or if there are any other stats you want to see! Footnote 1. VCPU time is (Number of VCPU) * (Running time). For example, if an in

Migrating to Nova Cells v2

On 15th October 2019, the Nectar Research Cloud finally migrated to Nova cells v2. This completed a long journey of running multiple patches on nova to make cells v1 work. First a little history..... The Nectar Research Cloud began its journey with cells back in the Essex release. We heard about this new thing called cells (now known as cells v1) and were aware Rackspace were using them in a forked version of Nova. Chris Behrens was the main developer of cells and so I got a conversation going and soon we were running a version of nova from a personal branch from Chris's Github . This helped us scale the Nectar Research Cloud to the scale it is now and was also helpful for our distributed architecture. I think it was Grizzly where cells v1 was merged into the main nova codebase. Finally we were back onto the mainline codebase... however there were multiple things that didn't work, including security groups and aggregates to name a few . That led me down the path of nova de

Understanding telemetry services in Nectar cloud

Intro Openstack telemetry service is to reliably collect data on the utilization of the physical and virtual resources comprising deployed clouds, persist these data for subsequent retrieval and analysis, and trigger actions when defined criteria are met.  Ceilometer project was the only piece of telemetry services in openstack, but since it was dumped into too many functions, it got split into several subprojects: Aodh (alarm evaluation functionality), Panko (events storage) and Ceilometer (data collection - more about ceilometer  history  from ex-PTL). The original storage and API function of ceilometer had been deprecated and moved to the other projects. The latest telemetry architecture could refer to openstack  system architecture . In Nectar cloud, we are using the Ceilometer for the metric data collection, Aodh for the alarm services, Gnocchi for the metric API services, and Influxdb as the time series database backend of Gnocchi. Influxdb backend driver is Nectar impleme

Passing entrophy to virtual machines

Recently, when we were working on testing new images with Magnum, I found that the newest Fedora Atomic 29 images were taking a long time to boot up. A closer look using  nova console-log  revealed that they were getting stuck at boot with the following error. [   12.220574] audit: type=1130 audit(1555723526.895:78): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-machine-id-commit comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [   12.248050] audit: type=1130 audit(1555723526.906:79): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-journal-catalog-update comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 1061.103725] random: crng init done [ 1061.108094] random: 7 urandom warning(s) missed due to ratelimiting [ 1063.306231] IPv6: ADDRCONF(NETDEV_UP): eth0: li

Nectar Identity service now upgraded to Queens

We're please to announce that the Nectar Identity service (Keystone) has now been upgraded to the Queens release. Normally an upgrade like this wouldn't require a blog post, but there's a couple of significant changes that users should be aware of. Keystone API v2.0 deprecation The Keystone v2.0 API has actually been removed from the Queens release, but we're aware that many users are still using it. We are now actively requesting that any users still using the v2.0 API to move over to the v3 API that has been available since 2016. We plan to keep the v2.0 API running until April 2019, at which time, it will no longer be available. See our Keystone v2.0 to v3 migration guide on how you can switch over. Application Credentials The long awaited application credentials are finally available in the Queens release. Application credentials allow users to generate their own OpenStack credentials suitable for applications to authenticate to the Identity service, with

Kubernetes now available on Nectar!

We’ve just deploy OpenStack Magnum (Container Infrastructure as a Service) on Nectar cloud. This allows a user to spin up a container cluster (kubernetes or docker swarm) on Nectar. We are in the process of coming out with official documentation, but in the meantime if you would like to test drive it, here are the steps to do so. First of all, you need the following Quotas for Floating IP Network Subnet If you have requested for floating IPs for your project, you will be fine. If not,  request for floating ip quota . Creating a Cluster You can create a cluster using either the Dashboard or CLI tools. Using Dashboard Log on to the  dashboard Navigate to  Container Infra . Click on  Clusters , then  Create Cluster . Give your cluster a name. Choose a cluster template. We have pre-defined global templates (in format  kubernetes-{az} ) to help you get started easier. Choose the template that you want your cluster to be in. Go to the  Misc  tab, and se

Getting capacity and usage information out of Openstack Placement

One of the problems we have in Nectar is reporting on capacity and usage of nova. We have ~1000 compute nodes and these come and go. The Openstack Placement service seems like a good place to extract capacity and usage information from nova so I thought I'd see what it could do. First problem is there is no placement client. I found the osc-placment project but this isn't that useful as it is just an openstack client plugin and not a python library like the other clients. I've written a custom client before for an internal project so I created a very simple client that can get resource provider usage and capacity. In placement a resource provider maps directly to a compute node in nova. Now we have the means to query this information I wrote a custom Ceilometer poller to collect this information periodically. Once this was collecting data I needed to configure Ceilometer and Gnocchi to ingest the notifications that my new poller was generating. In Ceilometer I ne