This post is intended to advocate among quality teams across OpenStack to try, as much as possible, to push their tests upstream. As you may see throughout this post, we have several advantages by taking this approach.
At first, why should we bother to submit our tests upstream and go though the know OpenStack's review process? There are some obvious (and not so much obvious) reasons:
The obvious reason: if upstream, the tests "automatically" have the community support. This is one of the greatest advantages of open source, we have lots of people (paid, most of the time) to maintain what is upstream - so if anything changes and fixes are needed or if bugs are found, anyone in the community has the potential to perform the fix, not necessarily the first person that proposed the tests patches.
The not so obvious reason: if upstream, means the tests are automated! The advantages of automated tests are well known, especially for projects with a fast release pace as OpenStack; new releases come with a bunch of new features twice a year, and by having automated tests we make sure this new features don't break the old ones.
Also having the opportunity to perform quality testing in a new feature early in the release cycle or even before it merges (via patches dependencies) can produce much cleaner releases with fewer backports from bug fixes.
QA in OpenStack
Knowing about the importance of QA, OpenStack has had upstream efforts since its earliest days. In this way, making possible the project's quality verification.
Generally speaking, in OpenStack we have two different test queues: a check gate and a merge gate. The check gate is responsible to run the first set of tests against a new proposed change, this is done independently of the current change score (-2, -1, 0, +1 or +2). After the change is approved by a core reviewer, the merge gate tests run against the change again. Only if all the voting tests pass, the change is finally merged in the repository.
Here we will give a brief description about some of the projects in OpenStack related to QA and infra. It is not intended to be an extensive list containing details about all the QA/infra projects in OpenStack, but rather a short list highlighting some of the most important ones.
For more details, everyone is welcome to reach the community out on IRC (freenode): #openstack-qa, #openstack-infra, #rally, #zuul... :)
DevStack is a project that provides a set of scripts to easily bring up a complete OpenStack - developer friendly - environment. The OpenStack integration tests usually run in a dsvm (DevStack virtual machine). We also have the DevStack Gate project, that is responsible to setup DevStack in the virtual machines prior running the gate tests.
DevStack also provides a plugin feature, where custom DevStack deployments can be constructed.
Tempest is a project that offers a way to validate OpenStack clouds via integration tests that use the components APIs. Tests goes from simple API verifications to complex features checks.
Tempest also provides a plugin, where you can write tests and place them in custom repositories.
Grenade is the OpenStack's tool to test upgrades. Besides regular database upgrades, it also checks if the upgrades are not destroying valuable resources (like servers and images).
Rally is a benchmark tool for OpenStack clouds, as you can check in the project's page, Rally can be responsible to deploy, run Tempest tests and run benchmarks with the intent to check the cloud's scalability.
Zuul is the OpenStack's CI tool, it is a great project responsible to work as a gate for all changes that are pushed to Gerrit, figure out their dependencies and check the output from tests - only changes with the tests passing are merged. You can check Zuul's current status at http://status.openstack.org/zuul/.
Zuul is also responsible to handle the well known recheck commands. Sometimes, unpredictable things happen, causing the gate tests to break. So if you see some test not passing and the error is unrelated to the change itself, the OpenStack community works hard to fix the issue (you can help here too) so you can rerun the tests using the "recheck" comment in the Gerrit change.
This is not a project but rather a feature from OpenStack infra. Using this, you can test a change in Gerrit using systems are not officially supported by OpenStack and vote in the change. Here you can find more details about this feature.
Creating a new CI job and new tests in OpenStack by example
Only describing the projects and their intent is not enough - there are too many "moving parts". In this session, we will show how to contribute to a new CI job that runs a new set of tests from the Keystone's Tempest plugin.
The main goal behind the creation of a Tempest plugin for Keystone was to add tests for the Federated Identity feature, specifically to its APIs and authentication mechanisms. This feature requires a custom deployment that doesn't make sense to run in all other services gates since it is very specific to Keystone's authentication process. The type of user authentication doesn't interfere in other OpenStack services.
For that, we have followed the steps below:
1. Add the plugin base structure
Tempest offers the tempest plugin cookiecutter, we used it to create a new folder in Keystone's tree to be the plugin tests home.
The plugin was introduced in this change. To run the tests we have to execute the following command inside a Tempest directory:
tox -e all-plugin -- keystone
At this point, the command above doesn't run any tests yet, but it useful to check if the plugin can be properly find and doesn't contain any error.
2. Add a non-voting job to run the plugin's tests
Now we already have the base structure to write tests for the Keystone plugin, but the tests aren't ran anywhere. For this, we need to create a new CI job that will execute the command described in the topic above in a DevStack environment.
New jobs are added to the project-config repository and this change was the responsible to create the new job. We created a non-voting "keystone-only" job so we could check first if it works properly, if it didn't, we would block Keystone's gate for no reason. In order to make the job non-voting, we can append -nv to the job name. We also had to modify the jenkins/jobs/keystone.yaml file specifying some environment variables that will be responsible to execute the correct set of tests. For example, to run the Keystone's plugin tests, we added the following line:
3. Add a first set of tests
At this point, we have a base structure for the tests and also have a CI job that can execute them. We can write new tests to the plugin and see the job output to check if they pass or not.
Since our main goal is to test the Federated Identity feature, we started by adding tests to the OS-FEDERATION API. With these tests working, we can feel confident to add the tests for the feature itself. These changes were responsible to add the new set of tests to Keystone's plugin.
4. Make the job voting
After we added the first tests, we saw that everything was running fine so we could, finally, make the job that runs the Keystone's plugin tests to be voting in the gate. This was done by simply removing the -nv from the job name: https://review.openstack.org/#/c/321890/
5. Next steps
With the OS-FEDERATION API being tested in a voting gate job, we still have some missing pieces in the maze of adding tests for the Federated Identity authentication mechanism. For that, we need the tests to call the correct sequence of endpoints, passing the correct set of headers and credentials to the Keystone server. We also need a DevStack plugin to handle the deployment of the custom environment.
The tests for the federated authentication, can be found here and the Keystone DevStack plugin here (thanks to Kristi Nikolla who has been driving the efforts here too). Both are still under review by the time this post was written.
Once the DevStack plugin merges, we need to update the job (or create a new one) to handle the settings that will configure the environment for the tests.
The tests were merged and are running upstream! Am I done?
The picture above (from a shirt given in a previous OpenStack Summit) resumes everything. Unfortunately, in the QA point of view, having the tests merged upstream is not sufficient. There are several reasons for why additional testing is needed, and may vary on each type of product provided/used by your team.
Generally speaking, the following are the most common reasons we should do some extra "downstream" work:
1. "Manual only" tests scenarios
Sometimes, it is not possible to automate the tests for a given feature. Consider the example where you need to use Horizon (the OpenStack dashboard) to verify it. Although it is possible to add some automation in the UX side, we usually test the APIs in an automate fashion and perform the UX actions manually to check the outcomes.
2. Custom environments
Usually, if you have a product coming from an open source project (like Red Hat Enterprise OpenStack Platform being the product version of the OpenStack project), you will also have some downstream CI that is responsible to run the tests again for the product. Why is that? Notice that during the project->product conversion, a lot of factors may differ, starting from the building tools and packages versions to major things like the operating system. Besides that, there are tests that don't make sense to exist as third-party tests upstream. Imagine tests that would take a full day to run, it doesn't scale for the upstream standards. Additionally, not all brand new features from the upstream project enters the "supported" area of a downstream product, so it is simpler to run these kind of tests in a fewer set of changes.
You also may have proprietary environments there a private to a customer or to your business. In this case, you can't disclosure the systems you are testing, also preventing from adding a third-party job for it.
In this post, we highlighted some of the characteristics of working with QA in OpenStack. Giving special attention on the reasons why you should try to push your tests upstream. This was accomplished by using a real example of how to do that and also giving more details about the QA practice in general.
If you are interested about this topic, join me in the OpenStack Barcelona Summit, where I will be giving this talk about the same topic of this post :)
Also, if you are not attending the Summit and want to discuss about QA, Keystone, OpenStack, or even cool engineering stuff in general, reach me out:
- rodrigods (IRC - freenode)
- @rodrigdsousa (Twitter)
- rodrigodsousa at gmail.com (email)