New openQA tests: update installer tests and desktop app start/stop test

It's been a while since I wrote about significant developments in Fedora openQA, so today I'll be writing about two! I wrote about one of them a bit in my last post, but that was primarily about a bug I ran into along the way, so now let's focus on the changes themselves.

Testing of install media built from packages in updates-testing

We have long had a problem in Fedora testing that we could not always properly test installer changes. This is most significant during the period of development after a new release has branched from Rawhide, but before it is released as the new stable Fedora release (we use the name 'Branched' to refer to a release in this state; in a month or so, Fedora 30 will branch from Rawhide and become the current Branched release).

During most of this time, the Bodhi update system is enabled for the release. New packages built for the release do not immediately appear in any repositories, but - as with stable releases - must be submitted as "updates", sometimes together with related packages. Once submitted as an update, the package(s) are sent to the "updates-testing" repository for the release. This repository is enabled on installed Branched systems by default (this is a difference from stable releases), so testers who have already installed Branched will receive the package(s) at this point (unless they disable the "updates-testing" repository, which some do). However, the package is still not truly a part of the release at this point. It is not included in the nightly testing composes, nor will it be included in any Beta or Final candidate composes that may be run while it is in updates-testing. That means that if the actual release media were composed while the package was still in updates-testing, it would not be a part of the release proper. Packages only become part of these composes once they pass through Bodhi and are 'pushed stable'.

This system allows us to back out packages that turn out to be problematic, and hopefully to prevent them from destabilizing the test and release composes by not pushing them stable if they turn out to cause problems. It also means more conservative testers have the option to disable the "updates-testing" repository and avoid some destabilizing updates, though of course if all the testers did this, no-one would be finding the problems. In the last few years we have also been running several automated tests on updates (via Taskotron, openQA and the CI pipeline) and reporting results from those to Bodhi, allowing packagers to pull the update if the tests find problems.

However, there has long been a bit of a problem in this process: if the update works fine on an installed system but causes problems if included in (for example) an installer image or live image, we have no very good way to find this out. There was no system for automatically building media like this that include the updates currently in testing so they could be tested. The only way to find this sort of problem was for testers to manually create test media - a process that is not widely understood, is time consuming, and can be somewhat difficult. We also of course could not do automated testing without media to test.

We've looked at different ways of addressing this in the past, but ultimately none of them came to much (yet), so last year I decided to just go ahead and do something. And after a bit of a roadblock (see that last post), that something is now done!

Our openQA now has two new tests it runs on all the updates it tests. The first test - here's an example run - builds a network install image, and the second - example run - tests it. Most importantly, any packages from the update under testing are both used in the process of building the install image (if they are relevant to that process) and included in the installer image (if they are packages which would usually be in such an image). Thus if the update breaks the production of the image, or the basic functionality of the image itself, this will be caught. This (finally) means that we have some idea whether a new anaconda, lorax, pykickstart, systemd, dbus, blivet, dracut or any one of dozens of other key packages might break the installer. If you're a packager and you see that one of these two tests has failed for your update, we should look into that! If you're not sure how to go about that, you can poke me, bcl, or the anaconda developers in Freenode #anaconda, and we should be able to help.

It is also possible for a human tester to download the image produced by the first test and run more in-depth tests on it manually; I haven't yet done anything to make that possibility more visible or easier, but will try to look into ways of doing that over the next few weeks.

GNOME application start/stop testing

My colleague Lukáš Růžička has recently been looking into what we might be able to do to streamline and improve our desktop application testing, something I'd honestly been avoiding because it seemed quite intractable! After some great work by Lukáš, one major fruit of this work is now visible in Fedora openQA: a GNOME application start/stop test suite. Here's an example run of it - note that more recent runs have a ton of failures caused by a change in GNOME, Lukáš has proposed a change to the test to address that but I have not yet reviewed it.

This big test suite just tests starting and then exiting a large number of the default installed applications on the Fedora Workstation edition, making sure they both launch and exit successfully. This is of course pretty easy for a human to do - but it's extremely tedious and time-consuming, so it's something we don't do very often at all (usually only a handful of times per release cycle), meaning we may not notice that an application which perhaps we don't commonly use has a very critical bug (like failing to launch at all) for some time.

Making an automated system like openQA do this is actually quite a lot of work, so it was a great job by Lukas to get it working. Now by monitoring the results of this test on the nightly composes closely, we should find out much more quickly if one of the tested applications is completely broken (or has gone missing entirely).

Comments

No comments.