Fedora 24 and Rawhide: What's goin' on (aka why is everything awful)

Hi folks!

Welp, I was doing a Fedora 24 status update in the QA meeting this morning, and figured a quick(ish) summary of what all is going on in Fedora 24 and Rawhide right now might also be of interest to a wider audience.

So, uh, the executive summary is: stuff's busted. Lots of stuff is busted. We are aware of this, and fixing it. Hold onto your hats.

glibc langpacks

A rather big change landed in Fedora 24 and Rawhide last week. glibc locales are now split into subpackages using the 'langpack' mechanism that yum introduced and dnf supports. This lets us drop a somewhat ugly hack we were using to remove unneeded locales from space-sensitive images (Cloud and container images). However, it broke...quite a lot of stuff.

Locales lost on update/upgrade

As it initially landed, Fedora 24 / Rawhide users who updated to the new glibc packages lose all their locales; after the update you may have no locales except C and C.UTF-8. Various apps have trouble when you have a locale configured but not available, including but probably not limited to ssh and gnome-terminal. If you've hit this, you probably will want to do something like dnf install glibc-langpack-en at least (substitute your actual locale group for en). If you just want to have all locales back (so you can test apps in other locales and so on), you can do dnf install glibc-all-langpacks.

Installer doesn't run any more

anaconda tries to set os.environ["LANG"] as the default locale when starting up. There's also no dependency or lorax configuration to pull any glibc langpacks into the installer environment. The result is that in recent Rawhide and F24 nightly installer images, anaconda blows up during startup, trying to set a locale that isn't installed.

Live and cloud images don't build

This is actually a consequence of the previous issue. Cloud images are built via anaconda. As of last week, with the Pungi 4 switchover, live images are also now built using anaconda (via livemedia-creator). anaconda hits the locale bug in both those workflows, blows up, and consequently we get no live or cloud images in current Rawhide and Fedora 24 nightly composes.

Pungi 4

Speaking of Pungi 4...yep, as of the middle of last week, Fedora 24 and Rawhide composes are being done with that tool, as I've been talking about for a while. You can see evidence of this in the Rawhide and Branched trees. They now look more like release trees, with variant directories at the top level and all the regular images produced daily (well, they should be, except see above for why half of them are missing). If you've got scripts or anything which expect a certain layout of these trees, you're probably going to have to update them.

Up until glibc threw a spanner in the works this seems to have turned out quite well, but there are a few known consequences so far. There was a bug with the Server DVD installer image not booting properly due to an incorrect inst.stage2 kernel parameter, but that seems to be fixed now.

No name resolution on live images

If you manage to find a Pungi 4-created live image (from the few days before glibc broke 'em) and get it to boot, you'll probably find networking is busted. In fact basic connectivity works, but name resolution doesn't. This is because /etc/resolv.conf is a dangling symlink. This is the latest incarnation of a longstanding...disagreement among the systemd developers, NetworkManager developers, and everyone else unfortunate enough to get caught up in the crossfire. No doubt it'll get bodged up somehow this time, too, soon enough. You can easily resolve the problem manually with rm -f /etc/resolv.conf; ln -s /var/run/NetworkManager/resolv.conf /etc/resolv.conf. The change here isn't Pungi 4 per se, but the fact that under the Pungi 4 regime, live images are now created by livemedia-creator rather than livecd-creator. livecd-creator stuffed a /etc/resolv.conf into the live image it created, which avoided this bug by preventing systemd-tmpfiles from creating it as a dangling symlink on boot. livemedia-creator does not do this, so when the live image boots /etc/resolv.conf does not exist, systemd creates it as a dangling symlink, and NetworkManager refuses to replace the dangling symlink with its own symlink.

Rawhide / Branched reports missing depcheck information

The 'Rawhide report' and 'Branched report' emails are still going out, but they're now generated by a new tool and look a bit different. I kinda like the added information, but some people don't like the new format so much; send patches ;) It is known that at present the new reports are missing information on broken dependencies, and releng are working to get this back ASAP.

compose check report emails not appearing

I've mentioned this before, but briefly, the 'compose check report' emails sent out by my tool aren't happening at all at the moment. The process for producing them runs through fedfind and needed rather a lot of rework for the new Pungi 4-ish world. I have code that works now and am aiming to get it deployed this week. Right now all the reports would basically say "all the tests failed and half the images are missing" due to the above-mentioned problems anyhow.

Long term I'd like to move the image checks from check-compose into compose-utils and thus have the 'missing expected images' and 'image diff to previous compose' bits appear in the 'Rawhide report' / 'Branched report' emails; check-compose would then just generate an openQA test report, basically. Doing that cleanly requires a change to the productmd metadata format, though, which I need to work through with pungi and productmd folks.

Release validation test events not happening

Also due to the compose process changes, we can't really create release validation events at present. Well, we could create nightly ones, but the image download tables would be missing, and we'd have to do it manually; the stuff for creating them automatically is kind of outdated now (it relied on some assumptions about the compose process which no longer really hold true). We can't do Alpha TCs and RCs (and thus the events for them) until we work out with releng how we want to handle TCs and RCs with Pungi 4.

This week I'm aiming to at least update python-wikitcms and relval so we can have proper nightly validation events again and they'll have correct download links. Probably this will just involve changing the page names a bit to add the 'respin' component of Pungi 4 nightly compose IDs (so we'll have e.g. Test Results:Fedora 24 Branched 20160301.0 Installation or Test Results:Fedora 24 Branched 20160301.n.0 Installation instead of Test Results:Fedora 24 Branched 20160301 Installation) and tweaking wikitcms a bit to add the 'respin' concept to its event/page versioning design, and writing a fedmsg consumer which replaces the relval nightly --if-needed mode to create the nightly events every so often.

It'll probably take a bit longer to figure out what we want to do for non-nightly composes.

Other bits: Wayland and SELinux

We also have a couple of other fairly prominent issues related to other changes.

Lives don't boot or don't work properly with SELinux in enforcing mode

A change to systemd seems to result in several things in /run being mislabelled in Fedora live images. (Yeah, yeah, systemd and SELinux...please put down the comment box and step away from the keyboard, trolls, moderation is in effect). With SELinux in enforcing mode (the default), this seems to result in Workstation lives not booting (it sits there looping over failing to set up the live user's session, basically). KDE lives boot, but then lots of stuff is broken (you can't reboot, for instance, and probably lots of other bits, that's just the one our tests noticed). I didn't check other desktops yet.

You can work around this one quite easily by booting with enforcing=0.

Installer doesn't run on Workstation lives

Workstation live images for F24 and Rawhide were flipped over to running on Wayland by default (in most cases) quite recently. Unfortunately, the live installer relies on using consolehelper to run as root, but consolehelper doesn't work on Wayland. So if you find a recent Rawhide / F24 Workstation nightly live, and you get it to boot, and you ignore the fact that networking is busted, you won't be able to install it (bet you were just dying to do that after this blog post, weren't you?) unless you just run /usr/sbin/liveinst directly as root. Well, I mean, I'm not guaranteeing you'll actually be able to install it if you do that. I haven't got that far in testing yet.

IN CONCLUSION

So, um, yeah. We know everything's busted. We know! We're sorry. It's all gettin' fixed. Return to your homes, and your Fedora 23 installs. :)

Comments

adamw wrote on 2016-03-01 00:34:

For extra credit: there is also GCC 6. That's probably busting things too! Software is fun.

PATO wrote on 2016-03-05 15:59:
Completely agree with you , I Have Been facing the same as you this week . It is a front collision with a truck . Rolling back this weekend to Fedora 23 as my work's laptops is not unusable but useless (my fault ... Dumb & Dumber).