On Fedora 11 installation

I have been, as I mentioned, following up on Fedora 11 release stuff. Sadly, there've been a few very negative reviews and comments, based on problems with the partitioning stage of installation.

So, here's the deal on that. I've been explaining it in comments and so on, but I thought it would be worth a recap blog post. The reason there's an unusually high number of bugs in the storage code in anaconda is simple: it was entirely rewritten for Fedora 11. Here's the feature page explaining this.

As you can see from the feature page, this is a pretty complex area: there are multiple filesystems, things like RAID and LVM, hardware issues, and desired and pre-existing partition layouts to consider. There's a ton of variables, in other words, many of the combinations of which we are unlikely ever to hit in internal testing. We (the QA group) could test installs forever and a day and probably not hit some of the situations that get hit very rapidly once code gets out to the real world.

Some questions arise. Why was a rewrite needed in the first place? This is explained on the feature page. Basically, the existing code is very old and was not written to be easily extensible; it makes it very hard to add support for interesting new things like LUKS and iSCSI. We wanted a more modular code base so these and future desirable storage-related innovations could be handled properly. The longer you delay a necessary rewrite, the more pain is involved, so it made sense to do it as soon as the need was identified.

Why was the rewrite put into the main Fedora release so soon? It would have been possible to keep the 'old' and 'new' tracks separate, maintain both, and ship Fedora 11 with the 'old' code, and maybe only bring the 'new' code into Fedora 12. There's a couple of reasons not to do this.

First, as noted above, many situations just aren't tested until the code gets out. Pretty much all the situations we can actually test internally actually work in F11's anaconda. Our test matrix is full of green check marks. So even if we'd delayed the new storage code until F12, it probably wouldn't have had many more problems fixed than it does at present. We have to find out about the problems before we can fix them.

Second, this would have used up (or, in many ways, wasted) rather a lot of developer resources. We would have had to split the anaconda team and had some of them work on maintaining the old code for the F11 release. This work would have been essentially wasted as that code was destined for the trash, and it would have been that many man hours diverted from work on the new code. So we decided to push the new branch into the main codebase relatively quickly and have it released in F11.

Final question is more of a rant I'm seeing a lot of: why do you guys suck so much? Why are there so many bugs? Surely the coders must just be lazy / incompetent if they couldn't fix $MY_ISSUE before release?

Short answer - well, no, they're not. First, here's a solid number for you: from January 1st to now, a total of 332 bugs were filed on anaconda in Rawhide (so F11 at the time) and fixed. Here's the list. Just as a quick ballpark comparison, the number for the kernel over the same period is 122. So it's certainly not the case that the anaconda team are a bunch of lazy asses; they've been working their behinds off on bug fixing throughout the F11 cycle.

Are they incompetent? No, they're not that either. As I mentioned at the top, storage during installation is an innately tricky area, and Fedora has to support a lot of different scenarios here (especially as more or less the same code is used in RHEL, which has a lot of fairly robust requirements in this area). A lot of the variables have a huge range of possible values (previously existing partition layout, for example) - so what looks like just a 'perfectly standard installation' could actually have several thousand variations for different sets of data that previously exist on the disk and different hardware. When you're working from scratch to write code to handle a situation this icky, it's just inevitable that bugs happen. No set of coders could likely have produced a significantly different result.

In conclusion - we knew this was going to happen, and we went into it with our eyes open. We knew there'd be regressions. That's regrettable, and it's fine to criticize this, mark F11 down for it in reviews, and warn readers that the installer's storage code is a bit problematic and they may hit issues here. That's all perfectly true and valid and fair. What I wanted to address with this are just the questions of why this is the case, why it's not because we just suck at what we do, and why we went ahead and did it even though we knew it would cause some level of pain. Hope it's been useful.

And of course: FILE BUGS! When Anaconda fails, it usually gives you a dialog box with a traceback of the issue, and lets you save a copy. Please do so, and file a bug explaining where it failed, what choices you made during installation, and ideally the previous partition layout of your system. Include the traceback as an attachment. If you don't report the problems, they don't get fixed. Thanks!

Comments

jspaleta wrote on 2009-06-16 16:46:
A few thoughts 1) Can we expose the testing matrix used as part of the marketing materials as a nudge for reviewers to do a review based on an install inside that matrix? 2) What more can we do to encourage people interested in more exotic situations outside the existing testing matrix to extend it prior to release? 3) And this is the toughest question of the bunch.... If you know there are going to be a significant number of regressions associated with extensive installer development going into a release cycle that are only going to be picked up at release... would it make sense to plan for and roll up installer updates post-release and plan for rel-releasing isos with the fixed installer? -jef
[...] View original here:  AdamW on Linux and more » Blog Archive » On Fedora 11 installation [...]
grangerx wrote on 2009-06-16 21:03:
Hello, I have a fairly heavily formatted main PC. I immediately hit a show-stopper when trying to install Fedora 11 during partitioning. I haven't filed a bug. Back in the days of Fedora 4 or 6, I put in an Anaconda bug report, with as much information as I could get, and I followed up on it if the developers asked questions. The result: In a month, someone generated an updates.img to fix the bug ... but *only* for the RHEL version that was based on the same code. But, for the Fedora that was having trouble installing, they basically answered with a politically correct "Screw you on getting an updates.img that works with Fedora. Just wait until Fedora N+1." Since then, I've followed along with installer bugs and their resolutions as a morbid hobby. The "you're screwed for six months" seems to be standard Fedora policy for any bug like this, and it massively soured my personal desire to put in bug reports for Fedora. I even started trying to figure out how to "fix it myself", but Ananconda was unusually inadequately-documented back then. Even the updates.img format itself was apparently be three different things, depending on what version of Fedora was being used. I haven't ventured back into those waters since. Is it better documented now? I understand the need for re-writes, etc, on this code, but Fedora needs to do better about things like updates.img files. They often exist for various installation issues, but they're *never* official, instead put out by some-random-developer at some random URL, and even their existence is poorly publicized. I've often wondered why there isn't some nearly single-click mechanism in Fedora to combine a standard Fedora release ISO with a latest/official updates.img and write it out to a DVD, so that the burden of installer bugs is minimized with a 1.44MB download that can work with whatever ISO you've got already, instead of "wait until Fedora N+1". I started looking at using mkisofs to do a multi-session dvd with an iso and an updates.img, but I haven't gotten too much further. I'd like to see some form of post updates "Fedora 11.i2" installer nomenclature/capability be added, but I'm not actually sure to whom to suggest that. Any pointers would be appreciated. Just my thoughts, GrangerX
adamw wrote on 2009-06-16 22:18:
As far as I know, we still don't really do anything along those lines. Although Fedora Unity re-spins will pick up installer fixes that are released between the initial release and when they do their respin, I think, assuming they're pushed into the Anaconda package in the tree of the release in question. I believe this is mostly a manpower issue. RHEL customers pay for support, so we commit to fix their installation issues - there are staff who more or less just work on things like creating updates.img files for particular RHEL customers who suffer from install issues. But the Fedora Anaconda team rather spends its limited resources on working on the code rather than building and releasing update images. Of course, this could well be a 'patches welcome' situation. I'll try and remember to ask the Anaconda devs if there's a decent documentation for the process of creating updates.img. If so, I'm sure it would be possible for someone to volunteer to build updates.img with collected Anaconda fixes, publish and document these somewhere...it would be a useful project for sure.
jspaleta wrote on 2009-06-17 00:31:
Adam, The updates.img is more complicated than just anaconda as it is also a releng issue. Its complicated and tense issue and it definitely puts manpower resource constraints front and center. Its sort of a chicken and egg problem. There are priorities established by existing available manpower, so we build processes and policies which reinforce those prioritized choices that support the existing contributors. Once those policies are established it then becomes more difficult to incorporate a new team of people to work on something that was deemed a low priority for existing manpower if that low priority work intersects with the work the existing team is doing. I don't have a suggestion to solve that without disruption of existing work and even then you risk short term negative impact until a new manpower is able to be recruited. -jef
adamw wrote on 2009-06-17 00:49:
Jeff: can you comment more specifically? What about our current "policies and processes" would make it tough for a volunteer (or volunteers) who wanted to create and distribute updates.img files incorporating the latest anaconda fixes?
Paul Frields wrote on 2009-06-18 04:14:
If you have an updates.img, you actually don't need to combine it with anything -- IIRC there's an Anaconda switch to simply point at the new updates.img via a URL and use it. So if we could make the process of making an updates.img a bit more transparent and then show people how easy it is to use that switch, it would be a big win. Adam, thanks for making a point of patrolling some sites and offering information and assistance for people who've had a nonstandard experience with Anaconda. It sucks when we disenfranchise people with the installer, but people do need to know that the devs really are awesome and work spectacularly hard.
amcnabb wrote on 2009-06-22 21:08:
I appreciate the hard work that the developers have done for Anaconda, but I think it was very unfortunate that Fedora 11 was released in such an incomplete state. When the installer is incomplete, people have every right to complain about the quality of the release. I was trying to test as much as I could before the release, but it's really hard when you run into three or four bugs at a time. I've run into at least a dozen Anaconda bugs since the beta (maybe 2 dozen), many of which exist in the final Fedora 11 release. I even noticed three or four new ones this weekend. The developers were doing a great job fixing the bugs, but they really needed a few more months to get everything done. Rewriting the storage system is a huge task, and it's not the sort of thing that you push out before it's done. When you do, you have to expect to see negative reviews and comments. I actually think that people have been remarkably positive, all things considered. At this point, I think a Fedora 11.1 release, or at least a new updates.img, would be extremely helpful. Naturally, it would be good for the users because they would have a more installable release. However, it would be good for the developers because they would get a new round of testing and fewer duplicate reports.