Gaming

Dice by OlivanderWhen I was coding at IBM, we had pretty clear quality metrics that had to be met before a product went out the door.  We had to execute all of our tests, and pass 95%, for instance.  No, not 100%, because good developers ought to write tests even if they know the current code won’t be able to pass them – that’s far better than not writing the test, and someone at IBM got that.  We also couldn’t ship with any P1 defects, and all P2 defects had to have a “disposition” – a workaround, or at least clear documentation on alternatives.  We were, after all, IBM.

I remember one product cycle where things were particularly tight.  Maybe they’re all “particularly tight.”  In this case anyhow, some teams had fallen far behind, to the point that our team was being brought in to do triage and QA on their code as well.  It was a stressful time for the product managers, for the whole department.

We were also not meeting our quality goals.  There were significant P1s that still didn’t have fixes, and our pass rate on tests was mid-80s.  We were asked to “focus.”

Whether it was encouraging “focus” per se, or just competent, dedicated people trying to do their job, we made some headway.  Tests-passed got into the high-80s, not many P1s got fixed but a couple more P2s had workarounds written.  Not enough, but better.  Still, we were about to run out of time.  That’s when we got an email.

“We test our code to make sure that the intended functionality succeeds,” it started (or words to that effect.)  “Obviously, it wouldn’t make sense to test functionality we never expected to have.  If we were releasing a word processor, and wanted to get inline spellcheck in, but just couldn’t do it, well then it would hardly be sensible to wring our hands about failing the inline spellcheck tests, would it?”

Oh…kaaaaay… we thought, all of us together.

“So if there are tests failing that we know we can’t fix in time, then that’s functionality we don’t intend to ship.  So it doesn’t make sense to include those in our tests.”

With those tests removed, of course, our pass rate went way up.  Ahem.

There was still the matter of the wayward P1s and P2s, but every developer in the room knows how those were fixed.  One morning we all came in to a bunch of bugmail saying that our P2s were now, coincidentally and en masse, P3s; our P1s were all either P2s or P3s depending on how plausibly a workaround could be written.

And the product shipped.  And customers complained.  And tech sales wept.  And a year after shipping we had no active, deployed, reference customers.  And we did that thing, where we taught our customers not to trust our X.0 software, to wait for at least two service packs before trusting us.  I hate doing that thing.

This isn’t about me throwing stones at IBM, it’s about underscoring how hard metrics are to get right, and how prone people are to gaming them when their incentives are misaligned.  I bet the product managers got congratulated for shipping Another On-Time Release. I’m sure, too, that the blame for the market failures was spread broadly enough to be much less impactful, so it’s hardly surprising that PMs would act this way.  I know that’s not novel insight, but I’ve always held on to that story as one of my own favourite examples.

The Mozilla community has amazed and impressed me with its active awareness of, and resistance to, these kinds of games, but it’s a never-ending battle.  We, too, will second-guess our decision to mark some feature as P1 when we get down to it, or our decision to mark some bug as blocking.  But I feel like there’s a cultural difference in game-awareness that’s important; those decisions generally seem to have “Are we gaming things here?” as part of the discussion.  Can anyone tell me how we get there?  IBM is not full of idiots nor of self-serving cycnics.  If someone can tell me how to bottle that awareness, and cultivate it in software companies, and make it stick, I’ll write the book and give you a cut.

3 comments

  1. Well, that one is pretty easy, IMHO: The difference is actually openness and the community.

    Mozilla has, in its over 9 years of existence, learned the community is there and it’s watching. If you start gaming, it will call your bluff recklessly, and even before you are shipping a product based on those games. You downgrade someone’s pet bug and he’ll shout out loudly. You ditch someone’s pet RFE from the list of things to ship and he’ll blog about it and tell others who might have wanted this. You can never indulge yourself in thinking the knowledge of that gamble will stay in-house. You know it won’t. And because you know that, you won’t dare to play the game. If you are ditching something, you first have to have a pretty strong argument for doing this. If you downgrade the priority of something, remove blocker status or such, you need a pretty good explanation or it’ll come back to you. You can’t just keep it in the team because you’re always in the public in some way. And the public, i.e. the community, won’t believe a cheap excuse.

    I don’t even think that people in those software companies don’t know about their gaming, they just make theirselves believe that it’s OK to do so as the users won’t see this or that problem or won’t care about this or that feature. You only need to satisfy yourself, your team, your managers – for the moment.

    In an open community, you need to account for every such step to actual users in the community, and not only after shipping but actually right when you take the step. Once you learn that, it changes your thinking. And Mozilla has learned that quite well over the years. We never would have been able to call what Netscape shipped as their 6.0 release a final release of Mozilla. Community members back then knew it was a nice testing version but not ready for the casual user. And Netscape6’s non-success told us once again that the community was right.

    So, what companies can learn here is:
    1) Make your processes open, have the community actively participate, and quality will profit.
    2) Believe testers in the community. Sure, they are talking about their pet feature/bug, so take feedback with a grain of salt. But at least re-think if what you’re doing is really what you should do when the community disagrees. There’s some possibility that they are actually right.
    3) Never ship on a fixed schedule and never rush a release. Only, ever, ship when it’s ready to be shipped, never before that point. It’s better to tell users you need to ship a month or two later to meet your high internal quality standards and have some press about people eagerly waiting on your release than to ship right on time and have even more press about how your product sucks.

  2. Bibs on everyone – let’s eat our own dogfood (this could be messy, but bear with it). If the developer (and ensuing manager) are customers, I’m pretty sure the game will change.

  3. Sharp faxless payday loans possible to view the universe to be better with revenue introducing faxless payday loans http://www.nfspaydayloan.com in which perhaps will reinforce members.