Vacuums and You (or, Estimating Like an Astronaut)

I’m going to teach you a surprisingly effective trick for estimating better, but first I need to talk about dressing up vacuum cleaners.

Ze Frank is a pretty creative guy, but what makes him really interesting to me is his ability to make other people creative. It’s what he does. He catalyzes creativity, frequently among those who don’t consider themselves creative. And when he talks about how he does it, he talks about the value of constraint.

Asked to go and “be creative,” he notes, most people shut down. So, instead, he asks for something more specific. He asked them to make a whole earth sandwich; they made a few. He asked people to send in pictures of vacuum cleaners dressed as people. He got 215. Constraining people, forcing them to solve a smaller problem, made them better at it.

Creativity isn’t the only thing that benefits from constraint. Asking engineers (or, really, anyone) for “an estimate” is basically akin to asking them to “be creative.” They know what examples of the thing in question look like, they understand that it’s a reasonable request, they just don’t actually know how to get there from here, much less how to be accurate about it.

Back in the sixties, NASA and the US DoD were spending a great deal of money on engineering. They therefore took a keen interest in improving planning and estimation, not unlike the interest you might take if someone was setting all of your money on fire. Out of this interest sprung the mellifluously titled “PERT/COST SYSTEMS DESIGN” which, on the subject of estimation, made this central observation:

If you ask engineers for 3 estimates (Best Case, Most Likely, Worst Case) instead of 1, you get different answers.

That’s pretty exciting! Constraints get us different answers, and different answers mean more bits of information. If you’re not convinced that this is brilliant, though, here comes some next level awesome: A (weighted) average of these 3 estimates is a better predictor of actual completion time than any one of them. Specifically

(Best + 4*Most Likely + Worst) / 6

turns out to work pretty well in the general case. These so-called “PERT Estimates” or “3-point Estimates” give engineers credit for their assessment of “most likely” by weighting it heavily, but still allow optimism and pessimism to pull the average. I dare you to argue with this graph:

Likelihood of project completion date vs estimates (Science, bitches!)

Likelihood of project completion date vs estimates

Having 3 data points actually helps in other ways, too. It means you can more clearly quantify the uncertainty of a project by comparing best and worst case estimates, and watching to see if the distance between them shrinks over time. It means you can produce “optimistic” and “pessimistic” schedules. And, most importantly, it means that everyone is saying the same thing when they estimate.

Best, Worst, Most Likely. Try it for your next project, and see how it works. As we finish Firefox 4 and start looking at what comes next, there will be plenty of estimation happening, and I’m keen to see us bringing more science to the table. This may not be the right model for us, or we may discover that the coefficients need changing in our version of the equation; that’s fine. That would actually be a great result. My interest isn’t in pushing a particular tool, my interest is in getting better at planning, getting more awesome out to our users faster. I think we do that by looking for systems that have worked for others, and seeing how well they adapt to us.

And then we dress up the vacuum cleaners.

7 comments

  1. [...] This post was mentioned on Twitter by Christopher Blizzard and Planet Mozilla, Thomas Roessler. Thomas Roessler said: An great lesson from @johnath: http://t.co/rtubACR [...]

  2. Insightful read – thank you :-)

  3. Excellent. I’ve always given a (Best Case, Worst Case) pair as an answer whenever I’m asked for an estimate, and now I know why I did that instinctively. Now I can try giving a (Best Case, Likely Case, Worst Case) triplet as well.

    Business types do not like to hear a range when they ask for an estimate though. Now I have more reasons to tell them why it makes more sense to have a range.

  4. Majken "Lucy" Connor

    Very cool! The way I interpret this is along the lines of something I was already thinking – people want to do a good job. The more constraints you give them the easier it is for them to be confident that they’re doing a good job, and the right job.

    To relate it back to estimates, people know that sooner is better, so if they only give one estimate they’ll weight it towards sooner to make a better impression (and because they want to do it sooner, sooner means better). By asking them for all three you’re showing your priority is a good estimate, rather than “how soon can it be done?” So they do a better job at estimating, knowing that’s the right job.

  5. Yeah, most of the time when engineers are asked for estimates its done with a simple question like “can you have the done by the code freeze we are imposing at the end of “today | the week | or some arbitrary cut-off”.

    The usual response comes back (some times with arm twisting) that of course it can…

    An interesting exercise that we used to do more of is to ask for schedule estimates in terms of 1 day, 1 week, and 1 month increments. It gets people thinking more about the complexity of the task, and comparing to similar tasks that took those same kind of intervals.

    When estimates are formed in this way interesting numbers start to come back that helps us go from zero schedule informatiion, to the beginning of some numbers to start to work with. Over time as estimating skills get better more precision can be added, but its best just to start with something simple.

  6. I think it’s important to note that having three points is irrelevant if the task isn’t well specified. When I was reading background information on Spolsky’s evidence based scheduling post a few years back ( http://www.joelonsoftware.com/items/2007/10/26.html ) it seemed generally agreed upon that poorly defined tasks (or tasks which have no historical precedent) could not be accurately estimated, period.

    At that point, all you can do is guess at the right amount of buffer time to allow for your lack of knowledge, at which point you end up relying on folklore like, “Estimate, then double it, then double it again!” This combines in unfortunate ways with the “slow down to meet the deadline exactly on time” phenomenon. (Is there a name for that?)

  7. If the three points are

    * “this looks easy” (1 day)
    * “this looks hard” (1 week)
    * and “this looks really hard” (1 month)

    I think there *is* some value gained. You can start to triage tasks to find out where continued work will bring the greatest impact. You can do things like order the tasks by hardest, or tasks most easily completed. And you can start to do the next round of work to define what needs to be done for each task.

    There definitely is a trade off in understanding what work needs to be done and actually doing the work v. developing schedules for the work.

    If we want to move fast we can’t get caught up with trying to develop detailed specification of the work solely for the purposes of having great estimates; but we do need detailed enough specifications so we can get the work done, and we do need some high level estimates to help coordinate the work and make rough cuts at scheduling releases.

    Just having information available that says engineering group A has Y really hard tasks ahead, and group B has Z simple things to wrap up is the first real easy step to understand what a schedule might look like.