Saturday, September 14, 2013

How confident are you with your estimates?

I know you are a busy reader but before going any further I would like you to answer to ten questions. The questions are taken from the book How to Measure Anything: Finding the Value of "Intangibles" in Business by Douglas W. Hubbard. You should be 90% confident with your answers. This means that if you do the test perfectly, you should get 9 correct answers and 1 wrong.


How did it go?  (I appreciate if you write your result to the comments.) When I did the test myself, I got 6 correct answers. When a couple of my colleagues did it, they got 2-5 correct. Two weeks ago I was in ALE 2013 unconference in Bucharest and held an open session there. I asked the questions from a dozen of people. Most of them got 2-4, some of them 5-6 and there was only one person with 7 correct answers.


Too difficult questions?


When people get rather low results from the test, they typically react by saying that the questions were too difficult. This is the key point here. If you don't have enough information, you should not give too narrow range. When you get more information, only then you should make the range narrower.

So how about software development? Have you ever been in a situation where your boss asks you to give a quick "educated guess" about a new feature because there is a management meeting tomorrow where the estimate is needed? And although you don't feel very confident, you say "well, maybe 30 days". And then the boss takes your (educated?) guess with him, goes to the meeting, and let's the guess to become a fact that is used to make an important business decision.

I think this is exactly such a situation where you don't have enough information to give an estimate like that. You are after all giving a range of 30-30, i.e. a single point! Instead, if you are a calibrated person (1), you could say: "with 90% confidence, I think it is 10-300 days". Or if you are not, like me, you should probably just say: "with this amount of information, I don't know".


Unprofessional answers?


When I gave that advice in the ALE13 open session, two people asked me that wouldn't it be unprofessional to give an answer like that? I replied to them that it would be actually quite the opposite.

I think it is unprofessional to pretend that you have information when you don't. I think it is actually unethical. It is much more professional to be honest and say: "I don't know".


How to narrow the range?


If you have a smart boss, he probably wants to know how to make the initial range narrower. The first thing is to spend a bit more time to think about the problem you are about to solve. You can probably identify a couple of parts that are especially uncertain. Maybe you can write a prototype or a technical spike for them?

Another approach is to ask what is the most important thing we need to solve? If your initial estimate for the whole thing was 10-300 days, maybe your calibrated guess for the most important part is 2-20 days. It is still a wide range but perhaps small enough so that you can just do it and see how it goes. And most importantly, to learn by doing.

You will actually learn many things. You may learn technically. You will understand the domain better. And you can start measuring your progress so that you can stop guessing and start forecasting.

Or you may learn that you need to build something else that you were actually supposed to build. If that happens, what is the value of estimating the whole project beforehand then? Another important aspect is that the smaller slices of work you can create, the less need you have for estimating their size.


Cost and value


If we go back to the initial question about the project size, it may actually be the wrong question for another reason as well. When we want to figure out whether to start a new project, we tend to focus on estimating the project cost because that is the "easy" part. We don't try to estimate the value because that would be too difficult.

In his book Douglas Hubbard criticizes such a behaviour. As the book title says, he claims that anything can be measured, so also the value of a project. For doing it he provides you with many tools. I really recommend you to read the book and find out more.


PS. In Finland they are planning to build a health record system that would cost 1.2-1.8 billion Euros. I wonder if that estimate has a 90% confidence interval..?


(1) Calibrated person is such that gets regularly 9/10 right when she is asked to answer with 90% confidence interval. When Douglas Hubbard has a measuring challenge, he trains the key people so that they become calibrated. One tool for that is to answer to other similar questions. After the key people are calibrated, they can give reliable initial estimates for the questions that are created based on the measuring challenge. Based on the initial estimates Hubbard defines with the aid of statistical tools what part of the challenge should be measured more in order to provide most valuable additional information with the least effort.

8 comments:

  1. I got 6 out of 10, because I'm a tech and history geek. But what's missing and a serious error is the "variance bands" on the answers. The number right and wrong 0%/100% right of wrong.

    A question like "estimate the top speed of the 1938 British locomotive to within an 85% confidence - ±15% - is how "basis of estimate" processes work in our domain. Early in the project that interval may even be too narrow. After early deliverables for assessing the risks and maturity, those "BOE's" are updated as part of the rolling wave process.

    The 1.2 to 1.8 B euros "must" have a confidence interval and an error band on the confidence interval. If they are using that range and not stating the confidence on the range using words like "the cost of this project is 1.2 B or less with an 85% confidence," then they'd be cited as non-compliant in the US DoD Cost Estimating guidelines http://www.gao.gov/new.items/d093sp.pdf

    Bad estimating processes is no excuse for not doing good estimating. There are many guidelines, texts, tools, and processes out there for estimating software, hardware, pouring concrete, installing ERP systems, etc. Those railing against estimating - present company excluded - need to do their home work before proceeding with their rants.

    Google "software cost estimating" for a small sample of approached, tools, and processes.

    These may or may not be applicable in your domain. But over generlizing that software cost estiimates are flawed, is simply uniformed without very specific domain and context.

    Here is a sample of what we use for software intensive, high risk, discovery design, rapidly emerging requirements (it's a science experiment) programs

    http://goo.gl/nToZv2
    http://goo.gl/DGZ62t
    http://goo.gl/nrwgvQ
    http://goo.gl/olBBlq

    The list can go one for 100's of sources. Do we get it right? Do we have really good estimates? No, there are "wicked problems" around estimating almost all aspects of large complex systems. http://goo.gl/G5Skpk is a high level summary of the problem.

    I work at IDA (on another topic), but this is a sample of the overall problem http://goo.gl/DdqgI

    But "no doing estimates" is not the answer either. Deciding "how" to estimate, starts with assessing the "value at risk." How much are you willing to risk y not having an estimate - at some level of confidence and some level of error (confidence on the confidence) for the basis of estimate? $100K no one cares. Billions? Those Finish estimators had better have done a credible job.

    We've learned much from Brian's book http://goo.gl/Y2vw6O maybe those Finish Health Care estimators should be asked if they read his book as well

    ReplyDelete
    Replies
    1. Thanks for your comment, Glen!

      I think you missed the point of the test. Your 6/10 is obviously much better than e.g. 2 but it still isn't very close to 90% confidence interval. Your answers, althought this is just one sample, indicate rather that your typical CI is 60%. And it doesn't actually matter how well you know the domain. Let's take an example. I have no idea what is Gilligan's Island TV show. I have never heard of it, nor ever watched it. But I know that there wasn't many televisions before the World War II. Also I would have probably heard of it if it was since I was a teenager. So my guess would be 1950-1990. On the other hand, if I was an American who used to watch that show, I could relate it to my own life and give a much narrower range. So no matter how well you know the domain beforehand, you are able to give your answer with 90% CI. It just means that you give much wider ranges than those who know the domain. I may not be able to explain this very clearly because I don't have such an experience as Hubbard. For that reason I really recommend that you'll read his book.

      I'm sorry I couldn't find any English articles about the planned health record system. The 1.2-1.8 is based on the actual costs of the similar investments elsewhere in the world. There were also two vendors providing their own estimates. The other one said 1 and the other one 1.5 billions. I don't know if there were CIs around the estimates but at least the report doesn't say anything about them.

      My solution is not to just stop estimating without doing any other changes to the system. I think that would be horrible. Imagine if they just started building the health record system like similar things have been done before without considering the cost! Now when they said it aloud, the public can at least react to it. Instead I think that in order to reduce the importance of estimates and perhaps eventually in some cases completely leave them out, we have to change the way we work. My first blog post (http://blog.karhatsu.com/2013/08/from-hour-estimates-gradually-to.html) described that in small scale (I know your scale is much much bigger). And in this post I gave one idea as well. So, I'm not so sure about this 'estimate better' approach. I rather try to find better ways to build software and if the side product is that we can stop estimating, that's great.

      Delete
  2. Sorry for the run on response. As far back as 1985, this problem of software estimating has plagued our industry. Tversky, A., and Kahneman, D., "Judgment Under Uncertainty: Heuristics and Biases," Science, v. 185, pp. 1124-1130, 1974. Goes back even further.

    http://goo.gl/RjJVVy is a sample of how old the problem is and how little progress we've made since then. Good estimating is possible, but it does require we understand some core concept

    - no point estimate can be credible without variance.
    - no estimate can be credible without a "reference class" basis for that estimate.

    Thanks for the thoughtful post. Was just in Stavanger, NO speaking on project controls and cost estimating to O&G conference.

    ReplyDelete
  3. I got effectively 9/10 (actually 8/10 due to giving fractions instead of percentage points as answer). The one that did go wrong was so by a order of magnitude (train speed). Just goes to show that I have strong opinions on some things such as technical related matters and I feel that I am qualified to guess a narrow range for them without actual knowledge.

    I had the advantage of having done similar test on confidence ranges before, and having a rather large number of estimating experiences from the last few years.

    Like you suggested in your post, the professional thing to do when estimating under a extreme uncertainty is to come out and say that you have no idea how long solving that particular problem is going to take.
    Everyone is happier for me having said that a given integration will take anything between 2-20 days to complete with a satisfactory quality. Even the budget managers who retrospectively wonder why it took 20 days even though they figured we just might get it done in 2.

    Most parts of a common, well defined problem can be estimated with a fairly low margin of error. If a fixed estimate is required, I would split the problem domain into well known areas for which I give a fixed estimate, and anything beyond that well known domain must by necessity have a estimate range to account for uncertainties.

    ReplyDelete
    Replies
    1. I have noticed that doing the similar tests again really gives you better results. Like I wrote, that's one thing what Hubbard does when he calibrates people. In ALE 2013 three guys did another ten questions and their results went from 2-4 to 6-8. However, an average they didn't had any more domain knowledge than they had before. They just learned to reduce the over confidence they had.

      Delete
  4. Henri: Excellent post - I got 2/10, mostly by falling into the trap of having my estimates in too narrow a band.

    With respect to how confident the estimators are for the Finnish health record system, I'd suggest that if they're anything like what we have experienced in Ontario, Canada[1], they will be off by a significant amount.

    When systems are so large and contain so many interdependent variables, it becomes an exercise in alchemy to derive a estimate that can be useful to make decisions. What we have an abundant amount of data on is that software, in the large and small, takes the time it takes. The only thing we can be really confident in saying is that it will adhere to Hofstadter's Law.


    [1] http://www.auditor.on.ca/en/reports_en/ehealth_en.pdf

    ReplyDelete
  5. I got 8. I'm really overconfident about trains and my understanding of the costume of Isaac Newton in the drawings I've seen. :)

    ReplyDelete
  6. Ha, I just took this and got 0! :-) I purposely stayed narrow with my answers thinking I was confident in many of them. For the ones I was very unsure I still didn't widen the range. A big fat 0. :-) Awesome exercise to illustrate your points!

    ReplyDelete