Initial estimation method: hour estimates
The development team I joined as a Scrum Master had a long history. However, during the previous year about half of the team had changed. I was told how it at least used to be a great Scrum team but when talking with anyone in the team, it was clear that they were struggling. I don't know if it had been like that a year or two earlier but at least at that time they were far from well-performing.
During the first sprint with them I just observed how they worked. Among many other things I noticed:
- The sprint planning meeting was neither very efficient nor effective. The group of 12 people was divided into two teams and both teams spent typically 5-7 hours on sprint planning.
- During the sprint planning meeting the Scrum Masters were using an electronic tool that contained user stories with initial story point estimates. The team was discussing the stories and the Scrum Master wrote tasks based on the discussion. For each task the team provided an hour estimate using the planning poker method.
- The atmosphere during the meetings was, how should I put it, not very energized. They weren't events where the team would eagerly try to find the best possible solution to the problem at hand. I noticed e.g. how some people were so bored and frustrated that every once in a while they would just ignor the discussion and spent time on Facebook or similar.
- One of the Scrum Masters' tasks was to print out the user stories and the tasks from the electronic tool. During the sprint the developers of course noticed new things to do but you couldn't see that on the Scrum boards since nobody wanted to write new tasks into the tool and print them out. It was thus difficult to follow the actual progress during the sprint.
- Even though the planning was very detailed, the teams weren't able to finish the user stories during the sprint. The other team finished half of the stories completely while the other one finished none. The teams created burndown charts based on tasks and their hour estimates. This meant that if they had 80% of the original tasks done, they had a pretty “successful” sprint. It didn't matter if the initial tasks were irrelevant or if none of the stories were completely done.
Transition from hour estimates to story points
The next sprint planning was quite different from the previous ones. We stopped doing hour estimates. We threw away the electronic tool. We stepped away from the pressing meeting rooms and used the team space instead. We didn't try to do all of the work with the whole team but instead did some of it in groups of 2-3 people. And although we had printed the user stories, we wrote the tasks by hand.
First we checked the product backlog and picked the top four user stories and discussed them briefly all together. Then we split the team into four small groups and each group was responsible for providing the tasks for the story. As a detail I remember how someone suggested that we should write a couple of tasks together so that everyone would see what it is like to write them, how to pick them from the discussion. This was an interesting detail since I realized afterwards how the “Scrum Master uses the tool” approach had made them passive also in this sense. After 15 minutes or so we gathered together and each group explained what they had done. Others made comments and asked some questions. Based on these the team fine-tuned the tasks.
The same was repeated until finally we had about ten stories planned. The only thing we were missing were the estimates. I asked the team which one of the stories is the smallest. It was easy to find and that story got one story point. Then I took a random story and asked if it was the same size and if not, how many times bigger. That way we got story point estimates for each of the stories.
The team had been using story points also before but they were based on hours with some formula that I don't recall. Since we now had a new meaning for one story point, we didn't have comparable data from the previous sprints. Instead I asked the team: do you think that you can completely finish all the stories during the sprint? Although they were not very confident, they decided to commit to all of them. So were we done. We had spent about three hours, went for lunch, and started writing some code.
Story points era
One of the changes we made was that we stopped drawing burndown charts based on tasks. Instead, we used completely finished stories. Below you can see how it looked in the new sprint #1.
This was something I had witnessed
before. It goes like this: In the beginning everybody can choose what
they start to work on. Since it is the most efficient way (right?),
almost everyone picks a story of their own. In the middle of the sprint none
of the stories are completely done. At the end of the sprint magic may
or may not happen. In this particular case they got pretty close but from an
earlier team I remember how there were five developers, five user
stories, all of the stories work in progress, and only one of them
completely finished on the last day of the sprint.
So during the first sprints we had a lot more to
improve than just make the sprint planning more effective
and efficient. One thing was to start working more in pairs or small
groups. Another important thing was that the developers tried to get something for the tester sooner instead of waiting for the whole
story to be coded. This way the user stories were ready sooner. It
also made the tester happier since he didn't have to wait until the end
of the sprint to get something new for testing.
However, that wasn't enough. The team
wasn't able to reach their goal during the first couple of sprints.
At the end of one sprint planning one of the team members asked how
many points the team had completed in the previous sprint. I said about
30. Then he asked from the team: If we have managed to do 30, why
should we commit to 40 again? A good question, I would say. So they
decided to drop a couple of stories away.
Little by little the team learned to
commit to a reasonable amount of work and also get the work
completely done in the sprint. After 2-3 months the charts
started to look like this (we changed from burndown to burnup at some
stage):
An important thing that the team learned was that if they commit to stories that are too big, there is a high risk that they won't be able to finish them. The team created a rule that if a user story is estimated to be more than five points, they have to split it into smaller pieces. I believe this was a crucial lesson towards the next step.
S/M/L estimating
The duration of a typical planning session had dropped from 5-7 hours to 2 hours or even less. The team was able to finish the sprint goal almost every time. But I still felt that we could do even better.
I remember that sometimes we were using
too much energy on discussing if a story was one or two points. I
even remember a case when time was spent arguing whether a story
was zero or one points.
We also discussed if it made sense to estimate bugs and include finished bugs in the burnup chart. It felt like cheating: what if you finish a 3-point story in sprint n, find three bugs in sprint n+1, and fix 1+1+1 points in sprint n+2? From the commitment perspective (how much we'll be able to do) it made sense but from the value perspective it didn't.
We also discussed if it made sense to estimate bugs and include finished bugs in the burnup chart. It felt like cheating: what if you finish a 3-point story in sprint n, find three bugs in sprint n+1, and fix 1+1+1 points in sprint n+2? From the commitment perspective (how much we'll be able to do) it made sense but from the value perspective it didn't.
There were also situations that we
couldn't know beforehand whether we were able to start working on a
certain story since it was blocked by an external party. Or we didn't
know exactly what we needed to do since we first needed to find that out by
doing another story. However, since those were important tasks that
should be done if possible, we reserved space for them in the
sprint backlog: “These are the stories we have selected and besides them we
have 3 points for these unknown stories.”
Since all of that felt kind of like
waste, I proposed the next step for the team. Let's drop the story
points and instead use sizes S, M, and L. S means 1-3 old points, M
means 5, and L is bigger than that. If a story was S, it required no further discussion about its size. If it was M, it was a warning that further
discussion might be needed - can we really complete the story or could we perhaps split it?
If it was L, we had to split it. The sprint commitment was made based
on the gut feeling using the question: from 1 to 5, how confident are
you that we will be able to complete all the stories we have chosen?
An interesting thing was that we never
actually used those sizes. The team had learned to split stories so
small that all of them were of size S. At that time our typical process was such that we had
enough stories on the whiteboard waiting for the next sprint. We spent
10 minutes on them on the last day of the sprint. We started
the next sprint with about an hour-long sprint planning meeting where we made
sure that the whole team knew what we were going to do and checked if
there was something important that was missing from the backlog. The
developers wrote the tasks when they picked a story and rewrote them whenever needed. It felt like we were getting closer and closer to
a nice flow.
#NoEstimates
The team I was in decided to take the next step towards #NoEstimates, although at that time I hadn't heard about such. We decided not to have sprints anymore but instead every time choose the next most important thing. Of course this meant that we tried to keep the amount of work in progress as low as possible, although we didn't have explicit WIP limits written on our board. It was important to have as small stories as possible but we didn't spend any time on estimating them (well, intuitively perhaps). We were just thinking if this story made sense and should and could we split it. Sometimes we noticed during the development that it made sense to split the story and then we just wrote a new story.
Instead of sprint plannings we started
to have weekly meetings having all the relevant people from this business area in the company. That included of
course the development team and the so called business people. We
didn't have a Product Owner anymore since there was no need for such.
In the weekly meetings we as a group talked about the big picture,
checked what was going on, and decided together what we should do next. We
used another whiteboard that was scaled to an upper level than what
the development team had.
Instead of calculating velocity based on story points we started to count finished stories per week. Below you
can see how our throughput statistics looked during the first 20
weeks. Notice especially the last eleven weeks: every week 2 or 3
finished stories. When the throughput is so stable, why would you
need any size estimates?
It was a week 20 or so when we realized
that we needed to do a major refactoring in order to meet a certain
important business need. It was the first time in this new team when
we needed to do estimation of some kind. Our approach was the
following: Try to understand what needs to be done. Split the work
into user stories or similar. Count the stories. Use the statistics
to forecast what the probabilities to have this done before date
X are or when all of the stories would be done with a decent certainty.
We were a bit skeptical about how the
business owner would deal with our non-traditional approach of
forecasting when the project would be ready and in production instead
of estimating in man-days. Luckily we were fortunate to work with a smart
guy and after asking a couple of questions he just said: ok, go
for it.
What really happened was that the
required changes were in production pretty much when we expected them
to be. However, we didn't finish all of the dozen stories we had planned initially. Instead we realized that half of them could be done
later and replaced those with other, more important tasks. The
throughput was as expected but the content was something different,
more valuable.
#estwaste and euros
Before the #NoEstimates hashtag I
remember that at least Vasco Duarte was using #estwaste in his
tweets. I like the word waste since it is an easy word to throw out on many occasions but let me provide you with some numbers that
should make the word more concrete in this case.
I guess you are now saying that I
forgot the value part of those sprint plannings or that I forgot the
cost of the one-hour weekly meeting. Well, I didn't.
First of all, the old sprint plannings produced very little or even negative value. Surely the developers
discussed the upcoming work there but I would say that the
discussion wasn't very useful. One of the purposes of the plannings
was to provide visibility for the Product Owner but it was hard to
see such an effect. And the usage of technical tool caused problems during the
sprints since the team was having difficulties using the new
information they learned while working. With negative value I refer
to the drop in people's motivation.
Instead, the weekly meetings really
produced value. They helped us to share information very efficiently and
make useful prioritization decisions. So the cost calculations above
really refer to the waste (=no value added), although they even
ignore things like opportunity cost, cost of delay, and so on.
Lessons learned
Let me choose the two most important
#NoEstimates lessons that I learned during this journey. The first one
is that at least in this kind of context the #NoEstimates approach is
perfectly valid and can bring huge improvements for the organization.
With “this kind of context” I mean an ongoing product
development. Unfortunately I don't have experience on making
business decisions before starting to develop a large-scale product. I would love to read your post about that topic.
I think that working without estimates requires that the team has a certain maturity level. If your team doesn't have that yet, you need to work hard (smart) in order to get there and enjoy the benefits of #NoEstimates. That is what we did and I recommend it for you as well.