This is my first blog post, welcome! I would like to share with you my
real-life experience related to #NoEstimates. I've been following the
discussion on Twitter this year but learned most of what I know by reading
blogs. So maybe you can learn something from my post as well. The
post tells a story about a team that was using plenty of time on
estimation but gradually moved towards #NoEstimates. Let's see how
that happened.
Initial estimation method: hour estimates
The development team I joined as a Scrum Master had a long
history. However, during the previous year about half of the team had
changed. I was told how it at least used to be a great Scrum team but
when talking with anyone in the team, it was clear that they were
struggling. I don't know if it had been like that a year or two
earlier but at least at that time they were far from well-performing.
During the first sprint with them I just observed how they worked.
Among many other things I noticed:
- The sprint planning meeting was neither very efficient nor effective.
The group of 12 people was divided into two teams and both
teams spent typically 5-7 hours on sprint planning.
- During the sprint planning meeting the Scrum Masters were using an
electronic tool that contained user stories with initial story point
estimates. The team was discussing the stories and the Scrum
Master wrote tasks based on the discussion. For each task the team provided an
hour estimate using the planning poker method.
- The atmosphere during the meetings was, how should I put it,
not very energized. They weren't events where the team would eagerly try to find the best possible solution to the problem at hand. I
noticed e.g. how some people were so bored and frustrated that every once in
a while they would just ignor the discussion and spent time on Facebook or
similar.
- One of the Scrum Masters' tasks was to print out the user
stories and the tasks from the electronic tool. During the sprint
the developers of course noticed new things to do but you couldn't see that
on the Scrum boards since nobody wanted to write new tasks into the tool and print them out. It
was thus difficult to follow the actual progress during the sprint.
- Even though the planning was very detailed, the teams weren't
able to finish the user stories during the sprint. The other team
finished half of the stories completely while the other one finished none.
The teams created burndown charts based on tasks and their hour estimates. This meant
that if they had 80% of the original tasks done, they had a
pretty “successful” sprint. It didn't matter if the initial
tasks were irrelevant or if none of the stories were completely
done.
The first sprint ended with a retrospective where many of the team
members pointed out the problems I listed above. The team decided to
try out something new.
Transition from hour estimates to story points
The next sprint planning was quite different from the previous
ones. We stopped doing hour estimates. We threw away the electronic
tool. We stepped away from the pressing meeting rooms and used the
team space instead. We didn't try to do all of the work with the whole
team but instead did some of it in groups of 2-3 people.
And although we had printed the user stories, we wrote the tasks by
hand.
First we checked the product backlog and picked the top four user
stories and discussed them briefly all together. Then we split the team into
four small groups and each group was responsible for providing the
tasks for the story. As a detail I remember how someone suggested
that we should write a couple of tasks together so that everyone would see
what it is like to write them, how to pick them from the discussion.
This was an interesting detail since I realized afterwards how the
“Scrum Master uses the tool” approach had made them passive also
in this sense. After 15 minutes or so we gathered together and
each group explained what they had done. Others made comments and
asked some questions. Based on these the team fine-tuned the tasks.
The same was repeated until finally we had about ten stories
planned. The only thing we were missing were the estimates. I asked
the team which one of the stories is the smallest. It was easy to
find and that story got one story point. Then I took a random
story and asked if it was the same size and if not, how many times
bigger. That way we got story point estimates for each of the stories.
The team had been using story points also before but they were
based on hours with some formula that I don't recall. Since we now
had a new meaning for one story point, we didn't have comparable data
from the previous sprints. Instead I asked the team: do you think
that you can
completely finish all the stories during the sprint? Although they were not very
confident, they decided to commit to all of them. So were we done. We
had spent about three hours, went for lunch, and started writing
some code.
Story points era
One of the changes we made was that we
stopped drawing burndown charts based on tasks. Instead, we used
completely finished stories. Below you can see how it looked in the
new sprint #1.
This was something I had witnessed
before. It goes like this: In the beginning everybody can choose what
they start to work on. Since it is the most efficient way (right?),
almost everyone picks a story of their own. In the middle of the sprint none
of the stories are completely done. At the end of the sprint magic may
or may not happen. In this particular case they got pretty close but from an
earlier team I remember how there were five developers, five user
stories, all of the stories work in progress, and only one of them
completely finished on the last day of the sprint.
So during the first sprints we had a lot more to
improve than just make the sprint planning more effective
and efficient. One thing was to start working more in pairs or small
groups. Another important thing was that the developers tried to get something for the tester sooner instead of waiting for the whole
story to be coded. This way the user stories were ready sooner. It
also made the tester happier since he didn't have to wait until the end
of the sprint to get something new for testing.
However, that wasn't enough. The team
wasn't able to reach their goal during the first couple of sprints.
At the end of one sprint planning one of the team members asked how
many points the team had completed in the previous sprint. I said about
30. Then he asked from the team: If we have managed to do 30, why
should we commit to 40 again? A good question, I would say. So they
decided to drop a couple of stories away.
Little by little the team learned to
commit to a reasonable amount of work and also get the work
completely done in the sprint. After 2-3 months the charts
started to look like this (we changed from burndown to burnup at some
stage):
An important thing that the team
learned was that if they commit to stories that are too big, there is
a high risk that they won't be able to finish them. The team created
a rule that if a user story is estimated to be more than five points,
they have to split it into smaller pieces. I believe this was a
crucial lesson towards the next step.
S/M/L estimating
The duration of a typical planning session
had dropped from 5-7 hours to 2 hours or even less. The team was able
to finish the sprint goal almost every time. But I still felt that we
could do even better.
I remember that sometimes we were using
too much energy on discussing if a story was one or two points. I
even remember a case when time was spent arguing whether a story
was zero or one points.
We also discussed if it made sense to estimate bugs
and include finished bugs in the burnup chart. It felt like cheating:
what if you finish a 3-point story in sprint n, find three bugs in
sprint n+1, and fix 1+1+1 points in sprint n+2? From the commitment
perspective (how much we'll be able to do) it made sense but from the
value perspective it didn't.
There were also situations that we
couldn't know beforehand whether we were able to start working on a
certain story since it was blocked by an external party. Or we didn't
know exactly what we needed to do since we first needed to find that out by
doing another story. However, since those were important tasks that
should be done if possible, we reserved space for them in the
sprint backlog: “These are the stories we have selected and besides them we
have 3 points for these unknown stories.”
Since all of that felt kind of like
waste, I proposed the next step for the team. Let's drop the story
points and instead use sizes S, M, and L. S means 1-3 old points, M
means 5, and L is bigger than that. If a story was S, it required no further discussion about its size. If it was M, it was a warning that further
discussion might be needed - can we really complete the story or could we perhaps split it?
If it was L, we had to split it. The sprint commitment was made based
on the gut feeling using the question: from 1 to 5, how confident are
you that we will be able to complete all the stories we have chosen?
An interesting thing was that we never
actually used those sizes. The team had learned to split stories so
small that all of them were of size S. At that time our typical process was such that we had
enough stories on the whiteboard waiting for the next sprint. We spent
10 minutes on them on the last day of the sprint. We started
the next sprint with about an hour-long sprint planning meeting where we made
sure that the whole team knew what we were going to do and checked if
there was something important that was missing from the backlog. The
developers wrote the tasks when they picked a story and rewrote them whenever needed. It felt like we were getting closer and closer to
a nice flow.
#NoEstimates
At some stage we decided to split
the team into two. The reason for this was that even though there was
one code base, there were two clearly distinct businesses using it.
This caused a major challenge of how to prioritize stories. So one
component team became two feature teams, each business having its
own.
The team I was in decided to take the
next step towards #NoEstimates, although at that time I hadn't heard
about such. We decided not to have sprints anymore but instead every
time choose the next most important thing. Of course this meant that
we tried to keep the amount of work in progress as low as possible,
although we didn't have explicit WIP limits written on our board. It
was important to have as small stories as possible but we didn't
spend any time on estimating them (well, intuitively perhaps). We
were just thinking if this story made sense and should and could we
split it. Sometimes we noticed during the development that it made
sense to split the story and then we just wrote a new story.
Instead of sprint plannings we started
to have weekly meetings having all the relevant people from this business area in the company. That included of
course the development team and the so called business people. We
didn't have a Product Owner anymore since there was no need for such.
In the weekly meetings we as a group talked about the big picture,
checked what was going on, and decided together what we should do next. We
used another whiteboard that was scaled to an upper level than what
the development team had.
Instead of calculating velocity based on story points we started to count finished stories per week. Below you
can see how our throughput statistics looked during the first 20
weeks. Notice especially the last eleven weeks: every week 2 or 3
finished stories. When the throughput is so stable, why would you
need any size estimates?
It was a week 20 or so when we realized
that we needed to do a major refactoring in order to meet a certain
important business need. It was the first time in this new team when
we needed to do estimation of some kind. Our approach was the
following: Try to understand what needs to be done. Split the work
into user stories or similar. Count the stories. Use the statistics
to forecast what the probabilities to have this done before date
X are or when all of the stories would be done with a decent certainty.
We were a bit skeptical about how the
business owner would deal with our non-traditional approach of
forecasting when the project would be ready and in production instead
of estimating in man-days. Luckily we were fortunate to work with a smart
guy and after asking a couple of questions he just said: ok, go
for it.
What really happened was that the
required changes were in production pretty much when we expected them
to be. However, we didn't finish all of the dozen stories we had planned initially. Instead we realized that half of them could be done
later and replaced those with other, more important tasks. The
throughput was as expected but the content was something different,
more valuable.
#estwaste and euros
Before the #NoEstimates hashtag I
remember that at least Vasco Duarte was using #estwaste in his
tweets. I like the word waste since it is an easy word to throw out on many occasions but let me provide you with some numbers that
should make the word more concrete in this case.
If you read the whole story, you
noticed that we started with sprint planning sessions
that lasted about 6 hours and in the end we didn't have them at all. If we assume that there
are 22 sprints per year and the team has an average of ten members, it
means 1320 saved hours per year. I really don't know what the average
hourly cost of the team members was but let's pick two numbers: 50 or
100 EUR/hour. On a yearly level this means savings of 66,000 or 132,000
euros. Besides that you probably noticed that we didn't need the
Product Owner anymore. So you can add the cost of one manager above
that.
I guess you are now saying that I
forgot the value part of those sprint plannings or that I forgot the
cost of the one-hour weekly meeting. Well, I didn't.
First of all, the old sprint plannings produced very little or even negative value. Surely the developers
discussed the upcoming work there but I would say that the
discussion wasn't very useful. One of the purposes of the plannings
was to provide visibility for the Product Owner but it was hard to
see such an effect. And the usage of technical tool caused problems during the
sprints since the team was having difficulties using the new
information they learned while working. With negative value I refer
to the drop in people's motivation.
Instead, the weekly meetings really
produced value. They helped us to share information very efficiently and
make useful prioritization decisions. So the cost calculations above
really refer to the waste (=no value added), although they even
ignore things like opportunity cost, cost of delay, and so on.
Lessons learned
Let me choose the two most important
#NoEstimates lessons that I learned during this journey. The first one
is that at least in this kind of context the #NoEstimates approach is
perfectly valid and can bring huge improvements for the organization.
With “this kind of context” I mean an ongoing product
development. Unfortunately I don't have experience on making
business decisions before starting to develop a large-scale product. I would love to read your post about that topic.
The second one is that if you
start from the situation described above, you cannot just jump to
#NoEstimates. Instead, you have to find your own path and take small
and sometimes bigger steps towards it. Vasco Duarte claims that story points are harmful. I understand what he means
but that statement depends on the context as well. In this post I
described how we gradually moved from hour estimates to story
points, to S/M/L sizes, and finally to #NoEstimates. The story points
helped the team to learn how big stories cause problems and split the
stories into smaller ones. I feel it was a necessary step to take.
I think that working without
estimates requires that the team has a certain maturity level. If
your team doesn't have that yet, you need to work hard (smart) in order to
get there and enjoy the benefits of #NoEstimates. That is what we did
and I recommend it for you as well.