Why We Need Buffers

Allan X
5 min readMay 22, 2017

I think everyone knows Murphy’s Law:

Everything that can go wrong, will go wrong.

I agree though that in reality, it isn’t so absolute, but unless we recognize and proactively mitigate the factors that make things go wrong, the probability is very high.

So why am I bringing this up now? Because today, I witnessed something blow up, once again!

More specifically almost every time there’s a release, something goes wrong.

Why?

One reason, those directly responsible, who should’ve foreseen this and prevented it, didn’t. They’re not the ones feeling the pain and dealing with the consequences. They have no incentive or motivation to improve.

The other reason, and the focus of this piece, is that there are no buffers.

A Simple Example

If it takes you, 30 minutes to get to an appointment and you leave home 30 minutes before the scheduled time, what are the chances you will be late?

Well pretty high if we’re just talking about a few minutes… but once in awhile you’re going to hit a traffic jam and be late by 15 minutes… and on rare occasions, your car completely breaks down and you have to cancel.

What we are interested in is the expected cost:

Sum(Probability(x) * Cost(x))

In summary,

  • The possible costs are: trivial, bad, very bad
  • But the probabilities of each scenario is very different. Something like 80%, 19%, 1%

In above case, it’s probably advisable to leave just 40 minutes before the appointment.

But what if the first 2 probabilities are flipped? Then you’d probably want to leave at least 60 minutes earlier.

That’s common sense right? Well what about picking a time where you’re pretty sure there will be no traffic jams and making sure your car is well maintained?

For this case it’s a sort of overkill but leads into my point:

You can take proactive steps to change the probabilities and lower the chances that the expensive cases will occur. You can lower the expected cost.

And what is one way to do this? Allocating enough time to make sure things are done correctly and checked.

A Real Problem

So let’s get into the real problem. I work as a developer on a mid-sized team that follow Agile practices.

In short, what I see daily is that we accept too much work and their requirements keep changing.

The Work

Every sprint, everyone is given several tasks to complete. We have approximately 3 weeks/sprint, and a task includes design, coding, and testing.

So in terms of time needed, usually:

  • 80%: coding and fixing bugs
  • 15%: testing, including setup overhead
  • 5%: thinking about the approach/design, this percentage is actually a bit generous

This I think already starts to show a problem, there’s not much time spent thinking.

Consider 5 changes per sprint: 5 days * 3 weeks / 5 changes * 7 hours = 21 hours per change…

So in the ideal case, a person spends:

  • 17 hours coding
  • 3 hours testing
  • 1 hour to design

Vague and Changing Requirements

We’re off to a great start aren’t we?

Well let’s add on unclear requirements that keep changing until the last minute, literally… So what happens now?

  • Mostly OK, may need some ugly hacks to make it in time for the release
  • Well I tested it before… after the last change, I spot checked it; seems to work so it’s good to go!
  • What design? All these unplanned changes sort of messed this up

But wait!

We actually don’t have 21 hours either. We need to deal with production issues on an almost daily basis! There’s always some sort of unplanned work! In addition to constant distractions that pull me out of my groove.

That 1 hour we had for design just became 10 minutes.

So what does this add up to?

Poorly designed and inefficient code that is hard to change, with high probability of bugs. And even though it’s hard to change, changes are needed… at the last minute…so we have to resort to making hacks. Hacks that are probably not well tested given the now lack of time.

Basically everything is rushed and extremely fragile… if not already broken.

What could possibly go wrong?

What do you think the probabilities would be for a release that is:

  1. completely smooth
  2. has some minor problems
  3. a total catastrophe

And yet somehow managers don’t see this.

They expect everyone give their 100% all the time and perform magic under tight deadlines, high pressure, and constantly changing requirements.

So work overload and vague requirements already makes producing quality code pretty much impossible.

And based on their deadlines, it seems fixing problems doesn’t even need time at all…

Why do we have all these issues anyway?

In addition, they keep bending the rules/making exceptions. In particular, they:

  • ignore hard cut-offs dates with their last last minute changes
  • have Sprint planning meetings that don’t involve any developers; it’s a managers-only affair… WTF? Don’t even get me started on this but you know in Agile, PMs should only set the backlog; developers decide what to actually do in a Sprint

Why do you think these rules were made in the first place?

Everything Comes Back to Having a Buffer

The answers to the 2 questions are:

  • everything is rushed because there is no buffer and therefore quality drops like a rock
  • to ensure there is enough buffer

In short, buffers prevent the Vicious Cycle.

http://theagileexecutive.com/2010/09/20/how-to-break-the-vicious-cycle-of-technical-debt/

Given enough time (and used properly, that’s a topic for another day), you reduce the probability of mistakes, and provide more time for proper planning and thinking.

Epilogue

Why doesn’t management understand this?!?!

It seems, at least in organizations that don’t see tech as a core competency, managers are optimists… but not realists. They just can’t seem to learn…

They aren’t feeling the pain and they don’t take input from the people who do. Admittedly, in certain organizations, a lot of people don’t even recognize they have pain. But that again is for another day. I will say though, that’s one reason why technology companies have technical interviews.

So whenever there is a problem, they tell the developers to fix it (see the cycle?). And then they just hope it doesn’t happen again. They don’t really take any steps to prevent this stuff from happening again. There’s no post-mortem or introspection…

and we all suffer for it… it will keep getting worse and worse.

It’s sort of ironic that I am even writing this. I’M NOT A MANAGER!

But I’ve felt the pain and wanted to do something about it.

I am also an avid reader and several years ago, I came across The Phoenix Project

This book pretty much covers all the points I’ve mentioned in this post. It made me realize and understand the pain I was feeling and gave me the perspective to tackle it. Being a problem solver also helped, more on that in another post.

I’ve actually recommended many people, including my managers, to read it as well. They say they have, love it, and completely understand it, but still… nothing has changed…

--

--