Killring

Rick Dillon's Weblog

Perils of Software Estimation

Software engineers often work on teams. The team writes software that has value. In order to plan when that value will be realized, it is useful to know when the software will be ready to ship. Failing that, it is useful to measure how the team is doing so that management can take action if the plan isn’t going to work out. Both of these endeavors involve estimating the complexity, time, or risk of various software features. This is harder than it sounds.

Project Scope

When projects are conceived, the ideas behind them are usually vague. Whether as complex as a self-driving car, or as simple as a dashboard app to keep clients in the loop, it’s necessary to first scope the project. What exactly will the project do? Maybe more importantly, what will it not do? Perhaps the self-driving car need only drive on highways when in automatic mode, avoiding the need to address automated driving on surface streets (with pedestrians and bicycles!) These scoping decisions may seem easy to make, but if the project spans weeks or months, the commitment to stick to the original project scope, despite changing circumstances, can be very hard keep.

As soon as project changes, estimates for how long things take have to change as well. If any other products or features depended on this project’s timeline, they will have to be adjusted as well. Team dynamics will change, and optimism and sense of purpose can easily be replaced by annoyance and frustration. Commitments have value, even despite their cost.

But let’s suppose we get the project scope just right.

Project Requirements

Our next step is to look closely at the components of the project and what we need them to do. Should the car accommodate passengers taller than six feet? Should our client dashboard work in Firefox and Chrome, or IE 6 as well? Small requirements changes like these can easily lead the team to decide they need to start from scratch, reworking everything done between the start of the project and the requirements change. More often, the cost is smaller than that, but often substantial. Anticipating these requirements is difficult, especially in the face of new data and feedback from clients. Entire philosophies of software development, like agile, have been developed to help compensate for this dynamic.

Let’s suppose that we got the requirements correct the first time, and all the data and feedback we gather during development only reinforces the requirements already established.

Project Decomposition

We now need to decompose the project into a set of features, or milestones. This requires a broad and deep understanding of both the problem space and the technologies that are relevant to it. If that understanding is not present, risk increases, since it becomes very hard to tell what direction the team might settle on. Even with that understanding, whole aspects of the project can get missed. Whether it’s a feature to get two systems to talk to each other correctly when we assumed it would ‘just work’, or that we didn’t realize we’d need an extra fleet of machines to do distributed compilation to turn around the nightly builds in time for QA, new, time-consuming aspects of the project often emerge just in time to make a feature or milestone late.

For the moment, though, let’s assume that we decomposed the project perfectly, and that we didn’t include any unneeded features, and that we didn’t miss any, either.

Team Dynamics

We now turn our attention to the team that will be working on the project. Have they worked together before? Are they familiar with the problem space? Are they familiar with the technologies being used to solve the problem? Are they all available for the duration of the project? A naive estimation model assumes engineers are interchangeable, but if we stick to small, autonomous teams, we might find that engineers play different roles, and the loss of a role disproportionately impacts the effectiveness of the team as a whole. This can be a single engineer that knows a lot about a particular technology, or a team member that’s very good at brokering information between other engineers that work in parallel. On a long enough timeline, however, people get sick, move teams, go on vacation, quit, and get hired. As the team changes, so does its performance in various ways, making estimates that much harder.

For purposes of our investigation into estimation, however, we’ll assume our team is stable, knows the problem space, and has expertise in the technologies we’ll use to solve the problem.

Feature Estimation

We now have a list of tasks for the team to complete. The first task is to put them in order. We can place the tasks that lay a foundation for collaborative development first, since we want to parallelize the work as early as possible. Alternatively, we can place the tasks that have the highest risk first, so we can discover pitfalls and adjust our plans early.

Once tasks are ordered, it’s very tempting to ask a simple question: ‘Here’s all the stuff we have to do. How long will it take?’ This is the moment when everything falls apart. Projects don’t get to this stage without managers already having negotiated for a budget in terms of time, people, and money. That means there are already expectations in place for how long the project should take. If those expectations didn’t exist, the project might not either. After all, no good business can succeed by funding a bunch of projects with no time or cost budget.

Unexpected Complexity

There are broadly two categories of complexity: intrinsic and incidental.

Some features are intrisically more complex than they first appear. It’s useful to keep a log and notice patterns. For example, among those on my list of common ‘simple’ features that always end up being more complex than they appear are calendaring, displaying video over a network connection, chat functionality, distributed locking, and syncing data between two or more stores. There are thousands of these pitfalls, and they are often associated with edge cases that are not considered during the planning phases of a project. Sometimes, it’s a thoughful engineer that runs across them as the edge cases in the code are being handled. Other times, the issue can be represented as an unanticipated state the application finds itself in that it must recover from.

Some features are incidentally complex. There’s nothing that should be difficult about them, but due to a requirement to support an older browser or device, or because of a decision made in legacy code to use a particular abstraction, the new feature will be much harder to implement than it seems like it should. You can always tell you’re dealing with incidental complexity when an engineer starts off by saying ‘If we were building it from scratch, I would…’.

No matter the source, though, there are always pockets of complexity that need to be quarantined, avoided, or removed, and any approach takes time.

Yak Shaving

A feature, taken alone, might not be a big deal to implement. But inevitably, from time to time, a feature drags the implementing developer down a rabbit hole of unrelated (but often necessary) maintenance work. One example that I recently saw led to a change of GeoJSON toolchains because of the recent enforcement of the ‘right hand rule’, which dictates in what order points of a multi-polygon must be specified in GeoJSON. The stored polygons didn’t conform to the rule, but the new version of the tooling did. Rather than tackle the actual feature, the engineer working the ticket found himself wrangling the toolchain to compensate for the unexpected change.

Complexity and yak shaving both have the capacity to derail the estimates for any given feature. But, as before, let’s assume they don’t, and that we assign exactly the right complexity points to each feature in a our list.

Planning Fallacy

All of these factors roll up to one massive effect: the planning fallacy. The planning fallacy is essentially the observation that people are terrible at accurately estimating task completion. It holds equally well for individual tasks and group tasks, and the numbers are fascinating. The single most interesting finding in these studies is that, most of the time, people take longer to complete tasks than their most conservative estimate when starting out. The wikipedia article is worth a read, but for a taste, consider that, when graduate students were asked to estimate the length of time an academic project would take, a mere 45% of them completed the project within their 99% probability estimate. Put another way: the majority of the time, things take radically longer than we expect.

Critical Chain

Why is it that we’re so bad at estimation? There are many possible reasons, but at the group level, I’ve found none to be as compelling as Eliyahu M. Goldratt’s Theory of Constraints as explained his novel Critical Chain. The entire book is insightful, but a particular section discussed how organizations undermine themselves in an effort to boost efficiency.

Consider a project (we’ll call it Project 1) that requires three teams to work together. We’ll call them Team A, Team B, and Team C, and let’s Project 1 involves writing some software for a website. Team A has to write the backend logic, and make the API endpoints available. Team B has to write the front-end code to provide a nice user experience, and Team C has to design the A/B tests and analytics. While Team A is working on the API for Project 1, Team B and Team C are doing another project (say, Project 2). After all, we don’t want to pay them to stand around a wait! Let’s consider two scenarios.

First, a pessimistic scenario: Team A is not able to finish the API when they said they would. Team B and Team C are waiting for work, and pick up Project 3 in the meantime. When Team A does finish, Team B and Team C have the option to stop Project 3 and incur switching cost to start Project 1, or they can continue Project 3 and further delay Project 1. Either way, Project 1 is late because Team A took longer than expected.

Next, an optimistic scenario: Team A finishes the API for Project 1 early. Team B and Team C aren’t ready, though, since they are still working on Project 2. Again, we face a dilemma of switching costs. Do Team B and Team C start work on Project 1 early, dropping their work on Project 2 (leaving it incomplete), or do they finish Project 2 and begin Project 1 when it is finished?

Unless Project 1 is the most important project across all the teams, these dilemmas are most often solved by minimizing switching costs (again, efficiency). This means that every time a part of a project takes longer than expected (which we know from the planning fallacy is most of the time), the delays accumulate as the project proceeds. The flip side of this is that when a part of a project is earlier than expected, the extra time does not accumulate. This means that as a project suffers delays, there is a ratchet effect: late projects almost never get back on track.

What Now?

Here are my closing thoughts for how to mitigate some of these effects in software projects.

  1. Decompose projects into small tickets.
  2. Don’t assume the project will be worked consecutively: design tickets assuming they will be shuffled in with other projects’ tickets.
  3. Write software that is modular. Avoid side-effects. Small, testable components reduce complexity and enable parallel development.
  4. Establish contracts early to avoid pipelines. Get teams working in parallel, rather than in sequence.
  5. After contracts are established, tackle the riskiest part of the project first.
  6. If time estimates are required, multiply them by between 2 and 5, depending on how well understood the requirements are.