The harsh reality of a domain breakthrough

One of my favorite stories from the Blue Book is the description of how changing the model of syndicate loans greatly simplified the code and resolved many bugs, not to mention that it made the communication with business experts way easier.

I’ve read this story many times and even though in my experience it’s been quite accurate and helpful, every time I can’t help thinking that one aspect that deserves more coverage is the harsh emotional reality of such a breakthrough.

So how do you know you came across something worth calling a breakthrough?

I think there are no strict rules, but I’ve noticed the following pattern in my experience:

  • Over time I have a growing sense that something is not quite right in a particular area of code. I find it really hard to get on the same page with business experts, the code is getting more and more complex, the bugs start piling up… But there’s no easy solution in sight for quite a while and I can’t put a finger on what exactly is wrong.
  • Then one moment it hits me. I look back, connect all the dots and suddenly can reduce the whole mess to something trivial like: “we should represent the loan as a pie chart”, “we only use these 5 types of promotions, we don’t need this uber-flexible form with 10 checkboxes”, “we should track accounting period for each co-author of the book, rather than the book”.
  • I verify the new assumptions. I run a few scenarios in my head or rewrite a small piece of code and confirm that this change will likely resolve those few nasty bugs and make code simpler. Suddenly it seems like adding this long-awaited feature will become trivial, even though until this point it was estimated to be months of work.
  • Finally, I double-check with the business experts and they look at me like I’m an alien. Of course, that’s right, they’ve been telling me this for the last 6 months, why I never listen? Is that my “big discovery”? Are you kidding me?

And this is where the difficulty lies. The initial excitement of the discovery may become clouded by the fact that trivial as it sounds, in code it’s quite a large change. It may be so fundamental that you’ll also need to rewrite your tests because the model in code is significantly misaligned with how the expert sees it and that has ripple effects throughout various parts of the code. You may come to the conclusion that it’ll be easier to “run in red” for a while, instead of struggling to make the tests pass. In a complex codebase, this may be a scary thought. And good luck explaining this to your team of software craftsmen…

Eric hints on that when he says that upon this discovery we have two choices, both hard: either bite the bullet and change the code (it probably will be closer to weeks rather than hours and may be hard to estimate) or not take the risk and suffer the consequences, at least for a little longer.

I’ve seen it play both ways and I think both choices have merit, depending on the circumstances. For example, if the problem is in a rarely used feature or consequences of mistakes are very low in the business sense compared to the cost of fixing it, or there’s a bunch of things that have a way bigger impact that need to be done sooner, then it may make sense to delay it or just decide to live with suboptimal model forever.

That was the case with the promotions code I’ve mentioned. The calculations were extremely complex and we knew there were mistakes in some scenarios when promotions overlapped, but on the rare occasion when a customer noticed the problem and complained, the helpdesk issued a small voucher for their next purchase.

On the other hand, in the case of two co-authors having separate accounting periods the issue was critical because any inaccuracy in royalty calculations would result in distrust that the system works correctly. It didn’t matter if the mistake was tiny in the monetary sense, even if it was just a couple of dollars, the consequences in reputation would be huge.

A few important lessons I’ve learned in practice:

  • The difficulty of code change related to domain breakthrough is not caused by the lack of code quality, it’s an orthogonal aspect. It’s really important to make it clear at the very beginning to avoid misunderstandings. You can have the cleanest and best-tested code in the world, but if the model is misaligned with the business reality then breaking changes or rewrites of some modules are unavoidable. On the other hand, having high-quality code makes the change easier. In fact, I personally think it’s a prerequisite to actually doing anything with your insight.
  • Domain breakthroughs are expected and a good sign of getting familiar with the domain. I’m not sure if it makes it any easier to handle, but I don’t think any amount of up-front design or analysis can prevent such events. They simply show that both developers and business stakeholders learn more about the system, priorities, and gradually come to a better alignment.
  • The so-called “soft” aspects of a change are way more important than code. It’s really hard to avoid falling into the trap of looking for people “responsible” for the misalignment because in hindsight the problem looks very obvious and business stakeholders may feel like they communicated it for a long time. It may take a conscious effort and a fair amount of explicit communication to focus everybody on moving forward instead of debating how this could have been avoided. Most likely it couldn’t, because this is how you learn anything complex, occasionally making mistakes.
  • This is a moment where you may need to use all the trust capital you’ve amassed with the business over time. If you’ve spent the last few weeks or months trying to tame the weird bugs and exploding code complexity that were very hard to explain, people may be skeptical about the potential of finally addressing the problem with such “trivial” change. If it was so obvious, why haven’t you done it months ago?
  • Talking about a breakthrough requires a lot of courage. It may be hard to even admit that the problem exists, it may feel embarrassing that we haven’t noticed it earlier, or you may unintentionally make your colleague feel like you’re attacking their design decisions. If the relationships in the project are not based on trust, mutual respect, and egoless programming principles, then instead of an opportunity for improvement you wanted to discuss, you may end up with lots of misunderstandings.
  • Last but not least, the hardest part for me was that I wasn’t certain if the proposed changes will deliver the results I hoped for. Timeboxing, some initial analysis to estimate or small POC were quite helpful in minimizing the risk, but the only way to know for sure was to make the change and verify this in practice. On the one hand, I didn’t want to promise the impossible, on the other I needed some enthusiasm for the change to get a chance to make it. The solution for me was to be very clear and explicit about assumptions and expectations and providing frequent updates about the progress. Turns out that in practice visibility (regular updates) is more important and helpful than predictability (accurate estimates and detailed plans) or having direct control over the process (micromanagement).

Those are a few observations regarding domain breakthroughs based on my experience. What are yours? Have you ever come across a domain breakthrough? Which way did you go: make the change or ignore? What have you learned? Let me know in the comments, I’m curious.

 

 

(Visited 262 times, 1 visits today)

Leave a Reply

Your email address will not be published. Required fields are marked *