In the world of software development, the term “over-engineering” gets used more that I like. In my experience “over-engineering” is a label often used to characterize a negative opinion toward code which has an undesirable amount of complexity. In this case, I agree with the sentiment, but very much disagree with the choice in wording as it does not capture the actual dissenting opinion.
Let’s briefly look at a hypothetical situation we likely have all experienced: you are tasked with fixing a bug in a piece of code, let’s say the implementation of a class called Foo. The issue is that Foo has 8 constructors and has tons of state mutating methods which may not work if called out of order. Somewhere in one of those methods, some internal state becomes invalid and it’s difficult to track down the right fix because there are so many entry points into the class. It’s easy to imagine reading such code and thinking “Wow, this code is so over-engineered! A much simpler design would be easier to debug!” We’ve all been there, right?
So we should agree to call it “over-engineered”? Not so fast. While simpler code is indeed easier to debug, there is a crucial piece of information which has been omitted: the code’s functional requirements. This is too often an aspect of software engineering that goes ignored, resulting in bad code in both extremes.
Before moving on, it is important to identify a common definition for what it means to “engineer” a solution. I am using the definition (straight from Dictionary.com) of “[verb]
So why are the “methods of engineering” so important to consider? It’s simple: if you do not define the problem you are trying to solve, you’ll never implement a proper solution. In other words, by definition you cannot solve an undefined problem at all, let alone solve (or “engineer”) it well. Each and every formalized method of software engineering (Agile, XP, waterfall, etc.) all share some form of requirements gathering or feature selection for a given development period. The only thing they disagree on is the order in which some actions happen or how long they last, such as tests being written before the implementation or development being done in “sprints” instead of all at once.
With this in mind, to “engineer” a solution is to come up with a design (and implementation) that meets a set of requirements during a development cycle. Tracking with me so far?
This leads me to view of “over-engineering” something is to perform better than required at a particular metric of success without sacrificing another metric: cost/effort/time spent, performance, functionality. From this perspective, “over-engineering” would be considered a positive outcome. Example statements at the end of a project would sound something like:
- “In the time allocated for implementing Foo, the implementation ended up running faster than the original performance requirement.”
- “The performance and functionality of Foo was implemented faster than anticipated.”
- “Without having to take any additional development time or have any performance hit, Foo was implemented with additional functionality beyond what was originally required.”
Perhaps the last one hits the closest to the problematic definition of “over-engineering”, but I argue that all three of those statements mark an implementation that delivered beyond its requirements, making the solution measure better than others in accordance with common “methods of [software] engineering”.
Now contrasting this with the common use of “over-engineering” as a descriptor, often code described this way was implemented void of any software engineering methodology to begin with. In other words, commonly “over-engineered” code was implemented:
- …with no aforementioned requirements
- …with no heuristics for comparing design choices
- …with no context for complexity of implementation
- …as a thought experiment or toy with no intention for production use
I believe most code that ends up this way should instead be called “over-complicated” or “over-built”, as this captures the sentiment: we look at an implementation and can’t make heads-or-tails of if it because it is more complex than is necessary, given the requirements.
Perhaps this makes more sense when looking at an example in another domain. Given the problem “build a vehicle to go down the street to the store and buy a loaf of bread for the next 10 years”. What is the right solution? My mind instantly thinks of various car makes I could just go buy, types of buggies I could construct based on some things I’ve seen on YouTube, or simply buying some parts for a bicycle. Interestingly, any of those ideas would require me to plan some aspect of the process. Some require more planning than others, but they all should have some amount of deliberation before getting started depending on the type of solution being targeted.
With writing code, bad designs are ones made by someone who (unfortunately) just starts writing code after seeing the “build a vehicle” and stops reading the requirements after that. That’s the equivalent of walking into the garage and seeing a few planks of wood, grabbing a saw, and starting to cut some parts for the first design that popped into my head.
Fortunately with code we don’t have a big penalty for throwing out stuff we no longer need (i.e. we essentially have no materials cost), but the two thought processes are not to far off. In this little example, an “over-engineered” vehicle may be one that lasts 20 years (!), or was cheaper than the original project budget (!). I’m fairly sure that if we saw a wooden buggy that was falling apart after a year that was made from thousands of tiny blocks of wood nailed together, nobody would think “someone really over-engineered that buggy”. Or if I went out and bought a Ferrari, it would be because I want to have a super-car and not because I determined it would be the best solution for going to the store.
This happens way too often with C++. One of the strengths of C++ is that it has lots of features to work with enabling multiple programming paradigms all in the same language. It is a very powerful language, but with great power comes great responsibility. Features are introduced into the language because they solve a particular problem: this is something that I’ve found many C++ dissenters fail to understand fully. The great difference in opinions of certain C++ features all settles on the different set of requirements we have for the problems we are solving.
A simple example in C++ is run-time polymorphism (virtual functions). If I have an implementation with a tight inner-loop that repeatedly calls a virtual function and I have measured that too much of my run-time is spent in that loop, then it makes sense that I may hand-optimize that by using templates to instead use compile-time polymorphism to remove the virtual function call overhead in that spot. However, there is a certain amount of code complexity that alternative solution introduces, which in that case may be a worthy trade-off to make.
Does that mean I should always default to using templates whenever I need to select different behaviors for a particular method or function? No! You cannot make an assertion about the efficacy of a solution without the context of a problem and a requirement on a solution to that problem.
So where does this leave the term “over-engineered”? I think the best way to characterize the sentiment is to use a term which captures the lack of approval for code which is more complicated than is necessary. “Over-complicated”, “bloated”, “over-generalized”, and “overly-verbose” are just some terms that fit better than “over-engineered”. If you meet requirements better than anticipated, it’s probably good code, not bad code. And remember: you can’t measure the quality of the implementation without knowing what problem it is trying to solve and what requirements the solution contains.
Therefore, the next time you are looking at “over-engineered” code, take a moment to think about it and realize that it is actually “under-engineered”.