Premature Abstraction

We often hear about "premature optimization" in the context of performance. We also hear about the virtues of "clean code", which means many different things to different people, e.g., immutability, single responsibility, decoupling, etc.

This last one, decoupling, says that two components interacting together should know as little as possible about each other. This is typically achieved via an abstraction. A recent post on LinkedIn reminded me that some folks really love abstractions.

Many years ago, I heard a good rule of thumb, which stuck with me:

A well-designed abstraction needs at least three implementations.

The issue with an abstraction that is backed only by a single implementation is that it will almost inevitably be over-fitted. The abstraction will closely espouse the shape of its one implementation and will therefore not be very generic, thus making it awkward to plug-in alternative implementations under the hood. Leaky abstractions are an example of this phenomenon.

This does not necessarily mean we should never write abstractions with just one or two implementations, just that it might be a red flag. Do we really need it, or is it a premature abstraction? Do we need to think extra carefully to avoid over-fitting? What future implementations which don't exist yet might we want to plug-in, and do those inform the design of the abstraction?

Premature Abstraction

↔️