Staying clean

There are lots of articles on the web about writing clean code. Searching for “writing clean code” in Google produces more than 300 million hits. Even when limiting the search to last year, there were dozens of articles listed. It is safe to say that this is a popular topic. But what is clean code?

Clean code is, arguably, readable code. Robert Martin spends a chapter of his book “Clean Code: A Handbook of Agile Software Craftsmanship” surveying the opinions of influential (or maybe even seminal) developers on the matter. To describe clean code, these experienced programmers use nouns like readability, clarity, elegance, focused purpose, brevity, efficiency, and expressiveness (Martin, 2009). Out of all those, readability is the word most often used.

Whenever one says readable code, it is in the sense of humans reading it. However, what the heck is readable code? Well, readable code is clean code. Yeah, I know, not funny. The thing is, defining what makes code readable or clean is not trivial. It is a bit like describing what art is. You know it when you see it. Or, more likely, when you don’t see it. In fact, Martin expands this discussion into a whole book. I strongly suggest reading it. A great companion is Martin Fowler’s Refactoring book (Fowler, 2000), which shows ways to identify “bad” code and how to tackle it.

I have been a victim of messy code written by me. I doubt that are any developers that have not experienced coming back to a piece of their own code three or six months after they wrote it and had absolutely no idea of what it does or how it does it. This is a sign of poor code. Clean code should be explicit, self-explanatory, and almost boring in its simplicity.

“Don’t be clever”

Over the years, I have developed several personal rules from principles I’ve read or battles I’ve fought. One of them is not to be clever. I avoid that like the plague. The fireup.pro team puts it nicely:

Don’t try to be too clever! Code can’t be a complex riddle that only its author can solve. Some say that clean code is the one that doesn’t stand out, and it actually should be a bit boring
(Team, 2022)

It feels cool to play acrobatics with a language. However, readability trumps it all. Why? Because when someone else comes back to read the code, the clever bit of logic will become an obstacle. Any time wasted trying to understand code affects the bottom line. And, in the world we live in, the bottom line reigns supreme.

As an example of code that is clever, let’s look at the the quitessential basic strcpy function from the C library taken from the Android source code (Google, 2005):

char *
strcpy(char *to, const char *from)
{
	char *save = to;
	for (; (*to = *from) != '\0'; ++from, ++to);
	return(save);
}

This function has nice names and is very simple. However, it is not very readable. Sure one can spend a couple minutes and eventually discern how the language features are exploited. Still, few could glance at it and divine what it is doing unless they are familiar with it. This is a protracted example since efficiency is paramount in this case, and the function is well-known and understood. But is serves as an example of how clever code can reduce readability.

And here is another version picked from a StackOverflow thread:

char* strcpy(char *a, char *b)
{
   while ( *b++ = *a++ ) {}
   return b;
}

Ironically, this function is easier to read despite using single-character variables and the implied use of return values for an expression (it also reversed the signature, but that is a different story).

My point is not to criticize the implementation of strcpy. That has been done ad-nauseam. What I want to highlight is that cleverness leads to poor readability. With today’s advanced compilers and interpreters, efficiency is rarely improved by clever implementation.

Yet, stay concise

There is, though, a delicate balance between cleverness and succinctness. Consider, for example, list comprehensions in Python. Are they readable? To a person new to the language, a list comprehension may seem weird, but to someone that knows the language, it is as expressive as an explicit loop but much briefer. Below is a snippet from some code I wrote a couple of days ago:

def map_header_text(incoming_sto: PurchaseOrder) -> Optional[List[StoHeaderText]]:
    if not incoming_sto.purchaseOrderHeader.text:
        return None

    return [
        StoHeaderText(
            languageName=text.textLanguageCode,
            longTextId=text.textIdentifier,
            longTextIdDescription=text.textDetails,
        )
        for text in incoming_sto.purchaseOrderHeader.text
    ]

The function returns a list of objects of type StoHeaderText that map certain values from a list of similar objects. This function uses a list comprehension. Now compare with the loop version:

def map_header_text(incoming_sto: PurchaseOrder) -> Optional[List[StoHeaderText]]:
    if not incoming_sto.purchaseOrderHeader.text:
        return None

    headers = []
    for text in incoming_sto.purchaseOrderHeader.text:
        header = StoHeaderText(
            languageNm=text.textLanguageCode,
            longTextId=text.textIdentifier,
            longTextIdDesc=text.textDetails,
        )
        headers.append(header)
        
    return headers

The difference is slight. A few more lines to create the list, generate each object, and append it to the list. Which one is more readable? I’d argue that the extra lines in the second version do not provide additional information. They are boilerplate, while the first version is just as readable but more concise. Could the first version be characterized as clever? I’d say no because it uses a language construct precisely as it was designed to be used. But I will leave it up to you to decide.

All this is to say that writing clean code is not easy. It takes effort and a lot of thinking. Just deciding on how to name things can be a challenge. Especially as the code base grows and you have an explosion of entities. Other aspects of clean code, like coupling, cohesion, expressiveness, etc., also require careful consideration.

How to Stay Clean

Because of how hard it is, we will inevitably add blemishes even when we, as developers, intend to write clean code. It is human nature to be inconsistent. Or we may take a shortcut here and there, thinking we will return and fix it later. Yet later rarely comes. This gradual deterioration of the code as it grows usually is called rot. Code can rot to the point that it becomes an impenetrable, brittle mess. And the only way to avoid it is to be proactive in grooming it, pickling out the nits as frequently as possible.

This is where code refactoring comes in. As Fowler puts it: “Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure. It is a disciplined way to clean up code that minimizes the chances of introducing bugs. In essence when you refactor you are improving the design of the code after it has been written.” (Fowler, 2000) Thus, the key to clean code is relentless refactoring.

Note, though, that a crucial aspect of refactoring is that the code does not alter its external behavior. From the outside, the software’s functionality remains identical, even if the innards are thoroughly shuffled. But how to guarantee that the program behaves the same after altering it? By testing it! This means that before doing any refactoring, we need to have a suite of tests that let us characterize the system’s behavior. These can be automated tests like unit tests, integration tests, end-to-end tests, or manual tests, where a user verifies the expected functionality.

The implication here is that since software rot is inevitable, implementing tests is as important as implementing the software itself. The tests are the scaffolding enabling the developers to aggressively groom the code to remove any rot. With a battery of tests, developers can be confident that any regressions caused by refactoring will be discovered quickly. Having a suite of tests is liberating. I know. I’ve been there.

Nonetheless, this is one of the many aspects of my career where I need to improve. Although I understand the importance of writing tests, I find it tedious and difficult. In fact, most developers I know share the same feeling. What is worse, spending time writing tests can, at times, be a hard sell to management. Yet, tests are essential for the long-term success of any project. It behooves me to champion them and make an effort to always write tests for the code I produce.

References

Fowler, M. (2000). Refactoring: improving the design of existing code. Addison-Wesley.

Libc/string/strcpy.c – platform/bionic – git at google. Google Git. (2005). Retrieved January 23, 2023, from https://android.googlesource.com/platform/bionic/+/ics-mr0/libc/string/strcpy.c

Martin, R. C. (2009). Clean code: A Handbook of Agile Software Craftsmanship. Prentice Hall.

Team, F. (2022, August 10). Clean code – how to code like A pro? Clean Code – How to Code like a Pro? Retrieved January 23, 2023, from https://fireup.pro/blog/how-to-keep-your-code-clean

Eventual Consistency

Ruminations on the Software Development Process

“Don’t be clever”

Yet, stay concise

How to Stay Clean

References

Leave a Reply Cancel reply