This week, my project team and I are working hard on getting the first few pieces of our project up and functional. Even though we are still a few weeks away from a working Minimum Viable Product, we are making good progress, chugging along with our pull requests and merges. It’s a really great feeling, seeing things come to life.
One of the most satisfying aspects of working together is that our code reviews are going pretty smoothly. My teammates’ code is very easy to read through and digest. We are sticking to our style guidelines and, by doing so, even though a teammate’s code can be quite different in approach than mine, it is easily consumable and sensible. This, in turn, makes for efficient and productive reviews.
In contrast, I’m facing a different sort of code quality scenario at work, while trying to read through some very open-ended API documentation. Unlike the efficient and productive teamwork that I am experiencing with my school project, at my current work project, I am needlessly spending time trying to make sense of what the writers of the documentation are trying to say and how to set my expectations with these vague specifications.
In all fairness, though I am not at all an expert on API best practices, I can see that the team behind this documentation tried to follow good API writing standards such as: trying to write with self-describing naming for their JSON schemas, trying to have their data model mirror the business or domain logic, and not mixing data types for any given request or response property.
I can see that it all started out with sound design. But, somewhere along the way, things started getting hairy. As a consumer of this API I’ve identified some key (anti-)patterns that make it difficult for me to use the product:
- Optional properties or, as I call them, GHOST fields: There are a LOT of optional properties in the response JSON that may or may not exist in any HTTP response. The documentation does not specify when I am to expect a certain property, so I have no context to help me anticipate the receipt or omission of some response JSON properties. I am told that these properties were intentionally kept optional to make it more flexible for the app developers to address different, or changing, scenarios. Also, the goal of the optional fields is to make the codebase extensible. Well, there is a fine line between flexibility and vagueness that, once crossed, leads to confused developer end users who can’t depend on the API. My recommendation: For people who use the API, even with the added bulk of sending more information to send across the wire, having required properties with null objects or empty strings as values is much more preferable to vanishing or sometimes-present properties.
- Repeated data in the deep, dark nested parts of the JSON: Some of the same properties and values that are available in the outer, more accessible layers of the JSON sometimes reappear 10 levels deep, somewhere. I can only guess that this repeated data is useful in its immediate context, 10 levels deep, so it was copied there. But this really smells off to me. It makes me think that the JSON schema is incorrect. My recommendation: If the two separate areas in the schema share a common set of properties, then maybe the JSON isn’t capturing that relationship correctly.
- Non-unique property names, that are eventually changed: So this one is related to the previous bullet. There are repeated property names in different scopes because there was initially a “need” for that setup. But now the API devs have decided that they want to make the names unique because consumers of the API are “grepping” the wrong property. So they send out a notification alerting people of the name changes. The folks using the API in production are hopping mad (or, at least, put out). My recommendation: while having unique names seems like the right answer, the real problem is the one mentioned in the previous bullet: the relationship is not correctly constructed. Maybe there is a better way to structure the schema so that there are no repeating properties. Maybe the solution lies in representing only the differences between those two similar sections of the JSON (literally, the diffs).
- The documentation doesn’t note expected values when it should: For many properties, there is a finite, known list of values that can be expected. This API documentation is missing many of these expected values. If I see a new value being returned, I immediately question it: what does it mean? is it really new? is this value returned in error? My recommendation: be explicit in the documentation about what values people can expect for a given property. Being explicit always trumps inference, at least in technical documentation.
So, in writing these thoughts out, I have to curb my censure and say that I believe that this particular team behind the API is really trying their best. And, like I said, there is evidence of solid methodologies that speak of good beginnings. But, as it often goes with scaling out a product, with complexity comes the opportunity for bad design choices to seep in. The motivations were good: flexibility, extensibility, accounting for detailed real-life data models… but the danger lies in losing too much structure, leading to end-user confusion and discontent.
My final thought is self-reflective: whether at work or with my current school project, it is easy to write code that I, as the author, can understand, but it is much more difficult to keep the reader of my code in mind (or the consumer of my endpoint, say, as the case may be). Putting myself in the recipient’s shoes makes it easier to see how my design decisions or coding styles might affect them, a practice that I am should do more often.