Identifying bad code with function usage patterns
Coding Jan 13, 2022
As software engineers, we aim to write code that is easy to understand and change. But it is not always obvious what kind of code achieves that. Let me explain how I look at function usage patterns to identify and refactor bad code.
Let me first introduce some vocabulary so that we are talking about the same things.
If I'm saying "a function is used in a place", then I'm talking about it being referred to in a specific static spot.
function f() {
g();
}
The function g
is used in the function f
. If it appears in a for
-loop, then it is still only "used in one place".
function f() {
for (const i = 0; i < 2; i++) {
g();
}
}
It appears in two places, if its name appears multiple times in the code. It doesn't matter if its usage is in the same or another function. In the following example g
is used two times.
function f() {
g();
g();
}
If a function is passed to another function as an argument or used in a lambda, then only its appearance where its passed is a usage. In the following example g
is used one time.
function f() {
[1, 2, 3].map((x) => g(x));
}
So I'm talking purely about static analysis, not the dynamic call patterns at runtime.
Hence, as an example, in the following code the function g
is used three times.
function f() {
g(); // 1
if (false) {
g(); // 2
}
}
function h() {
return [1].map(g); // 3
}
It's always about the program as a whole, independently from if the functions are distributed over multiple files.
Note For all the object-oriented folks out there, you can apply these function usages similarly to methods within objects instead of functions.
So much for the boring part, let's talk about how I use usage patterns to evaluate the quality of code.
1 to 1 usage 🔗
That's the default you want to have in your program. A function is used in exactly one place.
function createBooks(count) {
for (const i = 0; i < count; i++) {
createBook();
}
}
There are multiple advantages of 1-to-1 usages.
- If
createBook
is changed, then onlycreateBooks
might break or change in behavior. - A signature change to
createBook
only affects this one place increateBooks
. - The input parameters and performance of
createBook
can be optimized for exactly this one usagecreateBooks
. - If used exclusively, then your function usages form a tree which is easier to reason about than a graph.
- Because the usage of
createBook
is so limited, you don't necessarily need to test it in isolation but can testcreateBooks
instead. - You only need to look at the last item of a stack trace and know in which context the function is used.
However, in some cases, you don't want to limit yourself to 1 to 1
usages.
2-5 to 1 usage 🔗
This usage pattern refers to a function being used in a limited number of places. I choose the numbers 2-5
because above that number it can be overwhelming looking at all the usages of this function.
// On `POST /books`
function postBook(book) {
synchronizeBook(book);
}
// Scheduled cron job
function refreshAllBooks(books) {
for (const book of books) {
synchronizeBook(book);
}
}
function synchronizeBook(book) {
// ...
}
A usage pattern like this can be very powerful, but also introduces risk since there are now multiple places that would break if synchronizeBook
is faulty. Hence, if this pattern is used, there absolutely need to be tests for synchronizeBook
.
Furthermore, this function needs to have a meaningful and short name. If its name is something like createOrUpdateBookIfAuthorized
, then its naming can either be improved or the 2-5 to 1
usage should rather be higher or lower in the call chain. On the other hand, the name needs to be so specific such that people would not want to use it in additional places. In the example, there are only a finite number of places like putBook
where someone would want to use synchronizeBook
as well.
n to 1 usage 🔗
This usage pattern is super dangerous and software engineers love it. It is the pattern for functions exposed by a library. And who does not love to write a library?
// On `POST /books`
function postBook(book) {
synchronizeEntity(book);
}
// On `POST /animal`
function postAnimal(animal) {
synchronizeEntity(animal);
}
// Scheduled cron job
function refreshAllBooks(books) {
for (const book of books) {
synchronizeEntity(book);
}
}
// Scheduled cron job
function refreshAllAnimals(animals) {
for (const animal of animals) {
synchronizeEntity(animal);
}
}
function synchronizeEntity(entity) {
// ...
}
In this example, the function synchronizeEntity
is used n to 1
.
My take on this: Use this usage pattern very, very rarely within a software product developed in a single team.
There are multiple disadvantages of n to 1
usages.
- If
synchronizeEntity
is changed, then god beware what might break. - A signature change to
synchronizeEntity
affects your whole program. - For individual usages, the performance of
synchronizeEntity
is close to impossible to optimize without risking bugs in other usages. - Because
synchronizeEntity
is so generic, you will have trouble to test it properly. You might achieve 100% test coverage for that function and it's still not properly tested for actual usages.
This pattern should mostly be reserved for libraries with clear and well-designed APIs. The library needs to be at least maintained by a different team, better by a different company or as an open-source project.
However, in some cases this pattern can be fine to use. Consider an API with endpoints which all check for authorization.
function postBook(user, book) {
if (!isAdmin(user)) {
throw new UnauthorizedException();
}
// ...
}
function postAnimal(user, animal) {
if (!isAdmin(user)) {
throw new UnauthorizedException();
}
// ...
}
In this example, isAdmin
is used n to 1
. However, it is always used in the same way. Some endpoint uses it and uses the return value to leave the function early with an exception. The implementation of isAdmin
can be rather simple compared to a function synchronizeEntity
and it can be expected to stay stable.
Some middleware frameworks (like Express) also allow you to centralize these usages.
app.post("/books", onlyAdmins, postBook);
app.post("/animals", onlyAdmins, postAnimal);
This way, you don't have the function usages of onlyAdmins
spread across your whole program if you gather the endpoint definitions in a single app
file.
1 to n usage 🔗
This usage pattern is in the vast majority of cases bad but luckily also the easiest to spot. It is a single function that uses many more other functions and has a tendency to grow over time. It sometimes comes in combination with an n to 1
usage as a 1 to n to 1
usage as the generality of the function is often the problem that leads to it being overly huge.
function synchronizeEntity(
user,
config,
db,
http,
/* ..., */ entityId,
entityType
) {
if (config.requireAdmin && !isAdmin(user)) {
throw new UnauthorizedException();
}
const entity = db.read(entityType, entityId);
let mappedEntity;
if (entityType === "book") {
mappedEntity = mapBook(entity);
} else if (entityType === "animal") {
mappedEntity = mapAnimal(entity);
} else {
throw new InvalidArgumentException();
}
let httpPath;
if (entityType === "book") {
httpPath = "/books";
} else if (entityType === "animal") {
httpPath = "/animals";
} else {
throw new InvalidArgumentException();
}
http.post(httpPath, mappedEntity);
}
Such a function will continue to grow. With more entity types being added, more function usages like mapBook
and mapAnimal
will come up. In this case, this problem can be resolved with introducing separate functions synchronizeBook
and synchronizeAnimal
. Such functions would not continue to grow unexpectedly and are rather 2-5 to 1 to 2-5
usages.
function synchronizeBook(user, db, http, bookId) {
if (!isAdmin(user)) {
throw new UnauthorizedException();
}
const book = db.read("book", entityId);
const mappedBook = mapBook(book);
http.post("/books", mappedBook);
}
Conclusion 🔗
If you make yourself conscious about the usage patterns of the functions within your program, you can use this knowledge to identify code smells and a direction on how to get rid of them. While 1 to 1
usages should be your default, 2-5 to 1
usages can be powerful if used consciously with care, n to 1
usages should be rare or reserved for libraries maintained outside of the team and 1 to n
usages are to be avoided.