Meat memory management
Why code is hard to read and how to fix it
Consider this code:
{const chars = stringconst initialChunks = char: chars0 count: 0const chunks = charsreturn chunks}
Imagine your job is to find out what it does and how it does it. How do you go about it? You will undoubtedly draw on vast reserves of knowledge, experience, strategies and rules of thumb. Eventually, by an awe-inspiringly complex orchestration of activities, you will come to an explanation.
As software engineers we are faced with this job every day. Sometimes it is so easy we don't notice, others it is so hard that it takes days of our lives. What causes that difference? In other words — what makes some code easy to understand, and other code hard?
The answer lies, unsurprisingly, in our brains and their limitations. We will see what these are, and learn how to transform our code from difficult to comprehend forms into easily cognisable ones.
Mentally representing code
Consider this code:
""endencode("aaaa") # => "4a"
Let's say I wanted to figure out how this works, to change or debug it.
First I read the code and transform it into a sentence a bit like this: "it returns a string with the length of the string and its first character".
We can diagram this out something like this:
I apply this to an example
"aaaa"
— the length of the string is 4, its first character is a, so the string is "4a".
Notice how my mental representation involves things and relationships, not operations. We're reasoning about properties.
Forgetting
Let's try something more complex. This one is in Ruby:
string.chars.chunk_while {|a, b| a == b }.map {|list| "" }.joinendencode("shhhh") # => "1s4h"
We can interpret this as a pipeline:
- Convert a string to an array of characters
-
["s", "h", "h", "h"]
-
- Convert that into chunks of identical characters
-
[["s"], ["h", "h", "h"]]
-
- Convert each of those into a string denoting run length and character
-
["1s", "3h"]
-
- Join them all up into one big string.
-
"1s3h"
-
While previously we were reasoning about properties, we're now reasoning about transformations.
This way of laying out code has a big advantage. As each stage goes by, I can forget about the previous stages.
Imagine I listed out 10 numbers for you and then asked you to add them all up — this would be quite hard. But if instead I read them slowly and let you add them up to a running total as we go, the task is much easier. You can forget the numbers as soon as they become irrelevant. That's what makes chains of transformations so pleasant to reason about.
There is, however, one disadvantage — you have to evaluate the whole chain to figure out what it does as a whole.
Intermezzo: a practical exercise with working memory
Try to generate the Fibonacci sequence in your head.
Don't use paper — brain only. Don't worry about remembering the sequence — just generate the next number.
See how far you can get, then read on.
Questions:
- When you started to slow down, what was the reason?
- When you failed, what was the cause?
- How far could you go on paper? Why is paper such an effective aid here?
- When programming, at what point do you break out the notebook? Why?
- Suppose you did this exercise in something less familiar like hexadecimal — would it be harder? Why? What does this mean for learning?
State
Here's the same code from the last exercise expressed in a different way.
{const chars = stringconst chunked = char: chars0 count: 0charsreturn chunked}
Here's roughly what I think when I'm trying to figure this one out:
So with the first character,
"a"
, it looks up the last item inchunked
and sees it is the same because we set that up there, then increments that count and returns. Next,"b"
it looks up the last item inchunked
and finds..."a"
which is not the same, so it pushes a new object,{char: "b", count: 1}
and then returns, then I guess with"b"
it increments, then with"a"
it adds a new one. Then it'll map them all to run length strings and join the lot together. 1
I'd describe this as reasoning via state. It makes use of our ability to maintain several pieces of information in short-term memory while performing operations on them. But it feels more taxing than the transformation version — and even worse, the complexity here is unbounded. If our method just chains transformations we can scale it indefinitely (though it might get tedious), but this state-y function could expand to have many pieces of state involved — effectively removing it from the category of things we can reason about in our heads. 2
Refactoring state
Luckily for us, we're programmers so we can refactor into easier forms. Let's try to refactor the above state solution into transformations and properties.
I started off extracting a grouping operation:
{const chars = stringreturn}{return items}
This is good for
encode
— it is now all transformations and we don't have to keep track of any of the intermediate steps. But
groupConsecutiveItems
itself is still pretty challenging. There's less named state, but we still have to keep track of the last and next
and all the rest. Let's try refactoring that.
After a few intermediate steps, I ended up with this:
{const chars = stringreturn}{if == -1return listconst before after =return before ...}{return list list}{const firstChar = list0return list}
This is a little easier to follow.
groupConsecutive
is now fairly straightforward — find the first change, split the string there, then do the same with the second chunk.
Novelty
Two scenarios:
One. You are sat in your company's cafeteria attempting to learn how the IO monad works in Haskell by building a BitTorrent client.
Two. You are sat at home on a lazy Sunday when everyone is out, attempting to learn how the IO monad works by building a simple command line quiz app.
In which situation are you likely to learn more about the IO monad?
We can describe cognitive load as being made up of three components:
- Intrinsic: the essential effort to complete a task. Building a BitTorrent client is more challenging than a quiz app.
- Extraneous: the effort due to the presentation or environment. Your company's cafeteria full of work-related chat and people you know consumes more cognitive load than your lazy Sunday.
-
Germane: the effort devoted to learning. The IO monad
:)
We have a budget of cognitive load that we can spend. Intrinsic can't be avoided, extraneous tends to be intrusive — and so germane often gets the squeeze.
When designing educational material we try to reduce the extraneous and manage the intrinsic, in order to leave maximum room for the germane.
So what does this mean for programming?
Let's say you really like my recursive solution above and you want to put it in your production software where all the other programmers will have to deal with it. This will involve some of them getting more familiar with recursion. This isn't necessarily a bad thing — it's good for your team to learn new solutions to problems.
But it is bad if they give up or don't work with it because it is too hard to reason about. So it's especially important, if you want to start solving problems this way, that you pay special attention to providing the programmer with a good environment to learn. Don't put it in crucial often-modified bits of the program where people might be under pressure to make changes quickly (minimising extraneous load), and make sure to use all the tools in your toolbox to make it easy to understand (minimising intrinsic load).
What is reading code about?
Consider these three functionally identical methods.
# Example 1""end# Example 2""end# Example 3""end
Which is easier to understand? Which is more readable?
Reading code is, literally, reading: decoding symbols to derive meaning. 3 Undoubtedly with code this is a multilayered process — we decode the actual symbols to derive the operations, and then derive the meaning (purpose, intention, behaviour) of the method, the program, etc.
Given this, we can say that readable code facilitates this process of deriving meaning.
Let's compare our three expressions above to their equivalent meaning:
-
string[0]
Get the first character of the string -
string.first
Get the first character of the string. -
string.chars.first
Get the first character of the string.
Each additional cue reduces the distance between the code and its meaning, and thereby the cognitive load required to read it. 4
Refactoring for meaning
So, one handy heuristic — try explaining how your algorithm works and then structure your code like that. 5
When I tried this with run-length encoding, I came up with:
Go through a string and replace all sequences of the same character with the length of the sequence followed by one of that character.
And then came up with a new solution:
{const CONSECUTIVE_CHARACTERS = /\1+/return}{const match = patternif matchconst before matched after =return before + +return string}{const originalString = matchinputreturnoriginalStringmatch0originalString}
A concise, communicative solution — though admittedly not without its rough spots. The regex requires a little parsing,
and the core activity of 'converting' is hidden in the middle of the
replaceMatching
function.
6
After a bit more searching around the
String
prototype, I found a solution so straightforward as to be a bit embarrassing.
{const ALL_CONSECUTIVE_IDENTICAL_CHARACTERS = /\1+/greturn string}
The decode function turned out pretty straightforward too.
{const ALL_RUN_LENGTH_SYMBOLS = /[0-9]+[a-z]/greturn string}
Why care?
When you're coaching junior developers some things come up again and again. One is the developer who massively prioritises extremely concise solutions using the minimum number of lines and characters.
The line I've settled on is that the conciseness to aim for is conciseness of understanding. Shortening
deposit
to
dpst
might make the line physically shorter, but take longer to read.
This is a novice error, but us more experienced types are often guilty of the same thing. We optimise for the algorithm we have discovered, not for the discovery of our algorithm by others. A design that flows effortlessly through the mind — even where the mind is changed by it — tends to be preserved for future generations. Unremarkable code, leaving a silence behind it, as there is nothing left to say.
If you liked this, perhaps you will like other things I make. Like this newsletter about becoming a better developer, or my tweets at @neoeno, or my website — you're on it right now!
Notes
- Ever find yourself talking out loud when figuring out a particularly tricky piece of code? Ever find yourself reading a number out loud again and again when you have to take it from one place to another? We have an extra piece of short-term memory that operates with audio — it's called the phonological loop, and it's a bit like a closed tape loop that just repeats. That's why you can remember longer strings of numbers if you repeat them — you say them aloud, listen to yourself saying them, and then repeat what you just said. ↩
- Catastrophically — different people have different numbers of things they can maintain in their short-term memory at a time, and a part of this is down to how comfortable someone is in the area. It doesn't take much to shift from 'just about manageable' to 'totally unmanageable', and when two developers can be on either side, this goes some way to explaining why we write unreadable code. ↩
-
In a rare display of sheer awe, Wikipedia describes reading like this:
Reading is a complex "cognitive process" of decoding symbols in order to construct or derive meaning (reading comprehension). Reading is a means of language acquisition, communication, and of sharing information and ideas. Like all languages, it is a complex interaction between the text and the reader which is shaped by the reader’s prior knowledge, experiences, attitude, and language community which is culturally and socially situated. The reading process requires continuous practice, development, and refinement. In addition, reading requires creativity and critical analysis. Consumers of literature make ventures with each piece, innately deviating from literal words to create images that make sense to them in the unfamiliar places the texts describe. Because reading is such a complex process, it cannot be controlled or restricted to one or two interpretations. There are no concrete laws in reading, but rather allows readers an escape to produce their own products introspectively. This promotes deep exploration of texts during interpretation.[1] Readers use a variety of reading strategies to assist with decoding (to translate symbols into sounds or visual representations of speech) and comprehension. Readers may use context clues to identify the meaning of unknown words. Readers integrate the words they have read into their existing framework of knowledge or schema (schemata theory). ↩
- Admittedly, this might induce some 'wtf' cognitive load in an expert, who might wonder why such inefficiency was tolerated. Tradeoffs! ↩
- Thanks to Mary Rose Cook for this one. ↩
- I noticed that whenever I explained what methods did they were often either something starting with a verb representing a transformation, like "it splits a string around a regex match", or something like a noun representing a property, like "the index of the first change". I've started naming my methods verbs or nouns accordingly, which saves me some effort thinking of names while also guiding me towards communicative naming. ↩