November 27, 2023

Two households, both alike in dignity, In fair Verona¹, where we lay our scene, From ancient grudge break to new mutiny, Where civil blood makes civil hands unclean.

Untitled

New Lines in the Sand

We all take the fact that pressing Enter puts our text at the beginning of the next line for granted. But in computing there are two² methods for actually accomplishing this, and they sometimes clash nearly as violently as the Montagues and the Capulets.

Every time you use a CSV or TSV file, or a multiline JSON file (which still underlies a LOT of electronic data exchange in 2023) you are using one or two ancient, invisible characters. These “non-printable” characters are special letters that drive functionality instead of visibly appearing. Unix-based systems and Windows-based systems use different combinations of characters to indicate a newline.

These invisible characters are the Carriage Return (CR, also written as an escaped “r” or \\r) and the Line Feed (LF or \\n). On teletype machines, the Carriage Return character moved the cursor to the beginning of the line horizontally, while the LF character moved the cursor vertically to the next line. Today, half the world indicates one of these with CR (\\r) and half with CR+LF (\\r\\n). The divergence between them takes us through the history of different operating systems, different international standards organizations, and through the full lineage of telecommunications technology all the way back to the telegraph.

Consequences

Problems caused by the divergence are incredibly common. In fact, there’s an entire page of Git documentation on the subject.

This is just the first result from Googling “line feed carriage return problem.” They are legion.

Untitled

This is not a dunk on this person commenting publicly on a codebase in the slightest — we’ve all been there.

Eternal Carriage Return

You might ask: where does the “carriage” in CR come from? The carriage is the component of a typewriter that holds the paper and moves back and forth horizontally. The original ‘carriage return’ was a physical lever on the side of the typewriter that, when pulled, caused the “carriage” to return to its original position. This set up the with “strikers” to be ready on the left side of the page for the new line of text.

The first incidence of the term “carriage” I found in the context of typewriters is in a book from 1843. Presumably, it is called the carriage because it carries the paper back and forth as the typing takes place, much like a horse-propelled carriage. Even in the 1840s, technology’s present was perpetually haunted by its past.

Untitled

CRLF as Metaphor and Warning

The case of \\r\\n vs. \\n encapsulates and symbolizes the state of today’s technology, particularly around data.

Vicky Boykis’ fantastic 2022 essay, “The cloudy layers of modern-day programming” lays this out in detail at the macro level and sparked my thinking on this topic. Please read it.

As modern technology users, we sit perched on a massive stack of tangled, conflicting modules, frameworks, components, platforms, APIs, and protocols. The cloud and layers of code and infrastructure abstraction heighten these issues.