Break Where You Breathe
I’ve been writing text files for a living for over a decade now. Code, documentation, READMEs, commit messages, config files. All plain text, all version controlled. And for most of that time I treated prose and code as fundamentally different things.
Code got careful formatting. Every indentation had meaning, every line break was intentional. I’d refactor a function to make its structure clearer before I’d ever think about adding a comment.
Then I’d open a Markdown file and just… type. Paragraphs as long unbroken lines that wrapped wherever the terminal decided. Or worse, hard-wrapped at 80 columns like I was writing Fortran punch cards. Either way, the moment I needed to revise something, version control made it look like I’d rewritten the entire paragraph. Don’t get me started on vim gitdiff horizontal split.
At some point I started putting each sentence on its own line. Then I started breaking the text with intention. It felt weird at first, like writing poetry (maybe because I don’t write poetry). But the diffs got cleaner. The edits got easier. And I never went back.
Years later I learned this technique has a name and a history that goes back further than most programming languages in active use today.
The diff problem
Here’s what a typical diff looks like when you change one word in a paragraph that’s hard-wrapped at 80 columns:
-The beauteous scheme is that now, if you change
-your mind about what a paragraph should look
-like, you can change the formatted output merely
-by changing the definition of ''.PP'' and
-re-running the formatter.
+The beauty of this scheme is that now, if you
+change your mind about what a paragraph should
+look like, you can change the formatted output
+merely by changing the definition of ''.PP''
+and re-running the formatter.
Every line is marked as changed. A reviewer has to read the entire block to figure out what actually happened. Was it a rewrite? A word swap? A typo fix? Impossible to tell at a glance.
Now the same paragraph, with each clause on its own line:
-The beauteous scheme is that now,
+The beauty of this scheme is that now,
if you change your mind
about what a paragraph should look like,
you can change the formatted output
merely by changing
the definition of ''.PP''
and re-running the formatter.
One word changed, one line in the diff. The signal-to-noise ratio goes from terrible to “ok, I’ll review it”.
Kernighan said it first
In 1974, Brian Kernighan wrote a Bell Labs memo titled “UNIX for Beginners.” In it, he offered this advice:
Most documents go through several versions (always more than you expected) before they are finally finished. Accordingly, you should do whatever possible to make the job of changing them easy.
Start each sentence on a new line. Make lines short, and break lines at natural places, such as after commas and semicolons, rather than randomly. Since most people change documents by rewriting phrases and adding, deleting and rearranging sentences, these precautions simplify any editing you have to do later.
This was written for people using ed and troff on terminals
that probably couldn’t display more than 24 lines at a time.
And yet it applies perfectly, maybe even more to a world
where we review prose in pull requests
and collaborate on documentation asynchronously.
Kernighan hit the nail on the head: text that will be revised should be structured for revision, not for reading. The rendered output is for readers. The source is for writers.
The rules
The SemBr specification formalizes this into a clean set of conventions:
- A line break must occur after a sentence (
.,!,?). - A line break should occur after an independent clause (
,,;,:,—). - A line break may occur after a dependent clause to clarify structure or stay under 80 characters.
- A line break must not occur within a hyphenated word.
- A line break must not alter the rendered output.
The key insight is rule 5. In Markdown, reStructuredText, AsciiDoc, LaTeX, and Org Mode, consecutive lines within the same block are joined into a single paragraph. Line breaks are invisible to the reader. They exist only to serve the writer.
Compatible formats
Semantic line breaks work in any markup language where consecutive lines are joined into a single paragraph:
- Markdown / CommonMark
- reStructuredText
- AsciiDoc
- LaTeX
- Org Mode
- MediaWiki
If you write HTML or plain text where newlines are significant, this doesn’t apply.
In practice
You don’t need to reformat an entire repository. Just start using semantic breaks in new text and revisions. Over time the style propagates naturally, and the distinctive shape of semantically-broken text makes it obvious which sections have been touched recently.
For reviewing diffs, git diff --word-diff pairs well with this approach,
highlighting exactly which words changed within a line.
Read the text out loud. Where you pause for breath, for emphasis, to let the idea land before the next one starts. That’s where the line break goes.
I’ve been writing this way for years now. It’s one of those small practices that costs nothing and quietly makes everything around it better.