How Much Text Can You Handle? Line Length, Gutenberg, and Henry James

Line length is a major reason why easily available online transcriptions of Henry James’s major texts tend to overwhelm me. Click for a free full text, and you face a forbidding wall of words, as if it’s a curtain covering up some magical text lurking beneath it. Words crowd upon one another, and the your eyes slip and slide as you train your eyes on the same line all the way across the screen. We know that, grammatically, Jamesian sentences build up tension and dependent clauses until they threaten to collapse in on the sentence, but that’s no reason why the glyphs themselves should do so. To avoid this feeling of chaos and doom, I want to explore the line length appropriate for James full texts.

Conventional recommendations about digital line lengths typically range from 45-100 characters, often settling down on an average recommendation of no more than 70 characters. As Richard Rutter’s digital adaptation of Robert Bringhurst’s Elements of Typographical Style advises,

Anything from 45 to 75 characters is widely regarded as a satisfactory length of line for a single-column page set in a serifed text face in a text size. The 66-character line (counting both letters and spaces) is widely regarded as ideal. For multiple column work, a better average is 40 to 50 characters.

As we will see, it really is a matter of having, or not having, multiple columns. It isn’t simply a matter of encourage ease of reading James, as his tendency to alternate dialogue-heavy chapters with description-heavy chapters causes his various scores on reading difficulty scales to go up and down wildly according to which part of a text you test. Added to the fluctuations of his style from the so-called early, middle, and late phases, there can be no statistically significant measure of his overall difficulty that could determine matters of “page” design once and for all. This is why I regard industry best practices as very relevant for any digital James project.

Let’s use The Golden Bowl (1904) as our example, comparing the first lines among editions, all three of which use justified rather than ragged right edges. The New York Edition (NYE) of the novel (1909) contains 47 characters, including spaces, in its first line:

NYE First Line

This line presents us with less than one sentence. It contains an independent main clause and hints as to the content of a coming dependent clause (“when it”). The line contains two primary nouns—the subject (“The Prince”) and and the object (“his London”)—linked by one verb, the simple “liked” with its modifications (the past perfect “had…liked” and the intensifier “always). It is a simple declaration, easy to absorb, elegantly setting us in time and space, as well as giving us a person to think about, ending with a promise of drama from the enjambed dependent clause (“when it…”). The amount of information contained in the line is accessible yet grabs the reader’s attention. (Wow, a Prince, amiright.) It presents us with a lovely, microcosmic sample of narration: an atom of a novel.

From my calculations, the NYE as a whole averages 51.2 characters per line. The Wings of the Dove’s 47 characters are shortish but still representative, just within the limits of the average’s first standard deviation. Meanwhile, in Gutenberg’s ebook (.epub file), the line length of the first line of the novel accords with current best practices and contains the same number of characters of the New York Edition:

Epub First Line

Subsequent lines reveal, however, that the ebook contains on average fewer characters per line than the NYE, and a further difference between the NYE and the .epub display is the number of lines constituting a single page; the printed book contains (24 lines in the NYE, 17 in the ebook). Because the line length is permanently set in the ebook, it will remain consistent across users (so, if you assign free online texts in your classroom, I urge you to give the students links straight to the .epub file). The ebook “reads” less weighty, airier, than the print book. It feels, in fact, almost too airy, for my eyes keep straying to the white margins that dominate over the text block.

By contrast, in Gutenberg’s in-browser .html file—temptingly listed first among Gutenberg’s full-text options and far more visible in Google search results than the .epub file—the line length is an astonishing 138 characters:

Gutenberg First Line

We must keep in mind there is a man, “the Prince,” who is in “his London,” that his London has “come to him,” that he was born a Roman, and that he has an opinion about the Thames, which we are about to learn once our eyes complete a carriage return. Only at such a line length does James’s late style appear immediately as a burden on the readers’ memories. The Prince has been effectively buried. (This is, by way of disclosure, when viewed without changing the screen size from its default in my Chrome browser in my 11” MacBookAir. Keep in mind that, though I could force a change in line length by reducing my browser screen, it would also expose my peripheral vision to whatever underlay on my desktop.)

If I scroll down this .html version so that the first line is at the top of my browser, I see a gut-punching total of 4,228 characters that seems to confirm those greatly exaggerated rumors of James’s impenetrability. The .epub display uses a facing-page design (showing two pages at a time), but the first screen is a title-page, so the reader’s first view encompasses a more manageable—almost meager—1,051 characters: a quarter of the .html file browser display. The next facing-page combination yields 2,111 characters. In the NYE, 1,249 total characters greet the reader on the first page, and in turning to the next page, the two facing p ages encompass 3,288 characters.

We have line lengths from 47 to 138, character totals from 2,111 (.epub) to 3,288 (NYE) to 4,228 (.html). The .html text’s absurdity, ensured by its lack of organizational features, directly result from its strengths as a simple, small file, easily transferable, and of course it is the .epub version that is meant to succeed where the .html file does not, and vice versa. But this .epub version still bothers me. My instinct that it too is absurd, despite its excellent management of character overload, stems from my agreement with Johanna Drucker in her essay “The Virtual Codex from Page Space to E-space.” Drucker explains, “Electronic presentations often mimic the most kitsch elements of book iconography while for the longest time the newer features of electronic functionality seemed not to have found their place in the interface at all.” Instead, Drucker advises,

Understanding the way the basic spatio-temporal structure of the codex undergirds the conceptual organization of reading spaces is still important as we move forward with designing new environments for publication.

This would be an understanding based less on a formal grasp of layout, graphic, and physical features and more on an analysis of how those format features effect the functional operation and activity of the work done by a traditional book. Or, to put it more simply, rather than think about simulating the way a book looks, we might consider extending the ways a book works as we shift into digital instruments.

For me, this means creating options of restricting character overload beyond the use of columns, which feels literalist and even tacky.

And it means that every design decision for a digital edition reflects an ontology of the book. Is it an historical artifact, in which case you would want large-scale facsimiles and zoomable images of manuscripts and individual book-objects? In the case of re-archiving James online, do tactile and visual qualities of font, paper, color, binding, matter? As long as HathiTrust continues to host so many high-resolution images, I certainly don’t have to worry about it myself. Preservation in this sense is in the capable hands of librarians, though of course it would be exciting to be a part of technological advances in preservation. What I would like are far smaller files that encode the text so that it can be displayed on a multitude of platforms and can be manipulated for data visualization, data mining, and all sorts of distant reading and statistical analysis—but what matters in the context of design and line length is that I’m going to define a James text as a conceptual entity, a way to organize thoughts, a method of presenting narrative information.

As a start, I’d like to preserve the quantity of information facing a reader. Per Drucker, I can throw out the facing-page design that, on a screen, effectively makes a two-column document, so that I can reduce character overload. Throwing out the columns will avoid the airy overemphasis on white margins in the .epub document. I’d start from the 3,288 characters of the NYE’s visual “load” as a benchmark for the maximum number of characters seen on the default screen of a digital edition. Line length will then be deduced from this character load in the process of ensuring proper balance between margin and text, line height, and any navigational bars/menus around the text (although, shudder, as I’m dreaming of an austere design whose beauty is derived from proportionality itself, courtesy of Devin Hunt’s typebase.css).

What remains is the issue of responsive design, ensuring that these proportions remain no matter what platform a reader uses. Richard Rutter concludes his meditation on line length by advising designers to specify proportions rather than inelastic pixel sizes, using a “liquid layout” that preserves the designer’s ratios between white space and text (margins, line height, et cetera) but allows the user to adjust line length “to suit his or her comfort”:

Relinquishing such control makes some designers quake in their boots, but the beauty and advantage of the Web as a medium is that readers are able to adjust their reading environment to suit their own needs. This is a concept that should be acknowledged & embraced, and built into website designs from the ground up.

Already, the question of control seeps in. The NYE was a marvel of control: of James trying to control his artistic legacy by the revising his texts and demanding veto power over the design of the volumes themselves. If James’s tight control over the NYE means that the resulting volumes reflect his ontology of the book—spoilers, I think they do—this control is a part of the reading process. And it’s not merely a literalist, hammy, conservative reproduction of traditional bookishness, as Drucker warns us against, but a crucial element of understanding the book as a conceptual structure.

