Category Archives: The Voynich Alphabet

Investigations of the shapes that are used within the Voynich to render textlike material.

Letter Patterns, EVA-m (the “j” Shape)

here’s a glyph in the Voynich Manuscript ~~EVA~~ font-set that is mapped to the “j” key, because it resembles a j to contemporary eyes (note, it turns out this is mapped to “m” in EVA).

In the 15th century, however, the letter j barely existed. Many European languages used a soft “j” (similar to a “y” as in “you”) and it was written as an “i” preceding another vowel, as in IOANNES (Johannes) and IVLIVS (Julius).

The “j” wasn’t even part of the alphabet—it evolved gradually from an embellished capital “i” that was used for names.

To the medieval eye, the “j” shape was not a letter, it was a Latin abbreviation written as a ligature (two shapes combined together for comfortable writing—something I’ve mentioned in previous blogs about the Voynich glyphs). Here’s an example of -ris, from a 14th-century manuscript, decomposed into its parts.

The letter "r"is on the left and is combined with the shape on the right, which is an abbreviation for "-is".

The letter “r”on the left is combined with the shape on the right, a common Latin abbreviation for “-is” to create the suffix “-ris”.

Depending on the shape of the first stroke, this can stand for “-ris”, “-tis”, or “-cis” and, in some contexts it was also used for the suffix “-rum”, instead of the more common 4-shaped “-rum”.

Origins of VMS Glyph Shapes

The Voynich Manuscript borrows many conventions from Latin, so it’s reasonable to assume that the inspiration for the EVA-j glyph-shape was probably the Latin -ris. It’s also interesting to note that in Latin, -ris occurs more frequently than -cis, and this is also true in the VMS. Whether this has anything to do with the meaning of the glyph or whether it is a case of misdirection (mimicry of Latin shapes without intending the same meaning) is not known but it’s noteworthy that -ris can occur at the end of a word almost anywhere in a Latin sentence, whereas it tends to occur at or near the ends of lines in the Voynich manuscript. The shape is the same; the positional patterns are different.

It’s also noteworthy that almost any letter can occur before -ris/-tis/-cis in Latin, whereas in the Voynich Manuscript it is usually preceded by the EVA-a glyph, as in the following examples:

But EVA-j is not limited to following the a-glyph. It doesn’t happen often, but it can follow other shapes:

The aj combination is the most frequent, but many other glyphs can precede the EVA-j shape, some of which are unclear as to whether they are “o”, “a”, or something else.

It’s difficult to tell which VMS glyphs are 1) combined shapes meant to be read as one glyph, or 2) combined shapes intended as multiple-glyph ligatures, but there’s some evidence that the Latin -is shape (the righthand side of the -ris) might be a separate glyph in the Voynich manuscript. There are instances where the -is loop is completely disconnected from the previous stroke and some where it is preceded by other glyphs besides the “r”, thus suggesting it may be able to stand alone:

In these examples, the -is glyph is separated from the previous glyph and is preceded by something other than the “r” shape, thus suggesting it may be a separate glyph and possibly used as a ligature.

In Latin, it’s uncommon for the -ris shape to appear anywhere other than the end of a word and even more unusual for two of them to occur in sequence unless they happen to be variations (e.g., -ris followed by -tis). Midword positions are infrequent in the VMS, as well, but they do occur:

In the VMS, “aj” is usually found at the ends of words, usually at the ends of lines, but it is sometimes written midword, as in these examples.

Many transcriptions of the VMS text do not recognize the distinction between the straight “aj” and the curved “aj” (which is part of the reason I created my own transcription), but it might be important to acknowledge the difference partly because they are separate suffixes in Latin, but also because they appear to be clearly distinguished from one another in adjacent examples in the same VMS word-tokens. For example, here we see the -ris shape both preceding and following the -cis shape:

In the first example, there are two -ris shapes and one that may be either -cis with a short stem or a different character entirely. The second and third examples are less ambiguous, however. In both, the -cis glyph precedes the -ris glyph and it appears that the distinction is deliberate, as was the custom in medieval Latin.

Summary

If we assume that the looped part of the aj glyph is the right-hand side of a ligature, and could potentially be combined with other glyphs, then we have to look for other instances of its use.

As I illustrated back in January (and mentioned in even earlier blogs), the gallows character on the right may be composed of two parts, as well. Even if it is, what the glyph means is anyone’s guess. This shape has different interpretations in different languages—it can be “Il” in French, “lis” in Latin, “Item” in German, and sometimes even a very abbreviated “peri” in Greek. It’s also possible that it’s a capitulum, modifier, or marker, and the similarity to the looped shape in “aj” is coincidental.

Note that the gallows glyph also has certain positional peculiarities that differ from “aj”. It’s frequently preceded by “o” rather than “a”, it’s not usually found at the ends of words or the ends of lines, and might be a counterpart to the gallows glyph with two loops.

One other detail worth noting is that some of the EVA-d characters have a straight rather than looping stem. Is it possible this shape is a short-stemmed -cis or “j” rather than a “d”? In some places the distinction between them is more dramatic than in this example but are they different enough to be considered different glyphs?

Questions like this can’t be answered by shape alone. Position and frequency have to be considered, as well, to see if they behave differently. I’ve done this kind of analysis on some of the other morphologically similar glyphs, but I haven’t had time to evaluate the short-stemmed -cis to see if it’s different from EVA-d.

J.K. Petersen

Entering the Entropy Zone

Entropy and the VMS

n the Voynich world, there is an oft-quoted statistic that the text exhibits low entropy compared to natural languages. It has been said that only one or two languages come close (with Hawaiian being one of them).

This comes as no surprise if one looks closely at the Voynich text. I created my own transcription of the entire manuscript several years ago, so I had no choice but to examine and evaluate every letter, every space, and one can’t help noticing how certain combinations repeat, and how certain letters re-occur in the same positions with surprising frequency. Line structure follows patterns also, with specific glyphs falling at the beginnings or ends of lines more often than one might expect.

How does the entropy of Voynich text compare to other 15th-century manuscripts? This is a broad and complex question, far beyond the scope of a blog whose purpose is to introduce the idea without all the math, but it probably wouldn’t hurt to show one example (note that entropy and repetition are related but not identical concepts—I’ll deal with repetition more specifically in a separate blog).

Comparing Two Snippets

Here’s an example from folio 81r I chose because the page layout reminds me of a song or poem and it’s not too hard to find 15th-century poetry for comparison. Poetry tends to be more repetitious and regimented than regular text, so I thought a medieval poem might resemble VMS text more than regular narrative text.

Excluding the fragments beginning with “o” on the right, and assuming the “9” and the “o” on the left are single characters, there are 23 word-tokens, and 20 repeated sequences of three characters (I was bleary-eyed from lack of sleep when I first wrote this, so I corrected this paragraph Nov. 10th).

Note that the repeating 3-glyph sequences are always in the same positions at the beginnings or ends of word-tokens. This is not a pattern we typically associate with natural languages except in specific forms of text such as prayers. poetry, or lists.

Compare this to a 22-word snippet from a 15th-century cosmology-themed rhyming poem in Italian that includes 6 repeated sequences:

In this example, there are also three 3-character sequences, but each one repeats only twice. Since this is a rhyming poem, two sequences are at the ends of words (and lines) but, unlike the VMS, the “chi” sequence appears in the middle of one word and at the beginning of another—it’s not positionally constrained.

Here’s another example, from one of the large-plant pages:

I colorized the sample to make it easier to see the patterns. Note that for the purposes of this example, I made the assumption that the “4o” sequence is intended to be together (this appears to be the case in most of the manuscript, but there are exceptions where the “4” appears without the “o”).

Even though the formatting and apparent subject matter of this plant page is quite different from the previous example, there are clearly many similarities, such as a high percentage of repeating sequences: the “4o” combination is almost always followed by a gallows character, the “c” and “r” shapes with tails are at the ends of words, the “9” is usually at the ends of words and frequently follows EVA-ch or EVA-sh, and the Latin “-ris, -cis” abbreviation (EVA-j) is always at the ends of lines (in other parts of the manuscript “j” appears elsewhere, but not as frequently as at the ends of lines). As I’ve mentioned on previous blogs, the structure is quite rigid.

Entropy is measured in a number of ways—it is not limited to repeating glyph sequences. Measures of word-length, character variability, and individual character combinations are all taken into consideration. Notice that the position of characters in relation to each other is more variable in the Italian example and the character set is larger. Most of the VMS text is expressed with about 17 to 20 of the more common glyph-shapes. The old Italic alphabet had only 17 characters, so it’s not an unworkable number but it’s fewer than most alphabets of the time and significantly less if you consider the various diacritical marks and abbreviation symbols that were in regular use. It’s also significantly less if any of the VMS glyphs are markers, nulls, or modifiers.

cheapest place to buy isotretinoin online Summary

These snippets are only examples—they don’t mean anything by themselves. Genuine research requires hundreds or sometimes hundreds-of-thousands of samples and many different kinds of comparisons. For a draft tutorial on entropy as it applies to the Voynich manuscript, you can read Anton’s post on the Voynich forum. For mathematical studies of entropy, you can consult scientific journals and blogs, and books such as CryptoSchool by Joachim von zur Gathen. For a basic introduction, however, you can look through the VMS and see that the above patterns are common to the text as a whole—glyph-groups tend to repeat, and the same glyph-groups end up in the same positions much of the time, with variation in letter-position being very constrained, all of which tend to lower the entropy.

Does this argue against the VMS being natural language?

Maybe.

But that’s a subject for another blog.

J.K. Petersen

Reconsidering the Columns

3 Replies

The Mystery of the Columns

n May 2016, I posted a follow-up blog about the faint letters visible on the right-hand side of folio 1r and speculated that it might be a failed attempt at decoding the manuscript. That was a guess based on seeing the Latin alphabet in the first column paired with Voynich shapes in the second, and the fact that it was later erased. Two more columns are also faintly visible, but there’s not enough detail to discuss them in depth.

In my previous blogs, I was reluctant to guess the date of the columnar writing because only a few letters are clearly visible, but I went out on a limb and estimated that it might be late-16th- or possibly 17th-century script, based on the small round shapes, the long unlooped ascenders, the slant, and the overall look and feel. I wasn’t completely sure, however, because important clues about how the writer connected the letters and spaced the lines aren’t available.

As soon as I posted the May 2016 blog, I started this blog, to describe the writing further, but was pulled away by other interests and responsibilities. The column text is a sideline for me, but studying it might reveal a few details about the VMS’s provenance, so I come back to it from time-to-time.

Who Added the Columns to the Voynich Manuscript?

My paleographic collection includes thousands of writing samples, but most are focused on Carolingian or Gothic time-frames and the VMS columnar writing is different. It looks more recent than other parts of the VMS, and more like a casual or correspondence hand than a scribal book hand, and most of it has been erased. Nevertheless, there is enough to sample some of the letters.

To recap: on folio 1r, the first column (to the right of the main text) is moderately clear. An alphabet has been written from top to bottom in a tidy script with small, relatively smooth curves and unlooped ascenders/descenders. I have colorized the letters to make them easier to see.

The second column starts with the VMS figure-8 glyph, followed by a small c-shape, and then some shapes that resemble the “red weirdo” at the top of the columns. I’ve colorized the “weirdos” red to distinguish them from the regular Latin alphabet in Column 1 and the VMS characters above them. Columns 3 and 4 are almost completely erased and crowded by wormholes, and column 4 appears incomplete (it’s even possible that columns 3 and 4 are one column worked in around the holes), so this blog focuses on the letters in column 1.

A Brief Background on Writing Styles

From a paleographical point of view, the style of writing in Column 1 is quite distinct from the angular looped ascenders and proportions of 15th-century Gothic scripts. Gothic book and cursive hands (and those that closely resemble them, like Anglicana) were predominant in the Holy Roman Empire in the 15th century and were in use all the way north to Scotland and Sweden and south to the area around Naples, partly through the influence of Benedictine and Franciscan monasteries, and partly due to commercial scriptoria that offered handwriting lessons.

Gothic cursive styles were less common in the central Italian states and western reaches of Portugal and Spain, but were used in Flanders, eastern France, and Bohemia.

Gothic handwriting is relevant to Beinecke 408 because the labels on the zodiacs, and the marginalia on the last page and a few of the other pages, are in Gothic cursive hands. The latter appears to be in an older transitional style, between a Gothic book hand and Gothic cursive (I have a detailed paper on this that I will upload in a future blog).

The folio page numbers also appear to be different from both the main text and the last-page marginalia, and it has been suggested that John Dee may have added the numbers. I have not read the prior research on Dee and the folio numbers because I wanted to determine for myself whether there is a match so I could independently corroborate or refute existing opinions and will post my observations on a separate blog. For this blog, I thought it might be interesting to ask the question…

Did John Dee Write the Marginal Columns?

John Dee was a pious family man with a thirst for learning. His broad interests included mathematics, medicine, astrology, and many other subjects. He avidly collected books, dreamed of establishing a national library, and was eager to communicate with angels in the hope of uncovering universal truths.

Dee is often described as an alchemist but he did not engage in alchemical experiments to any great degree, except in a secondary role if they were related to angelic communication. He was interested enough, however, to read about alchemy, to have some lab materials, and to leave marginal notes in this handwritten manuscript that may have been from his library:

Dee’s margin note about “the grene lyon” (the green lion) is a reference to one of the ingredients of alchemical distillation processes. Interestingly, something I noticed as I looked at page after page of Dee’s writing, is that he appears to have picked up scribal ideas for ligatures and flourishes from some of the texts that he read or copied. I noticed the scribe on the left used a ligature for “th” and, in some places, a flourished “e” that are not found in Dee’s marginal notes for this page, but which show up in Dee’s later notes in adapted form.

In note form, Dee’s hand can be scrawly and difficult but is elegant and comprehensible when applied to finished charts and formal correspondence. Dee could draw reasonably well, valued good handwriting, and is said to have encouraged his sons to write well so as to make a good impression. (Image detail of Dee’s autobiographical notes courtesy of the Royal College of Physicians exhibit.)

In his search for knowledge, Dee ardently tried to communicate with angels and kept profuse notes of these sessions. He made efforts, sometimes on a daily basis, to contact these heavenly messengers. As a consequence, his notes, diary, and correspondence provide enough samples to get a good sense of his handwriting.

Evolution of Handwriting

By the 17th century, handwriting in academic circles had evolved from the upright, heavy, angular Gothic styles of the 15th century to a lighter, quicker, more slanted script. Compared to early 15th-century scripts, Dee’s 16th-century lower-case letters are small and rounded, the space moderately wide between letters, and the ascenders and descenders long and not always looped, more similar to the example on the right.

On the left is a typical example of mid-15th century Gothic script from a commercial scriptorium that taught handwriting. By the 16th century, paper was more widely available, making it easier to engage in correspondence and quicker, lighter hands became prevalent in academic circles, as in the French example on the right. Dee’s hand also reflects this change in style and bears similarities to the hands of a number of scholars and nobles in France, distant parts of the Holy Roman Empire, and what is now northern Italy.

With regard to the VMS, Dee’s script is distinctively different from the Gothic cursive on folio 116v and a few other folios, so I think we can rule out Dee as the author of the last page and the zodiac wheels marginalia. It also doesn’t seem likely that he was one of the primary scribes for the VMS—the slant and spacing don’t match, the time-frame is wrong, and he handles the pen differently from the main text (more about that and the folio numbers in separate blogs).

Overall Impression

As I collected samples of Dee’s handwriting, it struck me that it was similar to Marcus Marci’s correspondence about the VMS, penned by a scribe on Marci’s behalf several decades later. I haven’t seen this similarity mentioned anywhere else in connection with Voynich studies, so I sampled one of Marci’s letters, as well, based on the image at http://www.voynich.nu. As far as I am aware, the identity of Marci’s scribe has never been determined.

Most of Dee’s available notes were written between 1550 and 1600, almost a century earlier than Marci’s letter, and yet you will see the similarities in style in the image below. The only significant differences are the following:

Dee sometimes wrote “e” with an ascending tail rather than a loop,
Dee’s “g” descender is shorter (although not always), and
the starting leg of the “h” is frequently truncated so it doesn’t reach the baseline—in combination with the flourished “e”, this is a distinctive marker in Dee’s handwriting but the pattern can be found in a few others, including that of Isabella d’Este who was raised in Ferrara, far from Dee’s London, England.

It was necessary to hunt through several hundred documents to find a few hands that closely resembled the style of writing on folio 1r and this is still a work in progress. It may require hundreds more to get a sense of when and where the columnar letters were written. As it is, Dee’s handwriting is somewhat close, and he sometimes wrote the “e” with a hook as in the columnar text, but the slant and pressure dynamics differ, so it’s not an exact match (click to see a larger version).

The hand of Isabella d’Esté (far right) is surprisingly similar to Dee’s (with the exception of the “g”), which demonstrates not only that geographically distant writers can end up with similar letter forms, but that it’s unwise to jump to conclusions when finding something that “almost” looks the same…. there might be others that match even more closely that may lie undiscovered.

Summary

When I first saw Dee’s handwriting, I noticed similarities between it and the VMS columnar text, but after sampling the handwriting of other writers, it appears that this style of script was widespread geographically even if it was not entirely common (I encountered many other styles in the search for this handful of samples).

My gut feeling, until more data is available, is that the columnar text was probably added sometime between the late 15th century and the mid-16th century. This is very tentative, as there is so little to go on, and certainly will be revised if additional examples that match more closely are found.

J.K. Petersen

Colorizing the Columns

Making Sense of the Columns

On folio 1r, on the right-hand side, there are three columns of letters and a few shapes that somewhat resemble the “red weirdos” on the same page. The first column appears to be the alphabet in Latin characters. The second column doesn’t follow an alphabetic pattern and may have been someone’s attempt at decoding the manuscript with a substitution code. The third column is faint and doesn’t appear to have as many characters as the others.

I had considerable difficulty trying to determine which marks were worm marks, which were variations in the parchment, which were chemical abrasions, and which were letters or other glyphs, but I did my best to colorize the forms so they can be more easily seen.

This is a very subjective process based only on scans, since I have never seen the original document, but I thought it might be helpful or at least of interest. I used a different color for the two shapes that look more like “red weirdos” than the other letters (the upper one, at least, the one that resembles a Y shape and is next to the “c”, looks like it was filled with a brush rather than a pen).

I asked myself what would motivate a person to scrape or chemically remove the columns on the right? The two most likely explanations seemed to be 1) to hide the original code or 2) to remove an unsuccessful attempt at decipherment.

Since the column script doesn’t match the handwriting for the marginal notes or the zodiac labels, and has a different look and feel than the original VMS text, I’ve been assuming for now that it was written by someone else and may be a failed attempt at decipherment. The problem with this idea is that some of the shapes in the second column are not regular shapes in the VMS and the Y shape that resembles a red weirdo may have been part of the original document, considering there’s an oddly placed red weirdo above it. Is it possible there were shapes in the margin before the columnar text was added?

The age of the column text is difficult to determine. The ink appears to be old, but the style is not Gothic cursive, as are the marginal notes. Gothic cursive was especially prevalent in the 15th century, which suggests that the marginal notes might be as old as the VMS, or almost as old, but the columnar text is different—it could range from about the 16th century to perhaps the 20th century, depending on the region. If I were forced to guess, I’d probably guess late 16th or 17th century.

You might notice something interesting toward the bottom of the second column—the shape at the bottom is rounder and more elaborate. It’s difficult to tell if it was written at a different time or by a different hand (or whether the column writer switched to a different style of writing, which seems less likely). To the left of it, in the first column, are a pretty standard y and z and possibly an x above the y, but it’s very faded and hard to tell. Above the curly letter in the second column is a shape I can’t make out.

A Little Dessert

I have one more image that strikes me as interesting. It was difficult to adjust the colors because it’s very small.

If you turn your head to the left next to the top right weirdo, there are three lines of what look like erased text. Nothing is clear except perhaps the shapes at the end of the second line, which look a bit like a modern era capital-F followed by an a (or maybe a g), but it’s scrawly, so I really can’t tell. It doesn’t look like Arabic, Hebrew, or any other language I recognize. The problem with identifying scrawls is that there are some shapes, like n or r, that look like letters in many languages (Anglo-Saxon, Latin, Greek, Hebrew, Italian, German, French, and many others) even if they mean something different in another language.

The VMS has gone through many hands, so who knows who may have added notes. Why the note would be so tiny is a bit of a puzzlement. Maybe you can make it out.

J.K. Petersen

More About Those Puzzling Pilcrows

11 Replies

Are There Pilcrows in the Voynich Manuscript?

In the last article, I diverged from the Voynich theme to illustrate a brief history of the symbol we know as the pilcrow. I realized the article would be too long if I tried to tackle both the history of the pilcrow and its relevance to the VMS in one go, so this is a continuation of the previous blog.

What does a typographical symbol have to do with the Voynich Manuscript? Maybe nothing. The VMS has enough space on most pages to visually separate the paragraphs, and yet there’s something odd about the behavior of the tall glyphs, popularly called “gallows characters”, a clue that might be important in interpreting the text.

The Duplicitous Gallows

The first time I looked at the Voynich manuscript, I noticed a tall shape that looks like a Greek pi with a loop (or two loops) and one that looks like a P were often at the beginning of paragraphs. Sometimes they are embellished with extra loops that appear to be more decorative than meaningful (although that is not known for certain).

At the time, I knew very little about medieval scripts, nothing at all about capitula (although I was familiar with pilcrows from word-processing programs) and I further didn’t know that capitula could occur in the middle of a line, as well as at the beginning of paragraphs.

What I did know was that the VMS had been called a “cipher” manuscript and I noticed immediately that the gallows characters at the beginnings of sections would sometimes alternate (see Folio 3r as an example), so I entertained the notion that different gallows chars signaled a different encryption method, or perhaps a paragraph that required a different decryption key. It hadn’t occurred to me yet that the explanation might be simpler.

After paging through several of the VMS scans, it became apparent that gallows characters weren’t always at the beginning of paragraphs and didn’t all behave as pilcrows—some of them were midline and too close together to reasonably expect them to signify a new section.

This pattern prompted me to do some research on paragraph markers and I discovered capitula (section markers often used to separate units smaller than a paragraph) and noted that the C-shape in old manuscripts served two functions—it could represent a capital C or it could represent the capitulum symbol, depending on context.

Could some of the VMS glyphs have more than one purpose?

This example from folio 58r illustrates how gallows characters are frequently included within the main text, often close together, which provides an argument against them being section markers, but what if the gallows serve two purposes, as with medieval capitula, which can stand as section markers but also represent the letter C? Notice also how frequently EVA-k and EVA-t are followed by the glyphs that look like “ar” or “al” (more often than would be expected in natural languages and something I touched on briefly in another article). Is it possible that even the midline gallows is a marker rather than a letter?

Could the gallows characters be pilcrows or capitula in some situations and letter glyphs in others? If so, a computational attack would have to adjust for this possibility when estimating letter frequencies.

Maybe the glyphs aren’t doing double duty. Maybe they only represent section markers, proper names, or something else in the text that needs to stand out, but if that’s the case then there’s a problem… if the midline gallows are capitula or markers, it reduces the number of VMS glyphs that potentially correspond to an alphabet. The VMS character set is already rather restricted and reducing it further would make it even less like a natural language.

Counting the Pees and Ques Tees

In the EVA font, a character set that helps you write Voynich glyphs, the EVA-k and EVA-t stand for the pi-like gallows with one loop or two. Similarly, EVA-f and EVA-p represent the P-like gallows with one loop or two. There are also some glyphs that combine the gallows with the bench char and are numerous enough to be significant, but it’s a more complex subject, so I considered both benched and unbenched at the beginning of sections as gallows for the purpose of this tally.

I counted only the text groupings that could be identified as discrete sections, what we would call paragraphs. Most of them are fairly clear—either there is space between them or the last line is shorter and a new group begins. Here are the tallies for the first section from folio 1r to 57r, which consists entirely of plant drawings except for the first page.

Out of 219 paragraphs, two were preceded by the red “weirdo” characters that resemble seagulls on the first folio. There were 206 groups preceded by gallows characters, most of them without bench chars, but a few had a full or partial bench character attached. A very small number were preceded by a gallows character with a leading “o”. Including the very small number with a leading “o”, 94% of paragraphs began with a gallows character.

The gallows were distributed as follows:

EVA-p 85 (double loop P-shape)
EVA-t 66 (double loop pi-shape)
EVA-k 40 (single-loop pi-shape)
EVA-f 15 (single-loop P-shape)

Gallows with two loops thus occur more frequently than those with single loops at the beginnings of paragraphs, with the most visually ornate of the four being the most numerous (whether by coincidence or design is not known).

Of the 11 glyphs that were not red seagull-shapes or preceded by gallows, all were secondary paragraphs (not the first one on the page) and were distributed as follows:

EVA-q 6 (all were followed by the “o”)
EVA-y 2
EVA-ch 2 (both were bench chars with caps)
EVA-s 1

Whether the lack of a gallows character at the beginning of some paragraphs was intentional or accidental is difficult to know without further interpretation or decryption of the text. Most of the exceptions were “4o” combinations that almost invariably fall at the beginnings of word-tokens throughout the manuscript.

A simple count cannot reveal whether a gallows at the beginning of a section is a paragraph marker, especially when there are four different symbols used for this purpose, all of which also show up in the main text. It does seem unusual, however, to have only four letters of an alphabet at the beginnings of paragraphs for such an extended number of pages. Even in medieval books of lists, calenders, and indexes, there is more variation than this when the length of the document exceeds 50 or 100 pages.

On folio 58r, the first full page of text after the plant section, the pattern of leading gallows characters continues, as does that of gallows characters occurring midline.

Perplexing Paragraphs

In many respects both styles of gallows characters (the ones with two legs and the P), resemble pilcrows. Look again at the marker on the right from a manuscript written in medieval Lombardy—it’s two slashes with a horizontal bar. It’s a bit like EVA-k or -t and the regular pilcrow that resembles a P or backwards-P could be represented by EVA-P. Maybe the VMS gallows characters are a hierarchy of pilcrows, like the red and blue capitula where one is used for greater emphasis than the other. If one were a pilcrow and the other a capitulum, you would expect the “capitulum” to show up more often and perhaps even double for a letter, as it did in most medieval western languages.

But what about the strange behavior of gallows glyphs where they stretch over several characters? Is each leg standing in for the same character so it doesn’t have to be written twice? Is it an embellishment? Is it a different letter-glyph, one that’s only occasionally needed? Capitula never stretch over letters like that, do they?

I didn’t think so until recently, and then I found the capitulum illustrated on the right. The manuscript had many traditional capitula and a few like this, where the stem came down several letters later. I don’t know if it’s because the scribe didn’t leave enough room for the stem and compensated by adding it farther on or if it was to give the capitulum greater emphasis. Perhaps it was an embellishment, but whatever its significance, apparently capitula can stretch over several letters.

Summary

Could the gallows characters be capitula, name or title markers, letter-glyphs or possibly both? At least some of the glyphs behave like capitula or pilcrows. Assuming that a natural language were behind the VMS, it seems unlikely that so many sections would start with the same two (or four) characters unless their function went beyond an alphabetic character. Maybe they aren’t letters at all. Maybe some or all of them are intended to provide emphasis, serve as modifiers, or as some kind of semantic break between words and perhaps that’s why they’re never placed two in a row.

J.K. Petersen

Pointing out Pilcrows

2 Replies

You’ve probably seen it in word-processed text, that funny backwards-P sometimes visible at the beginnings of paragraphs. It’s an ancient symbol that originated in the days when words were often broken across lines without a hyphen and sometimes run together without spaces so that it could be difficult to tell where one thought ended and another began.

The concept of the pilcrow is related to the Greek word paragraphos, the origin of our word “paragraph” from para (beside, next to, apart from) and grapho (write) but no one has given a really satisfying etymology for the word pilcrow. It has been suggested that pargrafte or pylcrafte somehow mutated into pilcrow but that seemed a bit of an aural stretch, so I looked up pilcrow in dictionaries from 1700 and earlier.

Looking at many definitions gave me the feeling pilcrow may have existed alongside the Greek-Latin paragraphus because old dictionaries listed it next to paragraphus as though they were synonyms, rather than as one leading to the other. In A Dictionarie of the French and English Tongues (1611), it is spelled Pill-crow which suggests it might have evolved from two words combined.

This 13th century paragraph marker in a Greek document consists of two curved slashes.

I tried looking up pill-crow, pylcrow and pull-crow and then remembered that many older languages would not have added “w” to the end. It then occurred to me that the Spanish pvlcro/pulcro, which means neat or tidy, might be related. I haven’t seen anyone propose pvlcro as a possible forerunner for pilcrow, but sound-wise it’s more tenable than pargrafte or even pylcrafte and the idea of tidying up or summing up a neat group of text might fit the sense of it, as well. So, it’s possible that there is a forerunner to pilcrow (perhaps pulcro or two words combined, or something else) that is not directly descended from paragraphos.

Whatever the origin of the name, the symbol was used to mark a new section, just as it is now.

The Pliable Pilcrow

The symbol has a very interesting visual history. Sometimes it was little more than a horizontal slash, or a vertical one, as in this Latin text on the right, from around 1100, or as in the example shown above with double slashes.

Sometimes a loop was added to the slash, making it look more like a contemporary pilcrow. That’s not to say every pilcrow was roughly P-shaped. Many didn’t resemble this shape at all.

If we look at medieval documents, there is a symbol called the “capitulum” (the diminutive of the word kaput for “head”). The capitulum or little-kaput is a C-shape that was used more liberally than our current concept of chapter or paragraph. It could mark a page, a paragraph, or even a sentence, and would sometimes occur mid-text, as well as at the beginning of the line.

In documents with only one color ink, sometimes the capitulum was drawn larger, to distinguish it from the letter C. To further distinguish it from the rest of the text, sometimes that extra vertical slash on the right, used to embellish the character, was extended below the line and superficially resembled the backwards-P.

When colored pigments were available, it was commonly drawn in red and sometimes blue. Alternating the blue and red made it easier to find certain passages and eventually scribes figured out that the colors could have meaning.

In one old ecclesiastical manuscript, there is a legend in the margin that designates blue and red capitula as 1) noteworthy, or 2) as biblical miracles. When used in this way, a capitulum can function as a combination paragraph marker and manicule.

The simple double-slash capitulum was still in use in the 15th century and is shown in its more basic form to the right. Sometimes a stem was added across the top, which makes it look more like an F than a double -slash or a C.

Variations on a Theme

Many of the section markers in the document to the right are drawn with the simple double-slash with an upper stem, but the two at the bottom look more like the letter C except that the bottom stem has a gap. The thicker back to the slash-shape that makes it resemble a C suggests greater emphasis.

Sometimes the shape is in between a backwards-P and a C as in the text to the left. These examples illustrate that scribes weren’t specifically trying to make these symbols look like Pees and Cees and were more concerned with their function than their exact form. When they are larger than the other letters and colored, there’s no problem recognizing their intended purpose.

Why did old manuscripts use these shapes instead of extra blank lines or indents?

Because parchment was expensive. It’s very labor intensive to kill a goat or calf, strip its hide, scrape off the hairs, and then prepare the parchment or vellum so it’s thin enough and smooth enough to use as a writing surface. Cramming the words together and using capitula for the breaks allowed more words to fit on the page.

where can i buy the cheapest Clomiphene End Markers

Sometimes a paragraph marker is put at the end, instead of the beginning of a paragraph, similar to the way Fin (end, finished) is used at the end of a story. Most of the time, in old documents, a pilcrow or capitulum symbol is used, but a simple P can also suffice (right).

Languages will sometimes borrow shapes from each other but assign them different values. The “-ris” abbreviation often used in old Latin texts shows up as an end-paragraph marker in German texts.

What does the pilcrow have to do with the Voynich Manuscript? It was too long for one blog, so I continue the topic here.

J.K. Petersen

Postscript 13 May 2020: I have continued to study the enigmatic glyphs that head up each paragraph in the VMS and I am more convinced than ever that they have a pilcrow-like function rather than a letter-like function. In other words, in the VMS, the first glyph in each paragraph is not part of a word, it is a marker-glyph.

I have posted more than one blog on pilcrows and have already published some statistics for how the beginning-glyphs behave, plus I have run more tests since then. These include statistics on the makeup of the beginning-paragraph tokens with and without the first glyph.

This marker-like function is not limited to the VMS. It also occurs in medieval manuscripts, as in the example below.

In the Wellcome Apocalypse (MS49), the two letter-shapes to the left of each paragraph-group more-or-less alternate throughout the manuscript in essentially the same way as the begin-paragraph gallows glyphs in the Voynich Manuscript. They guide the eye to major sections.

But at the same time, there are also capital letters for the beginnings of words and major sections, a pattern that is also characteristic of the VMS. In other words, in MS49, the same letter-shape can serve a marker function and a letter function, something that may also be true for the VMS:

With thanks to Arca Librarian, who alerted us that the Wellcome Apocalypse (MS49) is now available for viewing online.

How to Write Voynichese

9 Replies

26 February 2016

A Quick Course in Writing VMS-Style

Have you ever had the urge to write something in Voynichese that looks reasonably authentic, not just the letters, but the word-structure as well?

You can download the EVA font here (near the bottom of the page) so that you can reproduce the letter-shapes, but if you string them together any-which-way, it won’t come out looking like Voynichese. Keep in mind also that there may be two “dialects” in the Voynich manuscript, two styles of letter-combinations called Currier-A and Currier-B and I’m only going to cover one of them in this blog.

If you would like to read more about the two VMS dialects, you can look here. I haven’t had a chance to read it yet (there’s a lot of good information on the site and not enough hours in the day) but I’m sure it’s informative and I WILL get to it soon (I hope). What follows is based on my own observations.

Character Order

I randomly chose a section of text in the biological section (f77r), and created a simplified version of rules you can follow that will result in a pretty good representation of Voynich glyph-order. It might even give you some sense of how the text is constructed. Note that this applies to the big blocks of text, not the labels.

Let’s start with one of the most common characters, the bench glyph. The bench char has some special properties that allow it to be split apart or to cross over other characters, but I’ll be describing a light version of Voynichese that works well for most of the text, so your brain doesn’t explode from trying to incorporate all the VMS idiosyncracies on the first go.

If you’re not familiar with the EVA font-set, you can consult a chart here on René Zandbergen’s site. Scroll about halfway down to see the Basic Eva Characters.

In this version of VMS-Write-Lite, based on a limited text selection, the capped bench char and the bench char both behave the same way, but the capped bench char is used slightly more often.

The Rules of Engagement

The bench chars can only appear at the beginnings of words unless preceded by the Gallows P or by EVA-l (ell). There is one exception, but the ink is blobbed and I suspect it was intended to be “cc” rather than a bench char. The bench chars are almost always followed by EVA-e but are occasionally followed by “a” “o” EVA-d” or a bench char that straddles a gallows character.
The Eva-l (ell) must be preceded by an “o” or occasionally an “a” unless it’s the first letter in a word. It usually occurs with “o” or “a” about one to four times per line. I found only two exceptions to this on a full page of text. Near the bottom, there is one preceded by a bench char and one preceded by EVA-r.
The EVA-d char (that looks like a figure-8) is always followed by EVA-y and placed at the end of words or by itself UNLESS it spells out “dar” or “d ar” or “dain”. I saw only one exception to this where the text butted against a drawing and there was no room to add EVA-y. The combination EVA-dy is always preceded by EVA-e (it looks like a “c”) unless the EVA-dy stands alone. I found only one exception to this and I strongly suspect the missing EVA-e is a transcription error.
The EVA-r is not quite as common and only occurs at the end of words preceded by “a” “i” or “o” or, occasionally, by itself. It occurs on average about once every couple of lines. Sometimes it is doubled up, but each instance still needs a vowel-shape in front of it and they can be different ones as in the example on the right. There is a peculiar exception that shows up further down the page. If EVA-r occurs at the end of a line, sometimes EVA-y or EVA-ol is added and I suspect this may be to pad out the line length, but I can’t be sure without more study.
EVA-s can easily be mistaken for EVA-r so look carefully to make sure you don’t confuse them. After a while you get a feel for which one it is. The EVA-s occurs at the beginnings and ends of words and sometimes by itself (it’s by itself more often than most VMS characters). Whether it’s at the beginning of a word or by itself, it is almost always followed by EVA-a or occasionally EVA-o. In fact, if you aren’t completely sure if a shape is EVA-s or EVA-r, knowing this can help you figure it out.
EVA-q (which looks like the number 4) is almost invariably at the beginning of words (there are a few rare exceptions), and is almost always followed by “o”. In fact, the “o” is usually connected to the 4 which is notable if you consider that most letters are unconnected or only loosely and inconsistently connected. The 4o is almost always followed by a gallows character and in the uncommon instances where it’s not, it’s usually followed by EVA-d or EVA-l. There are a couple of exceptions in the last paragraph, where it is followed by “cc” and EVA-y but I will say more about these later, because I discovered something about the last paragraph on the page in another section as well.
EVA-i is a rare character and almost always follows “a”. There are a few exceptions but you have to hunt for them.
Gallows-P, Gallows-k and Gallows-t are almost never at the beginnings or ends of words unless it is the beginning of a paragraph or a line and then they are sometimes at the beginning. There are a few rare instances of Gallows-k at the beginnings of words mid-line (remember this is VMS-Write-Lite which documents only the more common patterns of one dialect as they appear in large blocks of text).
Gallows-k and -t are almost always followed by EVA-e (or EVA-e doubled) or by “ain” but are sometimes followed by EVA-al, -ar, or -y. Oddly, when they are followed by -y, it’s usually near the end of a line.
Gallows-P is usually followed by EVA-ol (at the beginnings of lines) or the bench char.
EVA-e is always midword and is often doubled (“cc”). I found only one exception in the last paragraph on f77r, and it’s a section that has been written over in darker pen (possibly by a different hand) where an “a” or bench char may have been misinterpreted as “cc”.
Gallows characters straddled by the bench char are usually midword, but are sometimes at the beginning. They’re not very common, occurring only four times on f77r. They are usually followed by EVA-e.

So that’s the basic structure in a nutshell for the most common characters.

Summary

You may have noticed that VMS glyph position is quite rigid. The order and position of glyphs rarely varies from a strict set of rules, rules that are not characteristic of natural languages, rules that apply not only to letter-glyphs and glyph combinations but to their position in lines, as well.

This is one of the reasons why one-to-one substitution codes are unlikely to get good results. It also brings up the question of whether glyph combinations represent one letter rather than two, but if the manuscript is interpreted this way, then the content would be very sparse and the word-lengths so short that spaces would have to be considered either contrived or arbitrary or, if they are abbreviations, a key to the abbreviations would have to be in the head of the writer or in another document.

In English, we have a few conventions that one can relate to the VMS… for example, the word “the” typically precedes a noun, and the letter “q” is almost always followed by a “u”, and there are some common letter combinations like “sh” and “th” that might correspond to double letters in a ciphered script, but English is a mongrel language with many loan words from French, Norse, and other languages, and thus has considerable variation in how letters can be combined and where they may be positioned in a word.

It’s probably more fruitful to look at ancient languages and some of the Asian languages to find text with a structure similar to the VMS. In terms of letter order and positioning, the syllabic languages (where a predetermined set of syllables is combined in specific ways) and abjads (languages that are typically written without vowels) have more in common with the VMS than English, German, French, Latin, Spanish or other European languages. It’s also possible that the VMS text has mathematical underpinnings rather than a direct relationship to a natural language.

This rigidity, and a character set that is constrained by positional rules, may also account for the extreme level of repetition that is found in the VMS. Significant repetition is not uncommon in medieval documents, it happens frequently in recipes, calenders, charms, and chants, but it’s also possible that the repetition in the VMS results partly from the textual structure.

If you like to code, you might enjoy generating some text algorithmically based on this rule-set combined with some study of the referenced folio. Even if you don’t like to code, studying the glyph-grammar of the VMS text might yield some insights on how it was constructed.

J.K. Petersen

The Strong Solution 6 Feb. 2016

3 Replies

The Strange Story of Leonell Strong

Antiquarian Wilfrid Voynich rediscovered the VMS in a cache of old books in Italy but failed to uncover the contents of the text.

Antiquarian Wilfrid Voynich rediscovered the VMS in a cache of old books in Italy but never solved the mystery of the text.

In 1945, Leonell Strong claimed to have solved the mysterious text of the Voynich Manuscript. He was not the first to attempt to decipher it after antiquarian Wilfrid Voynich acquired it and brought it to America as the Great War broke out in Europe.

In his lifetime, Wilfrid Voynich, a book dealer, corresponded with many people in an effort to decode the VMS and solidify its provenance. If it could be connected with important historical figures, the value would increase and Voynich, a businessman, would profit from his investment.

Voynich died in 1930, no wiser about the contents of the manuscript than when he began. After his death, his wife, Ethel Voynich, continued to try to unlock its secrets, to no avail. William Friedman, an eminent cryptologist, initiated a study group to decipher it in 1944 but, with the war looming large (and perhaps because of lack of progress), the study group was disbanded, in 1946.

You can read an extensive history and ongoing research at voynich.nu.

The manuscript was eventually sold to Hans P. Kraus, who also failed to decode it or sell it at his asking price of $160,000. Kraus eventually donated it to the Beinecke Library, in 1969, where it remains to this day. Before this happened however, Leonell Strong, cancer scientist and amateur cryptographer, came into the picture around the same time Friedman’s study group was trying to decode the manuscript.

The Strong Approach

Leonell Strong claimed to have decrypted the text based on analyzing photostats of two of the VMS folios, which he refers to as Folio 78 and Folio 93. There had already been articles about the manuscript published by John M. Manly and Hugh O’Neill in Speculum, in 1921 and 1944, so he was not starting from a blank slate. Based on its format and illustrations, it was already assumed by the 1940s that it might be an herbal and medical text with a particular emphasis on women’s health.

Strong was eager to publish the medical-related information he felt he had uncovered, but he didn’t explain his solution because he wanted to decode more of the pages and was earnestly trying to acquire more photostats.

Strong claimed the reason he didn’t want to reveal his decryption method was because of “present war conditions”. My guess is that he felt the information in the manuscript, if any of it provided unique insights into medieval remedies, would constitute a treasure trove of publishable articles and if he was the first to decipher it, he could benefit from writing up his discoveries. If he revealed his decryption scheme too soon, others might get the data first.

Despite considerable efforts—that were apparently rebuffed—he never received any additional pages. It has been said that Strong died without revealing his methods, but there are notes to his thought process and if you follow those notes you can puzzle out what he did and where he went wrong and why we are still trying to decode the VMS.

Publications

Strong described some of his findings in an article in Science (June 1945), in which he summarizes the background of the manuscript, including the assumption, by O’Neill (1944) that the manuscript must post-date the journeys of Columbus because the VMS includes New World plants (a theme revived in January 2014 by Tucker and Talbot in HerbalGram).

Strong claimed that the VMS was based on “… a double system of arithmetical progressions of a multiple alphabet…” and that the VMS author was familiar with ciphers discussed by Trithemius, Porta, and Selenius as well as one of Leonardo da Vinci’s documents. These historic treatises date from the late 1400s to the 1600s, long after the VMS is thought to have been penned.

Strong also claimed that certain of the “peculiar” glyphs in the VMS are mirror images of Italian letters but doesn’t explain exactly which VMS letters he means.

Given that Strong wasn’t very good at reproducing the VMS characters himself (the slants, connections, and pen sequence are mostly wrong), his analysis of what inspired the shapes is questionable—VMS shapes are found in many alphabets, including those around the Mediterranean and those in ancient documents recording dead languages.

Strong made further assumptions about what constitutes the VMS “alphabet”. In his chart, he excluded “j” and “z” and included both “u” and “v”. This works for some languages, but not for others. Clearly his assumptions were already influencing his choice of how the information was encoded, before he had barely begun, and his charts further indicate that he never looked beyond a substitution code, even if approached in a reverse numeric fashion.

Anthony Askham—the VMS Author

Many have criticized Strong’s decryption scheme based on his contention that the author of the VMS is Anthony Askham, an English academic active in the mid-1500s. I think the more important question is whether Strong’s decryption process was viable and accurate. Conjecture about who wrote it can come later and the decryption itself shouldn’t be discounted because the hypothesis about who wrote it may be wrong.

I won’t go through Strong’s entire process here, it’s too long for one article (and there’s no point in detailing a method that doesn’t work), but he created a series of frequency analyses of characters and mapped them to similar analyses of a few European languages and, after assuming which one most closely matched the VMS, he created charts trying to relate various Latin characters to VMS characters for that language, dating each attempt over a series of weeks.

Where Strong Becomes Weak

And now we get to the important part and the reason Strong’s method, already based on a series of possibly incorrect assumptions, doesn’t work. But first, what were the results of his decryption? Here’s a sample of the decrypted text which he describes as medieval English:

WIT SEEK TO EDIT NOT IDLE/IDEL? FOKLUORE FIT ES ME I MEATH TRUNNG IQUERI SELFLI O’ER IT NICLY RUTEN GLAVE QUIR ONGI SEM TE BELI’D

Apparently, Strong was told in no uncertain terms that this was not medieval English and made some later efforts to map the text to Gaelic, apparently without success (or maybe he just gave up).

So why is the text above not medieval English?

To list a few more obvious examples..

They don’t have the word “seek” in Old English. In the sense of searching for something, they say áséc or sēċan ‎or, if you’re seeking out something, you can say gitan or begeten. In old Norse and Dutch it’s søk/soek and German, suchan. In Middle English, sēċan became seken.
Meath isn’t a word, nor is trunng, although -rung was a common suffix in Old English (e.g., clatrung describes a clattering sound).
Iqueri isn’t a word in medieval English. It looks more like Latin and while Latin was often mixed in with Old English, it was not usually done in this way and doesn’t mean anything unless you break it into two words.
Selfli isn’t a word, although self– can be used as a prefix (as in selflicne which can mean self-centered or self-satisfied). If the words around it made sense, you could argue that selfli was an abbreviation for selflicne, but the context doesn’t appear to support this interpretation.

Taken together, there are too many words that aren’t really words, they just look familiar (I’ll explain why below), and the grammar doesn’t pan out either. Even if you evaluate it as “note form” writing, it doesn’t appear to have coherent meaning.

Let’s take another passage, quoted by Strong in his article submission, that seems more credible:

HSAWE-TRE APLE ETTEN VNLICH ARUMS CAN DRAVE WICKS AIR FROM SPLEEN: LIKE SISLE HE DRIS GAS AUT OVARI.

This seems as though it might be real medical information, about eating apples and using arum (which Strong interprets as alum without explaining why it might be alum rather than arum lily) and driving air from one’s spleen as well as driving gas from the ovary.

To understand why this isn’t any more credible than the previous quotation, you have to look at how Strong arrived at these words. Did he really decrypt the letters or did he look at many possible combinations of letters and simply guess, for each individual word, what it might be?

The Madness in the Method

How did Strong arrive at these tokens that look so much like real words?

Once he had a system worked out for mapping the VMS letters to Latin letters, he began evaluating each VMS word-token on its own against a list of “alphabets” he had developed for decipherment. In other words, he had several rows of letters (based on letter frequencies) that each VMS letter might represent. Note the column numbers on the far left. He was saying that A could be any of several VMS glyphs, B could be any of several glyphs, etc., on through the alphabet.

Even if you ignore all of his previous assumptions about language and which glyphs constitute the “alphabet”, and his assumptions about character frequency (based on already deciding on the underlying language), even if all those assumptions were correct, here’s where Strong over-reaches in his eagerness to find meaning in the VMS characters.

Strong created a set of index cards with the possible letter correspondences to each VMS glyph. You can see three of the word-tokens recorded in this example in terms of possible letters from the chart mentioned above.

The first has eight different possible interpretations of the six glyphs in the word token, the second has eight interpretations for five glyphs and the third he wasn’t so sure of (it may comprise less common glyphs) and thus he only proposed five for the five glyphs in the third example.

Under each one is the decrypted word. Strong has written ciphre, swais and lunar. How did he arrive at these? From what I can see, he took a letter from each column and combined them with the others until it became something that looked like a word.

He doesn’t appear to be following a mathematical model even though he described it as a mathematical cipher. In fact, examining all the available index cards, it looks like he inserted letters when he couldn’t create a word in a linear fashion. I have no proof of this, but based on the words noted on 13 index cards, it strongly appears as though his word formation process was subjective. There’s no sign of him uncovering a key, as would be needed for the Porta cypher, or of him necessarily having the alphabetic sequence correct, an important aid in deciphering double ciphers with this structure.

If Strong could come up with a word by using a letter from each column, he did so. If he couldn’t get all of them to work together, he made something up to fill in the gaps. The words themselves surely came from his own vocabulary, since other word combinations are possible but he didn’t list them. For example, a token he interpreted as “childe” (which works for the first three columns but not the remaining two) could also be deciphered as POLLIS, DOGFAR, COWHAG, PURPLO, SOWGAS, LOGLAD, LOWGAS, FORLAG, OWLPAR, or several others, using only the letters listed and not adding anything that isn’t (and that’s only if you look for English-sounding tokens).

The next one, interpreted as YOV (YOU?), can just as easily be read as YOR, TOR, POT, GOT, GLO, PIT, TIT, GOO, POO, or POX using his system, so he’s not only subjectively creating the words, he’s subjectively choosing which, out of many possible words, might fit with the words that precede or follow it and then fitting those into his assumption that the text was about plants and medicine.

It’s easy to assume from the drawings that the text is about medical folklore, and that might be the simplest explanation, but we don’t know for certain if the person who created the drawings also added the text. There are herbals from that period that contain only images, the text was never added, so it’s possible the text was added to the VMS by someone else and is sensitive political commentary or historical, rather than relating to plants. Maybe an unfinished herbal compendium was taken into enemy territory as a ruse (the way a botanist was included in one of the European spying expeditions to the Ottoman palace). Perhaps spy observations were added around the drawings.

Summary

Strong assumed English was the underlying language of the VMS based on creating frequency charts for only a few languages and on the assumption that each VMS glyph represented one character. From that very significant assumption, he tried to create English-sounding words by juggling his letter frequency charts and their derived possible alphabets.

Unfortunately, even with a subjective infusion of natural-sounding syllables, most of the decrypted text is nonsense and none of it fits any known version of medieval English from the 14th to 17th centuries.

Strong will be remembered for his contributions to oncology and the study of genetics in mice, but his status as a cryptographer will have to remain in the amateur category—a hobby, which means we still have a mystery to solve.

J.K. Petersen

On the Bench 3 Feb 2016

Holding Hands

Bench courtesy of Creation Woods.

There’s a character in the VMS sometimes refered to as the “bench” character. It may seem odd to us, but in medieval documents, this is quite a common ligature, sometimes representing “ce”, sometimes “cr”, sometimes “er”. It depends on the language and the context. It was written as one character to follow the flow of the hand. It also somewhat resembles the Greek letter Pi (although it’s a bit curvier).

No one is certain what it represents in the Voynich manuscript and it’s not entirely clear if it’s a ligature or a character on its own. Sometimes it is plain (as shown above) and sometimes it has a cap (see right). The shape and position of the cap varies quite a bit but the bench underneath tends to behave in fairly consistent ways, ways that are similar to the plain bench.

The bench is a common character. It’s found throughout the manuscript, often multiple times per line, and it is frequently at the beginning of glyph groups. Benches don’t usually sit next to each other, but there are exceptions (right).

Friends on the Bench

The bench glyph has an interesting property that distinguishes it from other shapes. Sometimes it stretches over other characters that are tall, with straight stems, commonly known as “gallows” characters. This creates a combination shape (or perhaps a ligature or a shape with an entirely different meaning).

Sometimes the connection between the left and right sides of the bench is broken and appears to have been intentionally written this way (rather than it being a slip of the pen). There are numerous examples of separated bench characters, but the majority are joined, so it’s difficult to tell whether it’s meant to represent one character or two (or something else). The disconnect happens with both plain and cap benches.

Apparently, the bench can cross any gallows character, although some combinations are less common than others—a bench crossing a one-loop “P” is quite rare.

Sometimes glyph combinations that look similar differ in whether the bench character has a cap.

In some parts of the manuscript (e.g., some of the plant sections), it’s uncommon to see a bench and a gallows-bench next to each other. In other parts, like the bathing nymph sections it’s not uncommon. It was suggested by Capt. Prescott H. Currier that there may be two “languages” (two different glyph-combination systems) underlying the VMS. These have been named Currier A and Currier B.

You can see how frequently benches are used in the example on the left from one of the plant pages. In this small selection of text, there are three gallows benches, two plain benches and a cap bench. Note how several of the bench characters are followed by small c-shaped glyphs. This is a common pattern. Note also that the P-bench and the gallows-bench following it are not usually combined this close together, in the same “word” token.

There’s a curious half-bench that appears in some of the combination glyphs. Sometimes the scribe drew only the left or right side of the bench, but it does appear to be distinct from the curved “c” shape in that the top is longer and straighter than the VMS “c” (sometimes even longer and straighter than this example on the right).

Sometimes the half-bench stretches across a gallows character and attaches itself to another bench on the other side (or perhaps it’s a full bench followed by a half-bench—there’s no way to tell). This long string of cap and plain benches is not common.

There are many bench characters on Folio 1r, including gallows benches, and near the bottom is this character (right) with only the right side of a bench. This half-bench is attached to a cap bench and then what may be another half bench. You have to examine it carefully to try to puzzle out which parts belong to which because the line attaching the two parts of the cap bench is very faint.

I wrote earlier that a bench can only cross a gallows character, and this is generally true, but there are a few rare instances in which a small glyph is inserted under the bench. It looks as though this is intentional since the leg of the gallows character is shifted to the left to make room.

To the right are examples of plain benches followed directly by gallows benches. The glyph combinations of the two words are very similar except for the additional glyph in the character group on the right. This form of repetition, where the following word differs from the previous by only one character (either by changing a character, or by adding or subtracting one) happens frequently in the VMS and is one of the reasons people have questioned whether there is sense or nonsense underlying the unusual glyphs.

Are There Other Bench Oddities?

There is a very different bench (right), in the naked nymphs section, that stands out as fractured, globby, and unconventional. There are anomalies in the VMS that suggest someone may have tampered with several parts of the manuscript, so it’s possible this bench-gallows, which is at the end of a line, was added by someone else and may not mean anything at all.

Convenient to Write or a Different Character?

Is the bench character a ligature, a combination character, or a convenient way to write a sequence with less movement of the hand?

I found this intriguing example on the right that has a plain bench on either side of the gallows but does not cross the gallows. If it’s intentional that there are two separate benches enclosing the gallows-P without crossing it (and not misdirection or a lapse of habit) then it might suggest that the stretched bench represents something other than a quick way to write a gallows with a bench on either side.

One of the difficulties in trying to crack the Voynich code is determining how much meaning might be attached to each shape. If you’re not sure whether a shape represents one, two, three, or more characters (or concepts), then creating algorithms in your mind or on a computer entails a lot more trial and error.

I’ll leave you to ponder that example and decide whether the dynamics of the bench character can help us better understand the VMS “alphabet”.

J.K. Petersen

Voynich Script – The Leaning Letter and Why I Never Use the Eva Font

3 Replies

9 Jan 2016

Delving into the Details

I am constantly asking myself what can I learn about a person who lived almost 600 years ago when all that remains are enigmatic pictures and about 200 pages of inscrutable text—text that has resisted the efforts of thousands (perhaps tens of thousands) of eager amateur and professional code-breakers.

It probably took a long time to create the VMS—hundreds of drawings, a couple of hundred pages of text. Writing and drawing with a quill takes considerable effort—something a generation of keyboarders might not fully appreciate.

There were no ballpoint pens in the 15th century. Ink was hand-mixed from iron particles, vinegar, and oak galls, and getting the right consistency was important. To create the pen, someone had to pluck the feathers of a goose (preferably one that’s been given last rites—live geese will bite your knees off).

Even if you braved muddy streets crawling with rats and bought the ink and quills premade from a local craftsman, you had to trim your quill with a knife every few pages as the end of the nib wore away. If the angle or width of the nib changed, the text would be inconsistent. If you didn’t scrape enough ink off the nib right after dipping, the ink would drip or blob on the page. If you waited too long to redip, the ink would run out, the text would be too light and a few letters would have to be overdrawn without smearing.

Unlike a fountain pen, which provides a smooth flow of ink, the medieval scribe had to control the flow of ink using rhythmic movements of hand and wrist—a process similar to coaxing good sound out of a musical instrument.

Given the labor involved, writing nonsense-text would take almost as long as writing meaningful text.

What Penmanship Says about the Penman

15th c Lombardy calligraphy

Looking at the physical balance of the VM letters and lines, one would have to call it handwriting rather than calligraphy. The VM scribe was not an expert penman. As I mentioned in a previous post, the angle of the pen is not optimal for enhancing the aesthetic qualities of the pen strokes and the VM letter forms are not consistent enough to qualify for professional penmanship in an age when jaw-droppingly beautiful illuminated manuscripts were crafted by expert scribes.

Similarly, the “second script” on the last page of the VM (which may be in a different hand) lacks the artful shapes and consistency of high-quality penmanship.

Even if it’s not up to calligraphic standards, the handwriting in the VM is careful, measured, and clear. In this respect, and in the fairly broad spacing, the VM script is more like 13th century Carolingian than Gothic cursive. That’s not to say it resembles Carolingian style (other than the broad spacing), but it is more readable than many medieval documents of the 15th century that were penned by amateur scribes—a detail that may say something about the personality of the author.

Discerning the Devil in the Details

This article doesn’t focus in depth on the writing style of the VM—I’ll do that in a separate post (there’s plenty to say about the spacing, size of the letters and how they are written). Instead I’d like to get to something more crucial—a clue that reveals that the VM author had a working knowledge of Latin scribal conventions that suggests classical training.

The Structure of the Text

There have been a number of computational attacks on the Voynich—attempts to discern its structure through computer analysis. Some of these are quite interesting (and probably worthwhile), others are based on faulty premises, but at least they yielded some funky graphs.

I’m in favor of computational attacks—they further our understanding of computer analysis, even if nothing else. I’m also in favor of computational attacks on the VM, even though many are based on the assumption that there’s a one-to-one correlation between VM glyphs and actual letters/sounds which, in my opinion, is a very shaky assumption.

One important detail about the VM text that I mentioned in my July 2013 zodiac post, and which I’d like to elaborate further, is the use of Latin abbreviations. The entire text incorporates writing conventions that were common in the 15th century, including the abbreviations in the zodiac labels written by another hand next to each animal.

Latin Conventions

Latin scribal abbreviations are shapes that stand in for letters. Some are used within words, some at the beginnings and ends of words. I’m not going to give a full tutorial on Latin sigla, since many of the conventions are outside the scope of the VM, but I’ll mention ones that are directly relevant.

The “9” abbreviation. The shape that resembles the number nine is used at the beginnings and ends of words. At the beginning, it typically stands for con– or com-. At the end, it is usually –cum or –cun but can also mean –us or –os or occasionally -is (particularly if it is superscripted). The 9 as a suffix is common to many manuscripts. The 9 is used as a prefix as well, but less often.

In the Voynich manuscript, the 9 is contextually similar to Latin and Germanic texts—it shows up frequently at the ends of “words” and occasionally at the beginning. If one hunts through the VM, one can find exceptions where it appears midtext, but that can happen in Latin, as well, and usually stands for -er or r.

c/e with a tail. A shape that looks like a c or e with a tail has meaning similar to the suffix 9 (con, cum) except that the shape usually stands alone rather than being attached at the beginning or end. In the VM, it’s sometimes difficult to tell from the spacing whether a character is intended to be read by itself or is associated with nearby glyphs. In the VM example to the right, the spacing is similar to the 14th century Latin document above it, but the distinction between this character and others is not always so clear.

It perplexes me when I see “decodings” from Voynich researchers who rigidly assume a one-to-one relationship between character glyphs and their underlying meaning (assuming there is an underlying meaning). Even if you put aside the possibility of 1) ligatures, 2) medieval abbreviations, and 3) null characters, you still should not assume one VM glyph equals one letter. It could be one, two, three, or more. If it were a one-to-one substitution code (also known as a Caesar code), the mystery would surely have been solved centuries ago.

Other Numbers

The number 9 is not the only number used in Latin manuscripts. The numbers 2 and 4 have significance as well (as does the number 3, but it’s not found in the VM and won’t be discussed in this post).

The 4 on the left is paired with a superscripted o in a 14th century Latin manuscript. Voynich fans might recognize the similarity to the Voynich 4o. The context is a little different, however. In Latin documents, 4o usually stands alone, while in the VM, it’s typically at the beginnings of glyph groups—it doesn’t follow Latin positioning conventions as closely as the number 9.

In Latin, the 2 is contextually similar to suffix 9—usually at the end of the word, often superscripted, and typically means -ur. A character that resembles a 7 is also commonly used to denote et or e (and is the basis of the abbreviation etc. which started life as a ligature between the 7 shape and a c). Less often it stands for –us or –que. The 7 shape does not appear to be represented in the VMS.

The Curling Tails

The tails of some of the VM characters swoop up and over the letter, and do so in a consistent way. In the 15th century this wasn’t a mere embellishment, the tail carried meaning. In Latin and Germanic text, the curved tail usually represented m or sometimes n. If it’s midtext, it appears as a line above the letters. At the end, it’s easier to swoop back the tail rather than lifting the pen.

In Latin documents, the swooped-up tails often follow the letter u to create –um. In Germanic texts, they often follow ai to form ain (an old form of ein). Many of the German scribes also wrote in Latin and retained some of these conventions when writing in their native tongue, as illustrated in the two examples on the right.

The Caps

In September 2014, Stephen Bax asked me the meaning of the curved caps that appear over some of the VM glyphs. I’m sure many people are wondering the same thing.

I didn’t answer right away, because the question can’t be answered in a few sentences. It depends on context. In fact, a blog post can only scratch the surface of VM conventions.

In Latin, the cap has a variety of shapes. Sometimes a different shape has a different meaning and sometimes different shapes have the same meaning (but vary stylistically based on individual hands).

In most cases a closed cap (an “o” shape) represents something different from an open cap. This is also true of German script, in which the open cap usually symbolizes –er- (or sometimes an old-style umlaut) and the closed cap the sound “oo” (and is placed over a u the same way as an umlaut in modern text). I’m not going to go into details on the cap in this post, because it needs a full post of its own and I want to concentrate on another shape that may be more important.

The “J” Character

One of the less common characters in the VM alphabet is the J-like character found at the ends of words. In Latin, this is easily recognized as the suffix –cis, –ris, or –tis and sometimes is used as –is when attached to a different beginning stroke. Note how the connection between the beginning stroke is sometimes blunt and sometimes rounded (the Eva font acknowledges this difference but neglects the third variation). That’s how it’s done in Latin, as well. I’ve seen some pretty odd proposals for what the VM J-shape represents but the shape (if not its meaning) is standard medieval Latin script.

All three suffixes can be found throughout the VMS and it’s possible that the one-loop gallows character may be an example, as well. Picture it as a ligature comprising a letter and the abbreviation –is. I’m not proposing this as a theory, just encouraging people to remain open to possibilities. In Latin, the shape of an abbreviation may change depending on its position in a word.

Putting aside the gallows character for now, the VM J character is almost always at the end of glyph-groups, but there are exceptions (you can find some on Folio 58r). Occasionally, two can be found combined, but even this is not particularly exceptional, since it’s possible that valid syllable combinations analogous to –ristis exist in whatever language underlies the VMS.

The Funny “r” Character

This brings us to one of the shapes that might be overlooked due to its resemblance to contemporary alphabets. Since there are several VM glyphs that look familiar to western European characters, like the a and the o, it might be easy to perceive this as an embellished r. In fact the Eva font maps it as a lower-case “r”. I’ve also seen people refer to it as a question mark, but that loop is definitely a tail, not the primary stroke of the letter.

There’s a Latin abbreviation similar to this that represents -ter. It’s a combination of a “t” (which often looks like a “c” in old manuscripts) with a curled tail, representing er. It looks like the VM “r”, but the stem is usually upright. Which brings us to an important detail I haven’t seen mentioned anywhere else…

Have you ever asked yourself why this funny character that somewhat resembles an “r” leans backward when all the other “stem” characters (with the possible exception of a character that resembles an “i”) are straight up and down? Maybe not. Just as you may not have considered a possible relationship between the suffix-J character and the loop on the gallows character.

Context is everything when interpreting a 600-year-old document.

I’ll explain why I think the lean in the “r” is important and why it helps confirm the idea that the VM scribe was familiar with classical Latin conventions. I’m proposing that it may be based on another Latin abbreviation that is usually found by itself, between words, although sometimes in other positions.

The letter that resembles a 2 in Latin manuscripts usually stands by itself, between words, but it can also be found in the end position (usually as a superscript).

Now use your imagination and picture this character rotated 60 degrees clockwise. Then you get a character that not only resembles the “r” with a tail but which appears in the VM by itself, between words, and sometimes at the ends of words. Even when it appears to be part of another word, the space between it and that word is sometimes greater than the distance between individual letters, which makes you wonder if it’s intended to stand alone or form part of the nearby group. In other words, the context of the VM “r” is similar to the way the “2” is used in Latin even if the shape, in this example, is not in the same orientation.

You don’t always have to rotate the character to recognize it. Here’s the same abbreviation in a different Latin document (14th c) in more angular handwriting. If you imagine the tail more smoothly curved, it resembles the VM glyph’s orientation without rotating it. One can also find examples where the tail (the backward swoosh) is more curved.

In the subscripted suffix position, the “2” or “r” (which sometimes looks like a leaning S) may represent –ur or –er. When it’s written as a suffix in-line with previous characters, it usually represents –re or –ri or sometimes –er, but this form would normally have an upright stem. In Latin, it’s less common to see it midword, but when it is, it can mean almost anything (and sometimes represents as many as five letters).

Darn Those Details

Learning medieval abbreviations is not as easy as looking at a chart. Some symbols are fairly consistent (mainly the suffixes) but many can only be interpreted in relation to the letters around them, which means you have to know the underlying language to understand whether the symbol stands for one character or many and to determine which ones they are.

The example above-right (from a 12th century Italian manuscript) is a relatively small snippet and yet is packed with Latin abbreviations, including pro-, der-, -us, -os, n, m, -er, -uo, -s-, con-, prae- and others.

Just when you think you’re getting the hang of it, another snafu comes along and you discover a symbol you thought you understood has other functions, as well. Like the J character that stands for -cis, -ris, or -tis… it sometimes doesn’t stand for characters in the preceding word at all. Sometimes it’s a paragraph-end marker.

Knowing the language is especially important for interpreting 15th-century manuscripts like the one on the right from the mid-1400s because writing was taught to a larger segment of the population and was no longer the exclusive domain of those chosen for their literacy and handwriting skills. Later documents are often not as tidy as 12th and 13th century hands, and superscripted symbols were not always written directly over the position where the letters are missing.

Implications for the Voynich Manuscript

I’m confident that the VM scribe knew Latin conventions beyond simply copying the shapes. Many of them are contextually applied in the same way one would see them in medieval Latin and German manuscripts. Whether there’s any Latin in the VMS is a completely different subject. I have my own ideas about whether the VM author used Latin conventions to represent single letters, groups of letters, or… as a smoke-screen to hide the underlying contents of the manuscript by crafting the text to look like Latin.

J.K. Petersen