Category Archives: The Voynich Alphabet

Investigations of the shapes that are used within the Voynich to render textlike material.

Checking Out Chechen

7 September 2020

Speakers of Chechen sometimes have difficulty reading and writing their own language. Currently there are about 1.4 million Chechen speakers, mostly in the Caucasus, but also in scattered colonies in the eastern Mediterranean, western Russia, and Bavaria/Tirol. The Chechens live in the mountains, in a linguistically diverse region that includes some very old languages.

In July 2018, I posted a blog on Tischlbong, a Slavic/Bavarian blended language spoken in the village of Timau on the Bavaria/Italian border. This blog takes us further east, to the region between the Black and Caspian seas, where a surprisingly diverse group of languages, some of which are nearly extinct, are still spoken in cultures that are thousands of years old.

It was actually the Azerbijani language that attracted my attention first, for a number of reasons, but after I began to appreciate the diversity of languages in this region, I learned of some unusual aspects of Chechen and decided to look into this, as well.

Chechen and Nearby Languages

Chechen is spoken by a little more than a million people in a culturally ancient and linguistically diverse region between the Black and Caspian seas, bordering Georgia, Azerbaijan, and Russia. [Source: Google maps; Vyacheslav Argenberg, Wikipedia]

Ubykh, one of the languages in the Akbhaz-Circassian language group, became extinct in 1992. This remarkable language had 82 consonants and only two vowels (Coene, 2009).

In general, minority languages and even some of the majority languages in the northern Caucasus region did not have their own alphabets until the 19th and 20th centuries. Chechen has a longer written history than most of the minority languages. Some of the minority languages are spoken by only a few thousand people and may be gone in a generation or two.

The Avar or Azerbaijani languages are used bilingually for economic transactions by a number of people in this region. Russian is also spoken and mandated in some areas.

In some ways, the Caucasians and Basques have characteristics in common. Not in terms of their language specifics or background (although both languages are agglutinative), but in resistance to outside influences. This is largely due to cultural isolation—mountain strongholds are harder to conquer. Historically, these cultural groups retained a certain autonomy that is reflected in their languages.

More recently, however, technology, Soviet expansion, and wars have left their mark and have wiped out a sizable portion of native literature. When orthography changes, books in previous alphabets become obsolete and are destroyed. With them goes the link to ancestral history.

History and Orthography

Chechen and Ingush are related to Vainakh, a northeast Caucasian language.

Like several middle eastern and central Asian languages, Chechen exemplifies synchronic digraphia—a language written with several alphabets, usually Arabic, Cyrillic, or Latin. Historically, the Arabic alphabet was used for Chechen, but since 1862, a Cyrillic-based alphabet was the dominant script, with recurring and politically controversial attempts to convert to Latin. In 2002, the Russian language was mandated for education, which may threaten the future of numerous local languages.

Members of the Chechen diaspora who settled in Bavaria and the eastern Mediterranean sometimes use Latin characters because they are familiar, but their efforts are not standardized. The number of books published in Chechen is small and some of these were destroyed in recent wars.

Chechen literature has received very little study but is worthy of attention because of its unique poetic characteristics and the position of this region in an important crossroad between Christian and Muslim cultures.

Some Interesting Aspects of Chechen

Chechen is an agglutinative language with some interesting characteristics. Literacy levels were not historically high, so it is difficult to chart changes between current usage and older versions of the language.

Here are some general characteristics…

Numbers (in the singular) and names of the seasons usually end in a vowel. Dal is the word for God, Seli for the traditional thunderer, and Eter for the ancient underground god (the Chechens were traditionally polytheistic).

There are many words comprised of simple 2- or 3-letter syllables, and some that repeat a syllable, such as zaza (flower), or which repeat a consonant together with different vowel or vice versa, as in or qoqa (dove) or adam (person).

Letters like j tend to be at the beginnings of words.

One spelling can have different pronunciations and serve multiple purposes. To take an example cited by E. Komen, the single word деза (deza) can be interpreted as four very different concepts:

dieza (to love), deza (valuable), diexa (to request), and deexa (long)

Does it look like Voynichese?

No, there is more variety in the positions of letters within Chechen words than in VMS tokens. But it demonstrates that natural languages can have orthographies in which different sounds are represented by the same shape, where vowel representation is limited, and within which the same linguistic unit can be repeated several times with different meanings for each iteration.

J.K. Petersen

© Copyright September 2020 J.K. Petersen, All Rights Reserved

Ma Me My Mo Mu

25 February 2020

I found the series Ma Me My Mo Mu in a mid-15th-century German manuscript. This surprised me. If you know east Asian languages, you will recognize the syllabic nature of this series. Another sequence in the German codex is Ba Be Bi Bl Bo Be Bu.

So which language is it? It has elements of Japanese or Filippino but isn’t quite a perfect match for the order or the components. It’s unlikely that Japanese was known in the 1460s in Europe. Could east Asian languages have been recorded earlier than we realized? Or is it an African language (some of which are similar to Asian languages)?

Syllables and Numerals

First I’ll introduce you to the manuscript. If you glance through the chart on Barth 24, f1v and you know Japanese, this sequence jumps out: ma me my mo mu (note that medieval languages often substitute “y” shape for “i”)…

Ma Me My Mo Mu sequence in medieval German manuscript.
Series of two-character syllables beginning with “m” and “n” [Source: Ms. Barth. 24, c. 1460s, Rhein region].

If you read the fragments in this order: black, black, black, red, red, you get ma, my, mu, me, mo which is the correct order for Japanese syllables. Here is the Japanese, with Hiragana equivalents:

But the syllables in the German manuscript are out of order. You have to read the black ones first, followed by the red ones, to get the correct sequence in Japanese. Is this because a medieval scribe or missionary got it wrong? Or because it’s not Japanese but perhaps a related language with a slightly different order?

It turns out it’s not a language at all, it’s a system based on language components and, even more surprising, it is remarkably consistent across unrelated languages. The same system is used in German, Spanish, English, and (believe it or not), Malaysian. Could this be relevant to the VMS, perhaps in more than one way?

It turns out that the German manuscript is a dictionary but not a Romanized-Japanese dictionary. The numbers paired with syllables in the above example refer to folios, and when I looked up an unfamiliar word in the “M” section on Google search, it took me to a word in Tagalog. Once again, I thought, did missionaries compile this? And yet the rest of it looked like Latin (and read as Latin).

The word I selected turned out to be one very big coincidences. It is Latin. The manuscript is Catholicon, and I coincidentally picked a word that is also valid in Latinized Tagalog.

So what are these syllables if they are not Japanese or Tagalog?

Here is a larger screensnap so you can get a sense of the overall system. The numbers above the syllables are folio numbers:

Barth medieval indexing system based on leading syllables

It took a bit of research to find answers, but I learned that this is a medieval indexing system, one that was designed for large datasets.

We’re used to indexes with numbers accompanying short words and phrases. The one above is a little different and reaches us from the minds of people who lived more than 500 years ago, and it’s still valid! In the post-medieval centuries, it was adapted by schools to teach writing, and by American companies to sell filing systems and insurance services. It is still in use today for a wide variety of purposes.

The system is based on the lookup characteristics of common syllables at the beginnings of words and it’s almost spooky the way it generalizes across unrelated languages. It appears that basic and common sounds at the beginnings of words are somewhat universal despite dramatic differences between western and eastern languages.

Here are some examples. The first one is an indexing system used in American accounting systems in the 19th century. Note the M and B sequences:

American accounting indexing system syllable lookup system.
Indexing lookup chart for common syllables at the beginnings of words or names (such as cities or clients) from American Counting-room, Volumes 7–8, 1883. Note the sequences listed in the text above the chart.

Here is another example of indexing for large sets of names (companies with 500 or more members). Note Ba Be Bi Bo Br Bu (not identical to the German example Ba Be Bi Bl Bo Be Bu, but close and also close to the Japanese Ma Me Mu Me Mo alphabet sequence:

Indexing system for insurance companies for large datasets, based on common syllables at the beginnings of words [A System of Records for Local Farmers’ Mutual Fire Insurance Companies, Valgren, USGPO, 1920].

The instructions for this system say to write the “guide letters” near the upper outside corners of the relevant pages (similar to folio numbers). It should probably be emphasized that even though medieval manuscripts were sometimes annotated with quire numbers prior to being sold, they were usually foliated by the purchaser, his heirs, or the bookbinder’s assistant when it was taken in for binding (sometimes decades or centuries after it was created).

Indexing didn’t always happen when a book was bound, sometimes the index was added weeks or decades later, but when it was professionally indexed, the indexers took their jobs very seriously. It could take months to critically analyze the manuscript, to annotate the margins and, finally, to create the index (as an example of this process, see BNF Latin 15754). In a sense, the index was like a Cliff Notes version of the manuscript.

So how could this indexing system possibly relate to east Asia? Well take a look at this 21st-century sequence for indexing street names in Malaysia:

I have removed “J” because it was generally non-existent in medieval Europe (what looks like a “j” is usually an embellished “i”) and also k because there are many more “k” syllables in 21st century Malaysian names than most western medieval languages. It is not a complete match by any means, but considering that German and Malaysian languages are very different, there are a remarkable number of matches in content and sequence.

This unexpected linguistic continuity gave me food for thought. I wondered… can this characteristic of languages have any relevance to the VMS?

Are There Indexes in the VMS?

Maybe. Here are some things to consider…

  • Some manuscripts were almost entirely indexes, which means the word patterns don’t match full sentences and numbers are frequent.
  • Some manuscripts, even long ones, had no indexes at all.
  • Some had brief section indexes (note the folios in the VMS that resemble “key” pages).
  • Some depended on an index as a separate volume.
  • Some had long indexes, extending for several folios (not unlike the dense text at the end of the Voynich Manuscript). Sometimes each entry was notated by a symbol such as a cross or flower.

Summary

Numerous insights can be gleaned from this. First of all, it shows there are aspects of language that are similar among western and Asian languages. The sample posted above demonstrates this with startling clarity.

Maybe it explains why Voynich “solutions” have been offered in a dozen different languages with many solvers (and statistical analysts) feeling strongly that it matches their language of choice. Perhaps we are seeing fragments (as in an index or as in words that have been broken into syllables with extra spaces) that follow patterns common to a number of languages.

Or perhaps the VMS (or portions of it) comprises an index which, in the middle ages could sometimes look like a student notebook, with many note-style annotations interspersed with numbers.

The concept of multiple volumes existed in the Middle Ages. There are a number of medieval herbals designed with separate text and illustrations. Bibliographers and historians have suggested that certain specific books, in a variety of subjects, may once have had a companion volume.

But does this apply to the Voynich Manuscript?

It’s my opinion that many of the VMS “labels” are not words, at least not if space boundaries are retained. Maybe they are references rather than names. It seems intuitively obvious to look for label matches in the main text (and I, of course, have done this as well), but this isn’t the only way to cross-reference. Label text doesn’t have to match the exact pattern of glyphs in the main text to function as a reference. It just has to “point” in some way (e.g., referencing a folio number, section, paragraph or quadrant, or perhaps a separate volume), a process that would result in a high degree of repetition and self-similarity.

I have seen cross-referencing in medieval manuscripts. There is an herbal in an English repository that cross-references the same plant in another manuscript, with a short annotation near the root. It is also very common in Greek herbals for illustrations in the margins to include an indexed number (written as letters) that references a formal index or some part of the text.

Even so, it should probably be noted that the VMS has quite a lot of text, most of it carefully integrated with the illustrations, which seems to speak against a companion volume, but if the VMS glyphs represent a verbose code, as one possibility, then the information content could be much lower than it appears.

J.K. Petersen

© Copyright Feb. 2020, J.K. Petersen, All Rights Reserved

Cheshire Reprised

16 May 2019         

A week ago I posted commentary on Gerard Cheshire’s “proto-Italic ” and “proto-Romance” solution for the VMS. At the time, his most recent paper was pay-to-view, so I had to restrict my comments to the previous open-access paper. Now the most recent version is open-access. Unfortunately, not much has changed from the previous version. You can see his April 2019 proto-Romance theory here.

What exactly do the terms “proto-Romance” and “proto-Italic” mean?

Proto-Romance

If you search for “proto-Romance”, you will find many references to “vulgar Latin” (also called colloquial Latin)—variations of Latin spoken by the common people (most of whom were illiterate) during the classical period of the Roman Empire.

The “classical period” of the Greeks and Romans spanned approximately 14 centuries up to about 6th century C.E. when the Roman Empire was no longer dominant. As Rome lost its grip, vernacular languages and local versions of Latin had the opportunity to evolve into modern languages such as Italian/Sardinian, Spanish, Portuguese, French (with Gaulish influence), and Romanian.

Extinct Languages and Undocumented Scripts

The prefix “proto-” comes from Greek πρωτο-. This refers to the first, or to something that comes before. So proto-Romance means before the Romance languages had fully emerged (from vulgar Latin), and proto-Italian script means an alphabet that was used before the script that became standard for writing medieval Italian. Medieval Italian script is essentially the same alphabet we use now except that the letterforms are more calligraphic than modern computer users are accustomed to seeing.

This brings us back to Cheshire, who is claiming that Voynichese is an extinct proto-Romance language in an undocumented proto-Italian script… something that existed about 1,000 years before the creation of the VMS.

How is that possible when the radiocarbon-dating and many of the iconographical and palaeological features of the VMS point to the early 15th century?

Cheshire’s Interpretation of Medieval Characters

Cheshire’s descriptions of individual glyphs, and his interpretations of the annotations on folio 116v, suggest that he is not familiar with medieval scripts.

It also seems that he hasn’t studied the frequency or distribution of the Voynich glyphs in the larger body of the main text, because he associates common letters and letter combinations with glyphs that are rare, or that have unusual positional characteristics. This point is so important, it bears repeating… Cheshire assigned substitution values for common letters to rare VMS glyphs, or glyphs that have positional characteristics that are not consistent with Romance languages.

Is it possible he never tested his system to see if it would generalize to larger chunks of text? Did he prematurely assume he had solved it?

Let’s look at some examples…

Cheshire’s Analysis and Transliteration of Voynich Glyphs

In his first example, Cheshire takes a glyph-shape that is known to palaeographers as the Latin “-cis” abbreviation (the letter c plus a loop that usually represents “is” and its homonyms). This shape is both a ligature and an abbreviation in languages that use Latin scribal conventions. It has not yet been determined what it means in the VMS, but its positional characteristics are similar to texts that use the Latin alphabet.

VMS researchers know this shape as EVA-g.

Cheshire transliterates it as a “ta” diphthong. It’s not a diphthong. A diphthong is a combination of two vowel sounds and “t” is clearly not a vowel. The terminology is wrong.

He then gives an explanation of the shape that doesn’t mesh with medieval interpretations of letter shapes. This is figure 26 from his paper (Source: tandfonline):

To say that this can be confused with the letter r and the letter n makes no sense to anyone accustomed to reading medieval manuscripts. It looks nothing like r or n. If Cheshire means it can be confused with his transliterated r or n, he should clarify and provide examples.

To get a sense of how this character was used in the medieval period, I have created a chart with examples of the “-cis” ligature/abbreviation that was common to languages that used Latin scribal conventions. I have sorted them by date.

This is not to imply that the Latin meaning and the VMS meaning are the same. The VMS designer may only have borrowed the shape, but it is important to note that the position of this glyph in the VMS is very similar to how it is positioned in Latin languages:

More important than the mistakes in reading medieval characters and linguistic terminology is that Cheshire did not address the basic statistics of VMS text and the fact that this glyph occurs primarily at the ends of words and sometimes the ends of lines. Thus, transliterating EVA-g as “ta” is highly questionable.

Perhaps Cheshire can justify this mismatch between letter frequency and position by saying that separate glyphs also exist for “t” and “a”, but when you put the various transliterations together, one finds that the character distribution of Romance-language glyphs and Cheshire transliterations are significantly out-of-synch.

For example, as in his previous paper, he chose one of the rarest glyphs in the VMS repertoire (EVA-x) to represent the letter “v”. In classical Latin and Romance languages, the letters “u” and “v” are essentially synonymous and very frequent. In this brief excerpt in modern characters, from Pliny the Younger, note how often u/v occurs:

Pic of letter frequency of U/V in classic Latin text by Pliny the Younger

If Voynichese were a proto-Romance language (some form of classical vulgar Latin), and EVA-x were transliterated to U/V and also F/PH, as per Cheshire’s system, one would expect to see this character more than 40,000 times in 200+ pages. Instead, this character occurs less than 50 times. That alone should create doubt in people’s minds about Cheshire’s “solution”.

So what has Cheshire done? He has assigned a different letter to represent “u”, but we know that in classical Latin, Etruscan, and Old Italic, “v” and “u” did not represent different letters even if both shapes were used (which they usually weren’t).

Even in the Middle Ages, when there were different shapes for “u” and “v”, most scribes used them interchangeably. In other words, “verba” might be written with the “v” shape in one phrase and with a “u” shape (uerba) in the next, just as “s” was written with several different shapes (without indicating any difference in sound).

This is the 23-character Latin alphabet in use around the time vulgar Latin was evolving into Romance languages:

Example of Roman alphabet

Perhaps Cheshire didn’t know that they were interchangeable shapes rather than two different letters when he created his transcription system. But if he did know, if he actually believes that “u” and “v” were distinct letters in proto-Romance languages, he will have to provide evidence, because historians, palaeographers, and linguists are going to be skeptical.

Beginning-Paragraph Glyphs

Voynich scholars have noticed there are disproportionate numbers of EVA-p/r and EVA-t/k characters at the beginnings of paragraphs. There is a possibility that some are pilcrows, or serve some other special function when found in this position.

Cheshire doesn’t appear to have noticed this unusual distribution (at least he doesn’t comment on this important dynamic in his paper) and translates the leading glyph in the same ways as the others. In his system, a very large number of paragraphs inexplicably begin with the letter “P”.

Some of his translations cannot be verified. For example, he used a drawing on f75r to demonstrate a single transliterated word “palina” on f79v. There’s no apparent relationship between them (other than what he contends), so how does an independent party determine if the translation is correct?

Tenuous Assertions

On f70r, he uses a circular argument to explain the transliteration of “opat” (which he says is “abbot”). He says the use of “opat” indicates “that proto-Romance reached as far as eastern Europe” because “opát survives to mean abbot in Polish, Czech and Slovak”.

We don’t need a dubious transliteration to tell us that proto-Romance languages reached eastern Europe. The existence of Romania demonstrates this rather well—it borders the Ukraine, and used to encompass parts of Bohemia. Bohemia included Hungary, Czech, and parts of eastern Germany, so transmission of vulgar Latin to Polish through Czech was a natural process.

Palaeographical Interpretations

There are problems with the way Cheshire describes the text on folio 116v. He refers to the script as “conventional Italics”. It is, in fact, a fairly conventional Gothic script, not “conventional Italics”.

Then he makes a strange statement that the second line on 116v is hybrid writing, that it is Voynichese symbols mixed with “prototype Italic symbols, as if the calligrapher had been experimenting with a crossover writing system”. It’s hard to respond to that because his statement is based on misreading the letters. Here is the text he referenced in his paper:

anchiton mehiton VMS 116v

Cheshire interprets this as “mériton o’pasaban + mapeós”

He misread a normal Gothic h as the letter “r” and a normal Gothic “l” as the letter “P”. In Gothic scripts, the figure-8 character is variously used to represent “s”, “d”, and the number 8, so it’s very familiar to medieval eyes, but he doesn’t seem to know that and interpreted it as a Voynich character that he transliterated to “n”.

If his reading of the letters is wrong, then his transliteration is going to be wrong, as well.

Zodiac Gemini Figures

Cheshire mentions the Gemini zodiac figures (the male/female pair), and states: “Both figures are wearing typical aristocratic attire from the mid 15th century Mediterranean.”

It takes research to determine the location and time period for specific clothing styles—it’s not something people just automatically know. Since Cheshire didn’t credit a source for this reference, I will. It’s possible he got the information from K. Gheuen’s blog.. Even if he didn’t, Gheuen’s blog is worth reading.

Flora and Fauna

I’m not going to deal with Cheshire’s fish identification. It’s just as dubious as the Janick and Tucker alligator gar. There are fish that are more similar to the VMS Pisces than Cheshire’s sea bass, and pointing out the fact that sea bass has “scales” is like pointing out that a bird has wings.

I was hopeful that Cheshire’s latest paper would be an improvement over his previous efforts, but I was disappointed.

Summary

It’s possible there is a Romance language buried somewhere in the cryptic VMS text (it was, after all, discovered in Italy, and the binding is probably Italian), but that is not what Cheshire is suggesting. He’s saying it’s an extinct proto-Romance language, without providing a credible explanation of how this information could have been transmitted a thousand years into the future.

There is a relentless publicity campaign going on right now to catapult Cheshire into the limelight. I’m not going to repeat the claims in the news release (they’re pretty outrageous), but even Superman would blush at the accolades being heaped on this unverified theory.

When I checked Cheshire’s doctoral research, I discovered it was in belief systems. Somehow that seems fitting.

J.K. Petersen

© Copyright 2019 J.K. Petersen, All Rights Reserved

Postscript 16 May 2019: The University of Bristol has retracted the Cheshire news release. You can see the retraction here for as long as they decide to make it available.

Maximizing the Minims

19 April 2019

There are two pattern groups in the VMS that could be related, maybe. They have traits in common that might help us understand Voynichese.

I’ve blogged about double-cee shapes (EVA-ee), but felt it would be too long if I included relationships between cee patterns and the more familiar aiin patterns, so I’ll continue the discussion here…

The Double-Cee Question

As I’ve posted before, there are many places in the VMS where cee shapes (EVA-e) look like they might be joined. There are even places where double-cee and single-cee are adjacent:

Examples of cee shapes in Voynich Manuscript text

I strongly suspect that double-cee (the one that is tightly coupled) is intended as one meaning-block.

  • In Visigothic manuscripts, the letter “t” was often written as a double-cee shape.
  • In early and mid-medieval manuscripts, a double-cee stood for “a”
  • In early and mid-medieval manuscripts, a superscripted double-cee stood for what we would call “u” (it was often next to a “q” character).

Thus, many scribes perceived tightly coupled cees as a unit.

Of course, nothing is easy with the VMS. Here is an example of overlapping cee-shapes next to ones that are separate. Do we interpret them as different or the same?

Note also how the bench joins with the row of connected cees, which brings us to the next point…

Is The VMS Deliberately Deceptive?

It’s very difficult to tell if the VMS is designed to deceive. Patterns like the following are hard to interpret.

Are the tails on these glyphs added to hide the length of a sequence? Or are they genuinely different glyphs?

In the same vein, are EVA-ch and EVA-sh cee-shapes in disguise? Could the cap on EVA-sh be yet another cee?

Here’s an example where two cee-shapes are topped with a macron-like cap (a shape that is usually associated with the benched char):

EVA-ee with cap

For that matter, is the 9-shape a hidden cee?

I don’t know for sure, but based on the behavior of the glyphs (in terms of position and proximity), I get the feeling (so far) that EVA-ch and EVA-sh might be related to cee-shapes, even if they mean something different (they frequently occur together), while EVA-y dances to a different drummer.

Positionality

Cee shapes frequently cluster in the middles of tokens, just as minim patterns are frequently at the ends, but are they somehow related? They are the only two groups of glyphs that repeat many times in a row.

These examples from f4v and f7v are provocative because they suggest that cee shapes and minims might be related. Rather than being word-medial, the cees on the right are word-final and have long tails from the bottom rather than the top:

Now, let’s examine the -aiin patterns…

Aiin not Daiin, and maybe not even Aiin

I think it was a big mistake for early researchers to cinch the idea of “daiin” in people’s minds. The aiin sequences are frequently (yes, frequently) preceded by glyphs other than EVA-d.

Stephen Bax wrote a paper in 2012 (revised Nov. 2013), in which he summarized one of the most common ideas for interpreting the glyph sequence called “daiin” (e.g., that it might mean “and”). Here is a quote and a link to the PDF file:

It is argued from this analysis that the element transcribed as ‘daiin’, the most frequently occurring item in the manuscript as a whole, is in fact a discourse marker separating out sense units, functioning like a comma or the word ‘and’, and analogous to the use of crosses in folio 116v.

Stephen Bax

The Voynich manuscript—informal observations on some linguistic patterns.

And here are some of my observations…

First, let’s start with the crosses on folio 116v. There is a strong precedence in medieval manuscripts for including the plus sign in charms and medical remedies in places where the reader or speaker (or healer) genuflects. The plus sign is sometimes also used like “and”, just as we use it now (nothing new about that). However, I doubt that the plus- or cross-symbol on 116v is related to “daiin”.

Now back to the paper…

On page 3, Bax noted instances of word-final daiin, but he examined them out of context. He recorded instances of aiin that are preceded by EVA-d and basically ignored the other glyphs that precede -aiin in the same sample (as well as daii- that occurs at the beginning). I have marked the patterns that were not mentioned in red:

Studying the “daiin” pattern this way is like examining -tally patterns in English while ignoring related patterns like -ly, -lly, -ally, -aly, and -dly. He also failed to account for the fact that aiin is not a homogenous glyph pattern. It includes an/ain/aiin/aiiin and even sometimes iiin.

He further makes no mention of the tail patterns. If the length of the tail is meaningful then, like so many before him, Bax might have overestimated the frequency of daiin.

Tail Coverage

Most transcripts treat the many versions of daiin as if they are the same. They count only the number of minims (and they don’t always get that right). But there is another dynamic that gets little attention, and that is the length of the tails.

Tail coverage varies. Thus, a glyph with three minims might have three different versions of tail coverage and perhaps three different meanings:

VMS tails in minim sequences

Here is the text sample color-coded for different tail patterns, with green for one and red for two:

About half the instances of “daiin” look like dauv and the others look like daiw, if you pay attention to the length of the tail. They are not necessarily the same. If you include aiin sequences not preceded by EVA-d, it varies even more. Normally I wouldn’t consider tail length to be important. In Latin, the length of tails (a form of apostrophe or ligature) is pretty arbitrary. Some scribes lengthened the tail if more letters were left out, but this was not the norm. In the VMS, when you create a transcript and examine every token, tail-length feels deliberate.

Nick Pelling pointed out to me in a blog comment that there are dots at the ends of tails. I’m not sure I had noticed that (he’s right, there are). I had noticed the varying tail lengths. After Pelling called my attention to the dots it occurred to me that maybe the dots were to help the scribe accurately craft the length of the tail.

Tail lengths might turn out to be trivial rather than meaningful, but it’s still important to document their patterns as part of the research process. If they are significant, then vanilla-flavored “daiin” is not nearly as frequent as claimed.

Forget about the “d”…

Minim sequences don’t require EVA-d and don’t always need EVA-a. Here’s a minim sequence that stands alone (four minims with one covered, or perhaps three minims and another glyph entirely):

I think future research would be more fruitful if transcripts and descriptions of the text were more aligned with reality. Calling them minim sequences carries fewer assumptions than “daiin”.

Interpreting Minims

I’m not sure minim sequences are intended as separate characters. Just as some of the cee shapes look like they belong together as a block, the iii sequences do so as well. There are numerous instances where they resemble uiv rather than iiv.

In this example from folio 8r, a curved macron has been placed over two minims in aiiin (I prefer to call the shapes aiiiv rather than aiiin, but I’ll respect the existing EVA system for now). It is almost as though the scribe were explicitly associating two minims:

Maybe the cap is a macron in the Latin sense (apostrophe for missing glyphs), or maybe it’s a way to say, this is a “u” shape, don’t confuse it with “ii”. Note that there is a slight gap between the first “u” shape and the second (or between the “u” shape and the “iv” shape):

In this example from f8v, the first two minims resemble a “u” shape and are distinctly separated from the final glyph (which resembles “v” or “i-tail”, and yet there is a 3-coverage tail):

As for the length of the tail, in Latin it usually doesn’t matter, but there were a few scribes who pointed the tail at the particular spot where letters were missing (the tail is an apostrophe attached to the end so the scribe doesn’t have to lift the quill). What it means in the VMS is still a mystery.

Maybe progress in understanding the VMS is slow because many transcripts don’t include these details.

I have an enormous chart that documents these patterns, but it’s not yet finished and ready to interpret. This is only the merest snippet—part of the top-left corner:

Snippet from very large Minim-Sequence Chart


Minims and Cee Shapes

This is getting long, so I’ll end with one last question (possibly an important one). Is there some connection between minims and cee shapes?

Minims are more frequently at the ends of tokens (but not always). Cee shapes more often in the middle. Both tend to cluster. Both have tails of varying lengths.

It’s fairly obvious that they both repeat, but I don’t know if anyone has offered a practical explanation (other than the possibility of Roman numerals). Here are examples that illustrate the similarities:

And here is an example that is particularly enigmatic. Is it EVA-ochaien or EVA-ocheiien or ochaiin or something else? Did the scribe slip and draw one of the minims as a cee-shape, or is this a uniquely structured token?

J.K. Petersen

© Copyright 2019 J.K. Petersen, All Rights Reserved


The Ligature Legacy

“…some symbols in the Voynich Codex show similarities to letters found in sixteenth century codices from New Spain (Tucker and Talbert 2013; Comegys 2013) particularly the Codex Osuna (Valderrama 1600; Chávez Orozco 1947).”  —Janick and Tucker, Aug. 2018

The authors are talking about shapes that roughly resemble EVA-k and EVA-t. The following statement is much more surprising:

“We thus conclude that the author of the Voynich Codex made up his syllabary/alphabet, and the letters were borrowed from contemporary post-Conquest MesoAmerican manuscripts such as the Codex Osuma.”

In scholarly circles, “conclude” is a strong word—a word that needs to be backed up with solid evidence. Unfortunately, I find this conclusion highly questionable. The examples the authors use in their arguments are conventions that originated in Old World Latin scripts long before the 16th century. How can one use Old World scribal conventions to argue for a New World conclusion?

Is There a Preponderance of Evidence to Support the Conclusion?

Perhaps the authors felt that if the glyph shapes are taken together with botanical and biological identifications, there is enough evidence to support a New World origin, but the botanical and biological identifications of Tucker, Talbert, Janick, and Flaherty are highly questionable, as well. If you haven’t been following this discussion, then at least scan-read the previous blogs:

Even though the VMS 93r “sunflower” has a number of possible identifications (both New and Old World), Janick bases broad conclusions on this unproven ID as though it were fact:

“Simply put, there is no way a manuscript written on vellum that contains a sunflower and an armadillo could have been written before 1492,” —quoted on Purdue The Exponent news site, 10 Sep. 2018

There isn’t any proof of the identity of the “armadillo” either. It looks more like an Old World pangolin than a New World armadillo, but even this identification can be contested.

What I see in the papers and book by these authors is a collection of inadequately researched suppositions combined in a circular argument to support a New World theory. They pick out a few similarities and ignore the larger body of contrary evidence. They identify two completely different fish as the same fish. One of the plants they identified doesn’t even grow in MesoAmerica. They ignore numerous significant details like the cloudband under the “armadillo”. They ignore alternate IDs for the sunflower.

Unfortunately, the authors’ identification of Voynich-like glyphs suffers the same lack of critical evaluation as the plant and animal IDs, so let’s take a closer look at those.

The VMS-like Letters in the Codex Osuna

Here are examples of the Voynich-like glyphs cited by Janick and Tucker (and by Tucker and Talbert in a previous publication) in the Codex Osuna.

EVA-k is at the top, and EVA-t is at the bottom. Note that the handwriting is different:

Before you say, “Oh, those are similar”, make sure you read the rest of this blog. Bats and owls might look similar to a visitor from another planet, but one is a mammal, the other is a bird, and they are not closely related.

Visual Similarity is Not Enough (especially when they’re not actually that similar)

Something important Janick and Tucker did not mention is that the letters that appear to resemble EVA-k and EVA-t exist in two different scripts in two different languages. Failing to mention this distinction obscures the origin of these shapes, so I will fill in the missing pieces:

  • The EVA-k shape is in the sections written in Nahuatl.
  • The EVA-t shape is in the sections written in Spanish.

There are simple reasons for this, but they are important ones because there is no specific relationship between the Spanish and Nahuatl shapes. The similarities are coincidental, but some background might be necessary to make this clear…

Nahuatl Version of EVA-k

If you’ve heard the Bushmen click language, you know it can be very difficult to express this with Latin letters.

Similarly, there is a sound in Nahuatl that is hard to write. It’s made with the tongue against the back of the teeth, so the Spanish missionaries chose to represent the sound as the letters t + l and they wrote it as a ligature tl, with the crossbar of the “t” connecting to the loop of the “l”.

This ligature is not specific to Nahuatl or to the New World. It exists in Old World words like “atlas”, “battle”, “gatling”, etc. Note that the crossbar in the first letter “t” always extends some distance to the left of the stem, which is different from the way EVA-k is written:

It’s possible EVA-k is a ligature (two shapes combined) but if it is, then it follows age-old scribal conventions that are not specific to the New World (or to Nahuatl script). It doesn’t seem likely that VMS EVA-k was copied directly from Nahuatl if one goes by shape alone. It is more similar to some of the European ligatures and abbreviations such as “Il” (French) or “Item” (Italian, German, Latin) than the ligature on the left.

What About EVA-t?

Another common ligature in Old World languages that used Latin characters was the d” + “e” and since the letter “d” was written a dozen different ways, the “de” ligature is quite variable. A similar ligature combines “d” and “l” as in words like “headless”. Sometimes they are hard to tell apart from each other and from ligatures like “il”, but the concept is the same—two letters are combined so they can be written faster or in less space.

Here are examples of how “de” and related ligatures were sometimes written in Spanish scripts from the 14th to 16th centuries. The two on the right are from the Codex Osuna. The faster and loopier the writing, the more it resembles EVA-t (sort of):

The examples on the right illustrate how loose a ligature could be and how combinations like “de” or “dl” or “Il” or “Ie” need to be seen in context to be distinguished from one another, especially if it is an open-loop “d” followed by a very round “e” or “l”.

It has been suggested by Janick and Tucker that the glyphs above-right inspired EVA-t in the VMS, but this seems unlikely. EVA-t has long straight stems:

It’s possible the VMS char is a ligature, but even if it was inspired by “de” (I highly doubt that “de” was the inspiration but let’s pretend for a moment that it was), this ligature was common in many Old World languages.

The authors of Unraveling the Voynich Codex didn’t mention that the two shapes that resemble VMS glyphs are taken from two different sets of scripts (one in Nahuatl, the other in Spanish) and, more importantly, that these shapes were part of the normal scribal repertoire of Old World Europe and thus might have been seen by the creator of the Voynich Manuscript long before the conquest of MesoAmerica.

Summary

The authors didn’t provide any solid evidence that the inspiration for these shapes was specifically New World sources. In fact, the position of EVA-k and EVA-t within VMS tokens doesn’t match well to Nahuatl letter order, either, which further weakens the authors’ interpretation of the VMS script as a Nahuatl substitution code.

I’m not entirely opposed to New World interpretations. I think the VMS is probably Old World, but I will listen to New World arguments, as long as they are good ones. Unfortunately, many New World theories are marred by faulty logic and hasty conclusions.

J.K. Petersen

© 2018 J.K. Petersen, All Rights Reserved

 

The Chameleon Quality of Scribal Conventions

Medieval alphabets, numbers, and abbreviations are often the same shape. For example, the glyph identified in the VMS as EVA-l (ell) was used as both a number and as a scribal abbreviation. In the previous blog, I described the “is” glyph, which is used to create syllables such as ris, tis, or cis. This time I’ll illustrate the flexibility of the EVA-l shape.

Something I noticed, when reading early medieval texts, is that many basic abbreviation symbols were based on Indic-Arabic numbers, long before these shapes came into general use. I’m assuming this was to help distinguish abbreviations from regular letters.

Numerals used as scribal abbreviations

Thus, the number 1 (the old style with a slight wiggle), and the lightning-bolt style of 5 were used for er/ir/re/ri and other sound-alikes that usually include “r”. Number 2 represented ur/tur, number 3 (often written like a zee) was at the ends of words to stand for rem or us/um. Number 4 (as shown above) was a general-purpose abbreviation, 7 became et, and the number 9 was commonly used at the beginnings of words for con/cum, and at the ends for us/um. These conventions continued from the early medieval period until the 16th century. The only significant change was that the number 4 gradually fell out of use  by the 15th century.

The 4 had lapsed as an abbreviation by the time the Voynich Manuscript was created, but it had become common to use it as a numeral.

This clip from a legal charter illustrates the flexible nature of 4. It represents whatever letters are missing, similar to a short macron or curved macron in late medieval texts. It usually stood for one- or two-letter omissions (a bar was more commonly used if several letters were missing). Here it variously stands for m, n, and er:

Scribal abbreviation 4 symbol representing a number of different letters

Examples of the flexible nature of the “4” abbreviation standing in for several different letters or more than one letter.  [Image credit: Stiftsarchiv Reichersberg German legal document from 1231]

You might wonder why a single glyph would be substituted for another single glyph rather than writing the missing letter. Part of the reason is space. There is plenty of space between lines that goes unused, so substituting a superscripted letter shortens the overall length of the document (and the amount of parchment needed). Vanity may also play a part—those who could write and read abbreviated text probably moved up in the social hierarchy.

Paper began to replace parchment in the 14th century, and was less expensive, so some of the superscripted abbreviations, like 9, were lowered to the main text (some scribes wrote it both ways). The 4 continued to be superscripted until it became more strongly associated with numbers rather than with abbreviations:

Illustration of flexibility of scribal abbrevation symbols

I think it’s important to understand how scribes made the distinction between letters, abbreviations, and embellishments if one is to analyze anything written in the medieval period.

I still encounter considerable skepticism about the VMS glyph-shapes being inspired by Latin. They are not as unusual as many people have suggested.

You can search all over the world for that elusive “alphabet” without finding it. In fact, I did exactly that. Even though I recognized these shapes as Latin, I wanted to be sure I had not overlooked anything and spent two years learning dozens of foreign alphabets (Armenian, Syriac, Gujarati, Georgian, Sanskrit, Hebrew, Greek, etc., in addition to the ones I already knew… Korean, Japanese, Russian and a tiny bit of Chinese), well enough to read simple words… and then came back to where I started—almost all the VMS glyphs are normal Latin scribal repertoire, and the few that are questionable are similar to Greek conventions or could reasonably be constructed from Latin scribal building blocks put together in a slightly unconventional and yet acceptable way.

Understanding scribal conventions might help sort out which variations in VMS shapes are meaningful and which are not. For example, in Latin, you can draw the tail on EVA-m in any direction without changing the meaning, but if you change the left-hand side, it becomes another syllable. In Latin, the 9 shape can be drawn any way you want, as long as it is vaguely 9-shaped, but if you move it from the end of the word to the beginning, you change the meaning. VMS glyphs might follow similar concepts even if different meanings have been assigned.

Summary

If some of the VMS glyphs are abbreviations, it creates one-to many relationships of varying lengths. If this were a substitution code, this, in itself, would not be unduly difficult for cryptanalysts to unravel, but in medieval times there was another twist—scribal abbreviations commonly represented not only several letters, but often different letters in each word. In this way, scribal abbreviations diverge from typical one-to-many/many-to-one diplomatic ciphers, in that the interpretation of a specific shape can change from one word to the next.

In Latin, and possibly also in the VMS, two words can look the same, but mean something different.

J.K. Petersen

Copyright © 2018 J.K. Petersen, All Rights Reserved

Le’go my Lego

I’ve frequently posted that Latin scribal conventions are flexible. They are like Lego® blocks that can be taken apart and recombined in many ways. This is essential to understanding Voynich text. Looking for similar shapes is not enough, one has to consider the combinatorial and positional dynamics of medieval text in relation to the VMS.

Basic Building Blocks

I’ve already discussed these shapes in previous blogs, but I’m going to post again with additional examples. This is one of the most common Latin scribal conventions, frequently used to represent the syllables “ris”, “tis”, and “cis”.

The glyphs below are from two scribes with different handwriting styles so you can see how the abbreviations vary from hand to hand, but can still be understood by the reader:

Some parts of the ris/tis/cis abbreviation are meaningful and some can be changed to suit the person’s handwriting or aesthetic sensibilities. This is one of the reasons why it’s important for the dynamics of scribal abbreviations to be understood—meaningful and non-meaningful variations in glyph-shapes need to be differentiated to create accurate VMS transcripts.

For ris/tis/cis, the left side of the glyph defines the meaning (although ris and tis can sometimes be hard to tell apart in 15th-century documents and tis and cis can be hard to tell apart in 13th-century documents). In the example below, it’s quite clear that the left side is “r” for ris, but the shape of the right side can be varied, as long as there is a small loop with some kind of tail. In this 15th-century addition to an early medieval calendar, the tail is long and swings left:

 

A further example (c. 16th century), shows that the ris/tis/cis abbreviations were still used during the Renaissance and early modern period, and could vary in shape, as long as the meaning was clear from the context. Note here that “tis” looks like a small EVA-k at the end of the word hereditatis. It has a straight stem (no foot as in earlier script styles) and a short tail, and yet is perfectly understandable:

Here’s an example in a different 15th-century hand in which ris has a foot and so, to distinguish it from tis, the scribe has given tis a bigger foot and a subtly more rounded back. Also notice how cis is rounded to the point that it looks like EVA-d with a tail. This cis-variation is also found in the VMS and might be a hand-variant of the cis-shaped glyph or might be a separate glyph:

This example, from yet another 15th-century scribe, shows tis and cis in context. Even if you don’t know medieval Latin, you can probably figure out that the word on the right is “dulcis”. Note how the tis and cis are at the ends of the words. This is how they are usually positioned in languages that use Latin conventions and in the VMS, but they can be placed elsewhere if desired:

Similar Shapes in the VMS

Now I’d like to point out a snippet from the VMS that suggests the Latin “is” convention was probably known to the scribe or designer, because it is not always written like EVA-k and EVA-m in the VMS. A right-facing  loop with a stem is sometimes attached to other letters.

On the left, an “is” shape has been attached to EVA-a or, alternately, it might be a straight-leg EVA-d attached to the right side of a bench. Either way, EVA-d has a variant form with a straight leg that is similar to Latin short-leg cis. I don’t have a clip handy, but  “is” also appears next to other glyphs such as EVA-y.

In the example below-right, both ris and cis shapes are visible, but what you might overlook if you are not familiar with Latin scribal variations, is that the character in the middle of the second line is drawn a little differently. This might be a differently-written ris (different from others in the VMS), OR if one were reading this as a Latin scribal glyph, it could also be interpreted as EVA-i with the looped shape added:

Summary

After I wrote a series of blogs on Latin scribal conventions there was a mini-flood of Voynich “solutions” that made the headlines, all claiming that the VMS was abbreviated Latin. They looked suspicious to me because the researchers clearly didn’t understand how to use these conventions correctly and thus could not have come up with the idea independently.

Here are some of the major problems with these “solutions”:

  • Latin scribal abbreviations don’t automatically mean Latin language. Latin conventions were used in all major European languages. Even some of the west Asian languages apply some of the same concepts. The VMS might turn out to be Latin, but not in the simplistic way these researchers translated the VMS text. And it might not be Latin—the same scribal conventions are used in Spanish, French, English, German, Czech, Italian, and other languages. Sometimes the same shape means the same thing in different languages, and sometimes it is adjusted to fit the spelling or grammar of the host language (for example, a squiggle-apostrophe might mean -er in English and -re in French).
  • Most of the proposed “solutions” are nonsense words, not Latin. They might include one correct Latin word out of every 20 or 30 words (more by accident than by design) and they might look a bit like Latin, but it’s not Latin vocabulary, it is certainly not Latin word-order or word-frequency, and it definitely isn’t Latin grammar (not even note-form grammar).
  • Those offering the solutions expanded the abbreviations incorrectly—that was the give-away that they hadn’t really done their homework and probably got the idea from somewhere else. Saying that it’s Latin is easy, anyone can do it—demonstrating that it’s Latin when you don’t know scribal conventions makes it really obvious to those who can read medieval script that the person offering the solution is not even superficially familiar with the basics. One of the solutions, published in a prominent newspaper, expanded one of the most simple and common Latin abbreviations the wrong way (he forgot to take into consideration the position of the abbreviation in the word). Another solution didn’t recognize the fact that Latin scribal glyphs have many different meanings, not just one, and that the meaning is inferred from context. The solver expanded them all the same way which, once again, generates nonsense rather than natural language.

The VMS might be natural language and it might not.  If it is natural language, it’s possible scribal shapes are different from alphabetical shapes. The majority of computational attacks do not take this into consideration, or the fact that if there are scribal shapes, they might need to be expanded.

So, I will continue to post examples, as I have time. Perhaps a better familiarity with medieval conventions will help researchers understand which glyphs might be combination glyphs, which parts might need to be expanded (if the glyphs do, in fact, represent abbreviations and ligatures), and how VMS transcripts could best  be adjusted to provide a more accurate rendering of the text.

J.K. Petersen

© Copyright 2018 J.K. Petersen, All Rights Reserved

Running the Numbers

Lambert St. Omer’s Liber Floridus is a delightful volume that I saw years ago, but which always draws me back because of a number of haunting resonances with the VMS.

Liber Floridus rotum with text and face

Guelf 1 Gud Lat [Wolfenbüttel Digital Library]

 Note, for example, the simple palette, rotum-with-face, and text written in circles. It’s a book of knowledge that covers many subjects, including Creation, sun and moon charts, ancient myths, former rulers, a wonderfully decorative (but very short) section on plants, real and mythical beasts (including a road-kill lion intended to represent a crocodile), Jerusalem, a lapidary, a history of nations,  genealogies, alphabets, and number systems.

If you glance through it, you will see that the calendars and timelines are laboriously indexed with Roman numerals:

index written in Roman numerals

In previous blogs, I’ve written about ancient languages that used letters for numbers instead of separate glyphs for numbers. Similarly, the Romans used the letters M, D, C, X, V, I as both letters and numbers. Sometimes dots or a line were drawn above the characters so they would not be mistaken for letters. In some languages, the placement of the dots (or the number of dots) indicated the place-value of the character (e.g., hundreds or thousands).

Most people probably zoom past the Liber Floridus chart on folio 55v, but it’s worth a second look because it describes the correspondence between Greek and Roman numerals.

Glance through it and you can see that the Greek system is more compact. For example, the single character tau (“t”) represents three hundred, which would be written as three characters (CCC) in Roman. It’s a trade-off… Greek takes less space to write (and probably saves wear and tear on the scribal hand), but requires more memorization.

Note that early Roman Numerals were “additive” in the sense that the number 9 was sometimes written as VIIII (rather than as IX):

additive, positional character of Roman numerals

As a small detail of interest to VMS researchers, the “a” in the left column is written 1) with a crossbar on the top rather than in the middle, and 2) without a crossbar. These forms are quite common in Coptic Greek and also in a number of early Latin texts. This shape was sometimes used for the letter “a” and was almost indistinguishable from the Indo-Arabic number 7 or Greek lappa/lambda. This shape is found in the VMS, as well, as one of the uncommon characters.

I’ve already pointed out similarities between the VMS and Roman Numerals in a previous blog, and how benched characters represented numbers in Greek and early medieval Latin, so I won’t repeat them here, but I would like to reiterate that patterns c, cc, ccc, and even cccc all occur in the VMS. Here are a few examples:

c-shapes in the Voynich manuscript Voynichese glyphsLinguistically, one doesn’t usually expect to see the same letter three- or four times in a row, but there could be a number of explanations for the uncommon “ccc” and “cccc” patterns. They might represent

  • the end of one token concatenated to the beginning of another, or
  • symbols, modifiers, or numbers (similar to Roman numerals), or
  • minims disguised as curves, or
  • biglyphs. I have mentioned in previous blogs that “cc” represented the letter “a” in early medieval texts and, if superscripted, sometimes referred to the “u” sound (especially when combined with “q”), so biglyphs that stood for a single letter did exist in early Latin. Thus, “ccc” could represent “ac” or “ca” in early medieval Latin, and “cccc” could represent “cac” or “aa”. Note how some of the VMS “cc” combinations are more tightly coupled than others.

A Century of Transition

By the late 14th century, Roman Numerals were giving way to Indic-Arabic numerals, the system that was used to write the VMS quire numbers. Several of the VMS glyphs in the main text are the same shapes as Latin numerals, including 4, 7, 8, and 9:

Voynich Manuscript similarity to Indic-Arabic numeralsIn Latin, all of the VMS “numeral” shapes can double as letters:

  • The “7” that looks like a caret sometimes stood for “a”.
  • The “8” often stood for “s” or “d”.
  • The “9” symbol could be a number or a very common Latin abbreviation used at the beginnings and ends of words.
  • In early medieval Latin texts, EVA-l (numeral “4”) was used as an abbreviation symbol, a convention that had mostly disappeared by the 15th century.

Thus, three of these four glyphs were in common use as both letters and as numbers in the 15th century, which makes it more difficult to guess how to interpret them in the VMS. When you factor in the position-dependency of Roman numerals the similarity to the construction of VMS tokens cannot be easily dismissed.

Transitional Greco-Latin Forms

For centuries, Latin manuscripts made use of Greek scribal conventions.

Here is an example of a notation that might help relate some of the examples I have shown in previous articles to those I have posted above. Other writers almost never comment on these marks, partly because they are concentrating on pictures and the primary text, and partly because most people can’t read them. I am, however, fascinated by snippets in the margins because many of them shed light on medieval conventions.

From a 14th-century book of statutes (Berkeley HM 923), note the marginal notation on the left:If you are unsure of what this represents, I’ll break it down for you…

That’s not a “y” at the end, the resemblance is superficial—these marks stand for a Greco-Latin numeral (I say Greco-Latin because the Roman numerals are organized in the Greek style). The strokes at the end represent four “i” characters (the number 4)—it was tradition to lengthen the tail of the last “i”.

Now look at the left side. Notice how there are three characters that resemble “r”? Each one stands for Greek rho, the symbol for 100, Romanized to look like a Latin “r”. Note the line through the characters similar to the crossbar on benched gallows in the VMS.

benched rho character on Greek coin

Benched rho char on a Greek coin.

I have shown examples of this form of benching in earlier blogs and I include an additional example of a benched-rho on a Greek coin (right). Note also the slash on the right.

Adaptation of Greek forms wasn’t always perfect. In a few Latin manuscripts of the 13th century, a benched-P-shape sometimes diverged from its Greek roots to represent the number 4 instead of 100. This may have been a corruption of Greek delta (4) which was sometimes written with extra descending hooks that resembled a bench. Thus benched-rho and delta with hooks might have looked confusingly similar to a Latin scribe.  Greek theta (number 9) was also sometimes written with hooks to resemble a benched character. In other words, when manuscripts were translated from Greek to Latin, sometimes the original meaning was retained, and sometimes only the shape remained. In either case, the result was a traditional bench shape associated with a number.

Latin manuscripts of the 13th and 14th centuries sometimes had a single “c” on the left (rather than on both sides) for the number 4 (Ref. Gerlandus De Abaco and BNF Ms Lat Fds St Victor 522). In fact, it’s hard to tell if they are meant to be benched rho, or a Latinized rho with only the left part of the bench. Note these examples in the VMS that show a similar dynamic in terms of the half-bench:

half-bench glyphs or long-cee in Voynichese

In languages that use Latin characters, crossbars and benching were applied to both numbers and words.

The Mysterious Foot on Some of the VMS Ascenders

Here comes the important part… As far as I know, there is no previous explanation of this glyph-shape by any VMS researcher…

Look at those tiny tails on the bottom of the “r” chars in the Huntington marginal example. The manuscript was written in early Anglicana script, which has many right-swooping flourishes on descenders, so this would be easy to overlook as an embellishment, but if you are familiar with the Greco-Latin evolution of numerals, you know that these are not flourishes, but “c” characters, the Roman glyph for 100.

The two most commonly benched chars in Latin manuscripts using Greek conventions were rho (which resembles EVA-p when written in the Greek style) and tau (which could possibly be the inspiration for EVA-T). Here are examples of the same “c” shape in the same position in the VMS.

Voynich Manuscript gallows characters with unusual stems

As has been mentioned, Latin scribes did not always retain the full meaning of the text they were copying. In Greek, when numbers were combined, they usually had additive or multiplicative effects. In Latin, the combination of characters didn’t always add up mathematically in the same way as they might in Greek. You have to look at context to figure out whether the combination of rho and “c” is simply a way of relating the Greek and Latin numerals as the same thing, or whether they must be multiplied to represent a larger number.

Greek method of benching and stacking characters ProdromosThe Greek convention of combining shapes wasn’t limited to numbers. The system was used for letters as well, especially when a name or word had to fit in a constrained space. As an example, look at the characters on the left.

The first letter is “o” and you might think the second one that looks like a “p” is rho, but in fact the next letter down (the one that looks like a bench but which represents pi when read as a letter) goes first, and the letter it crosses comes next, so that we have “o pr odro” finished off with an angular slash across the base of the bottom rho, which represents a common ending (in this case “mos”). Together it reads “o Prodromos” which refers to John, Baptist. When writing Greek acrophonic numbers, certain rules of precedent were followed, so the reader would know which parts stood for tens, hundreds, or thousands. The rules for text were looser, however, since a name or word could usually be puzzled out by context, and aesthetics sometimes came into play, as well.

Summary

I have much more information on this topic but unfortunately, I’ve run out of time and this is already long, so I will continue later.

J.K. Petersen

© 2018 J.K. Petersen, All Rights Reserved

The Pitfalls of Unhinging Pairs

An article posted today by Marco Ponzi alerted me to some VMS text analyses I have never seen before. A Google search revealed that several researchers have studied vowel placement in natural languages and some of those concepts were used to analyze VMS text (e.g., W.F. Bennett and linguist Jacques Guy).

So I looked up Guy’s papers, read through the first one, and am stating right up front that no matter how often one runs the numbers, vowel-identification analyses on individual glyphs in the VMS is not going to work as it does for natural languages. In this blog, I’ll explain why.

Background

For those interested in computational attacks and historical precedents, here is a summary of my Google search for the VMS-related work of Jacques Guy:

  • Cryptologia 1991 (Issue 3), “Statistical Properties of Two Folios of the Voynich Manuscript” by Jacques B. M. Guy, in which he analyzes folios 79v and 80r as to letter frequency in terms of both word and line placement, and co-occurrence, with a “tentative phonetic categorization of letters into vowels and consonants using Sukhotin’s algorithm”. This was republished online in June 2010.
  • Cryptologia 1992 (Issue 2), “The Application of Sukhotin’s Algorithm to Certain Non-English Languages” by George T. Sassoon, in which he applies Sukhotin’s vowel-finding algorithm to a number of languages, including Goergian, Croatian, and Hebrew, et al. This was republished online in June 2010.
  • Cryptologia 1992 (Issue 3), “A Comparison of Vowel Identification Methods” by Caxton C. Foster, in which he compares four methods of vowel identification. This was republished online in June 2010.
  • Cryptologia 1997 (Issue 1), “The Distribution of Signs c and o in the Voynich manuscript: Evidence for a Real Language?” by Jacques B. M. Guy, in which he references Currier A and B and builds on the Sukhotin identification of c and o as vowels and speculates as to whether they might represent o and e, “which they resemble in shape”, “a phenomenon similar to that shown by Standard English and Scots English…”. This was republished online in June 2010.

There isn’t room to analyze all these papers in a single blog, so I will confine this article to vowel identification. Sukhotin’s algorithm is given by Guy in the first paper as follows:

“Given a text in a supposed unknown language written in some alphabetical system, Sukhotin’s algorithm identifies which symbols of the alphabet are likely to denote consonants and which vowels.”

Guy then illustrates glyph-assignments in W. F. Bennett’s transcription system from the 1970s from which his own glyph-assignments are derived:

As you can see, it’s very similar to the current EVA system.

Guy then illustrates his transcription system, which is very similar to Bennett’s, except the VMS double-cee shape is acknowledged, and some ligature-like VMS glyphs are transcribed as single characters (a decision that can be debated but may not have a significant impact on vowel-analysis statistics):

Is the Above Research Based on a Flawed Premise?

Guy’s research into VMS vowels, and that of many others, neglects important co-relationships in VMS glyphs. It’s not enough to come up with a transcription alphabet and then run frequency analysis software on individual glyphs (whether it is specific to vowels or not) if there is evidence of biglyphs or positional dependence.

Many researchers accept the spaces in the VMS as word boundaries and, for the most part, assume 1-to-1 correspondences between alphabetical letters and glyphs perceived to be vowels (either by humans or by the software). Guy does not acknowledge biglyphs in any significant way other than VMS “cc”, which has a historical precedent in early Medieval Latin texts of representing the vowel “a”.

Side note: In early medieval text, “cc” usually stood for “a”, but if “cc” was superscripted, the vowel “u”  was usually intended. The “cc”=”u” is not mentioned in Guy’s paper, he only references “a” and may not have known medieval Latin well enough to know that “cc” could also stand for “u” or the consonant “t”. He does note, on page 210, that what we call EVA-e “is always followed by {t}” but he’s referring to EVA=ch, not the “c” shape in general, which is followed by a number of glyphs.

Guy must have been working from a flawed transcript because he acknowledges “cc” and “ccc” but completely neglects “cccc”, a pattern that unambiguously occurs a number of times in the VMS, but which is almost entirely ignored in most transcripts (including the popular Takahashi transcript). But even this may not be a significant hindrance to analyzing text patterns.

A perplexing statement in Guy’s paper is that he acknowledges the ligature-like nature of some of the VMS glyphs but then partially negates this by stating: “It is certain, then, that {ct} and {et} represent single letters.” By “ct” and “et”, he is referring to EVA-ch and EVA-sh and I think there are many who would debate the certainty of this assertion.

Pairing-Patterns in VMS Text

I have described glyph co-relationships in previous blogs, including biglyphs, Janus pairs, positional constraints, and the “rules” for reproducing samples of VMS text from a conceptual basis and with examples, but this time I have decided to keep it simple and use just one glyph to get the message across in a more explicit way.

As an example of co-relationships that could significantly alter the results of computational attacks, I will refer to the backleaning glyph expressed in most transcripts as the Latin letter “i” (this glyph also appears on the third line of folio 116v in the middle of a word that resembles Latin “vix”, but this appears to be an exception).

Guy analyzed two different transcripts and, in his results charts, identifies the “i” glyph as vowel #6.

Here is a detail-clip of Guy’s results from one of the transcripts he used for analysis. The chart was published in Cryptologia, 1991, Issue 3, p. 211. Glyphs that have been statistically analyzed as candidates for vowels are listed in the left-hand column. You can see quite clearly that EVA-i appears only in the medial position:

The Crux of the Matter

The problem with analyzing (or perceiving) the “i”-glyph as a vowel (or any individual letter) is that it is only positioned in one way in the VMS.

This is in stark contrast to natural languages, where vowels are found in many positions, both 1) in relation to other letters and 2) within a word.

I call EVA-i the “pivot” glyph due to its position in tokens and the way specific glyphs precede or follow it and I have not been able to find any natural language in which any vowel is always preceded by the same letter, and is always in the medial position, as in the VMS. This is why I constantly refer to the “rule-based” and “positionally constrained” nature of VMS text.

Looking More Closely at the Rules for EVA-i

With the exception of folio 116v (which may be marginalia in another hand), the backleaning-i cannot stand alone and must be preceded by “a”. It can only be followed by certain specific glyphs or glyph-groups.

VMS “i” differs from “o” and “a” (identified as vowels #1 and 3 in Guy’s chart) in that “o” and “a” can be paired with other glyphs and can move around somewhat (and can appear at the beginnings of tokens), whereas the “i” glyph cannot.

Before one argues that “i” can be preceded by another “i”, consider this…

The minims that appear after “ai” in “dain” are commonly transcribed as “n”, “m”, “v”, and there are many places in the VMS where the scribes have written them with a curved connector similar to the shape “u” (in contrast to most glyphs in the VMS, which are not connected), so the general feeling is that the minims that follow “ai” are not necessarily additional “i” shapes. Guy also acknowledges this ambiguity on page 210, and Bennett  transcribes the minims as “m”, “n” or “u”.

After going through every glyph in the VMS numerous times when I was creating my transcripts, I record them as follows:

The glyph-pair “ai” occurs almost 6,000 times in the VMS. If you run the numbers on popular transcripts, you may get different results, because the distinction between minims is not recognized in many of them (as discussed in the above chart).

But even if the minims are interpreted in a completely different way from the way I have drawn them above, even if one supposes all the minims are the same glyphs, the essential problem remains… the minims, as a group, must be preceded with “a” and do not occur at the beginnings of tokens, only together, and only at the end. The argument for “i” being a vowel becomes even weaker!

Guy acknowledges the possibility that the “i” strokes might be minims (or something else), but nowhere does he address the important fact that however one transcribes them, individually or as a group, “a” must precede them and they are never at the beginnings of words. These characteristics, taken together, are why we must question their interpretation as vowels.

Summary

Early researchers like Bennet and Guy were working with low-resolution B&W photostats, so some of their misconceptions can be forgiven, but ignoring glyph-placement is hard to excuse. Even if you can’t see fine details of individual letters, there’s no mistaking the following properties of EVA-i:

  • EVA-i is virtually always preceded by “a” (there are 7 rare instances (only 1/10th of 1%) of an “r”being inserted between two minims and one is especially strange as it is in different handwriting, is out of line, and is proportioned differently),
  • EVA-i never appears at the beginnings of words,
  • EVA-i rarely appears at the ends of words unless all the minims are assumed to be the same, and then they are always at the end, and
  • only certain specific glyphs follow “i” (I’ll include further details in a future blog).

The only way to resolve “i” into a natural-language vowel is to consider additional ways in which the text might be manipulated (as examples one would have to manipulate spaces, letter-order, or assign the same letter to multiple glyphs, or perhaps assume there are “hidden” minims, which is starting to stretch things a bit far, etc.). As it stands, if the spaces and letter-arrangement are taken as literal, and the glyph-assignments are considered consistent throughout the document, there’s not much evidence to support EVA-i as a vowel using the form of analysis proposed by Guy.

One might try to relate ai to “qu” in the sense that they are often found together and “q” rarely exists without “u”, but “qu” occurs in many different locations in a word, as can most common pairs of letters in most languages. The analogy doesn’t hold.

The Consequences

The behavior of EVA-i also affects the interpretation of EVA-a. If “ai” turns out to be a biglyph, then other instances of “a” have to be evaluated separately from “a” + “i” and the statistics will change, as will the number of glyphs that form the Voynichese “alphabet”. If there are other biglyphs (which I believe there are, as described in my blog about Janus pairs), then all the single-character computational attacks and early “vowel-assignment” research needs to be re-visited.

If, on the other hand, “ai” is not a biglyph, EVA-i is still problematic because one has to ask, “What is the purpose of the preceding “a”? or of “i” itself?” It’s possible that neither EVA-i, nor some instances of EVA-a, are vowels. They might not even be letters, but even if they are, they can not be statistically evaluated without taking into consideration the relationship of “i” to its companion.

J.K. Petersen

Copyright © 2018 J.K. Petersen, All Rights Reserved

Latin’s “Om-age” to Indic Numerals

5 November 2017

Most people don’t think of Indic and Latin scripts as similar, but the links between east and west are old and deep and medieval Latin script is not the same as modern Latin.

When I first discovered VMS glyphs, I scoured foreign alphabets for the origins of some of the less familiar characters. I already knew the Latin alphabet, some of the runic scripts, the Cyrillic and Hebrew alphabets, the rudiments of Korean, a little bit of Russian and Japanese (and a tiny bit of Chinese), some Coptic Greek, a few Greek numeral systems, and a smattering of Malaysian alphabets, but no matter how hard I searched, none of them, except Latin (combined with a small percentage of Greek), seemed to match a high proportion of the VMS glyphs.

I also searched plant-related words in Baltic and Turkic languages. Unfortunately, I haven’t had time to study Finnish, Czech, or Silesian, but they’re on my list.

Just to be sure I hadn’t missed anything, I explored several other alphabets from languages I thought had potential, including Georgian, Armenian, Amharic/Ge’ez, Syrian, and Sanskrit/Gujarati/Nagari (the word Devanagari did not exist in the middle ages) and… once again was led back to Latin, but with a better understanding of how Latin, Greek, and Indic script were more similar in the Middle Ages than they are now.

Western Presence in Eastern Lands

In ancient times, the Greeks and Romans occupied Pakistan and made forays into northern India. Alexander the Great, the Kushana peoples, and the Persians all left their mark, and absorbed certain aspects of Indic culture. There were numerous Indic coins that included Greek letters and numbers long after Greek occupation had subsided.

I couldn’t help noticing that “Arabic” numerals, as they were used by Latin scribes in the 14th and 15th centuries, resemble Indic numerals more than Arabic, and I subsequently saw the credit line in Latin, in the Codex Vigilanus (Spain, 976), attributing the number system to the Indians.

The earliest-known Indian numerals in a European manuscript are in the Codex Vigilanus (976 CE). It’s possible the manuscript reached the Spaniards through Arabic traders, thus leading to the “Arabic” moniker.

Leonardo of Pisa, now known as Fibonacci, appears to have independently discovered the Indic number system that was documented in Spain two centuries earlier. While traveling in Bugia, North Africa, with his father, he observed the notation system and calculations used by Muslim traders. When he returned to Pisa, he wrote Liber abbaci “Book of Calculation”, which included the Indic numerals. There are no copies of the original, completed in 1202, but a number of copies of Fibonacci’s enlarged 1228 edition survive.

The following is from a copy of Fibonacci’s book, believed to be from the late 13th century (BAV Pal. Lat. 1343). Like the Spanish manuscript, it introduced the numeral system that became popular until the 15th century, when slightly rotated glyphs for 4 and 7 and a more curled 5 evolved into our modern system:

Despite widespread acclaim for Fibonacci’s 13th-century manuscript on computation, change occurred slowly, and Roman numerals did not significantly give way until the 15th century when more flexible calculations were needed for scientific studies.

 

Latin Conventions in Medieval Scripts

Researchers often miss similarities between VMS glyphs and Latin because medieval scribes used many ligatures and abbreviations that are not taught in modern Latin. These were as integral as the letters themselves, and it’s hard to find late-medieval manuscripts without them.

Before describing similarities between Latin and Indic scripts, it’s important to understand how Latin is more than just an alphabet. You’ll note in the examples that follow that several of these scribal conventions are apparent in Voynichese.

Example #1

The first sample (BNF Lat 731) is lightly abbreviated. It uses some of the more common Latin conventions, including quibus, per, et, tails on the ends of words that loop back over the previous letters to indicate missing letters (it’s like an attached apostrophe), and caps over other letters to serve much the same purpose when the missing letters are closer to the middle of the word than the end.

Notice that loop-back tails and caps are common in the VMS, and that the abbreviation symbol that resembles a “2” or back-leaning “r” is, as well.

Example #2

The second example (BSB CLM 29505) also uses very common conventions, but not identical to the previous example. Scribes were free to pick and choose what was convenient because they were interpreted by context.

In this example, we see the common symbol for “Item” (at the beginnings of lines)—it resembles EVA-k; the macron or “cap” that indicates missing letters; the swooped-back tail at the ends of words (also missing letters); g° to stand for degree (grado/grade); a squiggly line over the “e”, which usually indicates a missing “r” or “er” “ir” or “re” (again, depending on context). Note that this is similar to the squiggle on the red weirdo on VMS 1r.

The loop on “item” is also used at the ends of words to represent “is” with the Latin suffixes -ris/-cis/-tis being drawn like EVA-m.

Notice also the tail on the “r” on the last line. This tail wasn’t always added to “r”, sometimes it was added to “i”, so one has to read for context to know which letter was intended. Take note that the shape of the tail sometimes indicates specifically which letters are missing (I’ll come back to that later), but not all scribes distinguished the missing letters by shape.

Thus, there are four scribal conventions in this small sample that are found as VMS glyphs:

Example #3

The third example (Ms San 827) makes slightly more frequent use of abbreviations, but they are still very common ones and easily readable.

In sample #3, note the lines and caps over the letters to indicate missing letters, the curled tail on the “p” to stand for “pro”, the symbol that resembles a “2” which sometimes means “et” (and) but often means -ur or tur.

On the fourth and fifth lines, you will see the “9” symbol at the beginning of one word and the end of another. At the beginning, in this example, it stands for “con-“. At the end it is usually “-us” or “-um”. This is one of the most common glyph-shapes in the VMS and, as in Latin, it is usually at the end, but sometimes at the beginning:

Example #4

The above examples are all from the 15th century, but conventions were similar in the 11th to 14th centuries, leading up to the creation of the VMS. The following earlier text (OBV SG 21), uses all of the same concepts and most of the same conventions:

Thus, with four brief samples, and the numerals that evolved from Greek that were mentioned in a previous blog, we can account for the majority of glyphs in the VMS.

The problem is not in relating the VMS glyph-shapes to Latin letters, ligatures, and abbreviations—the similarities are numerous and obvious—the difficulty is in determining their meaning because VMS tokens do not, in general, behave like Latin or the majority of natural languages in terms of the variability of the words or the characters within the words. Here are some important differences:

  • In Latin scripts used for a variety of languages, abbreviation symbols can be associated with many different letters. In the VMS we see caps only on EVA-sh and occasionally EVA-q.
  • In Latin, the swept-back tail is found on almost any character where letters have been omitted near the end of a word. In the VMS, it is specific to EVA-e, EVA-r, and the last glyph in “daiin”.
  • The “9” symbol is shaped and positioned the same in both Latin and Voynichese, but in Voynichese it’s much too frequent to mean the same thing as it means in Latin (or other common languages).

So the shapes are similar to Latin, but the extreme repetition and positional rigidity are not.

After the 15th century, abbreviations and ligatures fell out of use, as Latin scholarship was replaced by local languages, and the newly invented printing press and typewriter introduced mechanical limitations that made it difficult to mimic these scribal traditions.

Ties with the Eastern World

So what does all this have to do with the Indian scripts mentioned at the beginning?

Dozens of languages have been mentioned in connection with the VMS, but claiming it’s a specific language is easy. I saw one person claim five different languages in the same week, and another claimed three more in the course of three months. Proving that it’s a specific language is the real challenge, and so far no one has provided a convincing translation of even one paragraph.

I think I know why so many different languages have been proposed for the VMS. It’s partly because expanding or anagraming text expressly turns it into readable text or, if Voynichese is based on natural language, it may be partly because words related to disciplines like science are often loanwords and thus similar in many languages. But this bewildering array of suggested languages might not be entirely imaginary… certain languages did, in fact, have more in common with one another in the Middle Ages than they do now.

As an example, Indo-Iranian writing styles are more similar to medieval Latin than east-Asian character-based scripts like Chinese—both come from proto-Indo-European roots.

The Indo-Greeks and others who subsequently ruled Pakistan kept some of their native customs and adapted others from local culture. They blended pagan gods with Buddhist beliefs and minted bilingual Indo-Greek coins, as in the following example from c. 100 BCE:

[Image courtesy of the Classical Numismatic Group, Inc.]

The Kushana, nomadic peoples from central Asia, at one time ruled a large region that included Afghanistan, parts of Pakistan, and northern India, and almost shared a border with the Romans during Trajan’s and Hadrian’s rules (a coin mould featuring Emperor Hadrian was found in excavations of c. 2 CE artifacts in Rairh, near New Delhi). The Kushana were Indo-Europeans who actively traded with both Rome and China.

This gold coin, probably of Kushan origin, is a testament to multicultural interaction. It was minted in India, inscribed with Greek letters with the ruler on one side and “Boddo” (Buddha) on the other, and was unearthed in Afghanistan. Sometimes Zeus was substituted for Buddha on this style of coin.

[Image courtesy of the British Museum.]

Commonalities with Indo-Iranian Scripts

Please note that I have used Gujarati as an example of glyph similarities, even though it is more recent than Nagari, because it does not have the line across the top (thus making it easier for westerners to read). It is very similar to other Indic scripts if you ignore the top-line and look specifically at the shapes underneath. The following observations apply to a group of related Indic scripts descended from Sanskrit, not specifically to Gujarati.

I’ll start with some of the simpler and more familiar shapes, followed by glyphs with ascenders (gallows characters), because the majority of VMS glyphs are Latin. Only a few that are rarely used (or which show up only once) are distinctly eastern and will be described later.

 

Glyphs with Tails

Voynichese has a number of glyphs with tails, a ubiquitous convention in medieval Latin. Adding a tail to a glyph wasn’t just an embellishment, it was a way to indicate missing letters. In the VMS, the r, c, and minim shapes at the end of the word “daiin” all have distinctive tails. Certain Indic glyphs also have tails, and the shape or length of the tail can change the sound or meaning of a letter.

Here are some interesting patterns in Latin and certain Indic scripts, that may have some relevance to the VMS:

  • EVA-r. In Latin, when a tail is added to “r”, it can mean “rus, but it often means “re”, “er”, “ra”, “ar”, “ir”, or “ri”. In other words, a vowel is inherently indicated by a tail added to a consonant, as in some of the abugida languages. Similarly, in the later 13th- and 14th-century Nagari scripts, and in Gujarati, you will see an “r” shape with a curved tail to represent “r” or “ar” or “ra”. There are several places in the VMS where two forms of tails are apparent in the same block of text. In Voynichese, Latin, and Gujarati, the curved tail is more frequent than the extended-loop tail. If Voynichese is anything like Latin, Gujarati, or some of the Malaysian scripts, and not just a smokescreen to make the text look like Latin, then extending the tail and changing its shape changes the meaning of the glyph:
  • EVA-s. In many older Latin scripts, the “t” was written like a “c”, rather than with a straight stem. It can be a struggle to tell them apart. Adding a tail to this c-like tee stood for “te” or “ta” or most combinations of “t” plus a vowel (it can also mean “ter” or “tus”). In Gujarati, the symbol for “ta” is a c with a tail (note that both “r” and “c” shapes with tails are found in the VMS) and some are ambiguous, with a slight hook on the foot, perhaps denoting a third character. In Greek, a c-shape was used as an abbreviation for “kai” (and). Once again, if you look at it from a Latin point of view, the c-shape can also be “e” (many early medieval e-shapes didn’t have a crossbar or hook), and adding the tail turns it into “eius” or “et” for “and” (in fact, if you extend the tail a little more, it becomes an ampersand). Thus, we have a glyph with many meanings. C-tail can be the abbreviation for te, ta, or ter, or for et, eius, or er. In the VMS, as in Latin, this tailed shape, which sometimes resembles c-tail, sometimes e-tail, and sometimes t-tail, is found both individually and within other words.
  • EVA-d. If you look at variations of the thorn character, which is usually associated with northern European scripts, you’ll see some of them are written like a curvy “d” or a Greek sigma with a small bar through the ascender. It may be coincidental, but the Gujarati shape for “tha” is a curvy “d” shape. There’s no line through the stem, but many Latin scribes wrote it that way, and there is a strong association between “d” and “th” sounds in various Indo-European languages. If you round the top loop a little farther, as some scribes did with Latin “d”, thorn, and Greek sigma, it resembles a figure-8. This is why many researchers read the figure-8 on folio 116v as an “s” or “d”, but perhaps “th” should also be considered.

There are analogs to VMS shapes in both medieval Latin and some of the Indic scripts. The “a” and “o” shapes need no explanation—they are distinctly Latin, and “o” is common to many languages.

The simple “c” shape doesn’t tell us much either, because it is found in most alphabets, but two c-shapes tightly joined were used in early-medieval Latin to express “a”, “t”, and sometimes “u”. The double-c is also found in the VMS (right)—a distinction that might be meaningful but is not recognized in most VMS transcripts. In fact, in the Takahashi transcript, which is probably the most widely used, the extra c-shapes are sometimes omitted.

But tails are meaningful in both Latin and Indic languages, and ligatures common to both. Sometimes the tail changes the letter, sometimes it extends a sound, and sometimes it specifies which vowel is used. Note that Nagari and Gujarati are syllabic languages which might not seem to have much in common with Latin, but medieval Latin script has its share of implied vowels.

A sidenote on abugida scripts… Gujarati is a syllabic language, but not entirely an abugida script (neither is Hebrew). Both Hebrew and Gujarati include a shape for alpha, so it is explicit rather than implied (it’s possible that in ancient languages alpha was more of a glottal stop than a vowel), but most of the time the most common vowel (alpha) is rolled in with the consonant, as it is in a number of Asian and African languages.

In Gujarati, several of the syllables are written as though they were ligatures, with a vertical stem on the right  (as in sa, pa, na, and numerous other glyphs). This is technically part of the syllable but can also be thought of as the implied vowel. This vertical line has an additional function—it can be added to the preceding vowel or syllable to lengthen it into a long vowel, as in the following example:

Note how the vertical bar changes a short-a to long-a, a symbolic concept that was mentioned in the previous relative notation blog. A similar convention exists in Modi, another Indic script that is first recorded in the late 14th century.

Some of the commonalities between Latin and Indic scripts disappeared when Latin abbreviations were dropped and Latin was reduced to a simple alphabet.

Summary

I have much more information on this subject and was going to try to cover the Voynchese ascenders and some of the rare characters in the same blog  because they also have their roots in scribal conventions, but this is becoming too long, so I will continue with the less common characters in a future installment.

… to be continued…

J.K. Petersen

 

© Copyright 2017 J.K. Petersen, All Rights Reserved