Did Cicco Simonetta Bomb at Code-Breaking?

First a Few Words…

hose who know me know that I actively avoid looking at previous research about the VMS and have probably only read about 1/50th of what is out there. I hate spoilers and movie trailers—I enjoy the journey and the element of surprise.

If a new puzzle or game comes out, something like a Rubik’s cube, then lock me in a room and I’m happy. If you give me a book on how to solve it, or even the smallest of hints, I’m not happy—I want to solve it myself.

If I have an hour to spend reading someone’s analysis of the VMS or looking at the VMS itself, I usually choose the VMS. I like primary sources. If I have to learn a new language or other skills to understand it, that’s fine. It’s hard to find the time, but the effort is worth it.

Then along comes the Voynich forum and a personal dilemma… I want to support the forum. It’s a good thing because not everyone has blog-space and it provides them a more neutral environment to publish their findings than someone else’s blog. But it’s difficult to actively support a forum without reading it and if I’m reading it, I should be contributing, as well—to give something back. So… the peaceful days in my little cave are over and I’m now part of the “Voynich community”.

It’s not a bad thing, times change and we have to adapt, and I’ve met people I like and respect, but I’m in this weird twilight zone—I’ve only read a small portion of the prior research, which means I have no idea what people are talking about on some of their blogs!

Which brings us to the topic of today’s blog…

Enticed by a blogosphere note on the Voynich forum, I visited Nick Pelling’s Cipher Mysteries site today, where he posted a summary of Philip Neal’s translation of Cicco Simonetta’s treatise on decipherment.

I’ve barely heard of Philip Neal and I know nothing about Cicco Simonetta, so I was happy to see a summary, but I had a what-the-heck? reaction as soon as I started reading it. Who was this Cicco Simonetta dude and where did he get this information? I couldn’t believe my eyes and had to look up the full translation to confirm my impression… and then was even more surprised. It wasn’t some cockamamie 20th-century misunderstanding of 15th-century code-breaking, this was written in the 15th century!

The only way I can think of to explain my reaction is to go through the major points. It’s dated 1474, Pavia, as a treatise on extracting ciphered writings.

Note that Simonetta appears to be describing only Italian or Latin as possible languages for the ciphered text, even though there were many ciphered documents in German, Spanish, and French in the general region of northern Italy. At least I hope he’s only talking about Italian when he says “vulgar language”, because the generalizations only make sense in that light.

Simonetta’s Suggestions for 15th-century Code-Breaking

Evaluate the Word Endings

First Simonetta suggests looking at endings to determine if the code is in Latin or “the vulgar tongue” and counsels that five or less variations indicate vulgar tongue.

Right away we know Simonetta must be assuming that there are no null characters, that the spaces are real (not contrived or arbitrary), and that this is a one-to-one substitution code, otherwise it’s impossible, without significant analysis (and a little bit of luck) to determine which parts of the code are word endings.

Is it valid for Simonetta to make this assumption in the 15th century?


Many codes were, in fact, one-to-one substitution codes, but it’s certainly not a given—it’s an extremely low level of encipherment. If there’s enough text, you can simply stare at it for a while and the word-structure starts to become clear (you begin to see where the vowels and consonants are) and then the general language group becomes easier to recognize and, if you can narrow it down to a language group, after a while words start popping out at you.

This is what happened when I recently read a long manuscript in a dead simple substitution code based on astrological symbols. After a few pages, it was clear that it was probably Latin, and then words like “frigida” and “elleborus niger” started popping out. It’s like playing a game where they show you three out of nine letters, but you get to see a whole paragraph, not just one word, and the brain puts the pieces together. After a couple of dozen pages, you can simply read it.

But not all codes are one-to-one substitution codes. In 15th-century Italy, one-to-many/many-to-one/with-null codes were common. In the 1400s, Tranchedino collected many such codes. Several symbols could stand for one letter, several letters could be expressed with one symbol, and several null characters were often included, all in a single cipher. In addition to the alphabetic rules, many names were ciphered from a glossary, rather than following the rules for the rest of the text. In other words, there’s no consistency in the way glyphs correspond to letters that can be used to analyze the text. And thus, there’s no way to evaluate word endings or any individual letter in the manner Simonetta suggests.

Look for One-Character Words

Simonetta goes on to say that if there are many words represented by one cipher, that the code is in the vulgar tongue (Italian) and is rarely Latin because in Latin “there be no words presented by one only letter or cipher saving four words…” Again, this presupposes that the spaces are real but is also deeply perplexing coming from someone with a “fine education” in classical languages, because it’s not true.

Simonetta’s generalization completely ignores the multitude of abbreviations that were regularly used in Latin. Sometimes whole sentences were written with one-character abbreviations. “Et” was frequently written with the character 7. D stood for domine or dominus, A for anno. I could go on for two paragraphs citing all the examples. There’s no basis for assuming ciphered text would be written out in full Latin when use of abbreviations was so ingrained.

And guess what… I almost snorted my drink when I noticed, in Simonetta’s own treatise, that he uses common one-character Latin abbreviations such as q for “qui” or “quo” and p for “per” or “pro”, thus contradicting himself in his own writings. In a cipher it’s easy to create a distinction between “per” or “pro” by the length or slant of part of the glyph (it’s difficult for decrypters to know which variations of the pen are part of the handwriting and which ones carry meaning, as Voynich researchers themselves have surely noticed).

Pay Attention to Letter Endings

After some details about “vulgar language” word patterns, Simonetta counsels the Latin decrypters to examine letters at the ends of words, pointing out that “the most part of Latin words conclude either in a vowel, or in s, or in m, or in t…”. Once again, this completely ignores the way Latin was commonly written. Word endings were often omitted entirely, sometimes with a line over the word or a swoop of the tail standing in for the missing letters. There are also many terminal ligatures. The letters “is” might be spelled out at the end of one word and then abbreviated with a simple stroke on another—the meaning is the same, only space (or habit) dictates which one is chosen. Simonetta’s writing uses this convention as well, so it’s odd that he would not consider this possibility.


I’d like to try to redeem Simonetta by saying that his advice might be useful for decoding simple substitution codes in Italian, but Italian, German, French, and Spanish scribes used many of the same abbreviation conventions as Latin, which means the same caveats apply.

Even simple substitution codes sometimes manipulate the position of the spaces. As I’ve mentioned near the bottom in a previous blog, Pal. Germ 597 (a manuscript that includes a number of paragraphs in code) has a page of plaintext broken into syllables. Even a simple adjustment to the spacing, one of the easiest ways to manipulate a substitution code, makes it difficult to determine word length or to find word endings as per Simonetta—other methods are more effective.

As food for thought, I’ll leave a typical example from Tranchedino’s collection and you can judge for yourself whether any of Simonetta’s advice is useful for decrypting 15th-century ciphers. You may also notice a few glyphs are similar to VMS glyphs but I think it’s probably because they are common symbols, not because they’re directly related:

J.K. Petersen

© Copyright 2016 J.K. Petersen, All Rights Reserved

4 thoughts on “Did Cicco Simonetta Bomb at Code-Breaking?

  1. Koen Gheuens

    I understand what you mean with wanting to solve the thing bu yourself. I too love puzzles, and once I sink my teeth into any problem I never let go until it’s solved.

    The Voynich though… each day I grow more convinced that it is much more complex and much less trivial than people think. A large degree of cooperation will be required – it’s simply not a one player game. That’s why I share my analyses once I deem them ready enough, and why I actively encourage all kinds of meaningful interaction between researchers.

    There is also a more selfish reason why I don’t keep my ideas to myself. It really sucks when suddenly someone else comes along and says the thing you knew already, and then it looks like you’re copying them. 😀

    I also read Nick’s post and found this Simonetta’s scope rather limited. But that pronably means that these were the kinds of codes he was most often confronted with. Makes sense. It’s interesting to see the strategies he apparently used, even though they are clearly of no use for our purpose.

    This is a bit vague – I can’t exactly word it – but the VM breathes a completely different atmosphere than these 15th century codes. It’s something else. But I don’t know yet what exactly.

  2. Nick Pelling

    People definitely did use all manner of abbreviation in their own writing, but seem not to have done this in their enciphered writings. In Curse, I could point to only two cipher keys incorporating abbreviatory marks in the Tranchedino ledger, and then only one apiece.

    And there’s evidence – also in Curse – that Simonetta copied this text without understanding the details. And given that (as you point out) the level of detail seems a bit rudimentary compared to ciphers circa 1474, it seems a fair bet that it was copied from an older Chancellors document, say circa 1460.

    I’ve blogged about CPG 597 back in 2009, but it looked like a simple substitution cipher, nothing special. Which page were you talking about?

  3. D.N. O'Donovan

    -JKP –
    I’m glad to have noticed this post. I get a much clearer idea of your character and your approach to the manuscript here than at voynich.ninja.

    I really would recommend your bookmarking Philip Neal’s page. I’ve found it to be one of the most solid, relevant, meticulous and (yes, I’ll say it) intellectually honest Voynich-related web-pages ever mounted. A little of it is out of date, I’d say, but not much. Chiefly his expectation of an all Latin European (i.e. western European Christian) character.


  4. J.K. Petersen Post author

    Hello, Nick, thanks for dropping by.

    The plaintext page that has been broken into syllables in Pal. Germ. 597 is on folios 2r and 2v.

    I could be wrong, but I cannot think of any reason someone would break plaintext into syllables, in a book with many blocks of ciphers, unless they were preparing it to be ciphered. It doesn’t look like a grammar lesson—there’s a block of ciphered text right above it. The content has nothing to do with grammar either.

    It’s mostly, but not entirely Latin, includes a number of abbreviations (e.g., qz for quibus), and a number of one-character syllables. These adjustments would make a simple code (or any code) harder to break, without having to extensively manipulate the text. Once ciphered (even in a simple one-to-one substitution code), Simonetta’s advice would be useless for determining the language or evaluating the letters at the ends of words.

    Pal. Germ. 597


Leave a Reply

Your email address will not be published. Required fields are marked *