Determining the possibility of deciphering an unintelligible text by text clustering: The case of the Voynich Manuscript

Teru Agata, Mari Agata

研究成果: Article査読

抄録

Purpose: One of the most common approaches to understanding an undeciphered text is to identify and then decipher the underlying code. If a document remains unintelligible or undeciphered for a long period of time even after many attempts at decoding it, the possibility of it being "gibberish" must be considered. This study proposes a method to detect the existence, or non-existence, of a coherent structure within a previously non-translated text in order to determine the possibility of deciphering it. Methods: The present method begins with the assumption that natural languageprocessing methods that are commonly employed in analyzing known languages can be applied to an undeciphered text. To detect a coherent structure in a text, the similarity of every pair of partial document is measured, and then the similarity matrix is analyzed by clustering methods. The next step is to compare the detected structure with the sections suggested by other clues such as illustrations and the page order. Thus, it is determined whether an undeciphered text contains an identifiable structure which corresponds to the latter, or whether it is "gibberish" containing no order or structure. Results: We applied the proposed method to the Voynich Manuscript, which is a renowned undeciphered text. The results clearly demonstrate that the text of the Voynich Manuscript possesses an identifiable structure, and that the structure corresponds to the existing sections of the manuscript suggested by the accompanying illustrations. Thus, the results strongly suggest that the Voynich Manuscript is not "gibberish"; additional attempts to decipher its contents would be justified. The present experiment proves the usefulness of applying this method to a previously non-deciphered text.

本文言語English
ページ(範囲)1-23
ページ数23
ジャーナルLibrary and Information Science
61
出版ステータスPublished - 2009 10月 14

ASJC Scopus subject areas

  • 図書館情報学

フィンガープリント

「Determining the possibility of deciphering an unintelligible text by text clustering: The case of the Voynich Manuscript」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル