Latin WordNet 2.0

About

Under the direction of Dr William Michael Short, the University of Exeter has launched the development of a comprehensive WordNet for Latin. A WordNet is lexico-semantic database, in which a language's open-class words -- its nouns, verbs, adjectives, adverbs and sometimes prepositions -- are assigned to sets of cognitive synonyms (synsets), which represent discrete concepts and characterize the senses of words. Synsets are linked together through different kinds of semantic relations, such as antonymy, hypernymy, and hyponymy, and lexemes are linked through derivational relations. Synsets are also grouped together into larger conceptual fields (semfields). In the Latin WordNet, the word sica, for example, is connected through the synset glossed as 'a short stabbing weapon with a pointed blade' (within the 'Arms & Armour' semfield) to gladiolus, parazonium, sicula, pugio, and clunaculum, as well as to sicilicula, sicilimenta, sicilius, sicilio, and sicarius through derivational linkages. Each of these items is in turn linked to others through semantic and lexical relations, creating a densely interconnected and multi-layered conceptual network. For instance, sica is connected to gladius, scalprum, and cultellus via the superordinate category 'a weapon with a handle and blade with a sharp point'.

In this sense, the WordNet is like any lexicographic resource classicists may already be familiar with. Certainly, the WordNet can be viewed – and utilized – as a simple dictionary, or as something like Döderlein's Handbook of Latin Synonyms. Through its synsets, it captures subtle distinctions in words' meanings and can include information about how these meanings change diachronically. However, the WordNet goes beyond the printed lexicon by representing the meanings of the Latin language in machine-interpretable form (not primarily as strings of letters), and by providing a means of 'traversing' the lexicon programmatically by following determined paths of semantic and lexical interconnection. It is envisioned as a wide-ranging and comprehensive knowledge-bank that will aggregate information not only about Latin words and their meanings, but also about etymological relations and word formation – as well as the kinds of large-scale figurative patterns that organize meanings in this language at a level above any particular word's semantic structure.

The original WordNet for Latin, created for the Fondazione Bruno Kessler's MultiWordNet Project in 2008, contained about 9,000 words. It is now being expanded to include over 70,000 words covering the archaic through medieval periods of the language, along with rich figurative information. Enquiries are welcomed from anyone interested in taking advantage of the WordNet as part of another project, or in contributing to its development.