Dr Fiona McNeill & Dr Gábor Bella
The Unified Scottish Gaelic Wordnet: Lexicon Extension, Gaps, and Integration into a Large Multilingual Lexical Database.
We present the Unified Scottish Gaelic Wordnet, a large computer-readable lexicon for Scottish Gaelic, composed of 11 thousand words and 15 thousand word senses. It was created in a larger part through an expert-driven lexicographic effort and in a smaller part from Wiktionary data. The end result is a freely downloadable lexicon with each entry meaning-aligned with the widely used English Princeton WordNet. The lexicon is also integrated into the Universal Knowledge Core (UKC), a large-scale online database that interconnects the vocabularies of a thousand languages and contains over two million words. One of our primary research interests being language diversity and untranslatability, an important outcome of our project was the collection and formal representation of about 700 lexical gaps—linguistically and culturally specific words—between English and Gaelic. As future work, we foresee a larger-scale and more systematic cataloguing of such gaps, possibly in the more general context of the Celtic language family.
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
Meeting ID: 608 657 0437
Seminars begin at 1pm - please join the waiting room ten minutes before to allow us time to admit you (and say hello!); the room will be locked ten minutes after the start of the seminar. The full programme of seminars can be found here.