The Institute for the Basque Language (IBL) is working on a scientific project aimed at creating an online encyclopaedia of the Basque language: grammar, dictionary, historical information, etc. Some parts of this project are already partially or wholly available online, namely:
In addition to these projects, the IBL has conducted workshops (2002,2005, 2008, 2010, 2013) on terminology and has compiled a range of materials for teaching university courses in Basque. Also available is a grammar written in English
Given their nature, most of our projects are available in Basque. This explanation in English is intended to give the reader an idea of the type of work undertaken by the Institute.
What we call our Reference Corpus is a corpus of prose writings that appeared in print between 2000-2007. Altogether it contains some 25.1 million words, of which 13.1 million are drawn from books chosen for their quality (287 volumes) and 12 million are from newspaper articles published in Spain (Berria) and in France (Herria).
It is a closed corpus, because our team of researchers concluded, after making numerous trials, that the information that could be gleaned from more than 25 million words would not be significant given our objectives and that it would only add to the work and time spent in data processing. In any event, in future other linguistic corpora with different features will be added to the Institute for the Basque Language's website.
This corpus makes it possible to look up words as they are used today by authors writing in Basque. The word being looked up will appear in context, in a full sentence. Also indicated are frequency (number of times the word appears in books and newspapers), the writer concerned and the title and page of publication.
This corpus has given rise to various academic works made possible thanks to the information that the corpus contains: Dictionary of Standard Basque in Contemporary Prose (Hiztegi Batua Euskal Prosan), http://www.ehu.es/ehg/; Dictionary of Contemporary Basque (Egungo Euskararen Hiztegia), http://www.ehu.es/eeh/; and The Lexicon, Past and Present (Lexikoa, Atzo eta Gaur), http://www.ehu.es/lag/, etc., all available on the Institute for the Basque Language’s website.
This part of the project was partially funded by the City Council of San Sebastián and the Gipuzkoa Provincial Council.
CRP (in Basque)
Fecha de la última modificación: 25/01/2009