You will open the main page of UPV/EHU in a new window (spanish)

Euskara Institutua
Patterns of Frequency in the Basque Lexicon (PFBL)

The Institute for the Basque Language (IBL) is working on a scientific project aimed at creating an online encyclopaedia of the Basque language: grammar, dictionary, historical information, etc. Some parts of this project are already partially or wholly available online, namely:

  • Contemporary Reference Prose (CRP)
  • Dictionary of Standard Basque in Contemporary Prose (DSBCP)
  • The Lexicon, past and present (LPP)
  • Dictionary of Contemporary Basque (DCB)

In addition to these projects, the IBL has conducted workshops (2002,2005, 2008) on terminology and has compiled a range of materials for teaching university courses in Basque. Also available is a grammar written in English

Given their nature, most of our projects are available in Basque. This explanation in English is intended to give the reader an idea of the type of work undertaken by the Institute.

This online application offers the public the opportunity to search for the frequency of structural patterns in the Basque lexicon, including the following.

  • Word frequency
  • Syllabic structure of Basque words: numbers of letters, numbers of syllables, and frequency of patterns such as CV, VV, VC, and so forth.
  • Similar words: words in which a letter is added, or letters have been taken out, transposed letters, and so forth.
  • Repeated syllables, groupings of two or three letters, and their location in the word.
  • Morphology of each lemma, its frequency, its grammatical category, and so forth.

The database has been drawn from the corpus Ereduzko prosa Gaur (EPG). Only common Basque words have been included, that is to say, true Basque lemmas. Leaving out proper names, words in other languages and errors, of the 25.1 million words in this corpus, 22.7 have been included in this database.

There are three possible types of searches.

  • A data search: general information from the database.
  • A word search based on criteria: the user can select the criteria that he or she wants to use to limit the search, and the database will provide lists of words matching those criteria.
  • A data search based on words: the user provides a list of words (or a piece of text) in a file, and the application will analyze each word.

PFBL (in Basque)

Fecha de la última modificación: 30/03/2011