Description of the project

RESTAURE : RESsources informatisées et Traitement AUtomatique pour les langues REgionales /
Computational Resources and Processing for Regional Languages

Research aimed at producing resources and tools for regional languages currently experiences a resurgence of interest, in particular through the creation of corpora and dictionaries. The ultimate goal is to help preserve and disseminate cultural heritage. Regional languages of France can be considered as low-resourced. All languages with little resources have in common that their computerisation has a low financial profitability which does not compensate for considerable development costs. However, endowing these languages with electronic resources (corpora, lexicons, dictionaries) and tools is a major concern for their dissemination, protection and teaching (including for new speakers). In a broader perspective, it is the diversity of world languages which would be better preserved and the amount of data available to researchers in human and social sciences (linguistics, sociology, anthropology, literature, history, ... ) would increase.

The overall objective of the RESTAURE project is to provide computational resources and processing tools for three regional languages of France: Alsatian, Occitan and Picard. To achieve this goal, it will be necessary to develop new computational models suitable for low-resourced and poorly standardized languages. The initial choice of these three languages is motivated by several reasons: they cover various language families and there has been significant work in the areas covered by the project. It will thus be possible to build upon existing work in order to share different approaches, experiences and tools developed in previous projects.


  • Project start: Janurary 1st, 2015
  • Duration: 42 months


Project funded by the ANR, convention ANR-14-CE24-0003