|Liens rapides :||Quickstart||Web Site Manuel||Date Entry Manual||Item creation|
The aim of the RefLex project is to test a set of fundamental hypotheses about the structure and evolution of African languages which are often referred to in the literature, but whose validity has never been demonstrated in practice. These include, among others, the alleged existence of phonological, morphosyntactic and lexical phenomena that are distinctive to Africa, the hypothesis that the morphosyntax of Niger-Congo languages is strongly influenced by prosodic constraints on noun and verb stems, and various hypotheses about the genetic classification of African languages. All these hypotheses have something in common: they can be tested quantitatively; but this in turn assumes the existence of a fairly complete documentation. However at present, only a minority of African languages have been subject to thorough descriptive study. RefLex was born from the observation that there are lexical data for about two-thirds of African languages but that this wealth of data, because it is scattered and often difficult to access, is largely under-exploited.
The aim is to create a comprehensive corpus of lexical data on the languages of Africa and a toolkit to exploit them. The creation of the corpus will be a truly collaborative effort in which researchers brings lexical data from the languages that they are specialists in. In return, they will have access to standardized and reliable lexical data, which they can then manipulate and exploit for specific scientific purposes. All African language specialists will be asked to provide tools and lexical data for RefLex and, of course, to exploit this resource in their own research.
Thanks to its innovative approach, RefLex solves many of the methodological problems faced by other comparable projects. For one thing, all the bibliographic sources that make up the RefLex database will be accessible to users in digital form (e.g. PDF), so that everyone can verify the reliability of data entry, report errors and, above all, reproduce experimental measurements from reliable data. So the lexical corpus is designed as a true reference lexicon (hence the name RefLex). In addition, we will tackle data standardization. Adopting strict transcription rules (see the manual entry) will smooth out any variations due to the diversity of the source materials, and will facilitate direct comparison of very disparate documents. Finally, handling and exploitation of data will be optimized through the development and availability of a variety of tools. The pooling of technical specifications will allow each participant to develop their own tools that the entire community can then take advantage of. Thus, apart from the corpus itself,  the RefLex website will offer a veritable library of general and specific tools.
The RefLex corpus will be notable for its unprecedented size. In principle, there is no limit to the number of documents that can be integrated into it.