RICOTERM 3

Several projects on Information Retrieval, Knowledge Management, Machine Translation and Automatic Terminology Extraction include linguistic-based strategies to improve the quality of their results (coverage and accuracy, and thematic relevance). Among these linguistic strategies, it is remarkable the use of ontologies and lexical hierarchies, which offer a representation of semantic information of lexical units (semantic classes and lexical relations).

In the previous project RICOTERM-2 (HUM2004-056658-00), oriented towards Information Retrieval for Catalan, Spanish, Galician and Basque languages, we are checking the strategy's robustness for the expansion of multilingual queries in Economics. In the development of the Automatic Terminology Extractor YATE (Vivaldi 2001), developed in the linked projects TEXTERM2 (BFF2003-02111) and RICOTERM2, enrichment of EuroWordNet with terminological units from different thematic areas has been proved to be an essential strategy in order to increase coverage and accuracy in the identification of Term Candidates.

The basic aim of this new project is to adapt the YATE tool to several specialised areas and different languages (Spanish, Catalan and Basque) through the enrichment of EuroWordNet (EWN). Moreover, we pursue the improvement of the platform accessing to the tool, with the incorporation of better features and other auxiliary applications, as the automatic detection of metalinguistic expressions which reflect conceptual relations in texts.

Specifically, subproject 1 (UPF) will complete the enrichment of EuroWordNet (EWN) in Law, Environment and Computer Science in Spanish and Catalan languages; and subproject 2 (EHU), as well as continuing to constitute textual and lexical resources in Basque in some specialised areas, will complete the enrichment of EuroWordNet (EWN) in Medicine, Economics, Law, Environment and Computer Science, through the search of Basque equivalents for the entries in Spanish and Catalan.