-
Corpus for Complex Word Identification in Medical Spanish Texts (CWI-Med-Sp)
[Description of methods used for collection/generation of data] The corpus statistics and methods are explained in the following article: Federico Ortega-Riba, Leonardo Campillos-Llanos, Doaa Samy (2025) "Lexical Simplification in Spanish Texts For...
Instituto: Instituto de Lengua, Literatura y Antropología (ILLA), CSIC
-
CLARA-MeD corpus
A collection of 24.298 pairs of professional and simplified texts (>96 million tokens): 1) Drug leaflets and summaries of product characteristics (10 211 pairs of texts, >82M words); 2) Cancer-related information summaries (201 pairs of texts,...
Instituto: Instituto de Lengua, Literatura y Antropología (ILLA), CSIC
