-
Qualitative and quantitative data from contexts of use for the analysis of si...
This dataset compiles examples of use of the following terms: covid-19, coronavirus, confinamiento, SARS-CoV-2, pandemia and virus. This are selected in a double quantitative and qualitative methodology from the linguistic corpora in Spanish of...
Instituto: Centro de Ciencias Humanas y Sociales (CCHS), CSIC
-
CLARA-MeD simplified sentences
This dataset contains 1200 manually simplified sentences (144 019 tokens) from clinical trials in Spanish. A total of 1040 announcements from the European Clinical Trials Register (EudraCT) were analyzed to select sentences with ambiguities or...
Instituto: Instituto de Lengua, Literatura y Antropología (ILLA), CSIC
-
Medical Lexicon for Spanish (MedLexSp)
MedLexSp is an unified medical lexicon for Medical Natural Language Processing in Spanish. It includes 100 887 lemmas, 302 543 inflected forms (conjugated verbs, and number/gender variants), and 42 958 Unified Medical Language System (UMLS) Concept...
Instituto: Instituto de Lengua, Literatura y Antropología (ILLA), CSIC