MedTitles - A multimodal dataset of Spanish medical videos and aligned transcripts
Data and Resources
Interoperability
Groups
Additional Info
Field | Value |
Identifier | http://hdl.handle.net/10261/398113 |
---|---|
Author | |
Project | |
Name | MedTitles - A multimodal dataset of Spanish medical videos and aligned transcripts |
Description |
MedTitles is a dataset of 30 hours of medical videos and audios in Spanish, time aligned with the corresponding subtitles. Videos were obtained from authorized medical providers online. It contains the following data: A gold standard of 20 hours of 290 videos and audios, each revised by two human annotators. A silver standard of 10 hours of 76 videos and audios, in which only the medical transcriptions were revised. A pronunciation dictionary of medical words in Spanish, to be used with Montreal Forced Aligner. The dataset contains recordings from a total of 402 different speakers (200 male and 202 female). This repository contains only the audios and transcriptions. Please, contact the authors to get the corresponding videos. The files will be made publicly available on August 2, 2026. |
Themes | Science and technology |
Tags | |
Creation date | 2025-08-25T00:00:00 |
Last updated | 2025-08-26T07:15:07 |
Refresh rate | |
Languages | English |
Geographic coverage |
|
Geographic coverage (International) | |
Time coverage | |
Effective resource | |
Related resources | |
Normative |
|
Institute | |
Publisher | Publicador - Digital.CSIC |
Observations |
Recommended citation : Campillos-Llanos, Leonardo; 2025; MedTitles - A multimodal dataset of Spanish medical videos and aligned transcripts [Dataset]; Zenodo; Version v1; https://doi.org/10.5281/zenodo.16729213 |