Back to open data
NLP & Language

Goud-sum (HuggingFace) — Darija Summarization Dataset

About

Goud-sum contains 158,282 article-headline pairs from Goud.ma news. Headlines in Darija, articles in Darija/MSA/code-switched. Task: summarization. Splits: train (139k), validation (9.5k), test (9.5k). Size: 326 MB. Citation: Issam & Mrini, 3rd Workshop on African NLP, 2022.

https://huggingface.co/datasets/Goud/Goud-sum
Visit Website