Isharah Continuous Sign Language Recognition and Translation Dataset

The first large-scale continuous Saudi Sign Language (SSL) dataset

๐Ÿงพ About

Isharah is a large-scale dataset for Continuous Saudi Sign Language (SSL) recognition and translation. It features over 30,000 video samples signed by deaf and hearing-impaired individuals using smartphones in varied settings.

The dataset supports both Continuous Sign Language Recognition (CSLR) and Sign Language Translation (SLT), and includes Sentence-level gloss annotations and Corresponding Arabic translations. Three benchmark subsets are included: Isharah-500, Isharah-1000, and Isharah-2000.

๐Ÿ”— Download

We are currently preparing the dataset files for public release. The download links will be available soon.

๐Ÿ“„ Citation

If you use Isharah in your work, please cite:

@misc{alyami2025isharahlargescalemultiscenedataset,
      title={Isharah: A Large-Scale Multi-Scene Dataset for Continuous Sign Language Recognition}, 
      author={Sarah Alyami and Hamzah Luqman and Sadam Al-Azani and Maad Alowaifeer and Yazeed Alharbi and Yaser Alonaizan},
      year={2025},
      eprint={2506.03615},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2506.03615}, 
}