The first large-scale continuous Saudi Sign Language (SSL) dataset
Isharah is a large-scale dataset for Continuous Saudi Sign Language (SSL) recognition and translation. It features over 30,000 video samples signed by deaf and hearing-impaired individuals using smartphones in varied settings.
The dataset supports both Continuous Sign Language Recognition (CSLR) and Sign Language Translation (SLT), and includes Sentence-level gloss annotations and Corresponding Arabic translations. Three benchmark subsets are included: Isharah-500, Isharah-1000, and Isharah-2000.
We are currently preparing the dataset files for public release. The download links will be available soon.
If you use Isharah in your work, please cite:
@misc{alyami2025isharahlargescalemultiscenedataset, title={Isharah: A Large-Scale Multi-Scene Dataset for Continuous Sign Language Recognition}, author={Sarah Alyami and Hamzah Luqman and Sadam Al-Azani and Maad Alowaifeer and Yazeed Alharbi and Yaser Alonaizan}, year={2025}, eprint={2506.03615}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2506.03615}, }