Abstract:
Automatic Speech Recognition (ASR) systems have witnessed significant advancements in recent years due to the emergence
of deep learning techniques and the availability of large speech datasets. With the increasing demand for Arabic voice-enabled
technologies, the availability of a high-quality and representative data set for the Arabic language becomes crucial. This paper presents
the development of a new dataset called ArabAlg specifically designed for Arabic Speech Commands Recognition (ASCR) to support
the integration of Arabic voice recognition systems in smart devices in Internet of Things (IoT). This research focuses on collecting
and annotating a diverse range of Arabic speech commands, encompassing various domains and applications. The dataset construction
process involves recording and preprocessing several utterances from native Arabic speakers. To ensure precision and reliability, quality
control measures are implemented during data collection and annotation. The resulting dataset provides a valuable resource for training
and evaluating ASCR systems tailored for Arabic speakers using Machine Learning and Deep Learning.