ArabAlg: A new Dataset for Arabic Speech Commands Recognition for Machine Learning Purposes

OUKAS, Nourredine; HABOUSSI, Samia; MAIZA, Chafik; BENSLIMANE, Nassim

doi:http://dx.doi.org/10.12785/ijcds/150170

Journals About us Ethics and Policies Objectives Values Contact us

UOB Journals
→
02. International Journal of Computing and Digital Systems
→
Volume 15
→
Issue 01
→
View Item

ArabAlg: A new Dataset for Arabic Speech Commands Recognition for Machine Learning Purposes

OUKAS, Nourredine; HABOUSSI, Samia; MAIZA, Chafik; BENSLIMANE, Nassim

DOI: http://dx.doi.org/10.12785/ijcds/150170

ISSN: 2210-142X

Date: 2024-02-05

Abstract:

Automatic Speech Recognition (ASR) systems have witnessed significant advancements in recent years due to the emergence of deep learning techniques and the availability of large speech datasets. With the increasing demand for Arabic voice-enabled technologies, the availability of a high-quality and representative data set for the Arabic language becomes crucial. This paper presents the development of a new dataset called ArabAlg specifically designed for Arabic Speech Commands Recognition (ASCR) to support the integration of Arabic voice recognition systems in smart devices in Internet of Things (IoT). This research focuses on collecting and annotating a diverse range of Arabic speech commands, encompassing various domains and applications. The dataset construction process involves recording and preprocessing several utterances from native Arabic speakers. To ensure precision and reliability, quality control measures are implemented during data collection and annotation. The resulting dataset provides a valuable resource for training and evaluating ASCR systems tailored for Arabic speakers using Machine Learning and Deep Learning.

Show full item record