Abstract:
Understanding the topics of Quran verses is considered as a main interest of Islamic Scholars, specialists of Quran studies
and others. The traditional classification of Quran verses can be simplified and improved using the automated techniques such as
Natural Language Processing (NLP) and Machine Learning (ML). While the majority of the current studies have used traditional ML
approaches with small datasets, we used the Deep Learning (DL) algorithms with larger dataset for classifying Quran verses. This
paper proposes a method for multi-label classification for accurately classifying Quran verses based on 12 predefined main topics using
DL. We designed a structured method that consists of multiple steps for achieving the objective of this study. Firstly, a dataset of
labeled Quran verses is collected, organized and converted to sequences of numbers to be understood by the DL models. The skip-gram
algorithm of Word2Vec is used for considering the semantic of text to improve the models’ performances. Then the embedding vectors
are fed to two different DL models which are RNN and CNN to classify verses. The results of DL classifiers are evaluated based on
accuracy, precision, recall, F1-score, and hamming loss where the cross-validation technique is used for more accurate results. The
values of 90.38%, 96.98%, 92.49%, 93.81% and 0.0126 for accuracy, precision, recall, F1-score and hamming loss respectively were
achieved as best results. The findings of this study help specialists of Quran studies to gain more insight for easily studying and
teaching the topics discussed by Quran verses.