Abstract:
Semantic search is an information retrieval technique that seeks to understand the contextual meaning of words to find
more accurate results. It remains an open challenge, especially for the Holy Quran, as this sacred book encodes crucial religious
meanings with a high level of semantics and eloquence beyond human capacities. This paper presents a new semantic search approach
for the Holy Quran. The presented approach leverages the power of contextualized word representation models and discourse analysis
to retrieve semantically relevant verses to the user's query, which do not necessarily appear verbatim in Quranic text. It consists of
three crucial modules. The first module concerns the discourse segmentation of Quranic text into discourse units. The second module
aims to identify the most effective word representation model for mapping the Quranic discourse units into semantic vectors. To this
end, the performance of five cutting-edge word representation models in assessing semantic relatedness in the Holy Quran at verse
level is investigated. The third module concerns the semantic search model. Evaluation results of the proposed approach are very
promising. The average precision and recall are 90.79% and 79.57%, respectively, which demonstrates the strength of the proposed
approach and the ability of contextualized word representation models to capture Quran semantic information.