Exploring Sentence Embedding Representation for Arabic Question/Answering

Lahbari, Imane; El Alaou, Sa¨ıd Ouatik

doi:https://dx.doi.org/10.12785/ijcds/150187

Journals About us Ethics and Policies Objectives Values Contact us

UOB Journals
→
02. International Journal of Computing and Digital Systems
→
Volume 15
→
Issue 01
→
View Item

Exploring Sentence Embedding Representation for Arabic Question/Answering

Lahbari, Imane; El Alaou, Sa¨ıd Ouatik

DOI: https://dx.doi.org/10.12785/ijcds/150187

ISSN: 2210-142X

Date: 2024-03-1

Abstract:

Question Answering Systems (QAS) are made to automatically respond with precise information to user questions that are phrased in natural language. Due to its intricate and rich morphology, Arabic QAS poses a significant problem. Information retrieval, text summarization, and question-answering systems all fall under the category of natural language processing activities where text representation is a critical step. Comparing SE representation to more traditional approaches like bag-of-words and word embedding, it has demonstrated encouraging results. In this study, we introduce a novel QA approach for the Arabic language that is based on passage retrieval and SE representation. It consists of three steps: ”Question classification and query formulation”, ”Documents and passages retrieval”, and then ”Answers extraction”. In this work, we adopt the AraBert pre-trained model to compute vector representation. It allows us to consider implicit semantics and the words’ context within the text. Furthermore, in order to collect potential passages for user questions, we investigate a method for retrieving Arabic passages using the BM25 model, a query expansion process, and SE representation. The final answer is extracted by fine-tuning AraBERT parameters by ranking passages and extracting the most relevant ones. We carry out a number of tests with the CLEF and TREC datasets by following two different taxonomies. The outcomes demonstrate the efficacy of our methodology

Show full item record