Advancing Text Classification: A Systematic Review of Few- Shot Learning Approaches

Aljehani, Amani; Hamid Hasan, Syed; Ali Khan, Usman

doi:http://dx.doi.org/10.12785/ijcds/XXXXXX

Journals About us Ethics and Policies Objectives Values Contact us

UOB Journals
→
02. International Journal of Computing and Digital Systems
→
Preprint
→
View Item

Advancing Text Classification: A Systematic Review of Few- Shot Learning Approaches

Aljehani, Amani; Hamid Hasan, Syed; Ali Khan, Usman

DOI: http://dx.doi.org/10.12785/ijcds/XXXXXX

ISSN: 2210-142X

Date: 2024-05-10

Abstract:

Few-shot learning, a specialized branch of machine learning, tackles the challenge of constructing accurate models with minimal labeled data. This is particularly pertinent in text classification, where annotated samples are often scarce, especially in niche domains or certain languages. Our survey offers an updated synthesis of the latest developments in few-shot learning for text classification, delving into core techniques such as metric-based, model-based, and optimization-based approaches, and their suitability for textual data. We pay special attention to transfer learning and pre-trained language models, which have demonstrated exceptional capabilities in comprehending and categorizing text with few examples. Additionally, our review extends to the exploration of few-shot learning in Arabic text classification, including both datasets and existing research efforts. We evaluated 32 studies that met our inclusion criteria, summarizing benchmarks and datasets, discussing few-shot learnin g’s real-world impacts, and suggesting future research avenues. Our survey aims to provide a thorough groundwork for those at the nexus of few-shot learning and text classification, with an added focus on Arabic text, emphasizing the creation of versatile models that can effectively learn from limited data and sustain high performance, while also identifying key challenges in applying Few-Shot Learning (FSL), including data sparsity, domain specificity, and language constraints, necessitating innovative solutions for robust model adaptation and generalization across diverse textual domains.

Show full item record