*Result*: Enhancing sentiment analysis of moroccan dialect through transformer-based language models architectures and active learning strategies.

Title:
Enhancing sentiment analysis of moroccan dialect through transformer-based language models architectures and active learning strategies.
Authors:
Amnay, Meriem1 (AUTHOR) amnay.meriem@gmail.com, Jabrane, Mourad1 (AUTHOR) mourad.jabrane@usms.ac.ma, Ourdou, Amal1 (AUTHOR) a.ourdou@usms.ma, Hafidi, Imad1 (AUTHOR) i.hafidi@usms.ma
Source:
Language Resources & Evaluation. Mar2026, Vol. 60 Issue 1, p1-29. 29p.
Database:
Academic Search Index

*Further Information*

*The effectiveness of machine learning models, particularly in natural language processing, heavily depends on the quality and quantity of the labeling training data process, which is time-consuming and expensive. This challenge is particularly noticeable in the case of Moroccan Darija, a dialect characterized by a rich linguistic diversity but limited by a lack of labeled datasets. In this paper, we introduce a novel approach that integrates active learning with transformer-based architectures to enhance text classification tasks for Moroccan Darija. By employing active learning, the proposed methodology strategically selects the most informative data points for labeling. We integrate active learning with four transformer-based models, including DarijaBert, CamelBert, MarBert and Qarib to preserve the context of words within sentences through attention mechanisms, which is essential for accurately interpreting the varied meanings of words in Moroccan Darija. The assessments indicate that this methodology markedly enhances sentiment analysis, surpassing conventional models. This underscores the advantages of employing language model technologies combined with active learning for Moroccan Darija to address the complexities arising from diverse word meanings. [ABSTRACT FROM AUTHOR]*