Abstract:
The escalation of online hate speech, especially directed at women, has generated considerable apprehension in digital environments. Women often encounter mistreatment through disparaging remarks, body shaming, and explicit material, rendering the automated identification of such detrimental speech an essential endeavour. The difficulty escalates in low-resource languages such as Malayalam, particularly when code-mixing with English occurs. This study examines the efficacy of synthetic data augmentation methods—Machine Translation (MT), Masked Language Modeling (MLM), and Few-Shot Learning (FSL)—in improving hate speech detection for Malayalam-English code-mixed text. We utilize a multilingual BERT (mBERT) classifier, combining empirical data with synthetic data produced by masked language Modeling (MLM) and few-shot learning (FSL). Our results demonstrate that this method markedly enhances classification performance, attaining an F1-score of 86.42%. Through LIME-based explainability analysis, we demonstrate that contextual meaning is pivotal in the model's decision-making process. Furthermore, our comparative analysis of false positive and false negative rates between authentic and synthetic datasets underscores the capacity of MLM and FSL to foster a more equitable and impartial classification system. This study is the inaugural investigation into synthetic data generation for the detection of hate speech in code-mixed languages, concurrently addressing fairness issues, especially regarding online abuse directed at women.