Generative AI for Synthetic Data Augmentation in Imbalanced Classification Tasks: Boosting Performance and Robustness
DOI:
https://doi.org/10.70849/IJSCIKeywords:
Generative AIAbstract
Imbalanced classification tasks, characterized by disproportionate class distributions, pose significant challenges in machine learning applications across healthcare, finance, and cybersecurity domains. Traditional oversampling techniques such as SMOTE often generate noisy synthetic samples that inadequately capture minority class complexity. This study investigates the efficacy of generative AI models, specifically Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs), for synthesizing realistic minority class samples to mitigate data imbalance. We evaluate conditional GAN-based augmentation on benchmark datasets including credit card fraud detection (imbalance ratio 1:580) and medical diagnosis tasks. Experimental results demonstrate that generative augmentation achieves 22-28% improvements in F1-scores for minority classes compared to baseline methods, with AUC-ROC scores increasing from 0.82 to 0.91 in fraud detection scenarios. Generated samples exhibit statistical fidelity to original distributions, validated through Fréchet Inception Distance (FID) and Maximum Mean Discrepancy (MMD) metrics. These findings suggest that integrating generative AI into data preprocessing pipelines offers a robust solution for enhancing classification performance in imbalanced, high-stakes applications.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.








