Title: Adversarial Robustness and Generalization
Advisor: İnci Baytaş
Abstract: Adversarial robustness has become increasingly critical due to the vulnerability of state-of-the-art deep architectures to adversarial attacks. Adversarial training is considered one of the most effective approaches to defending against such attacks. However, the trade-off between adversarial robustness and generalization has emerged as one of the key challenges in adversarial robustness literature. The generalizability of adversarially trained models may degrade when the model overfits the attack patterns and the diversity of perturbations employed in training is inadequate. This thesis investigates the factors that affect the success of adversarial training and proposes solutions to mitigate some of these factors by utilizing new attack augmentation and generation methods. In particular, an adversarial training framework that improves attack diversity and robustness by augmenting principal adversarial directions during training without requiring extra backpropagation steps is proposed. Experimental results on popular benchmark datasets indicate that the proposed framework consistently improves adversarial robustness without sacrificing natural accuracy.