• Laura Montesdeoca Fenoy
  • Andreea-Emilia Ciontos
Recent work in machine learning has yielded in algorithms with high performance and accuracy. However, in critical areas such as medicine, finance or law, these algorithms are not yet fully trusted. The reason for this is their "black-box" nature. Meaning, when they fail, there is no clear reason for the failure. To overcome this issue, explainable AI (XAI) algorithms have been developed to add an extra layer of explainability towards AI. But with adversarial attacks at hand, even these algorithms become vulnerable. The aim of this paper is to study the effect of Fast Gradient Sign Method (FGSM) adversarial attack on two recent XAI algorithms, namely Similarity Difference and Uniqueness (SIDU) and Gradient-weighted Class Activation Mapping (Grad-CAM). Furthermore, by employing an eye tracker, we analyse how human eye fixation on natural images can be perceived and compared to the XAI saliency map. Our findings are that even though initially GradCam performs better than SIDU, when compared to the fixation maps as a ground truth, when it comes to noise, the results switch, thus SIDU is in fact more robust to adversarial attacks.
Publication date4 Jun 2020
ID: 333599359