Researchers usually use Lp-norm minimization as a proxy for imperceptibility. The metric is particularly useful in challenging black-box attack with limited queries, where the non-trivial perturbation power is hard to achieve.
Recently, there has been a large amount of work towards fooling
deep-learning-based classifiers, particularly for images, via adversarial
inputs that are visually similar to the benign examples. However, researchers
usually use Lp-norm minimization as a proxy for imperceptibility, which
oversimplifies the diversity and richness of real-world images and human visual
perception. In this work, we propose a novel perceptual metric utilizing the
well-established connection between the low-level image feature fidelity and
human visual sensitivity, where we call it Perceptual Feature Fidelity Loss. We
show that our metric can robustly reflect and describe the imperceptibility of
the generated adversarial images validated in various conditions. Moreover, we
demonstrate that this metric is highly flexible, which can be conveniently
integrated into different existing optimization frameworks to guide the noise
distribution for better imperceptibility. The metric is particularly useful in
the challenging black-box attack with limited queries, where the
imperceptibility is hard to achieve due to the non-trivial perturbation power.