Interpreting Image Classifiers by Generating Discrete Masks.

abstract

Deep models are commonly treated as black-boxes and lack interpretability. Here, we propose a novel approach to interpret deep image classifiers by generating discrete masks. Our method follows the generative adversarial network formalism. The deep model to be interpreted is the discriminator while we train a generator to explain it. The generator is trained to capture discriminative image regions that should convey the same or similar meaning as the original image from the model's perspective. It produces a probability map from which a discrete mask can be sampled. Then the discriminator is used to measure the quality of the sampled mask and provide feedbacks for updating. Due to the sampling operations, the generator cannot be trained directly by back-propagation. We propose to update it using policy gradient. Furthermore, we propose to incorporate gradients as auxiliary information to reduce the search space and facilitate training. We conduct both quantitative and qualitative experiments on the ILSVRC dataset. Experimental results indicate that our method can provide reasonable explanations for predictions and outperform existing approaches. In addition, our method can pass the model randomization test, indicating that it is reasoning the attribution of network predictions.

authors

Ji, Shuiwang

published proceedings

IEEE Trans Pattern Anal Mach Intell

altmetric score

1.25

author list (cited authors)

Yuan, H., Cai, L., Hu, X., Wang, J., & Ji, S.

citation count

8

complete list of authors

Yuan, Hao||Cai, Lei||Hu, Xia||Wang, Jie||Ji, Shuiwang

publication date

April 2022

publisher

Institute of Electrical and Electronics Engineers (IEEE) Publisher

published in

IEEE Transactions on Pattern Analysis and Machine Intelligence Journal

keywords

Computational Modeling
Computer Science
Convolutional Neural Networks
Discrete Masks
Electronic Mail
Generators
Image Classification
Interpretability
Neurons
Predictive Models
Reinforcement Learning
Training

PubMed Central ID

33021938

Digital Object Identifier (DOI)

10.1109/TPAMI.2020.3028783

start page

2019

end page

2030

volume

44

issue

4

URL

http://dx.doi.org/10.1109/tpami.2020.3028783

Interpreting Image Classifiers by Generating Discrete Masks.

Overview

abstract

authors

published proceedings

altmetric score

author list (cited authors)

citation count

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

PubMed Central ID

Digital Object Identifier (DOI)

Additional Document Info

start page

end page

volume

issue

Other

URL