This talk advances DNN interpretability via inverse classification by introducing frameworks and algorithms for structured, sparse, plausible adversarial/counterfactual examples—developing group-wise attacks via nonconvex proximal methods, efficient data-aligned counterfactuals via accelerated proximal gradients with non-smooth ℓₚ regularization, and training on such counterfactuals that improves robustness, fairness, and generalization—thereby unifying explanation and learning for transparent, reliable models.
We explore the brittleness of DNNs through adversarial attacks and counterfactual explanations.
We identify three promising approaches to generate sparse and plausible counterfactual explanations.
We provide a new technique to generate highly sparse and explainable adversarial attacks.
We revisit essential Calculus theory fundamentals crucial for Data Science applications.
We revisit essential Linear Algebra theory fundamentals crucial for Data Science applications.
We revisit the Backpropagation algorithm, widely used by practitioners to train Deep Neural Networks.
We provide new insights into vulnerabilities of deep learning models by showing that training-based and basis-manipulation defense methods are significantly less effective if we restrict the generation of adversarial attacks to the low-frequency discrete wavelet transform domain.
We review classical and modern results in Neural Network Approximation Theory.