A philosophical theory of AI explanations

A philosophical theory of AI explanations

Presentation to academic audience, 6 July 2020

The social and ethical implications of prediction-based decision systems in sensitive contexts have generated lively debates among multiple stakeholders including computer scientists, ethicists, social scientists, policy makers, and end users. Yet, the lack of a common language and a multi-dimensional framework for an appropriate bridging of the technical, ethical, and legal aspects of the debate prevents the discussion to be as effective as it can be. Drawing on philosophy, this paper offers a multi-faceted unifying theory for the varieties of data and non-data analytical explanations as to why a prediction-based decision is obtained. The theory identifies the existence and significance of dependencies between different kinds of AI explanations as well as the role of normative and pragmatic values in making sense of these explanations. This framework lays the groundwork for establishing the relevant connection between technical, moral, and legal aspects of artificially-intelligent decision making.

This paper has been presented at the Interpretable Machine Learning workshop, Simons Institute for the theory of computing, University of California Berkeley.