1 Executive Summary
The persistent weakness of machine learning (ML) models is the lack of explainability for individual decisions. These models are often described as “black boxes” that provide neither the operator nor end users with justification for a single decision or outcome from a learning model. Explainability is not only a necessity for mission-critical situations – explainability is a form of accountability, which is required in every situation, however mundane.
It is relatively simple to tell if an ML model is interpretable or not. As a user or an operator, do you clearly understand why an individual result is what it is? Are you left wondering why the result was not something else? Do you have any insight into if the model is making decisions based on correct (by human terms) information? If the answer is no to any of these, then you are dealing with a model that has very little accountability to be accurate, fair, logical, or nonbiased.
Having an explainable model does not mean that it is trustworthy. Sometimes the successful application of an explainability solution leads to knowing why you should not trust a particular model trained on a particular data set, like a model trained to differentiate between huskies and wolves using the presence of snow to decide. Accuracy in training data doesn’t always transfer to the real world but A/B testing can be prohibitively expensive, not to mention being a risk factor to expose ill-prepared models to the world. Explanation allows you to see if the model is able to be trusted.
For this reason, explainability is hugely helpful in debugging a model. It has been repeatedly demonstrated that many open-source training data sets contain underlying sexist, racist, and other problematic trends that are being trained into ML models. With the help of explanations for individual decisions, some of these problematic learned assumptions can be identified and removed before public release, preventing harm to end users and saving reputational face.
Explanations should be fully auditable for the entire lifecycle of an ML model. This includes knowing who the human participants/trainers/operators were, the training data sets, and all inputs and outputs. Explanations should be understandable by humans and be tailored to their audience. The requirements that business users have may be different than for the end user, such as the banker, police officer, or private consumer.
Explainable AI has made significant progress in the last two years. Although most research is coming from the academic sector, private firms are adopting these techniques and building them into enterprise AI management consoles and platforms. The burden of ensuring explainability still lies primarily with the organization implementing it, but soon some degree of interpretability will be a standard feature of any ML deployment.