Organisations face growing legal and social responsibilities to be able to explain decisions they have made using autonomous systems. Though there is much focus on how these decisions impact the public, there is also a need for these decisions to be clear and interpretable internally for employees. In many sectors, this means provisioning textual explanations around decisions made with technical or expertise-driven information in such a way that non-expert users can understand, thus supporting problem-solving in real-time. As an example, our current work with a telecommunications organisation is centred on empowering desk-based agents to better understand autonomous decision-making using specialist field-engineer notes. In this domain we have implemented various low-level (word-matching between problem and solution, confidence metrics) high-level (summarisation of similarities/differences) and co-created (hazard identification) textual explanation methods.
Increasingly we face difficulties in empirically evaluating the quality of these explanations; a problem which becomes even more challenging as the complexity of the provisioned explanation grows. Though we can easily examine whether an explanation contains the necessary content, it is more difficult to determine whether this content is placed in a suitable context to answer the user’s need for an explanation (e.g. its subjective quality). In this talk we will discuss our current work on eXplainable AI (XAI) and position it with the state-of-the-art by examining the output of several national and international workshops on the subject. In particular, we will highlight an important gap in current XAI research; the ability to empirically evaluate the quality of an explanation. We will present our findings in this domain and why we believe that empirical evaluation of explanation quality is key for the growth of XAI methods in future.