Developing an explainable artificial intelligence tool for training novices

by Neil Barigye

This thesis investigates the effectiveness of different explainable artificial intelligence (XAI) explanation types for training novices in dermatological image classification. A comprehensive and reusable web-based experimentation suite was designed and deployed, featuring automated random assignment of explanation types, the inclusion of timed training and testing phases, dynamic feedback, and a linked post-experiment Google Forms survey. The study engaged a notably large participant cohort (219 individuals), placing it among the largest XAI user studies to date. Participants were randomly assigned to one of five conditions: prototype-based, example-based, saliency-map-based, ChatGPT-based, or a non-XAI control. Performance was assessed across training and testing trials, alongside self-reported measures from the survey. A permutational multivariate analysis of variance (PERMANOVA) found no statistically significant differences in accuracy or response time between explanation types. Interestingly, participants’ confidence increased despite persistently low classification accuracy, even after viewing their poor aggregate performance data, suggesting either a disconnect between perceived and actual understanding or an unwillingness to acknowledge suboptimal performance. Survey data revealed a clear preference for textual or combined textual-visual explanations over saliency maps alone, providing actionable guidance for instructional XAI design. These results suggest that while XAI explanations may enhance engagement and self-assurance, their immediate impact on learning accuracy remains limited. The study contributes to XAI research in educational settings by introducing scalable experimental infrastructure and offering practical design recommendations for improving the instructional effectiveness of explainable models.