Téo Sanchez, Oleksandra Vereschak, and Ophelia Deroy. 2026. Mental Models in Human-AI Interaction: Systematic Review of Empirical Methodologies and Guidelines. In Proceedings of the 31st International Conference on Intelligent User Interfaces (IUI '26), pp. 663–682. DOI: 10.1145/3742413.3789223

Abstract: The notion of mental model has long been used in HCI to capture people’s understanding and reasoning about computing systems. Eliciting users’ mental models can explain their behaviors and attitudes toward a system—why and how they use, rely on, trust, or reject it. However, its use remains conceptually fragmented and methodologically diverse and has not been revisited in light of modern AI systems, whose opacity and newfound abilities may challenge human understanding. To address this gap, we systematically review 88 empirical studies that elicit humans’ mental models of AI systems. We extracted and analyzed how studies define and elicit mental models, the type of mental model their method presupposes, and how these vary across AI system types. Drawing from the mental model’s framing in cognitive psychology and HCI, and based on descriptive and relational analysis between the variables extracted, we find that (1) mental model elicitations’ goal bifurcates between system-specific evaluation and class-level probes surfacing lay theories; (2) epistemic assumptions exceed the classic functional-structural lens (how the system behaves / how it works internally) with analogical and anthropomorphic framings of AI systems; (3) elicitation methods are shaped more by system characteristics and community-specific practices than theoretical commitments, particularly for predictive and explainable AI systems and autonomous or driver-assist vehicles. We derive 9 practical guidelines to support more deliberate and reflective methods for eliciting mental models of AI systems. In doing so, we aim to reestablish continuity between the cognitive theory of mental models and their empirical use in HCI, improving the transparency and comparability of research surrounding the concept.