26 September 2025
The ability to recognize and interpret causal relations is fundamental for building robust intelligent systems. Recent research has focused on developing benchmarks and tasks to evaluate the inferential and causal reasoning capabilities of LLMs, such as the Pairwise Causal Discovery (PCD) task. However, most of these resources are limited to English.
In this work, EMERGE partners from the University of Pisa present ExpliCITA, a translation of the English ExpliCa dataset, which is the first publicly available dataset for joint temporal-causal reasoning in Italian, enabling evaluation of LLMs on Italian PCD. The authors conduct an extensive empirical study across 20 Italian and multilingual models of varying sizes and training strategies, combining a perplexity-based evaluation of causal reasoning competence with multiple-choice prompting tasks in both zero-shot and few-shot settings.
Read the paper in the link below.

