Publication: Standard vs. Modular Sampling: Best Practices for Reliable LLM Unlearning

01 April 2026

A conventional LLM Unlearning setting consists of two subsets: “forget” and “retain”, with the objectives of removing the undesired knowledge from the forget set while preserving the remaining knowledge from the retain. In privacy-focused unlearning research, a retain set is often further divided into neighbour sets, containing either directly or indirectly connected to the forget targets; and augmented by a general-knowledge set. A common practice in existing benchmarks is to employ only a single neighbour set, with general knowledge which fails to reflect the real-world data complexities and relationships. LLM Unlearning typically involves 1:1 sampling or cyclic iteration sampling. However, the efficacy and stability of these de facto standards have not been critically examined.

In this study, EMERGE partners from the University of Pisa systematically evaluate these common practices. Their findings reveal that relying on a single neighbour set is suboptimal and that a standard sampling approach can obscure performance trade-offs. Based on this analysis, they propose the Modular Entity-Level Unlearning (MELU) strategy as an alternative to cyclic sampling. They demonstrate that this modular approach, combined with robust algorithms, provides a clear and stable path towards effective unlearning.

Read the paper in the link below.

More Information

Next Article
Publication: End-to-End Learning of Soft-Robot Dynamics from Event-Based Camera Streams
Previous Article
Publication: A Scalable, Open and Remote Laboratory Architecture for Swarm Robotics Experimentation and Education
View All Articles

Publication: Standard vs. Modular Sampling: Best Practices for Reliable LLM Unlearning

About

Consortium

Resources

Outreach

Follow Us

About

Consortium

Resources

News & Events

Follow Us