DUSK: Do Not Unlearn Shared Knowledge

Abstract

Machine unlearning aims to remove the influence of specific data from a trained model, often in response to privacy regulations or user deletion requests. While prior benchmarks typically assume that the forget set and retain set are disjoint in both data and content, real-world deletion requests often involve documents that contain a mixture of private and general knowledge. Simply forgetting entire documents in these cases can lead to unintended removal of shared information that should be preserved. However, existing benchmarks fail to capture this critical overlap, limiting their ability to evaluate selective forgetting in practical scenarios. To address this, we introduce DUSK, a new benchmark that explicitly models overlap between forget and retain sets. DUSK provides a controlled synthetic corpus of documents with clearly separated unique and shared content, enabling fine-grained evaluation. We propose comprehensive evaluation metrics that jointly assess forgetting effectiveness, retention of shared and retain-specific knowledge, preservation of downstream capabilities, and privacy leakage. Through experiments across nine unlearning methods, we reveal fundamental trade-offs between effective forgetting and utility preservation, demonstrating that achieving both remains a significant challenge.

Data Construction

Step 1: Generate 120 Q&A profiles for fictional professors
Step 2: Convert each profile into 5 writing styles
Step 3: Select 1 document as the Forget set, 4 as the Retain set
Step 4: Introduce shared and unique knowledge across sets
Step 5: Evaluate fine-grained unlearning with overlapping content

To ensure diversity and balance, the dataset construction process controlled the distribution of key attributes such as gender, religion, nationality, and institutional affiliation. This was achieved by iteratively prompting GPT-4 to generate profiles that collectively reflect a wide range of demographic and professional backgrounds.

Experimental Results

Forget & Retain Assessments

In our evaluation of unlearning methods using the DUSK benchmark, we assess various metrics to determine the effectiveness of unlearning.

UFK (Unique Forget Knowledge): Measures whether knowledge exclusive to the forget set is effectively removed.
SK (Shared Knowledge): Assesses if knowledge shared between the forget and retain sets is preserved.
URK (Unique Retain Knowledge): Verifies if knowledge exclusive to the retain set is maintained.
DK (Downstream Knowledge): Evaluates the model’s overall performance to ensure utility is preserved.

Figure 4 presents a two-dimensional analysis of unlearning dynamics, focusing on the interaction between verbatim memorization and knowledge during the unlearning process.

1. Verbatim vs. Forget Knowledge (a):
Verbatim Memorization is a key aspect of unlearning, assessing how well the model removes exact text from the forget set.

2. Forget Knowledge vs. Shared Knowledge (b):
This part analyzes the relationship between Forget Knowledge (content to be removed) and Shared Knowledge (content shared between documents). Unlearning must remove unique knowledge from the forget set while preserving shared knowledge in the retain set.

This figure visually illustrates the challenges of balancing the removal of verbatim content and maintaining important shared knowledge during unlearning.

Distributional Assessments

Figure 5 illustrates Privacy Leakage and Retain Deviation throughout the unlearning process.

1. Privacy Leakage: Measures whether any residual information from the forget set remains in the model after unlearning, emphasizing the need to ensure that sensitive data is not inadvertently retained.

2. Retain Deviation: Evaluates how much the model's behavior deviates from its original performance on the retain set, ensuring that the unlearning process does not disrupt the model’s ability to perform on non-forgotten data.

In multi-source settings, monitoring these metrics is crucial as they highlight a key challenge: selective forgetting becomes inherently difficult when the forget and retain sets share overlapping information. This overlap complicates the unlearning process, making it harder to remove only the targeted information while preserving the knowledge that should remain.

BibTeX

@article{jeung2025dusk,
  title={DUSK: Do Not Unlearn Shared Knowledge},
  author={Jeung, Wonje and Yoon, Sangyeon and Hong, Hyesoo and Kim, Soeun and Han, Seungju and Yu, Youngjae and No, Albert},
  journal={arXiv preprint arXiv:2505.15209},
  year={2025}
}