RepoOpenBMB (MiniCPM)OpenBMB (MiniCPM)published Feb 18, 2025seen 5d

OpenBMB/ParamMute

Python

Open original ↗

Captured source

source ↗
published Feb 18, 2025seen 5dcaptured 11hhttp 200method plain

OpenBMB/ParamMute

Description: ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation

Language: Python

Stars: 58

Forks: 5

Open issues: 0

Created: 2025-02-18T08:45:45Z

Pushed: 2026-02-02T11:12:44Z

Default branch: main

Fork: no

Archived: no

README:

ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation

🎉 News

  • 20250919: Our paper has been accepted to NeurIPS 2025 ! Congratulations! 🎉
  • 20250615: Our work received the Highlight Poster Award🏆 at YSSNLP 2025 ! Congratulations! 🎉
  • 20250529: We updated our paper on Paper.
  • 20250226: Released our train data and test data on Hugging Face.
  • 20250219: Released our Paper on arXiv. Released our Model on Hugging Face. Released our Code on GitHub.

🎯 1. Introduction

We investigate the internal mechanisms behind unfaithful generation and identify a subset of mid-to-deep (70%–90% relative depth range) FFNs that are disproportionately activated in such cases. Building on this insight, we propose Parametric Knowledge Muting through FFN Suppression (ParamMute), a framework that improves contextual faithfulness by suppressing the activation of unfaithfulness-associated FFNs and calibrating the model toward retrieved knowledge. Experimental results on CoConflictQA and ConFiQA demonstrate that ParamMute significantly reduces knowledge conflicts and improves context fidelity.

⚡ 2. ParamMute Pipeline

2.1. setup

2.1.1. Installation

(1) Use git clone to download this project:

git clone git@github.com:OpenBMB/ParamMute.git
cd ParamMute

(2) Install the following packages using Pip or Conda under your environment

Python=3.10.16
torch=2.5.1
tqdm
jsonlines
rouge
datasets
tensorboardX
vllm==0.6.6.post1
accelerate==1.3.0
deepspeed==0.16.3
peft==0.14.0

(3) Install our modified transformers located in src/transformers to enable ParamMute functionality:

cd src/transformers
pip install -e .

2.1.2. Download the necessary resources:

The testing data can be downloaded from CoConflictQA. After downloading, place the files into the data directory using the following structure:

test/
├── hotpotq_kc.jsonl
├── NaturalQuestionsShort_kc.jsonl
├── NewsQA_kc.jsonl
...

2.2. Identifying Unfaithfulness-Associated FFNs (UA-FFNs)

First, we visualize the activation differences between faithful and unfaithful responses, and select the Top-K layers with the largest differences as Unfaithfulness-Associated FFNs (UA-FFNs). Our analysis in paper(§2.) shows that the over-activation of these FFNs is causally and strongly correlated with the model's unfaithful generations.

bash 1_visualize.sh

Running the commands above will generate the visualization results. (You can find more figures for different models in the /assets directory) ![method](assets/activations_llama3_8b_instruct.png) Based on the visualization results, we select the Top-K layers exhibiting the largest activation differences as the Unfaithfulness-Associated FFNs (UA-FFNs) for subsequent activation suppression. For LLaMA3-8B-Instruct, we set K to 8.

Note: The scripts require the data to be in JSONL format and include the following fields:

  • context: The context provided to the model.
  • question: The question being asked.
  • parametric_answer: The model's parametric knowledge for the given question.
  • prompt_w_context: The prompt with context.
  • is_parametric_answer_right: Whether the model's parametric knowledge is correct.

2.3. Knowledge-Augmented Adaptation (Tuning)

After identifying the UA-FFNs, we can train the LLMs while suppressing these UA-FFNs to achieve optimal faithful knowledge adaptation using the following scripts:

bash tune.sh

Key parameters include:

  • `train_mode`:

Choose either sft (standard supervised fine-tuning) or input_contrastive (preference optimization as described in §3.2). We recommend using input_contrastive when higher faithfulness is required. For general scenarios, sft is preferred.

  • `model_type`:

Specify the model type. Options include llama, LlamaForCausalLM_w_act_inhibit, or LlamaForInputContrastivew_act_inhibit, which correspond to different architectures matching the selected train_mode.

  • `inhibit_strength`:

Controls the suppression strength for UA-FFN activations.

  • `inhibit_layer_list`:

Specifies which layers are designated as UA-FFNs.

📃 3. Evaluation

For any model, you can perform inference using the script located at scripts/Evaluation/evaluate.sh.

bash evaluate.sh

Key parameters include:

  • `act_inhibit_layer_list`:

Same as the one used in the training scripts.

  • `act_inhibit_ratio`:

Same as in the training scripts. *Note: Our design allows you to dynamically adjust act_inhibit_ratio during inference to control the model’s reliance on parameterized knowledge. Alternatively, setting the suppression coefficient to a value greater than 1 can increase the model’s dependence on parameterized knowledge.*

🛫 4. Usage

Our model and data can be found in Hugging Face collections: `ParamMute` | Resource | Description | Link | |------------------|-----------------------------------------------------|-----------------------------------------------------------| | ParamMute-8B-SFT | Based on LLaMA3-8B-Instruct, trained via supervised fine-tuning (SFT) with activation suppression applied to layers 19–26. | 🤗ParamMute-8B-SFT | | ParamMute-8B-KTO | Based on LLaMA3-8B-Instruct, trained via KTO with activation suppression applied to layers 19–26. | 🤗ParamMute-8B-KTO | | CoConflictQA | A benchmark specifically designed to evaluate faithfulness in scenarios where the internal knowledge of LLaMA3-8B-Instruct conflicts with accurate external evidence. | 🤗CoConflictQA |

# Please install src/transformers first!
from…

Excerpt shown — open the source for the full document.

Notability

notability 5.0/10

Notable lab, low traction.