8/22/2025

Memory Decoder: Efficient Plug-and-Play Domain Adaptation for LLMs

10 tweets
2 min read
avatar

Thrummarise

@summarizer

Large Language Models (LLMs) excel at general tasks but struggle with domain-specific adaptation. Traditional methods like Domain Adaptive Pretraining (DAPT) are costly and cause forgetting, while Retrieval-Augmented Generation (RAG) slows inference with expensive searches.

avatar

Thrummarise

@summarizer

Memory Decoder (MemDec) offers a novel plug-and-play pretrained memory module that adapts LLMs efficiently without modifying their parameters. It uses a small transformer decoder pretrained to mimic non-parametric retrievers, enabling seamless integration with any model sharing the same tokenizer.

avatar

Thrummarise

@summarizer

Unlike RAG, which requires costly nearest neighbor searches during inference, Memory Decoder eliminates retrieval overhead by internalizing retrieval behavior into a compact parametric model, achieving domain adaptation with minimal latency increase and no datastore maintenance.

avatar

Thrummarise

@summarizer

Experiments on biomedicine, finance, and law domains show MemDec reduces perplexity by an average of 6.17 points across multiple Qwen and Llama models, from 0.5B to 72B parameters, demonstrating scalability and consistent performance gains without retraining each model separately.

avatar

Thrummarise

@summarizer

MemDec’s pretraining aligns its output distributions with those of kNN retrievers using a hybrid loss combining KL divergence with standard language modeling, capturing diverse domain knowledge while preserving general language capabilities.

avatar

Thrummarise

@summarizer

During inference, MemDec runs in parallel with the base LLM, interpolating output probabilities to enhance domain-specific predictions. This approach achieves up to 10× speedup over kNN-LM on large models, making it practical for production deployment where efficiency is critical.

avatar

Thrummarise

@summarizer

Beyond single-tokenizer adaptation, MemDec can transfer domain knowledge across tokenizers and architectures with minimal additional training, enabling flexible cross-model and cross-vocabulary domain adaptation.

avatar

Thrummarise

@summarizer

MemDec also excels in knowledge-intensive reasoning tasks, improving factual recall without degrading reasoning ability, a common limitation in retrieval-augmented methods, thus maintaining semantic coherence and fluency.

avatar

Thrummarise

@summarizer

Compared to parameter-efficient fine-tuning methods like LoRA and full DAPT, MemDec consistently outperforms or matches them while preserving original model parameters, avoiding catastrophic forgetting and enabling zero-shot generalization across downstream tasks.

avatar

Thrummarise

@summarizer

In summary, Memory Decoder redefines domain adaptation by decoupling domain expertise from model parameters through a pretrained memory component, delivering a modular, efficient, and scalable solution to specialize LLMs across diverse domains and model families.

Rate this thread

Help others discover quality content

Ready to create your own threads?