Awesome-Forgetting-in-Deep-Learning

A comprehensive list of papers about 'A Comprehensive Survey of Forgetting in Deep Learning Beyond Continual Learning'.

Abstract

Forgetting refers to the loss or deterioration of previously acquired information or knowledge. While the existing surveys on forgetting have primarily focused on continual learning, forgetting is a prevalent phenomenon observed in various other research domains within deep learning. Forgetting manifests in research fields such as generative models due to generator shifts, and federated learning due to heterogeneous data distributions across clients. Addressing forgetting encompasses several challenges, including balancing the retention of old task knowledge with fast learning of new tasks, managing task interference with conflicting goals, and preventing privacy leakage, etc. Moreover, most existing surveys on continual learning implicitly assume that forgetting is always harmful. In contrast, our survey argues that forgetting is a double-edged sword and can be beneficial and desirable in certain cases, such as privacy-preserving scenarios. By exploring forgetting in a broader context, we aim to present a more nuanced understanding of this phenomenon and highlight its potential advantages. Through this comprehensive survey, we aspire to uncover potential solutions by drawing upon ideas and approaches from various fields that have dealt with forgetting. By examining forgetting beyond its conventional boundaries, in future work, we hope to encourage the development of novel strategies for mitigating, harnessing, or even embracing forgetting in real applications.

Citation

If you find our paper or this resource helpful, please consider citing:

@article{Forgetting_Survey_2024,
  title={A Comprehensive Survey of Forgetting in Deep Learning Beyond Continual Learning},
  author={Wang, Zhenyi and Yang, Enneng and Shen, Li and Huang, Heng},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2024},
  publisher={IEEE}
}

Thanks!

Framework

Harmful Forgetting
Beneficial Forgetting
- Forgetting Irrelevant Information to Achieve Better Performance
  - Combat Overfitting Through Forgetting
  - Learning New Knowledge Through Forgetting Previous Knowledge
- Machine Unlearning

Harmful Forgetting

Harmful forgetting occurs when we desire the machine learning model to retain previously learned knowledge while adapting to new tasks, domains, or environments. In such cases, it is important to prevent and mitigate knowledge forgetting.

Problem Setting	Goal	Source of forgetting
Continual Learning	learn non-stationary data distribution without forgetting previous knowledge	data-distribution shift during training
Foundation Model	unsupervised learning on large-scale unlabeled data	data-distribution shift in pre-training, fine-tuning
Domain Adaptation	adapt to target domain while maintaining performance on source domain	target domain sequentially shift over time
Test-time Adaptation	mitigate the distribution gap between training and testing	adaptation to the test data distribution during testing
Meta-Learning	learn adaptable knowledge to new tasks	incrementally meta-learn new classes / task-distribution shift
Generative Model	learn a generator to appriximate real data distribution	generator shift/data-distribution shift
Reinforcement Learning	maximize accumulate rewards	state, action, reward and state transition dynamics
Federated Learning	decentralized training without sharing data	model average; non-i.i.d data; data-distribution shift

Forgetting in Continual Learning

[Back to top]

The goal of continual learning (CL) is to learn on a sequence of tasks without forgetting the knowledge on previous tasks.

Survey and Book

Paper Title	Year	Conference/Journal
Latest Advancements Towards Catastrophic Forgetting under Data Scarcity: A Comprehensive Survey on Few-Shot Class Incremental Learning	2025	Arxiv
Federated Continual Learning: Concepts, Challenges, and Solutions	2025	Arxiv
Online Continual Learning: A Systematic Literature Review of Approaches, Challenges, and Benchmarks	2025	Arxiv
A Comprehensive Survey of Forgetting in Deep Learning Beyond Continual Learning	2024	TPAMI
Class-Incremental Learning: A Survey	2024	TPAMI
A Comprehensive Survey of Continual Learning: Theory, Method and Application	2024	TPAMI
Unleashing the Power of Continual Learning on Non-Centralized Devices: A Survey	2024	Arxiv
Federated Continual Learning for Edge-AI: A Comprehensive Survey	2024	Arxiv
Continual Learning with Neuromorphic Computing: Theories, Methods, and Applications	2024	Arxiv
Recent Advances of Multimodal Continual Learning: A Comprehensive Survey	2024	Arxiv
Towards General Industrial Intelligence: A Survey on Industrial IoT-Enhanced Continual Large Models	2024	Arxiv
Towards Lifelong Learning of Large Language Models: A Survey	2024	Arxiv
Recent Advances of Foundation Language Models-based Continual Learning: A Survey	2024	Arxiv
Continual Learning of Large Language Models: A Comprehensive Survey	2024	Arxiv
Continual Learning on Graphs: Challenges, Solutions, and Opportunities	2024	Arxiv
Continual Learning on Graphs: A Survey	2024	Arxiv
Continual Learning for Large Language Models: A Survey	2024	Arxiv
Continual Learning with Pre-Trained Models: A Survey	2024	IJCAI
A Survey on Few-Shot Class-Incremental Learning	2024	Neural Networks
Sharpness and Gradient Aware Minimization for Memory-based Continual Learning	2023	SOICT
A Survey on Incremental Update for Neural Recommender Systems	2023	Arxiv
Continual Graph Learning: A Survey	2023	Arxiv
Towards Label-Efficient Incremental Learning: A Survey	2023	Arxiv
Advancing continual lifelong learning in neural information retrieval: definition, dataset, framework, and empirical evaluation	2023	Arxiv
How to Reuse and Compose Knowledge for a Lifetime of Tasks: A Survey on Continual Learning and Functional Composition	2023	Transactions on Machine Learning Research
Online Continual Learning in Image Classification: An Empirical Survey	2022	Neurocomputing
Class-incremental learning: survey and performance evaluation on image classification	2022	TPAMI
Towards Continual Reinforcement Learning: A Review and Perspectives	2022	Journal of Artificial Intelligence Research
An Introduction to Lifelong Supervised Learning	2022	Arxiv
Continual Learning for Real-World Autonomous Systems: Algorithms, Challenges and Frameworks	2022	Arxiv
A continual learning survey: Defying forgetting in classification tasks	2021	TPAMI
Recent Advances of Continual Learning in Computer Vision: An Overview	2021	Arxiv
Continual Lifelong Learning in Natural Language Processing: A Survey	2020	COLING
A Comprehensive Study of Class Incremental Learning Algorithms for Visual Tasks	2020	Neural Networks
Continual Lifelong Learning with Neural Networks: A Review	2019	Neural Networks
Three scenarios for continual learning	2018	NeurIPSW
Lifelong Machine Learning	2016	Book

Task-aware CL

[Back to top]

Task-aware CL focuses on addressing scenarios where explicit task definitions, such as task IDs or labels, are available during the CL process. Existing methods on task-aware CL have explored five main branches: Memory-based Methods | Architecture-based Methods | Regularization-based Methods | Subspace-based Methods | Bayesian Methods.

Memory-based Methods

[Back to top]

Memory-based (or Rehearsal-based) method keeps a memory buffer that stores the examples/knowledges from previous tasks and replay those examples during learning new tasks.

Paper Title	Year	Conference/Journal
Prior-free Balanced Replay: Uncertainty-guided Reservoir Sampling for Long-Tailed Continual Learning	2024	MM
FTF-ER: Feature-Topology Fusion-Based Experience Replay Method for Continual Graph Learning	2024	MM
Multi-layer Rehearsal Feature Augmentation for Class-Incremental Learning	2024	ICML
Gradual Divergence for Seamless Adaptation: A Novel Domain Incremental Learning Method	2024	ICML
Accelerating String-Key Learned Index Structures via Memoization based Incremental Training	2024	VLDB
DSLR: Diversity Enhancement and Structure Learning for Rehearsal-based Graph Continual Learning	2024	WWW
Exemplar-based Continual Learning via Contrastive Learning	2024	IEEE Transactions on Artificial Intelligence
Saving 100x Storage: Prototype Replay for Reconstructing Training Sample Distribution in Class-Incremental Semantic Segmentation	2023	NeurIPS
Selective Amnesia: A Continual Learning Approach to Forgetting in Deep Generative Models	2023	NeurIPS
A Unified Approach to Domain Incremental Learning with Memory: Theory and Algorithm	2023	NeurIPS
An Efficient Dataset Condensation Plugin and Its Application to Continual Learning	2023	NeurIPS
Augmented Memory Replay-based Continual Learning Approaches for Network Intrusion Detection	2023	NeurIPS
Bilevel Coreset Selection in Continual Learning: A New Formulation and Algorithm	2023	NeurIPS
FeCAM: Exploiting the Heterogeneity of Class Distributions in Exemplar-Free Continual Learning	2023	NeurIPS
Distributionally Robust Memory Evolution with Generalized Divergence for Continual Learning	2023	TPAMI
Improving Replay Sample Selection and Storage for Less Forgetting in Continual Learning	2023	ICCV
Masked Autoencoders are Efficient Class Incremental Learners	2023	ICCV
Error Sensitivity Modulation based Experience Replay: Mitigating Abrupt Representation Drift in Continual Learning	2023	ICLR
A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental Learning	2023	ICLR
DualHSIC: HSIC-Bottleneck and Alignment for Continual Learning	2023	ICML
DDGR: Continual Learning with Deep Diffusion-based Generative Replay	2023	ICML
BiRT: Bio-inspired Replay in Vision Transformers for Continual Learning	2023	ICML
Neuro-Symbolic Continual Learning: Knowledge, Reasoning Shortcuts and Concept Rehearsal	2023	ICML
Poisoning Generative Replay in Continual Learning to Promote Forgetting	2023	ICML
Regularizing Second-Order Influences for Continual Learning	2023	CVPR
Class-Incremental Exemplar Compression for Class-Incremental Learning	2023	CVPR
A closer look at rehearsal-free continual learning	2023	CVPRW
Continual Learning by Modeling Intra-Class Variation	2023	TMLR
Class-Incremental Learning using Diffusion Model for Distillation and Replay	2023	Arxiv
On the Effectiveness of Lipschitz-Driven Rehearsal in Continual Learning	2022	NeurIPS
Exploring Example Influence in Continual Learning	2022	NeurIPS
Navigating Memory Construction by Global Pseudo-Task Simulation for Continual Learning	2022	NeurIPS
Learning Fast, Learning Slow: A General Continual Learning Method based on Complementary Learning System	2022	ICLR
Information-theoretic Online Memory Selection for Continual Learning	2022	ICLR
Memory Replay with Data Compression for Continual Learning	2022	ICLR
Improving Task-free Continual Learning by Distributionally Robust Memory Evolution	2022	ICML
GCR: Gradient Coreset based Replay Buffer Selection for Continual Learning	2022	CVPR
On the Convergence of Continual Learning with Adaptive Methods	2022	UAI
RMM: Reinforced Memory Management for Class-Incremental Learning	2021	NeurIPS
Rainbow Memory: Continual Learning with a Memory of Diverse Samples	2021	CVPR
Prototype Augmentation and Self-Supervision for Incremental Learning	2021	CVPR
Class-incremental experience replay for continual learning under concept drift	2021	CVPRW
Always Be Dreaming: A New Approach for Data-Free Class-Incremental Learning	2021	ICCV
Using Hindsight to Anchor Past Knowledge in Continual Learning	2021	AAAI
Improved Schemes for Episodic Memory-based Lifelong Learning	2020	NeurIPS
Dark Experience for General Continual Learning: a Strong, Simple Baseline	2020	NeurIPS
La-MAML: Look-ahead Meta Learning for Continual Learning	2020	NeurIPS
GAN Memory with No Forgetting	2020	NeurIPS
Brain-inspired replay for continual learning with artificial neural networks	2020	Nature Communications
LAMOL: LAnguage MOdeling for Lifelong Language Learning	2020	ICLR
Mnemonics Training: Multi-Class Incremental Learning without Forgetting	2020	CVPR
GDumb: A Simple Approach that Questions Our Progress in Continual Learning	2020	ECCV
Episodic Memory in Lifelong Language Learning	2019	NeurIPS
Continual Learning with Tiny Episodic Memories	2019	ICML
Efficient lifelong learning with A-GEM	2019	ICLR
Learning to Learn without Forgetting by Maximizing Transfer and Minimizing Interference	2019	ICLR
Large Scale Incremental Learning	2019	CVPR
On Tiny Episodic Memories in Continual Learning	2019	Arxiv
Memory Replay GANs: learning to generate images from new categories without forgetting	2018	NeurIPS
Progress & Compress: A scalable framework for continual learning	2018	ICML
Gradient Episodic Memory for Continual Learning	2017	NeurIPS
Continual Learning with Deep Generative Replay	2017	NeurIPS
iCaRL: Incremental Classifier and Representation Learning	2017	CVPR
Catastrophic forgetting, rehearsal and pseudorehearsal	1995	Connection Science

Architecture-based Methods

[Back to top]

The architecture-based approach avoids forgetting by reducing parameter sharing between tasks or adding parameters to new tasks.

Paper Title	Year	Conference/Journal
CEAT: Continual Expansion and Absorption Transformer for Non-Exemplar Class-Incremental Learning	2024	TCSVT
Harnessing Neural Unit Dynamics for Effective and Scalable Class-Incremental Learning	2024	ICML
Revisiting Neural Networks for Continual Learning: An Architectural Perspective	2024	IJCAI
Recall-Oriented Continual Learning with Generative Adversarial Meta-Model	2024	AAAI
Divide and not forget: Ensemble of selectively trained experts in Continual Learning	2024	ICLR
A Probabilistic Framework for Modular Continual Learning	2024	ICLR
Incorporating neuro-inspired adaptability for continual learning in artificial intelligence	2023	Nature Machine Intelligence
TriRE: A Multi-Mechanism Learning Paradigm for Continual Knowledge Retention and Promotion	2023	NeurIPS
ScrollNet: Dynamic Weight Importance for Continual Learning	2023	ICCV
CLR: Channel-wise Lightweight Reprogramming for Continual Learning	2023	ICCV
Parameter-Level Soft-Masking for Continual Learning	2023	ICML
Continual Learning on Dynamic Graphs via Parameter Isolation	2023	SIGIR
Heterogeneous Continual Learning	2023	CVPR
Dense Network Expansion for Class Incremental Learning	2023	CVPR
Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning	2023	CVPR
Forget-free Continual Learning with Winning Subnetworks	2022	ICML
NISPA: Neuro-Inspired Stability-Plasticity Adaptation for Continual Learning in Sparse Networks	2022	ICML
Continual Learning with Filter Atom Swapping	2022	ICLR
SparCL: Sparse Continual Learning on the Edge	2022	NeurIPS
Learning Bayesian Sparse Networks with Full Experience Replay for Continual Learning	2022	CVPR
FOSTER: Feature Boosting and Compression for Class-Incremental Learning	2022	ECCV
BNS: Building Network Structures Dynamically for Continual Learning	2021	NeurIPS
DER: Dynamically Expandable Representation for Class Incremental Learning	2021	CVPR
Adaptive Aggregation Networks for Class-Incremental Learning	2021	CVPR
BatchEnsemble: an Alternative Approach to Efficient Ensemble and Lifelong Learning	2020	ICLR
Calibrating CNNs for Lifelong Learning	2020	NeurIPS
Continual Learning of a Mixed Sequence of Similar and Dissimilar Tasks	2020	NeurIPS
Compacting, Picking and Growing for Unforgetting Continual Learning	2019	NeurIPS
Superposition of many models into one	2019	NeurIPS
Reinforced Continual Learning	2018	NeurIPS
Progress & Compress: A scalable framework for continual learning	2018	ICML
Overcoming Catastrophic Forgetting with Hard Attention to the Task	2018	ICML
Lifelong Learning with Dynamically Expandable Networks	2018	ICLR
PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning	2018	CVPR
Expert Gate: Lifelong Learning with a Network of Experts	2017	CVPR
Progressive Neural Networks	2016	Arxiv

Regularization-based Methods

[Back to top]

Regularization-based approaches avoid forgetting by penalizing updates of important parameters or distilling knowledge with previous model as a teacher.

Paper Title	Year	Conference/Journal
Rehearsal-Free Continual Federated Learning with Synergistic Regularization	2024	Arxiv
A Statistical Theory of Regularization-Based Continual Learning	2024	ICML
IMEX-Reg: Implicit-Explicit Regularization in the Function Space for Continual Learning	2024	TMLR
Contrastive Continual Learning with Importance Sampling and Prototype-Instance Relation Distillation	2024	AAAI
Elastic Feature Consolidation for Cold Start Exemplar-free Incremental Learning	2024	ICLR
Fine-Grained Knowledge Selection and Restoration for Non-Exemplar Class Incremental Learning	2024	AAAI
Prototype-Sample Relation Distillation: Towards Replay-Free Continual Learning	2023	ICML
Continual Learning via Sequential Function-Space Variational Inference	2022	ICML
Overcoming Catastrophic Forgetting in Incremental Object Detection via Elastic Response Distillation	2022	CVPR
Class-Incremental Learning by Knowledge Distillation with Adaptive Feature Consolidation	2022	CVPR
Class-Incremental Learning via Knowledge Amalgamation	2022	PKDD
Natural continual learning: success is a journey, not (just) a destination	2021	NeurIPS
Distilling Causal Effect of Data in Class-Incremental Learning	2021	CVPR
On Learning the Geodesic Path for Incremental Learning	2021	CVPR
CPR: Classifier-Projection Regularization for Continual Learning	2021	ICLR
Few-Shot Class-Incremental Learning via Relation Knowledge Distillation	2021	AAAI
Continual Learning with Node-Importance based Adaptive Group Sparse Regularization	2020	NeurIPS
PODNet: Pooled Outputs Distillation for Small-Tasks Incremental Learning	2020	ECCV
Topology-Preserving Class-Incremental Learning	2020	ECCV
Uncertainty-based Continual Learning with Adaptive Regularization	2019	NeurIPS
Learning a Unified Classifier Incrementally via Rebalancing	2019	CVPR
Learning Without Memorizing	2019	CVPR
Efficient Lifelong Learning with A-GEM	2019	ICLR
Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence	2018	ECCV
Lifelong Learning via Progressive Distillation and Retrospection	2018	ECCV
Memory Aware Synapses: Learning what (not) to forget	2018	ECCV
Overcoming catastrophic forgetting in neural networks	2017	Arxiv
Continual Learning Through Synaptic Intelligence	2017	ICML
Learning without Forgetting	2017	TPAMI

Subspace-based Methods

[Back to top]

Subspace-based methods perform CL in multiple disjoint subspaces to avoid interference between multiple tasks.

Paper Title	Year	Conference/Journal
Revisiting Flatness-aware Optimization in Continual Learning with Orthogonal Gradient Projection	2025	TPAMI
Introducing Common Null Space of Gradients for Gradient Projection Methods in Continual Learning	2024	ACM MM
Improving Data-aware and Parameter-aware Robustness for Continual Learning	2024	Arxiv
Prompt Gradient Projection for Continual Learning	2024	ICLR
Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks	2024	ICLR
Towards Continual Learning Desiderata via HSIC-Bottleneck Orthogonalization and Equiangular Embedding	2024	AAAI
Orthogonal Subspace Learning for Language Model Continual Learning	2023	EMNLP
Data Augmented Flatness-aware Gradient Projection for Continual Learning	2023	ICCV
Rethinking Gradient Projection Continual Learning: Stability / Plasticity Feature Space Decoupling	2023	CVPR
Building a Subspace of Policies for Scalable Continual Learning	2023	ICLR
Continual Learning with Scaled Gradient Projection	2023	AAAI
SketchOGD: Memory-Efficient Continual Learning	2023	Arxiv
Continual Learning through Networks Splitting and Merging with Dreaming-Meta Weighted Model Fusion	2023	Arxiv
Beyond Not-Forgetting: Continual Learning with Backward Knowledge Transfer	2022	NeurIPS
TRGP: Trust Region Gradient Projection for Continual Learning	2022	ICLR
Continual Learning with Recursive Gradient Optimization	2022	ICLR
Class Gradient Projection For Continual Learning	2022	MM
Balancing Stability and Plasticity through Advanced Null Space in Continual Learning	2022	ECCV
Adaptive Orthogonal Projection for Batch and Online Continual Learning	2022	AAAI
Natural continual learning: success is a journey, not (just) a destination	2021	NeurIPS
Flattening Sharpness for Dynamic Gradient Projection Memory Benefits Continual Learning	2021	NeurIPS
Gradient Projection Memory for Continual Learning	2021	ICLR
Training Networks in Null Space of Feature Covariance for Continual Learning	2021	CVPR
Generalisation Guarantees For Continual Learning With Orthogonal Gradient Descent	2021	Arxiv
Defeating Catastrophic Forgetting via Enhanced Orthogonal Weights Modification	2021	Arxiv
Continual Learning in Low-rank Orthogonal Subspaces	2020	NeurIPS
Orthogonal Gradient Descent for Continual Learning	2020	AISTATS
Generalisation Guarantees for Continual Learning with Orthogonal Gradient Descent	2020	Arxiv
Generative Feature Replay with Orthogonal Weight Modification for Continual Learning	2020	Arxiv
Continual Learning of Context-dependent Processing in Neural Networks	2019	Nature Machine Intelligence

Bayesian Methods

[Back to top]

Bayesian methods provide a principled probabilistic framework for addressing Forgetting.

Paper Title	Year	Conference/Journal
Learning to Continually Learn with the Bayesian Principle	2024	ICML
A Probabilistic Framework for Modular Continual Learning	2023	Arxiv
Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference	2022	ICLR
Continual Learning via Sequential Function-Space Variational Inference	2022	ICML
Generalized Variational Continual Learning	2021	ICLR
Variational Auto-Regressive Gaussian Processes for Continual Learning	2021	ICML
Bayesian Structural Adaptation for Continual Learning	2021	ICML
Continual Learning using a Bayesian Nonparametric Dictionary of Weight Factors	2021	AISTATS
Posterior Meta-Replay for Continual Learning	2021	NeurIPS
Natural continual learning: success is a journey, not (just) a destination	2021	NeurIPS
Continual Learning with Adaptive Weights (CLAW)	2020	ICLR
Uncertainty-guided Continual Learning with Bayesian Neural Networks	2020	ICLR
Functional Regularisation for Continual Learning with Gaussian Processes	2020	ICLR
Continual Deep Learning by Functional Regularisation of Memorable Past	2020	NeurIPS
Variational Continual Learning	2018	ICLR
Online Structured Laplace Approximations for Overcoming Catastrophic Forgetting	2018	NeurIPS
Overcoming Catastrophic Forgetting by Incremental Moment Matching	2017	NeurIPS

Task-free CL

[Back to top]

Task-free CL refers to a specific scenario that the learning system does not have access to any explicit task information.

Paper Title	Year	Conference/Journal
Task-Free Continual Generation and Representation Learning via Dynamic Expansionable Memory Cluster	2024	AAAI
Task-Free Dynamic Sparse Vision Transformer for Continual Learning	2024	AAAI
Doubly Perturbed Task-Free Continual Learning	2024	AAAI
Loss Decoupling for Task-Agnostic Continual Learning	2023	NeurIPS
Online Bias Correction for Task-Free Continual Learning	2023	ICLR
Task-Free Continual Learning via Online Discrepancy Distance Learning	2022	NeurIPS
Improving Task-free Continual Learning by Distributionally Robust Memory Evolution	2022	ICML
VariGrow: Variational architecture growing for task-agnostic continual learning based on Bayesian novelty	2022	ICML
Gradient-based Editing of Memory Examples for Online Task-free Continual Learning	2021	NeurIPS
Continuous Meta-Learning without Tasks	2020	NeurIPS
A Neural Dirichlet Process Mixture Model for Task-Free Continual Learning	2020	ICLR
Online Continual Learning with Maximally Interfered Retrieval	2019	NeurIPS
Gradient based sample selection for online continual learning	2019	NeurIPS
Efficient lifelong learning with A-GEM	2019	ICLR
Task-Free Continual Learning	2019	CVPR
Continual Learning with Tiny Episodic Memories	2019	Arxiv

Online CL

[Back to top]

In online CL, the learner is only allowed to process the data for each task once.

Paper Title	Year	Conference/Journal
Online Curvature-Aware Replay: Leveraging 2nd Order Information for Online Continual Learning	2025	Arxiv
Dealing with Synthetic Data Contamination in Online Continual Learning	2024	NeurIPS
Random Representations Outperform Online Continually Learned Representations	2024	NeurIPS
Forgetting, Ignorance or Myopia: Revisiting Key Challenges in Online Continual Learning	2024	NeurIPS
Mitigating Catastrophic Forgetting in Online Continual Learning by Modeling Previous Task Interrelations via Pareto Optimization	2024	ICML
ER-FSL: Experience Replay with Feature Subspace Learning for Online Continual Learning	2024	MM
Dual-Enhanced Coreset Selection with Class-wise Collaboration for Online Blurry Class Incremental Learning	2024	CVPR
Orchestrate Latent Expertise: Advancing Online Continual Learning with Multi-Level Supervision and Reverse Self-Distillation	2024	CVPR
Learning Equi-angular Representations for Online Continual Learning	2024	CVPR
Online Continual Learning For Interactive Instruction Following Agents	2024	ICLR
Online Continual Learning for Interactive Instruction Following Agents	2024	ICLR
Summarizing Stream Data for Memory-Constrained Online Continual Learning	2024	AAAI
Online Class-Incremental Learning For Real-World Food Image Classification	2024	WACV
Rapid Adaptation in Online Continual Learning: Are We Evaluating It Right?	2023	ICCV
CBA: Improving Online Continual Learning via Continual Bias Adaptor	2023	ICCV
Online Continual Learning on Hierarchical Label Expansion	2023	ICCV
New Insights for the Stability-Plasticity Dilemma in Online Continual Learning	2023	ICLR
Real-Time Evaluation in Online Continual Learning: A New Hope	2023	CVPR
PCR: Proxy-based Contrastive Replay for Online Class-Incremental Continual Learning	2023	CVPR
Dealing with Cross-Task Class Discrimination in Online Continual Learning	2023	CVPR
Online continual learning through mutual information maximization	2022	ICML
Online Coreset Selection for Rehearsal-based Continual Learning	2022	ICLR
New Insights on Reducing Abrupt Representation Change in Online Continual Learning	2022	ICLR
Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference	2022	ICLR
Information-theoretic Online Memory Selection for Continual Learning	2022	ICLR
Continual Normalization: Rethinking Batch Normalization for Online Continual Learning	2022	ICLR
Navigating Memory Construction by Global Pseudo-Task Simulation for Continual Learning	2022	NeurIPS
Not Just Selection, but Exploration: Online Class-Incremental Continual Learning via Dual View Consistency	2022	CVPR
Online Task-free Continual Learning with Dynamic Sparse Distributed Memory	2022	ECCV
Mitigating Forgetting in Online Continual Learning with Neuron Calibration	2021	NeurIPS
Online class-incremental continual learning with adversarial shapley value	2021	AAAI
Online Continual Learning with Natural Distribution Shifts: An Empirical Study with Visual Data	2021	ICCV
Continual Prototype Evolution: Learning Online from Non-Stationary Data Streams	2021	ICCV
La-MAML: Look-ahead Meta Learning for Continual Learning	2020	NeurIPS
Online Learned Continual Compression with Adaptive Quantization Modules	2020	ICML
Online Continual Learning under Extreme Memory Constraints	2020	ECCV
Online Continual Learning with Maximally Interfered Retrieval	2019	NeurIPS
Gradient based sample selection for online continual learning	2019	NeurIPS
On Tiny Episodic Memories in Continual Learning	Arxiv	2019
Progress & Compress: A scalable framework for continual learning	2018	ICML

The presence of imbalanced data streams in CL (especially online CL) has drawn significant attention, primarily due to its prevalence in real-world application scenarios.

Paper Title	Year	Conference/Journal
Towards Macro-AUC oriented Imbalanced Multi-Label Continual Learning	2025	IJCAI
Joint Input and Output Coordination for Class-Incremental Learning	2024	IJCAI
Imbalance Mitigation for Continual Learning via Knowledge Decoupling and Dual Enhanced Contrastive Learning	2024	TNNLS
Overcoming Recency Bias of Normalization Statistics in Continual Learning: Balance and Adaptation	2023	NeurIPS
Online Bias Correction for Task-Free Continual Learning	2023	ICLR
Information-theoretic Online Memory Selection for Continual Learning	2022	ICLR
SS-IL: Separated Softmax for Incremental Learning	2021	ICCV
Online Continual Learning from Imbalanced Data	2020	ICML
Maintaining Discrimination and Fairness in Class Incremental Learning	2020	CVPR
Semantic Drift Compensation for Class-Incremental Learning	2020	CVPR
Imbalanced Continual Learning with Partitioning Reservoir Sampling	2020	ECCV
GDumb: A Simple Approach that Questions Our Progress in Continual Learning	2020	ECCV
Large scale incremental learning	2019	CVPR
IL2M: Class Incremental Learning With Dual Memory	2019	ICCV
End-to-end incremental learning	2018	ECCV

Semi-supervised CL

[Back to top]

Semi-supervised CL is an extension of traditional CL that allows each task to incorporate unlabeled data as well.

Paper Title	Year	Conference/Journal
Continual Learning on a Diet: Learning from Sparsely Labeled Streams Under Constrained Computation	2024	ICLR
Dynamic Sub-graph Distillation for Robust Semi-supervised Continual Learning	2024	AAAI
Semi-supervised drifted stream learning with short lookback	2022	SIGKDD
Ordisco: Effective and efficient usage of incremental unlabeled data for semi-supervised continual learning	2021	CVPR
Memory-Efficient Semi-Supervised Continual Learning: The World is its Own Replay Buffer	2021	IJCNN
Overcoming Catastrophic Forgetting with Unlabeled Data in the Wild	2019	ICCV

Few-shot CL

[Back to top]

Few-shot CL refers to the scenario where a model needs to learn new tasks with only a limited number of labeled examples per task while retaining knowledge from previously encountered tasks.

Paper Title	Year	Conference/Journal
Wearable Sensor-Based Few-Shot Continual Learning on Hand Gestures for Motor-Impaired Individuals via Latent Embedding Exploitation	2024	IJCAI
A Bag of Tricks for Few-Shot Class-Incremental Learning	2024	Arxiv
Analogical Learning-Based Few-Shot Class-Incremental Learning	2024	IEEE TCSVT
Few-Shot Class-Incremental Learning via Training-Free Prototype Calibration	2023	NeurIPS
Few-shot Class-incremental Learning: A Survey	2023	Arxiv
Warping the Space: Weight Space Rotation for Class-Incremental Few-Shot Learning	2023	ICLR
Neural Collapse Inspired Feature-Classifier Alignment for Few-Shot Class Incremental Learning	2023	ICLR
Few-Shot Class-Incremental Learning by Sampling Multi-Phase Tasks	2022	TPAMI
Dynamic Support Network for Few-Shot Class Incremental Learning	2022	TPAMI
Subspace Regularizers for Few-Shot Class Incremental Learning	2022	ICLR
MetaFSCIL: A Meta-Learning Approach for Few-Shot Class Incremental Learning	2022	CVPR
Forward Compatible Few-Shot Class-Incremental Learning	2022	CVPR
Constrained Few-shot Class-incremental Learning	2022	CVPR
Few-Shot Class-Incremental Learning via Entropy-Regularized Data-Free Replay	2022	ECCV
MgSvF: Multi-Grained Slow vs. Fast Framework for Few-Shot Class-Incremental Learning	2021	TPAMI
Semantic-aware Knowledge Distillation for Few-Shot Class-Incremental Learning	2021	CVPR
Self-Promoted Prototype Refinement for Few-Shot Class-Incremental Learning	2021	CVPR
Few-Shot Incremental Learning with Continually Evolved Classifiers	2021	CVPR
Synthesized Feature based Few-Shot Class-Incremental Learning on a Mixture of Subspaces	2021	ICCV
Few-Shot Lifelong Learning	2021	AAAI
Few-Shot Class-Incremental Learning via Relation Knowledge Distillation	2021	AAAI
Few-shot Continual Learning: a Brain-inspired Approach	2021	Arxiv
Few-Shot Class-Incremental Learning	2020	CVPR

Unsupervised CL

[Back to top]

Unsupervised CL (UCL) assumes that only unlabeled data is provided to the CL learner.

Paper Title	Year	Conference/Journal
Class-Incremental Unsupervised Domain Adaptation via Pseudo-Label Distillation	2024	TIP
Plasticity-Optimized Complementary Networks for Unsupervised Continual	2024	WACV
Unsupervised Continual Learning in Streaming Environments	2023	TNNLS
Representational Continuity for Unsupervised Continual Learning	2022	ICLR
Probing Representation Forgetting in Supervised and Unsupervised Continual Learning	2022	CVPR
Unsupervised Continual Learning for Gradually Varying Domains	2022	CVPRW
Co2L: Contrastive Continual Learning	2021	ICCV
Unsupervised Progressive Learning and the STAM Architecture	2021	IJCAI
Continual Unsupervised Representation Learning	2019	NeurIPS

Theoretical Analysis

[Back to top]

Theory or analysis of continual learning

Paper Title	Year	Conference/Journal
Theoretical Insights into Overparameterized Models in Multi-Task and Replay-Based Continual Learning	2024	Arxiv
An analysis of best-practice strategies for replay and rehearsal in continual learning	2024	CVPRW
Provable Contrastive Continual Learning	2024	ICML
A Statistical Theory of Regularization-Based Continual Learning	2024	ICML
Efficient Continual Finite-Sum Minimization	2024	ICLR
Provable Contrastive Continual Learning	2024	ICLR
Understanding Forgetting in Continual Learning with Linear Regression: Overparameterized and Underparameterized Regimes	2024	ICLR
The Joint Effect of Task Similarity and Overparameterization on Catastrophic Forgetting -- An Analytical Model	2024	ICLR
A Unified and General Framework for Continual Learning	2024	ICLR
Continual Learning in the Presence of Spurious Correlations: Analyses and a Simple Baseline	2024	ICLR
On the Convergence of Continual Learning with Adaptive Methods	2023	UAI
Does Continual Learning Equally Forget All Parameters?	2023	ICML
The Ideal Continual Learner: An Agent That Never Forgets	2023	ICML
Continual Learning in Linear Classification on Separable Data	2023	ICML
Theory on Forgetting and Generalization of Continual Learning	2023	ArXiv
A Theoretical Study on Solving Continual Learning	2022	NeurIPS
Learning Curves for Continual Learning in Neural Networks: Self-Knowledge Transfer and Forgetting	2022	ICLR
Continual Learning in the Teacher-Student Setup: Impact of Task Similarity	2022	ICML
Formalizing the Generalization-Forgetting Trade-off in Continual Learning	2021	NeurIPS
A PAC-Bayesian Bound for Lifelong Learning	2014	ICML

Forgetting in Foundation Models

[Back to top]

Foundation models are large machine learning models trained on a vast quantity of data at scale, such that they can be adapted to a wide range of downstream tasks.

Links: Forgetting in Fine-Tuning Foundation Models | Forgetting in One-Epoch Pre-training | CL in Foundation Model

Forgetting in Fine-Tuning Foundation Models

[Back to top]

When fine-tuning a foundation model, there is a tendency to forget the pre-trained knowledge, resulting in sub-optimal performance on downstream tasks.

Paper Title	Year	Conference/Journal
Upweighting Easy Samples in Fine-Tuning Mitigates Forgetting	2025	Arxiv
AURORA-M: Open Source Continual Pre-training for Multilingual Language and Code	2025	Coling
Continual Learning Using a Kernel-Based Method Over Foundation Models	2024	Arxiv
A Practitioner’s Guide to Continual Multimodal Pretraining	2024	Arxiv
SLCA++: Unleash the Power of Sequential Fine-tuning for Continual Learning with Pre-training	2024	Arxiv
MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning	2024	Arxiv
Towards Effective and Efficient Continual Pre-training of Large Language Models	2024	Arxiv
Revisiting Catastrophic Forgetting in Large Language Model Tuning	2024	Arxiv
D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models	2024	Arxiv
Dissecting learning and forgetting in language model finetuning	2024	ICLR
Understanding Catastrophic Forgetting in Language Models via Implicit Inference	2024	ICLR
Two-stage LLM Fine-tuning with Less Specialization and More Generalization	2024	ICLR
What Will My Model Forget? Forecasting Forgotten Examples in Language Model Refinement	2024	Arxiv
Scaling Laws for Forgetting When Fine-Tuning Large Language Models	2024	Arxiv
TOFU: A Task of Fictitious Unlearning for LLMs	2024	Arxiv
Self-regulating Prompts: Foundational Model Adaptation without Forgetting	2023	ICCV
Speciality vs Generality: An Empirical Study on Catastrophic Forgetting in Fine-tuning Foundation Models	2023	Arxiv
Continual Pre-Training of Large Language Models: How to (re)warm your model?	2023	ICMLW
Improving Gender Fairness of Pre-Trained Language Models without Catastrophic Forgetting	2023	ACL
On The Role of Forgetting in Fine-Tuning Reinforcement Learning Models	2023	ICLRW
Retentive or Forgetful? Diving into the Knowledge Memorizing Mechanism of Language Models	2023	Arxiv
Reinforcement Learning with Action-Free Pre-Training from Videos	2022	ICML
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos	2022	NeurIPS
On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting	2022	NeurIPS
How Should Pre-Trained Language Models Be Fine-Tuned Towards Adversarial Robustness?	2021	NeurIPS
Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models	2020	ICLR
Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting	2020	EMNLP
Universal Language Model Fine-tuning for Text Classification	2018	ACL

Forgetting in One-Epoch Pre-training

[Back to top]

Foundation models often undergo training on a dataset for a single pass. As a result, the earlier examples encountered during pre-training may be overwritten or forgotten by the model more quickly than the later examples.

Paper Title	Year	Conference/Journal
Efficient Continual Pre-training of LLMs for Low-resource Languages	2024	Arxiv
Exploring Forgetting in Large Language Model Pre-Training	2024	Arxiv
Measuring Forgetting of Memorized Training Examples	2023	ICLR
Quantifying Memorization Across Neural Language Models	2023	ICLR
Analyzing leakage of personally identifiable information in language models	2023	S&P
How Well Does Self-Supervised Pre-Training Perform with Streaming Data?	2022	ICLR
The challenges of continuous self-supervised learning	2022	ECCV
Continual contrastive learning for image classification	2022	ICME

CL in Foundation Model or Pretrained Model

[Back to top]

By leveraging the powerful feature extraction capabilities of foundation models, researchers have been able to explore new avenues for advancing continual learning techniques.

Paper Title	Year	Conference/Journal
SPARC: Subspace-Aware Prompt Adaptation for Robust Continual Learning in LLMs	2025	Arxiv
S-LoRA: Scalable Low-Rank Adaptation for Class Incremental Learning	2025	ICLR
Spurious Forgetting in Continual Learning of Language Models	2025	ICLR
PEARL: Input-Agnostic Prompt Enhancement with Negative Feedback Regulation for Class-Incremental Learning)	2025	AAAI
MOS: Model Surgery for Pre-Trained Model-Based Class-Incremental Learning	2025	AAAI
Learn or Recall? Revisiting Incremental Learning with Pre-trained Language Models	2024	ACL
Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized Rehearsal	2024	ACL
Mixture of Experts Meets Prompt-Based Continual Learning	2024	NeurIPS
SAFE: Slow and Fast Parameter-Efficient Tuning for Continual Learning with Pre-Trained Mode	2024	NeurIPS
Incremental Learning of Retrievable Skills For Efficient Continual Task Adaptation	2024	NeurIPS
Vector Quantization Prompting for Continual Learning	2024	NeurIPS
Dual Low-Rank Adaptation for Continual Learning with Pre-Trained Models	2024	Arxiv
Is Parameter Collision Hindering Continual Learning in LLMs	2024	Arxiv
Does RoBERTa Perform Better than BERT in Continual Learning: An Attention Sink Perspective	2024	COLM
Dual Consolidation for Pre-Trained Model-Based Domain-Incremental Learning	2024	Arxiv
ICL-TSVD: Bridging Theory and Practice in Continual Learning with Pre-trained Models	2024	Arxiv
Adaptive Adapter Routing for Long-Tailed Class-Incremental Learning	2024	Machine Learning Journal
CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model	2024	Arxiv
Continual Instruction Tuning for Large Multimodal Models	2024	Arxiv
Parameter-Efficient Fine-Tuning for Continual Learning: A Neural Tangent Kernel Perspective	2024	Arxiv
Class-Incremental Learning with CLIP: Adaptive Representation Adjustment and Parameter Fusion	2024	ECCV
Model Tailor: Mitigating Catastrophic Forgetting in Multi-modal Large Language Models	2024	ICML
One Size Fits All for Semantic Shifts: Adaptive Prompt Tuning for Continual Learning	2024	ICML
HiDe-PET: Continual Learning via Hierarchical Decomposition of Parameter-Efficient Tuning	2024	Arxiv
Domain Adaptation of Llama3-70B-Instruct through Continual Pre-Training and Model Merging: A Comprehensive Evaluation	2024	Arxiv
Mitigate Negative Transfer with Similarity Heuristic Lifelong Prompt Tuning	2024	ACL
Reflecting on the State of Rehearsal-free Continual Learning with Pretrained Models	2024	CoLLAs
Choice of PEFT Technique in Continual Learning: Prompt Tuning is Not All You Need	2024	Arxiv
Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction	2024	ACL
Gradient Projection For Parameter-Efficient Continual Learning	2024	Arxiv
Continual Learning of Large Language Models: A Comprehensive Survey	2024	Arxiv
Prompt Customization for Continual Learning	2024	MM
Dynamically Anchored Prompting for Task-Imbalanced Continual Learning	2024	IJCAI
InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning	2024	CVPR
Semantically-Shifted Incremental Adapter-Tuning is A Continual ViTransformer	2024	CVPR
Evolving Parameterized Prompt Memory for Continual Learning	2024	AAAI
Expandable Subspace Ensemble for Pre-Trained Model-Based Class-Incremental Learning	2024	CVPR
Consistent Prompting for Rehearsal-Free Continual Learning	2024	CVPR
Interactive Continual Learning: Fast and Slow Thinking	2024	CVPR
HOP to the Next Tasks and Domains for Continual Learning in NLP	2024	AAAI
OVOR: OnePrompt with Virtual Outlier Regularization for Rehearsal-Free Class-Incremental Learning	2024	ICLR
Continual Learning for Large Language Models: A Survey	2024	Arxiv
Continual Learning with Pre-Trained Models: A Survey	2024	Arxiv
INCPrompt: Task-Aware incremental Prompting for Rehearsal-Free Class-incremental Learning	2024	ICASSP
P2DT: Mitigating Forgetting in task-incremental Learning with progressive prompt Decision Transformer	2024	ICASSP
Scalable Language Model with Generalized Continual Learning	2024	ICLR
Prompt Gradient Projection for Continual Learning	2024	ICLR
TiC-CLIP: Continual Training of CLIP Models	2024	ICLR
Hierarchical Prompts for Rehearsal-free Continual Learning	2024	Arxiv
KOPPA: Improving Prompt-based Continual Learning with Key-Query Orthogonal Projection and Prototype-based One-Versus-All	2023	Arxiv
RanPAC: Random Projections and Pre-trained Models for Continual Learning	2023	NeurIPS
Hierarchical Decomposition of Prompt-Based Continual Learning: Rethinking Obscured Sub-optimality	2023	NeurIPS
A Unified Continual Learning Framework with General Parameter-Efficient Tuning	2023	ICCV
Generating Instance-level Prompts for Rehearsal-free Continual Learning	2023	ICCV
Introducing Language Guidance in Prompt-based Continual Learning	2023	ICCV
Generating Instance-level Prompts for Rehearsal-free Continual Learning	2023	ICCV
Space-time Prompting for Video Class-incremental Learning	2023	ICCV
When Prompt-based Incremental Learning Does Not Meet Strong Pretraining	2023	ICCV
Online Class Incremental Learning on Stochastic Blurry Task Boundary via Mask and Visual Prompt Tuning	2023	ICCV
SLCA: Slow Learner with Classifier Alignment for Continual Learning on a Pre-trained Model	2023	ICCV
Progressive Prompts: Continual Learning for Language Models	2023	ICLR
Continual Pre-training of Language Models	2023	ICLR
Continual Learning of Language Models	2023	ICLR
CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual Learning	2023	CVPR
PIVOT: Prompting for Video Continual Learning	2023	CVPR
Do Pre-trained Models Benefit Equally in Continual Learning?	2023	WACV
Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need	2023	Arxiv
First Session Adaptation: A Strong Replay-Free Baseline for Class-Incremental Learning	2023	Arxiv
Memory Efficient Continual Learning with Transformers	2022	NeurIPS
S-Prompts Learning with Pre-trained Transformers: An Occam’s Razor for Domain Incremental Learning	2022	NeurIPS
Pretrained Language Model in Continual Learning: A Comparative Study	2022	ICLR
Effect of scale on catastrophic forgetting in neural networks	2022	ICLR
LFPT5: A Unified Framework for Lifelong Few-shot Language Learning Based on Prompt Tuning of T5	2022	ICLR
Learning to Prompt for Continual Learning	2022	CVPR
Class-Incremental Learning with Strong Pre-trained Models	2022	CVPR
DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning	2022	ECCV
ELLE: Efficient Lifelong Pre-training for Emerging Data	2022	ACL
Fine-tuned Language Models are Continual Learners	2022	EMNLP
Continual Training of Language Models for Few-Shot Learning	2022	EMNLP
Continual Learning with Foundation Models: An Empirical Study of Latent Replay	2022	Conference on Lifelong Learning Agents
Rational LAMOL: A Rationale-Based Lifelong Learning Framework	2021	ACL
Achieving Forgetting Prevention and Knowledge Transfer in Continual Learning	2021	NeurIPS
An Empirical Investigation of the Role of Pre-training in Lifelong Learning	2021	Arxiv
LAnguage MOdeling for Lifelong Language Learning	2020	ICLR

Forgetting in Domain Adaptation

[Back to top]

The goal of domain adaptation is to transfer the knowledge from a source domain to a target domain.

Paper Title	Year	Conference/Journal
Towards Cross-Domain Continual Learning	2024	ICDE
Continual Source-Free Unsupervised Domain Adaptation	2023	International Conference on Image Analysis and Processing
CoSDA: Continual Source-Free Domain Adaptation	2023	Arxiv
Lifelong Domain Adaptation via Consolidated Internal Distribution	2022	NeurIPS
Online Domain Adaptation for Semantic Segmentation in Ever-Changing Conditions	2022	ECCV
FRIDA -- Generative Feature Replay for Incremental Domain Adaptation	2022	CVIU
Unsupervised Continual Learning for Gradually Varying Domains	2022	CVPRW
Continual Adaptation of Visual Representations via Domain Randomization and Meta-learning	2021	CVPR
Gradient Regularized Contrastive Learning for Continual Domain Adaptation	2021	AAAI
Learning to Adapt to Evolving Domains	2020	NeurIPS
AdaGraph: Unifying Predictive and Continuous Domain Adaptation through Graphs	2019	CVPR
ACE: Adapting to Changing Environments for Semantic Segmentation	2019	ICCV
Adapting to Continuously Shifting Domains	2018	ICLRW

Forgetting in Test-Time Adaptation

[Back to top]

Test time adaptation (TTA) refers to the process of adapting a pre-trained model on-the-fly to unlabeled test data during inference or testing.

Paper Title	Year	Conference/Journal
Conformal Uncertainty Indicator for Continual Test-Time Adaptation	2025	Arxiv
PCoTTA: Continual Test-Time Adaptation for Multi-Task Point Cloud Understanding	2024	NeurIPS
Adaptive Cascading Network for Continual Test-Time Adaptation	2024	CIKM
Controllable Continual Test-Time Adaptation	2024	Arxiv
ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation	2024	ICLR
Continual Momentum Filtering on Parameter Space for Online Test-time Adaptation	2024	ICLR
A Comprehensive Survey on Test-Time Adaptation under Distribution Shifts	2023	Arxiv
MECTA: Memory-Economic Continual Test-Time Model Adaptation	2023	ICLR
Decorate the Newcomers: Visual Domain Prompt for Continual Test Time Adaptation	2023	AAAI (Outstanding Student Paper Award)
Robust Mean Teacher for Continual and Gradual Test-Time Adaptation	2023	CVPR
A Probabilistic Framework for Lifelong Test-Time Adaptation	2023	CVPR
EcoTTA: Memory-Efficient Continual Test-time Adaptation via Self-distilled Regularization	2023	CVPR
AUTO: Adaptive Outlier Optimization for Online Test-Time OOD Detection	2023	Arxiv
Efficient Test-Time Model Adaptation without Forgetting	2022	ICML
MEMO: Test time robustness via adaptation and augmentation	2022	NeurIPS
Continual Test-Time Domain Adaptation	2022	CVPR
Improving test-time adaptation via shift-agnostic weight regularization and nearest source prototypes	2022	ECCV
Tent: Fully Test-Time Adaptation by Entropy Minimization	2021	ICLR

Forgetting in Meta-Learning

[Back to top]

Meta-learning, also known as learning to learn, focuses on developing algorithms and models that can learn from previous learning experiences to improve their ability to learn new tasks or adapt to new domains more efficiently and effectively.

Links: Incremental Few-Shot Learning | Continual Meta-Learning

Incremental Few-Shot Learning

[Back to top]

Incremental few-shot learning (IFSL) focuses on the challenge of learning new categories with limited labeled data while retaining knowledge about previously learned categories.

Paper Title	Year	Conference/Journal
Controllable Forgetting Mechanism for Few-Shot Class-Incremental Learning	2025	ICASSP
AnchorInv: Few-Shot Class-Incremental Learning of Physiological Signals via Feature Space-Guided Inversion	2024	Arxiv
On Distilling the Displacement Knowledge for Few-Shot Class-Incremental Learning	2024	Arxiv
Few-Shot Class-Incremental Learning via Training-Free Prototype Calibration	2023	NeurIPS
Constrained Few-shot Class-incremental Learning	2022	CVPR
Meta-Learning with Less Forgetting on Large-Scale Non-Stationary Task Distributions	2022	ECCV
Overcoming Catastrophic Forgetting in Incremental Few-Shot Learning by Finding Flat Minima	2021	NeurIPS
Incremental Few-shot Learning via Vector Quantization in Deep Embedded Space	2021	ICLR
XtarNet: Learning to Extract Task-Adaptive Representation for Incremental Few-Shot Learning	2020	ICML
Incremental Few-Shot Learning with Attention Attractor Networks	2019	NeurIPS
Dynamic Few-Shot Visual Learning without Forgetting	2018	CVPR

Continual Meta-Learning

[Back to top]

The goal of continual meta-learning (CML) is to address the challenge of forgetting in non-stationary task distributions.

Paper Title	Year	Conference/Journal
Meta Continual Learning Revisited: Implicitly Enhancing Online Hessian Approximation via Variance Reduction	2024	ICLR
Recasting Continual Learning as Sequence Modeling	2023	NeurIPS
Adaptive Compositional Continual Meta-Learning	2023	ICML
Learning to Learn and Remember Super Long Multi-Domain Task Sequence	2022	CVPR
Meta-Learning with Less Forgetting on Large-Scale Non-Stationary Task Distributions	2022	ECCV
Variational Continual Bayesian Meta-Learning	2021	NeurIPS
Meta Learning on a Sequence of Imbalanced Domains with Difficulty Awareness	2021	ICCV
Addressing Catastrophic Forgetting in Few-Shot Problems	2020	ICML
Continuous meta-learning without tasks	2020	NeurIPS
Reconciling meta-learning and continual learning with online mixtures of tasks	2019	NeurIPS
Fast Context Adaptation via Meta-Learning	2019	ICML
Online meta-learning	2019	ICML

Forgetting in Generative Models

[Back to top]

The goal of a generative model is to learn a generator that can generate samples from a target distribution.

Links: GAN Training is a Continual Learning Problem | Lifelong Learning of Generative Models

GAN Training is a Continual Learning Problem

[Back to top]

Treating GAN training as a continual learning problem.

Paper Title	Year	Conference/Journal
Learning to Retain while Acquiring: Combating Distribution-Shift in Adversarial Data-Free Knowledge Distillation	2023	CVPR
Momentum Adversarial Distillation: Handling Large Distribution Shifts in Data-Free Knowledge Distillation	2022	NeurIPS
Robust and Resource-Efficient Data-Free Knowledge Distillation by Generative Pseudo Replay	2022	AAAI
Preventing Catastrophic Forgetting and Distribution Mismatch in Knowledge Distillation via Synthetic Data	2022	WACV
On Catastrophic Forgetting and Mode Collapse in Generative Adversarial Networks	2020	IJCNN
Generative adversarial network training is a continual learning problem	2018	ArXiv

Lifelong Learning of Generative Models

[Back to top]

The goal is to develop generative models that can continually generate high-quality samples for both new and previously encountered tasks.

Paper Title	Year	Conference/Journal
KFC: Knowledge Reconstruction and Feedback Consolidation Enable Efficient and Effective Continual Generative Learning	2024	ICLR
The Curse of Recursion: Training on Generated Data Makes Models Forget	2023	Arxiv
Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models	2023	Arxiv
Selective Amnesia: A Continual Learning Approach to Forgetting in Deep Generative Models	2023	Arxiv
Lifelong Generative Modelling Using Dynamic Expansion Graph Model	2022	AAAI
Continual Variational Autoencoder Learning via Online Cooperative Memorization	2022	ECCV
Hyper-LifelongGAN: Scalable Lifelong Learning for Image Conditioned Generation	2021	CVPR
Lifelong Twin Generative Adversarial Networks	2021	ICIP
Lifelong Mixture of Variational Autoencoders	2021	TNNLS
Lifelong Generative Modeling	2020	Neurocomputing
GAN Memory with No Forgetting	2020	NeurIPS
Lifelong GAN: Continual Learning for Conditional Image Generation	2019	ICCV

Forgetting in Reinforcement Learning

[Back to top]

Reinforcement learning is a machine learning technique that allows an agent to learn how to behave in an environment by trial and error, through rewards and punishments.

Paper Title	Year	Conference/Journal
Continual Deep Reinforcement Learning with Task-Agnostic Policy Distillation	2024	Arxiv
Data Augmentation for Continual RL via Adversarial Gradient Episodic Memory	2024	Arxiv
Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem	2024	Arxiv
Hierarchical Continual Reinforcement Learning via Large Language Model	2024	Arxiv
Augmenting Replay in World Models for Continual Reinforcement Learning	2024	Arxiv
CPPO: Continual Learning for Reinforcement Learning with Human Feedback	2024	ICLR
Prediction and Control in Continual Reinforcement Learning	2023	NeurIPS
Replay-enhanced Continual Reinforcement Learning	2023	TMLR
A Definition of Continual Reinforcement Learning	2023	Arxiv
Continual Task Allocation in Meta-Policy Network via Sparse Prompting	2023	ICML
Building a Subspace of Policies for Scalable Continual Learning	2023	ICLR
Continual Model-based Reinforcement Learning for Data Efficient Wireless Network Optimisation	2023	ECML
Modular Lifelong Reinforcement Learning via Neural Composition	2022	ICLR
Disentangling Transfer in Continual Reinforcement Learning	2022	NeurIPS
Towards continual reinforcement learning: A review and perspectives	2022	Journal of Artificial Intelligence Research
Reinforced continual learning for graphs	2022	CIKM
Model-Free Generative Replay for Lifelong Reinforcement Learning: Application to Starcraft-2	2022	Conference on Lifelong Learning Agents
Transient Non-stationarity and Generalisation in Deep Reinforcement Learning	2021	ICLR
Sharing Less is More: Lifelong Learning in Deep Networks with Selective Layer Transfer	2021	ICML
Pseudo-rehearsal: Achieving deep reinforcement learning without catastrophic forgetting	2021	Neurocomputing
Lifelong Policy Gradient Learning of Factored Policies for Faster Training Without Forgetting	2020	NeurIPS
Policy Consolidation for Continual Reinforcement Learning	2019	ICML
Exploiting Hierarchy for Learning and Transfer in KL-regularized RL	2019	Arxiv
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks	2017	ICML
Progressive neural networks	2016	Arxiv
Learning a synaptic learning rule	1991	IJCNN

Forgetting in Federated Learning

[Back to top]

Federated learning (FL) is a decentralized machine learning approach where the training process takes place on local devices or edge servers instead of a centralized server.

Links: Forgetting Due to Non-IID Data in FL | Federated Continual Learning

Forgetting Due to Non-IID Data in FL

[Back to top]

This branch pertains to the forgetting problem caused by the inherent non-IID (not identically and independently distributed) data among different clients participating in FL.

Paper Title	Year	Conference/Journal
Exemplar-condensed Federated Class-incremental Learning	2024	Arxiv
Flashback: Understanding and Mitigating Forgetting in Federated Learning	2024	Arxiv
How to Forget Clients in Federated Online Learning to Rank?	2024	ECIR
GradMA: A Gradient-Memory-based Accelerated Federated Learning with Alleviated Catastrophic Forgetting	2023	CVPR
Acceleration of Federated Learning with Alleviated Forgetting in Local Training	2022	ICLR
Preservation of the Global Knowledge by Not-True Distillation in Federated Learning	2022	NeurIPS
Learn from Others and Be Yourself in Heterogeneous Federated Learning	2022	CVPR
Rethinking Architecture Design for Tackling Data Heterogeneity in Federated Learning	2022	CVPR
Model-Contrastive Federated Learning	2021	CVPR
SCAFFOLD: Stochastic Controlled Averaging for Federated Learning	2020	ICML
Overcoming Forgetting in Federated Learning on Non-IID Data	2019	NeurIPSW

Federated Continual Learning

[Back to top]

This branch addresses the issue of continual learning within each individual client in the federated learning process, which results in forgetting at the overall FL level.

Paper Title	Year	Conference/Journal
Federated Class-Incremental Learning: A Hybrid Approach Using Latent Exemplars and Data-Free Techniques to Address Local and Global Forgetting	2025	ICLR
Resource-Constrained Federated Continual Learning: What Does Matter?	2025	Arxiv
Diffusion-Driven Data Replay: A Novel Approach to Combat Forgetting in Federated Class Continual Learning	2024	ECCV
PIP: Prototypes-Injected Prompt for Federated Class Incremental	2024	CIKM
Personalized Federated Continual Learning via Multi-granularity Prompt	2024	KDD
Federated Continual Learning via Prompt-based Dual Knowledge Transfer	2024	ICML
Text-Enhanced Data-free Approach for Federated Class-Incremental Learning	2024	CVPR
Federated Continual Learning via Knowledge Fusion: A Survey	2024	TKDE
Accurate Forgetting for Heterogeneous Federated Continual Learning	2024	ICLR
Federated Orthogonal Training: Mitigating Global Catastrophic Forgetting in Continual Federated Learning	2024	ICLR
A Data-Free Approach to Mitigate Catastrophic Forgetting in Federated Class Incremental Learning for Vision Tasks	2023	NeurIPS
Federated Continual Learning via Knowledge Fusion: A Survey	2023	Arxiv
A Data-Free Approach to Mitigate Catastrophic Forgetting in Federated Class Incremental Learning for Vision Tasks	2023	NeurIPS
TARGET: Federated Class-Continual Learning via Exemplar-Free Distillation	2023	ICCV
FedET: A Communication-Efficient Federated Class-Incremental Learning Framework Based on Enhanced Transformer	2023	IJCAI
Better Generative Replay for Continual Federated Learning	2023	ICLR
Don’t Memorize; Mimic The Past: Federated Class Incremental Learning Without Episodic Memory	2023	ICMLW
Addressing Catastrophic Forgetting in Federated Class-Continual Learning	2023	Arxiv
Federated Class-Incremental Learning	2022	CVPR
Continual Federated Learning Based on Knowledge Distillation	2022	IJCAI
Federated Continual Learning with Weighted Inter-client Transfer	2021	ICML
A distillation-based approach integrating continual learning and federated learning for pervasive services	2021	Arxiv

Beneficial Forgetting

[Back to top] Beneficial forgetting arises when the model contains private information that could lead to privacy breaches or when irrelevant information hinders the learning of new tasks. In these situations, forgetting becomes desirable as it helps protect privacy and facilitate efficient learning by discarding unnecessary information.

Problem Setting	Goal
Mitigate Overfitting	mitigate memorization of training data through selective forgetting
Debias and Forget Irrelevant Information	forget biased information to achieve better performance or remove irrelevant information to learn new tasks
Machine Unlearning	forget some specified training data to protect user privacy

Links: Combat Overfitting Through Forgetting | Learning New Knowledge Through Forgetting Previous Knowledge | Machine Unlearning

Forgetting Irrelevant Information to Achieve Better Performance

[Back to top]

Combat Overfitting Through Forgetting

[Back to top]

Overfitting in neural networks occurs when the model excessively memorizes the training data, leading to poor generalization. To address overfitting, it is necessary to selectively forget irrelevant or noisy information.

Paper Title	Year	Conference/Journal
"Forgetting" in Machine Learning and Beyond: A Survey	2024	Arxiv
The Effectiveness of Random Forgetting for Robust Generalization	2024	ICLR
Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier	2023	ICLR
The Primacy Bias in Deep Reinforcement Learning	2022	ICML
The Impact of Reinitialization on Generalization in Convolutional Neural Networks	2021	Arxiv
Learning with Selective Forgetting	2021	IJCAI
SIGUA: Forgetting May Make Learning with Noisy Labels More Robust	2020	ICML
Invariant Representations through Adversarial Forgetting	2020	AAAI
Forget a Bit to Learn Better: Soft Forgetting for CTC-based Automatic Speech Recognition	2019	Interspeech

Learning New Knowledge Through Forgetting Previous Knowledge

[Back to top]

"Learning to forget" suggests that not all previously acquired prior knowledge is helpful for learning new tasks.

Paper Title	Year	Conference/Journal
"Forgetting" in Machine Learning and Beyond: A Survey	2024	Arxiv
Improving Language Plasticity via Pretraining with Active Forgetting	2023	NeurIPS
ReFactor GNNs: Revisiting Factorisation-based Models from a Message-Passing Perspective	2022	NeurIPS
Fortuitous Forgetting in Connectionist Networks	2022	ICLR
Skin Deep Unlearning: Artefact and Instrument Debiasing in the Context of Melanoma Classification	2022	ICML
Near-Optimal Task Selection for Meta-Learning with Mutual Information and Online Variational Bayesian Unlearning	2022	AISTATS
AFEC: Active Forgetting of Negative Transfer in Continual Learning	2021	NeurIPS
Knowledge Evolution in Neural Networks	2021	CVPR
Active Forgetting: Adaptation of Memory by Prefrontal Control	2021	Annual Review of Psychology
Learning to Forget for Meta-Learning	2020	CVPR
The Forgotten Part of Memory	2019	Nature
Learning Not to Learn: Training Deep Neural Networks with Biased Data	2019	CVPR
Inhibiting your native language: the role of retrieval-induced forgetting during second-language acquisition	2007	Psychological Science

Machine Unlearning

[Back to top]

Machine unlearning, a recent area of research, addresses the need to forget previously learned training data in order to protect user data privacy.

Paper Title	Year	Conference/Journal
Unlearning during Learning: An Efficient Federated Machine Unlearning Method	2024	IJCAI
Label-Agnostic Forgetting: A Supervision-Free Unlearning in Deep Models	2024	ICLR
Machine Unlearning: A Survey	2023	ACM Computing Surveys
Deep Unlearning via Randomized Conditionally Independent Hessians	2022	CVPR
Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep Networks	2022	CVPR
PUMA: Performance Unchanged Model Augmentation for Training Data Removal	2022	AAAI
ARCANE: An Efficient Architecture for Exact Machine Unlearning	2022	IJCAI
Learn to Forget: Machine Unlearning via Neuron Masking	2022	IEEE TDSC
Backdoor Defense with Machine Unlearning	2022	IEEE INFOCOM
Markov Chain Monte Carlo-Based Machine Unlearning: Unlearning What Needs to be Forgotten	2022	ASIA CCS
Machine Unlearning	2021	SSP
Remember What You Want to Forget: Algorithms for Machine Unlearning	2021	NeurIPS
Machine Unlearning via Algorithmic Stability	2021	COLT
Variational Bayesian Unlearning	2020	NeurIPS
Rapid retraining of machine learning models	2020	ICML
Certified Data Removal from Machine Learning Models	2020	ICML
Making AI Forget You: Data Deletion in Machine Learning	2019	NeurIPS
Lifelong Anomaly Detection Through Unlearning	2019	CCS
The EU Proposal for a General Data Protection Regulation and the Roots of the ‘Right to Be Forgotten’	2013	Computer Law & Security Review

Star History

Contact

We welcome all researchers to contribute to this repository 'forgetting in deep learning'.

Email: [email protected] | [email protected]

Name	Name	Last commit message	Last commit date
Latest commit EnnengYang Update README.md Feb 13, 2025 d6d9d0f · Feb 13, 2025 History 147 Commits
README.md	README.md	Update README.md	Feb 13, 2025

EnnengYang/Awesome-Forgetting-in-Deep-Learning

Folders and files

Latest commit

History

Repository files navigation

Awesome-Forgetting-in-Deep-Learning

Abstract

Citation

Framework

Harmful Forgetting

Forgetting in Continual Learning

Survey and Book

Task-aware CL

Memory-based Methods

Architecture-based Methods

Regularization-based Methods

Subspace-based Methods

Bayesian Methods

Task-free CL

Online CL

Semi-supervised CL

Few-shot CL

Unsupervised CL

Theoretical Analysis

Forgetting in Foundation Models

Forgetting in Fine-Tuning Foundation Models

Forgetting in One-Epoch Pre-training

CL in Foundation Model or Pretrained Model

Forgetting in Domain Adaptation

Forgetting in Test-Time Adaptation

Forgetting in Meta-Learning

Incremental Few-Shot Learning

Continual Meta-Learning

Forgetting in Generative Models

GAN Training is a Continual Learning Problem

Lifelong Learning of Generative Models

Forgetting in Reinforcement Learning

Forgetting in Federated Learning

Forgetting Due to Non-IID Data in FL

Federated Continual Learning

Beneficial Forgetting

Forgetting Irrelevant Information to Achieve Better Performance

Combat Overfitting Through Forgetting

Learning New Knowledge Through Forgetting Previous Knowledge

Machine Unlearning

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages