Skip to content

Commit

Permalink
fix figures
Browse files Browse the repository at this point in the history
  • Loading branch information
LFhase committed Feb 17, 2023
1 parent d901448 commit 9f649e7
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ In fact, the optimization process of the OOD objectives turns out to be substant
When optimizing the ERM and OOD objectives,
$$\min_f (L_\text{ERM},L_\text{OOD})^T$$
there often exists an **<ins>optimization dilemma</ins>** in the training of the OOD objectives:
<p align="center">
<!-- <p align="center">
<img alt="Light" src="figures/Fail_IRMS_Sqls.png" width="30%">
<img alt="Dark" src="figures/grad_conflicts.png" width="22.5%">
<img alt="Dark" src="figures/bad_scalar.png" width="24%">
Expand All @@ -34,15 +34,15 @@ there often exists an **<ins>optimization dilemma</ins>** in the training of the
<em>(b).</em> Gradient conflicts. &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
<em>(c).</em> Unreliable Opt. Scheme. &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
<em>(d).</em> Exhaustive tunning.
</p>

</p> -->
<p align="center"><img src="./figures/pair_motivation.png"></p>

1. The original OOD objectives are often hard to be optimized directly (e.g., IRM), hence they are **<ins>relaxed as regularization terms</ins>** of ERM (e.g., IRMv1), i.e., $\min_f L_\text{ERM}+\lambda \widehat{L}\_\text{OOD}$, which can behave very differently and introduce huge gaps with the original one. As shown in figure *(a)*, the ellipsoids denote solutions that satisfy the invariance constraints of practical IRM variant IRMv1. When optimized with ERM, IRMv1 prefers $f_1$ instead of $f_\text{IRM}$ (The predictor produced by IRM).

2. The **<ins>intrinsic conflicts</ins>** between ERM and OOD objectives brings conflicts in gradients that further increases the optimization difficulty, as shown in figure *(b)*. Consequently, it often require careful tuning of the penalty weights (the $\lambda$). Figure (d) shows an example that IRMv1 usually requires exhaustive tuning of hyperparameters ($y$-axis: penalty weights; $x$-axis: ERM pre-training epochs before applying IRMv1 penalty),
Especially, the Multi-Objective Optimization (MOO) theory the typically used linear weighting scheme, i.e., $\min_f L_\text{ERM}+\lambda \widehat{L}\_\text{OOD}$ cannot reach any solutions in the non-convex part of the Pareto front, as shown in figure *(c)*, and lead to suboptimal OOD generalization performance.

3. Along with the optimization dilemma is another challenge, i.e., **<ins>model selection</ins>** during the training with the OOD objectives. As we lack the access to a validation set that have a similar distribution with the test data, <a href="https://github.com/facebookresearch/DomainBed">DomainBed</a> provides 3 options to choose and construct a validation set from: training domain data; leave-one-out validation data; test domain data. However, all three validation set construction approaches have their own limitations, as they essentially posit different **<ins> assumptions on the test distribution</ins>**.
1. Along with the optimization dilemma is another challenge, i.e., **<ins>model selection</ins>** during the training with the OOD objectives. As we lack the access to a validation set that have a similar distribution with the test data, <a href="https://github.com/facebookresearch/DomainBed">DomainBed</a> provides 3 options to choose and construct a validation set from: training domain data; leave-one-out validation data; test domain data. However, all three validation set construction approaches have their own limitations, as they essentially posit different **<ins> assumptions on the test distribution</ins>**.

This work provides understandings and solutions to the aforementioned challenges from the MOO perspective, which leads to a new optimization scheme for OOD generalization, called PAreto Invariant Risk Minimization (`PAIR`), including an optimizer `PAIR-o` and a new model selection criteria `PAIR-s`.

Expand Down
Binary file added figures/pair_motivation.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 9f649e7

Please sign in to comment.