Skip to content

xsy786912649/hypergrad_for_constrained_bilevel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hypergrad_for_constrained_bilevel

Compute the hypergradients for constrained bilevel optimization by using the package Hypergrad. Please first view how to use package Hypergrad. A brief introduction for Hypergrad is attached in the end.

Hypergrad provides a approach to compute $\nabla f(\lambda)$ for following problem:

bilevel

Thanks to Hypergrad, we can form the constrained bilevel optimization problem to this form. Given a constrained bilevel problem.

1bilevel

  • First, solve the inner level problem $y^{\star}(x)={\arg\min}_y {g(x,y): p(x,y)\leq 0; q(x,y)=0}$, to get the solution $y^{\star}(x)$, $\mu(x)$ and $v(x)$. Here, $(\mu(x), v(x))$ are Lagrange multipliers when $x$ is given and $\mu(x) \geq 0$.

  • The Lagrangian is $L(y^{\star}(x), \mu(x), ν(x)) = g(x, y^{\star}(x))+\mu(x)p(x,y^{\star}(x))+v(x)q(x,y^{\star}(x)).$ From the KKT condition achieved at the point $(y^{\star}(x), \mu(x), v(x))$, we have

$\nabla_1 L(y^{\star}(x), \mu(x), ν(x))=0$,

$\nabla_3 L(y^{\star}(x), \mu(x), ν(x))=0$ (usually written as $q(x,y^{\star}(x))=0$ in the KKT condition),

Moreoever, if $\mu(x)>0$, we have $\nabla_2 L(y^{\star}(x), \mu(x), ν(x))=0$ (usually written as $\mu(x) p(x,y^{\star}(x))=0$ in the KKT condition).

Then, we have $\nabla_x L(y^{\star}(x), \mu(x), ν(x))=0$ if $\mu(x)>0$.

  • To fit the form in Hypergrad, we rewrite it as $[y^{\star}(x), \mu(x), ν(x)]^{\top}=[y^{\star}(x), \mu(x), ν(x)]^{\top}-\nabla_x L(y^{\star}(x), \mu(x), ν(x))$.

  • Fit the form to Hypergrad. Here,

$x$ corresponds the notation $\lambda$;

$[y^{\star}(x), \mu(x), ν(x)]^{\top}$ corresponds the notation of $w(\lambda)$;

$[y^{\star}(x), \mu(x), ν(x)]^{\top}-\nabla_x L(y^{\star}(x), \mu(x), ν(x))$ corresponds the map $\Phi(w(\lambda),\lambda)$.

  • If $\mu_i(x)=0$, we can remove $\mu_i(x)$ and $q_i(x,y^{\star}(x))$ from the computation.

The detail of the computation and theorms is shown in our paper Constrained bilevel optimization.

An example of the code is shown in example.py.

If you have more questions, fell free to contact me {[email protected]}.

If you find our code useful, please consider citing our work. Here is the citation informoation for our paper Constrained bilevel optimization.

@article{Xu_Zhu_2023,
title={Efficient Gradient Approximation Method for Constrained Bilevel Optimization},
volume={37},
number={10},
journal={Proceedings of the AAAI Conference on Artificial Intelligence},
author={Xu, Siyuan and Zhu, Minghui},
year={2023},
month={Jun.},
pages={12509-12517} }

Hypergrad

https://github.com/prolearner/hypertorch

Lightweight flexible research-oriented package to compute hypergradients in PyTorch.

What is an hypergradient?

Given the following bi-level problem.

bilevel

We call hypergradient the following quantity.

hypergradient

Where:

  • outerobjective is called the outer objective (e.g. the validation loss).
  • Phi is called the fixed point map (e.g. a gradient descent step or the state update function in a recurrent model)
  • finding the solution of the fixed point equation fixed_point_eq is referred to as the inner problem. This can be solved by repeatedly applying the fixed point map or using a different inner algorithm.

Cite

Here is citation informoation for the paper Hypergrad

@inproceedings{grazzi2020iteration,
  title={On the Iteration Complexity of Hypergradient Computation},
  author={Grazzi, Riccardo and Franceschi, Luca and Pontil, Massimiliano and Salzo, Saverio},
  journal={Thirty-seventh International Conference on Machine Learning (ICML)},
  year={2020}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages