Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactors of f-divergences #154

Open
RemyDegenne opened this issue Oct 1, 2024 · 1 comment · May be fixed by #174
Open

Refactors of f-divergences #154

RemyDegenne opened this issue Oct 1, 2024 · 1 comment · May be fixed by #174

Comments

@RemyDegenne
Copy link
Owner

RemyDegenne commented Oct 1, 2024

Current definition

def fDiv (f : ℝ → ℝ) (μ ν : Measure α) : EReal :=
  if ¬ Integrable (fun x ↦ f ((∂μ/∂ν) x).toReal) ν thenelse ∫ x, f ((∂μ/∂ν) x).toReal ∂ν + derivAtTop f * μ.singularPart ν .univ

Issues

  • The integral uses the value of f at zero, f 0 : ℝ. The math definition requires that the value at 0 should be equal to the limit of f at 0 from the right. If the limit is finite that's fine, we can simply require that f should be continuous at 0. However, if the limit at 0 is infinite, our current definition cannot encode the math definition. That was not an issue until now but it prevents us from writing desirable statements like the invariance of fDiv by taking the "dual" of f (see the issue about generalizing skew symmetry).
  • fDiv takes values in EReal, which is a pain to work with. We don't need the negative infinity though.
  • The use of a Bochner integral forces us to deal with integrability conditions.

Constraints

f has to have domain , and not the more natural ℝ≥0∞, because otherwise we can't talk about its derivatives and other basic things break.

Refactor 1

def rightLimZero (f : ℝ → ℝ) : EReal := Function.rightLim (fun x ↦ (f x : EReal)) (0 : ℝ)

def fDiv' (f : ℝ → ℝ) (μ ν : Measure α) : EReal :=
  if ¬ IntegrableOn (fun x ↦ f (μ.rnDeriv ν x).toReal) (ν.singularPartSet μ)ᶜ ν thenelse ∫ x in (ν.singularPartSet μ)ᶜ, f (μ.rnDeriv ν x).toReal ∂ν
    + derivAtTop f * μ.singularPart ν .univ + rightLimZero f * ν.singularPart μ univ

This does not change fundamentally the definition. It simply splits a third term, corresponding to the set where the Radon-Nikodym derivative is zero, and uses the right limit of f at 0.

This is a rather light refactor from a conceptual point of view. It's more a fix than a refactor. However, it will make all computations about fDiv heavier.

Refactor 2

def fDiv (f : ℝ → ℝ≥0∞) (μ ν : Measure α) : ℝ≥0∞ :=
    ∫⁻ x, f ((∂μ/∂ν) x).toReal ∂ν + derivAtTop f * μ.singularPart ν .univ

(in which derivAtTop f would be changed to be ℝ≥0∞)

This much more radical refactor does the following:

  • only allow nonnegative functions f : ℝ → ℝ≥0∞
  • enforce continuity at 0 (in lemmas). Since f can take value , we can deal with the case of infinite right limit.
  • use a Lebesgue integral, which avoids talking about integrability
  • change the type of fDiv to have values in ℝ≥0∞

That definition was not used from the start because it seemed to me that allowing f to take negative values was important. After all, every book writes that the KL divergence is an f-divergence for x * log x, which takes negative values. However on probability measures it is always possible to add an affine function without changing the divergence while making the function nonnegative: x * log x + 1 - x gives the same f-divergence on probability measures.

We would have to do one of those two options:

  1. change the definition of KL to be that f-divergence. The KL definition would be something like
def kl (μ ν : Measure α) : ℝ≥0∞ :=
  if μ ≪ ν ∧ Integrable (llr μ ν) μ then ENNReal.ofReal (∫ x, llr μ ν x ∂μ) + ν .univ - μ .univ  else
  1. write that KL is equal to an f-divergence only on probability measures

We could change the type of all divergences to use ℝ≥0∞ if we go for option 1 for all of them.

An advantage of that second refactor is that it would be presumably better linked to integrals of statistical informations (which are necessarily nonnegative).

@RemyDegenne
Copy link
Owner Author

RemyDegenne commented Oct 1, 2024

There is an issue with f : ℝ → ℝ≥0∞, which is that we can't talk about its right derivative since ℝ≥0∞ is not a real normed space. We have to go to then back to ℝ≥0∞.
And everything related to slopes and convexity breaks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant