You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
deffDiv (f : ℝ → ℝ) (μ ν : Measure α) : EReal :=
if ¬ Integrable (fun x ↦ f ((∂μ/∂ν) x).toReal) ν then ⊤
else ∫ x, f ((∂μ/∂ν) x).toReal ∂ν + derivAtTop f * μ.singularPart ν .univ
Issues
The integral uses the value of f at zero, f 0 : ℝ. The math definition requires that the value at 0 should be equal to the limit of f at 0 from the right. If the limit is finite that's fine, we can simply require that f should be continuous at 0. However, if the limit at 0 is infinite, our current definition cannot encode the math definition. That was not an issue until now but it prevents us from writing desirable statements like the invariance of fDiv by taking the "dual" of f (see the issue about generalizing skew symmetry).
fDiv takes values in EReal, which is a pain to work with. We don't need the negative infinity though.
The use of a Bochner integral forces us to deal with integrability conditions.
Constraints
f has to have domain ℝ, and not the more natural ℝ≥0∞, because otherwise we can't talk about its derivatives and other basic things break.
Refactor 1
defrightLimZero (f : ℝ → ℝ) : EReal := Function.rightLim (fun x ↦ (f x : EReal)) (0 : ℝ)
deffDiv' (f : ℝ → ℝ) (μ ν : Measure α) : EReal :=
if ¬ IntegrableOn (fun x ↦ f (μ.rnDeriv ν x).toReal) (ν.singularPartSet μ)ᶜ ν then ⊤
else ∫ x in (ν.singularPartSet μ)ᶜ, f (μ.rnDeriv ν x).toReal ∂ν
+ derivAtTop f * μ.singularPart ν .univ + rightLimZero f * ν.singularPart μ univ
This does not change fundamentally the definition. It simply splits a third term, corresponding to the set where the Radon-Nikodym derivative is zero, and uses the right limit of f at 0.
This is a rather light refactor from a conceptual point of view. It's more a fix than a refactor. However, it will make all computations about fDiv heavier.
(in which derivAtTop f would be changed to be ℝ≥0∞)
This much more radical refactor does the following:
only allow nonnegative functions f : ℝ → ℝ≥0∞
enforce continuity at 0 (in lemmas). Since f can take value ∞, we can deal with the case of infinite right limit.
use a Lebesgue integral, which avoids talking about integrability
change the type of fDiv to have values in ℝ≥0∞
That definition was not used from the start because it seemed to me that allowing f to take negative values was important. After all, every book writes that the KL divergence is an f-divergence for x * log x, which takes negative values. However on probability measures it is always possible to add an affine function without changing the divergence while making the function nonnegative: x * log x + 1 - x gives the same f-divergence on probability measures.
We would have to do one of those two options:
change the definition of KL to be that f-divergence. The KL definition would be something like
write that KL is equal to an f-divergence only on probability measures
We could change the type of all divergences to use ℝ≥0∞ if we go for option 1 for all of them.
An advantage of that second refactor is that it would be presumably better linked to integrals of statistical informations (which are necessarily nonnegative).
The text was updated successfully, but these errors were encountered:
There is an issue with f : ℝ → ℝ≥0∞, which is that we can't talk about its right derivative since ℝ≥0∞ is not a real normed space. We have to go to ℝ then back to ℝ≥0∞.
And everything related to slopes and convexity breaks.
Current definition
Issues
f
at zero,f 0 : ℝ
. The math definition requires that the value at 0 should be equal to the limit off
at 0 from the right. If the limit is finite that's fine, we can simply require thatf
should be continuous at 0. However, if the limit at 0 is infinite, our current definition cannot encode the math definition. That was not an issue until now but it prevents us from writing desirable statements like the invariance offDiv
by taking the "dual" off
(see the issue about generalizing skew symmetry).fDiv
takes values inEReal
, which is a pain to work with. We don't need the negative infinity though.Constraints
f
has to have domainℝ
, and not the more naturalℝ≥0∞
, because otherwise we can't talk about its derivatives and other basic things break.Refactor 1
This does not change fundamentally the definition. It simply splits a third term, corresponding to the set where the Radon-Nikodym derivative is zero, and uses the right limit of
f
at 0.This is a rather light refactor from a conceptual point of view. It's more a fix than a refactor. However, it will make all computations about
fDiv
heavier.Refactor 2
(in which
derivAtTop f
would be changed to beℝ≥0∞
)This much more radical refactor does the following:
f : ℝ → ℝ≥0∞
f
can take value∞
, we can deal with the case of infinite right limit.fDiv
to have values inℝ≥0∞
That definition was not used from the start because it seemed to me that allowing
f
to take negative values was important. After all, every book writes that the KL divergence is an f-divergence forx * log x
, which takes negative values. However on probability measures it is always possible to add an affine function without changing the divergence while making the function nonnegative:x * log x + 1 - x
gives the same f-divergence on probability measures.We would have to do one of those two options:
We could change the type of all divergences to use
ℝ≥0∞
if we go for option 1 for all of them.An advantage of that second refactor is that it would be presumably better linked to integrals of statistical informations (which are necessarily nonnegative).
The text was updated successfully, but these errors were encountered: