Eliminate less than and greater than comparisons for `UFloat` #278

jagerber48 · 2024-12-29T02:00:18Z

The user guide already calls out this strange behavior

>>> a = ufloat(25, 10)
>>> b = ufloat(25, 8)
>>> a >= b
False
>>> a > b
False
>>> a == b
False
>>> a.nominal_value >= b.nominal_value
True

That is, the order established on UFloat objects does not obey the law of trichotomy that you would expect to be followed by a strict total order.

In the new post #262 framing, UFloats are considered to model random variables. For random variables we talk about equality in distribution. This means that comparing nominal values, or even nominal values and standard deviations, does not suffice to establish equality. Rather two UFloats should only be equal if they give the same weights to the same atomic units of uncertainty (UAtoms).

This means that comparing nominal values shouldn't suffice to establish greater than or less than relations either. When working with random variables I don't think it is typical to establish an ordering on random variables. So I propose uncertainties doesn't try to do so. If users want to know if the mean of some random variable passes some fixed float threshold (including the mean of some other UFloat) then I suggest they explicitly indicate that by extracting the nominal_value from the UFloat they are working with.

Thoughts? Opinions?

The text was updated successfully, but these errors were encountered:

newville · 2024-12-29T07:24:26Z

That is, the order established on UFloat objects does not obey the law of trichotomy that you would expect to be followed by a strict total order.

That law applies to real numbers. UFloats are not real numbers. One should not expect that to apply to complex numbers, classes, or UFloats.

Nominal values (and std_dev) are real numbers. That law should apply. And it does.

I would assume that most people would expect ufloat1 == ufloat2 to mean
(ufloat1.n == ufloat2.n) and (ufloat1.s == ufloat2.s). "not equal" is similarly easy.

Less than and greater than are more troublesome. It seems okay for these to always return False. It would also be OK to raise a TypeError, as 8 + 2j > 8 + 1j does.

jagerber48 · 2024-12-29T13:10:10Z

I would assume that most people would expect ufloat1 == ufloat2 to mean
(ufloat1.n == ufloat2.n) and (ufloat1.s == ufloat2.s). "not equal" is similarly easy.

This expectation is incorrect (not sure if that is what you were trying to point out?). From the docs:

>>> x = ufloat(5, 0.5)
>>> y = ufloat(5, 0.5)
>>> x == x
True
>>> x == y
False

because $x$ and $y$ are uncorrelated, despite having the same standard deviation. This is what I meant above in reference to "equality in distribution". x and y, thought of as random variables, have different distributions because they are correlated differently, so they should not be equal. By contrast, x and 1*x have the same distributions, so they are equal. While I don't love comparing UFloat to float, if I think about this distribution idea I can see that a random variable with zero variance follows the same distribution as a float, so this helps justify the current behavior on comparison with float.

Less than and greater than are more troublesome. It seems okay for these to always return False. It would also be OK to raise a TypeError, as 8 + 2j > 8 + 1j does.

Yeah so right now these return True if the nominal_values follow the requested order. I think this behavior is weird. I would prefer TypeError (like your complex example) to always returning False. Again, guided by the idea that UFloat should model random variables, and we don't associate any ordering with random variables (just like there isn't an ordering on the complex numbers). But I'm open to begin convinced on always returning False if there arguments/people in favor.

One big question is how painful would this change be for users? It's hard for me to know... Recovery would probably be pretty simple if what the user is really wanting to do is a comparison on the nominal value. We could deprecate the __lt__ function on some schedule.

newville · 2024-12-29T16:21:04Z

I would assume that most people would expect ufloat1 == ufloat2 to mean
(ufloat1.n == ufloat2.n) and (ufloat1.s == ufloat2.s). "not equal" is similarly easy.

This expectation is incorrect (not sure if that is what you were trying to point out?). From the docs:

Yes, yes, I do understand that. I think that most people will find this current behavior of "==" to be confusing. The hidden correlation with self is not obvious. Any comparison of UFloats is going to struggle with "obvious".

Worse, any comparison with ">" or "<" is basically to "impossible to decide". It could be "always False", or it could raise a TypeError (citing the precedent of complex numbers).

With

>>> ufloat(5.2, 1.5) > ufloat(4.9, 3.3)
False

the ambiguity would be pretty clear, but

>>> ufloat(8000, 4) > ufloat(-2500, 200)
False

seems weird.

That probably argues for preferring "raise TypeError". If that were the case, then "raise TypeError" for "==" and "!=" would then also be defensible....and probably no more confusing than the current situation ;).

But to come back to the main topic: yeah, the law of trichotomy does not apply. It does not apply to complex numbers, sets, vectors, etc. There is no reason to expect it to apply to UFloats.

newville · 2024-12-29T19:10:17Z

... and also ... or maybe to summarize: raise TypeError for all comparisons, including equality, seems reasonable.
I think that might be what @jagerber48 is saying too.

jagerber48 · 2024-12-30T00:11:16Z

@newville ok, I can't tell if I'm fully following your points or not but I think I am. You are pointing out

trichotomy applies to the real numbers but not other classes, so we shouldn't necessarily expect it to apply to e.g. UFloat. Yes, I agree with this.
In this case, we shouldn't really be guided by what is surprising or not since the appropriate behavior for equality is already surprising. This is a good point.

So I agree with everything you are saying. Here is what I'm proposing for UFloat:

__eq__ works as is. Two UFloat objects are equal if they have the same nominal value and their uncertainties are equal (as linear combinations of uncertain elements). In the current code this is implemented as a check the difference between self and other has zero nominal value and zero standard deviation which is a necessary and sufficient condition for the same.
__lt__, __gt__, __le__, __ge__ are all not defined on UFloat. This will have the effect that comparison using <, <=, >, >= involving UFloat objects gives a TypeError.

This proposal is fully guided by the principle that UFloat should model a random variable. On random variables it is very common to consider equality of two random variables. For mathematical random variables there are a few ways to do it. See Equivalence of random variables. For UFloat it is clear that any of these tests for equality corresponds to equality of UFloat.nominal_value and UFloat.error_components.

It is NOT standard to define any partial or total order < on random variables. Also, I don't see any programming case for wanting an ordering on UFloat. For the cases I can think of you would really want an ordering of the nominal values of UFloat, which are of course float and which of course already have an ordering. So for this reason I'm proposing eliminating __lt__, __gt__, __le__, and __ge__ (making attempts to use them raise a TypeError).

Technical note: If __eq__ is defined then in our case __neq__ will, by python default, be defined as the logical negation of __eq__. In many cases, including ours, this is the behavior we want. So no need to define or discuss __neq__.

newville · 2024-12-31T02:29:28Z

@jagerber48 I am OK with __eq__ working as is. But, I also think that the link you point to illustrates the inherent confusion - there is not a single obvious answer. The view that "a == b" is equivalent to "a-b = 0" is defensible. It is the current behavior, so any surprise can at least be explained, even if it is "obvious" to lots of people.

wshanks · 2024-12-31T21:34:56Z

It seems like the consensus is to keep the current __eq__ behavior. I don't think it is common for users to rely on doing different calculations and generating different UFloat instances with the same nominal value and same error coefficients and variables (outside of the 0 error float-like case), but when considering the __eq__ case we should keep in mind that __eq__ also gets used for hashing and that is used for dict keys and set membership. I think there could be a use for that, so I wouldn't make __eq__ raise an exception.

I mentioned in #283 that there is the question of how much a UFloat should act like a float. We could consider letting __gt__/etc work when the UFloat has 0 standard deviation and is being compared to a float or other UFloat with 0 standard deviation. Maybe that is also a pedantic case not likely to come up any way though. In general, I think the users will be happiest if they just apply math transformations to UFloats and then use the nominal value and standard deviation to do comparisons.

wshanks mentioned this issue Dec 31, 2024

Problem with special casing std_dev = 0 #283

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eliminate less than and greater than comparisons for `UFloat` #278

Eliminate less than and greater than comparisons for `UFloat` #278

jagerber48 commented Dec 29, 2024

newville commented Dec 29, 2024

jagerber48 commented Dec 29, 2024

newville commented Dec 29, 2024

newville commented Dec 29, 2024

jagerber48 commented Dec 30, 2024

newville commented Dec 31, 2024

wshanks commented Dec 31, 2024

Eliminate less than and greater than comparisons for UFloat #278

Eliminate less than and greater than comparisons for UFloat #278

Comments

jagerber48 commented Dec 29, 2024

newville commented Dec 29, 2024

jagerber48 commented Dec 29, 2024

newville commented Dec 29, 2024

newville commented Dec 29, 2024

jagerber48 commented Dec 30, 2024

newville commented Dec 31, 2024

wshanks commented Dec 31, 2024

Eliminate less than and greater than comparisons for `UFloat` #278

Eliminate less than and greater than comparisons for `UFloat` #278