Skip to content

Commit

Permalink
Merge pull request hgrecco#1615 from MichaelTiemannOSC/parse-uncertai…
Browse files Browse the repository at this point in the history
…nties

This commit allows to parse uncertain numbers e.g. (1.0+/-0.2)e+03

Enable Pint to consume uncertain quantities.

Signed-off-by: [email protected]

* Fix problems identified by python -m pre_commit run --all-files

Signed-off-by: MichaelTiemann <[email protected]>

* Enhance support for `uncertainties`.  See hgrecco#1611, hgrecco#1614.

Signed-off-by: MichaelTiemann <[email protected]>

* Fix up failures and errors found by test suite.

Signed-off-by: MichaelTiemann <[email protected]>

* Copy in changes from PR1596

Signed-off-by: [email protected]

* Create modular uncertainty parser layer

Based on feedback, tokenize uncertainties on top of default tokenizer, not instead of default tokenizer.

Signed-off-by: MichaelTiemann <[email protected]>

* Fix conflict merge error

Signed-off-by: Michael Tiemann <[email protected]>

* Update util.py

Fixes problems parsing currency symbols that also show up when dealing with uncertainties.

Signed-off-by: Michael Tiemann <[email protected]>

* Update pint_eval.py

Handle negative numbers using uncertainty parenthesis notation.

Signed-off-by: Michael Tiemann <[email protected]>

* Update pint_eval.py

Ahem...use walrus operator for side-effect, not truth value.

Signed-off-by: Michael Tiemann <[email protected]>

* Fixed to work with both + and - e notation in the actually processing of the exponent, not just in the parsing of the exponent.

i.e., (5.01+/-0.07)e+04

Signed-off-by: Michael Tiemann <[email protected]>

* Fix test suite failures

Manually fix test_issue_1400.  Let other failures (which are not related to uncertainties) fail.

Signed-off-by: Michael Tiemann <[email protected]>

* Fix tokenizer merge error in pint/util.py

When using pint_eval.tokenizer don't try to import tokenizer from pint.compat.

Signed-off-by: Michael Tiemann <[email protected]>

* Merge cleanup: pint_eval.py needs tokenize

Clean up merge import error.

Signed-off-by: Michael Tiemann <[email protected]>

* Make black happier

Run `black` with default arguments to try to match whatever `black` wants to see in the CI/CD world.

Signed-off-by: Michael Tiemann <[email protected]>

* Make ruff happy

Remove unused redefinition of tokenizer in toktest.py.  Also remove unnecessary import of pint_eval from top-level (it's imported inside the function definition that needs it).

Signed-off-by: Michael Tiemann <[email protected]>

* Make ruff happier

Fix ruff errors missed in previous commit.

Signed-off-by: Michael Tiemann <[email protected]>

* Update toktest.py

Fix whitespace error created by `ruff --fix` that `black` didn't like.

Signed-off-by: Michael Tiemann <[email protected]>

* Update test_util.py

Follow deprecation of use_decimal from pint/util.py

Signed-off-by: Michael Tiemann <[email protected]>

* Fix additional regressions in test suite

If we have the uncertainties library loaded, go ahead and use the uncertainty_tokenizer by default.  This fixes problems with standard Pandas tests that expect the tokenizer to do the right thing without any special setup.

Also, prevent exception when a loop in consensus_name_attr (pandas-dev/pandas/core/common.py(86))) tests equality with a None argument.   Otherwise the zero_or_nan test raises an exception.

Signed-off-by: Michael Tiemann <[email protected]>

* Update quantity.py

Teach Pint's PlainQuantity about the Pandas pd.NA value so that ndim works.  Otherwise, it naively delegates to NumpyQuantity, which is the road to perdition for PintArrays.

Signed-off-by: Michael Tiemann <[email protected]>

* Make `babel` a dependency for testbase

Here's hoping this fixes the CI/CD problem with test_1400.

Signed-off-by: Michael Tiemann <[email protected]>

* Update .readthedocs.yaml

Removing `system_packages: false` as suggested by @keewis

Signed-off-by: Michael Tiemann <[email protected]>

* Fix failing tests

Fix isnan to use unp.isnan as appropriate for both duck_array_type and objects of UFloat types.

Fix a minor typo in pint/facets/__init__.py comment.

In test_issue_1400, use decorators to ensure babel library is loaded when needed.

pyproject.toml: revert change to testbase; we fixed with decorators instead.

Signed-off-by: Michael Tiemann <[email protected]>

---------

Signed-off-by: [email protected]
Signed-off-by: MichaelTiemann <[email protected]>
Signed-off-by: Michael Tiemann <[email protected]>
  • Loading branch information
hgrecco authored Sep 15, 2023
2 parents 2852f36 + 00f08f3 commit 07646d0
Show file tree
Hide file tree
Showing 16 changed files with 446 additions and 51 deletions.
1 change: 0 additions & 1 deletion .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,3 @@ python:
- requirements: requirements_docs.txt
- method: pip
path: .
system_packages: false
6 changes: 6 additions & 0 deletions CHANGES
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,12 @@ Pint Changelog
(Issue #1030, #574)
- Added angular frequency documentation page.
- Move ASV benchmarks to dedicated folder. (Issue #1542)
- An ndim attribute has been added to Quantity and DataFrame has been added to upcast
types for pint-pandas compatibility. (#1596)
- Fix a recursion error that would be raised when passing quantities to `cond` and `x`.
(Issue #1510, #1530)
- Update test_non_int tests for pytest.
- Better support for uncertainties (See #1611, #1614)
- Implement `numpy.broadcast_arrays` (#1607)
- An ndim attribute has been added to Quantity and DataFrame has been added to upcast
types for pint-pandas compatibility. (#1596)
Expand Down
64 changes: 36 additions & 28 deletions pint/compat.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,21 @@

import sys
import math
import tokenize
from decimal import Decimal
from importlib import import_module
from io import BytesIO
from numbers import Number
from collections.abc import Mapping
from typing import Any, NoReturn, Callable, Optional, Union
from collections.abc import Generator, Iterable
from collections.abc import Iterable

try:
from uncertainties import UFloat, ufloat
from uncertainties import unumpy as unp

HAS_UNCERTAINTIES = True
except ImportError:
UFloat = ufloat = unp = None
HAS_UNCERTAINTIES = False


if sys.version_info >= (3, 10):
Expand Down Expand Up @@ -58,19 +65,6 @@ def _inner(*args: Any, **kwargs: Any) -> NoReturn:
return _inner


def tokenizer(input_string: str) -> Generator[tokenize.TokenInfo, None, None]:
"""Tokenize an input string, encoded as UTF-8
and skipping the ENCODING token.
See Also
--------
tokenize.tokenize
"""
for tokinfo in tokenize.tokenize(BytesIO(input_string.encode("utf-8")).readline):
if tokinfo.type != tokenize.ENCODING:
yield tokinfo


# TODO: remove this warning after v0.10
class BehaviorChangeWarning(UserWarning):
pass
Expand All @@ -83,7 +77,10 @@ class BehaviorChangeWarning(UserWarning):

HAS_NUMPY = True
NUMPY_VER = np.__version__
NUMERIC_TYPES = (Number, Decimal, ndarray, np.number)
if HAS_UNCERTAINTIES:
NUMERIC_TYPES = (Number, Decimal, ndarray, np.number, UFloat)
else:
NUMERIC_TYPES = (Number, Decimal, ndarray, np.number)

def _to_magnitude(value, force_ndarray=False, force_ndarray_like=False):
if isinstance(value, (dict, bool)) or value is None:
Expand All @@ -92,6 +89,11 @@ def _to_magnitude(value, force_ndarray=False, force_ndarray_like=False):
raise ValueError("Quantity magnitude cannot be an empty string.")
elif isinstance(value, (list, tuple)):
return np.asarray(value)
elif HAS_UNCERTAINTIES:
from pint.facets.measurement.objects import Measurement

if isinstance(value, Measurement):
return ufloat(value.value, value.error)
if force_ndarray or (
force_ndarray_like and not is_duck_array_type(type(value))
):
Expand Down Expand Up @@ -144,16 +146,13 @@ def _to_magnitude(value, force_ndarray=False, force_ndarray_like=False):
"lists and tuples are valid magnitudes for "
"Quantity only when NumPy is present."
)
return value
elif HAS_UNCERTAINTIES:
from pint.facets.measurement.objects import Measurement

if isinstance(value, Measurement):
return ufloat(value.value, value.error)
return value

try:
from uncertainties import ufloat

HAS_UNCERTAINTIES = True
except ImportError:
ufloat = None
HAS_UNCERTAINTIES = False

try:
from babel import Locale
Expand Down Expand Up @@ -326,16 +325,25 @@ def isnan(obj: Any, check_all: bool) -> Union[bool, Iterable[bool]]:
Always return False for non-numeric types.
"""
if is_duck_array_type(type(obj)):
if obj.dtype.kind in "if":
if obj.dtype.kind in "ifc":
out = np.isnan(obj)
elif obj.dtype.kind in "Mm":
out = np.isnat(obj)
else:
# Not a numeric or datetime type
out = np.full(obj.shape, False)
if HAS_UNCERTAINTIES:
try:
out = unp.isnan(obj)
except TypeError:
# Not a numeric or UFloat type
out = np.full(obj.shape, False)
else:
# Not a numeric or datetime type
out = np.full(obj.shape, False)
return out.any() if check_all else out
if isinstance(obj, np_datetime64):
return np.isnat(obj)
elif HAS_UNCERTAINTIES and isinstance(obj, UFloat):
return unp.isnan(obj)
try:
return math.isnan(obj)
except TypeError:
Expand Down
2 changes: 1 addition & 1 deletion pint/facets/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
keeping each part small enough to be hackable.
Each facet contains one or more of the following modules:
- definitions: classes describing an specific unit related definiton.
- definitions: classes describing specific unit-related definitons.
These objects must be immutable, pickable and not reference the registry (e.g. ContextDefinition)
- objects: classes and functions that encapsulate behavior (e.g. Context)
- registry: implements a subclass of PlainRegistry or class that can be
Expand Down
19 changes: 10 additions & 9 deletions pint/facets/measurement/objects.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ class Measurement(PlainQuantity):
"""

def __new__(cls, value, error, units=MISSING):
def __new__(cls, value, error=MISSING, units=MISSING):
if units is MISSING:
try:
value, units = value.magnitude, value.units
Expand All @@ -64,17 +64,18 @@ def __new__(cls, value, error, units=MISSING):
error = MISSING # used for check below
else:
units = ""
try:
error = error.to(units).magnitude
except AttributeError:
pass

if error is MISSING:
# We've already extracted the units from the Quantity above
mag = value
elif error < 0:
raise ValueError("The magnitude of the error cannot be negative")
else:
mag = ufloat(value, error)
try:
error = error.to(units).magnitude
except AttributeError:
pass
if error < 0:
raise ValueError("The magnitude of the error cannot be negative")
else:
mag = ufloat(value, error)

inst = super().__new__(cls, mag, units)
return inst
Expand Down
15 changes: 15 additions & 0 deletions pint/facets/numpy/quantity.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,16 @@
set_units_ufuncs,
)

try:
import uncertainties.unumpy as unp
from uncertainties import ufloat, UFloat

HAS_UNCERTAINTIES = True
except ImportError:
unp = np
ufloat = Ufloat = None
HAS_UNCERTAINTIES = False


def method_wraps(numpy_func):
if isinstance(numpy_func, str):
Expand Down Expand Up @@ -224,6 +234,11 @@ def __getattr__(self, item) -> Any:
)
else:
raise exc
elif (
HAS_UNCERTAINTIES and item == "ndim" and isinstance(self._magnitude, UFloat)
):
# Dimensionality of a single UFloat is 0, like any other scalar
return 0

try:
return getattr(self._magnitude, item)
Expand Down
23 changes: 22 additions & 1 deletion pint/facets/plain/quantity.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,17 @@
if HAS_NUMPY:
import numpy as np # noqa

try:
import uncertainties.unumpy as unp
from uncertainties import ufloat, UFloat

HAS_UNCERTAINTIES = True
except ImportError:
unp = np
ufloat = Ufloat = None
HAS_UNCERTAINTIES = False


MagnitudeT = TypeVar("MagnitudeT", bound=Magnitude)
ScalarT = TypeVar("ScalarT", bound=Scalar)

Expand Down Expand Up @@ -133,6 +144,8 @@ class PlainQuantity(Generic[MagnitudeT], PrettyIPython, SharedRegistryObject):
def ndim(self) -> int:
if isinstance(self.magnitude, numbers.Number):
return 0
if str(self.magnitude) == "<NA>":
return 0
return self.magnitude.ndim

@property
Expand Down Expand Up @@ -256,7 +269,12 @@ def __bytes__(self) -> bytes:
return str(self).encode(locale.getpreferredencoding())

def __repr__(self) -> str:
if isinstance(self._magnitude, float):
if HAS_UNCERTAINTIES:
if isinstance(self._magnitude, UFloat):
return f"<Quantity({self._magnitude:.6}, '{self._units}')>"
else:
return f"<Quantity({self._magnitude}, '{self._units}')>"
elif isinstance(self._magnitude, float):
return f"<Quantity({self._magnitude:.9}, '{self._units}')>"

return f"<Quantity({self._magnitude}, '{self._units}')>"
Expand Down Expand Up @@ -1288,6 +1306,9 @@ def bool_result(value):
# We compare to the plain class of PlainQuantity because
# each PlainQuantity class is unique.
if not isinstance(other, PlainQuantity):
if other is None:
# A loop in pandas-dev/pandas/core/common.py(86)consensus_name_attr() can result in OTHER being None
return bool_result(False)
if zero_or_nan(other, True):
# Handle the special case in which we compare to zero or NaN
# (or an array of zeros or NaNs)
Expand Down
5 changes: 3 additions & 2 deletions pint/facets/plain/registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,9 @@
Handler,
)

from ... import pint_eval
from ..._vendor import appdirs
from ...compat import babel_parse, tokenizer, TypeAlias, Self
from ...compat import babel_parse, TypeAlias, Self
from ...errors import DimensionalityError, RedefinitionError, UndefinedUnitError
from ...pint_eval import build_eval_tree
from ...util import ParserHelper
Expand Down Expand Up @@ -1324,7 +1325,7 @@ def parse_expression(
for p in self.preprocessors:
input_string = p(input_string)
input_string = string_preprocessor(input_string)
gen = tokenizer(input_string)
gen = pint_eval.tokenizer(input_string)

def _define_op(s: str):
return self._eval_token(s, case_sensitive=case_sensitive, **values)
Expand Down
10 changes: 7 additions & 3 deletions pint/formatting.py
Original file line number Diff line number Diff line change
Expand Up @@ -375,9 +375,13 @@ def formatter(
# Don't remove this positional! This is the format used in Babel
key = pat.replace("{0}", "").strip()
break
division_fmt = compound_unit_patterns.get("per", {}).get(
babel_length, division_fmt
)

tmp = compound_unit_patterns.get("per", {}).get(babel_length, division_fmt)

try:
division_fmt = tmp.get("compound", division_fmt)
except AttributeError:
division_fmt = tmp
power_fmt = "{}{}"
exp_call = _pretty_fmt_exponent
if value == 1:
Expand Down
Loading

0 comments on commit 07646d0

Please sign in to comment.