Use sciform
as a formatting backend for uncertainties
#192
Replies: 9 comments 18 replies
-
The subject is interesting. As a matter of principle, I'm for avoiding duplication of efforts (in this case, maintaining and possibly extended the formatting of numbers with uncertainty). Formatting obviously belongs in a library that manipulates numbers with uncertainty, because users will often want to print them at some point. Now, how formatting is implemented can indeed be delegated to another library (with the advantage cited above). Back around 2009, I did check best scientific practices in terms of number printing (including NIST's and the Particle Data Group's) and took them into account while writing the formatting code, so the situation shouldn't be too bad. Now, I fully appreciate that some users might want additional options, and again, being able to maintain the formatting of numbers with uncertainty in a single place will help with that. Now, at this stage, it looks to me like a better option is to offer
And for reference: the formatting code was actually much harder to write than the rest! There are in fact many corner cases, that come from the interaction between formatting options (LaTeX, uncertainties in parentheses, the handling of NaN and inf, all within the constraints of padding, etc.). This makes the code also probably the hardest part of the package to read. This is where avoiding effort duplication can pay off, if it can be put in place. Again, I feel that this is not so hard to do, for the reasons above. At this stage, offering |
Beta Was this translation helpful? Give feedback.
-
@lebigot, thanks for your thoughtful responses. I've been working out how to respond. I'll try to mainly focus on proposed paths forward. Note, I'm assuming these changes would be implemented on a 3.x -> 4.x changeover so that breaking changes are considered acceptable. But first, yes, Trying to figure out exactly what you mean by:
By this do you mean there would be some package-wide global flag that selects either existing Here's what I consider to be my favorite idea after brainstroming for a little. Rather than importing the One big advantage of this is that the The other big advantage is that That said, this approach wouldn't be fully backwards compatible without a lot of custom formatting still happening in
One more point I'll make. It looks like For existing incompatibilities see:
For the latter I would have expected Since the existing FSML isn't fully compatible with the built-in FSML, and because I consider the built-in FSML to not have made the best choices for scientific formatting, I suggest
or
If you want to include left padding by |
Beta Was this translation helpful? Give feedback.
-
Some other comments:
I'm hoping the answer to "can we make breaking changes" is "yes". If that's the case then maybe a more productive direction for this conversation is "How should formatting be updated/redesigned for I would be interested in the latter discussion I mentioned and I could list out some ideas I have. |
Beta Was this translation helpful? Give feedback.
-
Yes, I was suggesting that users could set a global option in I still like your favorite idea ( As for some of the formatting behavior choices:
Good point, about the I personally don't mind breaking changes, but only as long as they can be mostly painless to users. This is a relevant point, for Concretely, one can, in this case:
How does this sound? |
Beta Was this translation helpful? Give feedback.
-
This will be the trickiest to support with But, about getting the
This would also be challenging to support with
It would be very easy to support this the way I'm imagining. it.
This seems fine for a transition involving breaking changes. The only thing I would add is to have a deprecation schedule even for the painless option to use the old formatting. The only reason we'd make a breaking change to the current formatting is we think we've designed something that is improved. If it really is improved I think the goal should be for all users to move to it eventually with enough warning. |
Beta Was this translation helpful? Give feedback.
-
However, with everything said, I would like to shift this discussion from "how can So for my next post I'll try to make a list of improvements I would like to see for |
Beta Was this translation helpful? Give feedback.
-
How can Note that this conversation is irrelevant if
probably also exposing something like But, in the likely event First some features I consider to be relatively important
Next some features I could take or leave
Then there's a question that concerns a few possible changes (don't know if they're improvements or not).
If users really want alignment then If uncertainties were to take these points of view then I would suggest:
|
Beta Was this translation helpful? Give feedback.
-
@jagerber48 @lebigot Sorry that I have not had time to keep up with this detailed conversation. I agree with @jagerber48 when in saying in the original post:
There is sort of a lot of code in Aside from how to best express precision and the number of significant digits in the printed form, there is a related question of what should
That suggests that >>> from uncertatinties import ufloat
>>> x = ufloat(3, 0.2)
>>> repr(x)
'3.0+/-0.2' could return I don't think anyone is complaining about the current rendering, but it is worth re-deciding whether With |
Beta Was this translation helpful? Give feedback.
-
This is something I'd like to do too, unrelated to sciform, as it reduces file size somewhat |
Beta Was this translation helpful? Give feedback.
-
I use the
uncertainties
package for two things.The former strikes me as the core functionality of the package. From the short description of the package:
The former functionality is also a mathematically and computationally challenging problem to get done well.
The latter regarding the display of value uncertainty pairs strikes me as useful nice-to-have feature while working with value/uncertainty pairs. The code to realize value/uncertainty formatting is not as algorithmically demanding as the core error propagation code. Rather, the formatting code is sprinkled with parsing many user options, doing various rounding, string manipulations, etc., adherence to published standards, and watching out for various edge cases like
nan
, or rounding changing the number of significant figures etc.In short, I view the error propagation code and the formatting code as being very different types of codes. My general suggestion is that the formatting code should be moved out of
uncertainties
and delegated to a separate package which can be dedicated to implementing scientific formatting according to published standards anduncertainties
. I suggest that this is a better separation of responsibilities.Heavily motivated/inspired by
uncertainties
formatting, I wrote the sciform package.sciform
is a package dedicated to formatting values and value/uncertainty pairs according to published scientific standards. It provides many relevant formatting options and a few workflows for implementing those options to format numbers into strings. Some relevant features foruncertainties
sciform
support±
formatting like123.45 ± 0.67
and parentheses uncertainty like123.45(0.67)
or123.45(67)
.sciform
supports explicit control over the number of significant figures of uncertainty to display and formats the corresponding value accordingly.sciform
can do this both for value/uncertainty pairs (like can be done with theufloat
now) but also for values alone.sciform
supports engineering notation in which the exponent in scientific notation is coerced to be a multiple of 3. This is valuable for presenting data more compatibly with SI standards and can improve readability.sciform
exposes a format specification mini-language (FSML) similar to the built-in FSML and similar to theuncertainties
FSML.0.000_012_345 ± 0.000_000_012
can be formatted as(12.345 ± 0.012)e-06
, butsciform
can also present it as(12.345 ± 0.012) μ
. Users can then easily append relevant units for simple dimensional quantities to makeμm
orμs
etc.sciform
will behave as desired.Many more features can be found in the documentation.
I would like to explore the idea of if
uncertainties
should migrate to usingsciform
as a formatting backend. Right now theufloat
object, at format time, calls a formatting functions that appear incore.py
inuncertainties
See AffineScalarFunc.__format__(). The suggestion would be that instead of calling internaluncertainties
code theufloat
orAffineScalarFunc
would import and callsciform
code. A simple example implementation taking advantage of thesciform
FSML would look likewhere
format_spec
is a validsciform
format specification string. A quick reference example to demonstrate how this might work isIn addition,
sciform
allows global configuration of formatting settings to allow users to modify setting not accessible through the FSML (like upper/lower/decimal separator characters).uncertainties
could expose something likeset_uncertainties_formatting_options
to allow users even more flexible formatting control if desired. Under-the-hooduncertainties
would use a localsciform
context orFormatter
object to implement the user selected global options. Note that the global options apply for any options not explicitly specified by the user's format specification string. Any options selected by the format specification string overwrite the corresponding option in the global configuration.The main caveat I will point is the following. I imagine the main way users will access formatting will be via f-string formatting on the
ufloat
object. Ifuncertainties
were to migrate to thescinum
FSML I have to point out that there are some backwards incompatible differences. See this section for a comparison between thesciform
and python built-in FSMLs. The main points I will highlight:sciform
always opts towards positive explicit control of all formatting parameters. For this reasonsciform
opted to not support e.g. theg
formatting option which does some automatic work in the background to select exactly what options will be used when formatting numbers.sciform
ONLY supports presenting value/uncertainty pairs by specifying the number of significant digits to display on the uncertainty. The value and uncertainty are always rounded to the same least-significant decimal place. Specifying digits-past-the-decimal-point is not supported because that sort of formatted is not recommended by official standards such as NIST or BIPM.sciform
does support formatting individual numbers according to a number of digits-past-the-decimal-point but that is less relevant foruncertainties
.sciform
supports left padding either zeros or spaces between the most significant digit and the sign symbol so that a number is left-padded up to a certain user-specified decimal place (e.g. the millions place).sciform
does not support padding by arbitrary symbols, andsciform
does not support left, right, or center padding.sciform
takes the view that these types of formatting can be applied tosciform
output strings afterwards if the user has some need for the string to be a certain length or justified a certain way.sciform
only takes responsibility for the numerical and scientific parts of the formatting, whereassciform
considers left/right/center padding to be a task generic to all python string objects and not specific to scientific/numeric string objects.nan
orinf
inputs.Adoption of the
sciform
FSML would likely mean breaking changes foruncertainties
formatting so probably would not be suitable for anuncertainties
3.x release and would need to wait for a 4.x release.That said, there is another way
uncertainties
could usesciform
that is not just directly inheriting thesciform
FSML.uncertainties
could retain its same FSML and instead of using the format string to fully format the string, the format string would be parsed and used to construct asciform
Formatter
object with the appropriate options which could in the end used to format the string usingsciform
. In this case there would still be breaking changes oruncertainties
would still need to do a lot of the formatting leg work to cover cases which are impossible forsciform
to handle.All of this said, I was intentional about excluding features from the
sciform
FSML. If a feature isn't included it is probably because it goes against scientific formatting best practices and should probably be avoided anyways. So, despite the differences, ifuncertainties
can bear backwards incompatible changes to formatting, I think it would be a net benefit touncertainties
to outsource formatting tosciform
.Two final notes:
sciform
only supports python >= 3.9 right now because it internally uses some modern typing features. If it is important forsciform
to support lower python versions this can likely be done pretty easy.uncertainties
, I am perfectly happy/open to discussing modifications tosciform
for better compatibility.Please let me know what you think or if you have any questions about
sciform
or how it could be used withuncertainties
.Beta Was this translation helpful? Give feedback.
All reactions