-
-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request for rounding exponent to multiple of 3 for "engineering notation" #159
Comments
This is a good idea: it looks like this can indeed be useful. Some work is needed so as to precisely define the convention that would make sense, as your describe. Maybe there are some standards or common practices? |
A little bit of research doesn't yield standards or common practices. I think anything that has an exponent divisible by 3 counts as engineering notation, so for 1200 all of 1200e0, 1.2e3, 0.012e6 0.000012e9 all count as engineering notation. I can continue to look for common practices, but I think my intuitive expected practices would be that the base in b^e is either 1.0 <= b < 1000 or 0.1 <= b < 100. I'd prefer the latter one, but I could imagine others, or in some situations, preferring the former. I could imagine a preference settings somewhere like eng_notation_small_base=True or eng_notation_large_base=True or something that could toggle you between the two modes. I haven't worked too much with uncertainties package other than ufloat, but for that case I could imagine something like: ufloat(value, sigma, eng_notation_small_base=True), or if there's somewhere you can set global preferences/settings it could appear there |
This is as close as we're going to get to an official standard I think: https://www.nist.gov/pml/special-publication-811/nist-guide-si-chapter-7-rules-and-style-conventions-expressing-values Thinking about this raised another question, is it possible to coerce the exponent to be a specific value in uncertainties? May open a separate issue about that. |
Thanks for the research. I think it's a good idea to use NIST's convention. The natural place for defining the chosen format is Python's string formatting. We could add a new format letter for this engineering format (uppercase for 1–1000, lowercase for 0.1–100 maybe?). |
Yes, that sounds great to me! I like upper case for 1-1000 and lower for 0.1 to 100. Maybe |
I like this idea. There is the question of possible interactions with other format indicators (precision control…). At this stage I'm not seeing any problem: I'm thinking they can simply be applied to the mantissa part. |
For reference: the string parsing function |
I've begun work on this. It was pretty straightforward to get Three questions. (1) Should I open a PR now with the partial code updates or wait until it is more complete and ready for review and (2) I haven't had a look at updating documentation anyways. I can update it in the nearby docstrings but if there are other places it needs to be updated I would need to be pointed there. (3) anywhere in the package other than |
Thanks. Yes, About your questions: (1) I'm thinking that it's more explicit to only open a Pull Request when you're satisfied with what you can do (so that nobody reads code / documentation that you may update later, as it's more efficient if you touch what you wrote). (2) Both the docstrings and the documentation should be updated (it's at https://github.com/lebigot/uncertainties/blob/master/doc/user_guide.rst#printing). (3) I don't think that anything besides |
Sounds good! I'll continue to work on this and if I have more questions of this nature I'll ask them here! |
Ok, I learned regex now. Do you anticipate a specific issue with At first blush and my first tests there is no change needed to For now I plan to start adding to the testing script to capture and test different possible edge cases like is already done thoroughly for existing features. My initial tests show that the new format plays well with other formatting instructions, but I want to confirm it more thoroughly. That + documentation and I'll setup a PR. |
Perhaps you were imagining using other delimiters than |
Consider value 123.456 with uncertainty 0.012. This was the easiest way for it to happen with the |
Sorry for the barrage of comments. Unfortunately I think it will take me longer to work this out than I realized. The formatting is failing for a pair like: 12.3 +/- 456.78 formatted with |
@lebigot Hello, do you have any suggestions for where would be the best place in the |
Thank you for the last 3 messages (which somehow escaped me until now). Let me respond in order:
|
@lebigot thanks for the responses.
2 + 3. ok, so what you suggest in 3. is what I attempted but it leads to the failure discussed in 2. Here's my code: master...jagerber48:uncertainties:feature/engineering_notation. I've spent more time looking at FWIW I wrote my own val +/- uncertainty string formatting function inspired by the code I've been looking at here. https://github.com/jagerber48/strunc/blob/main/strunc/val_unc_formatting.py It uses a custom "specification language" that's a little more targeted towards the use case of val/uncertainty printing than pythons native formatting which relies heavily on the "digits-past-decimal-point" precision as opposed to sig figs. Basically you can explicitly specify (1) whether to use sig figs of digits-past-decimal precision (2) whether val or std drives the precision i.e. (123.45 +/ 67.1) with 1 sig fig should appear as (100 +/- 0) or (120 +/- 70) and (3) whether the val or std drives the exponent so should (120 +/- 70) go to (1.2 +/- 0.7)e+2 or (12 +/- 7)e+1. For standard scientific printing you'd always want to use sig figs driven by the uncertainty (and the number of sig figs should probably be 1 or 2 and you can follow the PDG recommendation if you like). I think usually you'd want the exponent driven by the value, but there may be cases where you want it driven by the uncertainty. Then there are some other features motivated by what I saw here. I don't think this code is any kind of replacement for what's in this package. Just sharing in case it jogs any ideas. I think extending the native python string formatting to support val +/- unc formatting is one of the slickest parts of this package. It's just unfortunate that that approach has to inherit python's digits-past-the-decimal approach to precision.
|
@lebigot I'm still curious to get the engineering notation feature in, but it's challenging for me to get my head around what the expected behavior is for various combinations of inputs. I feel I need this information to help me understand the code and to write unit tests for the new feature. But there are existing unit test features blocking this. (1) #162 some unit tests are not actually being run because of a bug in the unit test code. I address issue (1) in this PR #167, but I think issue (2) blocks the PR from passing build. I see these issues as blocking this one, I'm curious to know how you think it best to handle these 2 issues. Should one PR fix the "unit tests not being run" bug AS WELL AS the "some unit tests fail" issue? Or should it be handled in separate PRs? If so, how? |
@lebigot I'm still interested in getting this engineering notation feature into this package if possible. But I'm taking a very long way around. The onion I'm looking at looks like this:
But I think the last two points can be worked on in parallel. I'm curious if you would be able to address my comments in other issues and PRs in this repo about unit tests pointed out in my previous comment? ALTERNATIVELY, if taking the "long way around" is a big waste of time, and you see an easy way to get engineering notation in, I'd be very open to tips/suggestions for how to do that! |
Thanks for the updates. I'm quite busy right now and cannot dig into this, but I'll see after the current bout of activity if I can whip up something. |
For many scientific applications it is nice to have the exponent rounded to a multiple of 3 so that a result like
(1.2+/-0.1)e+04
appears as(12+/-1)e+03
can be quickly interpreted, for example, as 12 +/- 1 kHz.How the rounding should happen is not totally unambiguous. For example, should 1.2, 12, 120 or 0.12, 1.2, 12 be the preferred decimal representations? I'd suggest that somehow be optional.
The text was updated successfully, but these errors were encountered: