-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finishing the manipulatons.py Rewrite #47
base: release_0.9.0
Are you sure you want to change the base?
Conversation
…data distributions with analytical solutions. This file will eventually replace utest_averaging.py
…ng changes to the unit tests
…l binning upgrade easier
…heir respective parent classes: _Slab and _Sector (which will be removed soon)
…fers a generalised method of calculating the directional average. All the averagers from the old manipulations.py have been rewritten to use the DirecitonalAverage class.
…test. The old SectorPhi now links to WedgePhi.
…been identical anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good. Locally, I am seeing two errors during testing related to zero division, but the ci succeeds, so likely a local environment issue. I think replacing the scipy requirement with something already included in the package would be good, but not a show-stopper.
Keeping the existing manipulations file is a good idea now that I've had a minute to think about it, but a deprecation message would be helpful. That is something that can be added later.
My fractional binning stuff definitely needs scipy
…On Wed, 27 Sept 2023, 19:13 Jeff Krzywon, ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
On sasdata/data_util/new_manipulations.py
<#47 (comment)>:
Disregard. See my main note. Keeping this is good for backward
compatibility.
—
Reply to this email directly, view it on GitHub
<#47 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACKU4SU644FHYZIDH4AOI43X4RUF7ANCNFSM6AAAAAA43YDHVE>
.
You are receiving this because your review was requested.Message ID:
***@***.***>
|
Required for 6.0.0 |
@lucas-wilkins will take a look at it |
… into 46-manipulations-dot-py-rewrite
@smk78 will try to take a look |
Testing https://github.com/SasView/sasview/actions/runs/7079272736 on W10/x64: After loading 2D datasets (one .dat, one .h5) & doing Create New > Right-click > Slicers, I see:
But files (obviously) have integer numbers of bins.
More to follow... |
This tool is also persistent; ie, once it has appeared, if you change the slicer it is not cleared on the plot. Also, weirdly, the top 14% of the 2D image gets erased with this slicer!
But in the Slicer Parameters box they are called: And the plot labels are: I think some consistency of naming would be helpful. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have performed functionality review on W10/x64. Unfortunately, there are some issues. Please see comments on this PR.
Are you sure its not 18% |
Note to @krzywon - I'm working on this now, that's why everything is failing. Will finish fixing soon. |
I was going by the ticks on the axis! A seventh being ~14%. |
Description
This pull request covers the last part of my summer project: the
manipulations.py
rewrite. It also includes the new analytical unit tests, from branch44-new-unit-tests-for-slicer-averagers
, except in this branch they point towards the new manipulations module instead of the old one.The details of the
manipulations.py
rewrite are as follows:DirectionalAverage
class for the bulk of their computation. This too reduces repeated code and increases maintainability.nbins
parameter now, whereas before some neededbin_width
instead.The main functionality change is how
DirectionalAverage
considers the weights of the datapoints. In both the new and oldmanipulations.py
these weights are either 0 or 1, though now it should be a lot easier to implement fractional weights. Previously, the weight assigned to a datapoint was calculated on-the-fly as part of a for loop which ran over each datapoint and was also responsible for part of the averaging process. The new methodology separates out the weighting and averaging processes, creating ann
xm
weights matrix wheren
is the number of bins the data is sorted into, andm
is the number of datapoints. The averaging is then done using matrix multiplication and numpy functions. This new method will certainly be more memory heavy than the old method, and it could be slower for large datasets. My hope is however that the switch from python functions to numpy functions will mitigate this, and the improvement to the quality of results once fractional weights are added should more than justify the resource usage.Dependencies needed: scipy - needed for new unit tests.
Note that merging this pull request requires that a sister pull request in the sasview repo also be merged: SasView/sasview#2615
Fixes #46 (issue announcing rewrite plan) and fixes #44 (issue about dodgy unit tests) and fixes #43 (wedge averaging had issues with full-circle ROIs).
How Has This Been Tested?
There are two ways this rewrite has been tested. The less rigorous method involved using this branch plus the sasview branch mentioned above to confirm that the plotted results look the same before and after using the same data file, slicer and slicer parameters. The more rigorous method involves the new unit tests. When the file
utest_averaging_analytical.py
from the branch44-new-unit-tests-for-slicer-averagers
is run, you see that averagers from the previous version ofmanipulations.py
all pass these tests. The version ofutest_averaging_analytical.py
included in this branch tests the new averagers, and these also pass the tests. Note there are other minor changes made to the tests for compatibility with the new averagers.Work still to be done
At a superficial level, there are some files which need renaming. The new version of
manipulations.py
goes by the namenew_manipulations.py
. Once everyone is satisfied with this new version, it should be renamed to replace the old one. There are references tonew_manipulations
inutest_averaging_analytical.py
on the sasdata side, as well as inPlotter2D.py
,AnnulusSlicer.py
,BoxSlicer.py
,BoxSum.py
,WedgeSlicer.py
, andSectorSlicer.py
on the sasview side. These should also be changed accordingly.There is also some more serious work to do before this rewrite can truly be considered complete. For the most part we've already reached feature parity with the old
manipulations.py
, but there are some blind spots which need acknowledging.manipulations.py
, the_Sector
class had the option to bin the data logarithmically. As far as I can tell this functionality was never called upon by sasview. There may be users who use it in custom scripts however, so it would still be worth implementing here. As a bonus, the feature would now be available to all slicers.manipulations.py
,CircularAverage
and_Sector
call a function by the name ofget_dq_data
. I don't quite understand the maths, but I get the impression from Lucas that it doesn't function as it should. The last thing needed to complete this rewrite would be proper consideration of thedqx_data
anddqy_data
properties of the supplied Data2D object to compute thedx
property of the returned Data1D object. The maths involved here is a little outside my comfort zone, so I'd be happy to delegate this responsibility to someone else.Review Checklist (please remove items if they don't apply):