Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError raised by reduce_chunk in PyActiveStorage - induced by a chunk of all zeros being treated as all missing? #194

Closed
bnlawrence opened this issue Mar 24, 2024 · 6 comments · Fixed by #196
Assignees
Labels
bug Something isn't working

Comments

@bnlawrence
Copy link
Collaborator

An attempt to do some work on small chunks using PyActiveStorage version 1 raises a ValueError if the values of the array in the chunk is all zeros.

(mampy5) [15:09:00:sci4 bnl]$Traceback (most recent call last):
 File "/home/users/lawrence/repos/PyActiveStorage/bnl/bnl_timeseries_runner.py", line 15, in <module>
  timeseries(location, b, v, t)
 File "/home/users/lawrence/repos/PyActiveStorage/bnl/bnl_timeseries_ral.py", line 65, in timeseries
  ts.append(active[i,0,0:960,:][0])
       ~~~~~~^^^^^^^^^^^^^
 File "/home/users/lawrence/repos/PyActiveStorage/activestorage/active.py", line 213, in __getitem__
  return self._get_selection(index, __metric_name='selection 1 time (s)')
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/users/lawrence/repos/PyActiveStorage/activestorage/active.py", line 57, in timed
  result = method(self,*args, **kw)
       ^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/users/lawrence/repos/PyActiveStorage/activestorage/active.py", line 328, in _get_selection
  return self._from_storage(ds, indexer, array._chunks, out_shape, dtype, compressor, filters, drop_axes)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/users/lawrence/repos/PyActiveStorage/activestorage/active.py", line 401, in _from_storage
  out = method(out)
     ^^^^^^^^^^^
 File "/home/users/lawrence/.conda/envs/mampy5/lib/python3.12/site-packages/numpy/core/fromnumeric.py", line 2313, in sum
  return _wrapreduction(a, np.add, 'sum', axis, dtype, out, keepdims=keepdims,
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/users/lawrence/.conda/envs/mampy5/lib/python3.12/site-packages/numpy/core/fromnumeric.py", line 88, in _wrapreduction
  return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (180,) + inhomogeneous part.

Tracking this down I find the bug occurring here:

if method:
        if missing != (None, None, None, None):
            tmp = remove_missing(tmp, missing)
        # check on size of tmp; method(empty) returns nan
        if tmp.any():
            return method(tmp), tmp.size # wanted this one
        else:
            return tmp, None.  # wrong thing returned
    else:
        return tmp, None

because although in this case the method is sum, the check on tmp.any() is returning False because tmp is all zeros, and so we get execution going through the line with #wrong thing returned. I think this test using tmp.any is not doing what the author intended. We need another solution.

@bnlawrence bnlawrence added the bug Something isn't working label Mar 25, 2024
@bnlawrence
Copy link
Collaborator Author

I think the new unit test should ensure that some chunks are all zeros ... but some are not ... and we get the right answer for version 1 and 2 ... and that it does raise this error with the current code.

@davidhassell
Copy link
Collaborator

np.ma.count(tmp) could be the answer.

@bnlawrence
Copy link
Collaborator Author

bnlawrence commented Mar 25, 2024

There are three issues in play here:

  1. Even if we go through the "wrong" thing route, we want it to return one missing value and a zero count, not a missing array.
  2. When they are all zeros, we want it to give the right answer.
  3. It occurs in both reduce_chunk and reduce_opens3_chunk

I think your suggestion will address the second ...

@davidhassell
Copy link
Collaborator

Your write right,. I should have said if tmp.size: could be the answer! (I hadn't realized that the tmp is np.ma.compressed).

@davidhassell
Copy link
Collaborator

A small put important change pointed out by @bnlawrence that should have made it int the original PR: #197

@valeriupredoi
Copy link
Collaborator

closed via #197 - very good catch and fix, gents 🍺

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants