-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
converting panstarrs read_hipscat to read_hats #382
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments and questions, mostly for cleanup purposes.
# !pip install -r requirements_light_curve_generator.txt | ||
!pip install -r requirements_light_curve_generator.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't make CI happy :)
The other failures have already been fixed on the default branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anything we can do about this because I need it to run the notebooks, and you need it not to be there for CI? Somehow remembering to comment it out isn't sticking in my brain. Can we wrap this in an if statement or something? It is not worth a lot of effort, but if there is a workaround that keeps you from having to remind me every single time, that might be worth it. And sorry I can't seem to remember this.
display_name: science_demo | ||
display_name: notebook | ||
language: python | ||
name: conda-env-science_demo-py | ||
name: python3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these the new names on fornax?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't know, this all happens for me in the background when I open a notebook on Fornax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, we can change them, python3
was the default for CI anyway, but it didn't make Fornax happy back then.
|
||
```{code-cell} ipython3 | ||
|
||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for the empty cell at the end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
lsdb | ||
lsdb==0.4.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a minimum requirement or strictly only this version works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do I test this? It doesn't work without the ==0.4.5 which is the current version. Is it best practice to make this a >= and assume future versions will work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeap, add >=
as we go with the assumption future versions work, too. If not, then we introduce an upper pin or change our code.
@@ -4,7 +4,7 @@ | |||
import lsdb | |||
from dask.distributed import Client | |||
from data_structures import MultiIndexDFObject | |||
|
|||
from upath import UPath |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this coming from? Do we really need the extra dependency? (upath is not a direct dependency until this point)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think lsdb.read_hats() will be able to deal with raw s3 strings, so no need to wrap it up as a path object, directly use the urls
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now required by lsdb
. Sending a raw string for s3 now throws an error:
FileNotFoundError: Failed to read HATS at location s3://stpubdata/panstarrs/ps1/public/hats/otmo
It also will not accept a filesystem object, only a UPath
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Before I commented I tried it locally and it worked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, I think it's an upstream issue, their API should be user friendly enough to accept reasonable inputs like an s3 uri string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the one todo here thus is to add universal_pathlib
to the requirements file as we directly use it (although it's already an indirect requirement via hats/lsdb)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened astronomy-commons/hats#462
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
astronomy-commons/hats#466 added support for submitting the s3 uri as a string when only an anonymous connection is needed. So this kind of call should work now:
hats.read_hats('s3://stpubdata/panstarrs/ps1/public/hats/otmo')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It hasn't been released though, so I would just open a reminder issue to do the cleanup once we have it installed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, thanks. I see you opened #383.
"nStackDetections", # some other data to use | ||
] | ||
) | ||
panstarrs_object = lsdb.read_hats(UPath('s3://stpubdata/panstarrs/ps1/public/hats/otmo', anon=True), margin_cache=UPath('s3://stpubdata/panstarrs/ps1/public/hats/otmo_10arcs', anon=True), columns=[ "objID", # PS1 ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please cut up this very long line (and the other similar one below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a good resource on how to do this? I tried cutting these up and ended up with syntax errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
I think I have addressed all comments, so merging this now. |
note: the failing CI doesn't seem to be related to the changes here, and at least a fraction of them is fixed in |
This is #375 |
panstarrs module in light_curve_generator was broken maybe because read_hipscat is gone, or the hipscat files themselves were gone. So, the switch to hats format is in this PR.
I mostly just changed lsdb.read_hipscat to lsdb.read_hats. Some warnings pop up now in this panstarrs module, so I have added a line of text to warn people that the warnings are expected.
This closes #380
This also closes #376